Regression models

Member Event: Count Models Accelerator

July 30th, 2025 by Karen Grace-Martin

You’ll be excited to hear we’re doing another Statistics Skills Accelerator for our Statistically Speaking members: Count Models.

Stats Skills Accelerators are structured events focused on an important topic. They feature Stat’s Amore Trainings in a suggested order, as well as live Q&As specific to the Accelerator.

In August, our mentors will be running a new Accelerator. The first Q&A is August 6, 2025 at 3 pm ET, hosted by Jeff Meyer.

Count models are used when the outcome variable in a model or group comparison is a discrete count:

Number of eggs in a clutch
Number of days in intensive care
Number of aggressive incidents in detention

Count models come in a few types, and any of these can also be used for rates:

Poisson Regression is the simplest and is the basis for all the other models, but its assumptions are rarely met with real data.
Negative Binomial regression adds an extra parameter to a Poisson regression measure the extra variance that often occurs in real data.
Truncated count models work when the lowest values (often just zero) cannot occur. This happens when a count has to occur in order to be part of the population of interest.
Zero inflated count models are used when there are more zeros than expected. For this model, some zeros could have been something else and others couldn’t.
Hurdle models also work when there are more zeros than expected, but the process of having a zero is different. In these models, there is an actual “hurdle” one has to pass in order to have a non-zero count.
Logistic regression, when your count is out of of maximum number.

In this accelerator, learn about the different types of count models, how to understand their results, how to apply them to rates, and how to choose among them.

Note: This training is an exclusive benefit to members of the Statistically Speaking Membership Program and is a combination of watching recorded trainings and live events.

(more…)

No comments yet

Pros and Cons of Treating Ordinal Variables as Nominal or Continuous

July 3rd, 2025 by Karen Grace-Martin

There are not a lot of statistical methods designed just for ordinal variables. (There are a few, though.)

But that doesn’t mean that you’re stuck with few options. There are more than you’d think. (more…)

3 comments

Anatomy of a Normal Probability Plot

June 19th, 2025 by Karen Grace-Martin

A normal probability plot is extremely useful for checking normality assumptions. It’s more precise than a histogram, which can’t pick up subtle deviations. And yet it doesn’t suffer from too much power from large samples with tiny departures from normality or too little power from small samples with large departures from normality, as do tests like Shaprio-Wilkes.

The biggest problem with a normal probability plot is that it’s hard to read, especially if you’re not used to them. So let’s take a moment and walk through exactly how they work and what they tell you.

There are two versions of normal probability plot: Q-Q and P-P. I’ll start with the Q-Q. (more…)

6 comments

6 Types of Dependent Variables that will Never Meet the Linear Model Normality Assumption

February 18th, 2025 by Karen Grace-Martin

The linear model normality assumption, along with constant variance assumption, is quite robust to departures. That means that even if the assumptions aren’t met perfectly, the resulting p-values and confidence intervals will still be reasonable estimates.

This is great because it gives you a bit of leeway to run linear models, which are intuitive and (relatively) straightforward. This is true for both linear regression and ANOVA.

You do need to check the assumptions anyway, though. You can’t just claim robustness and not check. Why? Because some departures are so far off that the p-values and confidence intervals become inaccurate. And in many cases there are remedial measures you can take to turn non-normal residuals into normal ones.

But sometimes you can’t.

Sometimes it’s because the dependent variable just isn’t appropriate for a linear model. The (more…)

No comments yet

Regression Models: How do you know you need a polynomial term?

November 18th, 2024 by Karen Grace-Martin

You might be surprised to hear that not only can linear regression fit lines between a response variable Y and one or more predictor variables, X, it can fit curves too. There are many ways to do this, but the simplest is by adding a polynomial term.

So what is a polynomial term and how do you know you need one?

The linear parameters in a regression model

A linear regression model has a few key parameters. These include the intercept coefficient, the slope coefficient, and the residual variance.

That intercept defines the height of the regression line. It does so by measuring the height of the line at one specific point: when all X = 0.

The slope defines how much Y differs, on average, for each one unit difference in X. In other words, it measures the constant relationship between X and Y. Yes, there can be multiple Xs and each one has its own slope.

A polynomial term–a quadratic (squared) or cubic (cubed) term turns a linear regression model into a curve.

(more…)

No comments yet

The Data Analysis Pathway: Steps for Running any Statistical Analysis

September 10th, 2024 by Karen Grace-Martin

No matter what statistical model you’re running, you need to take the same steps. The order and the specifics of how you do each step will differ depending on the data and the type of model you use.

These steps are in 4 phases. Most people think of only the third as modeling. But the phases before this one are fundamental to making the modeling go well. It will be much, much easier, more accurate, and more efficient if you don’t skip them.

And there is no point in running the model if you skip the last phase. That’s where you communicate the results.

I’ve found that if I think of them all as part of the analysis, the modeling process is faster, easier, and makes more sense.

Phase 1: Define and Design

In the first five steps of running the model, the object is clarity. You want to make everything as clear as possible to yourself. The clearer things are at this point, the smoother everything will be. (more…)

28 comments