I am reviewing your notes from your workshop on assumptions. You have made it very clear how to analyze normality for regressions, but I could not find how to determine normality for ANOVAs. Do I check for normality for each independent variable separately? Where do I get the residuals? What plots do I run? Thank you!

I received this great question this morning from a past participant in my Assumptions of Linear Models workshop.

It’s one of those quick questions without a quick answer. Or rather, without a quick and useful answer. The quick answer is:

Do it exactly the same way. All of it.

The longer, useful answer is this:The assumptions are exactly the same for ANOVA and regression models. The normality assumption is that residuals follow a normal distribution. You usually see it like this:

*ε~ i.i.d. N(0, σ²)*

But what it’s really getting at is the distribution of Y|X. That’s Y given the value of X. Because X values are considered fixed, they have no distributions. Residuals have the same distribution as Y|X. If residuals are normally distributed, it means that Y is normally distributed within a value of X (not necessarily overall).

The only difference between the models is that ANOVAs generally have only categorical predictor variables, whereas regressions tend to have mostly continuous ones. So while the assumption is the same, it plays out differently.

When predictors are continuous, it’s impossible to check for normality of Y separately for each individual value of X. There are too many values of X and there is usually only one observation at each value of X. So you have to use the residuals to check normality.

But when predictors are categorical, there are usually just a few values of X (the categories), and there are many observations at each value of X. So you’ll often see the normality assumption for an ANOVA stated as:

“The distribution of Y within each group is normally distributed.” It’s the same thing as Y|X and *in this context, *it’s the same as saying the residuals are normally distributed.

The concept of a residual seems strange in an ANOVA, and often in that context, you’ll hear them called “errors” instead of “residuals.” But they’re the same thing. It’s the distance between the actual value of Y and the mean value of Y for a specific value of X. Those distances have the same distribution as the Ys within that group.

So in ANOVA, you actually have two options for testing normality. If there really are many values of Y for each value of X (each group), and there really are only a few groups (say, four or fewer), go ahead and check normality separately for each group.

But if you have many groups (a 2x2x3 ANOVA has 12 groups) or if there are few observations per group (it’s hard to check normality on only 20 data points), it’s often easier to just use the residuals and check them all together.

If you have a continuous covariate in the model as well, you’ve just lost option one, and residuals are the only way to go.

All GLM procedures have an option to save residuals. Once you do, run the same QQ plots to check normality as you would in regression.

Carsten says

Thank you so much for this explanation – barely anywhere clears up the confusion of why sometimes they say “normality” refers to part of the input (continuous data) and the output (residuals). Very useful!

Karen says

Thank you for this explanation.

I have a more general question:

Does this mean that I do not have to check the normality assumptions for my variable overall (so not split up into groups) instead, I have to check the normality assumption within the groups?

Or do I always have to make sure that my overall variable is normally distributed in general and then I can proceed with the required statistical test and check for the normality assumption again (in this case normal distribution within groups)?

Thank you

Karen Grace-Martin says

Correct. Check normality within groups.

H K bagh says

Sir,

In ANOVA models (a generic case) it is assumed that Xs (independent factors) are non-normal.

Regression is a specific case of ANOVA.

However, if one forgoes the assumption of normality of Xs in regression model, chances are very high that the fitted model will go for a toss in future sample datasets.

Residual errors are normal, implies Xs are normal, since Ys are non-normal.

Further, capping, outliers, missing values, etc can not applied to dataset if Xs are considered non-normal.

thanks

H K bagh

Nausheen Sodhi says

Hi Karen! Thanks for the precise explanation on ANOVA’s normality assumption. But I have a doubt. So if I apply ANOVA and then test for normality of residuals, what to do if they are not normally distributed? Do I transform the data to make it normal and then apply ANOVA again? Thanks!

Max says

Hi Karen, thank you so much for this post. I got quite confused about the normality assumption and this cleared it up like nothing I have read so far! I got a a few questions left however: If I understood you correctly you say in ANOVA you can check the distribution of Y per each group as well as check the distribution of residuals in order to assess the normality assumption. 1)Why should you choose to look at the distribution of Y with few groups instead of always choose to look at the distribution of residuals, regardless of the number of groups? And 2) If you look at residuals, do you check the distribution of residuals PER group or overall? Thanks so much!

Sharon says

Hello was just wondering if you are testing for normality in different levels of the IV , would you use Shapiro Wilk if each level is less than 50 or does it have to be the total number of cases are less than 50 ?

Karen Grace-Martin says

Hi Sharon,

I don’t use Shaprio Wilk. I use QQ Plots regardless of sample size.

Alicia says

Hi Karen,

I want to run an ANCOVA using R so as to evaluate the effect of a factor (which can be sex, age, area, etc., with several levels male/female, adult/subadult, a/b/c/d, etc.) on the relationship between length (covariate) and weight data (response variable). Prior to that, I have to check both normality and homogeneity of variances assumptions. Thanks to your enlightening post I now know that it is the residuals and not the variables that must fulfill normality.

But my question is, how I am supposed to check it? I mean, should I run the Shapiro-Wilk test for the logweight~loglength regression in each level of the factor? Putting sex as an example, for males and females separatedly? As there are some factors in my dataset which have many levels, I wonder if there is a quicker (although statistically correct) way to do so.

Thank you in advance!

estefaby says

Hi!

Can you tell me if it matters that data are non-normal? or we just have to verify the normality for the residuals?

Karen says

Yes. https://www.theanalysisfactor.com/assumptions-about-residuals/

Ruth says

What can you do if normality is not fulfilled for the residuals when running a ANOVA and multiple comparison of means by using S.plus?

Karen says

Hi Ruth,

It depends on exactly what kind of anova you’re doing and how the data are non-normal (ie, bimodal, skewed, really, really skewed). I’d need more information to tell you what to do, but here are some options:

1. Kruskal Wallis test

2. Transformation of Y

3. Some other type of model entirely. See this:https://www.theanalysisfactor.com/when-dependent-variables-are-not-fit-for-glm-now-what/

Karen says

Hi Baris,

I am honestly not sure if the number of levels themselves affect the Levene’s test. I don’t generally use it myself.

But I can tell you, those two tests are comparing two different things. The first is comparing 4 variances to each other. The other is comparing 20. So it’s entirely reasonable to believe that those 4 variances are similar, but once you split the data into 20 groups, not all 20 variances are the same.

Karen

Baris says

Dear Karen,

Thank you for your insightful blog posts. Although my question is not related to normality assumption, it is about the other assumption of anova; variance. I want to run a 2-way anova using SPSS (it’s unbalanced). First factor has 4 levels, the second one has 5 levels. When I do one-way anova with 4-level factor with log transformed DV, the levene test has p-value >.05; but when I introduce the second factor, levene test p-value <.05. Can the reason be the factors have too many levels? Thank you.

Regards,