In a Regression model, should you drop interaction terms if they’re not significant?

In an ANOVA, adding interaction terms still leaves the main effects as main effects. That is, as long as the data are balanced, the main effects and the interactions are independent. The main effect is still telling you if there is an overall effect of that variable, after accounting for other variables in the model.

But in regression, adding interaction terms makes the coefficients of the lower order terms conditional effects, not main effects. That means that the effect of one predictor is conditional on the value of the other. The coefficient of the lower order term isn’t the effect of that term. It’s the effect only when the other term in the interaction equals 0.

So if an interaction isn’t significant, should you drop it?

If you are just checking for the presence of an interaction to make sure you are specifying the model correctly, go ahead and drop it. The interaction uses up df and changes the meaning of the lower order coefficients, and complicates the model. So if you were just checking for it, drop it.

But if you actually hypothesized an interaction that wasn’t significant, leave it in the model. The insignificant interaction means something in this case–it helps you evaluate your hypothesis. Taking it out can do more damage in specification error than in will in the loss of df.

The same is true in ANOVA models.

And as always, leave in any lower order terms, significant or not, for any higher order terms in the model. That means you have to leave in all insignificant two-way interactions for any significant 3-ways.

{ 14 comments… read them below or add one }

Dear Karen

I have a main factor that is directly proportional to the response. I identified the significant and non-significant factors and interactions and then I dropped them. When I runned again the DOE with only significant ones, this main fator remained positive as waited. Strangely, the estimated coeficient for uncoded units of this fator became negative. Low and high levels for this factor are positive (110 and 130). Is it possible to happen? Why? The presence of interactions can cause this? PS.: all the factors and their levels are positives. Thanks in advance!

Hi Karan!

Thank you for very useful article about interpretation of interaction terms. I know If we are looking at the coefficients, those “main effects” are not main effects once the interaction is in the model. They’re marginal effects. coefficients can be used to interpret the magnitude of marginal effects. I wounder, how can I determine statistical significance of the marginal effects. Thank you in advance.

Dear Karen,

For some reason I am not able to see the other 11 comments to this post.

However, the following reference strongly suggests that non-significant interaction terms should NOT be included in statistical models.

Engqvist, L. (2005). The mistreatment of covariate interaction terms

in linear model analyses of behavioural and evolutionary ecology studies, ANIMAL BEHAVIOUR, 2005, 70, 967–971.

I would appreciate if you could comment on the discrepancy between your view and that presented in the article.

Best,

Theo

Hi Karen,

Thanks for a great article. I don’t quite understand the reason to leave in an insignificant term – how does it help you evaluate your hypothesis? The only hypothesis it would help evaluate is…whether there is interaction of those terms. Aren’t you more concerned with discerning the relationships (whatever they are) of the covariates to the dependent variable? Said another way: when and why would you NOT be trying to specify the model correctly?

I suppose your answer will have to do with the specification error you mentioned, with which I am only basically familiar.

Hi!

hopefully you are doing good it was indeed a thoughtful reading but I was just wandering if i get insignificant interaction term but the overall model(ANOVA) is significant should i drop the interaction term from the regression model?

anxiously waiting for your reply

Regards..

Dear Karen,

In my analysis I am interested in three-way interaction including time. Basically it is X1*Time*X2; my dependent interacts with time and is dependend on the levels of my moderator. However, I also hypothesize a two-way interaction between X1 and time. This interaction is not significant, but my three-way interaction is significant. I know that you cannot interpret main effects when you include a two-way interaction. But what about interpreting two-way interactions when you include a three-way interaction? Thank you in advance,

Tom

Hi Tom,

You can often interpret a main effect when there are interactions. It depends on the pattern of means.

This is also true for two way and three way interactions. Depending on the nature of the three-way, the two way may or may not make sense on its own. You are hypothesizing a two way interaction for a reason, so does it still make sense in the presence of the three way?

Hi, I was just wanting to know, what if my main effects turn insignficant or reverse the sign of the coefficient (such that it is counter-intuitive to theory) but the interaction terms are now significant and absolutely as you would expect them to be theoretically. I am not sure how to interpret this then.

Hi Muj,

If you’re looking at the coefficients, those “main effects” are not main effects once the interaction is in the model. They’re marginal effects. See this: http://www.theanalysisfactor.com/interpreting-lower-order-coefficients-when-the-model-contains-an-interaction/

Hi Karen,

I know that the coefficients of main effects can´t be interpreted seperately anymore once the interaction is in the model. But how can I interprete the fact, that the main effect turns insignificant, when there´s a significant interaction.

May

Dear Karen,

thank you for article. It was very helpful. But I am still a bit confused and it would be great if you could give me a hint. If I have a regression model with 4 variables. Two of them do not have a significant coefficient nor do they contribute to the adj.R2 or F / Fsign. So I droped them. Fruther there is one variable which has the greatest explaination power. The last one is insignificant. But adds something to adj.R2. So I thoguht I test if there is some interaction going on and as it turned out the interaction insignificant with a beta 0. In this case would it be reasonable to drop the interactions term and the idea of an interaction?

Thank you Lukas

Hi Lukas,

Yes the interaction tests something completely different from the main effects.

thank u sir/mam a lots and lots….was highly confused regarding same point…but u r last line made all thing clear that’s dropping lower order terms for higher order interactions….leave 2 way insignificant interaction for 3 way significant interaction…and any significant main effect in 1 way for significant 2 way interaction…..as it consumes degree of freedom in type III error…

thank a lot…was just writing back what i interpreted from article….

thanks a ton….thank u very much…saurabh jangir

You’re welcome. Glad it was helpful. 🙂