Clarifications on Interpreting Interactions in Regression

In a previous post, Interpreting Interactions in Regression, I said the following:

In our example, once we add the interaction term, our model looks like:

Height = 35 + 4.2*Bacteria + 9*Sun + 3.2*Bacteria*Sun

Adding the interaction term changed the values of B1 and B2. The effect of Bacteria on Height is now 4.2 + 3.2*Sun. For plants in partial sun, Sun = 0, so the effect of Bacteria is 4.2 + 3.2*0 = 4.2. So for two plants in partial sun, a plant with 1000 more bacteria/ml in the soil would be expected to be 4.2 cm taller than a plant with less bacteria.

For plants in full sun, however, the effect of Bacteria is 4.2 + 3.2*1 = 7.4. So for two plants in full sun, a plant with 1000 more bacteria/ml in the soil would be expected to be 7.4 cm taller than a plant with less bacteria.

But I just received the following question about this explanation. I thought I’d respond here, in case I’m confusing other people as well.

The question was:

I was confused on how to interpret the interaction results. According to the post “For plants in full sun, however, the effect of Bacteria is 4.2 + 3.2*1 = 7.4.” I do not understand why the “sun” coefficient is not included, such that the effect of bacteria in full sun would be 9 + 4.2 + 3.2*1. Thanks for your help.

And here’s my answer:

Excellent question. First of all, you would need to include the 9 (the coefficient for full sun) to calculate the predicted, or mean, height for plants in full sun at any specific value of Bacteria that you decided to plug in.

Because Sun is dummy-coded, that 9 (Sun’s coefficient) represents the difference in mean plant heights for plants in full sun compared to those in partial sun ONLY when Bacteria=0.

But to know the effect of Bacteria levels on plant height, you don’t need to know the differences in means. The effect of a predictor variable, X, in a regression model is how much Y differs, on average, for a one-unit difference in X.

In this example, it’s the increase (or decrease) in plant height for each incremental difference in soil bacteria count.

That’s the slope in a simple linear regression.

The interaction is telling you that this increase is not the same for plants in full and partial sun.

So the coefficient of Bacteria on its own is not enough to tell you the effect of Bacteria on plant height. The coefficient of Bacteria is not an overall slope for Bacteria.

Because it’s not a constant effect. There are two different slopes (effects of Bacteria on height). One for full sun and one for part sun.

Interpreting Linear Regression Coefficients: A Walk Through Output

Learn the approach for understanding coefficients in that regression as we walk through output of a model that includes numerical and categorical predictors and an interaction.

Comments

Bul says

June 28, 2019 at 11:56 am

Hi all) I need to clarify about the significance of the coefficients. Do we require that all coefficients are significant? I mean coefficient of Bacteria and coefficient of the interaction of Bacteria with Sun. If only the coefficient of interaction is significant, so we consider only its value for interpetation (not summing with coefficient of Bacteria)? Thank you for clarifying.

Reply
Seren says

November 5, 2018 at 8:54 am

If I am unsure whether the model requires an interaction term, and I add it in anyway could this cause the model to be incorrect? In other words is it a safe bet to always add in an interaction term? Thank you

Reply
- NEnduru says
  
  April 5, 2020 at 12:01 am
  
  Hi Seren,
  
  we add interaction term if there is any relationship between two variables.
  eg., predict children’s food nutrition based on families size and income. here the interaction between family size and income can make significant difference.
  
  Reply
  - Karen Grace-Martin says
    
    April 17, 2020 at 2:44 pm
    
    NEnduru,
    
    I would just clarify, the interaction isn’t about the relationship between two variables. It’s about how a third affect the relationship between two. Subtle difference. See The Difference Between Interaction and Association
    
    Reply
A says

September 27, 2017 at 1:24 pm

The effect of Bacteria on Height can be interpreted as “how much Height differs for a one-unit difference in Bacteria”

Rewrite original formula:
Height_1 = 35 + 4.2*Bacteria + 9*Sun + 3.2*Bacteria*Sun
= 35 + 9*Sun + (4.2 + 3.2*Sun)*Bacteria

Assume, Bacteria increase by 1:
Height_2 = 35 + 9*Sun + (4.2 + 3.2*Sun)* (Bacteria + 1)
= 35 + 9*Sun + (4.2 + 3.2*Sun)*Bacteria + (4.2 + 3.2*Sun)
= Height_1 + (4.2 + 3.2*Sun)

Therefore, Height_2 – Height_1 = 4.2 + 3.2*Sun

So the effect is 4.2 + 3.2*Sun

Reply
- Karen says
  
  January 29, 2018 at 12:22 pm
  
  Yes, exactly. That’s the whole idea of the interaction–the effect of bacteria on height is not the same for without sun as it is for with sun.
  
  Reply
ben says

March 28, 2017 at 4:21 pm

would it be correct to say that the effect of bacteria AND sun is 9+4.2+3.2*1? or what else would that sum mean?
thanks for any input,
ben

Reply
Hendrik says

May 19, 2016 at 8:21 am

What if, instead of Bacteria, we had another dummy coded variable that interacted with Sun: Would this argument still hold?

Reply

Reader Interactions

Comments

Leave a Reply Cancel reply