Clarifications on Interpreting Interactions in Regression

by Karen


In a previous post, Interpreting Interactions in Regression, I said the following:

In our example, once we add the interaction term, our model looks like:

Height = 35 + 4.2*Bacteria + 9*Sun + 3.2*Bacteria*Sun

Adding the interaction term changed the values of B1 and B2. The effect of Bacteria on Height is now 4.2 + 3.2*Sun. For plants in partial sun, Sun = 0, so the effect of Bacteria is 4.2 + 3.2*0 = 4.2. So for two plants in partial sun, a plant with 1000 more bacteria/ml in the soil would be expected to be 4.2 cm taller than a plant with less bacteria.

For plants in full sun, however, the effect of Bacteria is 4.2 + 3.2*1 = 7.4. So for two plants in full sun, a plant with 1000 more bacteria/ml in the soil would be expected to be 7.4 cm taller than a plant with less bacteria.

But I just received the following question about this explanation.  I thought I’d respond here, in case I’m confusing other people as well.

The question was:

I was confused on how to interpret the interaction results. According to the post “For plants in full sun, however, the effect of Bacteria is 4.2 + 3.2*1 = 7.4.” I do not understand why the “sun” coefficient is not included, such that the effect of bacteria in full sun would be 9 + 4.2 + 3.2*1. Thanks for your help.

And here’s my answer:

Excellent question.  First of all, you would need to include the 9 (the coefficient for full sun) to calculate the predicted, or mean, height for plants in full sun at any specific value of Bacteria that you decided to plug in.

Because Sun is dummy-coded, that 9 (Sun’s coefficient) represents the difference in mean plant heights for plants in full sun compared to those in partial sun ONLY when Bacteria=0.

But to know the effect of Bacteria levels on plant height, you don’t need to know the differences in means.  The effect of a predictor variable, X,  in a regression model is how much Y differs, on average, for a one-unit difference in X.

In this example, it’s the increase (or decrease) in plant height for each incremental difference in soil bacteria count.

That’s the slope in a simple linear regression.

The interaction is telling you that this increase is not the same for plants in full and partial sun.

So the coefficient of Bacteria on its own is not enough to tell you the effect of Bacteria on plant height.  The coefficient of Bacteria is not an overall slope for Bacteria.

Because it’s not a constant effect.  There are two different slopes (effects of Bacteria on height).  One for full sun and one for part sun.

Bookmark and Share

tn_ircLearn more about the ins and outs of interpreting regression coefficients in our new On Demand workshop: Interpreting (Even Tricky) Regression Coeffcients.

{ 2 comments… read them below or add one }

ben March 28, 2017 at 4:21 pm

would it be correct to say that the effect of bacteria AND sun is 9+4.2+3.2*1? or what else would that sum mean?
thanks for any input,


Hendrik May 19, 2016 at 8:21 am

What if, instead of Bacteria, we had another dummy coded variable that interacted with Sun: Would this argument still hold?


Leave a Comment

Please note that Karen receives hundreds of comments at The Analysis Factor website each week. Since Karen is also busy teaching workshops, consulting with clients, and running a membership program, she seldom has time to respond to these comments anymore. If you have a question to which you need a timely response, please check out our low-cost monthly membership program, or sign-up for a quick question consultation.

{ 1 trackback }

Previous post:

Next post: