When NOT to Center a Predictor Variable in Regression

by Karen Grace-Martin 22 Comments

There are two reasons to center predictor variables in any type of regression analysis–linear, logistic, multilevel, etc.

1. To lessen the correlation between a multiplicative term (interaction or polynomial term) and its component variables (the ones that were multiplied).

2. To make interpretation of parameter estimates easier.

I was recently asked when is centering NOT a good idea?

Well, basically when it doesn’t help.

For reason #1, it will only help if you have multiplicative terms in a model. If you don’t have any multiplicative terms–no interactions or polynomials–centering isn’t going to help.

For reason #2, centering especially helps interpretation of parameter estimates (coefficients) when:

a) you have an interaction in the model

b) particularly if that interaction includes a continuous and a dummy coded categorical variable and

c) if the continuous variable does not contain a meaningful value of 0

d) even if 0 is a real value, if there is another more meaningful value such as a threshold point. (For example, if you’re doing a study on the amount of time parents work, with a predictor of Age of Youngest Child, an Age of 0 is meaningful and will be in the data set, but centering at 5, when kids enter school, might be more meaningful).

So when NOT to center:

1. If all continuous predictors have a meaningful value of 0.

2. If you have no interaction terms involving that predictor.

3. And if there are no values that are particularly meaningful.

Interpreting Linear Regression Coefficients: A Walk Through Output

Learn the approach for understanding coefficients in that regression as we walk through output of a model that includes numerical and categorical predictors and an interaction.

Comments

Raza says

August 15, 2020 at 11:50 pm

I have panel data, and issue of multicollinearity is there, High VIF.

1- I don’t have any interaction terms, and dummy variables
2- I just want to reduce the multicollinearity and improve the coefficents

Would it be helpful to center all of my explanatory variables, just to resolve the issue of multicollinarity (huge VIF values).

Thank you

Reply
Michiel says

June 14, 2019 at 5:29 pm

Dear Karen,

Is it necessary to create centered-mean variables for the dummy variables when you are creating interactions between two dummy variables?

Kind regards,
Michiel

Reply
- Karen Grace-Martin says
  
  August 22, 2019 at 1:39 pm
  
  No.
  
  Reply
Pascal says

May 1, 2019 at 9:53 am

Hello and thank you for your explanation.
There is still something that I don’t understand about centering in interactions, though. Let’s take the Bacteria (B) and Sun (S) example, assuming they are continuous variables with no possible 0 values. If we want to introduce interaction in a regression, it is recommended to mean-center both variables. But, then low value on B and S will become negative once centered, and therefore their interaction will become positive. In other words, lowB-lowS will have the same impact as highB-highS. What am I missing here? Thanks

Reply
- Karen Grace-Martin says
  
  May 9, 2019 at 9:52 am
  
  Hi Pascal,
  
  Since Sun is categorical, we wouldn’t center it. But even if you have two numerical predictors and center both, it doesn’t mean that lowB-lowS has the same *mean* as highB-highS. The interaction term will not change if both predictors are centered. The interaction always measures the *change* in the effect (aka slope) of one variable for each one-unit effect of the other.
  
  Reply
Rawiyah says

February 27, 2019 at 2:02 pm

Thank you so much!

Reply
Scott Stanley says

August 6, 2018 at 5:43 pm

For those who might be interested (and this is not dealing with the complexity of multilevel models for questions about centering), Hayes (2017) has a great section (9.1) starting on page 304 about the impact of centering predictors when you are testing moderation (i.e., when you have an interaction term in a regression equation), which is an example of when KGM says above it may be useful. He notes that centering will not change anything about testing the interaction term, itself. It will only change what happens with the two variables that go into the product. So, assume variable X and variable W, and an interaction XW (W = moderator in Haye’s notation): centering X and W will not impact the test or interpretation of the term for XW. It will change what you get for the CONDITIONAL results for X and W, however. Centering changes the interpretation of the conditional betas from being what happens to Y with a change of 1 unit for variable X among those with the value of 0 (zero) on W to what happens to Y with a change of 1 unit on X among those with the value at the mean of W.

I highly recommend that book as well as the treatment of this question in the simpler, non MLM cases. He also notes, consistent with what KGM says above, that centering can only be of much of any use at all (at least in non-MLM setting) if there is a multiplicative term or an interpretational issue, and apparently not because it changes the interaction test but because centering can make conditional effects that are non-sensical (e.g., one variable cannot be zero in real world) more interpretable. He says centering does indeed reduce the collinearity between X and XW, for example, but that collinearity is not really an issue when interpreting the finding for XW in the model, which of course, is the whole point of the moderation test. However, he notes it may still be useful if you have a model that just won’t run because the VIF for XW is so high that the software you are using will not run the model, but that the collinearity itself for XW is not a problem.

As an aside, Hayes takes a dim view of people messing much with interpreting the conditional effects when you have an interaction term, in any case, because people often misconstrue them as main effects.

Reply
Karthik Srinivasan says

October 16, 2016 at 3:25 am

Read this article: http://psycnet.apa.org/journals/met/12/2/121/ .It answers all your questions.

Reply
Luis says

June 8, 2016 at 6:03 am

Dear Karen, some claims you make in this article are not true. Please see http://orm.sagepub.com/content/15/3/339.abstract for more information.

Reply
Lovinator says

June 7, 2016 at 4:26 am

Thank you, Thank you. This was just what I needed.

Reply
Steve says

January 22, 2016 at 5:45 pm

Is it always necessary to center variables when using multilevel analysis (especially when it is a logit)? Might one be able to not center (especially when it seems to be change the significance of relevent variables). Thanks

Reply
- Oliver says
  
  March 2, 2016 at 3:02 pm
  
  Hi Steve,
  
  Similar to you, I also had some multilevel models in which Level 2 predictors became non-significant once these predictors were (grand-mean) centered. People keep telling that it will only change the intercept value, but it’s not true. It seems from my experience that a Level 2 predictor initially significant may become no longer significant after being centered. This is frustrating, especially when you’re not interested in interpreting the meaning of the intercept…
  
  Any idea, Karen ?
  
  Reply
  - SAM says
    
    May 1, 2016 at 5:22 pm
    
    http://www.ncbi.nlm.nih.gov/pubmed/16394187
    
    Reply
    - SAM says
      
      May 1, 2016 at 5:24 pm
      
      The article is open source via Google scholar.
      
      Reply
Antenor says

September 10, 2015 at 5:59 pm

Hello Karen,

Good explanation, it was helpful to me.I was just wondering if you have some reference where I can find this statements, some paper I can cite in scientific papers.

Reply
Elise says

May 28, 2015 at 9:52 pm

Does centering a variable change how you interpret the results? For example, if a Beta is positive or negative after a variable is centered?

Reply
- Karen says
  
  June 3, 2015 at 9:12 am
  
  Centering a variable won’t change it’s own coefficient.
  
  It will change the intercept, which may or may not be meaningful.
  
  It can also change other coefficients if the centered variable is involved in an interaction.
  
  Reply
  - Lauren says
    
    June 25, 2018 at 11:49 am
    
    I centered my independent variables to reduce collinearity and some of my variables went from being significant before centering to not significant after. The variables are all involved in interactions, so your last statement caught my eye. Can you recommend any resources for me to follow up on centering and interactions?
    
    Reply
    - Karen Grace-Martin says
      
      October 12, 2018 at 11:17 am
      
      Hi Lauren,
      
      Sure. I would start here: https://www.theanalysisfactor.com/interpreting-interactions-in-regression/
      
      Reply
Yan says

January 11, 2015 at 10:09 pm

Should we center a binary variable if we have an interaction between a binary variable and a continuous variable?

Reply
- Kerstin says
  
  November 28, 2018 at 1:33 pm
  
  I would love to know the answer to this as well.
  
  Reply
- Karen Grace-Martin says
  
  November 28, 2018 at 3:51 pm
  
  Hi Yan,
  
  That question has a very complicated answer. Most of the time, though, binary variables are dummy coded. If they are, then they have a specific meaning that works well in interactions. So you can change that coding to something that resembles centering for very specific reasons. But most of the time they are left as is.
  
  Reply

Reader Interactions

Comments

Leave a Reply Cancel reply