Centering and Standardizing Predictors

by Karen Grace-Martin

I was recently asked about whether centering (subtracting the mean) a predictor variable in a regression model has the same effect as standardizing (converting it to a Z score).  My response:

They are similar but not the same.

In centering, you are changing the values but not the scale.  So a predictor that is centered at the mean has new values–the entire scale has shifted so that the mean now has a value of 0, but one unit is still one unit.  The intercept will change, but the regression coefficient for that variable will not.  Since the regression coefficient is interpreted as the effect on the mean of Y for each one unit difference in X, it doesn’t change when X is centered.

And incidentally, despite the name, you don’t have to center at the mean.  It is often convenient, but there can be advantages of choosing a more meaningful value that is also toward the center of the scale.

But a Z-score also changes the scale.  A one-unit difference now means a one-standard deviation difference.  You will interpret the coefficient differently.  This is usually done so you can compare coefficients for predictors that were measured on different scales.  I can’t think of an advantage for doing this for an interaction.

Interpreting Linear Regression Coefficients: A Walk Through Output
Learn the approach for understanding coefficients in that regression as we walk through output of a model that includes numerical and categorical predictors and an interaction.

{ 4 comments… read them below or add one }


Why would standardizing remove collinearity? Just a small example in python
a = np.array([1,2,3,4,5,6,7,3,193,2,5])
df_nonstd = pd.DataFrame({‘a’: a ,’b’ : 10*a})
# standardization
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()

The correlation remains regardless the standardization



after applying mean centering ,normalization and standardization to scale the attributes how can I compare their various effects please(I am suing R)



What about centring a variable in a mixed model? I want to look at pig weight, in relation to the pen mean weight, so I pen-mean centred week 7 weight, week 10 weight and week 20 weight. This also had the benefit of reducing collinearity. I have been told I should have standardised the scores instead. Is this right/wrong?



Standardizing would have also removed the collinearity, like centering did. However, standardizing would also make the coefficients more interpretable. In essence, centering is part of the process of standardizing.


Leave a Comment

Please note that, due to the large number of comments submitted, any comments on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Previous post:

Next post: