• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • Our Programs
    • Membership
    • Online Workshops
    • Free Webinars
    • Consulting Services
  • About
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Collaborate with Us
  • Statistical Resources
  • Contact
  • Blog
  • Login

How to Get Standardized Regression Coefficients When Your Software Doesn’t Want To Give Them To You

by Karen Grace-Martin 30 Comments

Standardized regression coefficients remove the unit of measurement of predictor and outcome variables.  They are sometimes called betas, but I don’t like to use that term because there are too many other, and too many related, concepts that are also called beta.

There are many good reasons to report them:

  • They serve as standardized effect size statistics.
  • They allow you to compare the relative effects of predictors measured on different scales.
  • They make journal editors and committee members happy in fields where they are commonly reported.

If you use a regression procedure in most software, standardized regression coefficients are reported by default. Or at least an easy option.

But there are times you need to use some procedure that won’t compute standardized coefficients for you.

Often it makes more sense to use a general linear model procedure to run regressions.  But GLM in SAS and SPSS don’t give standardized coefficients.

Likewise, you won’t get standardized regression coefficients reported after combining results from multiple imputation.

Luckily, there’s a way to get around it.

A standardized coefficient is the same as an unstandardized coefficient between two standardized variables. We often learn to standardize the coefficient itself because that’s the shortcut.  But implicitly, it’s the equivalence to the coefficient between standardized variables that gives a standardized coefficient meaning.

So all you have to do to get standardized coefficients is standardize your predictors and your outcome.

How?

The Steps

Remember all those Z-scores you had to calculate in Intro Stats?  It wasn’t the useless exercise you thought it was at the time.

Converting a variable to a Z-score is standardizing.

In other words, do these steps for Y, your outcome variable, and every X, your predictors:

1. Calculate the mean and standard deviation.

2. Create a new standardized version of each variable.  To get it, create a new variable in which you subtract the mean from the original value, then divide that by the standard deviation.

3. Use those standardized versions in the regression.

Could this take a while?  Yup.

But if that’s what the journal requires you report, just do it.

A nice advantage, is you can apply it, at least partially, even in regression models that can’t usually accommodate standardized regression coefficients.

For example, in a logistic regression it doesn’t make sense to standardize Y because it’s categorical.  But you can standardize all your Xs to get rid of their units.

You can then interpret your odds ratios in terms of one standard deviation increases in each X, rather than one-unit increases.

Interpreting Linear Regression Coefficients: A Walk Through Output
Learn the approach for understanding coefficients in that regression as we walk through output of a model that includes numerical and categorical predictors and an interaction.

Tagged With: beta, invariance testing, logistic regression, measurement equivalence, standardized regression coefficients

Related Posts

  • How to Combine Complicated Models with Tricky Effects
  • Interpreting Regression Coefficients in Models other than Ordinary Linear Regression
  • Member Training: Goodness of Fit Statistics
  • Linear Regression for an Outcome Variable with Boundaries

Reader Interactions

Comments

  1. Ariadni says

    March 5, 2022 at 6:23 pm

    Thank you very much for this! I am doing a linear regression with both categorical and quantitative variables as covariates. I was wondering if I standardize the dependent variable and the independent quantitative variables, should I leave the categorical ones as they are? Will the beta values produced be accurate this way?

    Thank you so much in advance,
    Ariadni

    Reply
    • Karen Grace-Martin says

      March 8, 2022 at 12:09 pm

      Yes, that’s exactly how to do it. Standardize both Y and the numerical Xs.

      Reply
  2. Alex says

    April 24, 2020 at 11:54 am

    Hi Karen, thank you for the helpful and concise explanation. However, I just want to point out that in Step 2, you state: “To get it, create a new variable in which you subtract the mean from the original value, then divide that by the standard error.” I think this should say standard deviation instead of standard error to avoid confusion, as they are different things.

    Reply
    • Karen Grace-Martin says

      April 30, 2020 at 1:51 pm

      Hi Alex,

      Yes, you’re right. Thanks, I just fixed it.

      Reply
  3. Izabel says

    March 6, 2020 at 8:10 am

    Hello Karen,
    Thank you so much for this explanation. However I am struggling to figure out how to interpret the coefficients of a negative binomial regression in terms of SD. I have normalized all my predictors, but not my output (a count variable). I would like to know how would be the interpretation of my betas in this case.
    Thank you so much
    Izabel

    Reply
  4. Maoliang Ling says

    January 14, 2020 at 8:05 am

    Actually, obtaining standardized coefficients by standardizing raw data does not make sense at all in case where dependent variable is ordinal or dichotomous. In these cases, the standardized coefficients are actually the changes in SD of latent values of y due to 1 SD of x.

    Reply
    • Karen Grace-Martin says

      January 24, 2020 at 10:07 am

      I agree. That’s why you’d only standardized the numeric ones, not the categorical.

      Reply
  5. Renata says

    September 7, 2019 at 12:15 pm

    Hey Karen, thank you so much for your post! 🙂 I have a question that has been bugging me. I understand that I can use the Standardized Coefficients to evaluate the importance of predictor in my model. My question is, can I use the T-Statistic of that coefficient for that purpose too? I believe I can use the P-Value of the T-Statistic for that. But I couldn’t find if this is true for the t-value for a coefficient! Is the t-value linearly correlated to the standardized coefficient, and so, maybe could be used for the same conclusions? Thank you so much in advance!

    Reply
    • philip says

      November 27, 2019 at 10:02 pm

      Neither a test-statistic, nor a p-value tells you anything about the importance of a coefficient. It simply tells you the probability of finding the current outcome if your coefficient were null. However, it is well possible that one of your coefficients is a much stronger predictor, while having a higher p-value than one that with a low p-value that only marginally contributes. While a higher t-statistic implies a lower p-value, neither are directly related to the coefficient estimate. A greater coefficient does not necessarily lead to a lower p-value, it all depends on the precision in your data!

      Reply
  6. Tim says

    August 31, 2018 at 4:23 am

    Hi Karen,

    Thanks for the tip!
    In fact, one reviewer has requested this.
    Is there a citation for this? Can I cite your book?

    Tim

    Reply
    • Karen Grace-Martin says

      September 12, 2018 at 12:29 pm

      Hi Tim,
      I don’t think that’s in the book. You can cite this page. I think there are specifi ways to cite web pages, but it depends on your publication manual.

      Reply
  7. Cinzia says

    January 11, 2018 at 12:26 pm

    Hi, I need to perform a meta-analysis using the effect sizes of a continuous outcome from different cohort studies. The outcome can be in different scales, so I need to estimate a pooled effect size (SMD). So, from each study I perfom a multivariate linear regression model, and I am thinking to take out the standardize effect size (standardize coefficient) for a specific group variable included in the model and the relative standard error SE. Then I pool the standardize coefficients an the relative SEs across the studies, so the pooled will be again a SMD. Is it correct to use the SE of the standardize coefficient obtained from the regression?
    Thanks.
    Cinzia

    Reply
  8. Lal says

    October 7, 2016 at 1:35 pm

    KAREN mam/sir, You greatly explained manual calculation of Standardized Regression Coefficients and I cross checked it. Hats off to you

    Reply
  9. Rick Hass says

    July 22, 2016 at 11:37 am

    Greg is right. However, I just compared intercept v. no intercept models in R with small sample size. These were linear models, 2 and 3 predictors for an outcome wit N’s between 20 and 43. Coefficient estimates, R-squared, and F were all very very close between models. Not sure that if this implies always omitting the intercept when using z-score predictors and outcomes, or leaving it in, but just thought I’d give a simulation update.

    Reply
  10. Antonia says

    July 21, 2016 at 6:32 am

    Hi Karen,
    Great explanation, thank you.
    I am regularly using log-transformed variables (they are latencies, which usually need log-transformations to account for normality). Would you standardize (or center) the log-transformed or the original variable; and if it’s the original variable, once it is standardized (or centered), would you log-transform it? Thank you!

    Reply
  11. Shariq says

    July 16, 2016 at 2:38 pm

    Hi,

    I am working on time series data, i.e. stock market indexes. I extracted the first principal component from 3 major stock market indexes. Before doing the PCA I had to standardize data of all the 3 indexes, which I did. My question is that the values I get for the first principal component , as I understand, will also be standardized ? and secondly when I use the first principal component in regression as independent variable , do I have to standardize the data of my dependent variable, also another stock market index, before I can run the regression .

    Reply
  12. Greg Kochanski says

    June 13, 2015 at 10:54 am

    This is not a strictly correct recipe.
    The reason it is incorrect is that if you have N data and you convert them to z-scores, you spend one degree of freedom in computing the mean and another degree of freedom in computing the variance.

    So, you end up with N-2 degrees of freedom spread amongst N data points. What that means is that your N data points are not independent of each other. That should be clear if you imagine you have all but one data; you can compute the last one because you know that the z-scores sum to zero.

    So, you have N-2 degrees of freedom, but if you follow this recipe, you don’t tell that to the statistical software. It is written to assume that your data are independent, and it therefore assumes you have N degrees of freedom. As a result, it will get the wrong answer.

    If your if your model only uses up a small number of the degrees of freedom in the data, then you may be in good shape. For instance, if you have 1000 data and your model uses 200 degrees of freedom, it’s not going to make much difference. Having 800 or 798 degrees of freedom after building your model will change your computed significance levels by only a small amount, and your results should remain (reasonably) valid.

    But if you are building a very complex model that eats up most of the degrees of freedom of your data, this can get you bad results. If you start with (for example) 20 data, and your model has 15 free parameters, then, with z-scores you really have only 18 degrees of freedom in your data, rather than 20 before the model. After the model is built, there are only 3 (rather than 5) degrees of freedom left to estimate variances and significance levels. In that case, the difference between 3 d.o.f. and 5 d.o.f. will cause you to substantially over-estimate the significance of your conclusions.

    You can reduce the problem by half if you remove the “intercept” term from your model (as it should always be zero, if you build your model on z-scores). But you will still be off by one degree of freedom (because you normalized the standard deviation of the data) and there’s no easy way to work around that.

    Reply
  13. Amber says

    January 22, 2015 at 4:28 pm

    Thanks so much for your work. It has saved me numerous times. My question is how to get standardized standard errors to report with the standardized coefficients. I need to report standardized direct, indirect and total effects with associated standard errors (effects decomposition). While I can get the standardized effects form Stata, Std. Err. are only generated in relation to the unstandardized coefficients. Is there a way I can standardize an unstandardized standard error? Any help would be greatly appreciated!!

    Reply
  14. Greg says

    December 18, 2014 at 4:09 am

    I need go the opposite direction. I have standardized scores but need to construct the corresponding prediction equation using raw variable readings.

    Reply
  15. jeff says

    November 9, 2014 at 6:11 am

    Do you know if the coefficients output from excel data toolspack are standardised or unstandardised?

    Reply
  16. Robin says

    September 4, 2014 at 11:27 am

    One quick note about logit models. You correctly point out you shouldn’t standardize a dichotomous variable (I would probably argue not to standardize ordinal or categorical variables as well, as standardization implies continuous), and that you can standardize the X variables going into the model. Keep in mind that logit models are actually already standardized. See Williams (2009): http://www3.nd.edu/~rwilliam/oglm/RW_Hetero_Choice.pdf

    … in logit and probit models, coefficients are inherently standardized. Rather than standardizing by rescaling all variables to have a variance of one, as in OLS, the standardization is accomplished by scaling the variables and residuals so that the residual variances are either one (as in probit) or π^2/3 (as in logit). If residual variances differ across groups, the standardization will also differ, making comparisons of coefficients across groups inappropriate.

    Logit models can be very tricky to interpret when thinking about omitted variable bias (even if they are uncorrelated with your other independent variables) and when comparing across groups or samples.

    Reply
  17. fernanda says

    August 23, 2014 at 11:58 am

    Hi, should I use standardized variables for linear mixed effect models?
    Thanks

    Reply
  18. Delano says

    November 3, 2013 at 8:56 pm

    Okay… so I just read how to calculate z-scores, which I understand completely. I guess my confusion is when you say “In other words, do these steps for Y, your outcome variable, and every X, your predictors”. What do you mean by outcome variable and predictors? Outcome variable as in the dependent variable? Predictors as in the independent variables/factors/predictors? In this case, I don’t understand how to calculate the z-scores for the predictors unless they’re numeric. In my case, my predictors are discrete.

    Reply
    • Karen says

      November 8, 2013 at 11:43 am

      Hi Delano,

      Mathematically, you can still do it with dummy-coded predictor variables. The interpreation doesn’t make much sense, though, and therefore, it’s usually better to just keep those coded 0/1. Standardized coefficients aren’t really meaningful for categorical predictors.

      Reply
  19. Delano says

    November 3, 2013 at 8:31 pm

    Hi I’m kind of late with this, but this is a great post! I have been wandering how I could determine the relative effects of my predictors. However, I am not understanding the process. I get lost at step 2 when you say subtract the mean from the original value. What original value? Could you show the process using actual numbers? Thanks!!!

    Reply
  20. tlyn says

    August 12, 2013 at 8:27 pm

    Thank you for post. However, if you have used the Multiple Imputation Method, SPSS will not produce the standardised beta weighs but ALSO it wont produce SDs for the pooled data….what is one to do in this situation? many thanks!

    Reply
    • Karen says

      September 5, 2013 at 4:48 pm

      Hi Tamlyn, just standardize all Xs and Y BEFORE doing the multiple imputation.

      Reply
      • Emily Bilek says

        July 30, 2014 at 1:32 pm

        I need to calculate scale scores after I complete the MI (e.g. total anxiety score), but then this becomes a predictor in my regression. This means I can’t standardize the variable prior to running MI, so I am still struggling with how to find the pooled SDs.

        Reply
  21. Aziz says

    May 14, 2013 at 2:34 am

    Thanks for the above,
    I was going to book an hour of consultation for this.
    Your site and the workshops are really amazing.

    Best wishes

    Reply
    • Karen says

      May 14, 2013 at 11:05 am

      Thanks, Aziz!

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Primary Sidebar

This Month’s Statistically Speaking Live Training

  • Member Training: Assumptions of Linear Models

Upcoming Free Webinars

The Pathway: Steps for Staying Out of the Weeds in any Data Analysis

Upcoming Workshops

  • Analyzing Count Data: Poisson, Negative Binomial, and Other Essential Models (Jul 2022)
  • Introduction to Generalized Linear Mixed Models (Jul 2022)

Copyright © 2008–2022 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT