When is it important to use adjusted R-squared instead of R-squared?
R², the the Coefficient of Determination, is one of the most useful and intuitive statistics we have in linear regression.
It tells you how well the model predicts the outcome and has some nice properties. But it also has one big drawback.
R²’s nice properties
First, it’s standardized. Every R² is on the scale of 0 to 1. The advantage is that we can look at its actual value to get an idea of how well the model is doing. Your model has an R² of .7? That’s pretty good. An R² of .08? Not so great.
Of course, different fields can expect and interpret different values of R² as being high or low. Actually, an R² of .7 might not be great for every field or every data set. But once you’re used to the types of values you get in your field, you can evaluate your model on its own, without worrying about the units of the variables contained in them.
Second, it’s intuitive. That standardized scale of 0 to 1 represents the proportion of variation in our response variable Y, that is attributable to the predictors in the model. The more related those predictors collectively are to the response variable, the higher R² will be.
Third, you can use it as a measure of effect size for the model as a whole. This makes it particularly useful in sample size calculations.
R²’s big drawback
It does have one big drawback, though. In multiple regression as you add predictors, it will get bigger. Because of the way it’s calculated, it can never go down with more predictors. That raises a few issues.
First R² will go up even if those predictors don’t help predict Y. Sure, it won’t go up a lot, but it will gradually get bigger with more predictors.
And model complexity isn’t a good thing. If we’re going to add more predictors, we want to make sure they’re helpful.
The advantage of Adjusted R-squared
Luckily, there is an alternative: Adjusted R².
Adjusted R² does just what is says: it adjusts the R² value. This adjustment is a penalty that is subtracted from R². The size of the penalty is based on the number of predictors and the sample size.
If you add a predictor that is useful in predicting Y, the adjusted R² will increase because the penalty will be smaller than the R² increase.
But if you add a predictor that is not useful in predicting Y, the adjusted R² will decrease because the penalty will be a bigger negative than the small increase.
In fact, while R² cannot be below 0, adjusted R² can. So it’s a super-useful way to tell if adding predictors to a model is adding useless complexity.
So in multiple regression, when you have multiple predictors, always use Adjusted R².