But in SPSS there are options available in the GLM and Regression procedures that aren’t available in the other. How do you decide when to use GLM and when to use Regression?
GLM has these options that Regression doesn’t:
1. It will dummy code categorical variables for you. If you have only one or two binary categorical variables, this isn’t a huge advantage. But if you have several, and many of them are multi-category, this is a big advantage, both as a time saver, and for getting an overall p-value for the variable as a whole.
2. You can add in interactions. In Regression, you have to create each interaction as a separate variable. Once again, this can become very tedious, especially if those interactions contain dummy variables.
Regression has these options that GLM doesn’t:
1. It automatically gives standardized regression coefficients.
3. It will do multicollinearity diagnostics.
These are really an advantage when your model is exploratory in nature and contains only continuous variables. Of these three options, only the third is really useful when you are testing specific hypotheses that contain interactions and categorical predictors.
Remember, you can’t use standardized coefficients on dummy variables anyway (well, SPSS will let you, but they don’t mean anything). And the stepwise procedures are only useful with truly exploratory analyses, and even then you need to be able to test the models on another data set.
So my approach is to generally use GLM for my regression analysis, then rerun the model in regression if I see a reason to be concerned about multicollinearity.
Edited to add:
A number of commenters below are wondering why the results aren’t matching between SPSS’s GLM and Linear Regression.
They will match if:
- You’re comparing apples to apples. Both procedures will give you a table of F statistics and can give a table of regression coefficients along with p-values, but they are labeled differently, look different, and don’t all appear by default.Make sure you’re not trying to compare p-values from regression coefficients in one to the p-values from the F table in the other. GLM doesn’t give you the regression coefficients by default. You have to ask for them, and in GLM they’re called “Parameter Estimates” in the Options button.
- When you dummy code your variables yourself in Regression, you’re matching GLM’s default coding. If you have them backwards, everything will look different. See these for more info: