I would love to promise that the reason there is so much confusing terminology in statistics is NOT because statisticians like to laugh at hapless users of statistics as they try to figure out already confusing concepts. See my post on the different meanings of the term “level” in statistics. (There are other examples–how many different meanings does “beta” have in statistics? I can think of three off the top of my head. That will have to be another post).
But today I talk about the difference between multivariate and multiple, as they relate to regression.
A regression analysis with one dependent variable and 8 independent variables is NOT a multivariate regression. It’s a multiple regression. Multivariate analysis ALWAYS refers to the dependent variable.
So when you’re in SPSS, choose univariate GLM for this model, not multivariate.
I know what you’re thinking–but what about multivariate analyses like cluster analysis and factor analysis, where there is no dependent variable, per se?
Well, I respond, it’s not really about dependency. It’s about which variable’s variance is being analyzed. A regression model is really about the dependent variable. We’re just using the predictors to model the mean and the variation in the dependent variable.
Note: this is actually a situation where the subtle differences in what we call that Y variable can help. Calling it the outcome or response variable, rather than dependent, is more applicable to something like factor analysis.
So when to choose multivariate GLM? When you’re jointly modeling the variation in multiple response variables.