The beauty of the Univariate GLM procedure in SPSS is that it is so flexible. You can use it to analyze regressions, ANOVAs, ANCOVAs with all sorts of interactions, dummy coding, etc.
The down side of this flexibility is it is often confusing what to put where and what it all means.
So here’s a quick breakdown.
The dependent variable I hope is pretty straightforward. Put in your continuous dependent variable.
Fixed Factors are categorical independent variables. It does not matter if the variable is something you manipulated or something you are controlling for. If it’s categorical, it goes in Fixed Factors.
Now, you can put a categorical variable into Covariates, as long as it’s coded properly–dummy or effect coding are common. What you don’t want to do though, is to put a variable coded 1, 2, 3, 4, 5, 6 for the 6 categories into Covariates. SPSS will think those values are real numbers, and will fit a regression line.
There are a few things you should know about putting a categorical variable into Fixed Factors.
1. You don’t have to create dummy variables for a regression or ANCOVA. SPSS does that for you by default.
2.The default is for SPSS to create interactions among all fixed factors. So if you have 5 fixed factors and don’t want to test 5-way interactions that you’ll never be able to interpret, you’ll need to create a custom model by clicking Model and removing some of the interactions.
3. For any Fixed Factor, you can get marginal means (means adjusted for by other variables in the model) by clicking options. These are generally easier to interpret than the parameter estimates for categorical variables. Especially if you don’t have any continuous predictors in your model, it is much easier to interpret means than parameter estimates.
4. You can also get paired comparison tests for any Fixed Factors by clicking Post Hocs. You can’t get them for Covariates.
5. The default in SPSS is to dummy code any Fixed Factors for the Regression Parameter Estimates Table (which will only be output if you click Options–>Parameter Estimates). Furthermore, the default is to make the reference category the one that comes last alphabetically. So if your categories (what you typed into the data) are Male and Female, Male will be the default reference. Remember higher numbers come later alphabetically, so if you had coded your categories 0 and 1, SPSS will make 1 the reference group! This can create a lot of confusion, so you can change the default by choosing Contrast and making the reference group First. If you want a category in the middle to be the reference group, your only choice is to recode the variable so that that category comes last alphabetically.
Most of the time, you won’t use Random Factors. Rather than calculating means for each category, as is done with Fixed Factors, SPSS calculates only a single variance for Random Factors. So if you want to compare the means, use Fixed Factors. In fact, if you have Random Factors, you should generally be using the Mixed procedure, which uses better algorithms for estimating effects of Random Factors.
Editor’s Update 10/9/09: In just a few weeks, I’ll be offering a 3-hour workshop on the ins and outs of SPSS GLM. We’ll cover the defaults, the menus and syntax, the meanings of all these terms, when you need each option, and what the results mean. Get more info and register at: http://theanalysisinstitute.com/workshops/SPSS-GLM/index.html





{ 20 comments… read them below or add one }
Thank you, very helpful.
my walk on path to statistical confidence has hereby begun. thanks!
Excellent! Enjoy the journey.
Thank you for sharing this information. Can i ask u some questions?
I found a research that adding variable YEAR (the research period is 10 years) as fixed factor, then interact it with another variables/covariates. My question is, what is the function of it? and what should i interpret if YEAR variable is significant or not significant either as main effect or interaction effect?
Hi Nadya–The reason to add YEAR as a fixed factor is to control for any effect of YEAR on Y, the Dependent Variable. It may be that Y varies from year to year, and the main effect will quantify that.
The interaction between a covariate, X, and YEAR will indicate if the effect of X on Y varies across the years. For example, the effect of amount spent on advertising (X) on revenue (Y) probably changes from year to year. The effect might be smaller in years with a slow economy.
hi karen the Goddess of statistics,
for my research, my lecturer just advised me to insert the categorical data (both dependent and independents are categorical, in 1-5 Likert scale) into GLM despite the categorical assumption for the dependent variable.
This is done to have regression on multiple dependent variable.
Is this simplification makes sense?
please answer. thank you so much
Hi Ning,
Aw shucks….
A lot of people are willing to make the assumption that 1-5 Likert scale data are valid as continuous.
Here are a few posts I’ve written on this topic: http://www.theanalysisfactor.com/can-likert-scale-data-ever-be-continuous/
http://www.theanalysisfactor.com/likert-scale-items-as-predictor-variables-in-regression/
http://www.theanalysisfactor.com/when-dependent-variables-are-not-fit-for-glm-now-what/
This last one includes ordinal variables, which is really what the Likert data is. I did a webinar on this, and you can download the recording here: The Other Regression Models Part 1: Binary, Ordinal, and Multinomial Logistic for Categorical Outcomes.
One thing to be aware of, which I talked about in the webinar, is that ordinal logistic regression (which technically you should use for ordered categories) has its own assumptions, which are are often hard to meet for likert data. So sometimes it’s an issue of the lesser evil of two inappropriate methods.
hi there,
i need some help i want to know how to run a a 2x2x2x2 mixed-design ANOVA with Participant race (Black and White) as a between subjects design with Context condition( Verbal and Non Verbal) x Face race (Black and White) x Orientation (Upright and Inverted) as the within-participant Factors in SPSS. i want to know how to dispaly the in variable view and what steps to carry out it out please. please e.mail me on kingdom172002@hotmail.com
Done.
Hi Karen,
I have five dependent variables (interval) and four independent variables. One independent variable is 2 levels (yes/no) and the other 3 IVs I have as metric but I can categorize them to 4 levels. I assume I can enter the the metric IVs as covariates or use the 4-level categorical versions as fixed factors? In addition to these DV and IVs, I have a number of categorical variables (demographic) that I would like to see if they moderate the IV- DV relationship. Using GLM/Manova in SPSS, trying to use many variables gets very messy but I will get quite different results if I remove one or more variables. Can I (or should I) first check the the demographics for a main effect only and then not use those that are not significant? If yes, can I do the same thing for the IV i.e. first test for main effects? My data collection is over so my sample size is fixed and the response rate was lower than expected. I feel like I need to get the number of variables down to understand to have any chance at understanding the analysis. Plus Box’s test will not run with a large number of variables and my sample size.
Hi John,
You’re right–when you have a lot of IVs, it does get messy fast. And the fact that you have interactions AND 4 DVs makes it that much messier.
So here is what I suggest.
1. The first thing you need to do is univariate descriptives on all variables. This really helps you get an idea of your variables. If a numerical variable has a normal-looking distribution, it’s much less reasonable to categorize it than if it’s bimodal, for example.
2. Then do bivariate relationships. See how the DVs relate to each other. If they’re unrelated, you don’t need the MANOVA.
See how the IVs relate to each other as well as to the DVs. Again, if the relationship between a numerical IV and the DVs is linear, it makes less sense to categorize it than if there are big jumps.
If the demographics are moderators, they may not have bivariate relationships with the DVs, but should still be part of the model. So as you do your model building, you might want to go with a top-down strategy–include all main effects and interactions. Remove any non-significant interactions and any non-significant main effects only if the interactions involving those variables are also taken out.
Karen
Hi Karen,
I’m running a GLM Repeated measures in SPSS, and for this model I have 2 within-subject factors, 1 between subject factor, and 5 co-variates.
The problem I am having is that I am unable to plot any of the co-variates in the model settings, nor do any of the co-variates show up as options under ‘Estimated Marginal Means.’
Do you know how to fix this, because the co-variates are important and I am unable to make this work.
Thanks!
Hi Dereck,
SPSS will only give you estimated marginal means and profile plots for categorical predictors, i.e. those in the Fixed Factors box.
You can do a work-around though. In the options, ask for “Parameter Estimates.” These will give you regression coefficients for the model. You can use these to plot predicted values to see the effect on the covariates. I find the easiest way to do it is to export the coefficiencts table to excel, plug in possible values for the predictors, then use formulas in excel to get the predicted values.
Karen
Hi Karen,
I’m not quite sure how to translate the table that is created when I tell SPSS to make the “Parameter Estimates”. The table has a unique set of co-variate co-efficient values for each repeated measure, and I am not sure how go form there? Can you suggest a web tutorial or maybe provide an example?
Thanks for the reply,
Dereck
Hi Dereck,
Yep. In SPSS repeated measures, it’s a bit tricky. It is actually reporting results from two different models (one is a univariate model and the other a multivariate–I’m sure you’ve seen tables that mention both). Some tables aren’t labelled, but parameter estimates are multivariate and go with the multivariate tests of within-subjects factors. There is nothing you can do about this.
So the parameter estimates are telling you the effect of each covariate on each within-subject measurement. So if your within-subjects factors are a 2×3, say, you’re going to get a separate coefficient for each covariate on each of the 6 outcome measures.
Honestly, there is a better solution, but it may not be easy. In any case, you really want to run this in Mixed. Not GLM repeated measures. All the output is univariate, and it can easily handle this design.
Here are some resources to get you started:
Approaches to Repeated Measures Data: Repeated Measures ANOVA, Marginal, and Mixed Models–http://www.theanalysisfactor.com/repeated-measures-approaches/
Five Advantages of Running Repeated Measures ANOVA as a Mixed Model
http://www.theanalysisfactor.com/advantages-of-repeated-measures-anova-as-a-mixed-model/
Running Repeated Measures as a Mixed Model http://www.theanalysisinstitute.com/products/Product-Mixed-Model.html
Karen
Thanks for the response Karen. I will look into this right away.
Another question, if I wanted to run what would normally be a Repeated Measures but with multiple dependent variables (ie. like a repeated measures multivariate test) will the Mixed Model be able to handle this?
Dereck
Hi Dereck,
Sorry, I thought I already responded to this. Maybe it didn’t go through.
Yes.
Karen
Thanks Karen,
I purchased your tutorial on the mixed model, and it worked great.
Regarding a Multivariate Mixed Model, currently my syntax for the univariate mixed model looks like this:
MIXED num_fixations BY viz_order viz_type task WITH verbalWM K6 ps_score bar_expert radar_expert
/FIXED=task viz_order viz_type task*viz_order task*viz_type viz_order*viz_type task*viz_type*viz_order verbalWM K6 ps_score bar_expert radar_expert
task*verbalWM task*K6 task*ps_score task*bar_expert task*radar_expert
viz_order*verbalWM viz_order*K6 viz_order*ps_score viz_order*bar_expert viz_order*radar_expert| SSTYPE(3)
/METHOD=REML
/REPEATED=task | SUBJECT(subject_id) COVTYPE(cs).
Where can I specify more dependent variables? If I add any more values right after the MIXED keyword it does not work.
Dereck
Still looking for a reply or maybe a place where I can find the answer to my question?
Thank you!
Hi Dereck,
If it doesn’t work, then that may be an SPSS limitation. Theoretically it should be fine. I don’t know if other software will let you do this.
Theoretically, a MANOVA (multiple DVs) is the same as running a Factor Analysis on the response variables, creating Factor Scores, then using them as the dependent variable in an ANOVA. You might have to use that approach here. It’s slightly less elegant, but should give you the test you need.
Karen
{ 5 trackbacks }