SPSS GLM: Choosing Fixed Factors and Covariates

by Karen Grace-Martin 87 Comments

The beauty of the Univariate GLM procedure in SPSS is that it is so flexible. You can use it to analyze regressions, ANOVAs, ANCOVAs with all sorts of interactions, dummy coding, etc.

The down side of this flexibility is it is often confusing what to put where and what it all means.

So here’s a quick breakdown.

The dependent variable I hope is pretty straightforward. Put in your continuous dependent variable.

Fixed Factors are categorical independent variables. It does not matter if the variable is something you manipulated or something you are controlling for. If it’s categorical, it goes in Fixed Factors.

Now, you can put a categorical variable into Covariates, as long as it’s coded properly–dummy or effect coding are common. What you don’t want to do though, is to put a variable coded 1, 2, 3, 4, 5, 6 for the 6 categories into Covariates. SPSS will think those values are real numbers, and will fit a regression line.

There are a few things you should know about putting a categorical variable into Fixed Factors.

1. You don’t have to create dummy variables for a regression or ANCOVA. SPSS does that for you by default.

2.The default is for SPSS to create interactions among all fixed factors. So if you have 5 fixed factors and don’t want to test 5-way interactions that you’ll never be able to interpret, you’ll need to create a custom model by clicking Model and removing some of the interactions.

3. For any Fixed Factor, you can get marginal means (means adjusted for by other variables in the model) by clicking options. These are generally easier to interpret than the parameter estimates for categorical variables. Especially if you don’t have any continuous predictors in your model, it is much easier to interpret means than parameter estimates.

4. You can also get paired comparison tests for any Fixed Factors by clicking Post Hocs. You can’t get them for Covariates.

5. The default in SPSS is to dummy code any Fixed Factors for the Regression Parameter Estimates Table (which will only be output if you click Options–>Parameter Estimates).

The default is to make the reference category the one that comes last alphabetically. So if your categories (what you typed into the data) are Male and Female, Male will be the default reference.

Remember higher numbers come later alphabetically, so if you had coded your categories 0 and 1, SPSS will make 1 the reference group! This can create a lot of confusion, so you can change the default by choosing Contrast and making the reference group First.

If you want a category in the middle to be the reference group, your only choice is to recode the variable so that that category comes last alphabetically.

Most of the time, you won’t use Random Factors. Rather than calculating means for each category, as is done with Fixed Factors, SPSS calculates only a single variance for Random Factors. So if you want to compare the means, use Fixed Factors. In fact, if you have Random Factors, you should generally be using the Mixed procedure, which uses better algorithms for estimating effects of Random Factors.

Interpreting Linear Regression Coefficients: A Walk Through Output

Learn the approach for understanding coefficients in that regression as we walk through output of a model that includes numerical and categorical predictors and an interaction.

Comments

Ellen says

December 21, 2021 at 2:51 am

Dear Karen,

Both the article and replies are very helpful to me. Thank you for your excellent work!

Merry Christmas and best regards,

Ellen

Reply
- Karen Grace-Martin says
  
  January 7, 2022 at 12:40 pm
  
  Thanks, Ellen!
  
  Reply
Ellen says

December 16, 2021 at 4:10 am

Thanks a lot for this explicit narration. Very practical knowledge for me.

Reply
TAYYBA Rasool says

March 23, 2021 at 11:19 pm

Hello!
I want to apply factor analysis on data. I have 7 factors in my data file, However, spas show 9 factors when I apply factors analysis. How can I fix factors as 7 factors. Please guide me.
Thanks

Reply
- Karen Grace-Martin says
  
  December 6, 2021 at 12:37 pm
  
  Okay, if you’re talking about factor analysis, that is a totally different kind of factor than what we’re talking about here. See this: https://www.theanalysisfactor.com/confusing-statistical-term-6-factor/
  
  Reply
ESHETU says

August 10, 2020 at 3:02 am

I Have data about communication of father and mother with their sons and daughters. so which method is best to analyze these data using SPSS?

Reply
Katie says

July 8, 2020 at 7:22 am

You indicate that dummy variables can be included as a covariate (rather than within fixed factors) but I can’t find details on how to interpret this type of analysis. eg. If I have 2 treatment groups I want to compare on an outcome whilst controlling for pre-treatment scores and a dummy variable (e.g. gender), can I put the dummy variable in covariates? If so, how do I interpret? And if not, how do I run the analyses specifically to control for gender and pre-treatment scores?

Reply
Giana says

May 15, 2020 at 6:57 am

Hi Karen,

I have two categorical covariables in my data gender (2 levels), and ethnicity (4 levels). I transformed them into dummy variables but when I put them together into the model, the location ones don’t yield any results. Is this because I am trying to compute something SPSS doesn’t do, or is it because I need to add a reference variable for both groups?

Reply
Pruthviraj says

February 13, 2020 at 12:05 pm

Hi Karen,

I am using spss univariate GLM procedure. In the model, I have 3 fixed factors (with more than 2 levels each) and 1 covariable. When the covariable is put into covariate box, option for post hoc is becoming unavailable. I need the post hoc table to rank the levels under each factor. Do you know how can one get post hoc table, even after filling covariate box? Please let me know.

Thank you

Reply
- Karen Grace-Martin says
  
  April 17, 2020 at 2:56 pm
  
  Pruthviraj,
  
  Yeah, SPSS can’t do it. You only option is to use a Bonferroni or Sidak correction in the Estimated Marginal Means section (and click the little box that says “Compare Means.”) I agree, it’s frustrating.
  
  Reply
Rhiannon says

February 28, 2018 at 7:28 am

Hi Karen!

If you’re coding for mean income, i.e. 10,000-20,000$ coded as 1, 20,000-30,000$ coded as 2, 30,000-40,000$ coded as 3 ect how do you control for this in a hierarchical regression model? As it is coded categorically and using numbers to represent each income bracket.

best,
Rhiannon

Reply
- Karen Grace-Martin says
  
  May 17, 2018 at 9:58 am
  
  Rhiannon, you can only treat predictors as nominal or continuous in models. There is no way to take into account the ordering of categories.
  
  Reply
Renae says

June 14, 2017 at 9:29 am

Hi, Karen!

I have a between-groups pretest posttest design which I’m analysing using ANCOVA (grouping variable has 2 levels; covariate is pretest score, DV is posttest score). My scatter plots show pretty clearly parallel regression slopes–HOORAY! But the custom GLM is showing a significant groupingvariable*covariate interaction, p = 0.04. Is that -just- acceptable?
If so, how can I write that up? (If not, what should I do instead? I’ve heard I can still run the GLM, but with… some sort of change? And change in what I call the analysis?)

Reply
Legrand says

May 24, 2017 at 5:54 am

Dear Karen,
I have a question about the assumptions that need to be satisfied when runing an ANCOVA with a categorical covariate.
When the covariate is continuous, two assumptions need to be met : (1) The independent variable and the covariate are independent of each other, (2) There is no interaction between independent variable and the covariate.
Now, when the covariate is categorical, are there assumptions to be met? I would say that the second one would be that there is no interaction between the IV and the covariate, but is there any equivalent for the first one?
Thanks a lot for your help!

Reply
mumu says

May 2, 2017 at 3:06 am

Hi,

I found this argument of yours contradicted by some statisticians, supported by some others. But I dont perfectly understand your whole argument, please help me to get this:

Now, you can put a categorical variable into Covariates, as long as it’s coded properly–dummy or effect coding are common. What you don’t want to do though, is to put a variable coded 1, 2, 3, 4, 5, 6 for the 6 categories into Covariates. SPSS will think those values are real numbers, and will fit a regression line.

what do you mean by that? can you re-word them? thanks

Reply
I. MOHAMMED says

December 24, 2016 at 5:21 pm

how feasible is the use of log-transformed dummy explanatory variable in regression analysis of cross-sectional data?
Please help me out.

Reply
Julie says

October 31, 2016 at 1:52 pm

Hi, I ran a repeated measures MANOVA and got a significant interaction between my repeated (continuous) measure and my continuous covariate. How can I best probe this interaction?

Reply
Karina says

May 16, 2014 at 6:57 am

Hello Karen,

my question is the following: I have one dependent variable which is measured by 3 questions on Likert scale. I also have 5 independent variables (also measured by 5-points Likert scale). I would like to compare the effect of 2 independent variables on the dependent one. According to you what is the best way of doing this? Should the questions of the Likert scale be put in a group?

Thank you very uch in advance for your help!

Reply
Andreea says

May 15, 2014 at 9:55 am

Hello,
I have a question regarding predicted means in GLM….I want to do an error bar graph with predicted means outcome adjusted for various other factors (used the model to predict the outcome at mean values of the co-variates). Some of my covariates are categorical so I want to use GLM so that I don’t have to create dummies. However GLM doesn’t give me the option to save 95% prediction intervals for mean…How do I do it? thank you

Reply
Hristo says

February 20, 2014 at 2:17 pm

Hello Karen,

As I can see, you have a very good knowledge of statistics.
I have a question which may not be as difficult as the previous questions
in the blog, but I would appreciate if you can tell me how do i determine the categorical variables in a sample. I am about to run a logistic regression and I have the dependent variable and the covariates but I am not sure how to determine the categorical variable and which options to choose (classification plots, Hosmer-Lemeshow..). My research aims to determine the differences between rated and unrated banks, using financial and nonfinancial ratios. I will appreciate a little help!

Thanks in advance

Reply
Andreea says

January 14, 2014 at 9:22 am

Hello Karen,
I have a similar problem as Derek a while ago,
I have a model with continuous outcome, continuous predictor, continuous and categorical covariates, and categorical moderator (factor). I want to plot an interaction between my continuous predictor and my categorical (3 categories) moderator in Univariate GLM. However GLM only lets me plot estimated marginal means for categorical variables.
I understand the way to go forward in order to visualise an interaction between a factor and a covariate, I would need to plot the regression with the covariate for each level of the factor?
Can you please explain how to do that?
thanks, much appreciated
Andreea

Reply
- Karen says
  
  January 15, 2014 at 10:45 am
  
  Hi Andreea,
  
  You can’t do it within the univariate GLM. You have to use a scatterplot, then add the lines. There is an option in the chart editor to add a line from an equation. You’ll have to specify the lines based on the regression coefficients in order to include the effects of the covariates.
  
  Or just do it in excel. Sometimes that’s easier.
  
  Reply
  - Andreea says
    
    January 15, 2014 at 6:58 pm
    
    thank you, I think I simply have to Add Fit Line at subgroups as I understand ? However if I do decide to have a categorical predictor and categorical moderator (outcome is still continuous), I can do it with Univariate GLM by doing the estimated marginal means plot? Does that show my interaction for the adjusted regression model with covariates? thank you.
    
    Reply
Adam says

December 2, 2013 at 5:58 pm

Karen,

I’m with the last user…I appreciate the time you’ve put into this. I also have a sort of similar question.

I’m interested in the interaction between two categorical variables. I can understand the simple output (x*y – F statistic – p value), but I don’t understand how to interpret the parameter estimates. The p value is associated with only one interaction it seems (x =0, y=0), and I’m interested in another one. I’ve played around with the ‘contrasts’ options but that doesn’t seem to be right. Suggestions?

Thanks!
Adam

Reply
- Karen says
  
  December 3, 2013 at 12:10 pm
  
  Thanks, Adam.
  
  There is only one interaction, and it reflects the difference in the mean differences.
  
  You have uncanny timing–“Interactions” is the topic of this month’s Brown Bag webinar. In any case, start here: https://www.theanalysisfactor.com/interactions-effect-coded-predictors/
  
  Karen
  
  Reply
Peggy says

November 25, 2013 at 5:50 am

Hi Karen,

I recently discovered this website and it’s brilliant! It has helped me a lot with understanding statistics, so thanks for that!

May I ask one question? I have a categorical covariate (it’s not important which one is the reference group) and I coded it 1 and 2 (instead of using dummy or effect coding), and used it like that in my analysis. It was not significant. In what way can my output be wrong because of this? Or, in other words, what happens in your data/output if you use 1 and 2 for your variable instead of 0 and 1?

Reply
- Karen says
  
  November 25, 2013 at 3:32 pm
  
  Hi Peggy,
  
  First, thanks for the kind words.
  
  If you didn’t include any interactions and/or if you don’t try to interpret the intercept, it won’t make much difference at all.
  
  It’s cleaner to use 0/1, but not 100% necessary. The coefficient for that variable will still be the mean differences. If you do have an interaction, though, then the interpretation of the OTHER variable’s coefficient will be affected.
  
  Reply
sarah says

October 31, 2013 at 2:51 am

Hi Karen, unsure if you still follow this thread but I am hoping you do!
I have 2 metric DVS and 4 metric IVS (which can all be assessed separately they don’t have to present a relationship with one another). What is the best test to run? I have tried many different ones ie. ANOVA, Multiple regression etc but none seem to be producing numbers just empty boxes with error messages. Basically I want to compare my two DVS to my 4 IVs.

Thank You!

Reply
- Karen says
  
  December 3, 2013 at 12:08 pm
  
  HI Sarah,
  
  With two DVs, you need to run some sort of multivariate test.
  
  But really, to figure out the right test you need to get more specific on your research question. Do you want to assess the joint relationship between the 2 DVs and the 4 IVs? Test if the mean on each DV is predicted by each of the IVs?
  
  Get very, very specific.
  
  Reply
BJS says

May 5, 2013 at 11:44 am

I have a result showing that various brian regions differ between two groups but only after controlling for age, gender and a few other factors. Is it possible in SPSS 15 to create a bar graph of the result with the effect of the confounding variables removed?

Reply
- Karen says
  
  May 10, 2013 at 9:34 am
  
  Hi BJS, really good question. I don’t think so. Within GLM, you can get a line graph of the adjusted means (Estimated Marginal Means) OR you could export the EMMeans, then graph them in Excel.
  
  Reply
Negin says

April 4, 2013 at 9:37 am

Hello
I think that I should use multivariate GLM for analyzing my data. I have both continuous and discrete data. where should I put the discrete data in spss? should I put the discrete data in the covariate box if there is not any relationship between the dependent variables?

Reply
- Karen says
  
  April 8, 2013 at 9:49 am
  
  Negin, are the discrete data the independent or dependent variables? It makes a big difference.
  
  Reply
Doron Gothelf says

December 7, 2012 at 5:47 pm

Hi
I entered two categorical variables as fixed factors (group and gender) and one as covariate (age) into GLM univariate analysis.
In addition to the interactions I get between the fixed factors I would like to know the interaction of the group by age which I don’t receive in my SPSS output.
The question is how can I obtain in the SPSS univariate the interaction values of a group (fixed factor) by age (covariate)?
Many thanks
Doron

Reply
- Karen says
  
  December 12, 2012 at 11:57 am
  
  Hi Doron,
  
  You need to click on the Model button and click on “Custom.” That will allow you to specify the interactions you want.
  
  In syntax, do it in the Design statement.
  
  I go over all this in detail in the Running Regressions and ANCOVAs in SPSS GLM workshop. Here’s the link.
  
  Reply
ishfaq says

November 24, 2012 at 12:56 pm

hi,
i have collected survey data for for my project regarding wetland fisheries. one of my questions was regarding seeking suggestions from fishermen to make fishing occupation better and there were eleven such suggestions suggested by the respondents with some of them (say 50) opting for suggestion one, some for suggestion second and so forth. i want to use these suggestions as one predictor variable in my regression analysis so can i use this as nominal/ordinal (depending on their frequencies) variable and if so how to enter data in excel and SPSS for this variable.

Reply
- Karen says
  
  December 3, 2012 at 4:56 pm
  
  Hi Ishfaq,
  
  If I’m understanding it correctly, it sounds like you’d want to create a binary yes/no variable for each suggestion, which indicates whether each respondent made that suggestion. You’ll only be able to do this for popular suggestions–it won’t work, for example if only one or two people suggested something.
  
  Karen
  
  Reply
Aso says

September 3, 2012 at 1:37 am

Hi Karen,
Amazingly written…

I have a quick question, If I want to perform two regression analysis based on one column depending on the numerical values of 1 and 2. How can I perform it? I want to say that for example, one column named NUM contains numerical data of 1 and 2 only. The other column is of FR. How will I perform the regression analysis, FR as dependent variable and NUM=1 and NUM=2 as independent variables? (I am a newbie so I know the question is a bit naive) I can understand that it’s a small thing but I am unable to resolve it.

Thanks!

Reply
- Aso says
  
  September 3, 2012 at 2:05 am
  
  Hi, just an update…. I did it. Thank You 🙂
  
  Reply
  - Karen says
    
    September 3, 2012 at 5:24 pm
    
    Excellent. 🙂
    
    Karen
    
    Reply
Lisa says

May 23, 2012 at 1:27 pm

Hi Karen,
I have run a univariate ANOVA looking at the effect of a 3 group categorical variable on a continous outcome variable (with some continous covariates also entered into the model) in two ways. First, I ran it by entering 2 dummy-coded variables into fixed factors (for the 3 group categorical variable). The results matched exactly what I found with a linear regression. Second, I ran it with the 3 group categorical variable entered as a fixed factor, on the assumption that SPSS would dummy code this variable for me. The results were similar to the previous univariate ANOVA, but not identical, which makes me concerned. Is SPSS really treating this fixed factor as a fixed factor?

Thank you very much,
Lisa

Reply
Tom says

April 24, 2012 at 7:20 pm

I have a question…I have a data set with sales as Y and total retail shelf space (x)..i want to investigate the diminishing return i.e. whether additional shelf space would contribute to more sales. I did regression or Arima model where I found that the squared shelf space was significant, thus there is a diminishing return.

My question is how to forecast this? When doing my analysis do I need to transform my data to log (Y) = Log (x)(shelf space) + 2log(x)(squared shelf space) or should I simply ignore the shelf space (the one which was not significant i.e. the non squared one) and only do my analysis with shelf space squared. Any help will be highly appreciated.

Reply
- Karen says
  
  April 25, 2012 at 4:19 pm
  
  Hi Tom,
  
  Let me make sure I understand. Where X is shelf space, you have a model with both X and X-squared? The X-squared term is significant, but X is not?
  
  I’m not sure where the logs come in, and I’m not an econometrician, so I’m not sure specifically how a forecast differs from a predicted value (any econometricians out there, feel free to comment). But if you’re trying to either describe the effect of shelf space on sales or get predicted values for sales based on shelf space, you want to include the X term as well as X squared. In fact, centering X at its mean helps the interpretation quite a bit. When you do that, the coefficient of X is the slope of the tangent line at the mean (the overall linear trend if X is reasonably symmetric) and the coefficient for X-squared is the amount of curvature.
  
  Hope that helps.
  Karen
  
  Reply
  - Tom says
    
    April 26, 2012 at 12:07 pm
    
    Hi Karen,
    
    Thank you for your reply.
    And I also understand why X-squared should be there. However, the entire point of this exercise was to investigate whether diminishing return exists and X-squared being significant in the regression, stated this fact. The log-transformation is done, so regression method can be applied.
    
    Do you know, how diminishing return can be forecasted from the above example? Any input will be highly appreciated.
    
    Br,
    
    Tom
    
    Reply
    - Karen says
      
      April 27, 2012 at 9:14 am
      
      HI Tom,
      
      I really don’t. I’ve never been trained in economics, and I’m not sure how the statistics plays out in that context.
      
      Best,
      Karen
      
      Reply
Sara says

April 16, 2012 at 7:05 am

Thanks for this post. Very useful.
I would have a question on how to plot results from a univariate model. I have a model that includes 1 fixed group (Age) and one Covariate (Rate of learning, RoL). One of the main effect was significant (Age) and so the interaction (Age *RoL): in one Age group the relationship with RoL is significant and in the other one it is not. I wish to describe this result with a figure. Where can I find in the univariate output the values of the intercept and slope for the relationship with the covariate? Thank you very much
Sara

Reply
- Karen says
  
  April 16, 2012 at 12:37 pm
  
  Hi Sara,
  
  It’s called “parameter estimates” and it’s under options.
  
  Karen
  
  Reply
Lisa says

April 3, 2012 at 2:43 pm

Thank you for this great posting. Is it possible to put more than one covariate into the model in SPSS GLM? When I have tried to do this, I get a message that “the following factors or covariates are not used in the model:” and then the label of one of my covariates. And sure enough, that covariate was not included in the model.
Thank you,
Lisa

Reply
- Karen says
  
  April 3, 2012 at 9:37 pm
  
  Hi Lisa,
  
  I’m guessing that you’re running a second GLM and adding the second covariate after all the boxes are already filled in. When you do this, SPSS doesn’t add the covariate automatically to the model. Putting it in the covariate box just defines it as continuous. To add it into the model, you need to click on the Model button, and move it over into the Model box.
  
  If you’re in syntax, you’d have to add it to the /DESIGN subcommand, not just put it after WITH in the UNIANOVA command. Same thing.
  
  Best,
  Karen
  
  Reply
  - Silvano says
    
    August 14, 2019 at 3:51 am
    
    Thanks for the explanation! Great help.
    
    Best,
    Silvano.
    
    Reply
cg1991 says

March 29, 2012 at 2:39 pm

Hi,

I’ve been using SPSS and have completed my analysis of a set of data. However, I have been asked to ‘Indicate whether the covariate exerted an independent effect on the outcome’. How would I know? I’ve been combing over this data for hours and have just come to a complete stop.

Thanks for any help 🙂

Reply
- Karen says
  
  April 2, 2012 at 10:11 am
  
  I’m not entirely sure what the reviewer means by that.
  
  It could be that they want to make sure its effect is not also explained by any other related predictors in the model. So run the model with and without the other predictors and see if the regression coefficient changed for the covariate in question.
  
  I would ask for clarification, though. There’s nothing wrong with doing that. 🙂
  
  Karen
  
  Reply
James says

March 26, 2012 at 1:14 pm

Hi Karen,

Thank you so much for posting. I am having some troubles performing a logistic regression while accounting for fixed effects and controlling for other variables. My research in on the influence of financial aid on student retention. The dependent variable is a binary variable (retained or not retained). The independent variable, financial aid, is numerical. My professor wants me to account for the fixed effects of the of the institution (the data set consists of students from 27 different schools). I want to also control for the effects of being a minority (binary “yes” or “no”), gender (M or F), and ACT score.

How do I set this up in SPSS? How would I find the point of diminishing returns for the regression line? My professor mentioned taking into the consideration the year in which a student received financial aid as well (this study is over a three year period). Is this also going to be a fixed effect? Anything you can provide me with would be helpful.

Reply
- Karen says
  
  April 2, 2012 at 9:23 am
  
  Hi James,
  
  I can answer some of your questions, but would need to talk over details with you about the others.
  
  Controlling for the effects of being a minority, gender, and ACT score are pretty straightforward. You’d have to dummy code minority and gender.
  
  If you’re not familiar with dummy coding, here are some resources:
  
  https://www.theanalysisfactor.com/complicated-models-with-tricky-effects/
  https://www.theanalysisfactor.com/about-dummy-variables-in-spss-analysis/
  https://www.theanalysisfactor.com/interpreting-linear-regression-parameters-a-walk-through-output/
  
  You could do the same thing with school, if you want to actually compare schools to each other, but it actually sounds like school would be better as a random effect. This sounds like a hierarchical data set, with students nested within school.
  
  So that means you’re starting to get into more complicated models (linear mixed model) and perhaps your advisor wants you to take the simpler approach of treating school as fixed. Since this is a logistic regression, this can get pretty complicated.
  
  And finally, about year, that depends on whether each student is measured each year or just one year. That’s a really important distinction.
  
  This is honestly the kind of question for which I have Quick Question Consultations. 🙂 There are just too many really important details, including your statistical background, to take into account.
  
  Best,
  Karen
  
  Reply
Lianne says

March 25, 2012 at 11:13 am

Dear Karen,

I have one within subjects factor (treatment: 2 levels) and one between subjects factor (group: 2 levels). I am interested in the group*treatment interaction controlling for two categorical covariates, namely order (2 levels coded 1 and 2) and income (3 levels coded 1, 2 and 3). To my understanding, these categorical factors can be either put into the model as fixed factors or as covariates as dummy codes. For the latter I have dummy coded order as 0 and 1 and created two variables for dummy coding income (with level 1 as a reference).

My two questions are:
– If I am interested in group*treatment interaction controlled for order and income, should I model all interactions when including the two covariates as fixed factors in the analysis or only two way interactions (such that the effects of treatment, treatment*group, treatment*order and treatment*income is modelled)?
– The results are very different for the main effect of treatment when including the dummy coded covariates as covariates compared to including the covariates as fixed factors (and in the latter case only modelling 2 way interactions in order to be similar to the model when dummy coded covariates are used). How is this possible? And what is the right way to proceed? Note: the group*treatment interaction is not affected by these two different methods of including categorical covariates.

I hope you can help me further.

Kind regards,
Lianne

Reply
- Karen says
  
  April 2, 2012 at 9:09 am
  
  Hi Liane,
  
  First, I assume you’re doing this in the GLM Repeated Measures since you have one within-subjects variable. If so…
  
  1. No, you don’t have to (although if those interactions make sense theoretically and those factors are crossed, you can). By only including main effect for order, for example, you’re basically testing if all means are higher or lower in each order. Including interactions between Order and the IVs would test if the effect of those IVs differs depending on order. So it all depends on what you want to control for.
  
  2. This is very hard to answer without more details. What results differ? The F-test? That’s surprising. The regression coefficients–that’s not. When you say you’re including 2-way interactions, are you including all 2-way interactions or just the original treatment*group interaction. That’s the equivalent of putting them in as covariates.
  
  Best,
  Karen
  
  Reply
Karen says

March 9, 2012 at 10:57 am

Hi Zara,

You’d need to make involvement a fixed factor if you have only two values. If you have actual numerical values for involvement, you could put that in as a covariate.

The difference is fixed factors will test the difference in the mean of the DV for each category of involvement (high vs. low).

Covariates will fit a regression line between numerical values of involvement and the DV. That’s why you need a numerical variable.

And no, you won’t be able to run a t-test if you have more than one IV.

Karen

Reply
zara says

March 9, 2012 at 7:16 am

Dear karen
Recently I am doing a research that assesses the effectiveness of appeals (rational and emotional), gender (m,f) and the level of involvement(high, low) (3 IV) on people attitude(7 point likert) (DV). So I am running anova in spss. Now my question is: do I serve involvement as covarite or I must serve that as one of the fixed factors? & what differences is between these two?
Another question: could I run T test in place of anova?
Thank you so much

Reply
Lisa says

March 8, 2012 at 8:23 pm

Whoops–sorry for the second post. I meant to say that I am wondering about whether dummy coding is necessary for logistic regression analyses in SPSS 19.
Thank you very much,
Lisa

Reply
- Karen says
  
  March 9, 2012 at 10:54 am
  
  Hi Lisa,
  
  That was actually an important distinction.
  
  In the linear regression procedure, you’ll have to create your own dummy variables.
  
  In logistic regression, SPSS will do it for you. You have to click on “Categorical” and indicate which predictor variables are categorical. For some reason, you are allowed here to specify whether you want the first or last alphabetical value to be the reference group. Not sure why you can’t do that in linear regression or even GLM.
  
  Karen
  
  Reply
Lisa says

March 8, 2012 at 7:52 pm

Hi Karen,
Your posts are great. One question–since SPSS automatically dummy codes the fixed factors in GLM, can I assume that if I run a linear regression with a 3-group categorical variable (coded as “nominal” in SPSS), that SPSS will do the dummy-coding? Or do I need to make the dummy coding myself? I am using SPSS 19.
Thank you,
Lisa

Reply
Emma says

March 5, 2012 at 12:50 pm

Hi Karen,
I am unsure of how to analyse my study using SPSS. I have a 2x2x2x2 design looking at the following:
ATS Score – high or low
Perpetrator gender – Male or female
Victim gender – male or female
Victim age – old or young

Reply
- Karen says
  
  March 9, 2012 at 10:49 am
  
  Hi Emma,
  
  In deciding how to analyze, there are other issues I would need to know, like the design (is everything between subjects? I assume they are, but I don’t know what ATS is, and I don’t know if you’re presenting different scenarios to the same subject with different perpetrators and victims); the measurement of the dependent variable (continuous, discrete, binary, etc.?).
  
  If the design is completely between subjects and the dependent variable is continuous, unbounded, and measured on an interval or ratio scale, then use SPSS GLM, put all your independent variables in as fixed factors. You may not want every possible interaction, but SPSS will put them in by default.
  
  If you have no idea whether your dependent variable meets those criteria, I would suggest starting here: https://www.theanalysisfactor.com/the-11-steps-for-statistical-modeling-in-any-regression-or-anova/
  https://www.theanalysisfactor.com/6-types-of-dependent-variables-that-will-never-meet-the-glm-normality-assumption/
  
  Karen
  
  Reply
Karen says

February 24, 2012 at 7:51 pm

Hi Naqeeb,

Thanks for you kind words. I’m glad you find it so helpful.

But I had to let you know, I already wrote that book! 🙂

It doesn’t cover everything in SPSS, but all of the basics up to linear and logistic regression. The Amazon link is in the right sidebar. –>

Karen

Reply
Naqeeb Ullah Khan says

February 24, 2012 at 11:25 am

Hi Karen

Thank you so much for the detail information you provided. Believe me you make statistic so easy for understanding that I have no where seen it before.
I will request you that please write a book about the various statistical procedures which can be done in SPSS and how one can interpret it. I am giving you guarantee that it will be a best seller book. Do not go into theory just discuss it practical aspect and interpret it. Your writing is so clear that even a 10 years old child understand it at once

Thanks and best wishes

Naqeeb.

Reply
Naqeeb Ullah Khan says

February 3, 2012 at 12:07 pm

Dear Sir/Madame

I have a data with 6 dependent variables (DVs) and 6 independent variables (IVs). All the DVs are continuous and IVs Categorical, [some are Nominal and some or in Ordinal Scales].
They are as following
1. Age in Years :5 -Categories -Ordinal
2. Nationality: 2 -ategories. Nominal
3. Gender: 2 -Categories. Nominal
4. Highest level of education: 6-Categories- Nominal
5. Major Field of Education: 6-Categories-Nominal
6. Size of the Hospital (number of beds): 5 -Categories-Ordinal

I want to run MANOVA/MAncova using GLM.

So my question is that which method I should run? AND if it is MANCOVA that which variables I should take covariates and what would be the benefits of MANOVA over MANCOA and Vice versa.. THanks and Regards

Reply
- Karen says
  
  February 10, 2012 at 6:18 pm
  
  Hi Naqeeb,
  
  If your DVs are correlated, you do need MANOVA, particularly if they make sense as a single construct and you want to test them as a unit.
  
  There’s no way to indicate an ordinal independent variable (and that’s not just SPSS, that’s linear models). So all would go into Fixed Factors. However, by default, SPSS will automatically put in all possible interactions among all possible fixed factors. That would be a mess in this model. You probably want no more than 3-way interactions. So you will have to click on Model and put in Main Effects and the 2 way and 3 way interacitons into a custom model.
  
  If you want more info, the details on this kind of thing is EXACTLY the type of issue we cover in the upcoming SPSS GLM workshop. We do it in the univariate case, but all the options are the same. Look under Workshops at Running Regressions and ANCOVAs in SPSS GLM.
  
  Karen
  
  Reply
Karen says

November 8, 2011 at 12:26 pm

Hi Dereck,

If it doesn’t work, then that may be an SPSS limitation. Theoretically it should be fine. I don’t know if other software will let you do this.

Theoretically, a MANOVA (multiple DVs) is the same as running a Factor Analysis on the response variables, creating Factor Scores, then using them as the dependent variable in an ANOVA. You might have to use that approach here. It’s slightly less elegant, but should give you the test you need.

Karen

Reply
Dereck says

November 2, 2011 at 1:20 pm

Still looking for a reply or maybe a place where I can find the answer to my question?

Thank you!

Reply
Karen says

October 24, 2011 at 2:49 pm

Hi Dereck,

Sorry, I thought I already responded to this. Maybe it didn’t go through.

Yes.

Karen

Reply
- Dereck says
  
  October 27, 2011 at 2:58 pm
  
  Thanks Karen,
  
  I purchased your tutorial on the mixed model, and it worked great.
  
  Regarding a Multivariate Mixed Model, currently my syntax for the univariate mixed model looks like this:
  
  MIXED num_fixations BY viz_order viz_type task WITH verbalWM K6 ps_score bar_expert radar_expert
  /FIXED=task viz_order viz_type task*viz_order task*viz_type viz_order*viz_type task*viz_type*viz_order verbalWM K6 ps_score bar_expert radar_expert
  task*verbalWM task*K6 task*ps_score task*bar_expert task*radar_expert
  viz_order*verbalWM viz_order*K6 viz_order*ps_score viz_order*bar_expert viz_order*radar_expert| SSTYPE(3)
  /METHOD=REML
  /REPEATED=task | SUBJECT(subject_id) COVTYPE(cs).
  
  Where can I specify more dependent variables? If I add any more values right after the MIXED keyword it does not work.
  
  Dereck
  
  Reply
Dereck says

August 30, 2011 at 4:19 pm

Thanks for the response Karen. I will look into this right away.

Another question, if I wanted to run what would normally be a Repeated Measures but with multiple dependent variables (ie. like a repeated measures multivariate test) will the Mixed Model be able to handle this?

Dereck

Reply
Karen says

August 30, 2011 at 2:48 pm

Hi Dereck,

Yep. In SPSS repeated measures, it’s a bit tricky. It is actually reporting results from two different models (one is a univariate model and the other a multivariate–I’m sure you’ve seen tables that mention both). Some tables aren’t labelled, but parameter estimates are multivariate and go with the multivariate tests of within-subjects factors. There is nothing you can do about this.

So the parameter estimates are telling you the effect of each covariate on each within-subject measurement. So if your within-subjects factors are a 2×3, say, you’re going to get a separate coefficient for each covariate on each of the 6 outcome measures.

Honestly, there is a better solution, but it may not be easy. In any case, you really want to run this in Mixed. Not GLM repeated measures. All the output is univariate, and it can easily handle this design.

Here are some resources to get you started:

Approaches to Repeated Measures Data: Repeated Measures ANOVA, Marginal, and Mixed Models–https://www.theanalysisfactor.com/repeated-measures-approaches/

Five Advantages of Running Repeated Measures ANOVA as a Mixed Model
https://www.theanalysisfactor.com/advantages-of-repeated-measures-anova-as-a-mixed-model/

Running Repeated Measures as a Mixed Model http://www.theanalysisinstitute.com/products/Product-Mixed-Model.html

Karen

Reply
Dereck says

August 29, 2011 at 2:11 pm

Hi Karen,

I’m not quite sure how to translate the table that is created when I tell SPSS to make the “Parameter Estimates”. The table has a unique set of co-variate co-efficient values for each repeated measure, and I am not sure how go form there? Can you suggest a web tutorial or maybe provide an example?

Thanks for the reply,
Dereck

Reply
Dereck says

August 19, 2011 at 7:32 pm

Hi Karen,

I’m running a GLM Repeated measures in SPSS, and for this model I have 2 within-subject factors, 1 between subject factor, and 5 co-variates.

The problem I am having is that I am unable to plot any of the co-variates in the model settings, nor do any of the co-variates show up as options under ‘Estimated Marginal Means.’

Do you know how to fix this, because the co-variates are important and I am unable to make this work.

Thanks!

Reply
- Karen says
  
  August 26, 2011 at 12:56 pm
  
  Hi Dereck,
  
  SPSS will only give you estimated marginal means and profile plots for categorical predictors, i.e. those in the Fixed Factors box.
  
  You can do a work-around though. In the options, ask for “Parameter Estimates.” These will give you regression coefficients for the model. You can use these to plot predicted values to see the effect on the covariates. I find the easiest way to do it is to export the coefficiencts table to excel, plug in possible values for the predictors, then use formulas in excel to get the predicted values.
  
  Karen
  
  Reply
Karen says

June 3, 2011 at 11:21 am

Hi John,

You’re right–when you have a lot of IVs, it does get messy fast. And the fact that you have interactions AND 4 DVs makes it that much messier.

So here is what I suggest.

1. The first thing you need to do is univariate descriptives on all variables. This really helps you get an idea of your variables. If a numerical variable has a normal-looking distribution, it’s much less reasonable to categorize it than if it’s bimodal, for example.

2. Then do bivariate relationships. See how the DVs relate to each other. If they’re unrelated, you don’t need the MANOVA.

See how the IVs relate to each other as well as to the DVs. Again, if the relationship between a numerical IV and the DVs is linear, it makes less sense to categorize it than if there are big jumps.

If the demographics are moderators, they may not have bivariate relationships with the DVs, but should still be part of the model. So as you do your model building, you might want to go with a top-down strategy–include all main effects and interactions. Remove any non-significant interactions and any non-significant main effects only if the interactions involving those variables are also taken out.

Karen

Reply
John says

June 2, 2011 at 11:32 pm

Hi Karen,
I have five dependent variables (interval) and four independent variables. One independent variable is 2 levels (yes/no) and the other 3 IVs I have as metric but I can categorize them to 4 levels. I assume I can enter the the metric IVs as covariates or use the 4-level categorical versions as fixed factors? In addition to these DV and IVs, I have a number of categorical variables (demographic) that I would like to see if they moderate the IV- DV relationship. Using GLM/Manova in SPSS, trying to use many variables gets very messy but I will get quite different results if I remove one or more variables. Can I (or should I) first check the the demographics for a main effect only and then not use those that are not significant? If yes, can I do the same thing for the IV i.e. first test for main effects? My data collection is over so my sample size is fixed and the response rate was lower than expected. I feel like I need to get the number of variables down to understand to have any chance at understanding the analysis. Plus Box’s test will not run with a large number of variables and my sample size.

Reply
Dee Simons says

March 19, 2011 at 10:42 am

hi there,
i need some help i want to know how to run a a 2x2x2x2 mixed-design ANOVA with Participant race (Black and White) as a between subjects design with Context condition( Verbal and Non Verbal) x Face race (Black and White) x Orientation (Upright and Inverted) as the within-participant Factors in SPSS. i want to know how to dispaly the in variable view and what steps to carry out it out please. please e.mail me on kingdom172002@hotmail.com

Reply
- Karen says
  
  March 25, 2011 at 12:53 pm
  
  Done. 🙂
  
  Reply
ning says

January 31, 2011 at 4:29 pm

hi karen the Goddess of statistics,

for my research, my lecturer just advised me to insert the categorical data (both dependent and independents are categorical, in 1-5 Likert scale) into GLM despite the categorical assumption for the dependent variable.

This is done to have regression on multiple dependent variable.
Is this simplification makes sense?
please answer. thank you so much

Reply
- Karen says
  
  February 25, 2011 at 12:05 pm
  
  Hi Ning,
  
  Aw shucks….
  
  A lot of people are willing to make the assumption that 1-5 Likert scale data are valid as continuous.
  
  Here are a few posts I’ve written on this topic: https://www.theanalysisfactor.com/can-likert-scale-data-ever-be-continuous/
  
  https://www.theanalysisfactor.com/likert-scale-items-as-predictor-variables-in-regression/
  
  https://www.theanalysisfactor.com/when-dependent-variables-are-not-fit-for-glm-now-what/
  
  This last one includes ordinal variables, which is really what the Likert data is. I did a webinar on this, and you can download the recording here: The Other Regression Models Part 1: Binary, Ordinal, and Multinomial Logistic for Categorical Outcomes.
  
  One thing to be aware of, which I talked about in the webinar, is that ordinal logistic regression (which technically you should use for ordered categories) has its own assumptions, which are are often hard to meet for likert data. So sometimes it’s an issue of the lesser evil of two inappropriate methods.
  
  Reply
nadya says

September 16, 2009 at 6:27 am

Thank you for sharing this information. Can i ask u some questions?
I found a research that adding variable YEAR (the research period is 10 years) as fixed factor, then interact it with another variables/covariates. My question is, what is the function of it? and what should i interpret if YEAR variable is significant or not significant either as main effect or interaction effect?

Reply
- Karen says
  
  September 17, 2009 at 2:43 pm
  
  Hi Nadya–The reason to add YEAR as a fixed factor is to control for any effect of YEAR on Y, the Dependent Variable. It may be that Y varies from year to year, and the main effect will quantify that.
  
  The interaction between a covariate, X, and YEAR will indicate if the effect of X on Y varies across the years. For example, the effect of amount spent on advertising (X) on revenue (Y) probably changes from year to year. The effect might be smaller in years with a slow economy.
  
  Reply
claudia says

February 5, 2009 at 1:12 am

my walk on path to statistical confidence has hereby begun. thanks!

Reply
- admin says
  
  February 5, 2009 at 3:09 pm
  
  Excellent! Enjoy the journey. 🙂
  
  Reply
Rasmus says

January 5, 2009 at 5:03 pm

Thank you, very helpful.

Reply

Reader Interactions

Comments

Leave a Reply Cancel reply