The beauty of the Univariate GLM procedure in SPSS is that it is so flexible. You can use it to analyze regressions, ANOVAs, ANCOVAs with all sorts of interactions, dummy coding, etc.

The down side of this flexibility is it is often confusing what to put where and what it all means.

So here’s a quick breakdown.

The dependent variable I hope is pretty straightforward. Put in your continuous dependent variable.

Fixed Factors are categorical independent variables. It does not matter if the variable is something you manipulated or something you are controlling for. If it’s categorical, it goes in Fixed Factors.

Now, you can put a categorical variable into Covariates, as long as it’s coded properly–dummy or effect coding are common. What you don’t want to do though, is to put a variable coded 1, 2, 3, 4, 5, 6 for the 6 categories into Covariates. SPSS will think those values are real numbers, and will fit a regression line.

There are a few things you should know about putting a categorical variable into Fixed Factors.

1. You don’t have to create dummy variables for a regression or ANCOVA. SPSS does that for you by default.

2.The default is for SPSS to create interactions among all fixed factors. So if you have 5 fixed factors and don’t want to test 5-way interactions that you’ll never be able to interpret, you’ll need to create a custom model by clicking Model and removing some of the interactions.

3. For any Fixed Factor, you can get marginal means (means adjusted for by other variables in the model) by clicking options. These are generally easier to interpret than the parameter estimates for categorical variables. Especially if you don’t have any continuous predictors in your model, it is much easier to interpret means than parameter estimates.

4. You can also get paired comparison tests for any Fixed Factors by clicking Post Hocs. You can’t get them for Covariates.

5. The default in SPSS is to dummy code any Fixed Factors for the Regression Parameter Estimates Table (which will only be output if you click Options–>Parameter Estimates).

The default is to make the reference category the one that comes last alphabetically. So if your categories (what you typed into the data) are Male and Female, Male will be the default reference.

Remember higher numbers come later alphabetically, so **if you had coded your categories 0 and 1, SPSS will make 1 the reference group!** This can create a lot of confusion, so you can change the default by choosing Contrast and making the reference group First.

If you want a category in the middle to be the reference group, your only choice is to recode the variable so that that category comes last alphabetically.

Most of the time, you won’t use Random Factors. Rather than calculating means for each category, as is done with Fixed Factors, SPSS calculates only a single variance for Random Factors. So if you want to compare the means, use Fixed Factors. In fact, if you have Random Factors, you should generally be using the Mixed procedure, which uses better algorithms for estimating effects of Random Factors.

Ellen says

Dear Karen,

Both the article and replies are very helpful to me. Thank you for your excellent work!

Merry Christmas and best regards,

Ellen

Karen Grace-Martin says

Thanks, Ellen!

Ellen says

Thanks a lot for this explicit narration. Very practical knowledge for me.

TAYYBA Rasool says

Hello!

I want to apply factor analysis on data. I have 7 factors in my data file, However, spas show 9 factors when I apply factors analysis. How can I fix factors as 7 factors. Please guide me.

Thanks

Karen Grace-Martin says

Okay, if you’re talking about factor analysis, that is a totally different kind of factor than what we’re talking about here. See this: https://www.theanalysisfactor.com/confusing-statistical-term-6-factor/

ESHETU says

I Have data about communication of father and mother with their sons and daughters. so which method is best to analyze these data using SPSS?

Katie says

You indicate that dummy variables can be included as a covariate (rather than within fixed factors) but I can’t find details on how to interpret this type of analysis. eg. If I have 2 treatment groups I want to compare on an outcome whilst controlling for pre-treatment scores and a dummy variable (e.g. gender), can I put the dummy variable in covariates? If so, how do I interpret? And if not, how do I run the analyses specifically to control for gender and pre-treatment scores?

Giana says

Hi Karen,

I have two categorical covariables in my data gender (2 levels), and ethnicity (4 levels). I transformed them into dummy variables but when I put them together into the model, the location ones don’t yield any results. Is this because I am trying to compute something SPSS doesn’t do, or is it because I need to add a reference variable for both groups?

Pruthviraj says

Hi Karen,

I am using spss univariate GLM procedure. In the model, I have 3 fixed factors (with more than 2 levels each) and 1 covariable. When the covariable is put into covariate box, option for post hoc is becoming unavailable. I need the post hoc table to rank the levels under each factor. Do you know how can one get post hoc table, even after filling covariate box? Please let me know.

Thank you

Karen Grace-Martin says

Pruthviraj,

Yeah, SPSS can’t do it. You only option is to use a Bonferroni or Sidak correction in the Estimated Marginal Means section (and click the little box that says “Compare Means.”) I agree, it’s frustrating.

Rhiannon says

Hi Karen!

If you’re coding for mean income, i.e. 10,000-20,000$ coded as 1, 20,000-30,000$ coded as 2, 30,000-40,000$ coded as 3 ect how do you control for this in a hierarchical regression model? As it is coded categorically and using numbers to represent each income bracket.

best,

Rhiannon

Karen Grace-Martin says

Rhiannon, you can only treat predictors as nominal or continuous in models. There is no way to take into account the ordering of categories.

Renae says

Hi, Karen!

I have a between-groups pretest posttest design which I’m analysing using ANCOVA (grouping variable has 2 levels; covariate is pretest score, DV is posttest score). My scatter plots show pretty clearly parallel regression slopes–HOORAY! But the custom GLM is showing a significant groupingvariable*covariate interaction, p = 0.04. Is that -just- acceptable?

If so, how can I write that up? (If not, what should I do instead? I’ve heard I can still run the GLM, but with… some sort of change? And change in what I call the analysis?)

Legrand says

Dear Karen,

I have a question about the assumptions that need to be satisfied when runing an ANCOVA with a categorical covariate.

When the covariate is continuous, two assumptions need to be met : (1) The independent variable and the covariate are independent of each other, (2) There is no interaction between independent variable and the covariate.

Now, when the covariate is categorical, are there assumptions to be met? I would say that the second one would be that there is no interaction between the IV and the covariate, but is there any equivalent for the first one?

Thanks a lot for your help!

mumu says

Hi,

I found this argument of yours contradicted by some statisticians, supported by some others. But I dont perfectly understand your whole argument, please help me to get this:

Now, you can put a categorical variable into Covariates, as long as it’s coded properly–dummy or effect coding are common. What you don’t want to do though, is to put a variable coded 1, 2, 3, 4, 5, 6 for the 6 categories into Covariates. SPSS will think those values are real numbers, and will fit a regression line.

what do you mean by that? can you re-word them? thanks

I. MOHAMMED says

how feasible is the use of log-transformed dummy explanatory variable in regression analysis of cross-sectional data?

Please help me out.

Julie says

Hi, I ran a repeated measures MANOVA and got a significant interaction between my repeated (continuous) measure and my continuous covariate. How can I best probe this interaction?

Karina says

Hello Karen,

my question is the following: I have one dependent variable which is measured by 3 questions on Likert scale. I also have 5 independent variables (also measured by 5-points Likert scale). I would like to compare the effect of 2 independent variables on the dependent one. According to you what is the best way of doing this? Should the questions of the Likert scale be put in a group?

Thank you very uch in advance for your help!

Andreea says

Hello,

I have a question regarding predicted means in GLM….I want to do an error bar graph with predicted means outcome adjusted for various other factors (used the model to predict the outcome at mean values of the co-variates). Some of my covariates are categorical so I want to use GLM so that I don’t have to create dummies. However GLM doesn’t give me the option to save 95% prediction intervals for mean…How do I do it? thank you

Hristo says

Hello Karen,

As I can see, you have a very good knowledge of statistics.

I have a question which may not be as difficult as the previous questions

in the blog, but I would appreciate if you can tell me how do i determine the categorical variables in a sample. I am about to run a logistic regression and I have the dependent variable and the covariates but I am not sure how to determine the categorical variable and which options to choose (classification plots, Hosmer-Lemeshow..). My research aims to determine the differences between rated and unrated banks, using financial and nonfinancial ratios. I will appreciate a little help!

Thanks in advance

Andreea says

Hello Karen,

I have a similar problem as Derek a while ago,

I have a model with continuous outcome, continuous predictor, continuous and categorical covariates, and categorical moderator (factor). I want to plot an interaction between my continuous predictor and my categorical (3 categories) moderator in Univariate GLM. However GLM only lets me plot estimated marginal means for categorical variables.

I understand the way to go forward in order to visualise an interaction between a factor and a covariate, I would need to plot the regression with the covariate for each level of the factor?

Can you please explain how to do that?

thanks, much appreciated

Andreea

Karen says

Hi Andreea,

You can’t do it within the univariate GLM. You have to use a scatterplot, then add the lines. There is an option in the chart editor to add a line from an equation. You’ll have to specify the lines based on the regression coefficients in order to include the effects of the covariates.

Or just do it in excel. Sometimes that’s easier.

Andreea says

thank you, I think I simply have to Add Fit Line at subgroups as I understand ? However if I do decide to have a categorical predictor and categorical moderator (outcome is still continuous), I can do it with Univariate GLM by doing the estimated marginal means plot? Does that show my interaction for the adjusted regression model with covariates? thank you.

Adam says

Karen,

I’m with the last user…I appreciate the time you’ve put into this. I also have a sort of similar question.

I’m interested in the interaction between two categorical variables. I can understand the simple output (x*y – F statistic – p value), but I don’t understand how to interpret the parameter estimates. The p value is associated with only one interaction it seems (x =0, y=0), and I’m interested in another one. I’ve played around with the ‘contrasts’ options but that doesn’t seem to be right. Suggestions?

Thanks!

Adam

Karen says

Thanks, Adam.

There is only one interaction, and it reflects the difference in the mean differences.

You have uncanny timing–“Interactions” is the topic of this month’s Brown Bag webinar. In any case, start here: https://www.theanalysisfactor.com/interactions-effect-coded-predictors/

Karen

Peggy says

Hi Karen,

I recently discovered this website and it’s brilliant! It has helped me a lot with understanding statistics, so thanks for that!

May I ask one question? I have a categorical covariate (it’s not important which one is the reference group) and I coded it 1 and 2 (instead of using dummy or effect coding), and used it like that in my analysis. It was not significant. In what way can my output be wrong because of this? Or, in other words, what happens in your data/output if you use 1 and 2 for your variable instead of 0 and 1?

Karen says

Hi Peggy,

First, thanks for the kind words.

If you didn’t include any interactions and/or if you don’t try to interpret the intercept, it won’t make much difference at all.

It’s cleaner to use 0/1, but not 100% necessary. The coefficient for that variable will still be the mean differences. If you do have an interaction, though, then the interpretation of the OTHER variable’s coefficient will be affected.

sarah says

Hi Karen, unsure if you still follow this thread but I am hoping you do!

I have 2 metric DVS and 4 metric IVS (which can all be assessed separately they don’t have to present a relationship with one another). What is the best test to run? I have tried many different ones ie. ANOVA, Multiple regression etc but none seem to be producing numbers just empty boxes with error messages. Basically I want to compare my two DVS to my 4 IVs.

Thank You!

Karen says

HI Sarah,

With two DVs, you need to run some sort of multivariate test.

But really, to figure out the right test you need to get more specific on your research question. Do you want to assess the joint relationship between the 2 DVs and the 4 IVs? Test if the mean on each DV is predicted by each of the IVs?

Get very, very specific.

BJS says

I have a result showing that various brian regions differ between two groups but only after controlling for age, gender and a few other factors. Is it possible in SPSS 15 to create a bar graph of the result with the effect of the confounding variables removed?

Karen says

Hi BJS, really good question. I don’t think so. Within GLM, you can get a line graph of the adjusted means (Estimated Marginal Means) OR you could export the EMMeans, then graph them in Excel.

Negin says

Hello

I think that I should use multivariate GLM for analyzing my data. I have both continuous and discrete data. where should I put the discrete data in spss? should I put the discrete data in the covariate box if there is not any relationship between the dependent variables?

Karen says

Negin, are the discrete data the independent or dependent variables? It makes a big difference.

Doron Gothelf says

Hi

I entered two categorical variables as fixed factors (group and gender) and one as covariate (age) into GLM univariate analysis.

In addition to the interactions I get between the fixed factors I would like to know the interaction of the group by age which I don’t receive in my SPSS output.

The question is how can I obtain in the SPSS univariate the interaction values of a group (fixed factor) by age (covariate)?

Many thanks

Doron

Karen says

Hi Doron,

You need to click on the Model button and click on “Custom.” That will allow you to specify the interactions you want.

In syntax, do it in the Design statement.

I go over all this in detail in the Running Regressions and ANCOVAs in SPSS GLM workshop. Here’s the link.

ishfaq says

hi,

i have collected survey data for for my project regarding wetland fisheries. one of my questions was regarding seeking suggestions from fishermen to make fishing occupation better and there were eleven such suggestions suggested by the respondents with some of them (say 50) opting for suggestion one, some for suggestion second and so forth. i want to use these suggestions as one predictor variable in my regression analysis so can i use this as nominal/ordinal (depending on their frequencies) variable and if so how to enter data in excel and SPSS for this variable.

Karen says

Hi Ishfaq,

If I’m understanding it correctly, it sounds like you’d want to create a binary yes/no variable for each suggestion, which indicates whether each respondent made that suggestion. You’ll only be able to do this for popular suggestions–it won’t work, for example if only one or two people suggested something.

Karen

Aso says

Hi Karen,

Amazingly written…

I have a quick question, If I want to perform two regression analysis based on one column depending on the numerical values of 1 and 2. How can I perform it? I want to say that for example, one column named NUM contains numerical data of 1 and 2 only. The other column is of FR. How will I perform the regression analysis, FR as dependent variable and NUM=1 and NUM=2 as independent variables? (I am a newbie so I know the question is a bit naive) I can understand that it’s a small thing but I am unable to resolve it.

Thanks!

Aso says

Hi, just an update…. I did it. Thank You 🙂

Karen says

Excellent. 🙂

Karen

Lisa says

Hi Karen,

I have run a univariate ANOVA looking at the effect of a 3 group categorical variable on a continous outcome variable (with some continous covariates also entered into the model) in two ways. First, I ran it by entering 2 dummy-coded variables into fixed factors (for the 3 group categorical variable). The results matched exactly what I found with a linear regression. Second, I ran it with the 3 group categorical variable entered as a fixed factor, on the assumption that SPSS would dummy code this variable for me. The results were similar to the previous univariate ANOVA, but not identical, which makes me concerned. Is SPSS really treating this fixed factor as a fixed factor?

Thank you very much,

Lisa

Tom says

I have a question…I have a data set with sales as Y and total retail shelf space (x)..i want to investigate the diminishing return i.e. whether additional shelf space would contribute to more sales. I did regression or Arima model where I found that the squared shelf space was significant, thus there is a diminishing return.

My question is how to forecast this? When doing my analysis do I need to transform my data to log (Y) = Log (x)(shelf space) + 2log(x)(squared shelf space) or should I simply ignore the shelf space (the one which was not significant i.e. the non squared one) and only do my analysis with shelf space squared. Any help will be highly appreciated.

Karen says

Hi Tom,

Let me make sure I understand. Where X is shelf space, you have a model with both X and X-squared? The X-squared term is significant, but X is not?

I’m not sure where the logs come in, and I’m not an econometrician, so I’m not sure specifically how a forecast differs from a predicted value (any econometricians out there, feel free to comment). But if you’re trying to either describe the effect of shelf space on sales or get predicted values for sales based on shelf space, you want to include the X term as well as X squared. In fact, centering X at its mean helps the interpretation quite a bit. When you do that, the coefficient of X is the slope of the tangent line at the mean (the overall linear trend if X is reasonably symmetric) and the coefficient for X-squared is the amount of curvature.

Hope that helps.

Karen

Tom says

Hi Karen,

Thank you for your reply.

And I also understand why X-squared should be there. However, the entire point of this exercise was to investigate whether diminishing return exists and X-squared being significant in the regression, stated this fact. The log-transformation is done, so regression method can be applied.

Do you know, how diminishing return can be forecasted from the above example? Any input will be highly appreciated.

Br,

Tom

Karen says

HI Tom,

I really don’t. I’ve never been trained in economics, and I’m not sure how the statistics plays out in that context.

Best,

Karen

Sara says

Thanks for this post. Very useful.

I would have a question on how to plot results from a univariate model. I have a model that includes 1 fixed group (Age) and one Covariate (Rate of learning, RoL). One of the main effect was significant (Age) and so the interaction (Age *RoL): in one Age group the relationship with RoL is significant and in the other one it is not. I wish to describe this result with a figure. Where can I find in the univariate output the values of the intercept and slope for the relationship with the covariate? Thank you very much

Sara

Karen says

Hi Sara,

It’s called “parameter estimates” and it’s under options.

Karen

Lisa says

Thank you for this great posting. Is it possible to put more than one covariate into the model in SPSS GLM? When I have tried to do this, I get a message that “the following factors or covariates are not used in the model:” and then the label of one of my covariates. And sure enough, that covariate was not included in the model.

Thank you,

Lisa

Karen says

Hi Lisa,

I’m guessing that you’re running a second GLM and adding the second covariate after all the boxes are already filled in. When you do this, SPSS doesn’t add the covariate automatically to the model. Putting it in the covariate box just defines it as continuous. To add it into the model, you need to click on the Model button, and move it over into the Model box.

If you’re in syntax, you’d have to add it to the /DESIGN subcommand, not just put it after WITH in the UNIANOVA command. Same thing.

Best,

Karen

Silvano says

Thanks for the explanation! Great help.

Best,

Silvano.

cg1991 says

Hi,

I’ve been using SPSS and have completed my analysis of a set of data. However, I have been asked to ‘Indicate whether the covariate exerted an independent effect on the outcome’. How would I know? I’ve been combing over this data for hours and have just come to a complete stop.

Thanks for any help 🙂

Karen says

I’m not entirely sure what the reviewer means by that.

It could be that they want to make sure its effect is not also explained by any other related predictors in the model. So run the model with and without the other predictors and see if the regression coefficient changed for the covariate in question.

I would ask for clarification, though. There’s nothing wrong with doing that. 🙂

Karen

James says

Hi Karen,

Thank you so much for posting. I am having some troubles performing a logistic regression while accounting for fixed effects and controlling for other variables. My research in on the influence of financial aid on student retention. The dependent variable is a binary variable (retained or not retained). The independent variable, financial aid, is numerical. My professor wants me to account for the fixed effects of the of the institution (the data set consists of students from 27 different schools). I want to also control for the effects of being a minority (binary “yes” or “no”), gender (M or F), and ACT score.

How do I set this up in SPSS? How would I find the point of diminishing returns for the regression line? My professor mentioned taking into the consideration the year in which a student received financial aid as well (this study is over a three year period). Is this also going to be a fixed effect? Anything you can provide me with would be helpful.

Karen says

Hi James,

I can answer some of your questions, but would need to talk over details with you about the others.

Controlling for the effects of being a minority, gender, and ACT score are pretty straightforward. You’d have to dummy code minority and gender.

If you’re not familiar with dummy coding, here are some resources:

https://www.theanalysisfactor.com/complicated-models-with-tricky-effects/

https://www.theanalysisfactor.com/about-dummy-variables-in-spss-analysis/

https://www.theanalysisfactor.com/interpreting-linear-regression-parameters-a-walk-through-output/

You could do the same thing with school, if you want to actually compare schools to each other, but it actually sounds like school would be better as a random effect. This sounds like a hierarchical data set, with students nested within school.

So that means you’re starting to get into more complicated models (linear mixed model) and perhaps your advisor wants you to take the simpler approach of treating school as fixed. Since this is a logistic regression, this can get pretty complicated.

And finally, about year, that depends on whether each student is measured each year or just one year. That’s a really important distinction.

This is honestly the kind of question for which I have Quick Question Consultations. 🙂 There are just too many really important details, including your statistical background, to take into account.

Best,

Karen

Lianne says

Dear Karen,

I have one within subjects factor (treatment: 2 levels) and one between subjects factor (group: 2 levels). I am interested in the group*treatment interaction controlling for two categorical covariates, namely order (2 levels coded 1 and 2) and income (3 levels coded 1, 2 and 3). To my understanding, these categorical factors can be either put into the model as fixed factors or as covariates as dummy codes. For the latter I have dummy coded order as 0 and 1 and created two variables for dummy coding income (with level 1 as a reference).

My two questions are:

– If I am interested in group*treatment interaction controlled for order and income, should I model all interactions when including the two covariates as fixed factors in the analysis or only two way interactions (such that the effects of treatment, treatment*group, treatment*order and treatment*income is modelled)?

– The results are very different for the main effect of treatment when including the dummy coded covariates as covariates compared to including the covariates as fixed factors (and in the latter case only modelling 2 way interactions in order to be similar to the model when dummy coded covariates are used). How is this possible? And what is the right way to proceed? Note: the group*treatment interaction is not affected by these two different methods of including categorical covariates.

I hope you can help me further.

Kind regards,

Lianne

Karen says

Hi Liane,

First, I assume you’re doing this in the GLM Repeated Measures since you have one within-subjects variable. If so…

1. No, you don’t have to (although if those interactions make sense theoretically and those factors are crossed, you can). By only including main effect for order, for example, you’re basically testing if all means are higher or lower in each order. Including interactions between Order and the IVs would test if the effect of those IVs differs depending on order. So it all depends on what you want to control for.

2. This is very hard to answer without more details. What results differ? The F-test? That’s surprising. The regression coefficients–that’s not. When you say you’re including 2-way interactions, are you including all 2-way interactions or just the original treatment*group interaction. That’s the equivalent of putting them in as covariates.

Best,

Karen

Karen says

Hi Zara,

You’d need to make involvement a fixed factor if you have only two values. If you have actual numerical values for involvement, you could put that in as a covariate.

The difference is fixed factors will test the difference in the mean of the DV for each category of involvement (high vs. low).

Covariates will fit a regression line between numerical values of involvement and the DV. That’s why you need a numerical variable.

And no, you won’t be able to run a t-test if you have more than one IV.

Karen

zara says

Dear karen

Recently I am doing a research that assesses the effectiveness of appeals (rational and emotional), gender (m,f) and the level of involvement(high, low) (3 IV) on people attitude(7 point likert) (DV). So I am running anova in spss. Now my question is: do I serve involvement as covarite or I must serve that as one of the fixed factors? & what differences is between these two?

Another question: could I run T test in place of anova?

Thank you so much

Lisa says

Whoops–sorry for the second post. I meant to say that I am wondering about whether dummy coding is necessary for logistic regression analyses in SPSS 19.

Thank you very much,

Lisa

Karen says

Hi Lisa,

That was actually an important distinction.

In the linear regression procedure, you’ll have to create your own dummy variables.

In logistic regression, SPSS will do it for you. You have to click on “Categorical” and indicate which predictor variables are categorical. For some reason, you are allowed here to specify whether you want the first or last alphabetical value to be the reference group. Not sure why you can’t do that in linear regression or even GLM.

Karen

Lisa says

Hi Karen,

Your posts are great. One question–since SPSS automatically dummy codes the fixed factors in GLM, can I assume that if I run a linear regression with a 3-group categorical variable (coded as “nominal” in SPSS), that SPSS will do the dummy-coding? Or do I need to make the dummy coding myself? I am using SPSS 19.

Thank you,

Lisa

Emma says

Hi Karen,

I am unsure of how to analyse my study using SPSS. I have a 2x2x2x2 design looking at the following:

ATS Score – high or low

Perpetrator gender – Male or female

Victim gender – male or female

Victim age – old or young

Karen says

Hi Emma,

In deciding how to analyze, there are other issues I would need to know, like the design (is everything between subjects? I assume they are, but I don’t know what ATS is, and I don’t know if you’re presenting different scenarios to the same subject with different perpetrators and victims); the measurement of the dependent variable (continuous, discrete, binary, etc.?).

If the design is completely between subjects and the dependent variable is continuous, unbounded, and measured on an interval or ratio scale, then use SPSS GLM, put all your independent variables in as fixed factors. You may not want every possible interaction, but SPSS will put them in by default.

If you have no idea whether your dependent variable meets those criteria, I would suggest starting here: https://www.theanalysisfactor.com/the-11-steps-for-statistical-modeling-in-any-regression-or-anova/

https://www.theanalysisfactor.com/6-types-of-dependent-variables-that-will-never-meet-the-glm-normality-assumption/

Karen

Karen says

Hi Naqeeb,

Thanks for you kind words. I’m glad you find it so helpful.

But I had to let you know, I already wrote that book! 🙂

It doesn’t cover everything in SPSS, but all of the basics up to linear and logistic regression. The Amazon link is in the right sidebar. –>

Karen

Naqeeb Ullah Khan says

Hi Karen

Thank you so much for the detail information you provided. Believe me you make statistic so easy for understanding that I have no where seen it before.

I will request you that please write a book about the various statistical procedures which can be done in SPSS and how one can interpret it. I am giving you guarantee that it will be a best seller book. Do not go into theory just discuss it practical aspect and interpret it. Your writing is so clear that even a 10 years old child understand it at once

Thanks and best wishes

Naqeeb.

Naqeeb Ullah Khan says

Dear Sir/Madame

I have a data with 6 dependent variables (DVs) and 6 independent variables (IVs). All the DVs are continuous and IVs Categorical, [some are Nominal and some or in Ordinal Scales].

They are as following

1. Age in Years :5 -Categories -Ordinal

2. Nationality: 2 -ategories. Nominal

3. Gender: 2 -Categories. Nominal

4. Highest level of education: 6-Categories- Nominal

5. Major Field of Education: 6-Categories-Nominal

6. Size of the Hospital (number of beds): 5 -Categories-Ordinal

I want to run MANOVA/MAncova using GLM.

So my question is that which method I should run? AND if it is MANCOVA that which variables I should take covariates and what would be the benefits of MANOVA over MANCOA and Vice versa.. THanks and Regards

Karen says

Hi Naqeeb,

If your DVs are correlated, you do need MANOVA, particularly if they make sense as a single construct and you want to test them as a unit.

There’s no way to indicate an ordinal independent variable (and that’s not just SPSS, that’s linear models). So all would go into Fixed Factors. However, by default, SPSS will automatically put in all possible interactions among all possible fixed factors. That would be a mess in this model. You probably want no more than 3-way interactions. So you will have to click on Model and put in Main Effects and the 2 way and 3 way interacitons into a custom model.

If you want more info, the details on this kind of thing is EXACTLY the type of issue we cover in the upcoming SPSS GLM workshop. We do it in the univariate case, but all the options are the same. Look under Workshops at Running Regressions and ANCOVAs in SPSS GLM.

Karen

Karen says

Hi Dereck,

If it doesn’t work, then that may be an SPSS limitation. Theoretically it should be fine. I don’t know if other software will let you do this.

Theoretically, a MANOVA (multiple DVs) is the same as running a Factor Analysis on the response variables, creating Factor Scores, then using them as the dependent variable in an ANOVA. You might have to use that approach here. It’s slightly less elegant, but should give you the test you need.

Karen

Dereck says

Still looking for a reply or maybe a place where I can find the answer to my question?

Thank you!

Karen says

Hi Dereck,

Sorry, I thought I already responded to this. Maybe it didn’t go through.

Yes.

Karen

Dereck says

Thanks Karen,

I purchased your tutorial on the mixed model, and it worked great.

Regarding a Multivariate Mixed Model, currently my syntax for the univariate mixed model looks like this:

MIXED num_fixations BY viz_order viz_type task WITH verbalWM K6 ps_score bar_expert radar_expert

/FIXED=task viz_order viz_type task*viz_order task*viz_type viz_order*viz_type task*viz_type*viz_order verbalWM K6 ps_score bar_expert radar_expert

task*verbalWM task*K6 task*ps_score task*bar_expert task*radar_expert

viz_order*verbalWM viz_order*K6 viz_order*ps_score viz_order*bar_expert viz_order*radar_expert| SSTYPE(3)

/METHOD=REML

/REPEATED=task | SUBJECT(subject_id) COVTYPE(cs).

Where can I specify more dependent variables? If I add any more values right after the MIXED keyword it does not work.

Dereck

Dereck says

Thanks for the response Karen. I will look into this right away.

Another question, if I wanted to run what would normally be a Repeated Measures but with multiple dependent variables (ie. like a repeated measures multivariate test) will the Mixed Model be able to handle this?

Dereck

Karen says

Hi Dereck,

Yep. In SPSS repeated measures, it’s a bit tricky. It is actually reporting results from two different models (one is a univariate model and the other a multivariate–I’m sure you’ve seen tables that mention both). Some tables aren’t labelled, but parameter estimates are multivariate and go with the multivariate tests of within-subjects factors. There is nothing you can do about this.

So the parameter estimates are telling you the effect of each covariate on each within-subject measurement. So if your within-subjects factors are a 2×3, say, you’re going to get a separate coefficient for each covariate on each of the 6 outcome measures.

Honestly, there is a better solution, but it may not be easy. In any case, you really want to run this in Mixed. Not GLM repeated measures. All the output is univariate, and it can easily handle this design.

Here are some resources to get you started:

Approaches to Repeated Measures Data: Repeated Measures ANOVA, Marginal, and Mixed Models–https://www.theanalysisfactor.com/repeated-measures-approaches/

Five Advantages of Running Repeated Measures ANOVA as a Mixed Model

https://www.theanalysisfactor.com/advantages-of-repeated-measures-anova-as-a-mixed-model/

Running Repeated Measures as a Mixed Model http://www.theanalysisinstitute.com/products/Product-Mixed-Model.html

Karen

Dereck says

Hi Karen,

I’m not quite sure how to translate the table that is created when I tell SPSS to make the “Parameter Estimates”. The table has a unique set of co-variate co-efficient values for each repeated measure, and I am not sure how go form there? Can you suggest a web tutorial or maybe provide an example?

Thanks for the reply,

Dereck

Dereck says

Hi Karen,

I’m running a GLM Repeated measures in SPSS, and for this model I have 2 within-subject factors, 1 between subject factor, and 5 co-variates.

The problem I am having is that I am unable to plot any of the co-variates in the model settings, nor do any of the co-variates show up as options under ‘Estimated Marginal Means.’

Do you know how to fix this, because the co-variates are important and I am unable to make this work.

Thanks!

Karen says

Hi Dereck,

SPSS will only give you estimated marginal means and profile plots for categorical predictors, i.e. those in the Fixed Factors box.

You can do a work-around though. In the options, ask for “Parameter Estimates.” These will give you regression coefficients for the model. You can use these to plot predicted values to see the effect on the covariates. I find the easiest way to do it is to export the coefficiencts table to excel, plug in possible values for the predictors, then use formulas in excel to get the predicted values.

Karen

Karen says

Hi John,

You’re right–when you have a lot of IVs, it does get messy fast. And the fact that you have interactions AND 4 DVs makes it that much messier.

So here is what I suggest.

1. The first thing you need to do is univariate descriptives on all variables. This really helps you get an idea of your variables. If a numerical variable has a normal-looking distribution, it’s much less reasonable to categorize it than if it’s bimodal, for example.

2. Then do bivariate relationships. See how the DVs relate to each other. If they’re unrelated, you don’t need the MANOVA.

See how the IVs relate to each other as well as to the DVs. Again, if the relationship between a numerical IV and the DVs is linear, it makes less sense to categorize it than if there are big jumps.

If the demographics are moderators, they may not have bivariate relationships with the DVs, but should still be part of the model. So as you do your model building, you might want to go with a top-down strategy–include all main effects and interactions. Remove any non-significant interactions and any non-significant main effects only if the interactions involving those variables are also taken out.

Karen

John says

Hi Karen,

I have five dependent variables (interval) and four independent variables. One independent variable is 2 levels (yes/no) and the other 3 IVs I have as metric but I can categorize them to 4 levels. I assume I can enter the the metric IVs as covariates or use the 4-level categorical versions as fixed factors? In addition to these DV and IVs, I have a number of categorical variables (demographic) that I would like to see if they moderate the IV- DV relationship. Using GLM/Manova in SPSS, trying to use many variables gets very messy but I will get quite different results if I remove one or more variables. Can I (or should I) first check the the demographics for a main effect only and then not use those that are not significant? If yes, can I do the same thing for the IV i.e. first test for main effects? My data collection is over so my sample size is fixed and the response rate was lower than expected. I feel like I need to get the number of variables down to understand to have any chance at understanding the analysis. Plus Box’s test will not run with a large number of variables and my sample size.

Dee Simons says

hi there,

i need some help i want to know how to run a a 2x2x2x2 mixed-design ANOVA with Participant race (Black and White) as a between subjects design with Context condition( Verbal and Non Verbal) x Face race (Black and White) x Orientation (Upright and Inverted) as the within-participant Factors in SPSS. i want to know how to dispaly the in variable view and what steps to carry out it out please. please e.mail me on kingdom172002@hotmail.com

Karen says

Done. 🙂

ning says

hi karen the Goddess of statistics,

for my research, my lecturer just advised me to insert the categorical data (both dependent and independents are categorical, in 1-5 Likert scale) into GLM despite the categorical assumption for the dependent variable.

This is done to have regression on multiple dependent variable.

Is this simplification makes sense?

please answer. thank you so much

Karen says

Hi Ning,

Aw shucks….

A lot of people are willing to make the assumption that 1-5 Likert scale data are valid as continuous.

Here are a few posts I’ve written on this topic: https://www.theanalysisfactor.com/can-likert-scale-data-ever-be-continuous/

https://www.theanalysisfactor.com/likert-scale-items-as-predictor-variables-in-regression/

https://www.theanalysisfactor.com/when-dependent-variables-are-not-fit-for-glm-now-what/

This last one includes ordinal variables, which is really what the Likert data is. I did a webinar on this, and you can download the recording here: The Other Regression Models Part 1: Binary, Ordinal, and Multinomial Logistic for Categorical Outcomes.

One thing to be aware of, which I talked about in the webinar, is that ordinal logistic regression (which technically you should use for ordered categories) has its own assumptions, which are are often hard to meet for likert data. So sometimes it’s an issue of the lesser evil of two inappropriate methods.

nadya says

Thank you for sharing this information. Can i ask u some questions?

I found a research that adding variable YEAR (the research period is 10 years) as fixed factor, then interact it with another variables/covariates. My question is, what is the function of it? and what should i interpret if YEAR variable is significant or not significant either as main effect or interaction effect?

Karen says

Hi Nadya–The reason to add YEAR as a fixed factor is to control for any effect of YEAR on Y, the Dependent Variable. It may be that Y varies from year to year, and the main effect will quantify that.

The interaction between a covariate, X, and YEAR will indicate if the effect of X on Y varies across the years. For example, the effect of amount spent on advertising (X) on revenue (Y) probably changes from year to year. The effect might be smaller in years with a slow economy.

claudia says

my walk on path to statistical confidence has hereby begun. thanks!

admin says

Excellent! Enjoy the journey. 🙂

Rasmus says

Thank you, very helpful.