Can I use SPSS MIXED models for (a) ordinal logistic regression, and (b) multi-nomial logistic regression?
Every once in a while I get emailed a question that I think others will find helpful. This is definitely one of them.
My answer:
No.
(And by the way, this is all true in SAS as well. I’ll include the SAS versions in parentheses).
You can think of SPSS Mixed (SAS proc mixed) as the clustered-data version of SPSS GLM (proc glm). They have a lot of similarities in both their syntax and the kinds of models they can run.
Any model you can run in GLM, you can run in Mixed (but not vice-versa).
But both require an outcome variable that is unbounded, continuous, and measured on an interval or ratio scale.
So logistic regression, along with other generalized linear models, is out.
But there is another option (or two, depending on which version of SPSS you have).
You can run a Generalized Estimating Equation model for a repeated measures logistic regression using GEE (proc genmod in SAS). It has a repeated statement, and can run equivalent models to a model in Mixed with a repeated statement.
These are called population averaged models in both procedures, because you’re fitting a single model to all clusters, but controlling for within-cluster correlation.
In contrast are true Mixed Models, which actually fit a variance parameter for random effects, usually random intercepts and slopes. Rather than just control for within-cluster similarity in responses, they model it. Mixed models are run in Mixed using the Random statement.
(One of the reasons this gets so confusing is that for some designs, you can get the exact same results with either type of model. But they’re taking different routes to the same destination).
Mixed Models have a lot more flexibility than Population Averaged Models–you can, for example, run a 3-level mixed model, but Population Averaged Models are restricted to two levels.
To run a true Mixed Model for logistic regression, you need to run a Generalized Linear Mixed Model using the GLMM procedure, which is only available as of version 19.
(In SAS, use proc glimmix).
If you want to learn more about Mixed Models, check out our webinar recording: Random Intercept and Random Slope Models. It’s free.
Send to Kindle




{ 29 comments… read them below or add one }
Hi, you wrote: “to run a true Mixed Model for logistic regression, you need to run a Generalized Linear Mixed Model using the GLMM procedure, which is only available as of version 19″. Well, i have this version, and i need to run a mixed model of logistic regression. However, i don’t really know what to do in the first window of “data structure” – i don’t have any repeated measure, just have subject ID, and one random effect, which was a clinic location. I don’t think it was truly random though because i “handpicked” the clinics from the whole city to match my needs.
The problem is a reviewer to my manuscript wants me to account for the clinics in my logistic regression models as a random effect.
Can you direct me somewhere where i can get an “how to” explanation?
Thanks so much.
Hi Hilit,
First, make sure you really want to make clinic random. I’m not sure what “match my needs” implies, but if you actually want to compare them (Clinic A, with these characteristics, has a mean 3 points higher than Clinic F, with these characteristics), then make them a fixed effect. If the point is to control for multiple subjects at each clinic, then you do want it random.
Second, I don’t ever use the SPSS menus for mixed models. As you’ve noticed, they are completely unintuitive. I can’t imagine what they’re asking with “data structure”. It’s really easy to mis-specify a mixed model and not realize it, so you really need the control of using syntax.
Third, if all you need to do is add a random effect of clinic (which is what it looks like), that is the same as adding a random intercept for clinic. I recently did a webinar on random intercept and random slope models, and demonstrated using SPSS. It’s not exactly what you need help with, but it might clear some things up. I address there treating the clusters as a fixed or random effect, and show the syntax for specifying the random intercept. The syntax is slightly different in GLMM than in mixed, but it’s very similar.
You can get the recording at http://www.theanalysisfactor.com/learning/webinar21.htmlIt’s free.
Hi Karen,
I will look at the webinar as soon as i can (children and everything). Meanwhile, can you just tell me if what you suggest there can be done in a regular logistic regression analysis in SPSS (i mean not through the various GLM’s but using logistic regression command)?
Thanks
Hi Hilit,
If you treat clinic as a fixed effect, you can. Just enter clinic as a categorical predictor variable into your model. This will only work, though, if you have no predictors measured at the clinic level (eg. clinic size) because they’ll be confounded.
If you treat clinic as random, you will need the GLMM.
If you want more help, we can always set up a Quick Question consultation, but these models are too complex to advise much without seeing the data and research questions. The devil is definitely in the details.
Karen
So, what do you suggest i do?
Go see a consultant?
Yes, a good consultant could really clear things up in an hour or less (assuming there are no surprises). These are complicated models with many issues involved, and if you’re still learning, it would behoove you to get some guidance, if that’s an option.
Hi Karen,
Thanks for the information you have provided above. I want to run a binary logistic regression with one categorical predictor and one interval predictor, in addition to adding a random variable. Can I just confirm that this is not possible through SPSS 16, and I would need SPSS 19 to do this? If so, is there any information that you know of that shows you how to do the GLMM with a logistic DV in SPSS 19 without using syntax?
Any help would be appreciated!!
Hi Simon,
Confirmed–you need SPSS 19 to run that model.
I don’t know of any video resources on this. This procedure is pretty new in SPSS.
And I know you don’t want to use syntax, but I will say it really is a good idea to use syntax with models this complicated. It’s just too easy to make a mistake in the menus and not realize it….
Karen
Hi Karen,
Wondering if you can direct me to any information about how to a GLMM with a logistic DV in SPSS using syntax?
Menus are extremely confusing.
Thanks
Farah
Hi Farah,
Hmm, it’s pretty new and I haven’t seen anything on GLMM beyond the manual. Have you looked at the Command Syntax Reference?
They should have some examples.
Unfortunately, the commands are similar, but different than MIXED.
I’ll try to put together a post that shows the same analysis in MIXED and GLMM.
Karen
Hi Karen,
I am also looking for SPSS command syntax and have the same problem. My dependent variable is categorical (with dichotomous and continuous predictors and I need to run a multinomial logistic regression controlling for the random effects of site. Our data was collected at 3 different sites. Also, I saw your response about entering the site variable as a fixed effect predictor. I wonder if it will be OK to do that. I would appreciate any help.
Thank you.
Hi Bushra,
It should work as long as you don’t have any covariates measured at the site level. For example, if your sites are something like hospitals, and you had a predictor that was hospital size, it would be confounded with hospital if hospital was fixed. That’s one of the advantages of random factors.
If you don’t have any site-level covariates, then you’re golden.
Karen
Thank you so much Karen. I don’t have covariates measured at the site level. So I was able to use your approach.
Hello Karen!
I am a doctoral student from Germany running a study about mental health of 2 different groups of students. I have several diagnoses as dependent variables (dichotomous) and I am studying the risk factors (dichotomous and continuous predictors) and if there are differences between these two groups of students. I have 1 follow up where these diagnoses were assessed again and two new risk factors were included. My questions is: which of the 2 methods you suggested is better for my samples and study design? I am using SPSS 20 at the moment. And it would be great if you have some indications how to run this analysis because I am not able to find any SPSS tutorial.
Best regards,
Marcela
Hi Marcela,
With only two time points, run the GEE. It’s much simpler and you’ll get the same results.
I would start here for info on GEE: http://jeromyanglim.blogspot.com/2009/11/generalized-estimating-equations.html
Jeremy gives a nice list there–I’m familiar with some of those websites he links to, and they’re very understandable.
Best,
Karen
Since one can run a mixed model for logistic regression in SPSS as of version 19 (i.e., GLMM), why was your initial answer to the question “No” at the beginning of the thread ?
Hi Michael,
Maybe I misread it, but the initial question was whether you could use SPSS MIXED, and I took that to mean the MIXED procedure in SPSS.
You have to use GENLIN to fit a generalized linear mixed model. MIXED only fits linear mixed models (which assume normality of residuals and have an identity link function).
Karen
Hi Karen
I found very interesting advise about generalized linear model using spss.
i am using spss 19 and would like to use mixed model. i want to check effect of 4 factor on seed viability. i check normality and i can not work in normal distribution. i can only use Poisson or binomial distribution. i can consider my data as count or binomial both. i have ziro a lot and either a high value frequency a lot there for i have overdispersion. what is your suggestion to select model? is GLMM under Poisson distribution is a good choice? in some other references they recommend quasi-poisson, ziro inflated ..
it would be great to have you idea about using spss 19 for doing mixed model.
All the best
Mehdi
Hi Mehdi,
It sounds like indeed GLMM with a Poisson (or some model in that family) would be a good choice. You don’t mention why you need a mixed model–do you have randomized blocks or repeated measures?
One way to test if a negative binomial is necessary due to over-dispersion is to run the model with a negative binomial residual and a log link, and allow the negative bionomial parameter to be estimated from the data. If it’s around 1, a Poisson is adequate. If it’s much larger, you need a negative binomial.
These are tricky models, and if you aren’t up on them, you’ll want to do a lot of reading. I like the book by J. Scott Long, but there are others.
Karen
I wanted to know how to run in SPSS 19.0 an ordinal logistic regression when I have a mixed model. The ordinal response data are in the form: no response (1), minimal response (2), high response (3). I have two fixed predictors (location and treatment) and subjects that received both a treatment and a control (random effect?). I initially ran it under GLMM: data = ordinal; distribution multinomial and got output, but I am wondering whether I really want a multinomial; doesn’t this ignore the order effect of my data? Or should I being running this as a GEE with ordinal data and repeated measures? What is the difference?
Ian
Hi Ian,
You do want to include the ordering. The language used in SPSS GLMM is strange. I believe that as long as you specify ordinal, it is taking that into account.
You could run it either as a GEE or a mixed model, from what you’ve said about your design. This is the quick response to what is the difference: GEE is a marginal model, and GLMM is a true mixed model. I’ve started writing a newsletter article on it in response to your question, and that should come out next week. To get you started until then, this article explains marginal and mixed models in a linear context: The Repeated and Random Statements in Mixed Models for Repeated Measures
Karen
Hi Karen,
I’ve got clustered data (familial design), and I need to run a multinomial logistic regression. Somebody told me to use the Genmod Procedure but I’ve got this message : “The response variable as2 has 3 levels. A binary response must have two levels.” So, is the genmod Procedure really adequate to deal w/ multinomial regression ?
Thanks a lot.
Hi Epitaf,
It can. Did you specify the distribution as multinomial? I believe it’s in the model statement, after the slash, include link=logit dist=multinomial.
Karen
Hi Karen,
Thanks for your answer !
Actually, I forgot to say the most important thing (I think). I was talking about nominal multinomial regression and it seems (but tell me if I’m wrong) that Genmod can do a GEE fit for ordinal multinomial data with residual correlated structures.
” Only the cumulative logit, cumulative probit, and cumulative complementary log-log link functions are available for the multinomial distribution.”
Thank you.
ps : your blog should be mandatory for all epidemiologists and statisticians students. It’s a gift. Don’t you plan to summarize all your science and your tips (#StatWisdom
) in a magic book ? Please, think about it !!!
Hi Epitaf,
You are right–I just looked it up. I’m surprised, but genmod can’t do multinomial. CATMOD can, but can’t do GEE.
I found this paper, which discusses the issue in detail (much of it mathematical detail). It looks like the best option is to use proc glimmix with the METHOD=MMPL option. See section 4.2.2.
http://www.oliverkuss.de/science/publications/Kuss_McLerran_Second_Revision_CPMB.pdf
Karen
Oh, and thanks. Glad you find it helpful. I had not thought of a magic book, and will have to think about how to do that.
Thank you for the link !
Hi Karen,
I have a simple data set with 1 within subjects factor (2 levels) and binary outcomes (0 or 1). As I understand, I should do a repeated measures logistic regression. When doing this in SPSS under GEE it does not run. It only works when I ad a factor, but then it only estimates the effect of this added factor (corrected for my repeated measurement) and not that of my repeated measurement itself. Do you know if and how I can analyze my data using SPSS in a way that will show me an effect of my within subjects factor (rather than correcting for it)?
It would be great if you could help, thanks a lot!
Lisa
Hmm, it should test your within subjects factor if you’re including it in the model.
I’m not 100% sure from the way you describe it, but if you’re only interested in the one within-subjects factor, you may be able to get away with a McNemar test. Much simpler.
Karen