Multinomial Logistic Regression
The multinomial (a.k.a. polytomous) logistic regression model is a simple extension of the binomial logistic regression model. They are used when the dependent variable has more than two nominal (unordered) categories.
Dummy coding of independent variables is quite common. In multinomial logistic regression the dependent variable is dummy coded into multiple 1/0 variables. There is a variable for all categories but one, so if there are M categories, there will be M-1 dummy variables. All but one category has its own dummy variable. Each category’s dummy variable has a value of 1 for its category and a 0 for all others. One category, the reference category, doesn’t need its own dummy variable, as it is uniquely identified by all the other variables being 0.
The mulitnomial logistic regression then estimates a separate binary logistic regression model for each of those dummy variables. The result is M-1 binary logistic regression models. Each one tells the effect of the predictors on the probability of success in that category, in comparison to the reference category. Each model has its own intercept and regression coefficients—the predictors can affect each category differently.
Why not just run a series of binary regression models? You could, and people used to, before multinomial regression models were widely available in software. You will likely get similar results. But running them together means they are estimated simultaneously, which means the parameter estimates are more efficient–there is less overall unexplained error.
Ordinal Logistic Regression: The Proportional Odds Model
When the response categories are ordered, you could run a multinomial regression model. The disadvantage is that you are throwing away information about the ordering. An ordinal logistic regression model preserves that information, but it is slightly more involved.
In the Proportional Odds Model, the event being modeled is not having an outcome in a single category, as is done in the binary and multinomial models. Rather, the event being modeled is having an outcome in a particular category or any previous category.
For example, for an ordered response variable with three categories, the possible events are defined as:
- being in group 1
- being in group 2 or 1
- being in group 3, 2 or 1.
In the proportional odds model, each outcome has its own intercept, but the same regression coefficients. This means
1. the overall odds of any event can differ, but
2. the the effect of the predictors on the odds of an event occurring in every subsequent category is the same for every category. This is an assumption of the model that you need to check. It is often violated.
The model is written somewhat differently in SPSS than usual, with a minus sign between the intercept and all the regression coefficients. This is a convention ensuring that for positive
coefficients, increases in X values lead to an increase of probability in the higher-numbered response categories. In SAS, the sign is a plus, so increases in predictor values lead to an increase of probability in the lower-numbered response categories. Make sure you understand how the model is set up in your statistical package before interpreting results.
![]()
If you want to learn all the ins and outs of dealing with logistic regression, check out our 8-hour live workshop Binary, Ordinal, and Multinomial Logistic Regression.
Send to Kindle




{ 22 comments… read them below or add one }
I have an interesting analysis that I’m not sure how to analyze correctly. I am using all nominal data as IVs (subject membership in one of two groups=IV1 and subject gender=IV2) to predict the likelihood of each subject making a choice to layoff five out of twenty-five people. Each subject, then, has a 1st, 2nd, 3rd, 4th, and 5th choice in order of who he would layoff. I have coded all 25 different choices for each subject as either 0 (subject didn’t choose that person) or 1 (subject chose that person). I have 136 subjects who completed the study.
I am clear about the between subjects analysis, but I’m not quite sure how to do the within subjects analysis. Any suggestions would be greatly appreciated!
Dale
Hi Dale,
Just to be clear, you’re ignoring the order here, right? 1st-5th choice, and just coding the response as 1/0?
I would look into a GEE analysis. It’s Generalized Estimating Equations, and it is an approach for repeated measures for generalized linear models.
I have to say, though, this IS an interesting analysis, and there may be other approaches to get at your research questions. For example, if there are characteristics of each of the 25 people, you may want to include them. Or the order may be interesting. But GEE would be a great place to start, and it is available in all the major stat packages.
-Karen
Hi Every one ;
At the moment I am a PhD student in the field of natural resources Economics ,by the time I start to develop my research proposal entitled “The interaction of Poverty and Natural resources degradation” two things comes in mind;
The model needed is a two stage regression approach which poverty is continuous endogenous variable and natural resources degradation is categorically ordered qualitative variable.
Here comes where I am challenged and I need your kind help.
First of all the model has a mix of logit (probit application) for the ordinal variable(natural resources degradation ) and next an OLS endogenous variable (poverty measured in percapita expenditure)
I can have the data from the household survey and when I start to think how to fit the model by STATA I am confused which commands to use and how to deal with a mixture of such continuous and ordinal endogenous variable.
May you then be kind so send me some hints how to deal with this issue.
With Kind regards
Darish
Hi Darish,
Unfortunately, I’m not a big Stata expert. I’ve used it before, but I’m not up on it enough to give you suggestions.
If anyone else can answer this or give Darish some hints, please do.
Given the nature of your model, though, I would suggest getting a hold of Long & Freese’s book on Categorical data analysis using Stata. I don’t know if it covers two-stage modelling, but it may at least get you started, and I know Long’s other book (which is a bit more theoretical) does cover mixture models.
Good luck,
Karen
hi,
I am currently completing my dissertation which uses the Polity IV index of democracy, measured on a 0-10 scale with 1 0 having ‘no democracy’ and 10 being a ‘full democracy. My independent variable on diamond abundance is measured on a continuous level.
With such a dependent variable, is it wise to use a simple logistic regression? and should i construct my 1-10 measure of democracy into a dummy variable?
Can anyone help me with this? i will be very grateful!
Anton
Hi Anton,
Unless there is a cut point on your 11 point scale that is particularly meaningful, you probably don’t want to split it into a dummy variable. You will lose a lot of information that way.
You basically have two choices: 1. treat it as a continuous variable, which sometimes is a reasonable assumption, and run a linear regression model. 2. treat it as ordinal (which it inherently is), and run an ordinal logistic regression.
There’s a big debate on this, and both types of models have assumptions that may or may not be met here. A lot of people will make it sound like the OLS is clearly wrong here, but the ordinal regression also has assumptions that have to be met.
This post has more information: Can Likert Scale Data ever be Continuous?.
i have a question but the data/subject matter is very confidential so i would prefer a private email – is this at all possible? please let me know
I can’t usually answer emails privately, but you’re welcome to set up a Quick Question consultation. That’s just what they’re for–when you are stuck on something and just need a quick bit of help. Just click on Consulting in the menus, and go to Quick Question.
Karen
Can Ordinal Logistic Regression be applied to analyse the factor weightings that predictively rank folks, say, in a quiz competition? Factors could be those such as age, education, job, relevant hobbies, specialisations etc.
Ten people, say, will finish with 10 different scores in a particular competition. What confuses me is that I have seen examples where perhaps a 100 other competition results are grouped into one analysis run. There are 100 winners but each winner has different abilities which gets lost in the analysis. Or a person finishing last (rank 10) in one competition, say, may have come out top (rank 1) in another weaker competition.
I am thinking that each individual competition has to be normalised somehow so that it actually can be related sensibly to the other 99 competitions.
Any views?
Hi Robert,
Let me see if I understand. Each person has 10 different scores, each on, say, a different topic? You use the raw scores to create rankings for each topic?
I guess what I’m missing here is what are you trying to get from the analysis? How the predictors (age, education, etc) affect the overall ranking across all 10 topics? Or whether they affect the rankings on some topics but not others?
Karen
Hi,
How do we check the log linearity assumption for a quantitative predictor in a multinomial regression ?
Thanks.
Hi Epitaf,
There are two ways–graph it and try non-linear terms to make sure they don’t fit better.
Karen
Hi Karen,
I am dealing with ordinal data for the first time and I am having a bit of a data dilemma. I am examining whether increased insight (awareness) of the fact that one has a mental illness (MI; measured continuously) predicts people’s increased perceptions of their own recovery from MI. Further, does experienced stigma (measured continuously) for having MI moderate this relationship.
Recovery (the DV) is measured using an empirically-validated ordinal 5-stage model which provides a score for each stage of recovery per participant (i.e., 5 scores per person). Hence, according to my hypothesis, high insight should be related to higher stages of recovery, and this relationship should change as experienced stigma changes.
Given that my DV is both ordinal and dependent (i.e., stages of recovery are divergently correlated the further apart they are conceptually; e.g., stage 5 has a stronger correlation with stage 4 than with stage 2) would you recommend a repeated-measures multinomial logistic regression?
Thanks,
Chris
i need to learn every thing about logistic mult more than two categories because i make my resarch thank you
Hi Shymaa,
You may want to start with my webinar on multinomial and ordinal models, but there are also great books out there that discuss it as well, including Long’s Regression Models for Categorical and Limited Dependent Variables.
The webinar recording is free: http://www.theanalysisfactor.com/binary-ordinal-multinomial-logistic/
Karen
Hi
I have a categorical IV and the DV is order of opening 4 information boxes. Each participant has data like this:
infobox1: 1st
infobox2: 2nd
infobox3: 4th
infobox4: 3rd
Can I use linear regression/ANOVa for this or do I need to use chi-square/logistic regression because it is really an ordinal variable?
I’d really appreciate your advice.
Anna
Hi Anna,
It really is an ordinal variable. Plus, when you ask people to rank things, it has the extra issue of dependence. If you know how people ranked the first three boxes, you know their answer to the fourth.
Karen
Hi Karen,
This is similar to Robert’s question back in March. My research is actually about language but might be easier to understand if I use the competition metaphor. I have lots of data about different competitors and the results of lots of three-competitor competitions. So, my IVs are three sets of competitor data, and my DV is the winning competitor. I want to build a model that will predict the winner. I understand that this will be a logistic regression, but what kind and how should I organise the data (for SPSS 19), given that my DV is also one of my IVs? A nudge in the right direction would be very much appreciated. Kevin
PS Your courses look very good!
Hi Kevin,
This may be the kind of question that requires a consultation because the answer is in the details, which means I’d have to ask about a dozen questions to make sure I understand correctly. But I’ll try to give you a nudge.
Your DV can’t be an IV as well. So that means you’re going to have to define your DV in such a way that it’s not the same as an IV. For example, the DV may be “Did this competitor win: Yes or No” and the IV is “Competitor ID: 1, 2, or 3.” That’s a little different than defining the DV as “Which competitor won: 1, 2, or 3.” Does that work?
Karen
Hi Karen,
Thanks for the nudge – that definitely helps. A consultation might be the way forward, but I want to put together a more detailed plan if I go down that route, so that you would be verifying and improving it rather than telling me how to build it from scratch. I hope that makes sense. Thanks again – a very helpful nudge.
Kevin
The value of the response variables are 0,1,2,3,4. The observation takes place in 253 primary school childrens. Can i use poisson regression models, or any count data model to model the data.
Hi Bereket,
That sounds like it’s eligible. I can’t tell you it’s the way to go without more info. Are those actual counts?
Karen