How to Decide Between Multinomial and Ordinal Logistic Regression Models

A great tool to have in your statistical tool belt is logistic regression.

It comes in many varieties and many of us are familiar with the variety for binary outcomes.

But multinomial and ordinal varieties of logistic regression are also incredibly useful and worth knowing.

They can be tricky to decide between in practice, however.  In some — but not all — situations you could use either.

So let’s look at how they differ, when you might want to use one or the other, and how to decide.

The Basics

Both multinomial and ordinal models are used for categorical outcomes with more than two categories.

The simplest decision criterion is whether that outcome is nominal (i.e., no ordering to the categories) or ordinal (i.e., the categories have an order).

It should be that simple.

Here’s why it isn’t:

1. While there is only one logistic regression model appropriate for nominal outcomes, there are quite a few for ordinal outcomes.

These models account for the ordering of the outcome categories in different ways. Most software, however, offers you only one model for nominal and one for ordinal outcomes.

2. The most common of these models for ordinal outcomes is the proportional odds model. It has a strong assumption with two names — the proportional odds assumption or parallel lines assumption.

It essentially means that the predictors have the same effect on the odds of moving to a higher-order category everywhere along the scale.

The problem?

This assumption is rarely met in real data, yet is a requirement for the only ordinal model available in most software.

3. If you have a nominal outcome variable, it never makes sense to choose an ordinal model. Your results would be gibberish and you’ll be violating assumptions all over the place.

(That makes one choice simple!)

In contrast, you can run a nominal model for an ordinal variable and not violate any assumptions. But you may not be answering the research question you’re really interested in if it incorporates the ordering.

4. The names. Most software refers to a model for an ordinal variable as an ordinal logistic regression (which makes sense, but isn’t specific enough).

In contrast, they will call a model for a nominal variable a multinomial logistic regression (wait – what?).

It gets better.

Some software procedures require you to specify the distribution for the outcome and the link function, not the type of model you want to run for that outcome. Both ordinal and nominal variables, as it turns out, have multinomial distributions.

What differentiates them is the version of logit link function they use. So if you don’t specify that part correctly, you may not realize you’re actually running a model that assumes an ordinal outcome on a nominal outcome. Not good.

A link function with a name like “mlogit,” “multinomial logit,” or “generalized logit” assumes no ordering.

A link function with a name like “clogit” or “cumulative logit” assumes ordering, so only use this if your outcome really is ordinal.

Confusing, right?

To summarize:

If you have a nominal outcome, make sure you’re not running an ordinal model.​​​​​​​

​​​​​​​​​​​​​​If you have an ordinal outcome and the proportional odds assumption is met, you can run the cumulative logit version of ordinal logistic regression.

If you have an ordinal outcome and your proportional odds assumption isn’t met, you can​​​​​​​:

     1. Run a different ordinal model

     2. Run a nominal model as long as it still answers your research question
​​​​​​​

Binary, Ordinal, and Multinomial Logistic Regression for Categorical Outcomes
Get beyond the frustration of learning odds ratios, logit link functions, and proportional odds assumptions on your own. See the incredible usefulness of logistic regression and categorical data analysis in this one-hour training.

Reader Interactions

Comments

  1. Jesuloba Stephen Owojori says

    Hello please my independent and dependent variable are both likert scale. can i use Multinomial Logistic Regression?

  2. Tom Sorger says

    Thanks again. It is just puzzling that you obtain different rankings for the same dataset when you reverse the dependent and independent variables i.e. ANOVA versus Nominal Logistic Regression.

  3. Tom Sorger says

    For a nominal outcome, can you please expand on:
    a) why there can be a contradiction between ANOVA and nominal logistic regression;
    b) why it is incorrect to compare all possible ranks using ordinal logistic regression.

    • Karen Grace-Martin says

      Hi Tom, I don’t really understand these questions.

      a) You would never run an ANOVA and a nominal logistic regression on the same variable. The ANOVA results would be nonsensical for a categorical variable.
      b) I’m not sure what ranks you’re referring to.

      • Tom Sorger says

        Hi Karen, thank you for the reply. Please let me clarify.
        a) There are four organs, each with the expression levels of 250 genes. We wish to rank the organs w/respect to overall gene expression.
        ANOVA: compare 250 responses as a function of organ i.e. compare mean response in each organ. This gives order LHKB.
        Nominal Regression: rank 4 organs (dependent) based on 250 x 4 expression levels. This gives order LKHB.
        b) Why not compare all possible rankings by ordinal logistic regression?

  4. Tom Sorger says

    Why can the ordinal and nominal logistic regressions yield contradictory results from the same dataset? We have 4 x 1000 observations from four organs. NomLR yields the following ranking: LKHB, P ~ e-05.
    ANOVA yields: LHKB (!), P ~ e-05.
    OrdLR assuming the ANOVA result, LHKB, P ~ e-06.
    Why does NomLR contradict ANOVA?
    Is it incorrect to conduct OrdLR based on ANOVA?

    • Karen Grace-Martin says

      Hi Tom,

      Ordinal and Nominal logistic regression testing different hypotheses and estimating different log odds. So they don’t have a direct logical “If ordinal says this, nominal will say that.”

      By ANOVA I’m assuming you mean the linear model, not for example, the table that is often labeled ANOVA? If so, it doesn’t even make sense to compare ANOVA and logistic regression results because they are used for different types of outcome variables. IF you have a categorical outcome variable, don’t run ANOVA.

  5. Ngozi Louis Uzomah says

    I have a dependent variable with five nominal categories and 20 independent variables measured on a 5-point Likert scale. There are also other independent variables such as gender (2 categories), age group(5 categories), educational level (4 categories), and place of origin (3 categories). I am using multinomial regression, do I have to convert any independent variables into dummies, and which ones are supposed to enter into Factors and Covariates in SPSS?

    • Karen Grace-Martin says

      Ngozi,

      SPSS called categorical independent variables Factors and numerical independent variables Covariates. Anything you put into the Factor box SPSS will dummy code for you. Not every procedure has a Factor box though. For example, in Linear Regression, you have to dummy code yourself. In Binary Logistic, you can specify those factors using the Categorical button and it will still dummy code for you.

  6. Bobby Thapa says

    Hi,

    When do we make dummy variables? Is it done only in multiple logistic regression or we have to make it in binary logistic regression also?

    • Karen Grace-Martin says

      Whenever you have a categorical variable in a regression model, whether it’s a predictor or response variable, you need some sort of coding scheme for the categories. The 1/0 coding of the categories in binary logistic regression is dummy coding, yes.

  7. RAHUL SINGH says

    What should be the reference In MLR, how the comparison between the reference and each of the independent category IN MLR useful over BLR?

  8. george says

    Question?

    so I think my data fits the ordinal logistic regression due to nominal and ordinal data. My predictor variable is a construct (X) with is comprised of 3 subscales (x1+x2+x3= X) and is which to run the analysis based on “hierarchical/stepwise” theoretical regression framework.
    Should I run “3” independent regression analyses with each of the 3 subscales ( of my construct) or run just one analysis (“X” with 3 levels) and still use a hierarchical/stepwise , theoretical regression approach with ordinal log regression?

  9. Katrina Dunlap says

    Hi there. This was very helpful. But let’s say that you have a variable with the following outcomes: Almost always, Most of the time, Some of the time, Rarely, Never, Don’t Know, and Refused. These 6 categories can be reduce to 4 however I am not sure if there is an order or not because “Don’t know” and “refused” is confusing to me. Thoughts?

    • Shahzeb says

      It always depends on the research questions you are trying to answer but apparently “Don’t Know” and “Refused” seem to have very different meanings. In case you might want to group them as “No information gained”, you would definitely be able to consider the groupings as ordinal.

  10. Stephen says

    Hi,
    Let’s say the outcome is three states: State 0, State 1 and State 2. How about a situation where the sample go through State 0, State 1 and 2 but can also go from State 0 to state 2 or State 2 to State 1? While you consider this as ordered or unordered?


Leave a Reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.