Five Ways to Analyze Ordinal Variables (Some Better than Others)

There are not a lot of statistical methods designed just to analyze ordinal variables.

But that doesn’t mean that you’re stuck with few options.  There are more than you’d think.

Some are better than others, but it depends on the situation and research questions.

Here are five options when your dependent variable is ordinal.

1. Analyze ordinal variables as if they’re nominal

Ordinal variables are fundamentally categorical. One simple option is to ignore the order in the variable’s categories and treat it as nominal. There are many options for analyzing categorical variables that have no order.

This can make a lot of sense for some variables. For example, when there are few categories and the order isn’t central to the research question.

The biggest advantage to this approach is you won’t violate any assumptions.

2. Analyze ordinal variables as if they’re numeric

Because the ordering of the categories often is central to the research question, many data analysts do the opposite: ignore the fact that the ordinal variable really isn’t numerical and treat the numerals that designate each category as actual numbers.

This approach requires the assumption that the distance between each set of subsequent categories is equal. And that can be very difficult to justify.

So think long and hard about whether you’re able to justify this assumption.

3. Non-parametric tests

Some good news: there are other options.

Many non-parametric descriptive statistics are based on ranking numerical values. Ranks are themselves ordinal–they tell you information about the order, but no distance between values.

Just like other ordinal variables.

So while we think of these tests as useful for numerical data that are non-normal or have outliers, they work for ordinal variables as well, especially when there are more than just a few ordered categories.

Common rank-based non-parametric tests include Kruskal-Wallis, Spearman correlation, Wilcoxon-Mann-Whitney, and Friedman.

Each test has a specific test statistic based on those ranks, depending on whether the test is comparing groups or measuring an association.

The limitation of these tests, though, is they’re pretty basic.  Sure, you can compare groups one-way ANOVA style or measure a correlation, but you can’t go beyond that.  You can’t, for example, include interactions among two independent variables or include covariates.

You need a real model to do that.

4. Ordinal logistic & probit regression

There aren’t many tests that are set up just for ordinal variables, but there are a few.  One of the most commonly used is ordinal models for logistic (or probit) regression.

There are a few different ways of specifying the logit link function so that it preserves the ordering in the dependent variable. The most commonly available in software is the cumulative link function, which allows you to measure the effect of predictors on the odds of moving into any next-highest-ordered category.

These models are complex, have their own assumptions, and can take some practice to interpret. But they are also sometimes exactly what you need.

They are a very good tool to have in your statistical toolbox.

5. Rank transformations

Another model-based approach combines the advantages of ordinal logistic regression and the simplicity of rank-based non-parametrics.

The basic idea is a rank transformation: transform each ordinal outcome score into the rank of that score and run your regression, two-way ANOVA, or other model on those ranks.

The thing to remember though, is that all results need to be interpreted in terms of the ranks.  Just as a log transformation on a dependent variable puts all the means and coefficients on a log(DV) scale, the rank transformation puts everything on a rank scale. Your interpretations are going to be about mean ranks, not means.

 

Binary, Ordinal, and Multinomial Logistic Regression for Categorical Outcomes
Get beyond the frustration of learning odds ratios, logit link functions, and proportional odds assumptions on your own. See the incredible usefulness of logistic regression and categorical data analysis in this one-hour training.

Reader Interactions

Comments

  1. Jacob O. Wobbrock, Ph.D. says

    For #5, it has been shown that interaction effects under simple rank transformations explode Type I error rates. Put simply, you can’t just rank-transform your DV and run, say, a two-way ANOVA and safely interpret the interaction effect. An increasingly common fix is to use the ALIGNED rank transform (ART), which aligns the response for each main effect and interaction separately, before ranking. This procedure preserves correct Type I error rates for all effects, including interactions. We created an R package called ARTool that performs the ART procedure and gives ANOVA table results. We also have a Windows executable for aligning-and-ranking data. You can find out more on our project page here: http://depts.washington.edu/acelab/proj/art/

  2. shimuye nigusse says

    1. am running ordinal logistic regression in stata but when I tried parallel lines test using the command oparallel it responded me in some explanatory variables as hessian is not negative semi definite and in some it says full model can’t be estimated due to perfect prediction. how can I solve this problem, please?
    2. I have two categorical variables which are ordinal, what is the best way to analyze my data either using ordinal logistic regression for each of the dependent variable or any one model to use in combination?
    sincerely!


Leave a Reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.