Odds ratios are one of those concepts in statistics that are just really hard to wrap your head around. Although probability and odds both measure how likely it is that something will occur, probability is just so much easier to understand for most of us.

I’m not sure if it’s just a more intuitive concepts, or if it’s something were just taught so much earlier so that it’s more ingrained. In either case, without a lot of practice, most people won’t have an immediate understanding of how likely something is if it’s communicated through odds.

So why not always use probability?

The problem is that probability and odds have different properties that give odds some advantages in statistics. For example, in logistic regression the odds ratio represents the constant effect of a predictor X, on the likelihood that one outcome will occur.

The key phrase here is *constant effect*. In regression models, we often want a measure of the unique effect of each X on Y. If we try to express the effect of X on the likelihood of a categorical Y having a specific value through probability, the effect is not constant.

What that means is there is no way to express in one number how X affects Y in terms of probability. The effect of X on the probability of Y has different values depending on the value of X.

So while we would love to use probabilities because they’re intuitive, you’re just not going to be able to describe that effect in a single number. So if you need to communicate that effect to a research audience, you’re going to have to wrap your head around odds ratios.

#### What about Probabilities

What you can do, and many people do, is to use the logistic regression model to calculate predicted probabilities at specific values of a key predictor, usually when holding all other predictors constant.

This is a great approach to use together with odds ratios. The odds ratio is a single summary score of the effect, and the probabilities are more intuitive.

Presenting probabilities without the corresponding odds ratios can be problematic, though.

First,when X, the predictor, is categorical, the effect of X *can* be effectively communicated through a difference or ratio of probabilities. The probability a person has a relapse in an intervention condition compared to the control condition makes a lot of sense.

But the p-value for that effect *is not* the p-value for the differences in probabilities.

If you present a table of probabilities at different values of X, most research audiences will, at least in their minds, make those difference comparisons between the probabilities. They do this because they’ve been trained to do this in linear models.

These differences in probabilities don’t line up with the p-values in logistic regression models, though. And this can get quite confusing.

Second, when X, the predictor is continuous, the odds ratio is constant across values of X. But probabilities aren’t.

It works exactly the same way as interest rates. I can tell you that an annual interest rate is 8%. So at the end of the year, you’ll earn $8 if you invested $100, or $40 if you invested $500. The rate stays constant, but the actual amount earned differs based on the amount invested.

Odds ratios work the same. An odds ratio of 1.08 will give you an 8% increase in the odds at any value of X.

Likewise, the difference in the probability (or the odds) depends on the value of X.

So if you do decide to report the increase in probability at different values of X, you’ll have to do it at low, medium, and high values of X. You can’t use a single number on the probability scale to convey the *relationship* between the predictor and the probability of a response.

It takes more than a single number, and it’s not “the effect of X on Y,” but sometimes it’s a better way to communicate what is really going on, especially to non-research audiences.

{ 9 comments… read them below or add one }

how to compute Probability in Logistic Regression with stata?

You write

The key phrase here is constant effect. In regression models, we often want a measure of the unique effect of each X on Y. If we try to express the effect of X on the likelihood of a categorical Y having a specific value through probability, the effect is not constant.

But sometimes dont you want the effect of x in the cat var to not be constant. Like if you are predictive modeling a individual x that shows different behavior based on high or low x

Thank you. This helped me explain to reviewer 1 why the request for predicted probabilities rather than odds ratios was respectfully declined. Succinct, clear, and intuitive. Much appreciated.

how to interpret odds ratio in ordered multinational logit

This was an extremely clear explanation explained in a simple manner.

Can someone tell me how to transform odds ratios into logistic beta coefficients? Suppose the odds of becoming diabetic when some one is obese is 4, what would be the corresponding value of beta coefficient in a logistic regression?

Thank you so much for any clue.

Kiza.

This was an extremely intuitive explanation. I couldn’t find an answer like this elsewhere. Thank you!

How to interpret multinomial logistic?

Hi Tesfaye,

That’s a really good question, and how I’d answer depends on say, whether you already understand binary logistic regression.

I would suggest starting with these two webinar recordings:

Binary, Ordinal, and Multinomial Logistic Regression for Categorical Outcomes

Understanding Probability, Odds, and Odds Ratios in Logistic Regression.

They’re both free.

The former describes multinomial logistic regression and how interpretation differs from binary. The latter goes into more detail about how to interpret an odds ratio.

Karen