Logistic Regression


online workshops

Logistic Regression for Binary, Ordinal, and Multinomial Outcomes

Sooner or later, you’re going to have to answer a research question with a categorical dependent variable. As you may have already encountered, no matter how many ways you transform or try to finagle the data, you just can’t force it into a linear regression or ANOVA. So what do you do? Logistic regression: A researcher’s best friend when it comes to categorical outcome variables. learn more


the craft of statistical analysis free webinars

Binary, Ordinal, and Multinomial Logistic Regression for Categorical Outcomes

Logistic regression is one of the most useful tools you can have in your statistical tool box. The different types can be used in a common data situation when linear models can’t – when the outcome variable is categorical. They are a little trickier to learn than linear models, but once you get the idea, you’ll see that they’re well within your reach. learn more

Understanding Probability, Odds, and Odds Ratios in Logistic Regression

Odds ratios are the bane of many data analysts. Interpreting them can be like learning a whole new language. We discuss how to interpret the odds ratios in binary logistic regression.​​​​​​​ learn more


statistically speaking member trainings

Those Darn Ratios!

Ratios are everywhere in statistics—coefficient of variation, hazard ratio, odds ratio, the list goes on. You see them reported in the literature and in your output. You comment on them in your reports. You even (kinda) understand them. Or, maybe, not quite? learn more

Logistic Regression for Count and Proportion Data

Most of us know that binary logistic regression is appropriate when the outcome variable has two possible outcomes: success and failure. There are two more situations that are also appropriate for binary logistic regression, but they don’t always look like they should be. learn more

Generalized Linear Models

Generalized linear models are designed to work with outcomes that aren’t normally distributed, but have other recognizable characteristics, such as being counts, proportions, or belonging to categories. They are often exactly what you need when you just can’t get a normal distribution to fit. learn more

A Primer on Exponents and Logarithms for the Data Analyst

Ah, logarithms. They were frustrating enough back in high school. (If you even got that far in high school math.) And they haven’t improved with age, now that you can barely remember what you learned in high school. And yet… they show up so often in data analysis. learn more

Analysis of Ordinal Variables: Options Beyond Nonparametrics

There are many types and examples of ordinal variables: percentiles, ranks, likert scale items, to name a few. These are especially hard to know how to analyze – some people treat them as numerical, others emphatically say not to. learn more

ROC Curves

ROC Curves are incredibly useful in evaluating any model or process that predicts group membership of individuals. Any ROC can tell you how well a process or model distinguishes between true and false positives and negatives. learn more

Types of Regression Models and When to Use Them

Linear, Logistic, Tobit, Cox, Poisson, Zero Inflated… The list of regression models goes on and on before you even get to things like ANCOVA or Linear Mixed Models. learn more


articles at the analysis factor

get started with logistic regression concepts

Introduction to Logistic Regression

Linear regression is commonly used when the response variable is continuous. One assumption of linear models is that the residual errors follow a normal distribution. This assumption fails when the response variable is categorical, so an ordinary linear model is not appropriate. We present a regression model for a response variable that is dichotomous–having two categories. learn more

What is a Logit Function and Why Use Logistic Regression?

One of the big assumptions of linear models is that the residuals are normally distributed. Unfortunately, categorical response variables are not. No matter how many transformations you try, you’re just never going to get normal residuals from a model with a categorical response variable. learn more

Chi-Square Test vs. Logistic Regression: Is a Fancier Test Better?

I recently received a great question, and one of wider interest: “why is using regression or logistic regression ‘better’ than doing bivariate analysis such as Chi-square?” There are a number of different reasons I’ve seen. learn more

Why Use Odds Ratios in Logistic Regression

Odds ratios are one of those concepts in statistics that are just really hard to wrap your head around. Although probability and odds both measure how likely it is that something will occur, probability is just so much easier to understand for most of us. learn more

Logistic Regression Analysis: Understanding Odds and Probability

Probability and odds measure the same thing: the likelihood or propensity or possibility of a specific outcome. People use the terms odds and probability interchangeably in casual usage, but that is unfortunate. It just creates confusion because they are not equivalent. learn more

When Linear Models Don’t Fit Your Data, Now What?

When your dependent variable is not continuous, unbounded, and measured on an interval or ratio scale, linear models don’t fit. The data just will not meet the assumptions of linear models. But there’s good news: other models exist for many types of dependent variables. learn more

advanced topics in logistic regression

How to Interpret Odd Ratios when a Categorical Predictor Variable has More than Two Levels

One great thing about logistic regression, at least for those of us who are trying to learn how to use it, is that the predictor variables work exactly the same way as they do in linear regression. Dummy coding, interactions, quadratic terms–they all work the same way. learn more

Generalized Linear Models in R, Part 1: Calculating Predicted Probability in Binary Logistic Regression

Ordinary Least Squares regression provides linear models of continuous variables. However, much data of interest to statisticians and researchers are not continuous and so other methods must be used to create useful predictive models. learn more

Generalized Linear Models in R, Part 2: Understanding Model Fit in Logistic Regression Output

Last time, we saw how to create a simple Generalized Linear Model on binary data using the glm() command. We continue with the same glm on the mtcars data set (modeling the vs variable on the weight and engine displacement). learn more

SPSS Procedures for Logistic Regression

Need to run a logistic regression in SPSS? Turns out, SPSS has a number of procedures for running different types of logistic regression. Some types of logistic regression can be run in more than one procedure. For some unknown reason, some procedures produce output, and others don’t. So it’s helpful to be able to use more than one. learn more

Logistic Regression Models: Reversed Odds Ratios in SAS Proc Logistic–Use ‘Descending’

If you’ve ever been puzzled by odds ratios in a logistic regression that seem backward, stop banging your head on the desk. Odds are (pun intended) you ran your analysis in SAS Proc Logistic. Proc logistic has a strange little default. learn more

Effect Size Statistics in Logistic Regression

Effect size statistics are expected by many journal editors these days. If you’re running an ANOVA, t-test, or linear regression model, it’s pretty straightforward which ones to report. Things get trickier, though, once you venture into other types of models. learn more

How to Get Standardized Regression Coefficients When Your Software Doesn’t Want To Give Them To You

Standardized regression coefficients remove the unit of measurement of predictor and outcome variables. They are sometimes called betas, but I don’t like to use that term because there are too many other, and too many related, concepts that are also called beta. There are many good reasons to report them. learn more

Measures of Predictive Models: Sensitivity and Specificity

With any model, like one for aspects of a transaction that are likely enough to be fraudulent that it shuts it down, you’re never going to to hit 100% accuracy. And if you’re wrong, there’s a tradeoff between tightening standards to catch the credit card thieves and annoying customers who are just trying to stock up at Trader Joe’s. learn more

Explaining Logistic Regression Results to Non-Statistical Audiences

I received an e-mail from a researcher in Canada that asked about communicating logistic regression results to non-researchers. It was an important question, and there are a number of parts to it. learn more

Models for Repeated Measures Continuous, Categorical, and Count Data

Lately, I’ve gotten a lot of questions about learning how to run models for repeated measures data that isn’t continuous. Mostly categorical. But once in a while discrete counts. A typical study is in linguistics or psychology where each subject is asked to answer some Yes/No question on each of many trials. learn more

ordinal and multinomial logistic regression

Logistic Regression Models for Multinomial and Ordinal Variables

The multinomial (a.k.a. polytomous) logistic regression model is a simple extension of the binomial logistic regression model. They are used when the dependent variable has more than two nominal (unordered) categories. learn more

Opposite Results in Ordinal Logistic Regression—Solving a Statistical Mystery

A number of years ago when I was still working in the consulting office at Cornell, someone came in asking for help interpreting their ordinal logistic regression results. The client was surprised because all the coefficients were backwards from what they expected, and they wanted to make sure they were interpreting them correctly. learn more

Opposite Results in Ordinal Logistic Regression, Part 2

I received the following email from a reader after sending out the last article in the series. And I agreed I’d answer it here in case anyone else was confused. learn more

Generalized Ordinal Logistic Regression for Ordered Response Variables

***Description text will be here. Lorim ipsum at sentros it arento. Lorim ipsum at sentros it arento. Subhead will be here. Lorim ipsum et…learn more