• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • About
    • Our Programs
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Guest Instructors
  • Membership
    • Statistically Speaking Membership Program
    • Login
  • Workshops
    • Online Workshops
    • Login
  • Consulting
    • Statistical Consulting Services
    • Login
  • Free Webinars
  • Contact
  • Login

What is a Logit Function and Why Use Logistic Regression?

by Karen Grace-Martin 16 Comments

One of the big assumptions of linear models is that the residuals are normally distributed.  This doesn’t mean that Y, the response variable, has to also be normally distributed, but it does have to be continuous, unbounded and measured on an interval or ratio scale.

Unfortunately, categorical response variables are none of these.

No matter how many transformations you try, you’re just never going to get normal residuals from a model with a categorical response variable.

There are a number of alternatives though, and one of the most popular is logistic regression.

In many ways, logistic regression is very similar to linear regression.  One big difference, though, is the logit link function.

The Logit Link Function

A link function is simply a function of the mean of the response variable Y that we use as the response instead of Y itself.

All that means is when Y is categorical, we use the logit of Y as the response in our regression equation instead of just Y:

The logit function is the natural log of the odds that Y equals one of the categories.  For mathematical simplicity, we’re going to assume Y has only two categories and code them as 0 and 1.

This is entirely arbitrary–we could have used any numbers.  But these make the math work out nicely, so let’s stick with them.

P is defined as the probability that Y=1.  So for example, those Xs could be specific risk factors, like age, high blood pressure, and cholesterol level, and P would be the probability that a patient develops heart disease.

Why Bother With This Logit Function?

Well, if we used Y as the outcome variable and tried to fit a line, it wouldn’t be a very good representation of the relationship.  The following graph shows an attempt to fit a line between one X variable and a binary outcome Y.

You can see a relationship there–higher values of X are associated with more 0s and lower values of X have more 1s.  But it’s not a linear relationship.

Bad fitting model

Okay, fine.  But why mess with logs and odds?  Why not just use P as the outcome variable?  Everyone understands probability.

Here’s the same graph with probability on the Y axis:

It’s closer to being linear, but it’s still not quite there.  Instead of a linear relationship between X and P, we have a sigmoidal or S-shaped relationship.

But it turns out that there are a few functions of P that do form reasonably linear relationships with X.  These include:

  • Square root of arcsin
  • Complimentary log-log
  • Probit
  • Logit

The logit function is particularly popular because, believe it or not, its results are  relatively easy to interpret.  But many of the others work just as well.

Once we fit this model, we can then back-transform the estimated regression coefficients off of a log scale so that we can interpret the conditional effects of each X.


Bookmark and Share

Binary, Ordinal, and Multinomial Logistic Regression for Categorical Outcomes
Get beyond the frustration of learning odds ratios, logit link functions, and proportional odds assumptions on your own. See the incredible usefulness of logistic regression and categorical data analysis in this one-hour training.

Tagged With: Binary Logistic Regression, logit, logit function, logit link

Related Posts

  • Link Functions and Errors in Logistic Regression
  • How to Decide Between Multinomial and Ordinal Logistic Regression Models
  • Logistic Regression Models for Multinomial and Ordinal Variables
  • Member Training: Explaining Logistic Regression Results to Non-Researchers

Reader Interactions

Comments

  1. Shahab Hadaegh says

    October 17, 2019 at 3:07 pm

    Given the non-linearity of the transformation, can back-transforming the estimated coefficients result in bias ?

    Reply
  2. Lisa says

    May 13, 2019 at 6:25 am

    Hello,
    what’s the difference between performing a logistic regression and performing a linear regression based on a [previously computed] logit as dependent variable? Many thanks in advance.

    Reply
    • Karen Grace-Martin says

      October 16, 2020 at 3:10 pm

      Hi Lisa,

      A logit as dependent variable doesn’t really work when the outcome is 1/0. You’d have to group observations to come up with a value of p in the logit–the proportion of 1s. That’s the beauty of the link function. It does that for you.

      Reply
  3. afvrefz says

    May 31, 2018 at 1:24 pm

    Karen,

    Can you list all possible functions used in GRL please ?

    Example: logictic function, square root of arcsin,…

    Thank you

    Reply
  4. may says

    May 16, 2018 at 12:30 am

    How can i used Square root of arcsin as a model in logistic regression for binomial data?

    Reply
    • Karen Grace-Martin says

      May 17, 2018 at 9:54 am

      May, you can’t. Square root of arcsin is an alternative to logistic regression, but it’s arcane. It is still recommended sometimes, but it’s an ad-hoc way of fitting a binary outcome into a normal model. It’s better to just do the logistic regression.

      Reply
  5. faisal says

    January 4, 2018 at 4:52 am

    Aoa, please tell me that will i use probit o logit model if, i have one dependent and one independent model with five control variables. dependent variable is a dummy variable 0 and 1 and also 1 control variable have values 1 and 0??

    Reply
    • Karen Grace-Martin says

      March 7, 2019 at 2:31 pm

      In that situation, you should be able to use either probit or logistic.

      Reply
  6. Hamedi says

    February 16, 2017 at 4:21 am

    In GLM models, Is it possible to use a function of median instead of a function of the mean of the response in the logit link? I would use a distribution which it’s median is simpler than mean. Could I use median instead of mean?

    Reply
    • Karen Grace-Martin says

      March 7, 2019 at 2:30 pm

      Hi Hamedi,

      Not that I’ve ever heard. The median of a binary outcome would be either 0 or 1. The mean is somewhere in between – it’s the proportion of 1s.

      Reply
  7. Tom says

    August 4, 2016 at 3:46 am

    What will be the form of the logit function in case that Y is a binary variable such that y=0 with probability p and y=1 with probability (1-p).
    Thanks

    Reply
    • Karen Grace-Martin says

      March 7, 2019 at 2:28 pm

      Hi Tom,

      The form will be the same. You’re just switching what is a success (Y=1) with what is a failure (Y=0).

      Reply
  8. V. Mahdavi says

    June 27, 2016 at 6:04 pm

    Is it possible negative slope of the line in the chart probit؟

    Reply
    • Karen Grace-Martin says

      March 7, 2019 at 2:29 pm

      If you mean a negative coefficient for a probit? Yes.

      Reply
  9. Raj says

    June 26, 2016 at 11:46 pm

    Hello,
    I am some doubt regarding adding interaction term in a model. When we add interaction term in a model and how to interpret the coefficient of interaction. If you have some paper or book, kindly send it to me. It helps me a lot.
    Thanks

    Reply
    • Karen says

      July 1, 2016 at 11:44 am

      Hi Raj,

      I’m not sure I can think of anything written on interpreting interactions in logistic regression, but we do cover this in the logistic regression workshop.

      The very basic idea, though, is that the odds ratio for an interaction is the ratio of odds ratios. It’s hard to explain without knowing if the terms you are interacting are continuous or categorical, but that’s the basic definition. It takes me a good half hour to go over this in the workshop. 🙂

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Primary Sidebar

Free Webinars

Binary, Ordinal, and Multinomial Logistic Regression for Categorical Outcomes (Signup)

This Month’s Statistically Speaking Live Training

  • April Member Training: Statistical Contrasts

Upcoming Workshops

  • Logistic Regression for Binary, Ordinal, and Multinomial Outcomes (May 2021)
  • Introduction to Generalized Linear Mixed Models (May 2021)

Read Our Book



Data Analysis with SPSS
(4th Edition)

by Stephen Sweet and
Karen Grace-Martin

Statistical Resources by Topic

  • Fundamental Statistics
  • Effect Size Statistics, Power, and Sample Size Calculations
  • Analysis of Variance and Covariance
  • Linear Regression
  • Complex Surveys & Sampling
  • Count Regression Models
  • Logistic Regression
  • Missing Data
  • Mixed and Multilevel Models
  • Principal Component Analysis and Factor Analysis
  • Structural Equation Modeling
  • Survival Analysis and Event History Analysis
  • Data Analysis Practice and Skills
  • R
  • SPSS
  • Stata

Copyright © 2008–2021 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled

Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.

Non-necessary

Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.

SAVE & ACCEPT