• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • Our Programs
    • Membership
    • Online Workshops
    • Free Webinars
    • Consulting Services
  • About
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Collaborate with Us
  • Statistical Resources
  • Contact
  • Blog
  • Login

Poisson Regression Analysis for Count Data

by Karen Grace-Martin 4 Comments

There are many dependent variables that no matter how many transformations you try, you cannot get to be normally distributed.  The most common culprits are count variables–the variable that measures the count or rate of some event in a sample.  Some examples I’ve seen from a variety of disciplines are:

Number of eggs in a clutch that hatch
Number of domestic violence incidents in a month
Number of times juveniles needed to be restrained during tenure at a correctional facility
Number of infected plants per transect

A common quality of these variables is that 0 is the mode–the most common value.  1 is the next most common, 2 the next, and so on.  In variables with low expected counts (number of cars in a household, number of degrees earned), this is often more pronounced.  No monotonic transformation (log, square root, etc.) can ever move the mode from the end of the distribution to the middle as a normal distribution requires.

But a least-squares normal model doesn’t work for a few more reasons.

First, count variables can’t be below 0.  They just don’t make sense.  But a normal model has no bounds–any value is possible, so a normal model can produce negative predicted values.

Second, with these variables, variance is often not constant–a basic assumption of ordinary least squares regression.  Instead, it goes up with the value of Y.

Another common approach is to categorize the data into two categories–one of the eggs hatched or none did–or more ordered categories–no eggs hatched, 1-2 eggs hatched, 3 or more eggs hatched–then run a Logistic Regression Model.  This can work, but it throws away real information and often lowers power.

As it happens, Count variables often follow a Poisson distribution, and can therefore be used in a Poisson Regression Model.  Poisson Regression Models are similar to Logistic Regression in many ways–they both use Maximum Likelihood Estimation, they both require a transformation of the dependent variable.  Anyone familiar with Logistic Regression will find the leap to Poisson Regression easy to handle.

There are a few issues to keep in mind, though.

1. The link function (the transformation of Y) is the natural log.  So all parameter estimates are on the log scale and need to be transformed for interpretation.

2. It is often necessary to include an exposure or offset parameter in the model to account for the amount of risk each individual had to the event.  A clutch with more eggs will have more opportunity for chicks to hatch.

3. One assumption of Poisson Models is that the mean and the variance are equal, but this assumption is often violated. This can be dealt with by using a dispersion parameter if the difference is small or a negative binomial regression model if the difference is large.

4. Sometimes there are many, many more zeros than even a Poisson Model would indicate.  This generally means there are two processes going on–there is some threshold that needs to be crossed before an event can occur.  A Zero Inflated Poisson Model is a mixture model that simultaneously estimates the probability of crossing the threshold, and once crossed, how many events occur.


Bookmark and Share

 

Poisson and Negative Binomial Regression for Count Data
Learn when you need to use Poisson or Negative Binomial Regression in your analysis, how to interpret the results, and how they differ from similar models.

Tagged With: Count data, Least Squares Regression, logistic regression, Negative Binomial Regression, Poisson Regression, Zero Inflated

Related Posts

  • A Few Resources on Zero-Inflated Poisson Models
  • Member Training: Count Models
  • Member Training: Types of Regression Models and When to Use Them
  • Regression Models for Count Data

Reader Interactions

Comments

  1. Francois BOCQUIER says

    December 19, 2017 at 11:24 am

    Laura

    https://www.theanalysisfactor.com/poisson-regression-analysis-for-count-data/

    Thank you so much for your tutorials, I cannot reach the above link, could you send it ?
    Best regards,
    Francois

    Reply
    • Karen says

      January 2, 2018 at 1:19 pm

      Please try http://thecraftofstatisticalanalysis.com/poisson-negative-binomial-regression-models

      Reply
  2. Laura says

    January 21, 2016 at 2:11 pm

    https://www.theanalysisfactor.com/poisson-regression-analysis-for-count-data/

    “This page contains a link that is broken. By search Poisson on the Institute the topic doesn’t even appear. Can you help me locate the webinar?
    If you’d like to learn more about the different models available for Count data, you can download a free recording of the webinar: Poisson and Negative Binomial Regression for Count Data. It’s free.”

    Reply
    • Karen says

      August 16, 2016 at 3:29 pm

      Hey Laura,

      You should be able to download the webinar here. If you still have trouble, just shoot us an email at support@analysisfactor.com and we’ll help you out.

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Primary Sidebar

This Month’s Statistically Speaking Live Training

  • Member Training: Analyzing Pre-Post Data

Upcoming Free Webinars

Poisson and Negative Binomial Regression Models for Count Data

Upcoming Workshops

  • Analyzing Count Data: Poisson, Negative Binomial, and Other Essential Models (Jul 2022)
  • Introduction to Generalized Linear Mixed Models (Jul 2022)

Copyright © 2008–2022 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT