Poisson and Negative Binomial Regression Models

Poisson or Negative Binomial? Using Count Model Diagnostics to Select a Model

March 19th, 2018 by Jeff Meyer

How do you choose between Poisson and negative binomial models for discrete count outcomes?

One key criterion is the relative value of the variance to the mean after accounting for the effect of the predictors. A previous article discussed the concept of a variance that is larger than the model assumes: overdispersion.

(Underdispersion is also possible, but much less common).

There are two ways to check for overdispersion: (more…)

10 comments

Getting Accurate Predicted Counts When There Are No Zeros in the Data

March 12th, 2018 by Jeff Meyer

We previously examined why a linear regression and negative binomial regression were not viable models for predicting the expected length of stay in the hospital for people with the flu. A linear regression model was not appropriate because our outcome variable, length of stay, was discrete and not continuous.

A negative binomial model wasn’t the proper choice because the minimum length of stay is not zero. The minimum length of stay is one day. Negative binomial and Poisson models can only be used on data where the observations’ outcome have the possibility of having a zero count.

We need to use a truncated negative binomial model to analyze the expected length of stay of people admitted to the hospital who have the flu. Calculating the expected length of stay is an easy task once we create our model. (more…)

No comments yet

The Problem with Linear Regression for Count Data

February 26th, 2018 by Jeff Meyer

Imagine this scenario:

This year’s flu strain is very vigorous. The number of people checking in at hospitals is rapidly increasing. Hospitals are desperate to know if they have enough beds to handle those who need their help.

You have been asked to analyze a previous year’s hospitalization length of stay by people with the flu who had been admitted to the hospital. The predictors in your data set are age group, gender and race of those admitted. You also have an indicator that signifies whether the hospital was privately or publicly run.

(more…)

1 comment

Member Training: Making Sense of Statistical Distributions

August 1st, 2017 by guest contributer

Many who work with statistics are already functionally familiar with the normal distribution, and maybe even the binomial distribution.

These common distributions are helpful in many applications, but what happens when they just don’t work?

This webinar will cover a number of statistical distributions, including the:

Poisson and negative binomial distributions (especially useful for count data)
Multinomial distribution (for responses with more than two categories)
Beta distribution (for continuous percentages)
Gamma distribution (for right-skewed continuous data)
Bernoulli and binomial distributions (for probabilities and proportions)
And more!

We’ll also explore the relationships among statistical distributions, including those you may already use, like the normal, t, chi-squared, and F distributions.

Note: This training is an exclusive benefit to members of the Statistically Speaking Membership Program and part of the Stat’s Amore Trainings Series. Each Stat’s Amore Training is approximately 90 minutes long.

(more…)

No comments yet

What Are Nested Models?

July 28th, 2017 by Karen Grace-Martin

Pretty much all of the common statistical models we use, with the exception of OLS Linear Models, use Maximum Likelihood estimation.

This includes favorites like:

All Generalized Linear Models, including logistic, probit, beta, Poisson, negative binomial regression
Linear Mixed Models
Generalized Linear Mixed Models
Parametric Survival Analysis models, like Weibull models
Structural Equation Models

That’s a lot of models.

If you’ve ever learned any of these, you’ve heard that some of the statistics that compare model fit in competing models require (more…)

2 comments

Analyzing Zero-Truncated Count Data: Length of Stay in the ICU for Flu Victims

January 9th, 2017 by Jeff Meyer

It’s that time of year: flu season.

Let’s imagine you have been asked to determine the factors that will help a hospital determine the length of stay in the intensive care unit (ICU) once a patient is admitted.

The hospital tells you that once the patient is admitted to the ICU, he or she has a day count of one. As soon as they spend 24 hours plus 1 minute, they have stayed an additional day.

Clearly this is count data. There are no fractions, only whole numbers.

To help us explore this analysis, let’s look at real data from the State of Illinois. We know the patients’ ages, gender, race and type of hospital (state vs. private).

A partial frequency distribution looks like this: (more…)

3 comments