• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • About
    • Our Programs
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Guest Instructors
  • Membership
    • Statistically Speaking Membership Program
    • Login
  • Workshops
    • Online Workshops
    • Login
  • Consulting
    • Statistical Consulting Services
    • Login
  • Free Webinars
  • Contact
  • Login

Count vs. Continuous Variables: Differences Under the Hood

by Jeff Meyer Leave a Comment

by Jeff Meyer, MBA, MPA

One of the most important concepts in data analysis is that the analysis needs to be appropriate for the scale of measurement of the variable. The focus of these decisions about scale tends to focus on levels of measurement: nominal, ordinal, interval, ratio.

These levels of measurement tell you about the amount of information in the variable. But there are other ways of distinguishing the scales that are also important and often overlooked.

For example, ratio-level variables can be of two types: continuous and discrete. Continuous variables can take on any value on a number line, whereas discrete variables can take on only integers.

This is important in statistics because we measure the probabilities differently for discrete and continuous distributions.

The probability of each value of a discrete random variable is described through a probability distribution. At its simplest, this is a list of all the values in the set and the probability of each value occurring.

But the probabilities of many discrete random variables follow patterns that can be described with a mathematical function. This function is called a probability mass function (pmf).

As you can imagine, there are numerous discrete probability distributions. A few of you may have heard of are Bernoulli, binomial, hypergeometric, discrete uniform, and Poisson. There are explicit criteria that determine which probability distribution is appropriate for a specific discrete random variable.

Count data are a good example. A count variable is discrete because it consists of non-negative integers. Even so, there is not one specific probability distribution that fits all count data sets.

The Poisson Distribution

The Poisson distribution often fits count data. It fits well when the mean of the variable is equal to its variance. So how do you determine that?

Run a summary of your variable in your statistical software package and compare the mean to the variance. If the standard deviation is listed instead of the variance, just square the standard deviation. If they are almost equal, then that’s a good sign.

You can also run a qqplot of your data against a Poisson distribution. Although we usually use these for normal distributions, you can do it with any of a number of distributions.

But many count variables fail these tests.

Below are two graphs generated with Poisson and negative binomial probability distribution functions.  Each has 5,000 observations. The mean of the Poisson data is 2, the variance is 1.99, and the range is from 0 to 8. The mean of the negative binomial data is 2, the variance  is 4.16, and the range is from 0 to 15.

The negative binomial distribution contains an extra parameter that allows the variance to be greater than the mean. If you tried to fit a data set with that mean and variance to a Poisson distribution, it would be considered overdispersed — not a good fit.

Notice that there is a probability for each non-negative value on the x axis, beginning with zero. A Poisson or negative binomial random number generator will only create non-negative integers.

The Normal Distribution

If the mean of a Poisson or negative binomial variable is high enough, it will be symmetric and bell-shaped. It will look like a normal distribution, except for one key distinction—normal variables are truly continuous, not discrete. They can take on any possible value.

As a result, there are an infinite number of values (2.30546 is a different value than 2.30547). It makes no sense to calculate the probability that X is any exact value in a continuous variable. That probability is infinitesimal, a value approaching zero.

With continuous variables, the probability of a value falling within a range is calculated instead. For example, there is a 95% probability that a value from a normal distribution will fall within 1.96 standard deviations of the mean of that distribution.

To show you the difference, I created a set of 5,000 random values from a normal distribution with a mean and variance of 2. The range of the data is -2.512433 to 7.461702. Included next to its graph is the graph of the Poisson variable with a mean and variance of 2.

Note the following:

The ranges differ a lot (values are less than zero for the continuous variable).

There is a large difference in the number of unique observations (4,999 for the continuous set and 9 for the discrete Poisson set).

The take-away here?

Examine your outcome variable to determine whether it is discrete or continuous. If it is discrete, find the probability distribution function that best matches its make-up.

Jeff Meyer is a statistical consultant with The Analysis Factor, a stats mentor for Statistically Speaking membership, and a workshop instructor. Read more about Jeff here.

Poisson and Negative Binomial Regression for Count Data
Learn when you need to use Poisson or Negative Binomial Regression in your analysis, how to interpret the results, and how they differ from similar models.

Tagged With: normal distribution, pmf, Poisson distribution, probability mass function

Related Posts

  • Differences Between the Normal and Poisson Distributions
  • When Can Count Data be Considered Continuous?
  • The Exposure Variable in Poisson Regression Models
  • The Importance of Including an Exposure Variable in Count Models

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Primary Sidebar

This Month’s Statistically Speaking Live Training

  • February Member Training: Choosing the Best Statistical Analysis

Upcoming Workshops

  • Logistic Regression for Binary, Ordinal, and Multinomial Outcomes (May 2021)
  • Introduction to Generalized Linear Mixed Models (May 2021)

Read Our Book



Data Analysis with SPSS
(4th Edition)

by Stephen Sweet and
Karen Grace-Martin

Statistical Resources by Topic

  • Fundamental Statistics
  • Effect Size Statistics, Power, and Sample Size Calculations
  • Analysis of Variance and Covariance
  • Linear Regression
  • Complex Surveys & Sampling
  • Count Regression Models
  • Logistic Regression
  • Missing Data
  • Mixed and Multilevel Models
  • Principal Component Analysis and Factor Analysis
  • Structural Equation Modeling
  • Survival Analysis and Event History Analysis
  • Data Analysis Practice and Skills
  • R
  • SPSS
  • Stata

Copyright © 2008–2021 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled

Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.

Non-necessary

Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.

SAVE & ACCEPT