• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • About
    • Our Programs
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Guest Instructors
  • Membership
    • Statistically Speaking Membership Program
    • Login
  • Workshops
    • Online Workshops
    • Login
  • Consulting
    • Statistical Consulting Services
    • Login
  • Free Webinars
  • Contact
  • Login

When Listwise Deletion works for Missing Data

by Karen Grace-Martin 1 Comment

You may have never heard of listwise deletion for missing data, but you’ve probably used it.

Listwise deletion means that any individual in a data set is deleted from an analysis if they’re missing data on any variable in the analysis.

It’s the default in most software packages.

Although the simplicity of it is a major advantage, it causes big problems in many missing data situations.

But not always.  If you happen to have one of the uncommon missing data situations in which listwise deletion doesn’t cause problems, it’s a reasonable solution.

You hear a lot about its problems because most data sets don’t fit two conditions that must hold for listwise deletion to work well.

So let’s talk about those two conditions and what the problems are when they’re not met.

When Listwise Deletion Works

  1. The Data are Missing Completely at Random

When the incomplete cases that are dropped differ from the complete cases still in the sample, then the carefully selected random sample is no longer reflective of the entire population.

You’ve now got a biased sample and biased results.  That’s not good.

You can’t trust those results to be reflective of the population.

But sometimes the cases with missing data are no different than the complete cases—they are a purely random subset of the data.  This is called Missing Completely at Random (MCAR).

If this holds, there won’t be any bias in analyses based on complete cases.

  1. You have sufficient power anyway, even though you lost part of your data set

Dropping more than a few cases from a data set can have dramatic consequences for sample size.  Since statistical power is directly tied to sample size, losing one results in losing the other.

But listwise deletion doesn’t always drop so many cases to adversely affect power.  If the percentage of missing data is very small or you had an overly large sample to begin with, you may still have adequate power to detect meaningful effects.

There is one caveat here though.  It’s possible to have only a small percentage of observations missing overall, yet still lose a large part of the sample to listwise deletion.  This is the situation that’s most problematic for listwise deletion.

This happens when an analysis includes many variables, and each is missing for a few unique cases.  Say you have a data set with 200 observations and use 10 variables in a regression model.  If each variable is missing on the same 10 cases, you end up with 190 complete cases, 5% missing.  Not bad.

But if you have a different 10 cases missing on each variable, you will lose 100 cases (10 cases by 10 variables).   With only 5% missing data, you end up with 100 complete cases, 50% missing.  Not so good.

How to Tell if Listwise Deletion is Reasonable

Before you just assume that listwise deletion is an adequate approach, it is important to establish that these two conditions are met.

Spend some time doing missing data diagnosis to understand patterns and randomness of missingness.  Like testing assumptions in linear models, there isn’t one definitive test to tell you if assumptions are met for listwise deletion.  It’s more an exercise in gathering evidence that assumptions aren’t clearly violated.

And if one or the other of these conditions are clearly violated, there are now other good ways to deal with missing data, including maximum likelihood and multiple imputation approaches.



Bookmark and Share

Approaches to Missing Data: the Good, the Bad, and the Unthinkable
Learn the different methods for dealing with missing data and how they work in different missing data situations.

Tagged With: listwise deletion, MCAR, Missing Data, missing data mechanism, Statistical power

Related Posts

  • Missing Data Mechanisms: A Primer
  • How to Diagnose the Missing Data Mechanism
  • Do Top Journals Require Reporting on Missing Data Techniques?
  • What is the difference between MAR and MCAR missing data?

Reader Interactions

Comments

  1. Jonathan Bartlett says

    May 10, 2014 at 5:17 am

    Hi Karen

    Just a small note to add regarding the missingness assumption required for listwise deletion (sometimes also called complete case analysis). As you wrote, if data are missing completely at random, it will be unbiased. When your analysis consists of fitting a regression model, it will also be unbiased provided missingness is independent of the outcome (dependent) variable, conditional on the covariates (independent variables). This condition means the data are not MCAR, and in fact can even be missing not at random in certain setups, but yet the listwise deletion analysis is unbiased.

    I wrote a blog post on this last year, which may be useful:
    http://thestatsgeek.com/2013/07/06/when-is-complete-case-analysis-unbiased/

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Primary Sidebar

Free Webinars

Effect Size Statistics on Tuesday, Feb 2nd

This Month’s Statistically Speaking Live Training

  • January Member Training: A Gentle Introduction To Random Slopes In Multilevel Models

Upcoming Workshops

  • Logistic Regression for Binary, Ordinal, and Multinomial Outcomes (May 2021)
  • Introduction to Generalized Linear Mixed Models (May 2021)

Read Our Book



Data Analysis with SPSS
(4th Edition)

by Stephen Sweet and
Karen Grace-Martin

Statistical Resources by Topic

  • Fundamental Statistics
  • Effect Size Statistics, Power, and Sample Size Calculations
  • Analysis of Variance and Covariance
  • Linear Regression
  • Complex Surveys & Sampling
  • Count Regression Models
  • Logistic Regression
  • Missing Data
  • Mixed and Multilevel Models
  • Principal Component Analysis and Factor Analysis
  • Structural Equation Modeling
  • Survival Analysis and Event History Analysis
  • Data Analysis Practice and Skills
  • R
  • SPSS
  • Stata

Copyright © 2008–2021 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled

Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.

Non-necessary

Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.