• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • Our Programs
    • Membership
    • Online Workshops
    • Free Webinars
    • Consulting Services
  • About
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Collaborate with Us
  • Statistical Resources
  • Contact
  • Blog
  • Login

Answers to the Missing Data Quiz

by Karen Grace-Martin 3 Comments

In my last post, I gave a little quiz about missing data.  This post has the answers.

If you want to try it yourself before you see the answers, go here. (It’s a short quiz, but if you’re like me, you find testing yourself irresistible).

True or False?

1. Imputation is really just making up data to artificially inflate results.  It’s better to just drop cases with missing data than to impute.

Answer: False!

Imputation has gotten a bad rap because early imputation methods, like mean imputation, bias your results pretty badly.  And single imputation underestimates standard errors.

But imputation has come a long way, baby!

Multiple imputation, when done well, gives pretty much the same unbiased results, with full power, as the full non-missing data set.

2. I can just impute the mean for any missing data.  It won’t affect results, and improves power.

Answer: False!

As I just said, mean imputation is bad imputation.  It does improve power, but your results will be so biased, the improved power won’t help much.  Sure, your results might be significant, but they’re the wrong results!

3. Mulitple Imputation is fine for the predictor variables in a statistical model, but not for the response variable.

Answer: False!

It’s true that imputing the response doesn’t add any new information to your regression model.  But if you have missing data in the predictors as well,  simultaneously imputing both reponse and predictors improves those predictor imputations.

4. Multiple Imputation is always the best way to deal with missing data.

Answer: False!

It often is, and is a good result.  But it’s not always easy to do well, and it is a large sample technique.

If you’re running a linear or log-linear model, (like a regression or linear mixed model), maximum likelihood techniques give the same great, unbiased, uninflated, full power results that multiple imputation does.

But you don’t have to spend the time and resources imputing anything.

5. When imputing, it’s important that the imputations be plausible data points.

Answer: False!

It’s counter-intuitive, but it’s not actually important that imputations be plausible data points.  The important thing when imputing is that your parameter estimates–your means, regression coefficients, or whatever it is you’re using this data to estimate–be accurate.  Not the imputed data itself.

There are a number of situations, like imputing categorical data, where you actually get better parameter estimates when the imputed data itself aren’t plausible values.

6. Missing data isn’t really a problem if I’m just doing simple statistics, like chi-squares and t-tests.

Answer: False!

It’s not the analysis you’re doing, but the percent, pattern, and randomness of the missing data that determines how problematic missing data are.

Even simple statistics need to be accurate and unbiased.  How important is it that your results are correct?

7. The worst thing that missing data does is lower sample size and reduce power.

Answer: False!

The loss of power from listwise deletion–the default in most software–can be quite devastating.

But even worse are the other two effects of missing data: biased parameter estimates and biased standard errors.  They, in essence, make your results, including p-values, wrong.

And they’re worse than low power because you can’t tell they’re wrong.  If you lose half your sample and have no significant results, you notice.  If the regression coefficients or standard errors aren’t what they’re supposed to be, there’s no way to tell.

That makes it worse in my book.

—————————————————————————————————–

How did you do?  (BTW, it took me years of seminars, reading, and trying things out to figure this all out).

Approaches to Missing Data: the Good, the Bad, and the Unthinkable
Learn the different methods for dealing with missing data and how they work in different missing data situations.

Tagged With: maximum likelihood, Missing Data, Multiple Imputation

Related Posts

  • Two Recommended Solutions for Missing Data: Multiple Imputation and Maximum Likelihood
  • Quiz Yourself about Missing Data
  • Missing Data: Criteria for Choosing an Effective Approach
  • EM Imputation and Missing Data: Is Mean Imputation Really so Terrible?

Reader Interactions

Comments

  1. Conrad Zygmont says

    May 11, 2011 at 4:56 am

    Hi Karen,

    I am very exited to see that you will be tackling this important and often ignored or obfuscated topic. I am very interested in robust alternatives to ML procedures, and will be watching with interest to see how the multiple imputation procedures you address deal with various data that violate the normal distribution assumptions.

    Thank you for your great work and service to those navigating the space of statistical knowledge and uncertainty.

    Kind Regards,
    Conrad

    Reply
  2. Anick Lamarche says

    May 16, 2010 at 9:50 pm

    Hi

    I was wondering if there will be other “missing data” seminars upcoming (in the near future). I just missed the May 6th workshop and this sounds like what I really need!

    Anick

    Reply
    • Karen says

      May 20, 2010 at 3:11 pm

      Hi Anick,

      I try to do each of the workshops once a year. Once I’ve done each one a few times, I may be able to do them more often, but at this point each one requires a big effort. So we’re looking at spring.

      However, one of my goals for the summer is to create a home study version of the workshop based on the recordings and transcripts. So keep an eye out for that. If you’ve signed up for our newsletter mailing list, you’ll get an announcement once it’s ready.

      Karen

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Primary Sidebar

This Month’s Statistically Speaking Live Training

  • Member Training: Analyzing Pre-Post Data

Upcoming Free Webinars

Poisson and Negative Binomial Regression Models for Count Data

Upcoming Workshops

  • Analyzing Count Data: Poisson, Negative Binomial, and Other Essential Models (Jul 2022)
  • Introduction to Generalized Linear Mixed Models (Jul 2022)

Copyright © 2008–2022 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT