Missing Data: Criteria for Choosing an Effective Approach

by Karen


In choosing an approach to missing data, there are a number of things to consider.  But you need to keep in mind what you’re aiming for before you can even consider which approach to take.

There are three criteria we’re aiming for with any missing data technique:

1. Unbiased parameter estimates:  Whether you’re estimating means, regressions, or odds ratios, you want your parameter estimates to be accurate representations of the actual population parameters.  In statistical terms, that means the estimates should be unbiased.  If all the assumptions of your statistical test are met, the sample is randomly selected, and no data were missing, you can be confident that estimates are unbiased.  But missing data (and many of the solutions to solve it) can mess with that nice property.

2. Adequate power: Case deletion–dropping cases with missing data–lowers sample size, and therefore lowers power.  At least theoretically.  However, if you don’t have a problem with bias, and you are getting significant results, you have adequate power to detect your effects.  End of story.

3. Accurate standard errors, and therefore p-values and confidence intervals: in the world of statistical inference, not just description, we need not only the parameter estimate to be accurate, but the standard errors of those estimates as well.  Many approaches to missing data, such as single imputation of any type, underestimates standard errors.  This means p-values are too small, confidence intervals too narrow, and you, the researcher, making claims that really aren’t there.

Slide 16

Modern approaches like Mulitple Imputation and Full Information Maximum Likelihood meet all three criteria for many missing data problems.  But simpler techniques can adequately meet them as well in specific situations.

tn_ircWant to learn more about how to handle Missing Data? In this 5-part workshop, you’ll learn all about two fabulous modern missing data techniques: multiple imputation and maximum likelihood.

{ 2 comments… read them below or add one }

Andreas December 7, 2009 at 7:28 am

I would like to thank you for the information and help I got understanding and utilizing the EM-imputation technique rather then the mean-replace it seems my department uses. Not even my professors had heard of this technique, and it made me embarrassed to hear that they still thought mean replace was the bees’ knees.

Again, thanks for great information that really helped out my master-thesis in org-psy!


Karen December 7, 2009 at 10:59 am

Hi Andreas,

I like “bees’ knees.” I’ll have to use that. 🙂

And mean imputation is still pretty widespread. There are a LOT of researchers that still haven’t heard about the newer and better missing data techniques. So thanks for helping me spread the word.

– Karen


Leave a Comment

Please note that Karen receives hundreds of comments at The Analysis Factor website each week. Since Karen is also busy teaching workshops, consulting with clients, and running a membership program, she seldom has time to respond to these comments anymore. If you have a question to which you need a timely response, please check out our low-cost monthly membership program, or sign-up for a quick question consultation.

{ 1 trackback }

Previous post:

Next post: