The Second Problem with Mean Imputation

by Karen


A previous post discussed the first reason to not use mean imputation as a way of dealing with missing data–it does not preserve the relationships among variables.

A second reason is that any type of single imputation underestimates error variation in any statistic that used the imputed data.  Because the imputations are themselves estimates, there is some error associated with them.  But your statistical software doesn’t know that.  It treats it as real data.

Ultimately, because your standard errors are too low, so are your p-values.  Now you’re making Type I errors without realizing it.

A better approach?  Mulitple Imputation or Full Information Maximum Likelihood.

Leave a Comment

Please note that Karen receives hundreds of comments at The Analysis Factor website each week. Since Karen is also busy teaching workshops, consulting with clients, and running a membership program, she seldom has time to respond to these comments anymore. If you have a question to which you need a timely response, please check out our low-cost monthly membership program, or sign-up for a quick question consultation.

Previous post:

Next post: