Missing Data

I don’t need to tell you, missing data stinks. After getting stuck on a big problem with missing data many years ago, I started studying what to do about it in a big way.

Answers to Questions from the Missing Data Webinar

These questions were originally asked in a live webinar. We didn’t get through all the questions, so I’m answering many of them in this series. If you want to listen to the full webinar, you can get the recording on this page. It’s free.

Online Workshops

Approaches to Dealing with Missing Data


Missing Data in the Context of Data Analysis


Missing Data

by Paul Allison

Very reader-friendly.
of “the little green Sage books.” This is an excellent overview, covers
much of what a data analyst needs to know, and very accessible. This is
the book to start with. And
very reasonably priced.

Analysis of Incomplete Multivariate Data

by Joseph Schafer

This book is the basis
for Joe’s series of multiple imputation programs
in S-Plus. It is somewhat more readable than Little & Rubin (below).

Statistical Analysis with Missing Data, Second

by Roderick Little
& Donald Rubin

This is the Missing
Data Bible. It can get pretty technical at times, but can be worth
working through.

Journal Articles

  • Allison, P.D. (2000). Multiple Imputation for Missing Data: A Cautionary Tale. Sociological Methods and Research, 28, 301-309.
  • Allison, P.D. (1987) Estimation of linear models with incomplete data. In C. Clogg [Ed.] Sociological Methodology. San Francisco: Jossey Bass, 71-103.
  • Graham, J. W., & Hofer, S. M. (2000). Mulitple imputation in multivariate research. In T. D. Little, K. U. Schnabel, & J. Baumert, (Eds.), Modeling longitudinal and multiple-group data: Practical issues, applied approaches, and specific examples. Hillsdale, NJ: Erlbaum.This chapter is a very user-friendly description of the use of Joe Schafer’s NORM program, with an illustrative empirical example. (Also see Schafer & Olsen — below — for the same kind of information).
  • Graham, J. W., Hofer, S.M., Donaldson, S.I., MacKinnon, D.P., & Schafer, J.L. (1997). Analysis with missing data in prevention research. In K. Bryant, M. Windle, & S. West (Eds.), The science of prevention: methodological advances from alcohol and substance abuse research. (pp. 325-366). Washington, D.C.: American Psychological Association.In the context of an empirical example, this chapter discusses, and illustrates the pros and cons of four acceptable, and readily available methods: (a) raw data maximum likelihood with Amos; (b) multiple imputation with NORM; (c) multiple imputation with EMCOV; and (d) EM algorithm (with EMCOV) and bootstrap. We show how the following “old” methods fall very short of desiriable treatment of missing data (listwise deletion, pairwise deletion, mean substitution).
  • Graham, J.W. & Donaldson, S.I. (1993) Evaluating interventions with differential attrition: The importance of nonresponse mechanisms and use of follow-up data. Journal of Applied Psychology, 78, 119-128
  • Horton, N. J. & Lipsitz, S.R. (2001). Multiple Imputation in Practice: comparison of Software Packages for Regression Models with Missing Variables. The American Statistician, 55, 244-254.
  • Horton, N.J. & Kleinman, K.P. (2007). Much Ado about Nothing: A Comparison of Missing Data Methods and Software to Fit Incomplete Data Regression Models. The American Statistician, 61, 79-90.
  • Muthén, B.O., Kaplan, D., & Hollis, M. (1987). On structural equation modeling with data that are not missing completely at random. Psychometrika, 52, 431-462
  • Schafer, J.L. & Graham, J.W. (2002). Missing Data: Our View of the State of the Art. Psychological Methods, 7, 147-177.
  • Wothke, W. (2000) Longitudinal and multi-group modeling with missing data. (Adobe pdf format) In T.D. Little, K.U. Schnabel, and J. Baumert [Eds.] Modeling longitudinal and multiple group data: Practical issues, applied approaches and specific examples. Mahwah, NJ: Lawrence Erlbaum Associates. (Reproduced with permission).
  • Schafer, J.L. & Graham, J.W. (2002). Missing Data: Our View of the State of the Art. Psychological Methods, 7, 147-177. This is a very well-written overview of the new approaches to dealing with missing data. Joe Schaefer is one of the top statististicians doing research on Missing data techniques and John Graham runs the statistical consulting center at Penn State. Together they explain these new techniques in understandable ways.

Web Sources