Missing Data

Missing Data Diagnosis in Stata: Investigating Missing Data in Regression Models

January 4th, 2016 by

In the last post, we examined how to use the same sample when running a set of regression models with different predictors.

Adding a predictor with missing data causes cases that had been included in previous models to be dropped from the new model.

Using different samples in different models can lead to very different conclusions when interpreting results.

Let’s look at how to investigate the effect of the missing data on the regression models in Stata.

The coefficient for the variable “frequent religious attendance” was negative 58 in model 3 and then rose to a positive 6 in model 4 when income was included. Results (more…)


When Listwise Deletion works for Missing Data

February 25th, 2014 by

You may have never heard of listwise deletion for missing data, but you’ve probably used it.

Listwise deletion means that any individual in a data set is deleted from an analysis if they’re missing data on any variable in the analysis.

It’s the default in most software packages.

Although the simplicity of it is a major advantage, it causes big problems in many missing data situations.

But not always.  If you happen to have one of the uncommon missing data situations in which (more…)


Two Recommended Solutions for Missing Data: Multiple Imputation and Maximum Likelihood

September 10th, 2012 by

Two methods for dealing with missing data, vast improvements over traditional approaches, have become available in mainstream statistical software in the last few years.

Both of the methods discussed here require that the data are missing at random–not related to the missing values. If this assumption holds, resulting estimates (i.e., regression coefficients and standard errors) will be unbiased with no loss of power.

The first method is Multiple Imputation (MI). Just like the old-fashioned imputation (more…)


Do Top Journals Require Reporting on Missing Data Techniques?

June 3rd, 2011 by

Q: Do most high impact journals require authors to state which method has been used on missing data?

I don’t usually get far enough in the publishing process to read journal requirements.

But based on my conversations with researchers who both review articles for journals and who deal with reviewers’ comments, I can offer this response.

I would be shocked if journal editors at top journals didn’t want information about the missing data technique.  If you leave it out, they’ll either assume you didn’t have missing data or are using defaults like listwise deletion. (more…)


Is Multiple Imputation Possible in the Context of Survival Analysis?

May 27th, 2011 by

Sure.  One of the big advantages of multiple imputation is that you can use it for any analysis.

It’s one of the reasons big data libraries use it–no matter how researchers are using the data, the missing data is handled the same, and handled well.

I say this with two caveats. (more…)


Computing Cronbach’s Alpha in SPSS with Missing Data

July 16th, 2010 by

I recently received this question:

I have scale which I want to run Chronbach’s alpha on.  One response category for all items is ‘not applicable’. I want to run  Chronbach’s alpha requiring that at least 50% of the items must be answered for the scale to be defined.  Where this is the case then I want all missing values on that scale replaced by the average of the non-missing items on that scale. Is this reasonable? How would I do this in SPSS?

My Answer:

In RELIABILITY, the SPSS command for running a Cronbach’s alpha, the only options for Missing Data (more…)