Computing Cronbach's Alpha in SPSS with Missing Data

I recently received this question:

I have scale which I want to run Chronbach’s alpha on. One response category for all items is ‘not applicable’. I want to run Chronbach’s alpha requiring that at least 50% of the items must be answered for the scale to be defined. Where this is the case then I want all missing values on that scale replaced by the average of the non-missing items on that scale. Is this reasonable? How would I do this in SPSS?

My Answer:

In RELIABILITY, the SPSS command for running a Cronbach’s alpha, the only options for Missing Data are to include or exclude User-Defined missing data. And by exclude, they mean listwise deletion.

So the only way to include cases with more than 50% observed data would be to impute them in a separate step before you run the reliability analysis.

And while you could impute the mean, I highly recommend you do not. While mean imputation maintains the mean of each separate variable, it does not maintain the relationships among variables.

In fact, with a lot of imputed values all right at the mean, the correlations with other variables become much lower. And since scale reliability entirely depends on correlations among the values in your scale, you will severely underestimate your scale reliability if you have more than a few cases with missing data.

Since you’re doing a Cronbach’s alpha, you could do a single imputation that is based on other variables–a regression or an EM imputaton. This kind of imputation will preserve the relationship among the variables on your scale without inflating them.

The general downside of single imputation is that SPSS will think that the imputed values were true, observed values. It will therefore underestimate standard errors.

But Cronbach’s alpha doesn’t have a standard error and is not involved in a hypothesis test. So for this purpose, the downside isn’t a big deal.

If you were doing a hypothesis test or doing any statistical analysis based on p-values, the best option, is to conduct a Multiple Imputation on the missing values. It’s often the only good one if you have more than about 10% of data missing (that’s 10% of all values, not of cases)

Both the single and multiple imputation techniques are available in SPSS Missing Values Analysis module. Multiple imputation was added in version 17, but single imputation is available in earlier versions.

Principal Component Analysis

Summarize common variation in many variables... into just a few. Learn the 5 steps to conduct a Principal Component Analysis and the ways it differs from Factor Analysis.

Comments

Dominique says

December 10, 2019 at 9:40 am

Hi. I’m having trouble because I have questionnaires with missing answers in some of the questions. I use a likert scale (1-5) for the choices. Any suggestions on how to properly determine the Cronback Alpha in this case? I’ve tried computing it in two ways: leaving the missing answers blank and assigning a 0 as the answer. Finl computation varies slightly in both instances. Please help….

Reply
Bruce Weaver says

August 10, 2015 at 11:26 am

Both the RELIABILITY and FACTOR procedures in SPSS can take matrix data as input. So if one does not need standard errors, confidence intervals, or p-values, one can use the EM covariances (or correlations) as input. Unfortunately, the MVA procedure, which generates those EM estimates, does not have a /MATRIX sub-command for writing out the matrix data set. To get around that problem, Hillary Maxwell and I wrote a couple macros which are described in the following article:

http://www.tqmp.org/RegularArticles/vol10-2/p143/p143.pdf

The supplementary materials can be downloaded here:

https://sites.google.com/a/lakeheadu.ca/bweaver/Home/statistics/spss/my-spss-page/emcorr

I hope this helps.

Reply
Bella99 says

August 11, 2014 at 5:19 am

Hi Karen
I have data measuring 3 variables with 12 questions (four questions per variable), and each variable uses a Likert Scale. I have substituted the missing data (where people do not answer the question) with the same number, 6. The Cronbach alpha is fine when I input a numeral instead of a missing value, but when I leave the box blank it is very low. My question is does the Cronbach alpha calculate based on relationships between values, so the 6 value is fine, or should I leave the space blank because there was no value? Then it has excluded cases, whereas with the 6 it has none.

Thank you for your help! I hope I am clear…
Belinda

Reply
Jill Avery says

May 30, 2013 at 1:50 pm

Karen, thanks for your very helpful suggestions on how to deal with missing variables in a reliability analysis in SPSS — I’ve been pulling my hair out about how to do it!

Reply
- Karen says
  
  June 6, 2013 at 5:28 pm
  
  Cool! Glad it helped.
  
  Reply
Daphne says

March 6, 2013 at 10:53 am

Hi Karen, if participants fill in the same multi-item scale more than one time (for example, in a repeated measures design), do I have to calculate several Cronbach alphas scores (for each of the time the scale is used) or there is a way to have a single Cronbach alpha score? Many thanks!

Reply
- Karen says
  
  March 7, 2013 at 1:43 pm
  
  Hi Daphne,
  
  I honestly don’t know. That is a multilevel data situation and I don’t know of a version of Cronbach’s alpha for that. Anyone else know if it exists?
  
  Karen
  
  Reply
- Michelle says
  
  December 8, 2021 at 12:11 am
  
  Hi can you help me in running my spss using cronbach alpha,the result is very low but based on my survey the result is high,the spss said that the covariance matrix is 0
  
  Reply
loraine says

February 23, 2013 at 3:34 am

how do i input the values on spss for reliability testing if my test is forced choice or answerable by yes or no

Reply
- Karen says
  
  March 4, 2013 at 11:04 am
  
  Hi Loraine,
  
  I am not sure. There are many reliability measures, and it’s one of those issues I have to look up myself every time.
  
  Karen
  
  Reply
keto says

January 29, 2012 at 1:58 pm

i want to know can i use chronbach alhpa tool for skipping question not liker t scale? if no what type of tool can i use to measure reliability?
Thanx allot

Reply
- Karen says
  
  January 31, 2012 at 3:54 pm
  
  Hi Keto,
  
  I’m not sure I understand your question. Are you trying to see which questions don’t load reliably with the others? And if your data aren’t measured on likert scales, how are they measured?
  
  Karen
  
  Reply
Don says

December 28, 2011 at 2:26 pm

SPSS does remove those who do not answer all questions on the survey and mean substitution will effect your variance. You can use relicheck.com. This is an online survey site that has a cronbach analysis as part of the results. The analysis on the site accounts for missing data.

Reply
hana says

May 14, 2011 at 9:41 am

I would like to use EM for my missing values in SPSS. I have data for over 1000 participants, each completing several questioannaires. Do I enter all my data into the EM at once or do i do it seperately for each questionnaire?

Also, I would only like values imputed for cases with at least 50% items responded to. How do I do this for EM in SPSS?

Thnk you

Reply
- Karen says
  
  May 19, 2011 at 11:36 am
  
  Hi Hannah,
  
  If the questionnaires are long (with many items), you may have trouble putting them all in at once (read: SPSS crashes). If the questionnaires are independent of each other, you can do them separately.
  
  But remember EM means and correlations are unbiased, but standard errors are too small. If you need accurate standard errors, you’ll want a multiple imputation.
  
  And to impute cases only if at least 50% present, you’ll have to do a work around. There isn’t an option in SPSS that I know of to do it directly.
  
  Reply
  - Susanna says
    
    September 16, 2013 at 10:59 am
    
    Hello! I have the same problem. If i do Multiple Imputation (i have 1,8% missing data, 300 items/9 questionnaires and 600 participants), do i impute the questionnaires separately and combine them later somehow or do i impute them all at once??
    Most of the Items are Likert-scale and i treat them as continuous, but some are nominal (the participants can decide between 3 different sentences.) Is that a problem?
    Please help me!
    
    Reply
    - Karen says
      
      September 25, 2013 at 10:16 am
      
      Hi Susanna,
      
      With that many items and scales, I think you will need to impute the entire scales. Although your % missing data is very low. I would strong suggest looking at John Graham’s 2009 article. He addresses this issue of imputing scales and scale items directly.
      
      Reply

Reader Interactions

Comments

Leave a Reply Cancel reply