# Observed Values less than 5 in a Chi Square test–No biggie.

by

I was recently asked this question about Chi-square tests.  This question comes up a lot, so I thought I’d share my answer.

I have to compare two sets of categorical data in a 2×4 table. I cannot run the chi-square test because most of the cells contain values less than five and a couple of them contain values of 0. Is there any other test that I could use that overcomes the limitations of chi-square?

1. The assumption of the Chi-square test is not that the observed value in each cell is greater than 5.  It is that the expected value in each cell is greater than 5.  (The expected value for each cell is row total*column total/overall total).   Often when the observed values are low, the totals are too, so they overlap a lot, but not always.

2. If you still find that your expected values are too low, use a Fisher’s Exact test.  Not all standard statistical software will do one for a 2×4 table, but it can be done.  StatXAct will do it.  Great software if you can get access to it.

(If you’re wondering, SPSS will only do a Fisher’s Exact for a 2×2 table).

On the hunt for affordable statistical training with the best stats mentors around? Want to ask an expert all your burning stats questions? Check out Statistically Speaking, our exclusive membership program featuring monthly webinars and open Q&A sessions.

Shivam dixit March 29, 2017 at 11:42 am

I have a crosstab 2×2 with a cell having 0 value. So should I use what kinds of test for this crosstab.
Positive Negative
positive negative
male 38 6
female 30 0

Spoorthi December 21, 2016 at 4:51 am

hi all,

below is my data, I want generate chisq p-value for 5 tests together. I have data for 3 tests (1,2,3) but don’t have counts for test 4 and 5. these should include in the model to get pvalue. any one could suggest what I have to do?

data test;
test=1; trt=1; count=6;output;
test=1; trt=2; count=6;output;
test=2; trt=1; count=10;output;
test=2; trt=2; count=6;output;
test=3; trt=1; count=8;output;
test=3; trt=2; count=6;output;
test=4; trt=1; count=0;output;
test=4; trt=2; count=0;output;
test=5; trt=1; count=0;output;
test=5; trt=2; count=0;output;
run;

proc sort data=test;by test trt;run;
proc freq data=test order=data;
table test*trt/chisq;
weight count/zeros;
run;

*when i run this code i am getting below warning not getting chisqure pvalue;

Statistics for Table of test by trt

Row or column sum zero. No statistics computed for this table.

Sample Size = 42

najam December 8, 2016 at 6:37 pm

Iam using graph pad prism and for 2×2 contingency table iam using fisher exact test because i have total sample size lees than 100 but at some times it gives very odd results that are not acceptable like

26 4
83 18
it gives significant p value for these 4 samples out of total sample size of one city

Lan April 30, 2016 at 8:06 am

I have a crosstab 3×2 with a cell having 0 value. So should I use what kinds of test for this crosstab.
Positive Negative
4 17
6 20
0 13

Dora Marques August 5, 2014 at 8:36 am

Hello Karen. Thank you for your usefull input. I have a crosstab of 4×2 with more than 20% of expected frequencies lower than 5 and I am currently using SPSS. I found on the internet that we can use Fisher’s exact test on crosstabs bigger than 2×2 on SPSS, as a Fisher-Freeman-Halton Exact Test. Is it the same as running the first test? Once again, thanks for your help.

Priya July 16, 2014 at 1:34 pm

Thanks, Karen! Great explanation. This helps a lot.

I use STATA, which does to chi-squared for 2×2 tables, and also will provide expected cell frequencies so you can check your numbers.

tab var1 var2, expected chi
tab var1 var2, expected exact

Byron July 25, 2013 at 11:03 am

In many cases, Fisher’s exact test can be too conservative. The mid-p quasi-exact test or N-1 chi-square may be good alternatives.

D.a. January 29, 2013 at 4:30 pm

nice one.. this cleared my mind: observed versus expected..
i experienced having 0 values as well but my expected is of course more than 5..

Karen January 29, 2013 at 5:11 pm

Excellent. 🙂

maryam August 8, 2009 at 3:45 pm

thanks dear writer of the main text for introducing StatXAct and also Andreas for introducing R. Would somebody kindly tell me how can i access these softwares? by the way i have a 3×9 table, with two main variables and one variable as COUNT, i have to run Chi-square but several cells would contain zeros and i get a warning from SPSS and i’m afraid my results would not be true, what can i do if i dont get access to those mentioned softwares u introduced? i wonder of a great helpful hero would leave me a message on maryam_na_li@yahoo.com as within few days i’ve to submit my thesis and i’m still stuck with this data!!

Andreas June 22, 2009 at 4:54 am

Karen – Yes I believe so. Any 2-dimensional matrix will do – regardless of size.

Try:

v1 <- c(2,4,1,5,2,1,7,8,2)
v2 <- c(2,3,6,1,2,5,1,7,4)
v <- cbind(v1,v2)
fisher.test(v)

Sincerely

admin June 24, 2009 at 5:51 pm

Thanks. I haven’t used R yet, but should probably start. I used to use SPlus, which I understand is the same or very similar, so I’m sure it wouldn’t be much of a stretch. But I find that most of my clients use either SPSS or SAS. I suspect R is going to infiltrate the market more and more, though.

andreas June 19, 2009 at 1:06 pm

Fisher exact was the reason I started learning R. R is free and the comand is simply fisher.test(name of contingency table)