We all want rules of thumb even though we know they can be wrong, misleading or misinterpreted.
Rules of Thumb are like Urban Myths or like a bad game of ‘Telephone’. The actual message gets totally distorted over time.
For example, you may have heard this one: “The Chi-Square test is invalid if we have fewer than 5 observations in a cell”.
I frequently hear this mis-understood and incorrect statement.
The correct statement is not about the observed in each cell. Those can be less than 5. It’s the EXPECTED count that needs to be >5 per cell*.
Why is this important?
The Chi-square statistic follows a chi-square distribution asymptotically with df=n-1. That means we can use the chi-square distribution to calculate an accurate p-value only for large samples. (That’s where the asymptotically comes in). For small samples, it doesn’t work.
How large of a sample is needed?
One that is large enough that the expected value for each cell is at least 5. Those expected values come from the total sample size, and the corresponding total frequencies of each row and column. So if any row or column totals in your contingency table are small, or together are relatively small, you’ll have an expected value that’s too low.
Software packages typically supply a warning when this occurs. For example, it’s the criterion SAS uses.
Look at the table below, which shows observed counts between two categorical variables, A and B. The observed counts are the actual data. You can see that out of a total sample size of 48, 28 are in the B1 category and 20 are in the B2 category.
Likewise, 33 are in the A1 category and 15 are in the A2 category. Inside the box are the individual cells, which give the counts for each combination of the two A categories and two B categories.
The Expected counts come from the row totals, column totals and the overall total, 48. For example, in the A2, B1 cell, we expect a count of 8.75. It is an easy calculation: (Row Total * Column Total)/Total. So (28*15)/48.
The more different the observed and expected counts are from each other, the larger the chi-square statistic.
Notice in the Observed Data there is a cell with a count of 3. But the expected counts are all >5. If the expected counts are less than 5 then a different test should be used (e.g. Fisher’s Exact Test).
But is 5 the true minimum?
There are other suggested guidelines too.
According to Cochran (1952, 1954), all expected counts should be 10 or greater. If < 10, but >=5, Yates’ Correction for continuity should be applied.
More recent standards for a 2 x 2 Table (Campbell 2007) say Fisher’s Exact and Yates Correction are too conservative and proposes alternative tests depending on the study design.
For tables larger than 2 x 2 Yates, Moore & McCabe (1999), state “No more than 20% of the expected counts should be less than 5 and all individual expected counts should be greater or equal to 1. Some expected counts can be <5, provided none <1, and 80% of the expected counts should be equal to or greater than 5.”
The Minitab manual criteria are: If either variable has only 2 or 3 categories, then either
— all cells must have expected counts of at least 3 or
— all cells must have expected counts of at least 2 and 50% or fewer have expected counts below 5
If both variables have 4 to 6 levels then either:
— all cells have expected counts of at least 2, or
— all cells have expected counts of at least 1 and 50% or fewer cells have expected counts of < 5
In summary, different sources have different criteria for the Chi-square test to be valid. All criteria refer to the expected cell counts and not the observed data.
There are alternative tests for small cell counts such as Fisher’s Exact Test and Yates correction. Lastly, sometimes it might be necessary to collapse across categories to obtain adequate cell counts.
*The ‘5’ probably came from Fisher.