# The Difference Between a Chi-Square Test and a McNemar Test

by

You may have heard of McNemar tests as a repeated measures version of a chi-square test of independence. This is basically true, and I wanted to show you how these two tests differ and what exactly, each one is testing.

First of all, although Chi-Square tests can be used for larger tables, McNemar tests can only be used for a 2×2 table.  So we’re going to restrict the comparison to 2×2 tables.

### The Chi-square test

Here’s an example of a contingency table that would typically be tested with a Chi-Square Test of Independence:

The Chi-Square will test whether Experiencing Joint Pain is associated with running more than 25km/week.

How is it doing that?

The chi-square statistic itself is calculated based on the counts of people in each of those four cells of the table and their subsequent row and column totals.

But the comparison it essentially boils down to is the comparison of the two purple percentages.  You’ll notice each of these percentages is based on the row total.  In other words, the 75 non-runners answering Yes to Joint Pain represent 26% of the 290 non-runners*.

But 33% of the 1165 Runners said Yes, they’ve experienced joint pain.  A higher proportion of runners than non-runners are experiencing joint pain.

If those percentages were the same, the chi-square test statistic would be zero and it would mean that whether someone runs tells you nothing about whether they have join pain.

So if those percentages were the same, we’d conclude the two variables are  not associated.

Since our percentages aren’t the same, we conclude that running and joint pain are associated.  (Feel free to check the p-value on this example).

*As a non-runner myself, I’m being strict here in the definition of a “runner” as someone who runs at least 25k/week. All others I’m calling non-runners for simplicity.

### The McNemar Test

A McNemar test does something different.

The McNemar is not testing for independence, but consistency in responses across two variables.

Here is a table with the exact same counts, but different variables.  Now we’re comparing whether someone experiences joint pain before and after some treatment.  We want to test whether the treatment worked to change people from Yes to No.

But the McNemar recognizes that some people will move from Yes to No and others from No to Yes just randomly.  If the treatment is having no effect, the number of people who move from No to Yes should be about equal to those who move in the other direction.

But if there is a direction to the movement, we’ll see it because one of those purple boxes will be different from the other.

The 215 people who said no at both time points and the 380 people who said Yes at both are actually irrelevant to this comparison.  We’re actually just interested in whether the people who change answers do so randomly or not.

In the McNemar test, we can compare counts directly, because the comparison is not based on row totals.  But if changing to percentages makes interpretation easier, that’s fine too.  Just make sure you use percentage of the total sample, not percentage of the row totals, as we did for Chi-square.

Check out our Free Webinar Recordings, including topics like: Missing Data, Mixed Models, Structural Equation Modeling, Data Mining, Effect Size Statistics, and much more...

yugoh September 12, 2017 at 6:58 am

when using McNemar’s test, what will be the sample (N)?

Timo April 29, 2017 at 9:36 am

Nice explanation of the differences between these two Tests for categorial data. I love it.

Senjuti Kabir April 11, 2017 at 12:47 am

If I want to compare any test value (eg. interferon-gamma level) among same group of people before and after treament or any intervention, which will be the ideal test?

zulaikha January 17, 2017 at 11:12 pm

What if we have pre post but 3 group independent variable? It’s 3 by 2, can we proceed with mc nemar?

Karen January 18, 2017 at 11:05 am

Nope. McNemar only works in a 2×2

larry July 29, 2016 at 2:58 pm

What about he Cochrane Q? I thought it was an extension of McNemar test for the case of tables greater than 2X2.

Sabrina July 15, 2016 at 10:39 am

Hi, I was wondering whether either of these tests would be appropriate for assessing the association between repeat questions which were inserted into a survey as measure of internal validity.

Lauren May 22, 2016 at 11:03 pm

Hi, I am wondering whether there is a chi sq repeated measures equivalent where the data is not binary? I know you can use McNemar for 2×2 but my data includes >2 categories (likert not at all = 1 to very much = 5) and >2 time points (e.g. intake, progress 1, exit, follow up 1).
Thanks very much.