One area in statistics where I see conflicting advice is how to analyze pre-post data. I’ve seen this myself in consulting. A few years ago, I received a call from a distressed client. Let’s call her Nancy.
Nancy had asked for advice about how to run a repeated measures analysis. The advisor told Nancy that actually, a repeated measures analysis was inappropriate for her data.
Nancy was sure repeated measures was appropriate. This advice led her to fear that she had grossly misunderstood a very basic tenet in her statistical training.
The Study Design
Nancy had measured a response variable at two time points for two groups. The intervention group received a treatment and a control group did not. Participants were randomly assigned to one of the two groups.
The researcher measured each participant before and after the intervention.
Analyzing the Pre-Post Data
Nancy was sure that this was a classic repeated measures experiment. It has one between subjects factor (treatment group) and one within-subjects factor (time).
The advisor insisted that this was a classic pre-post design, and that the way to analyze pre-post data is not with a repeated measures ANOVA, but with an ANCOVA.
In ANCOVA, the dependent variable is the post-test measure. The pre-test measure is not an outcome, but a covariate. This model assesses the differences in the post-test means after accounting for pre-test values.
The advisor said repeated measures ANOVA is only appropriate if the outcome is measured multiple times after the intervention. The more she insisted repeated measures didn’t work in Nancy’s design, the more confused Nancy got.
The Research Question
This kind of situation happens all the time, in which a colleague, a reviewer, or a statistical consultant insists that you need to do the analysis differently. Sometimes they’re right, but sometimes, as was true here, the two analyses answer different research questions.
Nancy’s research question was whether the mean change in the outcome from pre to post differed in the two groups.
This is directly measured by the time*group interaction term in the repeated measures ANOVA.
The ANCOVA approach answers a different research question: whether the post-test means, adjusted for pre-test scores, differ between the two groups.
In the ANCOVA approach, the whole focus is on whether one group has a higher mean after the treatment. It’s appropriate when the research question is about the mean value at the end. Not about gains, growth, or changes.
The adjustment for the pre-test score in ANCOVA has two benefits. One is to make sure that any post-test differences truly result from the treatment, and aren’t some left-over effect of (usually random) pre-test differences between the groups.
The other is to account for variation around the post-test means that comes from the variation in where the patients started at pretest.
So when the research question is about the difference in means at post-test, this is a great option. It’s very common in medical studies because the focus there is about the size of the effect of the treatment.
As it turned out, the right analysis to accommodate Nancy’s design and answer her research question was the Repeated Measures ANOVA. (For the record, linear mixed models also work. It would have some advantages, but in this design, the results are identical).
The person she’d asked for advice was in a medical field, and had been trained on the ANCOVA approach.
Either approach works well in specific situation. The one thing that doesn’t is to combine the two approaches.
I’ve started to see data analysts attempt to use the baseline pre-test score as both a covariate and the first outcome measure in a repeated measures analysis. Particularly when there is more than one post-test measurement.
That doesn’t work, because both approaches remove subject-specific variation, so it tries to remove that variation twice.