Repeated measures ANOVA is the approach most of us learned in stats classes, and it works very well in certain designs.

But it’s a bit limited in what it can do. Sometimes trying to fit a data set into a repeated measures ANOVA requires too much data gymnastics—averaging across repetitions or pretending a continuous predictor isn’t

These data gymnastics mean you’re throwing away good information and under-accounting for true variation among repetitions.

There are a few specific design and data situations that will eliminate repeated measures ANOVA as a reasonable approach.

Let’s go through seven of these and what the options are instead.

1. **Missing Data on the outcome**

One of the biggest problems with traditional repeated measures ANOVA is missing data on the response variable.

The problem is that repeated measures ANOVA treats each measurement as a separate variable. Because it uses listwise deletion, if one measurement is missing, the entire case gets dropped.

What to use instead: Mixed models treat each occasion as a different observation of the same variable. So you may lose the measurement with missing data, but not all other responses.

2. **Unbalanced number of repeats across individuals**

A related problem is imbalance in the number of repeated responses from each individual.

This is common in observed data, where the number of repeats is uncontrollable. You measure a response each some event happens.

Repeated measures ANOVA treats each response as a different variable. This causes two problems.

First, you will have a different number of response variables for each individual. If some have missing data in the last few responses, they’ll get dropped. (That dropping again. Ugh).

Second, the ANOVA will compare the responses to each other, assuming that each one represents a different condition. Here they don’t—they’re really interchangeable. But there is no way to turn off that comparison.

What to use instead: A mixed model can handle unequal repeats.

3. **When time is continuous**

In some repeated measures studies, each repeat occurs under a different experimental condition, so there is a qualitative difference among the repeats. No problem here.

In others, the amount of time that has passed between repeats is important (or equivalently, the amount of space if the repeats are say, along a transect). In other words, you want to treat the within-subjects effect of time as a continuous variable.

This is theoretically valid and reasonable, but repeated measures ANOVA can only account for categorical repeats.

There are contrasts that allow you to order the categories and simulate a trend over time, but they’re not truly treating time as continuous.

What to use instead: A mixed model can treat time as a truly continuous effect.

4. **Time-varying covariates**

In some studies, the important predictor variables are measured on each repeat, right along with the response.

Because of that wide-data format, there’s no way to specify that each covariate variable should only predict the

What to use instead: A marginal or mixed model can incorporate time-varying covariates.

5. **Three level models**

If the subjects themselves are not only measured multiple times, but also clustered into some other groups, you’ve got a three-level model.

For example, you may have students measured over time, but students are also clustered within classrooms.

Streams may be measured over time, but are also clustered into watersheds.

Patients measured over time are also clustered into medical centers.

In all these cases, the repeated measures ANOVA can account for the repeats over time, but not the clustering.

What to use instead:A mixed model can incorporate multiple levels.

6. **Repeats across people and items**

There is a repeated measures design that occurs in specific experimental studies common in linguistics and psychology. These are studies in which each subject is repeated measured across many trials. Each trial contains one item, and there are multiple items for each condition.

An example may be to measure reaction time of 50 participants to each of 20 high-frequency and 20 low frequency words.

It’s clear that the 40 reaction times are repeated across each participant, and we need to account for the fact that multiple responses from the same subject are correlated. After all, some participants will

alwaysbe faster than others.But each word also has 50 repeated measurements (one per participant) and those are also likely to be correlated to each other. Some words will elicit faster times than others, even within the same condition.

Repeated measures ANOVA can only account for the repeat across one type of subject.

What to use instead: A mixed model with crossed random effects.

7. **Non-continuous outcomes**

Finally, repeated measures ANOVA has assumptions of normality within each grouping factor.

Sure, it’s robust to small departures of this assumption. And if the outcome variable is continuous, unbounded, and measured on an interval or ratio scale, you may be able to solve non-normality with a transformation.

But if you’ve got categorical outcomes or count outcomes, it’s not going to work. Luckily, there are other options.

What to use instead:A Generalized Estimation Equation (GEE) or Generalized Linear Mixed Model (GLMM).

**Analyzing Repeated Measures Data: GLM and Mixed Models Approaches**.

*"I wanted to share this recent publication with you. I would not have made it through the statistical analysis for this project without your Repeated Measures class. I am continually pulling out my class notes for other analyses. Thanks again!"*

{ 7 comments… read them below or add one }

Good Explanation. I have a question we record values on TWO occasions with same participant every time. Can we run repeated measure of ANOVA or I should go for Paired Sample t-test ( non-parametric )?

My variables are continuous in nature and my data is not distributed normally.

Karen,

Thanks for your explanation.

Currently, I am running an experiment with 5 independent variables and two dependent variables (response time and correctness). The correctness is a binary response, and I used GEE. Also, since there are missing data for the response time, I used Mixed Model. However, when I found a significant result for one independent variable which has three levels, I would like to do multiple comparisons to know where the significant results come from. Is there any suggestion you could provide to do this multiple comparisons? Or should I look into the Estimates of Fixed Effects table?

Thanks.

Best,

Amanda

Karen,

I really enjoyed readying this. I was struck by a line in your first paragraph “Sometimes trying to fit a data set into a repeated measures ANOVA requires too much data gymnastics—averaging across repetitions or pretending a continuous predictor isn’t”.

This is the situation I am currently in. I am running a repeated measures in SPSS, and I have two predictor variables, one is continuous (which I entered as a co-variate) and the other categorical (which I entered as a between-subject factor). I want these two predictor variables to interact, so what I did was alter the syntax and added the interaction in the “design” line. The interaction is significant, and now I am trying to interpret the interaction. I can split the file by the categorical predictor and determine the level of the categorical variable that is moderated by my continuous predictor. Because GML does not produce a beta coefficient, I am having a hard time knowing the direction of this association.

I feel like I might be missing something obvious. Any thoughts you have are greatly appreciated.

Erin

Hi Erin,

I think what you’re missing is that GLM does produce a beta coefficient. You need to use /Print Solution.

However, in Repeated Measures GLM, it may not be what you want. I suspect you’ll have to use Mixed instead of RM Anova.

You may find this helpful: http://www.theanalysisfactor.com/resources/mixed-multilevel-models/

Beautiful explanation. However, I´m trying to analyze a dataset and predict a binary dependent variable measured once several years after obtaining multiple measurements on a independent variable (time-varying covariate) in an unbalanced design. As the dependent variable is only measured once I´m uncertain as to the correct approach in analyzing this. I have previously done cox analyses with time-varying covariates, but I´ve never seen an approach with time-varying covariates for logistic regression. Any ideas?

Hi Mike,

There are a few options, but the most common would be to summarize the time-varying covariate with something like it’s max, mean, slope of its change over time and use that as a predictor. If there aren’t too many time points of this variable, you can also use each value as a covariate.

Really well structured explanation.