When Does Repeated Measures ANOVA not work for Repeated Measures Data?

Repeated measures ANOVA is the approach most of us learned in stats classes for repeated measures and longitudinal data. It works very well in certain designs.

But it’s limited in what it can do. Sometimes trying to fit a data set into a repeated measures ANOVA requires too much data gymnastics. really.

These data gymnastics mean you’re throwing away good information and under-accounting for true variation among repetitions.

There are a few specific design and data situations that will eliminate repeated measures ANOVA as a reasonable approach.

Let’s go through seven of these and what the options are instead.

1. Missing Data on the outcome

One of the biggest problems with traditional repeated measures ANOVA is missing data on the response variable.

The problem is that repeated measures ANOVA treats each measurement as a separate variable. Because it uses listwise deletion, if one measurement is missing, the entire case gets dropped.

What to use instead: Marginal and mixed models treat each occasion as a different observation of the same variable. So you may lose the measurement with missing data, but not all other responses from the same subject.

2. Unbalanced number of repeats across individuals

A related problem is imbalance in the number of repeated responses from each individual.

This is common in observed data, where the number of repeats is uncontrollable. You measure a response each time some occurence happens.

Repeated measures ANOVA treats each response as a different variable. This causes two problems.

First, you will have a different number of response variables for each individual. If some have missing data in the last few responses, they’ll get dropped. (That dropping again. Ugh).

Second, the ANOVA will compare the responses to each other, assuming that each one represents a different condition. Here they don’t—they’re really interchangeable. But there is no way to turn off that comparison.

What to use instead: A mixed model can handle unequal repeats.

3. When time is continuous

In some repeated measures studies, each repeat occurs under a different experimental condition. There is a qualitative difference among the repeats. No problem here.

In others, the amount of time that has passed between repeats is important. (Or equivalently, the amount of space if the repeats are say, along a transect). In other words, you want to treat the within-subjects effect of time as a continuous, quantitative variable.

This is theoretically valid and reasonable, but repeated measures ANOVA can only account for categorical repeats.

There are contrasts that allow you to order the categories and simulate a trend over time, but they’re not truly treating time as continuous.

What to use instead: A marginal or mixed model can treat time as a truly continuous effect.

4. Time-varying covariates

In some studies, the important predictor variables are measured on each repeat, right along with the response.

Because of that wide-data format, there’s no way to specify that each measurement of the covariate variable should only predict the corresponding response.

What to use instead: A marginal or mixed model can incorporate time-varying covariates.

5. Three (or more) level models

If the subjects themselves are not only measured multiple times, but also clustered into some other groups, you’ve got a three-level model.

For example, you may have students measured over time, but students are also clustered within classrooms.

Streams may be measured over time, but are also clustered into watersheds.

Patients measured over time are also clustered into medical centers.

In all these cases, the repeated measures ANOVA can account for the repeats over time, but not the clustering.

What to use instead: A mixed model can incorporate multiple levels.

6. Repeated measures across people and items

There is a repeated measures design that occurs in specific experimental studies. They’re common in linguistics and psychology. These are studies in which each subject is repeated measured across many trials. Each trial contains one item, and there are multiple items for each condition.

An example may be to measure reaction time of 50 participants to each of 20 high-frequency and 20 low frequency words.

It’s clear that the 40 reaction times are repeated across each participant, and we need to account for the fact that multiple responses from the same subject are correlated. After all, some participants will always be faster than others.

But each word also has 50 repeated measurements (one per participant) and those are also likely to be correlated to each other. Some words will elicit faster times than others, even within the same condition.

Repeated measures ANOVA can only account for the repeat across one type of subject.

What to use instead: A mixed model with crossed random effects.

7. Non-continuous outcomes

Finally, repeated measures ANOVA has assumptions of normality within each factor.

Sure, it’s robust to small departures of this assumption. And if the outcome variable is continuous, unbounded, and measured on an interval or ratio scale, you may be able to solve non-normality with a transformation.

But if you’ve got categorical outcomes or count outcomes, it’s not going to work. Luckily, there are other options.

What to use instead: A Generalized Estimation Equation (GEE) or Generalized Linear Mixed Model (GLMM).

Fixed and Random Factors in Mixed Models

One of the hardest parts of mixed models is understanding which factors to make fixed and which to make random. Learn the important criteria to help you decide.

Comments

Adrian Olszewski says

August 1, 2023 at 2:40 pm

One should not forget, however, that GEE may be (quite likely!) biased under missingness other than MCAR, which is rather rare and often unrealistic. To make it working under MAR:
– use DR GEE (doubly robust)
– use IPW (inverse probability weighting) GEE
– use MI GEE (multiple imputation GEE). This has the additional bonus of keeping the sample size.
I’d like to repeat that: do NOT use GEE on data with missing observations if you suspect that MCAR may not hold. Which is a reasonable approach bearing in mind that MNAR (missing not at random) can NEVER be ruled out, so MCAR becomes rather a wishful thinking. MAR may be a good compromise, but sensitivity analysis should always be performed against some common MNAR patterns.

Reply
Kali Prescott says

January 21, 2023 at 11:30 pm

Hello,

I have a case where my experiment was a classic repeated measures design. We samples 4 independent replicates repeatedly over time for gases. However, we used a novel sampling technique to measure gases that previously could not be measured and there is a high probability of data loss at low concentrations. The result is that my data is extremely zero-inflated. However, because of the unreliability of the measuring method I can’t assume these are true zeros. When I convert all zeros in my dataset to NA and treat them as “non-detects” rather than a concentration of 0 my data becomes normal for many gases. But with the NAs I cannot run a repeated measures ANOVA. What would suggest would be an alternative model? Based on your suggestion of a linear mixed model I attempted a Mixed Model with Repeated Measures using the mmrm package in R, but have had issues with the covariance structure of the model (likely a coding issue but would like to know if I’m on the right track)

Reply
Ramona says

December 2, 2022 at 7:56 am

Hi,

I had a question that might be related to your first and second point. What are the issues that I could encounter if I run a rmANOVA on 20 participants at 3 different time points (before, during and after training) where one of the timepoints has less observations (1/3 of other timepoints) per participant than the other two timepoints? Would this make the timepoints not comparable? What could I use as an alternative?

Reply
- Karen Grace-Martin says
  
  December 21, 2022 at 3:26 pm
  
  You’re right. This is a perfect example of an analysis where RMAnova isn’t a good choice.
  
  A similar and better choice is to run a marginal model using maximum likelihood. The different software has different names for this model, but in spss and sas you’d use the mixed procedure with a repeated statement.
  
  Reply
Steven says

May 28, 2022 at 3:45 am

Hello, i am wondering does it matter how many dependent variables u use. For example in my study I measured pre and post for shoulder strength and tendon size. Do each of these have to have a separate ANOVA or can they be combined together. The unit for strength are Newtons and strength measures go to about 200. However, the units for tendon thickness are mm. (not sure if this makes any difference)?

Reply
- Karen Grace-Martin says
  
  June 15, 2022 at 11:11 am
  
  A separate analysis for each dependent variable is the simplest approach. There are ways to combine, but they get very complex and may not answer your research question.
  
  Reply
Rebecca Bokoch says

March 3, 2021 at 12:29 am

If you are running a simple repeated measures ANOVA, looking at change overtime (time 1, time 2, time 3) for a measure of anxiety, and one of the measures of anxiety (time 1) is not normally distributed but the other two are (time 2 and time 3), do you:
1) transform just anxiety at time 1?
2) transform all measures of anxiety at time 1, 2, & 3?
3) don’t transform any of the measures of anxiety

If you do have to use a transformation, how do you interpret the results? Please let me know if you can advise. Thank you!

Reply
- Karen Grace-Martin says
  
  December 6, 2021 at 12:58 pm
  
  Rebecca,
  
  It won’t help to transform just time 1 because then your scaling will be totally off. And transforming all three time points might cause more problems than it solves. This might be one of those situations where there just isn’t a clear best way to do it. So you have to look at everything you’ve got and make the best decision you can. In this case, it would really depend on the way in which time 1 anxiety isn’t normal. Like, is it skewed right, bimodal, uniform?
  
  Reply
Richard Anderson says

July 7, 2020 at 2:38 pm

I think there’s a problem with saying that repeated-measures ANOVA can’t handle the following:

“5. Three (or more) level models
If the subjects themselves are not only measured multiple times, but also clustered into some other groups, you’ve got a three-level model.”

That’s just a repeated-measures ANOVA with with “X” as a within-subject factor (whichever repeated-measure “X” refers to) and “Class” as a between-subject factor.

For example, you may have students measured over time, but students are also clustered within classrooms.

Reply
Sophia says

February 22, 2020 at 8:05 pm

Hi there, I have a question regarding repeated measures ANOVA. Does this test check for random effects in your data set? Does it tell you if there’s a special relationship between data points (e.g. subject 1 and subject 2 have similar values across different time points.) Does it check for if “when the values for subject 1 go up, the values for subject 2 go up too”?
Thank you,
Sophia

Reply
Roos says

June 16, 2019 at 4:18 am

Hi! I have to perform a repeated measure in SPSS and this works for the example that I found on YouTube. However, if I enter my own data, the outcome of the Mauchly’s Test of Sphericity is a . by significance, a 0 by df, and 1,000 for almost all other values (except for approx. chi-square which is ,000). Could someone help me, because I do not know what I did wrong?

Reply
- Karen Grace-Martin says
  
  August 22, 2019 at 11:59 am
  
  Hi Roos,
  
  It’s too hard to tell from your description what happened. I can tell you Mauchley isn’t a very good test of sphericity anyway, but it’s strange you didn’t even get a value.
  
  Reply
- mohammed says
  
  February 20, 2020 at 12:49 pm
  
  if the repeats number is two, you will not have results for this test. you need at least 3 repeats.
  
  Reply
Ray says

May 1, 2019 at 1:36 pm

Hi. I have a design where participants view images repeatedly, and the images have 3 levels. I have a continuous predictor (i.e., scale measuring life history). Can a mixed model (LME) be appropriate for this type of design? Thank you.

Reply
- Karen Grace-Martin says
  
  May 9, 2019 at 9:49 am
  
  Hi Ray,
  
  It sounds like it, but I would need to know a lot more detail before I could give you accurate advice about the analysis to take for any given study.
  
  Reply
William says

June 19, 2018 at 3:57 am

Thanks! This REALLY helped!!

Reply
Lena says

June 14, 2018 at 9:17 am

Hi Karen,

Thank you very much for providing this info.

I would be thankful if you could provide me a quick feedback regarding the best analysis for my situation. I have one sample which went through physical activity program. We measured participants at baseline, 10 and 20-week follow up (no control condition). My retention data is very poor Baseline= 58 participants, 10-weeks= 39 and 20-weeks= 21. Our outcomes are a few tests on a continuos scale (time, repetitions).

I would be thankful for your tip.

Greetings,
Lena

Reply
- Karen Grace-Martin says
  
  October 26, 2018 at 5:18 pm
  
  Hi Lena,
  
  Unfortunately, there isn’t a clear answer. It really depends on why people are dropping out and how much you can assume randomness of dropout.
  
  Reply
Faraz Farooqi says

February 21, 2017 at 8:12 am

Good Explanation. I have a question we record values on TWO occasions with same participant every time. Can we run repeated measure of ANOVA or I should go for Paired Sample t-test ( non-parametric )?
My variables are continuous in nature and my data is not distributed normally.

Reply
Amanda says

February 15, 2017 at 5:02 am

Karen,

Thanks for your explanation.

Currently, I am running an experiment with 5 independent variables and two dependent variables (response time and correctness). The correctness is a binary response, and I used GEE. Also, since there are missing data for the response time, I used Mixed Model. However, when I found a significant result for one independent variable which has three levels, I would like to do multiple comparisons to know where the significant results come from. Is there any suggestion you could provide to do this multiple comparisons? Or should I look into the Estimates of Fixed Effects table?

Thanks.

Best,

Amanda

Reply
Erin says

July 10, 2015 at 4:38 pm

Karen,

I really enjoyed readying this. I was struck by a line in your first paragraph “Sometimes trying to fit a data set into a repeated measures ANOVA requires too much data gymnastics—averaging across repetitions or pretending a continuous predictor isn’t”.

This is the situation I am currently in. I am running a repeated measures in SPSS, and I have two predictor variables, one is continuous (which I entered as a co-variate) and the other categorical (which I entered as a between-subject factor). I want these two predictor variables to interact, so what I did was alter the syntax and added the interaction in the “design” line. The interaction is significant, and now I am trying to interpret the interaction. I can split the file by the categorical predictor and determine the level of the categorical variable that is moderated by my continuous predictor. Because GML does not produce a beta coefficient, I am having a hard time knowing the direction of this association.

I feel like I might be missing something obvious. Any thoughts you have are greatly appreciated.

Erin

Reply
- Karen says
  
  July 14, 2015 at 9:43 am
  
  Hi Erin,
  
  I think what you’re missing is that GLM does produce a beta coefficient. You need to use /Print parameter~~Solution~~.
  
  However, in Repeated Measures GLM, it may not be what you want. I suspect you’ll have to use Mixed instead of RM Anova.
  
  You may find this helpful: https://www.theanalysisfactor.com/resources/mixed-multilevel-models/
  
  Reply
  - Gerard says
    
    May 15, 2020 at 9:41 am
    
    Hi Karen,
    
    Great website. I’m wondering as well how to use the /PRINT SOLUTION command in the GLM function in SPSS to get a beta coefficient. The only outputs I have are F statistics & the p-value for each co-variate.
    Whenever I try to type /PRINT=SOLUTION or SOLUTION into the syntax it generates an error. It seems /PRINT SOLUTION is a ‘mixed’ syntax?
    At my wits end at the moment!
    
    Many thanks!
    
    Reply
    - Karen Grace-Martin says
      
      June 2, 2020 at 4:25 pm
      
      Hi Gerard,
      
      Oh, I think you’re right. Sorry about that. Try /print parameter in GLM. (I’ll fix that).
      
      One nice thing about SPSS is if you can type the first letter of an option, it will give you a drop down menu of all the possible options. So if one isn’t working, you can see what does.
      
      Reply
Mike Lyell says

October 22, 2014 at 7:30 am

Beautiful explanation. However, I´m trying to analyze a dataset and predict a binary dependent variable measured once several years after obtaining multiple measurements on a independent variable (time-varying covariate) in an unbalanced design. As the dependent variable is only measured once I´m uncertain as to the correct approach in analyzing this. I have previously done cox analyses with time-varying covariates, but I´ve never seen an approach with time-varying covariates for logistic regression. Any ideas?

Reply
- Karen says
  
  October 23, 2014 at 2:01 pm
  
  Hi Mike,
  
  There are a few options, but the most common would be to summarize the time-varying covariate with something like it’s max, mean, slope of its change over time and use that as a predictor. If there aren’t too many time points of this variable, you can also use each value as a covariate.
  
  Reply
Craig Marsden says

September 24, 2014 at 3:06 pm

Really well structured explanation.

Reply