# Approaches to Repeated Measures Data: Repeated Measures ANOVA, Marginal, and Mixed Models

by

In a recent post, I discussed the differences between repeated measures and longitudinal data, and some of the issues that come up in each one.

I want to expand on that discussion, and discuss the three approaches you can take to analyze repeated measures data.

For a few, very specific designs, you can get the exact same results from all three approaches.  This, I find, has always made it difficult to figure out what each one is doing, and how to apply them to OTHER designs.

For the purposes of discussion here, I’m going to define repeated measures data as repeated measurements of the same outcome variable on the same individual.  The individual is often a person, but could just as easily be a plant, animal, colony, company, etc.  For simplicity, I’ll use “individual.”

Beyond that, anything goes.  Measurements can be repeated over time or space; time can itself be an important factor in the experiment or not; each individual can have 2 or 20 measurements.

Approach 1: Repeated Measures Multivariate ANOVA/GLM

When most researchers think of repeated measures, they think ANOVA.  In my personal experience, repeated measures designs are usually taught in ANOVA classes, and this is how it is taught.

The data is set up with one row per individual, so individual is the focus of the unit of analysis.  This is called the wide format.

The multiple measures of the outcome variable are in multiple columns of data-each is considered a different variable.  It’s a multivariate approach and is run as a MANOVA, so the model equation had multiple dependent variables and multiple residuals. (SPSS users-this is the approach taken by the Repeated Measures (RM) GLM procedure).

The biggest advantage of this approach is its conceptual simplicity.  It makes sense.  But it has a lot of assumptions that can be very difficult to meet in all but very limited experimental situations.

These include balanced data (if even one observation is missing, the subject will get dropped) and equal correlations among response variables.  It also has the limitation that it cannot do post-hoc tests on the repeated measures factor, which I consider a huge limitation.

It tends to work well in many experimental situations, where each measurement is taken under a different experimental condition.

Approach 2: The Marginal Multilevel Model

The second approach assumes the repeated responses make up multilevel data.  The outcome is a single variable, and another variable is needed to indicate the condition or time measurement.  This requires that each subject have multiple rows of data in the spreadsheet. This is called the long format, or Stacked data, and this changes the unit of analysis from the subject to each measurement occasion.

In a marginal model (AKA, the population averaged model), the model equation is written just like any linear model.  There is a single response and a single residual.  The difference between the marginal model and a linear model is that the residuals are not assumed to be independent with constant variance.

In a marginal model, we can directly estimate the correlations among each individual’s residuals.  (We do assume the residuals across different individuals are independent of each other). We can specify that they are equally correlated, as in the RM ANOVA, but we’re not limited to that assumption.  Each correlation can be unique, or measurements closer in time can have higher correlations than those farther away.  There are a number of common patterns that the residuals tend to take.

Likewise, the residual variances don’t have to be equal as they do in the RM ANOVA.

So in cases where the assumptions of equal variances and equal correlations are not met, we can get much better fitting models by using a marginal model.  The other big advantage is by taking a univariate approach, we can do post-hoc tests on the repeated measures factor.

Approach 3: The Linear Mixed Model

Like the marginal model, the linear mixed model requires the data be set up in the long or stacked format.

It too controls for non-independence among the repeated observations for each individual, but it does so in a conceptually different way.  Rather than just estimate the correlation among an individual’s repeated observations, it actually adds one or more random effects for Individuals to the model.

The model equation therefore includes extra parameters to include any random effects.  They take the form of additional residual terms, each of which has its own variance to be estimated.

This literally means the model is controlling for the effects of individual.  The simplest mixed model, the random intercept model, controls for the fact that some individuals always have higher values than others.  By controlling for this variation, we’ve taken it out of the original residual.

Individual growth curve models are a specific type of mixed model that uniquely models each individual’s value of the outcome over time.  They are particularly useful when the research question is about how covariates affect not only the value of the dependent variable, but its change over time.

The biggest advantage of mixed models is their incredible flexibility.  They can handle clustered individuals as well as repeated measures (even in the same model).  They can handle crossed random effects, where there are repeated measures not only on an individual, but also on each stimulus.

Time can easily be considered continuous or categorical, and covariates can be measured just once per individual or repeatedly at each observation.  Unbalanced data are no problem, and even if some outcomes are missing for some individuals, they won’t be dropped from the model.

The biggest disadvantage of mixed models, at least for someone new to them, is their incredible flexibility. It’s easy to mis-specify a mixed model, and this is a place where a little knowledge is definitely dangerous.

Learn more about repeated measures analysis using mixed models in our most popular workshop: Analyzing Repeated Measures Data: GLM and Mixed Models Approaches.

Send to Kindle

{ 22 comments… read them below or add one }

E Roest July 31, 2011 at 10:16 am

Thank you for your clarification. As a matter of fact, I’m struggling with this material since I’m looking for an appropriate method to analyze my thesis data.

Could someone here maybe help me out a bit? I have a relatively simple design, yet the analysis is somewhat complicated. I have a 2×2 between subjects design in which I measure measuring product evaluation. So far, nothing complicated. But, one of my between-subject factors has 4 different replications, resulting in 16 different experimental conditions.
Moreover, I’m also using 4 (within-subject) replications (different product categories), resulting in an unbalanced mixed-design. I understand that the latter fact rules out Repeated Measures ANOVA as an option. Someone suggested using Linear Regression, but unless I used some kind of Repeated Measures Regression, I will violate some assumptions.

Any suggestions?

Karen August 26, 2011 at 1:29 pm

Hi Elvin,

First, I apologize for the late response. I was out of town with very haphazard internet access (and I’m only now getting over the withdrawal symptoms).

Yes, you’re going to need to run a mixed model. You have a variation on a randomized block design. Most Design of Experiments text books have chapters on these models.

You’re right that the repeated measures anova won’t work if it’s unbalanced. You are going to need to run a mixed model. If you want some background info (it won’t answer all your questions or give you step-by-step instructions on how to run it, but it will give you a framework in which to approach it), I would recommend my webinar Fixed and Random Factors in Mixed Models. It’s free.

karim August 8, 2011 at 6:28 pm

my research question is to examine the effects of odors on consumer behavior and the moderating effects of sex and need for stimulation on the relationship between odor and consumer behavior. My variables are:
1 – Independent variable: odors (three experimental conditions: the presence of odor x, y presence of odor, no odor)
2 – Dependent variable (continuous variables): reactions of consumers
3 – moderating variables: need for stimulation (continuous variable: a number of Likert scale items) and Sex (Categorical variable)

My question: how to test in spss moderating effects of need for stimulation (the measurement scale of this variable has several items? Can we recode into two categories (high vs. low need for stimulation need stimulation) by considering that the respondents less than 4 on a Likert scale to 7 degrees with a low need for stimulation and those who responded more than 4 have a high degree)? What is the exact procedure to do in SPSS?

Thank you

Karen August 26, 2011 at 1:06 pm

Hi Karim,

I’m assuming all your variables are between subjects b/c you didn’t say otherwise.

A moderating effect is just an interaction. So put in an interaction term between Odor and Need for Stimulation to test that moderation. You can do this regardless of whether the moderator is continuous or categorical. It doesn’t matter. You do need to interpret them differently b/c you have to use the regression coefficients. If you’re not familiar with interpreting interactions in regression, I would recommend watching this webinar recording. Interpreting Linear Regression Parameters: A Walk Through Output It’s free.

You have to be careful about categorizing continuous predictors. It CAN make sense, but usually you just lose information.

Ali April 19, 2012 at 9:23 am

Hi Karen,
Love your site! I was just wondering if you could expand a little on your explanation for approach 1 (GLM) and post-hocs. In particular, what should you do if you have a significant interaction between your RM factor and one/more between-subjects factors? Is it OK to run ANOVAs/ANCOVAs at each level of the RM factor, or is the error term not appropriate?
(To put this slightly rambling question in context, I am using RM ANOVA/ANCOVA with brain imaging data. I have a region of interest, ROI, and because I have measured ROI’s volume in the left and right hemispheres, my within-subjects factor has those 2 levels. I want to look at asymmetry effects, and also effects of clinical group and gender (and their interactions), and also build a second model that includes whole brain volume as a continuous covariate/control variable.) Thankyou so much for any advice you have!!

Karen April 19, 2012 at 11:11 am

Hi Ali,

Those are tricky. In a between subjects model, you would run simple effects and change the error term (Keppel’s Design & Analysis has a nice chapter on this). But there are multiple error terms in a repeated measures GLM.

I would personally rerun it as a marginal model in mixed software, and use the multiple comparisons options in the EMMeans (SPSS) or LSMeans (SAS) statement. You have a pretty straightforward model.

Karen

Ali April 23, 2012 at 3:17 am

Thanks, Karen!!

Gabriela April 27, 2012 at 2:29 pm

Hi Karen,
could you write a bit more about post hoc tests in mixed models? in my research I have repeated measure and 2 between subject variables on two levels (for exaple sex and age [young and old]. I have difficulities in running post hoc for interaction of between subject variables, could you give me a tip how to deal with it?
Thanks for any advice!

Karen May 1, 2012 at 9:24 am

Hi Gabriela,

That should work pretty well. Usually the tricky part is post-hoc tests for the within-subjects factors.

Which software are you using? I can tell you better how to approach it.

Karen

Gissele May 7, 2012 at 3:43 pm

Hi Karen
I really need your help. I am working in child welfare and have the following variables to analyze:
-repeated administrations of a tool (strengths and needs which as questions on social support, income, drug use etc).
-whether they are an adult/child
-where they are now (child in care, at home etc)
what I am really interested in is in seeing whether our involvement (the intervention) had any effect on strengths and needs. I anticipate that the strengths will go and the needs will go down over time (over our intervention). However, I am unsure as to how to analyze this data. Originally I thought about a repeated measures ANOVA but I am not happy about the fact that you need complete data or it will boot out the cases. Inputting missing data is not feasible as my data is ordinal. I thought about regression but do I have to use polytomous?
Any advice will really help,
Gissele.

Karen May 9, 2012 at 11:30 am

HI Gissele,

It’s hard to tell you how to proceed without all the details, but if you have missing data in the response variable, you’re right, repeated measures ANOVA won’t work. Actually, even without the repeat, if the outcome is ordinal, no ANOVA will work.

You probably need a mixed model, and since it’s ordinal, a Generalized Linear Mixed Model. Here are two resources that might help parts of this. If you want help putting it all together, I can only suggest a Quick Question consultation so that we get all the details straight.

This webinar recording is an example of testing effects over time: Random Intercept and Random Slope Models

And this article discusses GLMM: Five Extensions of the General Linear Model

Demewoz Haile August 12, 2012 at 8:52 am

HI KAREN
frst of thank you very much for clarification of my statistical confusions on the area of mixed model analysis for repeated measures. I get the website very useful for researchers. we need more explanation with statistical software like SPSS and STATA
Thank u
Demewoz Haile

Karen September 3, 2012 at 5:25 pm

Hi Demewoz–You’re welcome. Glad you found it helpful.

Karen

Paola December 13, 2012 at 2:58 am

Hi Karen,
I am trying to use repeated measures anova for plant varietal treatments that were sampled across time. It is ok to use it for plant populations instead of individual plants? Second, I am using JMP software, how can I approach the post-hoc test? thanks,

Paola

Karen December 19, 2012 at 11:27 am

It can be okay if the plant populations are the unit of measurement. I’d had to see it to be sure.

And I haven’t used JMP in a while, so don’t know the post hoc tests off the top of my head. If anyone else knows, please feel free to comment. Karen

Pardeep Kumar February 12, 2013 at 1:01 pm

Sir, please tell me the statistical technique by which I can compare the multiple measurements of multiple treatments. Like many recovery measures of blood pH of different exercises.

Karen February 13, 2013 at 2:52 pm

Hi Pardeep,

I would need a lot more information to answer that. I can interpret it a few different ways.

Karen

JF Cardin March 14, 2013 at 4:23 pm

Hi karen,

I am doing a GLM repeated measures analysis with levels of burnout measured at Pre and Post. I also have 5 categorical IV’s (2 of them with 2 categories and the other 3 with 3 categories). If I put all of them in the model and just ask for a design with 2-way interactions (time*each of the IV’s), I see a significant increase of Burnout for one category of one of my IV (level od education). However, if I run the same model but this time including 3-way interactions (time*IV*IV) in the design, the significant increase observed for one category of level of education (in the 2-way interactions) is now not significant. In fact, there is now a significant DECREASE for the other 2 categories of that variable. Does it make sense? Why is that including 3-way interactions in the model changes all the results for the 2-way interactions? Is it because I have too many variables/interactions in my model? Should I just look at my model including the 3-Way interactions instead of doing it step-by-step. Thank you so much. JF

Karen March 15, 2013 at 11:03 am

Hi JF,

It’s hard to tell without a thorough examination of the output what is going on.

It’s possible that there really are 3-way interactions, but when you don’t include them in the model, their effects are getting “pushed” into the 2-ways.

I would suggest doing a bunch of graphing. Plot the EMmeans across the 3 way interactions, then the two-way. See where the differences in means are the same across time and where they aren’t. You could also do this through investigation of means from a table, but I personally find it much easier to see if I plot them.

Karen

JF Cardin March 26, 2013 at 11:01 am

Thank you Karen, I will do what you proposed. And thanks for everything you are doing to help us!

Gareth April 5, 2013 at 3:32 pm

Can you help me see which approach I should use?

I’m doing an education study and I’m trying to see the effect of website usage on quiz scores in a college plant identification class. There were 30 students in the class and they took 10 weekly quizzes over the course of the semester. They also had a website where they could log in and use a study tool. The website kept track of the time they were online.

So my dependent variable is quiz score (continuous) and my predictor variable is website usage over the previous seven days (also continuous).

What I essentially want is a scatterplot with a regression line, showing whether using the website improved the quiz score. But to get there I need to control for differences between the students and differences between the quizzes. Each student took 10 repeated measures, and those 10 measurements were slightly different from week to week. So if I’m looking at my data set in the wide format, I need to control for differences between the rows and differences between the columns.

I tried using “General Linear Model > Repeated Measures” in SPSS, but I can’t figure out how to tell the program that website usage is a single, continuous predictor variable. And I also can’t figure out how to control for the shifting class average from week to week. Can you give me any guidance?

Thanks!