• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • About
    • Our Programs
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Guest Instructors
  • Membership
    • Statistically Speaking Membership Program
    • Login
  • Workshops
    • Online Workshops
    • Login
  • Consulting
    • Statistical Consulting Services
    • Login
  • Free Webinars
  • Contact
  • Login

Approaches to Repeated Measures Data: Repeated Measures ANOVA, Marginal, and Mixed Models

by Karen Grace-Martin 91 Comments

In a recent post, I discussed the differences between repeated measures and longitudinal data, and some of the issues that come up in each one.

I want to expand on that discussion, and discuss the three approaches you can take to analyze repeated measures data.

For a few, very specific designs, you can get the exact same results from all three approaches.  This, I find, has always made it difficult to figure out what each one is doing, and how to apply them to OTHER designs.

For the purposes of discussion here, I’m going to define repeated measures data as repeated measurements of the same outcome variable on the same individual.  The individual is often a person, but could just as easily be a plant, animal, colony, company, etc.  For simplicity, I’ll use “individual.”

Beyond that, anything goes.  Measurements can be repeated over time or space; time can itself be an important factor in the experiment or not; each individual can have 2 or 20 measurements.

Approach 1: Repeated Measures Multivariate ANOVA/GLM

When most researchers think of repeated measures, they think ANOVA.  In my personal experience, repeated measures designs are usually taught in ANOVA classes, and this is how it is taught.

The data is set up with one row per individual, so individual is the focus of the unit of analysis.  This is called the wide format.

The multiple measures of the outcome variable are in multiple columns of data-each is considered a different variable.  It’s a multivariate approach and is run as a MANOVA, so the model equation had multiple dependent variables and multiple residuals. (SPSS users-this is the approach taken by the Repeated Measures (RM) GLM procedure).

The biggest advantage of this approach is its conceptual simplicity.  It makes sense.  But it has a lot of assumptions that can be very difficult to meet in all but very limited experimental situations.

These include balanced data (if even one observation is missing, the subject will get dropped) and equal correlations among response variables.  It also has the limitation that it cannot do post-hoc tests on the repeated measures factor, which I consider a huge limitation.

It tends to work well in many experimental situations, where each measurement is taken under a different experimental condition.

Approach 2: The Marginal Multilevel Model

The second approach assumes the repeated responses make up multilevel data.  The outcome is a single variable, and another variable is needed to indicate the condition or time measurement.  This requires that each subject have multiple rows of data in the spreadsheet. This is called the long format, or Stacked data, and this changes the unit of analysis from the subject to each measurement occasion.

In a marginal model (AKA, the population averaged model), the model equation is written just like any linear model.  There is a single response and a single residual.  The difference between the marginal model and a linear model is that the residuals are not assumed to be independent with constant variance.

In a marginal model, we can directly estimate the correlations among each individual’s residuals.  (We do assume the residuals across different individuals are independent of each other). We can specify that they are equally correlated, as in the RM ANOVA, but we’re not limited to that assumption.  Each correlation can be unique, or measurements closer in time can have higher correlations than those farther away.  There are a number of common patterns that the residuals tend to take.

Likewise, the residual variances don’t have to be equal as they do in the RM ANOVA.

So in cases where the assumptions of equal variances and equal correlations are not met, we can get much better fitting models by using a marginal model.  The other big advantage is by taking a univariate approach, we can do post-hoc tests on the repeated measures factor.

Approach 3: The Linear Mixed Model

Like the marginal model, the linear mixed model requires the data be set up in the long or stacked format.

It too controls for non-independence among the repeated observations for each individual, but it does so in a conceptually different way.  Rather than just estimate the correlation among an individual’s repeated observations, it actually adds one or more random effects for Individuals to the model.

The model equation therefore includes extra parameters to include any random effects.  They take the form of additional residual terms, each of which has its own variance to be estimated.

This literally means the model is controlling for the effects of individual.  The simplest mixed model, the random intercept model, controls for the fact that some individuals always have higher values than others.  By controlling for this variation, we’ve taken it out of the original residual.

Individual growth curve models are a specific type of mixed model that uniquely models each individual’s value of the outcome over time.  They are particularly useful when the research question is about how covariates affect not only the value of the dependent variable, but its change over time.

The biggest advantage of mixed models is their incredible flexibility.  They can handle clustered individuals as well as repeated measures (even in the same model).  They can handle crossed random effects, where there are repeated measures not only on an individual, but also on each stimulus.

Time can easily be considered continuous or categorical, and covariates can be measured just once per individual or repeatedly at each observation.  Unbalanced data are no problem, and even if some outcomes are missing for some individuals, they won’t be dropped from the model.

The biggest disadvantage of mixed models, at least for someone new to them, is their incredible flexibility. It’s easy to mis-specify a mixed model, and this is a place where a little knowledge is definitely dangerous.

 

Random Intercept and Random Slope Models
Get started with the two building blocks of mixed models and see how understanding them makes these tough models much clearer.

Tagged With: Marginal Model, mixed model, Population Averaged Model, Repeated Measures

Related Posts

  • Six Differences Between Repeated Measures ANOVA and Linear Mixed Models
  • The Repeated and Random Statements in Mixed Models for Repeated Measures
  • The Difference Between Clustered, Longitudinal, and Repeated Measures Data
  • Linear Mixed Models for Missing Data in Pre-Post Studies

Reader Interactions

Comments

  1. Hank says

    May 31, 2020 at 6:56 pm

    Hi Karen,
    great article and linked video!
    The data I have suggest to me that a linear mixed model would be most appropriate, but it would be great to get your take on it.

    The outcome I have increases linearly and looks normally distributed at baseline and follow-up measurements. We want to see if the rate of increase in the outcome variable differs between groups defined by categories of two binary variables (we want to know if either of the binary variables has an effect and whether there is an interaction).
    Number of measurements of the outcome varies between subject as does the time between the measurements.
    We want to control for a few covariates (we’re only interested in the effects of abovementioned two binary variables) and those covariates are both categorical and continuous (for example age and gender).
    Do you think a linear mixed model might be work or might something like Marginal Multilevel Model be more appropriate?

    Reply
  2. Richard Anderson says

    May 15, 2019 at 4:23 am

    This is helpful. But I think your Approach 1 is conflating two distinct conceptualizations: Repeated measures versus multivariate. I a repeated measures ANOVA, while “the multiple measures of the outcome variable are in multiple columns of data,” each is considered a *level* (of one or more variables), not “a different variable.”

    Reply
    • Karen Grace-Martin says

      May 16, 2019 at 11:14 am

      Hi Richard,

      You may think of those as different categories of the same variable, but that’s not what is happening mathematically. When you run a proc glm in SAS with the repeated statement or a Repeated Measures ANOVA in SPSS’s glm, it is literally treating those as different variables. All the multivariate output that you get is MANOVA output.

      And slightly off topic, and I know I’m alone on this, but I try not to use “level” to designate “values of a facter/categorical variable” because people mix it up with levels in a multilevel model. Two different meanings of the same word within the same context. Very confusing. But I am reading your comment as meaning the former.

      Reply
  3. Larry says

    August 4, 2017 at 10:38 am

    You say that the multivariate approach to repeated measures has the limitation that it cannot do post-hoc tests on the repeated measures factor. Why can’t you use Hotelling’s t-test to do pairwise comparisons or a STP procedure to compare more than two categories? That is what I was taught.

    Reply
    • Karen Grace-Martin says

      May 16, 2019 at 11:14 am

      Hi Larry,

      Then you’re not accounting for inflated type I error from multiple testing.

      Reply
  4. Pim says

    June 26, 2016 at 6:59 am

    Hi there,

    I have hit a problem. I have conducted two tests and I want to do a RMANOVA.

    Problem is : the first test contained ten questions, the second test contained 11 questions.

    So if a student got 10 points on the first test and 11 on the second he scored the same grade but SPSS doesn’t recognize that.

    Is there any way in which I can account for that difference and still run a RMANOVA test that actually says something valid?

    Kind regards,
    Pim

    Reply
    • Karen Grace-Martin says

      May 16, 2019 at 11:23 am

      Hi Pim,
      I created that same problem for myself in my senior thesis when I was an undergrad.

      This is actually a tricky situation because you have proportions, which aren’t appropriate for ANOVA anyway. I would have to ask you a few questions to really give good advice.

      Reply
  5. Mark says

    April 14, 2016 at 11:51 am

    Hi there,
    I have a problem on SPSS and really need some help, if possible.

    I was originally trying to perform a repeated measures ANCOVA to investigate the effects of time (3 time points) on an continuous outcome variable. I have no missing cases. However, I need to include a time-varying covariate (also a continuous variable). From following some stats books, it appears that a linear mixed model will allow this. I have set the data up in long form, as suggested, and attempted to run the analysis (Analyze-Mixed Model-Linear). I have entered a categorical variable of SUBJECT (coded 1-73 for the number of subjects I have) into the subjects box, and the categorical variable of TIME (coded 1-3) into the repeated box.

    On the next box, I have placed my measured dependent variable into the corresponding box, my covariate (time varying continuous measurement) in the covariate box and I’ve placed my TIME factor into the Factors box.

    After this, I’m unsure what steps to take in terms of specifying random or fixed effects etc. I basically just need to know if there is a time effect when the covariate is included or not and how to interpret the SPSS output.

    Any help would be much appreciated!

    Thanks Karen

    Reply
  6. Charan says

    March 28, 2016 at 2:12 pm

    Hi

    What do you mean by model and parameter matrices for repeated measures data in case of multivariate analysis??

    Reply
  7. Jennifer says

    March 19, 2016 at 6:53 pm

    Hi Karen,

    Thank you for this article! I have a question, if that’s alright –

    I am trying to do a mixed design ANOVA (2 levels for the between-subj factor and 4 levels for the within-subj factor), and all I have are the means, standard deviations, and sample sizes. Since I can’t use the regular “click/drop-down menu” method, I am writing syntax for this analysis.

    I found how to do a one-way ANOVA (http://www-01.ibm.com/support/docview.wss?uid=swg21476127) and how to do a simple 2×2 or factorial ANOVA (http://www-01.ibm.com/support/docview.wss?uid=swg21475358) using syntax. I’ve been trying to play with this to make a syntax for a mixed design ANOVA, but I am not succeeding; I only keep coming up with a 2×4 factorial ANOVA, which isn’t right based on my data – I definitely need a mixed design ANOVA model.

    Any suggestions are much appreciated!! Thank you!

    Reply
« Older Comments

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Primary Sidebar

This Month’s Statistically Speaking Live Training

  • January Member Training: A Gentle Introduction To Random Slopes In Multilevel Models

Upcoming Workshops

  • Logistic Regression for Binary, Ordinal, and Multinomial Outcomes (May 2021)
  • Introduction to Generalized Linear Mixed Models (May 2021)

Read Our Book



Data Analysis with SPSS
(4th Edition)

by Stephen Sweet and
Karen Grace-Martin

Statistical Resources by Topic

  • Fundamental Statistics
  • Effect Size Statistics, Power, and Sample Size Calculations
  • Analysis of Variance and Covariance
  • Linear Regression
  • Complex Surveys & Sampling
  • Count Regression Models
  • Logistic Regression
  • Missing Data
  • Mixed and Multilevel Models
  • Principal Component Analysis and Factor Analysis
  • Structural Equation Modeling
  • Survival Analysis and Event History Analysis
  • Data Analysis Practice and Skills
  • R
  • SPSS
  • Stata

Copyright © 2008–2021 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled

Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.

Non-necessary

Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.