Repeated Measures

The Difference Between Crossed and Nested Factors

December 18th, 2023 by

One of those tricky, but necessary, concepts in statistics is the difference between crossed and nested factors.

As a reminder, a factor is any categorical independent variable. In experiments, or any randomized designs, these factors are often manipulated. Experimental manipulations (like Treatment vs. Control) are factors.Stage 2

Observational categorical predictors, such as gender, time point, poverty status, etc., are also factors. Whether the factor is observational or manipulated won’t affect the analysis, but it will affect the conclusions you draw from the results.

(more…)


The Wide and Long Data Format for Repeated Measures Data

December 2nd, 2023 by

One issue in data analysis that feels like it should be obvious, but often isn’t, is setting up your data.

The kinds of issues involved include:

  • What is a variable?stage 1
  • What is a unit of observation?
  • Which data should go in each row of the data matrix?

Answering these practical questions is one of those skills that comes with experience, especially in complicated data sets.

Even so, it’s extremely important. If the data isn’t set up right, the software won’t be able to run any of your analyses.

And in many data situations, you will need to set up the data different ways for different parts of the analyses. (more…)


The Difference Between Clustered, Longitudinal, and Repeated Measures Data

May 22nd, 2023 by

What is the difference between Clustered, Longitudinal, and Repeated Measures Data?  You can use mixed models to analyze all of them. But the issues involved and some of the specifications you choose will differ.

Just recently, I came across a nice discussion about these differences in West, Welch, and Galecki’s (2007) excellent book, Linear Mixed Models.

It’s a common question. There is a lot of overlap in both the study design and in how you analyze the data from these designs.

West et al give a very nice summary of the three types. Here’s a paraphrasing of the differences as they explain them:

  • In clustered data, the dependent variable is measured once for each subject, but the subjects themselves are somehow grouped (student grouped into classes, for example). There is no ordering to the subjects within the group, so their responses should be equally correlated.
  • In repeated measures data, the dependent variable is measured more than once for each subject. Usually, there is some independent variable (often called a within-subject factor) that changes with each measurement.
  • In longitudinal data, the dependent variable is measured at several time points for each subject, often over a relatively long period of time.

A Few Observations

West and colleagues also make the following good observations:

1. Dropout is usually not a problem in repeated measures studies, in which all data collection occurs in one sitting.  It is a huge issue in longitudinal studies, which usually require multiple contacts with participants for data collection.

2. Longitudinal data can also be clustered.  If you follow those students for two years, you have both clustered and longitudinal data.  You have to deal with both.

3. It can be hard to distinguish between repeated measures and longitudinal data if the repeated measures occur over time.  [My two cents:  A pre/post/followup design is a classic example].

4. From an analysis point of view, it  doesn’t really matter which one you have.  All three are types of hierarchical, nested, or multilevel data. You would analyze them all with some sort of mixed or multilevel analysis.  You may of course have extra issues (like dropout) to deal with in some of these.

My Own Observations

I agree with their observations, and I’d like to add a few from my own experience.

1. Repeated measures don’t have to be repeated over time.  They can be repeated over space (the right knee gets the control operation and the left knee gets the experimental operation). They can also be repeated over condition (each subject gets both the high and low cognitive load condition.  Longitudinal studies are pretty much always over time.

This becomes an issue mainly when you are choosing a covariance structure for the within-subject residuals (as determined by the Repeated statement in SAS’s Proc Mixed or SPSS Mixed).  An auto-regressive structure is often needed when some repeated measurements are closer to each other than others (over either time or space).  This is not an issue with purely clustered data, since there is no order to the observations within a cluster.

2. Time itself is often an important independent variable in longitudinal studies, but in repeated measures studies, it is usually confounded with some independent variable.

When you’re deciding on an analysis, it’s important to think about the role of time.  Time is not important in an experiment, where each measurement is a different condition (with order often randomized).  But it’s very important in a study designed to measure changes in a dependent variable over the course of 3 decades.

3. Time may be measured with some proxy like Age or Order.  But it’s still really about time.

4. A longitudinal study does not have to be over years.  You could be measuring reaction time every second for a minute.  In cases like this, dropout isn’t an issue, although time is an important predictor.

5. Consider whether it makes sense to think about time as continuous or categorical.  If you have only two time points, even if you have numerical measurements for them, there is no point in treating it as continuous.  You need at least three time points to fit a line, but more is always better.

6. Longitudinal data can be analyzed with many statistical methods, including structural equation modeling and survival analysis.  You only use multilevel modeling if the dependent variable is measured repeatedly and if the point of the model is to see how it changes (or differs).

Naming a data structure, design, or analysis is most helpful if it is so specific that it defines yours exactly. Your repeated measures analysis may not be like the repeated measures example you’re trying to follow. Rather than trying to name the analysis or the data structure, think about the issues involved in your design, your hypotheses, and your data. Work with them accordingly.

Go to the next article or see the full series on Easy-to-Confuse Statistical Concepts

 


Three Designs that Look Like Repeated Measures, But Aren’t

June 19th, 2020 by

Repeated measures is one of those terms in statistics that sounds like it could apply to many design situations. In fact, it describes only one.

A repeated measures design is one where each subject is measured repeatedly over time, space, or condition on the dependent variable

These repeated measurements on the same subject are not independent of each other. They’re clustered. They are more correlated to each other than they are to responses from other subjects. Even if both subjects are in the same condition.  (more…)


Member Training: Elements of Experimental Design

August 1st, 2019 by

Whether or not you run experiments, there are elements of experimental design that affect how you need to analyze many types of studies.

The most fundamental of these are replication, randomization, and blocking. These key design elements come up in studies under all sorts of names: trials, replicates, multi-level nesting, repeated measures. Any data set that requires mixed or multilevel models has some of these design elements. (more…)


What is the Purpose of a Generalized Linear Mixed Model?

September 10th, 2018 by

If you are new to using generalized linear mixed effects models, or if you have heard of them but never used them, you might be wondering about the purpose of a GLMM.

Mixed effects models are useful when we have data with more than one source of random variability. For example, an outcome may be measured more than once on the same person (repeated measures taken over time).

When we do that we have to account for both within-person and across-person variability. A single measure of residual variance can’t account for both.

(more…)