Whether or not you run experiments, there are elements of experimental design that affect how you need to analyze many types of studies.
The most fundamental of these are replication, randomization, and blocking. These key design elements come up in studies under all sorts of names: trials, replicates, multi-level nesting, repeated measures. Any data set that requires mixed or multilevel models has some of these design elements. (more…)
We often talk about nested factors in mixed models — students nested in classes, observations nested within subject.
But in all but the simplest designs, it’s not that straightforward. (more…)
One of those tricky, but necessary, concepts in statistics is the difference between crossed and nested factors.
As a reminder, a factor is any categorical independent variable. In experiments, or any randomized designs, these factors are often manipulated. Experimental manipulations (like Treatment vs. Control) are factors.
Observational categorical predictors, such as gender, time point, poverty status, etc., are also factors. Whether the factor is observational or manipulated won’t affect the analysis, but it will affect the conclusions you draw from the results.
When there is only one factor in a design, you don’t have to worry about crossing and nesting. But once you have at least two factors, you need to understand whether they are nested or crossed. It’s an important design feature that affects the analyses you can and should conduct.
Crossing and Nesting
Two factors are crossed when every category of one factor co-occurs in the design with every category of the other factor. In other words, there is at least one observation in every combination of categories for the two factors.
A factor is nested within another factor when each category of the first factor co-occurs with only one category of the other. In other words, an observation has to be within one category of Factor 2 in order to have a specific category of Factor 1. All combinations of categories are not represented.
If two factors are crossed, you can calculate an interaction. If they are nested, you cannot because you do not have every combination of one factor along with every combination of the other.
When you’re not sure whether two factors in your design are crossed or nested, the easiest way to tell is to run a cross tabulation of those factors.
Here is an example. In this study, 27 men in their early 20s were randomized into one of three physical training groups. The subjects in every group–endurance, strength, and concurrent training regimens–were measured on a number of physical health measures at two time points: pre and post.
Group and Time are Crossed
The two factors of interest–Training group and Time–are crossed, as there are 9 observations from each training group in each time. In other words, each Training group is represented at every Time point. This cross tabulation table shows this.
Subjects and Time are Crossed
However, there is a third factor that needs to be taken into account because it’s a repeated measures study: Subject.
If we had different people in each group, subject would not be an important factor and we could stop there. Groups and time would still be crossed. We would miss out on some of the efficiency advantages that we get from repeated measures, though, so let’s keep going.
In repeated measures, subject itself becomes a factor. Subject is crossed with time because each subject appears in every time point. Again, this is easy to see in the cross tabulation. Every subject has at least one value at every time point.
Subjects are Nested within Group
But you can see in the Subject*Groups cross tabulation that each subject has observations in only one group. Subjects 1-9 are in the Endurance training group only. Subjects 10-18 are in the Strength training group, etc. Because each subject was assigned to only one training group, subject and group are not crossed. Rather, subject is nested within training group.
In traditional multivariate approaches of analyzing repeated measures data, we ignore issues of nesting and crossing and use different names for these same concepts. Factors that Subject is nested within, like Training group, are called between-subjects factors. Factors that Subject is crossed with, like Time, are called within-subjects factors.
Those concepts are helpful and valid. But you can see bigger design and analysis issues if you translate them into crossed and nested factors. This becomes very, very important when you expand your analysis of repeated measures beyond traditional approaches to mixed models approaches.
It also becomes extremely important in clustered designs, which don’t necessarily have repeated measures, but do have some sort of nesting of individuals within some larger group.
The combinations of nesting and crossing in designs with many factors can get quite complex. It gets even more confusing when you also have to decide whether to make factors fixed or random. Remember to use the cross tabulations to help you sort it out.
Go to the next article or see the full series on Easy-to-Confuse Statistical Concepts