Understanding Random Effects in Mixed Models

In fixed-effects models (e.g., regression, ANOVA, generalized linear models), there is only one source of random variability. This source of variance is the random sample we take to measure our variables.

It may be patients in a health facility, for whom we take various measures of their medical history to estimate their probability of recovery. Or random variability may come from individual students in a school system, and we use demographic information to predict their grade point averages.

We call the variability across individuals’ “residual” variance (in linear models, this is the estimate of σ2, also called the mean squared error). It’s the variability that was unexplained by the predictors in the model (the fixed effects).

Multiple Sources of Random Variability

Mixed effects models—whether linear or generalized linear—are different in that there is more than one source of random variability in the data.

In addition to patients, there may also be random variability across the doctors of those patients. In addition to students, there may be random variability from the teachers of those students.

Some doctors’ patients may have a greater probability of recovery, and others may have a lower probability, even after we have accounted for the doctors’ experience and other measurable traits. Some teachers’ students will have higher GPAs than other teachers’ students, even after we account for teaching methods.

Random Effects: Intercepts and Slopes

We account for these differences through the incorporation of random effects. Random intercepts allow the outcome to be higher or lower for each doctor or teacher; random slopes allow fixed effects to vary for each doctor or teacher.

What do these random effects mean? How do we interpret them? We usually talk about them in terms of their variability, instead of focusing on them individually.

Using the patient/doctor data as an example, this allows us to make “broad level” inferences about the larger population of patients, which do not depend on a particular doctor. In other words, we can now incorporate (instead of ignore) doctor-to-doctor variability in patient recovery, and improve our ability to describe how fixed effects relate to outcomes.

Variance of Random Effects

We can also talk directly about the variability of random effects, similar to how we talk about residual variance in linear models.

There is no general measure of whether variability is large or small, but subject-matter experts can consider standard deviations of random effects relative to the outcomes.

For example, if teacher-averaged GPAs only vary from the overall average with an SD of 0.02 GPA points, the teachers may be considered rather uniform; however, if teacher-averaged GPAs varied from the overall average with an SD of 0.5 GPA points, it would seem as if individual teachers could make a large difference in their students’ success.

(For an additional way to look at variability in linear mixed effects models, check out Karen’s blog post on ICC here.)

Individual random effects

Finally, we can talk about individual random effects, although we usually don’t.  This was not the original purpose of mixed effects models, although it has turned out to be useful in certain applications. Software programs do provide access to the random effects (best linear unbiased predictors, or BLUPs) associated with each of the random subjects.

BLUPs are the differences between the intercept for each random subject and the overall intercept (or slope for each random subject and the overall slope). In some software, such as SAS, these are accompanied by standard errors, t-tests, and p-values.

In the case of the patient/doctor data set (assuming no random slopes for easier interpretation), a small p-value for an individual doctor’s random intercept would indicate that the doctor’s typical patient recovery probability is significantly different from an average doctor’s typical patient recovery probability.

These standard errors and p-values are adjusted so that they account for all of the fixed effects in the model as well as the random variability among patients. Clearly, this information could be of interest to the doctor’s place of work, or to a patient who is choosing a doctor.

What is GLMM and When Should You Use It?
When you have multilevel or repeated data and normality just isn't happening, you may need GLMM. Get started learning Generalized Linear Mixed Models and when and how to apply them to your data.

Reader Interactions


  1. Ramesha says

    Excellent explanation. I have a question, I would like to know about what message that plot SD and residual SD line indicates in a caterpillar plot used to explain the mixed effect model. Here plot is a random effect and tree height, soil variables and other are fixed effects.

Leave a Reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.