Concepts you Need to Understand to Run a Mixed or Multilevel Model

Have you ever been told you need to run a mixed (aka: multilevel) model and been thrown off by all the new vocabulary?

It happened to me when I first started my statistical consulting job, oh so many years ago. I had learned mixed models in an ANOVA class, so I had a pretty good grasp on many of the concepts.

But when I started my job, SAS had just recently come out with Proc Mixed, and it was the first time I had to actually implement a true multilevel model.  I was out of school, so I had to figure it out on the job.

And even with my background, I had a pretty steep learning curve to get to a point where it made sense.  Sure, I was able to figure out the steps, but there are some pretty tricky situations and complicated designs out there.

To implement it well, you need a good understanding of the big picture, and how the small parts fit into it.  That’s what took me a while.

Luckily in my job, I was able to see many, many different designs, and that helped me figure out which issues mattered in which contexts.  (I also had two really great mentors).

I realize you, as a researcher, don’t have that vantage point, which is why I develop workshops–to give you the big picture intuitive understanding, and to show you how the big picture relates to specific steps and decisions you need to make in specific situations.

To get started, it really helps to at least understand certain statistical concepts, some of which are specific to mixed models and some which are more general.

Some concepts related to regression and ANOVA:

You may recognize some of these as the topics of some of my newsletter articles, workshops, and free webinars.  I focus on these because they’re the topics I see researchers struggle with as they try to learn harder models, including mixed models.

These are topics that you really want to understand before you ever attempt a mixed model.  Because they’re so universal, I assume participants in my repeated measures workshop already understand them (though we review quite a few of them).

By the way, not all of them currently have links, so I’ll fill these in as I write more articles.

Interpreting intercepts and regression coefficients


Dummy Coding

How ANOVA and regression are the same model

 Assessing model fit


Correlation and Covariance

Model building

Crossed and Nested Factors

Some concepts that are inherent to mixed models:

The following are the concepts that aren’t relevant to all regression models, but are extremely important in mixed models. You may have come across some of them in other areas of statistics as well–they’re not all unique to mixed models.

These are the topics we go over in great detail in the Repeated Measures workshop, particularly how they related to repeated measures and longitudinal designs.  I’ve linked to some free resources we have on these topics to get you started.

Intra Class Correlation

Maximum Likelihood Estimation, Deviance, and -2 Log Likelihood

Fixed and Random Factors

Covariance Structures, as well as the meaning of specific structures, including Compound Symmetry,  Autoregressive, Toeplitz, Unstructured, and others

Information Criteria, like AIC, BIC, AICC

Marginal and Mixed Models

Random Intercepts and Slopes

G matrix and R matrix (and just to make it harder, these are sometimes called D and Sigma, respectively)

Missing at Random


Random Intercept and Random Slope Models
Get started with the two building blocks of mixed models and see how understanding them makes these tough models much clearer.

Reader Interactions


  1. Cynthia Park says

    Hi Karen,
    You were extraordinarily helpful with my dissertation and I am forever grateful. I still find myself in need of a well-written paragraph or two regarding why there are so many degrees of freedom used in determining effect. I understand why…due to the regression nature of mixed models. But I keep getting this from my advisor and other well-respected clingers to ANOVA: “You will always find a significant effect with so many degrees of freedom.” Any help would be more than welcome. Miss our times together!

    • Trey says

      Yes, you are much more likely to find significant effects in any traditional statistical approach when you have a large sample (which leads to higher degrees of freedom). That does not, however, mean it is a problem. In fact, it is quite the opposite. The large sample allows you to have much tighter confidence intervals and are, therefore, less likely to have zero in your confidence interval–essentially more likely to be significant. Where their argument would have validity is if the effect is significant statistically but not substantively. Too often researchers “star gaze” and view something as important because it is significant when the effect is of little consequence.
      One example, I was running a logistic regression on a sample of 5 million observations predicting school discipline.The coefficient for per-capita income was significant (p<.001). However, the coefficient was essentially zero-such that per-capita income would have to change by thousands of dollars to have even a trivial impact on probability of being disciplined. Yes, the value was significant because of the large sample–that did not make it incorrect though; that is the most likely relationship that exists given the data. However, I would be incorrect to assert that per-capita income was an important contributor to discipline because the relationship was close to zero.
      Long story short, don't apologize for large samples, but consider substantive significance.

  2. christine meng says

    Hi Karen, I am currently using linear mixed effects models in SPSS to analysis data that are hierarchical in nature, specifically students nested in classrooms. My understanding is that linear mixed effects can be used to analyze multilevel data. While I understand the steps that are used to run linear mixed effects models in SPSS, I am having difficulty to understand how I can account for the nested structures (students nested within classrooms) using linear mixed effects models. Do I simply aggregate variables measured at level-1 into level-2 by centering the level-1 variables around the mean of all cases within the same classroom? Thank you, Christine

Leave a Reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.