Have you ever been told you need to run a mixed (aka: multilevel) model and been thrown off by all the new vocabulary?

It happened to me when I first started my statistical consulting job, oh so many years ago. I had learned mixed models in an ANOVA class, so I had a pretty good grasp on many of the concepts.

But when I started my job, SAS had just recently come out with Proc Mixed, and it was the first time I had to actually implement a true multilevel model. I was out of school, so I had to figure it out on the job.

And even with my background, I had a pretty steep learning curve to get to a point where it made sense. Sure, I was able to figure out the steps, but there are some pretty tricky situations and complicated designs out there.

To implement it well, you need a good understanding of the big picture, and how the small parts fit into it. That’s what took me a while.

Luckily in my job, I was able to see many, many different designs, and that helped me figure out which issues mattered in which contexts. (I also had two really great mentors).

I realize you, as a researcher, don’t have that vantage point, which is why I develop workshops–to give you the big picture intuitive understanding, and to show you how the big picture relates to specific steps and decisions you need to make in specific situations.

To get started, it really helps to at least understand certain statistical concepts, some of which are specific to mixed models and some which are more general.

### Some concepts related to regression and ANOVA:

You may recognize some of these as the topics of some of my newsletter articles, workshops, and free webinars. I focus on these because they’re the topics I see researchers struggle with as they try to learn harder models, including mixed models.

These are topics that you really want to understand before you ever attempt a mixed model. Because they’re so universal, I assume participants in my repeated measures workshop already understand them (though we review quite a few of them).

By the way, not all of them currently have links, so I’ll fill these in as I write more articles.

Interpreting intercepts and regression coefficients

How ANOVA and regression are the same model

### Some concepts that are inherent to mixed models:

The following are the concepts that aren’t relevant to all regression models, but are extremely important in mixed models. You may have come across some of them in other areas of statistics as well–they’re not all unique to mixed models.

These are the topics we go over in great detail in the Repeated Measures workshop, particularly how they related to repeated measures and longitudinal designs. I’ve linked to some free resources we have on these topics to get you started.

Maximum Likelihood Estimation, Deviance, and -2 Log Likelihood

Covariance Structures, as well as the meaning of specific structures, including Compound Symmetry, Autoregressive, Toeplitz, Unstructured, and others

Information Criteria, like AIC, BIC, AICC

G matrix and R matrix (and just to make it harder, these are sometimes called D and Sigma, respectively)

{ 3 comments… read them below or add one }

Hi Karen,

You were extraordinarily helpful with my dissertation and I am forever grateful. I still find myself in need of a well-written paragraph or two regarding why there are so many degrees of freedom used in determining effect. I understand why…due to the regression nature of mixed models. But I keep getting this from my advisor and other well-respected clingers to ANOVA: “You will always find a significant effect with so many degrees of freedom.” Any help would be more than welcome. Miss our times together!

Cynthia

Cynthia:

Yes, you are much more likely to find significant effects in any traditional statistical approach when you have a large sample (which leads to higher degrees of freedom). That does not, however, mean it is a problem. In fact, it is quite the opposite. The large sample allows you to have much tighter confidence intervals and are, therefore, less likely to have zero in your confidence interval–essentially more likely to be significant. Where their argument would have validity is if the effect is significant statistically but not substantively. Too often researchers “star gaze” and view something as important because it is significant when the effect is of little consequence.

One example, I was running a logistic regression on a sample of 5 million observations predicting school discipline.The coefficient for per-capita income was significant (p<.001). However, the coefficient was essentially zero-such that per-capita income would have to change by thousands of dollars to have even a trivial impact on probability of being disciplined. Yes, the value was significant because of the large sample–that did not make it incorrect though; that is the most likely relationship that exists given the data. However, I would be incorrect to assert that per-capita income was an important contributor to discipline because the relationship was close to zero.

Long story short, don't apologize for large samples, but consider substantive significance.

Hi Karen, I am currently using linear mixed effects models in SPSS to analysis data that are hierarchical in nature, specifically students nested in classrooms. My understanding is that linear mixed effects can be used to analyze multilevel data. While I understand the steps that are used to run linear mixed effects models in SPSS, I am having difficulty to understand how I can account for the nested structures (students nested within classrooms) using linear mixed effects models. Do I simply aggregate variables measured at level-1 into level-2 by centering the level-1 variables around the mean of all cases within the same classroom? Thank you, Christine