• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • Our Programs
    • Membership
    • Online Workshops
    • Free Webinars
    • Consulting Services
  • About
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Collaborate with Us
  • Statistical Resources
  • Contact
  • Blog
  • Login

Regression Diagnostics in Generalized Linear Mixed Models

by Kim Love 2 Comments

What are the best methods for checking a generalized linear mixed model (GLMM) for proper fit?

This question comes up frequently.

Unfortunately, it isn’t as straightforward as it is for a general linear model.

In linear models the requirements are easy to outline: linear in the parameters, normally distributed and independent residuals, and homogeneity of variance (that is, similar variance at all values of all predictors).

For linear models, there are well-described and well-implemented methods for checking each of these, both visual/descriptive methods and statistical tests.

It is not nearly as easy for GLMMs.

Assumption: Random effects come from a normal distribution

Let’s start with one of the more familiar elements of GLMMs, which is related to the random effects. There is an assumption that random effects—both intercepts and slopes—are normally distributed.

These are relatively easy to export to a data set in most statistical software (including SAS and R). Personally, I much prefer visual methods of checking for normal distributions, and typically go right to making histograms or normal probability plots (Q-Q plots) of each of the random effects.

If the histograms look roughly bell-shaped and symmetric, or the Q-Q plots generally fall close to a diagonal line, I usually consider this to be good enough.

If the random effects are not reasonably normally distributed, however, there are not simple remedies.  In a general linear model outcomes can be transformed. In GLMMs they cannot.

Research is currently being conducted on the consequences of mis-specifying the distribution of random effects in GLMMs. (Outliers, of course, can be handled the same way as in generalized linear models—except that an entire random subject, as opposed to a single observation, may be examined.)

Assumption: The chosen link function is appropriate

Additional assumptions of GLMMs are more related to the generalized linear model side. One of these is the relationship of the numeric predictors to the parameter of interest, which is determined by the link function.

For both generalized linear models and GLMMs, it is important to understand that the most typical link functions (e.g., the logit for binomial data, the log for Poisson data) are not guaranteed to be a good representation of the relationship of the predictors with the outcomes.

Checking this assumption can become quite complicated as models become more crowded with fixed and random effects.

One relatively simple (though not perfect) way to approach this is to compare the predicted values to the actual outcomes.

With most GLMMs, it is best to compare averages of outcomes to predicted values. For example, with binomial models, one could take all of the values with predicted values near 0.5, 0.15, 0.25, etc., and average the actual outcomes (the 0s and 1s).  You can then plot these average values against the predicted values.

If the general form of the model is correct, the differences between the predicted values and the averaged actual values will be small. (Of course how small depends on the number of observations and variance function).

No “patterns” in these differences should be obvious.

This is similar to the idea of the Hosmer-Lemeshow test for logistic regression models. If you suspect that the form of the link function is not correct, there are remedies. Possibilites include changing the link function, transforming numeric predictors, or (if necessary) categorizing continuous predictors.

Assumption: Appropriate estimation of variance

Finally, it is important to check the variability of the outcomes. This is also not as easy as it is for linear models, since the variance is not constant and is a function of the parameter being estimated.

Fortunately, this is one of the easier assumptions to check. One of the fit statistics your statistical software produces is a generalized chi-square that compares the magnitude of the model residuals to the theoretical variance.

The chi-square divided by its degrees of freedom should be approximately 1. If this statistic is too large, then the variance is “overdispersed” (larger than it should be). Alternatively, if the statistic is too small, the variance is “underdispersed.”

While the best way to approach this varies by distribution, there are options to adjust models for overdispersion that result in more conservative p-values.

What is GLMM and When Should You Use It?
When you have multilevel or repeated data and normality just isn't happening, you may need GLMM. Get started learning Generalized Linear Mixed Models and when and how to apply them to your data.

Tagged With: generalized linear mixed model

Related Posts

  • R-Squared for Mixed Effects Models
  • Understanding Random Effects in Mixed Models
  • What is the Purpose of a Generalized Linear Mixed Model?
  • Models for Repeated Measures Continuous, Categorical, and Count Data

Reader Interactions

Comments

  1. Hedye says

    April 22, 2021 at 2:24 pm

    How would one “… and average the actual outcomes (the 0s and 1s).” in GLMM?

    Reply
  2. Katherine Cliff says

    November 10, 2020 at 6:41 am

    I have been reading about the assumptions for GLMM. One of these is that the random effects should be normally distributed.
    How do you test for this?
    Is this similar to the final assumption of appropriate estimation of variance?

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Primary Sidebar

This Month’s Statistically Speaking Live Training

  • Member Training: Introduction to SPSS Software Tutorial

Upcoming Free Webinars

Poisson and Negative Binomial Regression Models for Count Data

Upcoming Workshops

  • Analyzing Count Data: Poisson, Negative Binomial, and Other Essential Models (Jul 2022)
  • Introduction to Generalized Linear Mixed Models (Jul 2022)

Copyright © 2008–2022 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT