What Are Nested Models? - The Analysis Factor

Pretty much all of the common statistical models we use, with the exception of OLS Linear Models, use Maximum Likelihood estimation.

This includes favorites like:

All Generalized Linear Models, including logistic, probit, beta, Poisson, negative binomial regression
Linear Mixed Models
Generalized Linear Mixed Models
Parametric Survival Analysis models, like Weibull models
Structural Equation Models

That’s a lot of models.

If you’ve ever learned any of these, you’ve heard that some of the statistics that compare model fit in competing models require that models be nested (specifically, the likelihood ratio test, based on model deviance). This is particularly important while you’re trying to do model building. You need to know which model fits better.

This can get really confusing because we often talk about variables being nested. For example, you may hear:

Plot of trees is nested within Treatment
Students are nested within Teacher
Transects are nested within Location
Word is nested within List

While this concept of nesting is the same as the one we’re applying to models, it’s a different application of the concept.

Nested Models

So what are we talking about when we talk about nested models?

Model A is nested in Model B if the parameters in Model A are a subset of the parameters in Model B.

That’s it.

Let’s look at an example.

We are predicting the Height of a shrub from the bacteria in the soil, which is measured continuously, and by the dummy-coded variable Sun, which has a value of 1 for a location in full sun and a value=0 for a location in partial sun.

Model A: Height_i = β₀ + β₁*Bacteria + β₂*Sun + β₃*Bacteria*Sun + ε_i

This model has five parameters: β₀ , β₁ , β₂, β₃ are all obvious, but there is one more, usually stated as an afterthought: σ²

σ² is the variance of the errors, ε_i. Sometimes you’ll see this written after the model, to make sure that this parameter and model assumptions are directly stated:

ε_i ~ i.i.d. N(0, σ²)

In your output you’ll see estimates of all 5 parameters. The four regression coefficients will usually be in one table and the residual variance is often in another. But they’re all there.

Okay, so let’s compare this model to one in which we add a few covariates:

Model B: Height_i = β₀ + β₁*Bacteria + β₂*Sun + β₃*Bacteria*Sun + β₄*Soil Nitrogen level + β₅*Plant density + ε_i

However, let’s consider a third model. Say we realized there was no real interaction between Soil Bacteria and Sun so we remove it to get Model C.

Model C: Height_i = β₀ + β₁*Bacteria + β₂*Sun + β₄*Soil Nitrogen level + β₅*Plant density + ε_i

Model A had 5 parameters: β₀, β₁, β₂, β₃, and σ²

Model B has 7: β₀, β₁, β₂, β₃, β₄, β₅, and σ²

Model C has 6: β₀, β₁, β₂, β₄, β₅, and σ²

So Model A is nested in Model B.

Model C is nested in Model B.

But C and A are not nested. Each one contains parameters that the other doesn’t.

I’ve shown this example with fixed effects parameters — the regression coefficients, but it works the same way when we compare models with different variance or covariance parameters, as occurs when we add random or repeated effects.

Random Intercept and Random Slope Models

Get started with the two building blocks of mixed models and see how understanding them makes these tough models much clearer.

Nested Models

Reader Interactions

Leave a Reply Cancel reply