Pretty much all of the common statistical models we use, with the exception of OLS Linear Models, use Maximum Likelihood estimation.

This includes favorites like:

- All Generalized Linear Models, including logistic, probit, beta, Poisson, negative binomial regression
- Linear Mixed Models
- Generalized Linear Mixed Models
- Parametric Survival Analysis models, like Weibull models
- Structural Equation Models

That’s a lot of models.

If you’ve ever learned any of these, you’ve heard that some of the statistics that compare model fit in competing models require that models be nested (specifically, the likelihood ratio test, based on model deviance). This is particularly important while you’re trying to do model building. You need to know which model fits better.

This can get really confusing because we often talk about variables being nested. For example, you may hear:

- Plot of trees is nested within Treatment
- Students are nested within Teacher
- Transects are nested within Location
- Word is nested within List

While this *concept* of nesting is the same as the one we’re applying to models, it’s a different *application* of the concept.

### Nested Models

So what are we talking about when we talk about nested models?

Model A is nested in Model B if the parameters in Model A are a subset of the parameters in Model B.

That’s it.

Let’s look at an example.

We are predicting the Height of a shrub from the bacteria in the soil, which is measured continuously, and by the dummy-coded variable Sun, which has a value of 1 for a location in full sun and a value=0 for a location in partial sun.

**Model A**: Height_{i} = β_{0} + β_{1}*Bacteria + β_{2}*Sun + β_{3}*Bacteria*Sun + ε_{i}

This model has five parameters: β_{0} , β_{1} , β_{2 }, β_{3} are all obvious, but there is one more, usually stated as an afterthought: σ^{2}

σ^{2} is the variance of the errors, ε_{i}. Sometimes you’ll see this written after the model, to make sure that this parameter and model assumptions are directly stated:

ε_{i} ~ i.i.d. N(0, σ^{2})

In your output you’ll see estimates of all 5 parameters. The four regression coefficients will usually be in one table and the residual variance is often in another. But they’re all there.

Okay, so let’s compare this model to one in which we add a few covariates:

**Model B**: Height_{i} = β_{0} + β_{1}*Bacteria + β_{2}*Sun + β_{3}*Bacteria*Sun + β_{4}*Soil Nitrogen level + β_{5}*Plant density + ε_{i}

However, let’s consider a third model. Say we realized there was no real interaction between Soil Bacteria and Sun so we remove it to get Model C.

**Model C**: Height_{i} = β_{0} + β_{1}*Bacteria + β_{2}*Sun + β_{4}*Soil Nitrogen level + β_{5}*Plant density + ε_{i}

Model A had 5 parameters: β_{0}, β_{1}, β_{2}, β_{3}, and σ^{2}

Model B has 7: β_{0}, β_{1}, β_{2}, β_{3}, β_{4}, β_{5}, and σ^{2}

Model C has 6: β_{0}, β_{1}, β_{2}, β_{4}, β_{5}, and σ^{2}

So Model A is nested in Model B.

Model C is nested in Model B.

But C and A are not nested. Each one contains parameters that the other doesn’t.

I’ve shown this example with fixed effects parameters — the regression coefficients, but it works the same way when we compare models with different variance or covariance parameters, as occurs when we add random or repeated effects.