What Is Specification Error in Statistical Models?

June 8th, 2022

When we think about model assumptions, we tend to focus on assumptions like independence, normality, and constant variance. The other big assumption, which is harder to see or test, is that there is no specification error. The assumption of linearity is part of this, but it’s actually a bigger assumption.

What is this assumption of no specification error? (more…)

Overfitting in Regression Models

August 9th, 2021

The practice of choosing predictors for a regression model, called model building, is an area of real craft.Stage 2

There are many possible strategies and approaches and they all work well in some situations. Every one of them requires making a lot of decisions along the way. As you make decisions, one danger to look out for is overfitting—creating a model that is too complex for the the data. (more…)

What It Really Means to Remove an Interaction From a Model

September 17th, 2020

When you’re model building, a key decision is which interaction terms to include. And which interactions to remove.Stage 2

As a general rule, the default in regression is to leave them out. Add interactions only with a solid reason. It would seem like data fishing to simply add in all possible interactions.

And yet, that’s a common practice in most ANOVA models: put in all possible interactions and only take them out if there’s a solid reason. Even many software procedures default to creating interactions among categorical predictors.


Simplifying a Categorical Predictor in Regression Models

January 14th, 2020

One of the many decisions you have to make when model building is which form each predictor variable should take. One specific version of thisStage 2 decision is whether to combine categories of a categorical predictor.

The greater the number of parameter estimates in a model the greater the number of observations that are needed to keep power constant. The parameter estimates in a linear (more…)

Descriptives Before Model Building

January 28th, 2019

One approach to model building is to use all predictors that make theoretical sense in the first model. For example, a first model for determining birth weight could include mother's age, education, marital status, race, weight gain during pregnancy and gestation period.

The main effects of this model show that a mother's education level and marital status are insignificant.

The main effects of this model show that a mother’s education level and marital status are insignificant.

Member Training: Model Building Approaches

January 1st, 2019

There is a bit of art and experience to model building. You need to build a model to answer your research question but how do you build a statistical model when there are no instructions in the box? 

Should you start with all your predictors or look at each one separately? Do you always take out non-significant variables and do you always leave in significant ones?