Assumptions of Linear Models are about Errors, not the Response Variable

March 19th, 2024 by

Stage 2I recently received a great question in a comment about whether the assumptions of normality, constant variance, and independence in linear models are about the errors, εi, or the response variable, Yi.

The asker had a situation where Y, the response, was not normally distributed, but the residuals were.

Quick Answer:  It’s just the errors.

In fact, if you look at any (good) statistics textbook on linear models, you’ll see below the model, stating the assumptions: (more…)

Outliers and Their Origins

November 11th, 2016 by

Outliers are one of those realities of data analysis that no one can avoid.Stage 2

Those pesky extreme values cause biased parameter estimates, non-normality in otherwise beautifully normal variables, and inflated variances.

Everyone agrees that outliers cause trouble with parametric analyses. But not everyone agrees that they’re always a problem, or what to do about them even if they are.

Sometimes a nonparametric or robust alternative is available — and sometimes not.

There are a number of approaches in statistical analysis for dealing with outliers and the problems they create. It’s common for committee members or Reviewer #2 to have very strong opinions that there is one and only one good approach.

Two approaches that I’ve commonly seen are: 1) delete outliers from the sample, or 2) winsorize them (i.e., replace the outlier value with one that is less extreme).

The problem with both of these “solutions” is that they also cause problems — biased parameter estimates and underweighted or eliminated valid values. (more…)