influential outliers Archives - The Analysis Factor

Outliers are one of those realities of data analysis that no one can avoid.

Those pesky extreme values cause biased parameter estimates, non-normality in otherwise beautifully normal variables, and inflated variances.

Everyone agrees that outliers cause trouble with parametric analyses. But not everyone agrees that they’re always a problem, or what to do about them even if they are.

Ways to Deal With Outliers

Sometimes a non-parametric or robust alternative is available.

And sometimes not.

There are a number of approaches in statistical analysis for dealing with outliers and the problems they create.

It’s common for committee members or Reviewer #2 to have Very. Strong. Opinions. that there is one and only one good approach.

Two approaches that I’ve commonly seen are:

1) delete outliers from the sample, or

2) winsorize them (i.e., replace the outlier value with one that is less extreme).

Limitations of these Solutions

The problem with both of these “solutions” is that they also cause problems — biased parameter estimates and underweighted or eliminated valid values. (more…)

1 comment

I recently had this question in consulting:

I’ve got 12 out of 645 cases with Mahalanobis’s Distances above the critical value, so I removed them and reran the analysis, only to find that another 10 cases were now outside the value. I removed these, and another 10 appeared, and so on until I have removed over 100 cases from my analysis! Surely this can’t be right!?! Do you know any way around this? It is really slowing down my analysis and I have no idea how to sort this out!!

And this was my response:

I wrote an article about dropping outliers. As you’ll see, you can’t just drop outliers without a REALLY good reason. Being influential is not in itself a good enough reason to drop data.

4 comments

influential outliers

Outliers and Their Origins

Ways to Deal With Outliers

Limitations of these Solutions

A Reason to Not Drop Outliers