A Reason to Not Drop Outliers

by Karen Grace-Martin

I recently had this question in consulting:

I’ve got 12 out of 645 cases with Mahalanobis’s Distances above the critical value, so I removed them and reran the analysis, only to find that another 10 cases were now outside the value. I removed these, and another 10 appeared, and so on until I have removed over 100 cases from my analysis! Surely this can’t be right!?! Do you know any way around this? It is really slowing down my analysis and I have no idea how to sort this out!!

And this was my response:

I wrote an article about dropping outliers.  As you’ll see, you can’t just drop outliers without a REALLY good reason.  Being influential is not in itself a good enough reason to drop data.

Four Critical Steps in Building Linear Regression Models
While you’re worrying about which predictors to enter, you might be missing issues that have a big impact your analysis. This training will help you achieve more accurate results and a less-frustrating model building experience.

{ 3 comments… read them below or add one }


hey! thanks for sharing this!
i have a question and unfortunately i couldn’t find my answer.i hope some one can help here.
i know how to remove outliers.but what dint understand is ,should i only remove it according to my dependent variable vector?or i should do it for other vectors too?for example if i want to estimate salary according to age and education ,…
should i remove records which they are outlier in age vector?



Hi Karen, the newsletter and the dropping outliers link do not seem to be available. Is there a way I can get the answer to the question posted above?




Hi Meenu, I fixed it.


Leave a Comment

Please note that, due to the large number of comments submitted, any comments on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Previous post:

Next post: