The Analysis Factor Statwise Newsletter
Volume 1, Issue 1
July, 2008
In This Issue

A Note from Karen

Featured Article: Outliers: To Drop or Not To Drop?

Resource of the Month

What's New

About Us

 
Quick Links

Our Website

More About Us

You received this email because you subscribed to The Analysis Factor's mailing list. To change your subscription, see the link at end of this email. If your email is having trouble with the format, click here for a web version.

Please forward this to anyone you know who might benefit. If you received this from a friend, sign up for this ezine now! 
A Note from Karen

Welcome, welcome, welcome to our premiere issue!  I really hope you enjoy StatWise, and all the other resources, programs, and services we’re cooking up for you here at The Analysis Factor.  Thanks for joining me on this journey!

As you may know, I spent 7 years as a statistical consultant in the statistical consulting office at Cornell University.  I learned so much there about being a really good statistical consultant.  Not only knowing a lot about statistics, but understanding the pressures and issues researchers are facing, how to give good customer service, and how to communicate in a way that researchers understand.

Since very, very few universities have statistical consulting at all, much less consulting with a fabulous service focus, my goal is to make the best of Cornell’s legacy available to researchers anywhere. 

This newsletter is the first step in that goal.  (Over the next year, I will be adding many, many resources and learning opportunities).  In today’s article, I’m clarifying a very basic, but practical issue that researchers often have questions about: outliers.  Everyone’s got them.  When do you get rid of them?

Also please check out the Resource of the Month.  Each month, I will highlight a book, web site, or other great resource that will help you practice statistics.  For our first month, I couldn’t help but offer 3 fabulous resources for learning and using SPSS.

Featured Article: When is it Legitimate to Drop Outliers?

Outliers are one of those statistical issues that everyone knows about, but most people aren’t sure how to deal with.  Most parametric statistics, like means, standard deviations, and correlations, and every statistic based on these, are highly sensitive to outliers.  And since the assumptions of common statistical procedures, like linear regression and ANOVA, are also based on these statistics, outliers can really mess up your analysis.

Despite all this, as much as you’d like to, it is NOT acceptable to drop an observation just because it is an outlier.  They can be legitimate observations and are sometimes the most interesting ones.  It’s important to investigate the nature of the outlier before deciding. 

  1. If it is obvious that the outlier is due to incorrectly entered or measured data, you should drop the outlier:

    For example, I once analyzed a data set in which a woman’s weight was recorded as 19 lbs.  I knew that was physically impossible.  Her true weight was probably 91, 119, or 190 lbs, but since I didn’t know which one, I dropped the outlier.  

    This also applies to a situation in which you know the datum did not accurately measure what you intended.  For example, if you are testing people’s reaction times to an event, but you saw that the participant is not paying attention and randomly hitting the response key, you know it is not an accurate measurement.

  2. If the outlier does not change the results but does affect assumptions, you may drop the outlier.  But note that in a footnote of your paper.

    Neither the presence nor absence of the outlier in the graph below would change the regression line:

    graph-1

  3. More commonly, the outlier affects both results and assumptions.  In this situation, it is not legitimate to simply drop the outlier.  You may run the analysis both with and without it, but you should state in at least a footnote the dropping of any such data points and how the results changed.

    graph-2

  4. If the outlier creates a significant association, you should drop the outlier and should not report any significance from your analysis.

    In the following graph, the relationship between X and Y is clearly created by the outlier.  Without it, there is no relationship between X and Y, so the regression coefficient does not truly describe the effect of X on Y.

    graph-3

So in those cases where you shouldn’t drop the outlier, what do you do? 

One option is to try a transformation.  Square root and log transformations both pull in high numbers.  This can make assumptions work better if the outlier is a dependent variable and can reduce the impact of a single point if the outlier is an independent variable.

Another option is to try a different model.  This should be done with caution, but it may be that a non-linear model fits better.  For example, in example 3, perhaps an exponential curve fits the data with the outlier intact.

Whichever approach you take, you need to know your data and your research area well.  Try different approaches, and see which make theoretical sense.

Resource of the Month

While SPSS is relatively straightforward to learn and use, like any software program, it has its challenges.  And, unfortunately, the manuals aren’t always the most comprehensive.  These three resources are excellent for anyone learning or using SPSS (all are free or very inexpensive):

The University of Texas at Austin’s Division of Statistics and Scientific Computation has an online SPSS tutorial with four modules: 

  • Getting Started
  • Descriptive and Inferential Statistics
  • Displaying Data
  • Data Manipulation and Advanced Topic

http://ssc.utexas.edu/consulting/tutorials/index.html

The Academic & Technology Services at UCLA has many, many resources to help you learn and use SPSS (and many other statistical software packages) at:

http://www.ats.ucla.edu/stat/spss/default.htm

My favorite favorite “How to use SPSS” book is Using SPSS for Windows: Analyzing and Understanding Data, by Samuel B. Green, Neil J. Salkind, and Theresa M. Akey. 

Chapters are short and each covers a different statistical technique. There is a description of the technique and examples of when it should be used, as well as step-by-step instructions for doing it in SPSS. It is an SPSS book, not a statistics book--a good review of the statistical methods, but not enough for beginners.

What's New

Well, StatWise!  Again, I really hope you enjoy it.

We are planning on adding many resources and programs this year.  This section will keep you updated as we add each one.

We currently are offering one-on-one statistical consulting (advising and answering questions so you can practice statistics) as well as statistical project work (give us the data and we handle the rest.  Please contact us if you may have a need either service.

About Us

Karen Grace-Martin is the owner and founder of The Analysis Factor.  Our philosophy is that statistics, as an applied skill, is learned best within the context of a researcher’s own data.  Researchers at every stage of their career therefore need ongoing statistics training and support.  The Analysis Factor offers statistical consulting, projects, resources, and learning programs that empower social science researchers to become confident, able, and skilled statistical practitioners.

Karen spent seven years as a statistical consultant in the statistical consulting office at Cornell University.  While there, she learned how to be a great statistical advisor—not only developing excellent statistical skills, but understanding the pressures and issues researchers are facing, giving fabulous customer service, and communicating technical ideas at a level each client understands. 

You can learn more about Karen Grace-Martin and The Analysis Factor at analysisfactor.com.

Please forward this newsletter to colleagues who you think would find it useful. Your recommendation is how we grow.

If you received this email from a friend or colleague, click here to subscribe to this newsletter.

Need to change your email address? See below for details.

No longer wish to receive this newsletter? See below to cancel.