• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • About
    • Our Programs
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Guest Instructors
  • Membership
    • Statistically Speaking Membership Program
    • Login
  • Workshops
    • Online Workshops
    • Login
  • Consulting
    • Statistical Consulting Services
    • Login
  • Free Webinars
  • Contact
  • Login

The Problem with Using Tests for Statistical Assumptions

by Karen Grace-Martin Leave a Comment

Every statistical model and hypothesis test has assumptions.

And yes, if you’re going to use a statistical test, you need to check whether those assumptions are reasonable to whatever extent you can.

Some assumptions are easier to check than others. Some are so obviously reasonable that you don’t need to do much to check them most of the time. And some have no good way of being checked directly, so you have to use situational clues.

There are so many nuances with assumptions as well: depending on a lot of the details of your particular study and data set, violations of some assumptions may be more or less serious. It depends on a lot of details: sample sizes, imbalance in the data across groups, whether the study is exploratory or confirmatory, etc.

And here’s the kicker: the simple rules your stats professor told you to use to test assumptions were absolutely sufficient when you were learning about tests and assumptions. But now that you’re doing real data analysis? It’s time to dig into the details.

So here are some guidelines about checking assumptions that should help you make decisions about which to check and when to conclude that you’ve checked enough.

Before you begin

  1. make sure you understand what the assumptions are and what they mean. There is a lot of misinformation and vague information out there about assumptions.
  2. Don’t forget that when assumptions violations are serious, it will call into question all your results. This really is important.

Don’t rely on a single statistical test to decide if another test’s assumptions have been met.

There are many tests, like Levene’s test for homogeneity of variance, the Kolmogorov-Smirnov test for normality, the Bartlett’s test for sphericity, whose main usage is to test the assumptions of another test.

These tests probably have other uses, but this is how I’ve generally seen them used.

These tests provide useful information about whether an assumption is being met. But they’re just one piece of information that you should use to decide if the assumption is reasonable.

Because, nuances.

Let’s use that same example of Levene’s test of homogeneity of variance. It’s often used in ANOVA as a sole decision criterion. And software makes it so easy.

But it’s too simple.

  1. It relies too much on p-values, and therefore, sample sizes. If the sample size is large, Levene’s will have a smaller p-value than if the sample size is small, given the same variances.So it’s very likely that you’re overstating a problem with the assumption in large samples and understating it in small samples. You can’t ignore the actual size difference in the variances when making this decision. So sure, look at the p-value, but also look at the actual variances and how much bigger some are than others. (In other words, actually look at the effect size, not just the p-value).
  2. The ANOVA is generally considered robust to violations of this assumption when sample sizes across groups are equal. So even if Levene’s is significant, moderately different variances may not be a problem in balanced data sets. Keppel (1992) suggests that a good rule of thumb is that if sample sizes are equal, robustness should hold until the largest variance is more than 9 times the smallest variance.
  3. This robustness goes away the more unbalanced the samples are. So you need to use judgment here, taking into account both the imbalance and the actual difference in variances.

Gather Evidence Instead

Use the results of the Levene’s test as one piece of evidence you’ll use to make a decision.

In addition to that and the ratio rule of thumb, the other pieces of info could potentially include:

  • A graph of your data. Is there an obvious difference in spread across groups?
  • A different test of the same assumption, if it exists. (Hartley’s F-max is another one for equal variances). See if the results match.

Consider each piece of evidence within the wider data context.

Now make a judgment. Take into account the sample sizes when interpreting p-values from any tests.

I know it’s not comfortable to make judgments based on uncertain information. Experience helps here.

But remember, in data analysis, it’s impossible not to.

Be transparent in what you did and on what evidence you based your decision.

Standard Non-Deviation: The Steps to Running Any Statistical Model
Get the road map for your data analysis before you begin. Learn how to make any statistical modeling – ANOVA, Linear Regression, Poisson Regression, Multilevel Model – straightforward and more efficient.

Tagged With: ANOVA, checking assumptions, levene's test

Related Posts

  • The Steps for Running any Statistical Model
  • What are Sums of Squares?
  • Same Statistical Models, Different (and Confusing) Output Terms
  • Member Training: Elements of Experimental Design

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Primary Sidebar

Free Webinars

Effect Size Statistics on Tuesday, Feb 2nd

This Month’s Statistically Speaking Live Training

  • January Member Training: A Gentle Introduction To Random Slopes In Multilevel Models

Upcoming Workshops

  • Logistic Regression for Binary, Ordinal, and Multinomial Outcomes (May 2021)
  • Introduction to Generalized Linear Mixed Models (May 2021)

Read Our Book



Data Analysis with SPSS
(4th Edition)

by Stephen Sweet and
Karen Grace-Martin

Statistical Resources by Topic

  • Fundamental Statistics
  • Effect Size Statistics, Power, and Sample Size Calculations
  • Analysis of Variance and Covariance
  • Linear Regression
  • Complex Surveys & Sampling
  • Count Regression Models
  • Logistic Regression
  • Missing Data
  • Mixed and Multilevel Models
  • Principal Component Analysis and Factor Analysis
  • Structural Equation Modeling
  • Survival Analysis and Event History Analysis
  • Data Analysis Practice and Skills
  • R
  • SPSS
  • Stata

Copyright © 2008–2021 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled

Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.

Non-necessary

Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.