What is Analysis of Means?

The classic way to compare means in analysis of variance is examining pairwise differences in means after an F test.. It’s great for many problems, but sometimes your research question isn’t about pairwise differences.

Pairwise differences are not ideal if your research question is if, like the Sesame Street song, one of these groups is not like the others. Perhaps, you are trying to assure that groups conform to a common standard. Analysis of Means (ANOM) helps in all these settings.

The traditional hypothesis

You’ve seen this hypothesis for the standard ANOVA setting in your introductory Statistics class.

This is a very common approach in a comparison of k independent groups. Test for any deviation from the null hypothesis using an F test. If this is statistically significant, then use a Tukey follow-up test to see which pairs of means differ from one another.

The comparison to the overall mean hypothesis

Suppose you are in a setting like the one described above, where there is an expectation of similarity between each group mean and the overall mean. Then, you would write your hypothesis as

 

traditional hypothesis
 

This is a very common approach in a comparison of k independent groups. Test for any deviation from the null hypothesis using an F test. If this is statistically significant, then use a Tukey follow-up test to see which pairs of means differ from one another.

The comparison to the overall mean hypothesis

Suppose you are in a setting like the one described above, where there is an expectation of similarity between each group mean and the overall mean. Then, you would write your hypothesis as

 

comparison to overall mean equation
 

This approach is not that common, but you can find it in most statistical software programs under the name “Analysis of Means” or its acronym, ANOM.

It fits in well with a setting where you hope that all the groups produce consistent results. If any do not, then you want to identify the group or groups that deviate from the norm and study factors that may account for the deviation.

The ANOM approach compares each group mean to the overall mean. It is easy to implement and it lends itself to a simple graphical display.

You need a table of critical values, which depend on alpha (the overall Type I error rate), t (the number of groups), and n (the number of observations within each group). Some tables use the degrees of freedom for error in place of n.

You don’t need to compute the traditional F statistic for Analysis of Variance first, because the Analysis of Means approach controls the overall Type I error rate. This protects you from the accusation of p-hacking, even if the number of groups is very large.

It’s important to editorialize a bit here. Deviating from the norm could be a good thing or a bad thing or it could be an indifferent thing. Your goal is not to use statistics to hunt out different groups to reward or punish them. You are using statistics to help in understanding if deviations from the norm occur and then study those deviating groups to understand why they deviate.

A simple example

Imagine that you are a crop scientist, comparing five genetically modified strains of eggplant. You developed these strains to produce a natural insecticide, CryA1c, in the eggplant leaves.

You hope that all strains produce a comparable amount of insecticide. If any strain produces a lot more or a lot less insecticide, you will investigate full genetic profile of that strain to see why it is different from the others.

 

eggplant
 

The data comes from
Desiree M. Hautea DM et al. Field Performance of Bt Eggplants (Solanum melongena L.) in the Philippines: Cry1Ac Expression and Control of the Eggplant Fruit and Shoot Borer (Leucinodes orbonalis Guenée). PLoS One. 2016; 11(6): e0157498.
and my description of the testing scenario is a bit simpler than the analysis done in that paper.

Here are the summary statistics.

 

summary stats table
 

The raw data was not available, but you can calculate the analysis of means limits using the group means and standard deviations. You can plot the individual means versus the limits provided by analysis of means calculations.

 

plot of individual means vs limits
 

Notice that all five means lie inside the limits. None of the five strains shows a statistically significant difference from the overall mean.

You may prefer a one-sided test in this setting, such as testing whether any strain is deficient in the CryA1c levels. You can make a very easy modification to get one-sided tests.

Caveats

The graph gets a bit messier if the sample size differs from one group to another.

More importantly, the only comparisons that you can make with the analysis of means approach is a comparison to the overall mean. If two groups are both statistically significantly higher than the overall mean, you cannot make a comparison between those two groups without losing control of the overall Type I error rate.

You also need to specify the Analysis of Means hypothesis prior to looking at your data. Peeking at the data and then choosing your hypothesis is cheating.

The Analysis of Means provides a simple approach to testing a different type of hypothesis. Because it reduces the number of comparisons from all possible pairwise differences to a comparison of each group to the overall mean, you gain some precision and can summarize your results in a simple easily understood graph.

By Steve Simon

About the Author

Steve SimonSteve Simon works as an independent statistical consultant and as a part-time faculty member in the Department of Biomedical and Health Informatics at the University of Missouri-Kansas City. He has previously worked at Children’s Mercy Hospital, the National Institute for Occupational Safety and Health, and Bowling Green State University.

Steve has over 90 peer-reviewed publications, four of which have won major awards. He has written one book, Statistical Evidence in Medical Trials, and is the author of a major website about Statistics, Research Design, and Evidence Based Medicine, www.pmean.com. One of his current areas of interest is using Bayesian models to forecast patient accrual in clinical trials. Steve received a Ph.D. in Statistics from the University of Iowa in 1982.

Reader Interactions


Leave a Reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.