Stage 1

A Post-hoc Test for Kruskal-Wallis

May 8th, 2023 by

If you’ve ever run a one-way analysis of variance (ANOVA), you’re familiar with post-hoc tests. The ANOVA omnibus test only tells you whether any groups differ in their means. But if you want to explore which specific group mean is different from which, you need to follow up with a post-hoc test. (more…)


How to Pick an R Package

April 24th, 2023 by

One big advantage of R is its breadth. If anything has been done in statistics, there is an R package that will do it.

The problem is that sometimes there are four packages that will do it. This is big problem with R (and with Python for that matter). (more…)


What is the Mann-Whitney U Test?

April 13th, 2023 by

When you need to compare a numeric outcome for two groups, what analysis do you think of first? Chances are, it’s the independent samples t-test. But that’s not the only, or always, the best option. In many situations, the Mann-Whitney U test is a better option.

The non-parametric Mann-Whitney U test is also called the Mann-Whitney-Wilcoxon test, or the Wilcoxon rank sum test. Non-parametric means that the hypothesis it’s testing is not about the parameter of a particular distribution.

It is part of a subgroup of non-parametric tests that are rank based. That means that the specific values of the outcomes are not important, only their order. In other words, we will be ranking the outcomes.

Like the t-test, this analysis tests whether two independent groups have similar typical outcomes. You can use it with numeric data, but unlike the t-test, it also works with ordinal data. Like the t-test, it is designed for comparisons, and not for estimation or prediction.

The biggest difference from the t-test is that it does not compare means. The Mann-Whitney U test determines whether a random observation from one group tends to be higher (or lower) than a random observation from the other group. Imagine choosing two observations, one from each group, over and over again. This test will determine whether one group is more likely to have the higher values.

It has many advantages: It is a straightforward comparison of means. There are versions for similar and different variances in the two groups. Many people are familiar with it.

(more…)


Member Training: Using Macros, Loops, and Functions in Stata to Manage Your Data Software Tutorial

March 31st, 2023 by

Many data sets are challenging and time consuming to work with because the data are seldom in an optimal format.

(more…)


Best Practices for Formatting Date Variables

March 9th, 2023 by

Formatting Date Variables seems like it should be straightforward, but sadly, it’s not.

If you are given data that includes dates, expect confusion. Dates can be represented in many different ways. (more…)


What is a Dunnett’s Test?

January 10th, 2023 by

I’m a big fan of Analysis of Variance (ANOVA). I use it all the time. I learn a lot from it. But sometimes it doesn’t test the hypothesis I need. In this article, we’ll explore a test that is used when you care about a specific comparison among means: Dunnett’s test. (more…)