Six terms that mean something different statistically and colloquially

by Kim Love and Karen Grace-Martin

Statistics terminology is confusing.

Sometimes different terms are used to mean the same thing, often in different fields of application. Sometimes the same term is used to mean different things. And sometimes very similar terms are used to describe related but distinct statistical concepts.

But the type of terms that causes the most trouble when communicating with non-researchers are those with a different colloquial meaning in English than the technical definition in statistics. This is particularly difficult because the definitions are often similar, if not exact.

Let’s take a look at six of these.

1. Significance

This is, for sure, the big one. You’re probably familiar with the difference between statistical significance, generally indicating a p-value that is below a threshold, and the colloquial meaning of large or important.

A significant other is important. A significant raise is large. A statistically significant difference may be neither. This has been so misunderstood that many statisticians are calling for its demise.

2. Odds

In everyday English, people use the terms Odds and Probability interchangeably.  In statistics, they’re measuring the same general construct – how likely an event is to occur – on different scales. This difference in scales has a huge impact on how you interpret the value.

Odds measure the probability of an outcome relative to the probability that outcome doesn’t occur: p/(1-p). They range from zero to infinity and a value of 1 indicates equal odds.

Probability is just the numerator, p. They range from zero to one and a value of 0 indicates equal probability.

So while you can easily convert back and forth, an odds of .8 means something very different from a probability of .8.

3. Bias

In colloquial English, bias means prejudice.  It’s bad.

Bias isn’t always a good thing in statistics, but it doesn’t have that inherent value judgment.

Bias is a measure of the difference between the value of a population parameter and the theoretical mean value of a statistic that estimates that parameter.

For example, in a simple linear model, the parameter β1 is the slope of the regression line in the population. Since we don’t know its value, we estimate it by calculating b1, the slope of the regression line in a representative sample. Though we know b1 won’t have the exact same value as β1, we expect the average value of  b1, across hundreds of samples we could have taken, will. Any difference between that theoretical average value of b1 and the true population value, is the bias of that estimator.

Statistical bias can occur from using an estimator with a known bias. But since we know what those are, more often it comes from having an unrepresentative sample.

4. Correlation

In statistics, a correlation is a specific measurement. Yes, there are different correlation coefficients, like Spearman, Pearson, and polychoric, but all have a few characteristics:
– They measure the direction and strength of association between two variables
– They range from -1 to 1, with 0 indicating no association

The colloquial definition is much broader. It can mean any connection, match, or co-occurrence between individual events. “The correlation between the machine’s failure and a loose connection in the joint coupling.”

5. Error

Colloquially, an error is a mistake.

In regression models, an error is the difference between the value of an outcome variable for one individual and the value predicted by the model. There’s no mistake involved here. Just variation.

There are also other specific uses of error, such as “standard error,” “sampling error” and “measurement error,” all of which are about variation, not mistakes.

6. Random

The technical definition: “a phenomenon is random if individual outcomes are uncertain, but there is nonetheless a regular distribution of outcomes in a large number of repetitions” – Moore and McCabe

And while this is one usage of random in everyday English, it also often means strange or unexpected. For example “There is a random pineapple in my yard.”

The Pathway: Steps for Staying Out of the Weeds in Any Data Analysis
Get the road map for your data analysis before you begin. Learn how to make any statistical modeling – ANOVA, Linear Regression, Poisson Regression, Multilevel Model – straightforward and more efficient.

Reader Interactions

Leave a Reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.