One of the biggest challenges in learning statistics and data analysis is learning the lingo. It doesn’t help that half of the notation is in Greek (literally).

The terminology in statistics is particularly confusing because often the same word or symbol is used to mean completely different concepts.

I know it feels that way, but it really isn’t a master plot by statisticians to keep researchers feeling ignorant.

*Really.*

It’s just that a lot of the methods in statistics were created by statisticians working in different fields–economics, psychology, medicine, and yes, straight statistics. Certain fields often have specific types of data that come up a lot and that require specific statistical methodologies to analyze.

Economics needs time series, psychology needs factor analysis. Et cetera, et cetera.

But separate fields developing statistics in isolation has some ugly effects.

Sometimes different fields develop the same technique, but use *different names* or notation.

Other times different fields use the same name or notation on *different techniques* they developed.

And of course, there are those terms with slightly different names, often used in similar contexts, but with different meanings. These are never used interchangeably, but they’re easy to confuse if you don’t use this stuff every day.

And sometimes, there are different terms for subtly different concepts, but people use them interchangeably. (I am guilty of this myself). It’s not a big deal if you understand those subtle differences. But if you don’t, it’s a mess.

And it’s not just fields–it’s software, too.

SPSS uses different names for the exact same thing in different procedures. In GLM, a continuous independent variable is called a Covariate. In Regression, it’s called an Independent Variable.

Likewise, SAS has a Repeated statement in its GLM, Genmod, and Mixed procedures. They all get at the same concept there (repeated measures), but they deal with it in drastically different ways.

So once the fields come together and realize they’re all doing the same thing, people in different fields or using different software procedures, are already used to using their terminology. So we’re stuck with different versions of the same word or method.

So anyway, I am beginning a series of blog posts to help clear this up. Hopefully it will be a good reference you can come back to when you get stuck.

We’ve expanded on this list with a member training, if you’re interested.

If you have good examples, please post them in the comments. I’ll do my best to clear things up.

### Why Statistics Terminology is Especially Confusing

### Confusing Statistical Term #1: Independent Variable

### Confusing Statistical Terms #2: Alpha and Beta

### Confusing Statistical Term #3: Levels

### Confusing Statistical Terms #4: Hierarchical Regression vs. Hierarchical Model

### Confusing Statistical Term #5: Covariate

### Confusing Statistical Term #6: Factor

### Same Statistical Models, Different (and Confusing) Output Terms

### Confusing Statistical Term #7: GLM

### Confusing Statistical Term #8: Odds

### Confusing Statistical Term #9: Multiple Regression Model and Multivariate Regression Model

### Confusing Statistical Term #10: Mixed and Multilevel Models

### Confusing Statistical Terms #11: Confounder

### Six terms that mean something different statistically and colloquially

### Confusing Statistical Term #13: MAR and MCAR Missing Data

yutong Liu says

Hello, I’m seeking clarification about the terminology of “fixed effects” and “random effects”. In methods such as multi-level regression analysis and dynamic structural equation modeling (DSEM), it’s common to report the results in terms of these “fixed effects” and “random effects”. However, as they often appear together, this can become quite confusing. I’m hoping you can help clarify this for me. Thank you very much in advance!

Karen Grace-Martin says

You might find these articles helpful:

The Difference Between Random Factors and Random Effects

Specifying Fixed and Random Factors in Mixed Models

JML says

Great series. How about clarifying part, partial, and semi-partial? This is a classic example of where different fields/programs use different terminology for the same thing, or the same terms for different things! A handy reference would be most helpful. Thanks!

Karen says

Thanks, JML. Great suggestion. I have to look them up every time, too. 🙂 I’ll add it to the queue.

Karen

Karen Grace-Martin says

I can say that part and semi-partial are the same thing, and both are different from partial.

It’s actually really difficult to explain the difference without getting quite technical and verbose. This is the best explanation I’ve seen, though you may find it’s quite technical and verbose. 🙂

http://faculty.cas.usf.edu/mbrannick/regression/Partial.html

Karen says

Denis–Thanks. I’ll get on those.

Jim–Thanks. As someone trained in psychology and statistics (in a statistics department), I’ve always found econometricians to be using entirely different terminology. But with a bit of explanation, we can often find the common concepts to the different words, and work well together.

But engineers seem to be speaking a whole different language. LOL. I had a conversation once with an engineer where I had no idea what he was talking about until I finally realized he meant “variable” when he said “parameter.” That was bizarre.

Jim Hurdle says

Wonderful idea to publish this topic. It actually could almost be extended to a book. I am an economist so I have the “discipline” terms used in my graduate education. Sometimes, in reviews of articles by mathematical statisticians, I find some confusing terminology that turns out to be just a renaming of a concept I already know.

So keep on this route, your contribution will be appreciated!

Denis says

Mean, moment, significance are some of those mean words, indeed!

;D