One of the most confusing things about statistical analysis is the different vocabulary used for the same, or nearly-but-not-quite-the-same, concepts.

Sometimes this happens just because the same analysis was developed separately within different fields and named twice.

So people in different fields use different terms for the same statistical concept. Try to collaborate with a colleague in a different field and you may find yourself awed by the crazy statistics they’re insisting on.

Other times, there is a level of detail that is implied by one term that isn’t true of the wider, more generic term. This level of detail is often about how the role of variables or effects affects the interpretation of output.

**Predictor vs. Independent Variable**

One simple example of this is the difference between an independent variable and a predictor variable. Predictor Variable is a more generic term. It can refer to any X variable in a statistical model.

A predictor can predict the outcome Y or explain some of its variance, but there are no other implications about how it relates to Y or other predictors.

There are many subtypes of predictor variables. These distinct types imply something about their relationships to the other variables and about how you should interpret their effects.

Mathematically, there is absolutely no difference in how they are entered into the model or in how your software calculates their effects. The difference is entirely in interpretation.

Examples include: Explanatory Variable, Independent Variable, Covariate, Control Variable, Factor, Grouping Variable, and probably a few others.

Let’s use Independent Variable as an example. Calling a predictor an Independent Variable (IV) has a few implications. The true tricky part is these implications can also differ by field, so you need to be careful when someone uses this term.

Independent Variable can imply any or all of the following:

- The Independent Variable has a causal effect on the Dependent Variable, Y.
- The IV is categorical and experimentally manipulated.
- The Independent Variable is the primary predictor of interest. The other variables in the model are there so we can control for their effects, but the IV is the one we’re mostly interested in.

And, of course, some people are so used to using the term Independent Variable that they use it for any predictor.

**Moderation vs. Interaction**

Another example is moderation. I’ve had conversations with seasoned researchers who were shocked to discover that *moderation is simply an interaction effect*.

But again, interaction is a little more generic than moderation. Moderation distinguishes between the roles of the two variables involved in the interaction.

So, for example, when we say X and Z interact in their effects on an outcome variable Y, there is no real distinction between the role of X and the role of Z. They are both considered predictor variables.

The interaction tells us that the effect of X on Y is different at different values of Z.

It also tells us that the effect of Z on Y is different at different values of X.

You can interpret it either way. In some studies, it makes more logical sense to interpret in one direction, but it’s just a matter of preference.

When we talk about moderation, though, there is a specific role to X and Z. One is assigned as the Independent Variable and the other as the Moderator.

The Independent Variable is an independent variable based on the third implication listed above: its effect is of primary interest.

The Moderator, Z, is the predictor that changes the effect of the Independent Variable, X, on Y. So the idea is that we’re not really interested in whether Z predicts Y on its own. We’re really interested in how it changes the primary effect of X on Y.

Mathematically, there is no difference between X and Z. They are both entered into statistical software in the same way.

But in the concept of moderation, there is a clear distinction between the effects of the Independent Variable and that of the Moderator.

In other words, the concepts distinguish how the effects of each predictor are interpreted.

{ 4 comments… read them below or add one }

i did a study with several predictor variabless (selected through univariate anaysis, then ran a multivariate step analysis) – that analysis identified the

the highest predictor to outcome had several predictor variables highly correlated to it.

treatment credibility and working alliance with a therapist were highly related to treatment adherence – which itself highly related to outcome.

is treatment adherence a moderator vaiable, should pathanalysis or some other anysis beed used??

Hi Andrew,

Not necessarily. Moderation (which just means an interaction) is not the same as association (aka correlation). Here is an article on that: The Difference Between Interaction and Association

Thank you for this excellent article. Where do mediators and path variables fit into this picture – are they also similar concepts? Furthermore, you say moderator Z changes the effect of X on Y – if Z is categorical, what s the difference between moderation/interaction and stratifying the model by values of Z? Many thanks.

“Control variable is one which we want to keep unchanged in our experiment so that we can filter out its impact on dependent variable of interest. If I am not interest in finding out what is the impact of control variables on my dependent variable; but wanna use them so as to get a pure relation between dependent variable and independent variable of interest.”

What’s the difference between “Control Variable” and “Moderator”? I’m sure there is a thin line but not able to explicitly explain with examples. Please help.