One of the most confusing things about statistical analysis is the different vocabulary used for the same, or nearly-but-not-quite-the-same, concepts.
Sometimes this happens just because the same analysis was developed separately within different fields and named twice.
So people in different fields use different terms for the same statistical concept. Try to collaborate with a colleague in a different field and you may find yourself awed by the crazy statistics they’re insisting on.
Other times, there is a level of detail that is implied by one term that isn’t true of the wider, more generic term. This level of detail is often about how the role of variables or effects affects the interpretation of output.
Predictor vs. Independent Variable
One simple example of this is the difference between an independent variable and a predictor variable. Predictor Variable is a more generic term. It can refer to any X variable in a statistical model.
A predictor can predict the outcome Y or explain some of its variance, but there are no other implications about how it relates to Y or other predictors.
There are many subtypes of predictor variables. These distinct types imply something about their relationships to the other variables and about how you should interpret their effects.
Mathematically, there is absolutely no difference in how they are entered into the model or in how your software calculates their effects. The difference is entirely in interpretation.
Let’s use Independent Variable as an example. Calling a predictor an Independent Variable (IV) has a few implications. The true tricky part is these implications can also differ by field, so you need to be careful when someone uses this term.
Independent Variable can imply any or all of the following:
- The Independent Variable has a causal effect on the Dependent Variable, Y.
- The IV is categorical and experimentally manipulated.
- The Independent Variable is the primary predictor of interest. The other variables in the model are there so we can control for their effects, but the IV is the one we’re mostly interested in.
And, of course, some people are so used to using the term Independent Variable that they use it for any predictor.
Moderation vs. Interaction
Another example is moderation. I’ve had conversations with seasoned researchers who were shocked to discover that moderation is simply an interaction effect.
But again, interaction is a little more generic than moderation. Moderation distinguishes between the roles of the two variables involved in the interaction.
So, for example, when we say X and Z interact in their effects on an outcome variable Y, there is no real distinction between the role of X and the role of Z. They are both considered predictor variables.
The interaction tells us that the effect of X on Y is different at different values of Z.
It also tells us that the effect of Z on Y is different at different values of X.
You can interpret it either way. In some studies, it makes more logical sense to interpret in one direction, but it’s just a matter of preference.
When we talk about moderation, though, there is a specific role to X and Z. One is assigned as the Independent Variable and the other as the Moderator.
The Independent Variable is an independent variable based on the third implication listed above: its effect is of primary interest.
The Moderator, Z, is the predictor that changes the effect of the Independent Variable, X, on Y. So the idea is that we’re not really interested in whether Z predicts Y on its own. We’re really interested in how it changes the primary effect of X on Y.
Mathematically, there is no difference between X and Z. They are both entered into statistical software in the same way.
But in the concept of moderation, there is a clear distinction between the effects of the Independent Variable and that of the Moderator.
In other words, the concepts distinguish how the effects of each predictor are interpreted.