Statistical models, such as general linear models (linear regression, ANOVA, mixed models) and generalized linear models (logistic, Poisson, proportional hazard regression, etc.) all have the same general form. On the left side of the equation is one or more response variables, Y. On the right hand side is one or more predictor variables, X, and their coefficients, B. X, the variables on the right hand side can have many forms and are called by many names.
There are subtle distinctions in the meanings of these names, but they are often used interchangeably. Even worse, statistical software packages use different names for similar concepts, even among their own procedures. This quest for accuracy often renders confusion. (It’s hard enough without switching the words!).
Here are some common terms that all refer to a variable in a model that is proposed to affect or predict another variable. There are slight differences in the meanings of these terms, but they are often used interchangeably.
- Independent Variable: It implies causality: the independent variable affects the dependent variable. Used predominantly in ANOVA, but often in regression as well. It can be either continuous or categorical.
- Predictor Variable: It does not imply causality. A predictor variable is simply useful for predicting the value of the response variable. Used predominantly in regression. Predictor variables can be continuous or categorical.
- Predictor: Same as Predictor Variable.
- Covariate: A continuous predictor variable. Used in both ANCOVA (analysis of covariance) and regression. Some people use this to refer to all predictor variables in regression, but it really means continuous predictors. Adding a covariate to ANOVA (analysis of variance) turns it into ANCOVA (analysis of covariance).
- Factor: A categorical predictor variable. It may or may not indicate a cause/effect relationship with the response variable (this depends on the study design, not the analysis). Independent variables in ANOVA are almost always called factors. In regression, they are often referred to as indicator variables, categorical predictors, or dummy variables. They are all the same thing in this context.
- Grouping Variable: Same as a factor. Used in SPSS in the independent samples t-test.
- Fixed factor: A categorical independent variable in which the specific values of the categories are specific and important, often chosen by the experimenter. Examples include experimental treatments or demographic categories, such as sex and race. If you’re not doing a mixed model (and you should know if you are), all your factors are fixed factors. For a more thorough explanation of fixed and random factors, see Specifying Fixed and Random Factors in Mixed or Multi-Level Models
- Random factor: A categorical independent variable in which the values of the categories were randomly assigned. Generally used in mixed modeling. Examples include subjects or random blocks. For a more thorough explanation of fixed and random factors, see Specifying Fixed and Random Factors in Mixed or Multi-Level Models
- Dummy variable: A categorical variable that has been dummy coded. Dummy coding (also called indicator coding) is usually used in regression models, but not ANOVA. A dummy variable can have only two values: 0 and 1. When a categorical variable has more than two values, it is recoded into multiple dummy variables.
- Indicator variable: See dummy variable.