predictor variable

What’s in a Name? Moderation and Interaction, Independent and Predictor Variables

April 14th, 2014 by

One of the most confusing things about statistical analysis is the different vocabulary used for the same, or nearly-but-not-quite-the-same, concepts.

stage 1

Sometimes this happens just because the same analysis was developed separately within different fields and named twice.

So people in different fields use different terms for the same statistical concept.  Try to collaborate with a colleague in a different field and you may find yourself awed by the crazy statistics they’re insisting on.

Other times, there is a level of detail that is implied by one term that isn’t true of the wider, more generic term.  This level of detail is often about how the role of variables or effects affects the interpretation of output. (more…)


Should You Always Center a Predictor on the Mean?

December 2nd, 2011 by

Centering predictor variables is one of those simple but extremely useful practices that is easily overlooked.

It’s almost too simple.

Centering simply means subtracting a constant from every value of a variable.  What it does is redefine the 0 point for that predictor to be whatever value you subtracted.  It shifts the scale over, but retains the units.

The effect is that the slope between that predictor and the response variable doesn’t (more…)


The Distribution of Independent Variables in Regression Models

January 19th, 2010 by

Stage 2While there are a number of distributional assumptions in regression models, one distribution that has no assumptions is that of any predictor (i.e. independent) variables.

It’s because regression models are directional. In a correlation, there is no direction–Y and X are interchangeable. If you switched them, you’d get the same correlation coefficient.

But regression is inherently a model about the outcome variable. What predicts its value and how well? The nature of how predictors relate to it (more…)


Likert Scale Items as Predictor Variables in Regression

May 22nd, 2009 by

Stage 2I was recently asked about whether it’s okay to treat a likert scale as continuous as a predictor in a regression model.  Here’s my reply.  In the question, the researcher asked about logistic regression, but the same answer applies to all regression models.

1. There is a difference between a likert scale item (a single 1-7 scale, eg.) and a full likert scale , which is composed of multiple items.  If it is a full likert scale, with a combination of multiple items, go ahead and treat it as numerical. (more…)


The Distribution of Independent Variables in Regression Models

April 9th, 2009 by

I often hear concern about the non-normal distributions of independent variables in regression models, and I am here to ease your mind.Stage 2

There are NO assumptions in any linear model about the distribution of the independent variables.  Yes, you only get meaningful parameter estimates from nominal (unordered categories) or numerical (continuous or discrete) independent variables.  But no, the model makes no assumptions about them.  They do not need to be normally distributed or continuous.

It is useful, however, to understand the distribution of predictor variables to find influential outliers or concentrated values.  A highly skewed independent variable may be made more symmetric with a transformation.

 


Centering and Standardizing Predictors

December 5th, 2008 by

I was recently asked about whether centering (subtracting the mean) a predictor variable in a regression model has the same effect as standardizing (converting it to a Z score).  My response:

They are similar but not the same.

In centering, you are changing the values but not the scale.  So a predictor that is centered at the mean has new values–the entire scale has shifted so that the mean now has a value of 0, but one unit is still one unit.  The intercept will change, but the regression coefficient for that variable will not.  Since the regression coefficient is interpreted as the effect on the mean of Y for each one unit difference in X, it doesn’t change when X is centered.

And incidentally, despite the name, you don’t have to center at the mean.  It is often convenient, but there can be advantages of choosing a more meaningful value that is also toward the center of the scale.

But a Z-score also changes the scale.  A one-unit difference now means a one-standard deviation difference.  You will interpret the coefficient differently.  This is usually done so you can compare coefficients for predictors that were measured on different scales.  I can’t think of an advantage for doing this for an interaction.