A polynomial term–a quadratic (squared) or cubic (cubed) term turns a linear regression model into a curve. But because it is X that is squared or cubed, not the Beta coefficient, it still qualifies as a linear model. This makes it a nice, straightforward way to model curves without having to model complicated non-linear models.
But how do you know if you need one–when a linear model isn’t the best model? (more…)
Part 1 outlined one issue in deciding whether to put a categorical predictor variable into Fixed Factors or Covariates in SPSS GLM. That issue dealt with how SPSS automatically creates dummy variables from any variable in Fixed Factors.
There is another key default to keep in mind. SPSS GLM will automatically create interactions between any and all variables you specify as Fixed Factors.
If you put 5 variables in Fixed Factors, you’ll get a lot of interactions. SPSS will automatically create all 2-way, 3-way, 4-way, and even a 5-way interaction among those 5 variables. (more…)
If your graduate statistical training was anything like mine, you learned ANOVA in one class and Linear Regression in another. My professors would often say things like “ANOVA is just a special case of Regression,” but give vague answers when pressed.
It was not until I started consulting that I realized how closely related ANOVA and regression are. They’re not only related, they’re the same thing. Not a quarter and a nickel–different sides of the same coin.
So here is a very simple example that shows why. When someone showed me this, a light bulb went on, even though I already knew both ANOVA and multiple linear (more…)
In a Regression model, should you drop interaction terms if they’re not significant?
In an ANOVA, adding interaction terms still leaves the main effects as main effects. That is, as long as the data are balanced, the main effects and the interactions are independent. The main effect is still telling (more…)
A Linear Regression Model with an interaction between two predictors (X1 and X2) has the form:
Y = B0 + B1X1 + B2X2 + B3X1*X2.
It doesn’t really matter if X1 and X2 are categorical or continuous, but let’s assume they are continuous for simplicity.
One important concept is that B1 and B2 are not main effects, the way they would be if (more…)
I just came across this great article by Frank Harrell: Problems Caused by Categorizing Continuous Variables
It’s from the Vanderbilt University biostatistics department, so the examples are all medical, but the points hold for any field.
It goes right along with my recent post, Continuous and Categorical Variables: The Trouble with Median Splits.