OptinMon

Five Advantages of Running Repeated Measures ANOVA as a Mixed Model

May 13th, 2009 by

There are two ways to run a repeated measures analysis.The traditional way is to treat it as a multivariate test–each response is considered a separate variable.The other way is to it as a mixed model.While the multivariate approach is easy to run and quite intuitive, there are a number of advantages to running a repeated measures analysis as a mixed model.

First I will explain the difference between the approaches, then briefly describe some of the advantages of using the mixed models approach. (more…)


Interpreting Interactions: When the F test and the Simple Effects disagree.

May 11th, 2009 by

Stage 2The way to follow up on a significant two-way interaction between two categorical variables is to check the simple effects.  Most of the time the simple effects tests give a very clear picture about the interaction.  Every so often, however, you have a significant interaction, but no significant simple effects.  It is not a logical impossibility. They are testing two different, but related hypotheses.

Assume your two independent variables are A and B.  Each has two values: 1 and 2.  The interaction is testing if A1 – B1 = A2 – B2 (the null hypothesis). The simple effects are testing whether A1-B1=0 and A2-B2=0 (null) or not.

If you have a crossover interaction, you can have A1-B1 slightly positive and A2-B2 slightly negative. While neither is significantly different from 0, they are significantly different from each other.

And it is highly useful for answering many research questions to know if the differences in the means in one condition equal the differences in the means for the other. It might be true that it’s not testing a hypothesis you’re interested in, but in many studies, all the interesting effects are in the interactions.

 


Why Logistic Regression for Binary Response?

May 5th, 2009 by

Logistic regression models can seem pretty overwhelming to the uninitiated.  Why not use a regular regression model?  Just turn Y into an indicator variable–Y=1 for success and Y=0 for failure.

For some good reasons.

1.It doesn’t make sense to model Y as a linear function of the parameters because Y has only two values.  You just can’t make a line out of that (at least not one that fits the data well).

2. The predicted values can be any positive or negative number, not just 0 or 1.

3. The values of 0 and 1 are arbitrary.The important part is not to predict the numerical value of Y, but the probability that success or failure occurs, and the extent to which that probability depends on the predictor variables.

So okay, you say.  Why not use a simple transformation of Y, like probability of success–the probability that Y=1.

Well, that doesn’t work so well either.

Why not?

1. The right hand side of the equation can be any number, but the left hand side can only range from 0 to 1.

2. It turns out the relationship is not linear, but rather follows an S-shaped (or sigmoidal) curve.

To obtain a linear relationship, we need to transform this response too, Pr(success).

As luck would have it, there are a few functions that:

1. are not restricted to values between 0 and 1

2. will form a linear relationship with our parameters

These functions include:

Arcsine

Probit

Logit

All three of these work just as well, but (believe it or not) the Logit function is the easiest to interpret.

But as it turns out, you can’t just run the transformation then do a regular linear regression on the transformed data.  That would be way too easy, but also give inaccurate results.  Logistic Regression uses a different method for estimating the parameters, which gives better results–better meaning unbiased, with lower variances.

 


SPSS GLM or Regression? When to use each

April 23rd, 2009 by

Regression models are just a subset of the General Linear Model, so you can use GLM procedures to run regressions.  It is what I usually use.

But in SPSS there are options available in the GLM and Regression procedures that aren’t available in the other.  How do you decide when to use GLM and when to use Regression?

GLM has these options that Regression doesn’t: (more…)


EM Imputation and Missing Data: Is Mean Imputation Really so Terrible?

April 15th, 2009 by

I’m sure I don’t need to explain to you all the problems that occur as a result of missing data.  Anyone who has dealt with missing data—that means everyone who has ever worked with real data—knows about the loss of power and sample size, and the potential bias in your data that comes with listwise deletion.

stage-3

Listwise deletion is the default method for dealing with missing data in most statistical software packages.  It simply means excluding from the analysis any cases with data missing on any variables involved in the analysis.

A very simple, and in many ways appealing, method devised to (more…)


Checking Assumptions in ANOVA and Linear Regression Models: The Distribution of Dependent Variables

April 10th, 2009 by

Here’s a little reminder for those of you checking assumptions in regression and ANOVA:

The assumptions of normality and homogeneity of variance for linear models are not about Y, the dependent variable.    (If you think I’m either stupid, crazy, or just plain nit-picking, read on.  This distinction really is important). (more…)