OptinMon

What Makes a Statistical Analysis Wrong?

January 21st, 2010 by

One of the most anxiety-laden questions I get from researchers is whether their analysis is “right.”

I’m always slightly uncomfortable with that word. Often there is no one right analysis.

It’s like finding Mr. or Ms. or Mx. Right. Most of the time, there is not just one Right. But there are many that are clearly Wrong.

But there are characteristics of an analysis that makes it work. Let’s take a look.

What Makes an Analysis Right?

Luckily, what makes an analysis right is easier to define than what makes a person right for you. It pretty much comes down to two things: whether the assumptions of the statistical method are being met and whether the analysis answers the research question.

Assumptions are very important. A test needs to reflect the measurement scale of the variables, the study design, and issues in the data. A repeated measures study design requires a repeated measures analysis. A binary dependent variable requires a categorical analysis method.

If you don’t match the measurement scale of the variables to the appropriate test or model, you’re going to have trouble meeting assumptions. And yes, there are ad-hoc strategies to make assumptions seem more reasonable, like transformations. But these strategies are not a cure-all and you’re better off sticking with an analysis that fits the variables.

But within those general categories of appropriate analysis, there are often many analyses that meet assumptions.

A logistic regression or a chi-square test both handle a binary dependent variable with a single categorical predictor. But a logistic regression can answer more research questions. It can incorporate covariates, directly test interactions, and calculate predicted probabilities. A chi-square test can do none of these.

So you get different information from different tests. They answer different research questions.

An analysis that is correct from an assumptions point of view is useless if it doesn’t answer the research question. A data set can spawn an endless number of statistical tests and models that don’t answer the research question. And you can spend an endless number of days running them.

When to Think about the Analysis

The real bummer is it’s not always clear that the analyses aren’t relevant until you are all done with the analysis and start to  write up the research paper.

That’s why writing out the research questions in theoretical and operational terms is the first step of any statistical analysis.

It’s absolutely fundamental.

steps to choose the right statistical analysis And I mean writing them in minute detail. Issues of mediation, interaction, subsetting, control variables, et cetera, should all be blatantly obvious in the research questions.

This is one step where it can be extremely valuable to talk to your statistical consultant. It’s something we do all the time in the Statistically Speaking membership.

Thinking about how to analyze the data before collecting the data can help you from hitting a dead end. It can be very obvious, once you think through the details, that the analysis available to you based on the data won’t answer the research question.

Whether the answer is what you expected or not is a different issue.

So when you are concerned about getting an analysis “right,” clearly define the design, variables, and data issues. And check whatever assumptions you can!

But most importantly, get explicitly clear about what you want to learn from this analysis.

Once you’ve done this, it’s much easier to find the statistical method that answers the research questions and meets assumptions. And if this feels pointless because you don’t know which is the analysis that matches that combination, that’s what we’re here for. Our statistical team at The Analysis Factor can help you with any of these steps.

 


The Distribution of Independent Variables in Regression Models

January 19th, 2010 by

Stage 2While there are a number of distributional assumptions in regression models, one distribution that has no assumptions is that of any predictor (i.e. independent) variables.

It’s because regression models are directional. In a correlation, there is no direction–Y and X are interchangeable. If you switched them, you’d get the same correlation coefficient.

But regression is inherently a model about the outcome variable. What predicts its value and how well? The nature of how predictors relate to it (more…)


Answers to the Interpreting Regression Coefficients Quiz

January 16th, 2010 by

Yesterday I gave a little quiz about interpreting regression coefficients.  Today I’m giving you the answers.

If you want to try it yourself before you see the answers, go here.  (It’s truly little, but if you’re like me, you just cannot resist testing yourself).

True or False?

1. When you add an interaction to a regression model, you can still evaluate the main effects of the terms that make up the interaction, just like in ANOVA. (more…)


Interpreting (Even Tricky) Regression Coefficients – A Quiz

January 15th, 2010 by

Here’s a little quiz:

True or False?

1. When you add an interaction to a regression model, you can still evaluate the main effects of the terms that make up the interaction, just like in ANOVA.

2. The intercept is usually meaningless in a regression model. (more…)


Making Dummy Codes Easy to Keep Track of

January 14th, 2010 by

Here’s a little tip.Stage 2

When you construct Dummy Variables, make it easy on yourself  to remember which code is which.  Heck, if you want to be really nice, make it easy for anyone else who will analyze the data or read the results.

Make the codes inherent in the Dummy variable name.

So instead of a variable named Gender with values of 1=Female and 0=Male, call the variable Female.

Instead of a set of dummy variables named MaritalStatus1 with values of 1=Married and 0=Single, along with MaritalStatus2 with values 1=Divorced and 0=Single, name the same variables Married and Divorced.

And if you’re new to dummy coding, this has the extra bonus of making the dummy coding intuitive.  It’s just a set of yes/no variables about all but one of your categories.

 


Interpreting Regression Coefficients in Models other than Ordinary Linear Regression

January 5th, 2010 by

Someone who registered for my upcoming Interpreting (Even Tricky) Regression Models workshop asked if the content applies to logistic regression as well.

The short answer: Yes

The long-winded detailed explanation of why this is true and the one caveat:

One of the greatest things about regression models is that they all have the same set up: (more…)