I recently received this email, which I thought was a great question, and one of wider interest…
Hello Karen,
I am an MPH student in biostatistics and I am curious about using regression for tests of associations in applied statistical analysis. Why is using regression, or logistic regression “better” than doing bivariate analysis such as Chi-square?
I read a lot of studies in my graduate school studies, and it seems like half of the studies use Chi-Square to test for association between variables, and the other half, who just seem to be trying to be fancy, conduct some complicated regression-adjusted for-controlled by- model. But the end results seem to be the same. I have worked with some professionals that say simple is better, and that using Chi- Square is just fine, but I have worked with other professors that insist on building models. It also just seems so much more simple to do chi-square when you are doing primarily categorical analysis.
My professors don’t seem to be able to give me a simple justified
answer, so I thought I’d ask you. I enjoy reading your site and plan to begin participating in your webinars.
Thank you!
(more…)
The assumptions of normality and constant variance in a linear model (both OLS regression and ANOVA) are quite robust to departures. That means that even if the assumptions aren’t met perfectly, the resulting p-values will still be reasonable estimates.
But you need to check the assumptions anyway, because some departures are so far off that the p-values become inaccurate. And in many cases there are remedial measures you can take to turn non-normal residuals into normal ones.
But sometimes you can’t.
Sometimes it’s because the dependent variable just isn’t appropriate for a linear model. The (more…)
I was recently asked this question about Chi-square tests. This question comes up a lot, so I thought I’d share my answer.
I have to compare two sets of categorical data in a 2×4 table. I cannot run the chi-square test because most of the cells contain values less than five and a couple of them contain values of 0. Is there any other test that I could use that overcomes the limitations of chi-square?
And here is my answer: (more…)
There are 4 questions you must answer to choose an appropriate statistical analysis.
1. What is your Research Question?
2. What is the scale of measurement of the variables used to answer the research question?
3. What is the Design? (between subjects, within subjects, etc.)
4. Are there any data issues? (missing, censored, truncated, etc.)
If you have not already, read about these in more detail.
(more…)
One of the most common situations in which researchers get stuck with statistics is choosing which statistical methodology is appropriate to analyze their data. If you start by asking the following four questions, you will be able to narrow things down considerably.
Even if you don’t know the implications of your answers, answering these questions will clarify issues for you. It will help you decide what information to seek, and it will make any conversations you have with statistical advisors more efficient and useful.
1. What is your research question? (more…)
I first encountered the Great Likert Data Debate in 1992 in my first statistics class in my psychology graduate program.
My stats professor was a brilliant mathematical psychologist and taught the class unlike any psychology grad class I’ve ever seen since. Rather than learn ANOVA in SPSS, we derived the Method of Moments using Matlab. While I didn’t understand half of what was going on, this class roused my curiosity and led me to take more theoretical statistics classes. The rest is history.
A large section of the class was dedicated to the fact that Likert data was not interval and therefore not appropriate for statistics that assume normality such as ANOVA and regression. This was news to me. Meanwhile, most of the rest of the field either ignored or debated this assertion.
16 years later, the debate continues. A nice discussion of the debate is found on the Research Methodology blog by Hisham bin Md-Basir. It’s a nice blog with thoughtful entries that summarize methodological articles in the social and design sciences.
To be fair, though, this blog entry summarizes an article on the “Likert scales are not interval” side of the debate. For a balanced listing of references, see Can Likert Scale Data Ever Be Continuous?