Last week I had the pleasure of teaching a webinar on Interpreting Regression Coefficients. We walked through the output of a somewhat tricky regression model—it included two dummy-coded categorical variables, a covariate, and a few interactions.
As always seems to happen, our audience asked an amazing number of great questions. (Seriously, I’ve had multiple guest instructors compliment me on our audience and their thoughtful questions.)
We had so many that although I spent about 40 minutes answering (more…)
Predictor variables in statistical models can be treated as either continuous or categorical.
Usually, this is a very straightforward decision.
Categorical predictors, like treatment group, marital status, or highest educational degree should be specified as categorical.
Likewise, continuous predictors, like age, systolic blood pressure, or percentage of ground cover should be specified as continuous.
But there are numerical predictors that aren’t continuous. And these can sometimes make sense to treat as continuous and sometimes make sense as categorical.
(more…)
At The Analysis Factor, we are on a mission to help researchers improve their statistical skills so they can do amazing research.
We all tend to think of “Statistical Analysis” as one big skill, but it’s not.
Over the years of training, coaching, and mentoring data analysts at all stages, I’ve realized there are four fundamental stages of statistical skill:
Stage 1: The Fundamentals
Stage 2: Linear Models
Stage 3: Extensions of Linear Models
Stage 4: Advanced Models
There is also a stage beyond these where the mathematical statisticians dwell. But that stage is required for such a tiny fraction of data analysis projects, we’re going to ignore that one for now.
If you try to master the skill of “statistical analysis” as a whole, it’s going to be overwhelming.
And honestly, you’ll never finish. It’s too big of a field.
But if you can work through these stages, you’ll find you can learn and do just about any statistical analysis you need to. (more…)
Survival analysis isn’t just a single model.
It’s a whole set of tests, graphs, and models that are all used in slightly different data and study design situations. Choosing the most appropriate model can be challenging.
In this article I will describe the most common types of tests and models in survival analysis, how they differ, and some challenges to learning them.
(more…)
Every statistical model and hypothesis test has assumptions.
And yes, if you’re going to use a statistical test, you need to check whether those assumptions are reasonable to whatever extent you can.
Some assumptions are easier to check than others. Some are so obviously reasonable that you don’t need to do much to check them most of the time. And some have no good way of being checked directly, so you have to use situational clues.
(more…)
Most of us know that binary logistic regression is appropriate when the outcome variable has two possible outcomes: success and failure.
There are two more situations that are also appropriate for binary logistic regression, but they don’t always look like they should be.
(more…)