Previous Posts
Many of us love performing statistical analyses but hate writing them up in the Results section of the manuscript. We struggle with big-picture issues (What should I include? In what order?) as well as minutia (Do tables have to be double-spaced?).
What is a Confounder? Confounder (also called confounding variable) is one of those statistical terms that confuses a lot of people. Not because it represents a confusing concept, but because of how it’s used. (Well, it’s a bit of a confusing concept, but that’s not the worst part). It has slightly different meanings to different […]
Predicting future outcomes, the next steps in a process, or the best choice(s) from an array of possibilities are all essential needs in many fields. The predictive model is used as a decision making tool in advertising and marketing, meteorology, economics, insurance, health care, engineering, and would probably be useful in your work too!
There are a number of simplistic methods available for tackling the problem of missing data. Unfortunately there is a very high likelihood that each of these simplistic methods introduces bias into our model results. Multiple imputation is considered to be the superior method of working with missing data. It eliminates the bias introduced by the […]
The concept of “hazard” is similar, but not exactly the same as, its meaning in everyday English. If you’re not familiar with Survival Analysis, it’s a set of statistical methods for modelling the time until an event occurs.Let’s use an example you’re probably familiar with — the time until a PhD candidate completes their dissertation.
One issue with using tests of significance is that black and white cut-off points such as 5 percent or 1 percent may be difficult to justify. Significance tests on their own do not provide much light about the nature or magnitude of any effect to which they apply. One way of shedding more light on […]
Oops—you ran the analysis you planned to run on your data, carefully chosen to answer your research question, but your residuals aren’t normally distributed. Maybe you’ve tried transforming the outcome variable, or playing around with the independent variables, but still no dice. That’s ok, because you can always turn to a non-parametric analysis, right? Well, […]
What are the best methods for checking a generalized linear mixed model (GLMM) for proper fit? This question comes up frequently. Unfortunately, it isn’t as straightforward as it is for a general linear model. In linear models the requirements are easy to outline: linear in the parameters, normally distributed and independent residuals, and homogeneity of […]
Survey questions are often structured without regard for ease of use within a statistical model. Take for example a survey done by the Centers for Disease Control (CDC) regarding child births in the U.S. One of the variables in the data set is “interval since last pregnancy”. Here is a histogram of the results.
A great tool to have in your statistical tool belt is logistic regression. It comes in many varieties and many of us are familiar with the variety for binary outcomes. But multinomial and ordinal varieties of logistic regression are also incredibly useful and worth knowing. They can be tricky to decide between in practice, however. […]