Latest Blog Posts

Predicting future outcomes, the next steps in a process, or the best choice(s) from an array of possibilities are all essential needs in many fields. The predictive model is used as a decision making tool in advertising and marketing, meteorology, economics, insurance, health care, engineering, and would probably be useful in your work too!

When our research question is focused on the frequency of occurrence of an event, we will typically use a count model to analyze the results. There are numerous count models. A few examples are: Poisson, negative binomial, zero-inflated Poisson and truncated negative binomial.There are specific requirements as to which count model to use. The models are not interchangeable. But regardless of the model we use, there is a very important prerequisite that they all share.

There are a number of simplistic methods available for tackling the problem of missing data. Unfortunately there is a very high likelihood that each of these simplistic methods introduces bias into our model results. Multiple imputation is considered to be the superior method of working with missing data. It eliminates the bias introduced by the […]

The concept of “hazard” is similar, but not exactly the same as, its meaning in everyday English. If you’re not familiar with Survival Analysis, it’s a set of statistical methods for modelling the time until an event occurs.Let’s use an example you’re probably familiar with — the time until a PhD candidate completes their dissertation.

One issue with using tests of significance is that black and white cut-off points such as 5 percent or 1 percent may be difficult to justify. Significance tests on their own do not provide much light about the nature or magnitude of any effect to which they apply. One way of shedding more light on […]

Oops—you ran the analysis you planned to run on your data, carefully chosen to answer your research question, but your residuals aren’t normally distributed. Maybe you’ve tried transforming the outcome variable, or playing around with the independent variables, but still no dice. That’s ok, because you can always turn to a non-parametric analysis, right? Well, […]

by Kim Love, PhD What are the best methods for checking a generalized linear mixed model (GLMM) for proper fit? This question comes up frequently. Unfortunately, it isn’t as straightforward as it is for a general linear model. In linear models the requirements are easy to outline: linear in the parameters, normally distributed and independent […]

Survey questions are often structured without regard for ease of use within a statistical model. Take for example a survey done by the Centers for Disease Control (CDC) regarding child births in the U.S. One of the variables in the data set is “interval since last pregnancy”. Here is a histogram of the results.

A great tool to have in your statistical tool belt is logistic regression. It comes in many varieties and many of us are familiar with the variety for binary outcomes. But multinomial and ordinal varieties of logistic regression are also incredibly useful and worth knowing. They can be tricky to decide between in practice, however.  […]

You probably learned about the four levels of measurement in your very first statistics class: nominal, ordinal, interval, and ratio. Knowing the level of measurement of a variable is crucial when working out how to analyze the variable. Failing to correctly match the statistical method to a variable’s level of measurement leads either to nonsense […]