When our research question is focused on the frequency of occurrence of an event, we will typically use a count model to analyze the results. There are numerous count models. A few examples are: Poisson, negative binomial, zero-inflated Poisson and truncated negative binomial.

There are specific requirements as to which count model to use. The models are not interchangeable. But regardless of the model we use, there is a very important prerequisite that they all share.

[click to continue…]

{ 0 comments }

Member Training: Multiple Imputation for Missing Data

There are a number of simplistic methods available for tackling the problem of missing data. Unfortunately there is a very high likelihood that each of these simplistic methods introduces bias into our model results. Multiple imputation is considered to be the superior method of working with missing data. It eliminates the bias introduced by the […]

Read the full article →

What Is a Hazard Function in Survival Analysis?

The concept of “hazard” is similar, but not exactly the same as, its meaning in everyday English. If you’re not familiar with Survival Analysis, it’s a set of statistical methods for modelling the time until an event occurs.Let’s use an example you’re probably familiar with — the time until a PhD candidate completes their dissertation.

Read the full article →

How to Interpret the Width of a Confidence Interval

One issue with using tests of significance is that black and white cut-off points such as 5 percent or 1 percent may be difficult to justify. Significance tests on their own do not provide much light about the nature or magnitude of any effect to which they apply. One way of shedding more light on […]

Read the full article →

Member Training: Non-Parametric Analyses

Oops—you ran the analysis you planned to run on your data, carefully chosen to answer your research question, but your residuals aren’t normally distributed. Maybe you’ve tried transforming the outcome variable, or playing around with the independent variables, but still no dice. That’s ok, because you can always turn to a non-parametric analysis, right? Well, […]

Read the full article →

Regression Diagnostics in Generalized Linear Mixed Models

by Kim Love, PhD What are the best methods for checking a generalized linear mixed model (GLMM) for proper fit? This question comes up frequently. Unfortunately, it isn’t as straightforward as it is for a general linear model. In linear models the requirements are easy to outline: linear in the parameters, normally distributed and independent […]

Read the full article →

Recoding a Variable from a Survey Question to Use in a Statistical Model

Survey questions are often structured without regard for ease of use within a statistical model. Take for example a survey done by the Centers for Disease Control (CDC) regarding child births in the U.S. One of the variables in the data set is “interval since last pregnancy”. Here is a histogram of the results.

Read the full article →

How to Decide Between Multinomial and Ordinal Logistic Regression Models

A great tool to have in your statistical tool belt is logistic regression. It comes in many varieties and many of us are familiar with the variety for binary outcomes. But multinomial and ordinal varieties of logistic regression are also incredibly useful and worth knowing. They can be tricky to decide between in practice, however.  […]

Read the full article →

Member Training: Determining Levels of Measurement: What Lies Beneath the Surface

You probably learned about the four levels of measurement in your very first statistics class: nominal, ordinal, interval, and ratio. Knowing the level of measurement of a variable is crucial when working out how to analyze the variable. Failing to correctly match the statistical method to a variable’s level of measurement leads either to nonsense […]

Read the full article →

Eight Ways to Detect Multicollinearity

Multicollinearity can affect any regression model with more than one predictor. It occurs when two or more predictor variables overlap so much in what they measure that their effects are indistinguishable. When the model tries to estimate their unique effects, it goes wonky (yes, that’s a technical term). So for example, you may be interested in […]

Read the full article →