At times it is necessary to convert a continuous predictor into a categorical predictor.  For example, income per household is shown below.

This data is censored, all family income above $155,000 is stated as $155,000. A further explanation about censored and truncated data can be found here. It would be incorrect to use this variable as a continuous predictor due to its censoring.

[click to continue…]


A Useful Graph for Interpreting Interactions between Continuous Variables

What’s a good method for interpreting the results of a model with two continuous predictors and their interaction? Let’s start by looking at a model without an interaction.  In the model below, we regress a subject’s hip size on their weight and height. Height and weight are centered at their means.

Read the full article →

February 2019 Member Webinar: What’s the Best Statistical Package for You?

Choosing statistical software is part of The Fundamentals of Statistical Skill and is necessary to learning a second software (something we recommend to anyone progressing from Stage 2 to Stage 3 and beyond). You have many choices for software to analyze your data: R, SAS, SPSS, and Stata, among others. They are all quite good, but […]

Read the full article →

Descriptives Before Model Building

One approach to model building is to use all predictors that make theoretical sense in the first model. For example, a first model for determining birth weight could include mother’s age, education, marital status, race, weight gain during pregnancy and gestation period. The main effects of this model show that a mother’s education level and […]

Read the full article →

The Secret to Importing Excel Spreadsheets into SAS

My poor colleague was pulling her hair out in frustration today. Here’s what happened: She was trying to import an Excel spreadsheet into SAS, and it didn’t work. Here’s what to do.

Read the full article →

Using Predicted Means to Understand Our Models

The expression “can’t see the forest for the trees” often comes to mind when reviewing a statistical analysis. We get so involved in reporting “statistically significant” and p-values that we fail to explore the grand picture of our results. It’s understandable that this can happen.  We have a hypothesis to test. We go through a […]

Read the full article →

The Difference Between Random Factors and Random Effects

Mixed models are hard. They’re abstract, they’re a little weird, and there is not a common vocabulary or notation for them. But they’re also extremely important to understand because many data sets require their use. Repeated measures ANOVA has too many limitations. It just doesn’t cut it any more. One of the most difficult parts […]

Read the full article →

January 2019 Member Webinar: Model Building Approaches

There is a bit of art and experience to model building. You need to build a model to answer your research question but how do you build a statistical model when there are no instructions in the box?  Should you start with all your predictors or look at each one separately? Do you always take […]

Read the full article →

How to Understand a Risk Ratio of Less than 1

When a model has a binary outcome, one common effect size is a risk ratio. As a reminder, a risk ratio is simply a ratio of two probabilities. (The risk ratio is also called relative risk.)

Recently I have had a few questions about risk ratios less than one.

A predictor variable with a risk ratio of less than one is often labeled a “protective factor” (at least in Epidemiology). This can be confusing because in our typical understanding of those terms, it makes no sense that a risk be protective.

So how can a RISK be protective?

Read the full article →

Removing the Intercept from a Regression Model When X Is Continuous

In a recent article, we reviewed the impact of removing the intercept from a regression model when the predictor variable is categorical. This month we’re going to talk about removing the intercept when the predictor variable is continuous. Spoiler alert: You should never remove the intercept when a predictor variable is continuous. Here’s why.

Read the full article →