How to Understand a Risk Ratio of Less than 1

When a model has a binary outcome, one common effect size is a risk ratio. As a reminder, a risk ratio is simply a ratio of two probabilities. (The risk ratio is also called relative risk.)

Recently I have had a few questions about risk ratios less than one.

A predictor variable with a risk ratio of less than one is often labeled a “protective factor” (at least in Epidemiology). This can be confusing because in our typical understanding of those terms, it makes no sense that a risk be protective.

So how can a RISK be protective?

Read the full article →

Removing the Intercept from a Regression Model When X Is Continuous

In a recent article, we reviewed the impact of removing the intercept from a regression model when the predictor variable is categorical. This month we’re going to talk about removing the intercept when the predictor variable is continuous. Spoiler alert: You should never remove the intercept when a predictor variable is continuous. Here’s why.

Read the full article →

Rescaling Sets of Variables to Be on the Same Scale

by Christos Giannoulis, PhD Attributes are often measured using multiple variables with different upper and lower limits. For example, we may have five measures of political orientation, each with a different range of values. Each variable is measured in a different way. The measures have a different number of categories and the low and high […]

Read the full article →

Member Training: Those Darn Ratios!

Ratios are everywhere in statistics—coefficient of variation, hazard ratio, odds ratio, the list goes on. Join Elaine Eisenbeisz as she presents an overview of the how and why of various ratios we use often in statistical practice.

Read the full article →

Statistical Models for Truncated and Censored Data

Can we ignore the fact that a variable is bounded and just run our analysis as if the data wasn’t bounded?

Read the full article →

Your Questions Answered from the Interpreting Regression Coefficients Webinar

Q16: The different reference group definitions (between R and SPSS) seem to give different significance values. Is that because they are testing different hypotheses? (e.g. “Is group 1 different from the reference group?”)

A: Yes. Because they’re using different reference groups, we have different hypothesis tests and therefore different p-values.

Read the full article →

Member Training: Meta-analysis

Meta-analysis is the quantitative pooling of data from multiple studies. Meta-analysis done well has many strengths, including statistical power, precision in effect size estimates, and providing a summary of individual studies. But not all meta-analyses are done well.

Read the full article →

Should I Specify a Model Predictor as Categorical or Continuous?

Predictor variables in statistical models can be treated as either continuous or categorical.

Usually, this is a very straightforward decision about which way to specify each predictor.

Categorical predictors, like treatment group, marital status, or highest educational degree should be specified as categorical.

Likewise, continuous predictors, like age, systolic blood pressure, or percentage of ground cover should be specified as continuous.

But there are numerical predictors that aren’t continuous. And these can sometimes make sense to treat as continuous or sometimes make sense as categorical.

Read the full article →

Count vs. Continuous Variables: Differences Under the Hood

One of the most important concepts in data analysis is that the analysis needs to be appropriate for the scale of measurement of the variable. The focus of these decisions about scale tends to focus on levels of measurement: nominal, ordinal, interval, ratio.

These levels of measurement tell you about the amount of information in the variable. But there are other ways of distinguishing the scales that are also important and often overlooked.

Read the full article →

Differences in Model Building Between Explanatory and Predictive Models

Suppose you are asked to create a model that will predict who will drop out of a program your organization offers. You decide you will use a binary logistic regression because your outcome has two values: “0” for not dropping out and “1” for dropping out.

Most of us were trained in building models for the purpose of understanding and explaining the relationships between an outcome and a set of predictors. But model building works differently for purely predictive models. Where do we go from here?

Read the full article →