linear regression

Understanding Interactions Between Categorical and Continuous Variables in Linear Regression

May 14th, 2018 by

We’ve looked at the interaction effect between two categorical variables. Now let’s make things a little more interesting, shall we?

What if our predictors of interest, say, are a categorical and a continuous variable? How do we interpret the interaction between the two? (more…)


Why ANOVA is Really a Linear Regression, Despite the Difference in Notation

April 23rd, 2018 by

When I was in graduate school, stat professors would say “ANOVA is just a special case of linear regression.”  But they never explained why.Stage 2

And I couldn’t figure it out.

The model notation is different.

The output looks different.

The vocabulary is different.

The focus of what we’re testing is completely different. How can they be the same model?

(more…)


Getting Accurate Predicted Counts When There Are No Zeros in the Data

March 12th, 2018 by

We previously examined why a linear regression and negative binomial regression were not viable models for predicting the expected length of stay in the hospital for people with the flu.  A linear regression model was not appropriate because our outcome variable, length of stay, was discrete and not continuous.

A negative binomial model wasn’t the proper choice because the minimum length of stay is not zero. The minimum length of stay is one day. Negative binomial and Poisson models can only be used on data where the observations’ outcome have the possibility of having a zero count.

We need to use a truncated negative binomial model to analyze the expected length of stay of people admitted to the hospital who have the flu. Calculating the expected length of stay is an easy task once we create our model. (more…)


Member Training: Using Transformations to Improve Your Linear Regression Model

March 5th, 2018 by

Transformations don’t always help, but when they do, they can improve your linear regression model in several ways simultaneously.

They can help you better meet the linear regression assumptions of normality and homoscedascity (i.e., equal variances). They also can help avoid some of the artifacts caused by boundary limits in your dependent variable — and sometimes even remove a difficult-to-interpret interaction.

(more…)


The Problem with Linear Regression for Count Data

February 26th, 2018 by

Imagine this scenario:

This year’s flu strain is very vigorous. The number of people checking in at hospitals is rapidly increasing. Hospitals are desperate to know if they have enough beds to handle those who need their help.

You have been asked to analyze a previous year’s hospitalization length of stay by people with the flu who had been admitted to the hospital. The predictors in your data set are age group, gender and race of those admitted. You also have an indicator that signifies whether the hospital was privately or publicly run.

(more…)


Member Training: Quantile Regression: Going Beyond the Mean

September 1st, 2017 by

In your typical statistical work, chances are you have already used quantiles such as the median, 25th or 75th percentiles as descriptive statistics.

But did you know quantiles are also valuable in regression, where they can answer a broader set of research questions than standard linear regression?

In standard linear regression, the focus is on estimating the mean of a response variable given a set of predictor variables.

In quantile regression, we can go beyond the mean of the response variable. Instead we can understand how predictor variables predict (1) the entire distribution of the response variable or (2) one or more relevant features (e.g., center, spread, shape) of this distribution.

For example, quantile regression can help us understand not only how age predicts the mean or median income, but also how age predicts the 75th or 25th percentile of the income distribution.

Or we can see how the inter-quartile range — the width between the 75th and 25th percentile — is affected by age. Perhaps the range becomes wider as age increases, signaling that an increase in age is associated with an increase in income variability.

In this webinar, we will help you become familiar with the power and versatility of quantile regression by discussing topics such as:

  • Quantiles – a brief review of their computation, interpretation and uses;
  • Distinction between conditional and unconditional quantiles;
  • Formulation and estimation of conditional quantile regression models;
  • Interpretation of results produced by conditional quantile regression models;
  • Graphical displays for visualizing the results of conditional quantile regression models;
  • Inference and prediction for conditional quantile regression models;
  • Software options for fitting quantile regression models.

Join us on this webinar to understand how quantile regression can be used to expand the scope of research questions you can address with your data.


Note: This training is an exclusive benefit to members of the Statistically Speaking Membership Program and part of the Stat’s Amore Trainings Series. Each Stat’s Amore Training is approximately 90 minutes long.

(more…)