continuous variable

The Impact of Removing the Constant from a Regression Model: The Categorical Case

December 9th, 2016 by Jeff Meyer

In a simple linear regression model, how the constant (a.k.a., intercept) is interpreted depends upon the type of predictor (independent) variable.

If the predictor is categorical and dummy-coded, the constant is the mean value of the outcome variable for the reference category only. If the predictor variable is continuous, the constant equals the predicted value of the outcome variable when the predictor variable equals zero.

Removing the Constant When the Predictor Is Categorical

When your predictor variable X is categorical, the results are logical. Let’s look at an example. (more…)

5 comments

Member Training: Working with Truncated and Censored Data

July 1st, 2016 by Jeff Meyer

Statistically speaking, when we see a continuous outcome variable we often worry about outliers and how these extreme observations can impact our model.

But have you ever had an outcome variable with no outliers because there was a boundary value at which accurate measurements couldn’t be or weren’t recorded?

Examples include:

Income data where all values above $100,000 are recorded as $100k or greater
Soil toxicity ratings where the device cannot measure values below 1 ppm
Number of arrests where there are no zeros because the data set came from police records where all participants had at least one arrest

These are all examples of data that are truncated or censored. Failing to incorporate the truncation or censoring will result in biased results.

This webinar will discuss what truncated and censored data are and how to identify them.

There are several different models that are used with this type of data. We will go over each model and discuss which type of data is appropriate for each model.

We will then compare the results of models that account for truncated or censored data to those that do not. From this you will see what possible impact the wrong model choice has on the results.

Note: This training is an exclusive benefit to members of the Statistically Speaking Membership Program and part of the Stat’s Amore Trainings Series. Each Stat’s Amore Training is approximately 90 minutes long.

(more…)

1 comment

Member Training: Confidence Intervals

April 6th, 2015 by Karen Grace-Martin

A Science News article from July 2014 was titled “Scientists’ grasp of confidence intervals doesn’t inspire confidence.” Perhaps that is why only 11% of the articles in the 10 leading psychology journals in 2006 reported confidence intervals in their statistical analysis.

How important is it to be able to create and interpret confidence intervals?

The American Psychological Association Publication Manual, which sets the editorial standards for over 1,000 journals in the behavioral, life, and social sciences, has begun emphasizing parameter estimation and de-emphasizing Null Hypothesis Significance Testing (NHST).

Its most recent edition, the sixth, published in 2009, states “estimates of appropriate effect sizes and confidence intervals are the minimum expectations” for published research.

In this webinar, we’ll clear up the ambiguity as to what exactly is a confidence interval and how to interpret them in a table and graph format. We will also explore how they are calculated for continuous and dichotomous outcome variables in various types of samples and understand the impact sample size has on the width of the band. We’ll discuss related concepts like equivalence testing.

By the end of the webinar, we anticipate your grasp of confidence intervals will inspire confidence.

(more…)

2 comments

When a Variable’s Level of Measurement Isn’t Obvious

July 14th, 2014 by Karen Grace-Martin

A central concept in statistics is the level of measurement of a variable. It’s so important to everything you do with data that it’s usually taught within the first week in every intro stats class.

But even something so fundamental can be tricky once you start working with real data. (more…)

24 comments

When Can Count Data be Considered Continuous?

January 13th, 2012 by Karen Grace-Martin

Last month I did a webinar on Poisson and negative binomial models for count data. With a few hundred participants, we ran out of time to get through all the questions, so I’m answering some of them here on the blog.

This set of questions are all related to when it’s appropriate to treat count data as continuous and run the more familiar and simpler linear model.

Q: Do you have any guidelines or rules of thumb as far as how many discrete values an outcome variable can take on before it makes more sense to just treat it as continuous?

The issue usually isn’t a matter of how many values there are. (more…)

32 comments

Beyond Median Splits: Meaningful Cut Points

June 26th, 2009 by Karen Grace-Martin

I’ve talked a bit about the arbitrary nature of median splits and all the information they just throw away.

But I have found that as a data analyst, it is incredibly freeing to be able to choose whether to make a variable continuous or categorical and to make the switch easily. Essentially, this means you need to be (more…)

No comments yet