FacebookTwitterGoogle+Share

This month’s Topic Webinar

Working with Truncated and Censored Data
with Jeff Meyer

Statistically speaking, when we see a continuous outcome variable we often worry about outliers and how these extreme observations can impact our model.

But have you ever had an outcome variable with no outliers because there was a boundary value at which accurate measurements couldn’t be or weren’t recorded?

Examples include:

  • Income data where all values above $100,000 are recorded as $100k or greater
  • Soil toxicity ratings where the device cannot measure values below 1 ppm
  • Number of arrests where there are no zeros because the data set came from police records where all participants had at least one arrest

These are all examples of data that is truncated or censored.  Failing to incorporate the truncation or censoring will result in biased results.

This webinar will discuss what truncated and censored data is and how to identify it.

There are several different models that are used with this type of data. We will go over each model and discuss which type of data is appropriate for each model.

We will then compare the results of models that account for truncated or censored data to those that do not. From this you will see what possible impact the wrong model choice has on the results.

About the instructor

Jeff Meyer is a statistical consultant, instructor and writer for the Analysis Factor.

Jeff has an MBA from the Thunderbird School of Global Management and an MPA with a focus on policy from NYU Wagner School of Public Service.

Topic Webinar: Wed, July 20, 2016 3:00 PM EDT (check day & time in your area)

Note: this webinar is available to Data Analysis Brown Bag members.

DABB_logoCould you use some affordable ongoing statistical training with the opportunity to ask questions about statistical topics? Consider joining our Data Analysis Brown Bag program.


{ 1 comment }

June 2016 Topic Webinar: Zero Inflated Models

This webinar will explore two ways of modeling zero-inflated data: the Zero Inflated model and the Hurdle model. Both assume there are two different processes: one that affects the probability of a zero and one that affects the actual values, and both allow different sets of predictors for each process.

Read the full article →

Incorporating Graphs in Regression Diagnostics with Stata

In our upcoming Linear Models in Stata workshop, we will explore ways to find observations that influence the model. This is done in Stata via post-estimation commands. As the name implies, all post-estimation commands are run after running the model (regression, logit, mixed, etc)…

Read the full article →

Free May Craft of Statistical Analysis Webinar: Unlocking the Power of Stata’s Macros and Loops

There are many steps to analyzing a dataset. One of the first steps is to create tables and graphs of your variables in order to understand what is behind the thousands of numbers on your screen. But the type of table and graph you create depends upon the type of variable you are looking at…

Read the full article →

Linear Regression in Stata: Missing Data and the Stories it Might Tell

In a previous blog post we examined how to use the same sample when comparing the differences among regression models. Using different samples in our models could lead to erroneous conclusions when interpreting our models. But excluding observations can also result in inaccurate results…

Read the full article →

Issues with Truncated Data

Can we ignore the fact that a variable is bounded and just run our analysis as if the data wasn’t bounded?

Read the full article →

May 2016 Topic Webinar: Communicating Statistical Results: When to use tables vs graphs to tell the data’s story

In this webinar, we will discuss when tables and graphs are (and are not) appropriate and how people tend to engage with each of these media…

Read the full article →

April 2016 Topic Webinar: An Introduction to Kaplan-Meier Curves

In this talk, you will see a simple example of this using fruit fly data, and learn how to interpret the Kaplan-Meier curve to estimate survival probabilities and survival percentiles..

Read the full article →

The four models you meet in Structural Equation Modeling

On a previous post (Why do I need to have knowledge of multiple regression to understand SEM?) we showed how a multiple regression model could be conceptualized using Structural Equation Model path diagrams. That’s the simplest SEM you can create, but its real power lies in expanding on that regression model. Here I will discuss 4 ways to do that..

Read the full article →

Zero One Inflated Beta Models for Proportion Data

Like logistic and Poisson regression, beta regression is a type of generalized linear model. It works nicely for proportion data because the values of a variable with a beta distribution must fall between 0 and 1. It’s a bit of a funky distribution in that it’s shape can change a lot depending on the values of the mean and dispersion parameters. Here are a few examples of the possible shapes of a beta distribution, with different means and variances…

Read the full article →