Issues in Coding Missing Values

October 11th, 2023 by

There’s no mincing words here. Missing values can cause problems for every statistician. That’s true for a lot of reasons, but it can start with simple issues of choices stage 1made when coding missing values in a data set. Here are a few examples.

Example 1: The Null License Plate

Researcher Joseph Tartaro thought it would be funny to get the following California vanity license plate: (more…)

Recoding a Variable from a Survey Question to Use in a Statistical Model

March 18th, 2019 by

Survey questions are often structured without regard for ease of use within a statistical model.Stage 2

Take for example a survey done by the Centers for Disease Control (CDC) regarding child births in the U.S. One of the variables in the data set is “interval since last pregnancy”. Here is a histogram of the results.


Stata Loops and Macros for Large Data Sets: Quickly Finding Needles in the Hay Stack

August 7th, 2015 by

I recently opened a very  large data set titled “1998 California Work and Health Survey” compiled by the Institute for Health Policy Studies at the University of California, San Francisco. There are 1,771 observations and 345 variables. (more…)

R Is Not So Hard! A Tutorial, Part 18: Re-Coding Values

August 29th, 2014 by

One data manipulation task that you need to do in pretty much any data analysis is recode data.  It’s almost never the case that the data are set up exactly the way you need them for your analysis.

In R, you can re-code an entire vector or array at once. To illustrate, let’s set up a vector that has missing values.

A <- c(3, 2, NA, 5, 3, 7, NA, NA, 5, 2, 6)


[1] 3 2 NA 5 3 7 NA NA 5 2 6

We can re-code all missing values by another number (such as zero) as follows: (more…)

An Easy Way to Reverse Code Scale items

June 29th, 2012 by

Before you run a Cronbach’s alpha or factor analysis on scale items, it’s generally a good idea to reverse code items that are negatively worded so that a high value indicates the same type of response on every item.

So for example let’s say you have 20 items each on a 1 to 7 scale. For most items, a 7 may indicate a positive attitude toward some issue, but for a few items, a 1 indicates a positive attitude.

I want to show you a very quick and easy way to reverse code them using a single command line. This works in any software. (more…)

Recoding Variables in SPSS Menus and Syntax

March 11th, 2011 by

SPSS offers two choices under the recode command: Into Same Variable and Into Different Variables.

The command Into Same Variable replaces existing data with new values, but the command Into Different Variables adds a new variable to the data set.

In almost every situation, you want to use Into Different Variables. Recoding Into Same Variables replaces the values in the existing variable.

So if you notice a mistake after you’ve recoded, you can’t fix it.

But you may not even notice the mistake, because you can’t even test it.

And that’s just dangerous. (more…)