recode

Getting Started with Stata Tutorial #11: Editing Variables Using recode and recast

May 12th, 2025 by

From our last posts in this series, you should be comfortable with how Stata handles data editing, as well as with making your own variables. In this post, we’ll talk about commands that edit the content or storage type of your variables in Stata: recode and recast. Let’s start off with the recode command.

(more…)


Getting Started with Stata Tutorial #10: Four Commands to Create New Variables in Stata

April 29th, 2025 by

From our last article, you should feel comfortable with the idea of editing and saving data sets in Stata. In this article, we’ll explain how to create new variables in Stata using replace, generate, egen, and clonevar.

(more…)


Getting Started with Stata Tutorial #9: Saving, Reordering, and Dropping Data

March 17th, 2025 by

Stata makes it a breeze to edit or clean your data. If you’re unfamiliar with using data sets in Stata, check out these blog posts to get a good grasp on importing and browsing data in Stata.

For this tutorial we will be using Stata’s “auto” data set. If you haven’t loaded it in yet, type

(more…)


Issues in Coding Missing Values

October 11th, 2023 by

There’s no mincing words here. Missing values can cause problems for every statistician. That’s true for a lot of reasons, but it can start with simple issues of choices stage 1made when coding missing values in a data set. Here are a few examples.

Example 1: The Null License Plate

Researcher Joseph Tartaro thought it would be funny to get the following California vanity license plate: (more…)


Recoding a Variable from a Survey Question to Use in a Statistical Model

March 18th, 2019 by

Survey questions are often structured without regard for ease of use within a statistical model.Stage 2

Take for example a survey done by the Centers for Disease Control (CDC) regarding child births in the U.S. One of the variables in the data set is “interval since last pregnancy”. Here is a histogram of the results.

(more…)


Stata Loops and Macros for Large Data Sets: Quickly Finding Needles in the Hay Stack

August 7th, 2015 by

I recently opened a very  large data set titled “1998 California Work and Health Survey” compiled by the Institute for Health Policy Studies at the University of California, San Francisco. There are 1,771 observations and 345 variables. (more…)