From our last posts in this series, you should be comfortable with how Stata handles data editing, as well as with making your own variables. In this post, we’ll talk about commands that edit the content or storage type of your variables in Stata: recode and recast. Let’s start off with the recode command.
(more…)
From our last article, you should feel comfortable with the idea of editing and saving data sets in Stata. In this article, we’ll explain how to create new variables in Stata using replace, generate, egen, and clonevar.
(more…)
Stata makes it a breeze to edit or clean your data. If you’re unfamiliar with using data sets in Stata, check out these blog posts to get a good grasp on importing and browsing data in Stata.
For this tutorial we will be using Stata’s “auto” data set. If you haven’t loaded it in yet, type
(more…)
Once you’ve imported your data into Stata the next step is usually examining it.
Before you work on building a model or running any tests, you need to understand your data. Ask yourself these questions:
- Is every variable marked as the appropriate type?
- Are missing observations coded consistently and marked as missing?
- Do I want to exclude any variables or data points?
(more…)
In our previous posts, we’ve relied on Stata’s pre-loaded datasets to perform analyses. But when you’re working with your own data, you’ll need to know how to import it into Stata.
To demonstrate how this process works, we will use the Iris dataset from UCI.
Download the dataset, then move it to whichever directory you intend to use for Stata files.
There are three main ways of importing data in Stata: either use the menus to import the data, call the dataset by its full file extension, or change your directory to the one with your data and then refer to the dataset by name. (more…)
If you’ve tried coding in Stata, you may have found it strange. The syntax rules are straightforward, but different from what I’d expect.
I had experience coding in Java and R before I ever used Stata. Because of this, I expected commands to be followed by parentheses, and for this to make it easy to read the code’s structure.
Stata does not work this way.
An Example of how Stata Code Works
To see the way Stata handles a linear regression, go to the command line and type
h reg or help regress
You will see a help page pop up, with this Syntax line near the top.
(If you need a refresher on getting help in Stata, watch this video by Jeff Meyer.)

This is typical of how Stata code looks. (more…)