Loops in Stata: Making coding easy

by Jeff Meyer

by Jeff Meyer

We’ve already discussed using macros in Stata to simplify and shorten code.

Another great tool in your coding tool belt is loops. Loops allow you to run the same command for several variables at one time without having to write separate code for each variable.

This discussion could go on for pages and pages because there is much you can do with a loop.. But I won’t be able to do that just now, Uh oh, fifteen minutes to Judge Wapner.

Let me begin by explaining the loop I included in the macro article at the end of an earlier article on efficient coding in Stata:

(I already explained how the first line of this code is a macro and why it's useful. Read that first if you haven't already).

local continuous educatexper wage age

foreach var of local continuous{
graph box `var’, saving(`var’,replace)
}

The two most common commands to begin a loop are foreach and forvalues.

The foreach command loops through a list while the forvalues loops through numbers. The first line of the code above is very similar to how you would create a macro.

The line begins with the command foreach followed by the name I want to use to represent a group (exactly the same as a macro). The word “in” tells Stata it will perform some action on whatever follows the word “in”.

The word `continuous’ is from the local macro that I explained previously. In this situation foreach var of local continuous is the same as foreach var in educat exper wage age. I could use either one in my loop.

The first line of the loop ends with the open bracket “{“. This symbol tells Stata that some action, which starts on the next line, will be performed to the group that followed the word “in”.

On the second line of the loop I asked Stata to create a box plot of the variables educat, exper, wage, and age and save them.

Inside the parenthesis of saving is the name I want to use for the saved graph and to replace any existing graph with the same name that is in the directory where I am saving it.

Stata will now create the graph for the first variable in my list and save it.

The closed bracket “}” found on the third line tells Stata to return to the beginning, the “{“ symbol, and perform the same action on the next variable in the list. Stata continues to do this until all variables have been used.

Using Loops to Define Missing Data Codes

Time for one more example. It is not uncommon to open up a data set and find the code for missing data to be “99” or “999” or some other number. Stata recognizes the period, “.” as missing data. So to analyze the data set you will have to fix this.

There are at least two commands that can be used to do this, replace and recode. I will give you an example using the command replace.

Since we are working with variables I need to start my loop with the command foreach. Next step is to decide upon the name I want to use to represent the group. It is common practice in the Stata “help” files to use the name “var” to represent variables so I will do the same.

So my code so far looks like this foreach var. If I was going to list all of the variables one by one the next word in my code would be in. But I’m going to use a shortcut and use a variable list. As a result the next part of my command is of varlist.

So now my command looks like:

foreach var of varlist

The shortcut for listing a series of variables is with the dash key“-“. In this case I type in the first variable followed by the dash and end with the last variable. Using the wages data set I would have educat-Race2.

The first line of my code is complete:

foreach var of varlist educat-Race2 {

Next I tell Stata what I want it to do to my variables, which is use “.” instead of the number currently being used in the data set for missing data. Assuming 666 is used for missing data, the code for changing it to “.” is:

replace `var’=. if `var’==666

Remember that the last line of my code must be the closed bracket sign “}”. The complete coding to change all missing data to “.” is:

foreach var of varlist educat-Race2 {
replace `var’=. if `var’==666
}

When I run this code Stata will take the first variable from the variable list and replace 666 with a period. It will then go to the next variable and work its way through the entire list.

It’s as simple as that. Loops can significantly reduce the number of lines of code that you have to write.

Imagine how much time it would take you if you had a hundred variables and you had to write the code for each individual variable.

The fewer lines of code you have the less time you have to spend writing the code and the fewer chances for making mistakes.


Jeff Meyer is a statistical consultant, instructor and writer for The Analysis Factor. Learn more about Jeff…

{ 6 comments… read them below or add one }

Nina

Hi Jeff,
I have been trying to replicate your code for some of the variables in my dataset. and I always get the error ” too few quotes” r(132).

foreach var in `continuous'{
graph box `var’, saving(`var’,replace)
}

I am not sure where the quotes ought to be, do let me know if this is right.

Thank you for your help!

Reply

Jeff Meyer

Hi Nina,

It looks like your error is in the “foreach” statement, which I see I made the mistake in the article. My apologies for that.

If you are using a local macro for continuous your code should be:
foreach var of local continuous{

Everything else looks fine. I have found that if I copy and paste from a pdf file sometimes I have to re-type the ` and ‘ symbols where ever they are used in the code. If you run “help foreach” you will see the options you have for running the “foreach” statement.

Hope this works!
Jeff

Reply

Abdi Billow

Hello Jeff,
Thanks so much for the explicit explanation on the foreach loop. Could you also help me on the other loops like For… Next. For Each… Next. Do….Do While and Do Until and While… Wend. Thanks

Reply

Jeff Meyer

Hi,
I’m not familiar with Do While and Do Until and While. You can create loops within loops using both foreach and forvalues. The command within the command is done for the first value of the first loop and then uses the second value to run through the command within the command. I find it helpful to construct a flow chart to help me think through the process of what I want to create and then use the foreach and forvalues structure to run it.

Jeff

Reply

Majda

Very Useful blog !
I just have a question concerning the following code :

foreach var in `continuous'{
graph box `var’, saving(`var’,replace)
}

Once it is done how do you display the graphs ? I know that they are in memory but where ?

Thank you for your help

Reply

Jeff Meyer

Hi, glad you found it useful. To put your graphs in a location on your computer where you can find them add a code above the loop changing the directory to where you want the graphs put. An example is:
cd “C:\graphs”

You will need to create the folder first unless you use a folder that is already in existence.

Reply

Leave a Comment

Please note that, due to the large number of comments submitted, any comments on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Previous post:

Next post: