Loops in Stata: Making coding easy

We’ve already discussed using macros in Stata to simplify and shorten code.

Another great tool in your coding tool belt is loops. Loops allow you to run the same command for several variables at one time without having to write separate code for each variable.

This discussion could go on for pages and pages because there is much you can do with a loop.. But I won’t be able to do that just now, Uh oh, fifteen minutes to Judge Wapner.

Let me begin by explaining the loop I included in the macro article at the end of an earlier article on efficient coding in Stata:

(I already explained how the first line of this code is a macro and why it’s useful. Read that first if you haven’t already).

local continuous educatexper wage age

 

foreach var of local continuous{
graph box `var’, saving(`var’,replace)
}

The two most common commands to begin a loop are foreach and forvalues.

The foreach command loops through a list while the forvalues loops through numbers. The first line of the code above is very similar to how you would create a macro.

The line begins with the command foreach followed by the name I want to use to represent a group (exactly the same as a macro). The word “in” tells Stata it will perform some action on whatever follows the word “in”.

The word `continuous’ is from the local macro that I explained previously. In this situation foreach var of local continuous is the same as foreach var in educat exper wage age. I could use either one in my loop.

The first line of the loop ends with the open bracket “{“. This symbol tells Stata that some action, which starts on the next line, will be performed to the group that followed the word “in”.

On the second line of the loop I asked Stata to create a box plot of the variables educat, exper, wage, and age and save them.

Inside the parenthesis of saving is the name I want to use for the saved graph and to replace any existing graph with the same name that is in the directory where I am saving it.

Stata will now create the graph for the first variable in my list and save it.

The closed bracket “}” found on the third line tells Stata to return to the beginning, the “{“ symbol, and perform the same action on the next variable in the list. Stata continues to do this until all variables have been used.

Using Loops to Define Missing Data Codes

Time for one more example. It is not uncommon to open up a data set and find the code for missing data to be “99” or “999” or some other number. Stata recognizes the period, “.” as missing data. So to analyze the data set you will have to fix this.

There are at least two commands that can be used to do this, replace and recode. I will give you an example using the command replace.

Since we are working with variables I need to start my loop with the command foreach. Next step is to decide upon the name I want to use to represent the group. It is common practice in the Stata “help” files to use the name “var” to represent variables so I will do the same.

So my code so far looks like this foreach var. If I was going to list all of the variables one by one the next word in my code would be in. But I’m going to use a shortcut and use a variable list. As a result the next part of my command is of varlist.

So now my command looks like:

foreach var of varlist

The shortcut for listing a series of variables is with the dash key“-“. In this case I type in the first variable followed by the dash and end with the last variable. Using the wages data set I would have educat-Race2.

The first line of my code is complete:

foreach var of varlist educat-Race2 {

Next I tell Stata what I want it to do to my variables, which is use “.” instead of the number currently being used in the data set for missing data. Assuming 666 is used for missing data, the code for changing it to “.” is:

replace `var’=. if `var’==666

Remember that the last line of my code must be the closed bracket sign “}”. The complete coding to change all missing data to “.” is:

foreach var of varlist educat-Race2 {
replace `var’=. if `var’==666
}

When I run this code Stata will take the first variable from the variable list and replace 666 with a period. It will then go to the next variable and work its way through the entire list.

It’s as simple as that. Loops can significantly reduce the number of lines of code that you have to write.

Imagine how much time it would take you if you had a hundred variables and you had to write the code for each individual variable.

The fewer lines of code you have the less time you have to spend writing the code and the fewer chances for making mistakes.

 

Getting Started with Stata
Jeff introduces you to the consistent structure that Stata uses to run every type of statistical analysis.

Reader Interactions

Comments

  1. Yomi says

    Very informative, thank you. I have a question, what kind of loop will allow me to run separate, bivariante analysis simultaneously? Let’s say I wanted to do a chi2 analysis for one outcome and 7 demogrpahic variables. Is there a code that spares me from having to run each bivariante analysis separately?

    Thank you for your time,
    -Yomi

    • Jeff Meyer says

      Hi Yomi,

      It follows a similar set up. You state your outcome variable (I’ll call it “age_bracket”) within the loop and in the foreach line you stat the 7 demographic variables.

      foreach var of varlist education-gender{
      tabulate age_bracket `var’, chi2 exact
      }

      If you want a bivariate regression and your outcome is income:

      foreach var of varlist education-gender{
      reg income `var’
      }

      Jeff

  2. Nina says

    Hi Jeff,
    I have been trying to replicate your code for some of the variables in my dataset. and I always get the error ” too few quotes” r(132).

    foreach var in `continuous'{
    graph box `var’, saving(`var’,replace)
    }

    I am not sure where the quotes ought to be, do let me know if this is right.

    Thank you for your help!

    • Jeff Meyer says

      Hi Nina,

      It looks like your error is in the “foreach” statement, which I see I made the mistake in the article. My apologies for that.

      If you are using a local macro for continuous your code should be:
      foreach var of local continuous{

      Everything else looks fine. I have found that if I copy and paste from a pdf file sometimes I have to re-type the ` and ‘ symbols where ever they are used in the code. If you run “help foreach” you will see the options you have for running the “foreach” statement.

      Hope this works!
      Jeff

  3. Abdi Billow says

    Hello Jeff,
    Thanks so much for the explicit explanation on the foreach loop. Could you also help me on the other loops like For… Next. For Each… Next. Do….Do While and Do Until and While… Wend. Thanks

    • Jeff Meyer says

      Hi,
      I’m not familiar with Do While and Do Until and While. You can create loops within loops using both foreach and forvalues. The command within the command is done for the first value of the first loop and then uses the second value to run through the command within the command. I find it helpful to construct a flow chart to help me think through the process of what I want to create and then use the foreach and forvalues structure to run it.

      Jeff

  4. Majda says

    Very Useful blog !
    I just have a question concerning the following code :

    foreach var in `continuous'{
    graph box `var’, saving(`var’,replace)
    }

    Once it is done how do you display the graphs ? I know that they are in memory but where ?

    Thank you for your help

    • Jeff Meyer says

      Hi, glad you found it useful. To put your graphs in a location on your computer where you can find them add a code above the loop changing the directory to where you want the graphs put. An example is:
      cd “C:\graphs”

      You will need to create the folder first unless you use a folder that is already in existence.


Leave a Reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.