• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • About
    • Our Programs
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Guest Instructors
  • Membership
    • Statistically Speaking Membership Program
    • Login
  • Workshops
    • Online Workshops
    • Login
  • Consulting
    • Statistical Consulting Services
    • Login
  • Free Webinars
  • Contact
  • Login

Loops in Stata: Making coding easy

by Jeff Meyer 8 Comments

by Jeff Meyer

We’ve already discussed using macros in Stata to simplify and shorten code.

Another great tool in your coding tool belt is loops. Loops allow you to run the same command for several variables at one time without having to write separate code for each variable.

This discussion could go on for pages and pages because there is much you can do with a loop.. But I won’t be able to do that just now, Uh oh, fifteen minutes to Judge Wapner.

Let me begin by explaining the loop I included in the macro article at the end of an earlier article on efficient coding in Stata:

(I already explained how the first line of this code is a macro and why it’s useful. Read that first if you haven’t already).

local continuous educatexper wage age

 

foreach var of local continuous{
graph box `var’, saving(`var’,replace)
}

The two most common commands to begin a loop are foreach and forvalues.

The foreach command loops through a list while the forvalues loops through numbers. The first line of the code above is very similar to how you would create a macro.

The line begins with the command foreach followed by the name I want to use to represent a group (exactly the same as a macro). The word “in” tells Stata it will perform some action on whatever follows the word “in”.

The word `continuous’ is from the local macro that I explained previously. In this situation foreach var of local continuous is the same as foreach var in educat exper wage age. I could use either one in my loop.

The first line of the loop ends with the open bracket “{“. This symbol tells Stata that some action, which starts on the next line, will be performed to the group that followed the word “in”.

On the second line of the loop I asked Stata to create a box plot of the variables educat, exper, wage, and age and save them.

Inside the parenthesis of saving is the name I want to use for the saved graph and to replace any existing graph with the same name that is in the directory where I am saving it.

Stata will now create the graph for the first variable in my list and save it.

The closed bracket “}” found on the third line tells Stata to return to the beginning, the “{“ symbol, and perform the same action on the next variable in the list. Stata continues to do this until all variables have been used.

Using Loops to Define Missing Data Codes

Time for one more example. It is not uncommon to open up a data set and find the code for missing data to be “99” or “999” or some other number. Stata recognizes the period, “.” as missing data. So to analyze the data set you will have to fix this.

There are at least two commands that can be used to do this, replace and recode. I will give you an example using the command replace.

Since we are working with variables I need to start my loop with the command foreach. Next step is to decide upon the name I want to use to represent the group. It is common practice in the Stata “help” files to use the name “var” to represent variables so I will do the same.

So my code so far looks like this foreach var. If I was going to list all of the variables one by one the next word in my code would be in. But I’m going to use a shortcut and use a variable list. As a result the next part of my command is of varlist.

So now my command looks like:

foreach var of varlist

The shortcut for listing a series of variables is with the dash key“-“. In this case I type in the first variable followed by the dash and end with the last variable. Using the wages data set I would have educat-Race2.

The first line of my code is complete:

foreach var of varlist educat-Race2 {

Next I tell Stata what I want it to do to my variables, which is use “.” instead of the number currently being used in the data set for missing data. Assuming 666 is used for missing data, the code for changing it to “.” is:

replace `var’=. if `var’==666

Remember that the last line of my code must be the closed bracket sign “}”. The complete coding to change all missing data to “.” is:

foreach var of varlist educat-Race2 {
replace `var’=. if `var’==666
}

When I run this code Stata will take the first variable from the variable list and replace 666 with a period. It will then go to the next variable and work its way through the entire list.

It’s as simple as that. Loops can significantly reduce the number of lines of code that you have to write.

Imagine how much time it would take you if you had a hundred variables and you had to write the code for each individual variable.

The fewer lines of code you have the less time you have to spend writing the code and the fewer chances for making mistakes.


Jeff Meyer is a statistical consultant with The Analysis Factor, a stats mentor for Statistically Speaking membership, and a workshop instructor. Read more about Jeff here.

Tagged With: loops, Stata

Related Posts

  • Using Stored Calculations in Stata to Center Predictors: an Example
  • Statistical Software Access From Home
  • Member Training: What’s the Best Statistical Package for You?
  • Tricks for Using Word to Make Statistical Syntax Easier

Reader Interactions

Comments

  1. Yomi says

    July 1, 2019 at 11:49 pm

    Very informative, thank you. I have a question, what kind of loop will allow me to run separate, bivariante analysis simultaneously? Let’s say I wanted to do a chi2 analysis for one outcome and 7 demogrpahic variables. Is there a code that spares me from having to run each bivariante analysis separately?

    Thank you for your time,
    -Yomi

    Reply
    • Jeff Meyer says

      July 2, 2019 at 4:36 pm

      Hi Yomi,

      It follows a similar set up. You state your outcome variable (I’ll call it “age_bracket”) within the loop and in the foreach line you stat the 7 demographic variables.

      foreach var of varlist education-gender{
      tabulate age_bracket `var’, chi2 exact
      }

      If you want a bivariate regression and your outcome is income:

      foreach var of varlist education-gender{
      reg income `var’
      }

      Jeff

      Reply
  2. Nina says

    October 19, 2018 at 7:04 am

    Hi Jeff,
    I have been trying to replicate your code for some of the variables in my dataset. and I always get the error ” too few quotes” r(132).

    foreach var in `continuous'{
    graph box `var’, saving(`var’,replace)
    }

    I am not sure where the quotes ought to be, do let me know if this is right.

    Thank you for your help!

    Reply
    • Jeff Meyer says

      October 19, 2018 at 10:13 am

      Hi Nina,

      It looks like your error is in the “foreach” statement, which I see I made the mistake in the article. My apologies for that.

      If you are using a local macro for continuous your code should be:
      foreach var of local continuous{

      Everything else looks fine. I have found that if I copy and paste from a pdf file sometimes I have to re-type the ` and ‘ symbols where ever they are used in the code. If you run “help foreach” you will see the options you have for running the “foreach” statement.

      Hope this works!
      Jeff

      Reply
  3. Abdi Billow says

    May 3, 2018 at 9:19 am

    Hello Jeff,
    Thanks so much for the explicit explanation on the foreach loop. Could you also help me on the other loops like For… Next. For Each… Next. Do….Do While and Do Until and While… Wend. Thanks

    Reply
    • Jeff Meyer says

      May 3, 2018 at 9:51 am

      Hi,
      I’m not familiar with Do While and Do Until and While. You can create loops within loops using both foreach and forvalues. The command within the command is done for the first value of the first loop and then uses the second value to run through the command within the command. I find it helpful to construct a flow chart to help me think through the process of what I want to create and then use the foreach and forvalues structure to run it.

      Jeff

      Reply
  4. Majda says

    April 17, 2018 at 4:25 am

    Very Useful blog !
    I just have a question concerning the following code :

    foreach var in `continuous'{
    graph box `var’, saving(`var’,replace)
    }

    Once it is done how do you display the graphs ? I know that they are in memory but where ?

    Thank you for your help

    Reply
    • Jeff Meyer says

      April 17, 2018 at 10:30 am

      Hi, glad you found it useful. To put your graphs in a location on your computer where you can find them add a code above the loop changing the directory to where you want the graphs put. An example is:
      cd “C:\graphs”

      You will need to create the folder first unless you use a folder that is already in existence.

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Primary Sidebar

This Month’s Statistically Speaking Live Training

  • February Member Training: Choosing the Best Statistical Analysis

Upcoming Workshops

  • Logistic Regression for Binary, Ordinal, and Multinomial Outcomes (May 2021)
  • Introduction to Generalized Linear Mixed Models (May 2021)

Read Our Book



Data Analysis with SPSS
(4th Edition)

by Stephen Sweet and
Karen Grace-Martin

Statistical Resources by Topic

  • Fundamental Statistics
  • Effect Size Statistics, Power, and Sample Size Calculations
  • Analysis of Variance and Covariance
  • Linear Regression
  • Complex Surveys & Sampling
  • Count Regression Models
  • Logistic Regression
  • Missing Data
  • Mixed and Multilevel Models
  • Principal Component Analysis and Factor Analysis
  • Structural Equation Modeling
  • Survival Analysis and Event History Analysis
  • Data Analysis Practice and Skills
  • R
  • SPSS
  • Stata

Copyright © 2008–2021 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled

Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.

Non-necessary

Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.

SAVE & ACCEPT