• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • Our Programs
    • Membership
    • Online Workshops
    • Free Webinars
    • Consulting Services
  • About
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Collaborate with Us
  • Statistical Resources
  • Contact
  • Blog
  • Login

Macros in Stata, Why and How to Use Them

by Jeff Meyer 5 Comments

by Jeff Meyer

We finished the last article about Stata with the confusing coding of:

local continuous educat exper wage age

 

foreach var in `continuous'{
graph box `var’, saving(`var’,replace)
}

I admit it looks like a foreign language.  Let me explain how simple it is to understand. Let’s look at the first line from above:  local continuous educatexper wage age

Stata allows you to use a single word, such as “continuous”, to represent many other words.  In Stata this process is known as a macro.

Please note that a macro in Stata is not the same as a macro in Microsoft Excel.

Writing macros in Excel can be long and involved.

Writing a macro in Stata is very easy.

How to write a simple macro in Stata

A macro in Stata begins with the word “global” or “local”.  The command global tells Stata to store everything in the command line in its memory until you exit Stata.  If you open another data set before exiting, the global macro will still be in memory.

The command local tells Stata to keep everything in the command line in memory only until the program or do-file ends.

If you plan on analyzing only one data set then the global command shouldn’t cause you any problems.

If you will be analyzing more than one data set then you might want to consider using the local command.

If you remember from the previous article, the wages data set had four continuous variables.

In the first line of my code above, local continuous educat exper wage age, I am using the word “continuous” to represent the four variables educat, exper, wage, and age.

Some commands in Stata allow you to analyze more than one variable at a time. For example, I might want to run the following commands in order to see what my continuous variables look like.

tab1 educat exper wage age
codebook educat exper wage age
summarize educat exper wage age

But why bother with a macro?

Using a macro allows me to simplify my work, which will reduce the potential for errors and keep it organized.

The code below will perform the exact same actions as the code above.

local continuous educat exper wage age
tab1
`continuous’
codebook
`continuous’
summarize
`continuous’

The command local tells Stata to use the word continuous to represent the variables educat, exper, wage, and age.  I then substitute the word continuous for the variable names in the the last lines.

Note that each time you use the word continuous in a command line you must begin with the forward slanting single quote key ` (to the left of the 1 key on your keyboard) and finish with the backward slanting single quote key ‘ (to the right of the ; key on your keyboard).

Using a macro to represent several variables may not seem like a big deal and why bother with it.  But wait, there is more.

For example, you might have to run numerous linear regressions, using several of the same predictor variables in each regression you run. After reviewing your results you might decide to eliminate one of the variables.

If you didn’t use a macro you will have to go back to every line of code and remove it. If you did use a macro you will only have to go back to the line of code for the macro and remove the variable from the group.

Remove it once and it is gone. Que facil!

Macros for formatting tables and graphs

Another great use for macros is for creating tables and graphs.  Unless you are Raymond (of Rain Man fame), there is no way you will remember your favorite options when creating a graph.

What are options for a graph?  To name just a few, the major and minor tick labels for the X and Y axis, the X and Y axis scale properties, whether to have a legend, placement of the legend, placement of the title, etc. etc.

This is an example of the coding for a graph that I once created:

ms(O) mc(gs0) msize(small)) (line populatn_hat d, sort lcolor(gs0)) (line populatn_hat_p d, sort lcolor(gs0) lpattern(dash)) (line populatn_hat_m d, sort lcolor(gs0) lpattern(dash)), xline(0, lcolor(gs0))  title(“Quadratic fit”, color(gs0))  ytitle(“Population in district”) legend(label(1 “Vote shares”) label(2 “Quadratic fit”))

It would sure be easier to use a one word macro to represent all of the above.

I suggest you create a do-file template for tables and graphs (see my previous article on creating do-file templates).

In the template you create various macros that contain the options for your tables and graphs.

Be sure to document (by using // or “*” as discussed in the previous article) in your do-file what the table or graph will look like.

Perhaps the best reason for using the macros for your table and graph formatting is it ensures consistent formatting.

If you decide you want to tweak how the tables in your research paper look, you will only need to make a change to the macro.  This saves you from having to go back to every table and change the coding.

In my next article I will show you another way to save time and effort in Stata through looping.


Jeff Meyer is a statistical consultant with The Analysis Factor, a stats mentor for Statistically Speaking membership, and a workshop instructor. Read more about Jeff here.

Getting Started with Stata
Jeff introduces you to the consistent structure that Stata uses to run every type of statistical analysis.

Tagged With: formatting graphs, macros, Stata

Related Posts

  • Statistical Software Access From Home
  • The Wonderful World of User Written Commands in Stata
  • Using Stored Calculations in Stata to Center Predictors: an Example
  • Argggh! How Do I Output Tables and Graphs From Stata?

Reader Interactions

Comments

  1. Miranda says

    February 19, 2021 at 12:56 pm

    For macros and loops, is it possible to define large blocks of variables that are not contiguous in the dataset? As a simple example, let’s say I have a dataset that examines the look, feel, and taste of apples, bananas, and oranges with a varlist as follows:

    v1= apple look q1
    v2= apple look q2
    v3= apple look q3
    v4= apple feel q1
    v5= apple feel q2
    v6= apple feel q3
    v7= apple taste q1
    v8= apple taste q2
    v9= apple taste q3
    v10-18 = bananas
    v19-27 = oranges

    Lets say I’m only interested in taste, and I want to create a macro or a loop to route out miscoded cases among those variables. I want to create a single macro that includes the following variable groups: v7-v9, v16-v18, and v25-v27. Is there a straightforward way to do this without having to type out each variable?

    Reply
    • Jeff Meyer says

      February 20, 2021 at 10:03 am

      You can use an if statement within your loop.

      Reply
  2. Mekdes says

    July 6, 2019 at 3:19 am

    I found it helpful. Thank you so much.

    Reply
  3. irina says

    June 26, 2019 at 9:58 am

    thank you for such a simple explanation with great examples! Saved in my personal notes. Thank you sooooo much!

    Reply
  4. Moses Otieno says

    June 26, 2018 at 3:28 am

    This is awesome! Have learnt thoroughly on macros!!!!

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Primary Sidebar

This Month’s Statistically Speaking Live Training

  • Member Training: Analyzing Pre-Post Data

Upcoming Free Webinars

Poisson and Negative Binomial Regression Models for Count Data

Upcoming Workshops

  • Analyzing Count Data: Poisson, Negative Binomial, and Other Essential Models (Jul 2022)
  • Introduction to Generalized Linear Mixed Models (Jul 2022)

Copyright © 2008–2022 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT