Creating Graphs in Stata: From Percentiles to Observe Trends (Part 2)

In a previous post we discussed the difficulties of spotting meaningful information when we work with a large panel data set.

Observing the data collapsed into groups, such as quartiles or deciles, is one approach to tackling this challenging task.  We showed how this can be easily done in Stata using just 10 lines of code.

As promised, we will now show you how to graph the collapsed data.

There are two commands for graphing panel data in Stata. Stata created the command xtline. The command profileplot was created by a third party.  The command xtline has more options and as a result creates more professional graphs.

To use xtline  the data must be in long format.  To use profileplot the data needs to be in long format. We will use the xtline command.

So the first step is to reshape the data from wide to long. We will use the percentile variable “ptl” as the identifier. We will extract “wage” from each variable containing wage data (wage1985, wage1986 etc). This variable is known as the “stub”.  The data in every variable that contains the stub “wage” in its name is transferred into the new variable “wage”.

Next, we decide on a name for the new variable that will contain whatever is to the left of “wage” in the variables containing wage data. In this case it will contain the years 1985, 1986 etc.

Wide format Long format
image001 image002

Here is the coding for reshaping from wide to long:

reshape long wage, i(ptl) j(year)

Now we have to tell Stata which variable is the “identifier” and which variable is “time”.

xtset ptl year

All that is left is creating the graph:

xtline wage, overlay title(Income by Deciles) ylabel(, angle(horizontal))              ///
note("note:Wages adjusted by CPI")  ///
legend( order(1 "10th ptl" 2 "20th ptl" 3 "30th ptl" 4 "40th ptl" 5 "50th ptl" 6 "60th ///
ptl" 7 "70th ptl" 8 "80th ptl" 9 "90th ptl") ) ///
plot1opts(lwidth(medthick)) plot2opts(lwidth(medthin)) plot3opts(lwidth(medthick))  ///
plot4opts(lwidth(medthin)) plot5opts(lwidth(medthick)) plot6opts(lwidth(medthin))  ///
plot7opts(lwidth(medthick)) plot8opts(lwidth(medthin)) plot9opts(lwidth(medthick))  ///
legend(on cols(3))

Several options were used.  The number of columns and the text in the legend were changed. For visual effects, the widths of the lines in the graph were staggered between medium and medium thin. A title at the top and notes at the bottom of the graph were also added.


Jeff Meyer is a statistical consultant with The Analysis Factor, a stats mentor for Statistically Speaking membership, and a workshop instructor. Read more about Jeff here.


Getting Started with Stata
Jeff introduces you to the consistent structure that Stata uses to run every type of statistical analysis.

Reader Interactions


Leave a Reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.