*by Jeff Meyer*

Like many people with graduate degrees, I have used a number of statistical software packages over the years.

Through work and school I have used Eviews, SAS, SPSS, R, and Stata.

Some were more difficult to use than others but if you used them often enough you would become proficient to take on the task at hand (though some packages required greater usage of George Carlin’s 7 dirty words).

There was always one caveat which determined which package I used.

Eviews was quickly eliminated from the selection process due to its narrow focus. SAS came fully loaded but I had to be fully loaded with cash in order to afford it (btw, I’m not).

So the final contenders were R, SPSS, and Stata (wow, three choices, just like the TV show “International House Hunters”).

### Why not R

R had a lot to offer. I remembered a friend’s favorite saying, “If it’s free, it’s for me.”

My biggest concern with R was remembering how to use it. With the potential for infrequent use I might end up spending too much time relearning how to use it.

### Why not SPSS

I had used SPSS at graduate school and had the student version on my desktop at home. I had read that SPSS was the choice of most social scientists.

Graphs (clients love graphs) were easy to create, copy, and paste into a Word document. If I forgot the code I could always revert to the menus. So I decided to use SPSS.

To begin learning the things they don’t teach you in graduate school but you had better know if you don’t want to make a fool of yourself, I started taking Karen’s workshops.

The important topic of “bootstrapping” came up and I opened up SPSS to begin practicing.

I believe I used five of George Carlin’s words when I discovered that it was not included in the student version of SPSS. I opened up Stata (which I had on my desk top as well) and found the option (along with plenty of documentation on how to use it).

I then took Karen’s workshop on “Missing Data” and quickly ran into problems of recreating her results due to the need for an add-on module. I contacted SPSS and was told the missing data module was an additional $1,500.

Once again I opened up Stata and found that everything that I needed was included in the version of Stata that I owned.

### Why Stata

Once I settled on using Stata as my primary statistical software package I realized how much it has to offer besides being less expensive.

Like SPSS, Stata allows you to write code or use menus to perform your analysis.

Stata has two primary menu tabs: Graphics and Statistics. Within “Statistics” there are twenty-one sub tabs and numerous tabs within those tabs. Within “Graphics” there are twenty-one tabs as well (you think the people at Stata like to play Black Jack?).

I admit, this can be a bit daunting and time consuming if you are trying to find a specific function. But wait, there’s more.

In the command box you can type “help” and what you are looking for. For example, if I want to run a logistic regression I simply type “help logistic regression” and *documentation for running a logistic regression opens*.

Within the help guide is “Menu” which gives the path you take to use the menu method of running the command. In this case it is: Statistics > Binary outcomes > Logistic regression.

The key point here is I don’t waste time trying to find what I’m looking for. After running the command through the menus method the code shows up in the “Review” box. I can then copy and paste the command into a “do-file”.

A “do-file” is the text document that allows you to submit more than one command to Stata at once.

Stata allows you to have more than one do-file opened at a time. This is a big plus because it makes it easy to copy and paste from other project do-files into the current do-file. Using do-files is significantly quicker than using the menus if you have created template do-files, especially for creating graphs.

There are so many options for creating a graph. It takes less than a minute to copy from a template and paste the commands into your current project.

Stata is extremely efficient running repetitive analysis when incorporating macros and loops in a do-file. This sounds like it may be difficult but it’s not.

How well is Stata supported by Stata Corp? On average Stata sends out updated files every two months with new features and/or any fixes to reported glitches.The reference guide for Stata 13 is 281 pages filled with examples and links to the data sets used in the examples.

The professional community also provides incredible support. Stata allows third party written commands (also known as modules) to be imported into the software.

The website http://ideas.repec.org/s/boc/bocode.html is a warehouse for hundreds of third party written commands which have been tested before made public. Running a search for “logistic” returned 128 results.

The bottom line is Stata will run every analysis that the other major statistical packages can, if not more.

It is a very efficiently organized program to learn to use. Third party professionals are continuously offering new functions. Stata adds new features without charging a “new” version fee.

All this and the added bonus is it’s reasonably priced and has no add-on charges.

*Jeff Meyer is a statistical consultant with The Analysis Factor, a stats mentor for Statistically Speaking membership, and a workshop instructor. Read more about Jeff here*.

{ 8 comments… read them below or add one }

Thank you and thanks to the comments section for very useful distinctions. I know SPSS but wanted to learn one more package.

I took a course on Stata and R. For me, Stata is the winner, hands down, between those 2 options. The learning curve on R is very steep, and it’s extremely confusing to have to deal with 2 very different syntaxes (“base R” and “tidyverse”). It’s like having 2 separate programming languages to learn. Yes, R is free to use, but most likely your time isn’t free, and you’ll spend a lot more time learning R than Stata to get the same things done. I came away with the feeling that R is written for people who are like computer programming to do statistics while Stata is written for people who want to focus on getting a statistical analysis done. For beginners, the pull-down menus for statistics in Stata are really convenient (they don’t exist in R). Compared to SPSS (which I’ve also used), the Stata output is much cleaner.

QUITE VALUABLE LESSONS

Well said!

Thanks for the contribution and great suggestions on why to use Stata.

I completely agree that Stata is absolutely fantastic in terms of help commands, flexibility, add-ons, and a vibrant user community. For now, it is my package of choice (and the one my school department favors). You’ve listed some great reasons for ‘Why Stata,’ and I don’t mean my response to become a why-this-vs-that which is the best (all user preference), but I wanted to add a few things for readers that really aren’t sure where to start or things to consider. Having said that, my major package proficiencies are intermediate SPSS and Stata, beginner R, and never used SAS or Eviews.

Stata, SPSS, and R can all have open multiple do/syntax files at once–allowing for run commands or copy/pasting between their respective files. So I’m not really sure if the do/syntax file types are a strong reason for selecting one over the other. If anything, SPSS had a nice (or annoying) feature of drop menus with commands that usually appears when typing syntax commands.

SPSS seems designed rather well for users that prefer menus–far more so than Stata, which suffers from tab and drop option complexities that are not so intuitive. However, there are SO many regular, if uncommon, things that SPSS menus don’t cover that are readily done in its syntax or done natively Stata menus.

Although R is tough as nails for a (re-)learning curve, compared to Stata, the customization, add-ons, and active user-community is massive; yet their help files are not nearly as helpful or intuitive as those in Stata. Also, learning about or locating a package or command in R doesn’t seem as simple as Stata’s findit command.

I’m not sure which version of Stata you used for comparison to SPSS, but comparing any student version of a package probably isn’t a fair representation–unless all compared are student versions.

Overall Stata is great, but there are two things about it that really drive me batty. [warning-rant ahead] First is how Stata saves datasets with regards to its version. For example, it is ridiculous that my professor saves a dataset (as normal) using Stata13, but my Stata12 version cannot open it because it wasn’t specifically saved as a Stata12 version type. I can understand if there are major systematic differences in the program that warrants version type differences, but unfortunately Stata does not make this apparent in the naming convention or file extension (e.g. it is always .dta, unlike MS Word’s .doc and .docx). Second is the output screen, which (perhaps because I’m not using the latest version) cannot be selectively or entirely cleared. Nor is the output easily/quickly copied to other documents like a work processor or spreadsheet for presentation-quality formatting (SPSS wins here hands down v. Stata & R).

So yes, at the end of the day–Why Stata for me–because I find it to be a really great balance of options, it’s powerful, used in many of the places I work, and universally accepted enough that I can find many datasets of interest already in Stata format (which importing & converting data, especially with poor or incomplete codebooks is a whole other topic).

Shawn, thanks for your comments.

Regarding the second item that drives you batty, I will be writing a blog explaining my trials and tribulations with producing documentation of my Stata outputs. In the beginning I spent more time trying to print my work in a legible manner than I did on the analysis.

With regard to comparing student versions of Stata and SPSS, the Stata student version only limits the number of variables and observations that you can have in a data set. SPSS limits the types of analysis you can run.

Thanks,

Jeff

Shawn, there is the command “saveold” that might help you with the issue of not being able to use the data sets that your professor is using. If your professor uses the command “saveold ” anyone using Stata 11 or Stata 12 will be able to use the data set.

Jeff Meyer

also, you can download free modules using findit e.g. ‘findit use13’ that will make older versions forwards compatible, so this becomes a non-issue