Stata

Using Stata Efficiently to Understand Your Data

October 3rd, 2014 by

Most statistical software packages use a spreadsheet format for viewing the data. This helps you get a feeling for what you will be working with, especially if the data set is small.

But what if your data set contains numerous variables and hundreds or thousands of observations? There is no way you can get warm and fuzzy by browsing through a large data set.

To help you get a good feel for your data you will need to use your software’s command or syntax editor to write a series of code for reviewing your data. Sounds complicated.
(more…)


Why Use Stata?

September 15th, 2014 by

Like many people with graduate degrees, I have used a number of statistical software packages over the years.

Through work and school I have used Eviews, SAS, SPSS, R, and Stata.

Some were more difficult to use than others but if you used them often enough you would become proficient to take on the task at hand (though some packages required greater usage of George Carlin’s 7 dirty words).

There was always one caveat which determined which package I used. (more…)


Ten Ways Learning a Statistical Software Package is Like Learning a New Language

January 31st, 2014 by

Someone recently asked me if they need to learn R.  In responding, it struck me that this is another way that learning a stat package is like learning a new language.

The metaphor is extremely helpful for deciding when and how to learn a new stat package, and to keep you going when the going gets rough. (more…)


Opposite Results in Ordinal Logistic Regression, Part 2

July 22nd, 2013 by

I received the following email from a reader after sending out the last article: Opposite Results in Ordinal Logistic Regression—Solving a Statistical Mystery.

And I agreed I’d answer it here in case anyone else was confused.

Karen’s explanations always make the bulb light up in my brain, but not this time.

With either output,
The odds of 1 vs > 1 is exp[-2.635] = 0.07 ie unlikely to be  1, much more likely (14.3x) to be >1
The odds of £2 vs > 2 exp[-0.812] =0.44 ie somewhat unlikely to be £2, more likely (2.3x) to be >2

SAS – using the usual regression equation
If NAES increases by 1 these odds become (more…)


Opposite Results in Ordinal Logistic Regression—Solving a Statistical Mystery

July 5th, 2013 by

A number of years ago when I was still working in the consulting office at Cornell, someone came in asking for help interpreting their ordinal logistic regression results.

The client was surprised because all the coefficients were backwards from what they expected, and they wanted to make sure they were interpreting them correctly.

It looked like the researcher had done everything correctly, but the results were definitely bizarre. They were using SPSS and the manual wasn’t clarifying anything for me, so I did the logical thing: I ran it in another software program. I wanted to make sure the problem was with interpretation, and not in some strange default or (more…)


EM Imputation and Missing Data: Is Mean Imputation Really so Terrible?

April 15th, 2009 by

I’m sure I don’t need to explain to you all the problems that occur as a result of missing data.  Anyone who has dealt with missing data—that means everyone who has ever worked with real data—knows about the loss of power and sample size, and the potential bias in your data that comes with listwise deletion.

stage-3

Listwise deletion is the default method for dealing with missing data in most statistical software packages.  It simply means excluding from the analysis any cases with data missing on any variables involved in the analysis.

A very simple, and in many ways appealing, method devised to (more…)