OptinMon

Getting Started with Stata Tutorial #6: How Stata Code Works

July 18th, 2024 by

If you’ve tried coding in Stata, you may have found it strange. The syntax rules are straightforward, but different from what I’d expect.

I had experience coding in Java and R before I ever used Stata. Because of this, I expected commands to be followed by parentheses, and for this to make it easy to read the code’s structure.

Stata does not work this way.

An Example of how Stata Code Works

To see the way Stata handles a linear regression, go to the command line and type

h reg or help regress

You will see a help page pop up, with this Syntax line near the top.

(If you need a refresher on getting help in Stata, watch this video by Jeff Meyer.)

This is typical of how Stata code looks. (more…)


Seven Steps for Data Cleaning

June 20th, 2024 by

Ever consider skipping the important step of cleaning your data? It’s tempting but not a good idea. Why? It’s a bit like baking.stage 1

I like to bake. There’s nothing nicer than a rainy Sunday with no plans, and a pantry full of supplies. I have done my shopping, and now it’s time to make the cake. Ah, but the kitchen is a mess. I don’t have things in order. This is no way to start.

First, I need to clear the counter, wash the breakfast dishes, and set out my tools. I need to take stock, read the recipe, and measure out my ingredients. Then it’s time for the fun part. I’ll admit, in my rush to get started I have at times skipped this step.

(more…)


An Introduction to Repeated Measures Designs

May 23rd, 2024 by

There are many designs that could be considered Repeated Measures design, and they all have one key feature: you measure the outcome variableStage 2 for each subject on several occasions, treatments, or locations.

Understanding this design is important for avoiding analysis mistakes. For example, you can’t treat multiple observations on the same subject as independent observations.

Example

Suppose that you recruit 10 subjects (more…)


Getting Started with Stata Tutorial #5: The Stata Do-File

May 4th, 2024 by

From our first Getting Started with Stata posts, you should be comfortable navigating the windows and menus of Stata. We can now get into  programming in Stata with a do-file.

Why Do-Files?

A do-file is a Stata file that provides a list of commands to run. You can run an entire do-file at once, or you can highlight and run particular lines from the file.

If you set up your do-file correctly, you can just click “run” after opening it. The do-file will set you to the correct directory, open your dataset, do all analyses, and save any graphs or results you want saved.

I’ll start off by saying this: Any analysis you want to run in Stata can be run without a do-file, just using menus and individual commands in the command window. But you still should make a do-file for the following reason:

Reproducibility (more…)


Too Many Colors Spoil the Graph

March 26th, 2024 by

When you draw a graph- either a bar chart, a scatter plot, or even a pie chart, you have the choice of a broad range of colors that you can use. R, for example, has 657 different colors from aliceblue to yellowgreen. SAS has 13 shades of orange, 33 shades of blue, and 47 shades of green. They even have different shades of black.

You have a wealth of colors, but you can’t use all of them in the same graph. The ideal number of colors is 2.

(more…)


Assumptions of Linear Models are about Errors, not the Response Variable

March 19th, 2024 by

Stage 2I recently received a great question in a comment about whether the assumptions of normality, constant variance, and independence in linear models are about the errors, εi, or the response variable, Yi.

The asker had a situation where Y, the response, was not normally distributed, but the residuals were.

Quick Answer:  It’s just the errors.

In fact, if you look at any (good) statistics textbook on linear models, you’ll see below the model, stating the assumptions: (more…)