OptinMon

Issues in Coding Missing Values

October 11th, 2023 by

There’s no mincing words here. Missing values can cause problems for every statistician. That’s true for a lot of reasons, but it can start with simple issues of choices stage 1made when coding missing values in a data set. Here are a few examples.

Example 1: The Null License Plate

Researcher Joseph Tartaro thought it would be funny to get the following California vanity license plate: (more…)


Six Common Types of Statistical Contrasts

September 18th, 2023 by

When you learned analysis of variance (ANOVA), it’s likely that the emphasis was on the ANOVA table, with its Sums of Squares and F tests, followed by a post-hoc test. But ANOVA is quite flexible in how it can compare means. A large part of that flexibility comes from its ability to perform many types of statistical contrast.

That F test  can tell you if there is evidence your categories are different from each other, which is a start. It is, however, only a start. Once you know at least some categories’ means are different, your next question is “How are they different?” This is what a statistical contrast can tell you.

What is a Statistical Contrast?

A statistical contrast is a comparison of a combination of the means of two or more categories. In practice, they are usually performed as a follow up to the ANOVA F test. Most statistical programs include contrasts as an optional part of ANOVA analysis. (more…)


Getting Started with Stata Tutorial #3: the Graphics Menu

September 11th, 2023 by

In part 2 of this series, we got started on the various menus in Stata. This post covers an important menu that you’ll probably use often: the graphics menu.

What’s in the Graphics menu

The graphics menu provides an impressive variety of options for creating just about any graph you might need.

Take a look at the menu. It includes everything from univariate graphs like bar charts and pie charts to more complex, multivariate plots. Go ahead and explore some of the graphs available in the menu.

A comprehensive resource for a full understanding of the graphics you can do in Stata is the Stata Graphics Reference Manual, which is a free pdf download from the Stata web site. At nearly 800 pages, though, it’s not a quick read (it is excellent, though!).

A much quicker read is the Stata Data Visualization Cheat Sheet. Pages 5 – 6.

Browsing this two-page resource will tell you a lot about what you can do in Stata graphics. This includes not only which kinds of graphs you can create, but how to customize a graph’s appearance, apply themes, and save plots.

But first let’s explore how easy it is to create a simple, but customized plot using only the menus.

An Example of creating a Scatter plot using menus

To show an example, we’ll use the auto data. If you haven’t loaded up the data in your current session, type the graphics tabfollowing into your command line

sysuse auto

Note that you could also open this data set using the File menu, but this is a command that is so simple, it’s faster to just type it into the command line.

As you’ll see, every time you use the menus, Stata fills in the associated commands for you into the command line

Now say we want to make a scatter plot with price on the y axis, and mpg on the x axis, but only for observations where the gear ratio is less than 3. We want this graph to have red triangles representing points, and we want it to have informative titles:

We click on Graphics -> Twoway graph. In the plots window click Create and select Basic plots -> Scatter.

Choose price as the y variable and mpg as the x variable; don’t press accept yet.

Under Marker properties choose Triangle as the symbol and Red as the color. Also notice you can also change the size or opacity of points or mark particular observations.

Click accept, then click accept on the next page.

Under the if/in tab, type “gear_ratio<3” so only observations with a gear ratio less than 3 are plotted; click on Y axis.

Under the Y axis tab type the title “Price”, and under the X axis tab type the title “MPG”. Note how you can also change properties of the axis.

In the Titles tab type an appropriate title for the graph – I chose “Price and Mileage”.

We don’t want to see a legend so in the Legend tab choose “hide legend”.

You can now press ok and should see the following graph:

price and mileage scatterplot

You’ll also see that Stata put this code put to the console:

twoway (scatter price mpg, mcolor(red) msymbol(triangle)) if gear_ratio<3, ytitle(`"Price"') xtitle(`"MPG"') title(`"Price and Mileage"') legend(off)

Now you know how to make this graph and similar ones with syntax as well! If you’re ever having trouble creating a certain chart using code, the graph menu can provide an easier way to select the options you want.

Note: when you make plots in Stata menus, make sure to always make a new plot rather than layering on an old one. If you press create when you already have a plot selected, your new scatterplot will layer on top of the old one.

by James Harrod

Getting Started with Stata Tutorial #4: Do-files

About the Author:

James Harrod interned at The Analysis Factor in the summer of 2023. He plans to continue into a career as an actuary, and hopes to continue finding interesting ways of educating people about statistics. James is well-versed in R and Stata programming and enjoys teaching the intuition behind common statistical methods. James is a 2023 graduate of the University of Rochester with bachelor’s degrees in Statistics and Economics.

 


What is Analysis of Means?

August 21st, 2023 by

The classic way to compare means in analysis of variance is examining pairwise differences in means after an F test.. It’s great for many problems, but sometimes your research question isn’t about pairwise differences.

Pairwise differences are not ideal if your research question is if, like the Sesame Street song, one of these groups is not like the others. Perhaps, you are trying to assure that groups conform to a common standard. Analysis of Means (ANOM) helps in all these settings. (more…)


Getting Started with Stata Tutorial #2: Stata Menus

August 11th, 2023 by

A great way to get started with Stata is using its menus.

The first part of this Tutorial Series introduced you to Stata’s windows.  You can now begin learning how to use Stata to work with data.

Across the top are 8 tabs: File, Edit, Data, Graphics, Statistics, User, Window, and Help.

Stata menu tabs

We will not go through every option within the Stata menus. Instead, we’ll highlight a few options to get you started. In this article, we’ll start with three of the most useful menus: File, Data, and Help, along with those helpful icons under the menus.

In our next article, we’ll look at two more: Graphics and Statistics.
(more…)


What is a Randomized Complete Block Design?

July 24th, 2023 by

Designing experiments would always be simple if we could just randomly assign subjects to different treatment conditions with no other restrictions. Unfortunately, that doesn’t always work.

For example, there are many experimental situations where the subjects aren’t independent of each other. The subjects that are related to each other are combined into clusters called “blocks.” It can happen due to practicalities of running an experiment efficiently or you can intentionally plan it as a way to reduce random variance.

In either case, this is a randomized complete block design. It’s a great design to become familiar with because it will greatly expand your ability to create and analyze experiments.

How It Works

When you have subjects that share characteristics with one another, it can sometimes be difficult to isolate those characteristics directly. This makes it hard to record them as additional variables. By identifying the subjects that are similar, you can still capture how those characteristics affect the outcome. Subjects that are similar are grouped into “blocks.”

From there, you can make treatment assignments so that you put subjects from the same block into different treatment groups.

Why different treatment groups? Suppose subjects from the same block were assigned to the same treatment group. (more…)