Author: Trent Buskirk, PhD.
What do you do when you hear the word error? Do you think you made a mistake?
Well in survey statistics, error could imply that things are as they should be. That might be the best news yet–error could mean that things are as they should be.
Let’s break this down a bit more before you think this might be a typo or even worse, an error.
Why Sampling Always Creates Error
In sampling theory there are two basic ways to get information about a target population. You measure everyone (you take a census) or you measure a subset of the population (you take a sample).
If you choose the sample wisely using some sort of random sample design, you should get a reasonable estimate of the population based on the sample.
Let’s say you want to know how much television people in your hometown (the target population) watch on a typical day. You would love to measure this on all 40,000 residents in your town, but practically that’s too many people to ask all at once.
So instead of polling everyone you decide to take a random sample of 40 residents. Now in this sample there may be folks who don’t watch any television and many who do.
If you had taken a different random sample of 40 residents, it is possible that every one of them watched some television. So your estimate for the average television watching will differ slightly across the samples.
Naturally, not every sample of 40 residents will produce the same estimate. The only way it could is if everyone in the entire population watches exactly the same amount of television on a typical day.
The fact that no two samples are likely to be exactly the same means two things. First, estimates derived from them are also likely be slightly different. Second, both estimates can provide information about the entire population.
Sampling error is this variation in estimates that results simply because samples differ.
In survey statistics where samples are far more common than censuses, having error in estimates is in fact as it should be.
The size of the error depends in part on three things: the size of the sample, how variable the thing you are measuring in the population is, and on the sampling design.
Margin of Error
In Polling, we see a related concept called the margin of error. Margin of error quantifies the degree of sampling error present in a sample estimate.
You may have seen this in news coverage of a poll . Newscasters might say Candidate X has 45% of the vote with a margin of error of 3 percentage points (up or down).
Assuming simple random sampling, polls can usually achieve this level of accuracy by using a sample with about 1,200 residents. This is usually a lot fewer than a Census while still having a fairly accurate estimate of the true support for Candidate X in the entire population.
We know we’ll have error because we use samples instead of censuses. Even so, we can work to minimize the amount of sampling error by using efficient sampling designs.