Be careful about reading polls

MAJOR BREAKING NEWS (NBC NEWS): New National Poll Shows Massive Swing Toward Democrats in Voter Enthusiasm; Republican Enthusiasm Is Down 5% and Democratic Enthusiasm Up 4% Over Just a Few Weeks—Wholly Erasing the “Enthusiasm Gap”

Please RETWEET this stunning eleventh-hour news! pic.twitter.com/3rMgDjG5DL
— Seth Abramson ((SethAbramson?)) November 6, 2022

Why am I pointing out this poll? I am not one to call people out specifically. However this is the wrong take here. Let me explain why.

What is the margin of error (MOE)?

The margin of error refers to a range of uncertainty that the researchers have about the actual number. So what this means is that the actual value can be anywhere within the range of the margin of error + the number. So from the example from above, it is likely that anywhere between about 69.9 and 76.1 percent of Republicans or Democrats are interested in the Midterm election.

How is this margin of error calculated, and why? Why can’t they just tell us what the actual number is?

We have what is referred to the population. This is the overall group of people we are interested in studying. In this case, this would be eligible American voters. However, we can’t ask the amount of interest that all eligible American voters have in the Midterm elections. So… we have to collect a subset of people that are part of that population. This is what is referred to as a sample.

Since our sample is not our population, we are somewhat uncertain how closely our particular sample overlaps with our population on interest in the midterms. Our margin of error is an estimation of this uncertainty.

However, this uncertainty of our sample overlapping with our population is assumed to be the case under close to ideal circumstances.

The limitations of the margin of error

It is rare, however, that we actually are in this world of ideal circumstances. This means that even our margin of error is likely an underestimate of the difference between our population and our sample.

As G. Elliot Morris recommends to take the margin of error with a grain of salt:

2, about uncertainty:

- The true margin of error in a survey is much wider than its sampling error. In this example, we moved the topline vote shares for Tim Ryan and JD Vance BY UP TO TWELVE PERCENTAGE POINTS just by changing how the data was processed. (True fx cd be larger.)
— G. Elliott Morris ((gelliottmorris?)) October 20, 2022

Why?

Think of error as the difference between the predicted or estimated value of something (like a poll) and the actual value.

There is more than just the error that come from differences that come up from a sample that may just naturally be different than our population. There may also be systematic sources of this error that increase how inaccurate we are about our guess of the population’s views of things.

The Total Survey Error is a popular theory and framework for researchers and pollsters who rely on surveys. In one of the more comprehensive summaries of this issue, Weisberg (2005) demonstrates that this is just the tip of the iceberg.

At each and every step of a survey, we are making choices that can make our estimate of what the population things less accurate.

The margin of error does not weigh each and every one of these potential problems.

So what does this mean for the tweet above?

The margin of error is a very generous estimation of the accuracy of the poll above.

Let’s interpret it.

In October 69% of the Democrats in the sample reported interest in the Midterm elections while 78% of Republicans did. In November, 73% of Democrats and Republicans in the sample reported interest in the Midterm elections. However, these numbers reflect interest by those included in the subset of people asked. If we want to understand whether this reflects the views of the population writ large, we need to include a margin of error to reflect our uncertainty about it in ideal conditions. The authors of the poll did that for us. The margin of error is +/- 3.1 points. What this means for the interpretation above: In October, between r 69-3.1 and r 69+3.1 of Democrats in the population are estimated to be interested in the Midterm elections while somewhere between r 78-3.1 and r 78+3.1 percent of Republicans did. In November, anywhere between r 73-3.1 and r 73+3.1 Republicans and Democrats in the population are likely to be interested in the Midterm elections. All of this assumes, however, that the only problem is the natural differences that may be between a sample of at most about a couple of thousand of people and all eligible American voters.

The differences between reported levels of interest among Republicans and Democrats between October and November are not much larger than 3.1 points. But as we know, these margin of error calculations are often generous to the pollster (in terms of making them appear less uncertain than we actually are). If we consider this, then we should expect we really are probably more like 6 or 7 points off from the population once we account for a whole bunch of uncertainty that comes beyond this sampling error.

All in all, it means that if we are being honest with ourselves, Republicans did not move between October and November nor did Democrats.

What does all of this mean for the forecasts of the midterm elections… well, it is essentially a toss up at this point. We’ll see what happens tomorrow and the upcoming weeks as the votes are tallied.

this is a good question. my prediction is that democrats will either win more votes than republicans or they won’t. thx 4 asking https://t.co/1UWlUxdrYF
— G. Elliott Morris ((gelliottmorris?)) November 4, 2022

References

Weisberg, Herbert F. 2005. The Total Survey Error Approach: A Guide to the New Science of Survey Research. Chicago, IL: The University of Chicago Press.