How To Read The Polls: General Election Edition
Time to prepare yourself for the incoming deluge of polling.
Nate Silver is out with a very useful guide to reading, understanding, and putting into context the plethora of national and state-level polls that will be hitting us on nearly a daily basis from now until Election Day. His final point is perhaps the most prescient:
There have been only 16 presidential elections since World War II. That simply isn’t a lot of data, and overly specific conclusions from them, like “no recent president has been re-elected with an unemployment rate over 8.0 percent” or “no recent incumbent has lost when he did not face a primary challenge,” are often not very meaningful in practice and will generally not carry much predictive weight.
Another point to keep in mind is how seriously to treat the numbers for demographic subgroups:
The sample sizes on subpopulations in a poll — like Hispanics, young voters or evangelical Christians — are much smaller than for all voters as a whole and therefore contain much larger margins of error. For instance, a poll that surveys 600 respondents, of whom 75 are Hispanic, has a margin of error of about plus or minus 11 points on that subgroup. And that is under ideal circumstances; in practice, some subgroups (including Hispanics) are harder to get on the phone than others.
It’s easy to write the “Candidate X has problems among Group Y” stories, but very often they are just weaving narratives from statistical noise. Unless the demographic patterns are clear and consistent across several different polls, these stories are usually worth ignoring.
There are other things to keep in mind, such as paying attention to whether a poll result is among likely voters, registered voters, or just “all adults,” of those three groups “adults” probably has the lease predictive value since it is likely capturing people who have no intention of voting in November, or aren’t even registered to vote. Among all three groups, it’s also important to keep an eye on the sample weighting among Republicans, Democrats, and Independents. With a polling population of a few hundred, or even as few as a thousand, respondents it’s fairly easy for a Romney v. Obama result to be thrown off by a sample that is overly represented by one party or another, or that doesn’t contain enough independents. Of course, determining what the right sample balance isn’t always an easy thing to do because it requires at least some guess work in figuring out what turnout on Election Day is likely to be. Typically, though, a poll that has a D/R/I breakdown that is wildly different from previous elections and other polls should be looked at skeptically at least. Of course, some pollsters don’t release complete crosstabs for their polls while others, like Rasmussen, keep that data behind a paywall and charge extra for those who want to access it. If the data isn’t available and the poll results seem out of the norm, then it’s probably a good idea to be at least a little skeptical.
That leads to what might be the best piece of advice that Silver gives:
This ought to be obvious, but you should generally be looking for a trend to show up in several different polls from several different polling firms before you start to view it as newsworthy. Again, this differs a little bit from the primaries because there is less of a premium on recency in the general election; you’re usually better off waiting for another (or better yet two or three more) data points.
The easiest way to do this is to take an average of recent polls, as sites like Real Clear Politics do. The technique that FiveThirtyEight uses is a little fancier, taking a weighted average of polls based on their past accuracy as well as their methodological standards. However, the gains from doing this are modest as compared with the simple average method. In contrast, taking even that simple polling average provides for considerable gains in accuracy over any one poll taken alone.
Four years ago, RealClearPolitics proved itself to be invaluable to political junkies because it was the first site that introduced the concept of the polling overage, and kept track of it on a daily basis. Now, in addition to RCP, we’ve got Nate Silver’s incomparable date crunching, along with tracking that you can keep track of at The Huffington Post and Talking Points Memo. The cool thing about the HuffPo and TPM trackers is that they include options that allow you to control what data is showing in the charts in generates. You can restrict the chart to a certain date range, restrict the types of polls that are included in the average, and restrict the specific pollsters that are included. RCP remains the go-to place but it’s chart’s aren’t manipulable in this manner, at least not yet.
The other part of Silver’s comment here is worth taking to heart too. It’s very easy to get caught up in the excitement of a new poll, especially if it’s one that shows something dramatic (Obama up! Obama down! Romney losing support among Hispanics!). I’ll admit to getting caught in that trap more than once myself. However, one poll doesn’t necessarily tell us much of anything, especially given some of the caveats I noted above. What matters is the trend, especially since we’re talking about an election that is still seven months away. There will be ups and downs between now and then. Both candidates are likely to get a bounce from their respective conventions, for example. And, of course, the economy will continue to be the major influence on the attitudes of voters. So, let’s all try to give these polls the attention they deserve, and none that they don’t.
Update: Here’s another useful tool, this from The New York Times, an interactive chart of Exit polling for all Presidential elections going back to 1980