A Note on 2016 Polling

Because sometimes a comment starts to become a post.

Steven L. Taylor · Friday, June 12, 2020 · 21 comments

In a post about Joe Biden’s current poll numbers, I responded to a comment which called into question that 2016 polling. After all, the polls said that Hillary Clinton was ahead, and she lost. Ergo: polling sucks! Or, less dramatically, the question arises as to why we should trust polling now when it got it oh so very wrong that last go-round.

The point is, however, that the polling didn’t get it all that wrong in 2016. Indeed, let’s look at the actual numbers.

The official final numbers for the 2016 election are as follows:

HRC: 48.18%

DJT: 46.09%

So, HRC won the popular vote by 2.09%

Now, the final RCP polling average was

HRC: 46.8%

DJT: 43.6%

That suggested HRC winning the popular vote by 3.2%

The difference between the final total and the polling is 1.11 percentage points. That is pretty darn close. It is anything but a failure of polling. The polling called the popular vote winner and got the numbers quite close. Note, too, that the RCP average in 2012 had more of a gap between polling and final results than in 2016, but no one wails about how bad the polling was, because the guy we all expected to win, won. We have to remind ourselves how much we allow expectations to cloud our understanding of numbers.

So, the polling in 2016 was pretty good–it was how the polling was used by some analysts that was the problem (linked, too, with media and the public doing a poor job of understanding what they were looking at). Throw in the fact that most people still seem not to really understand the Electoral College (they think it is a magic formula created by The Founders instead of an allocation rule that distorts the outcome) and you get a narrative that the polling was waaay off.

And yes, there were predictive models by Sam Wang and Natalie Jackson at HuffPo which predicted an HRC with > 98% confidence that HRC would win. (As compared to Nate Silver’s model that gave HRC a 71.4% chance to Trump’s 28.6% chance).

Look, if you are taking a bet, you’d rather have 7 in 10 odds than 3 in 10, but an almost 30% chance of something is still a really good chance that it happens. If there is a 30% chance of rain, it is not unreasonable to take an umbrella. And even if you take the 7 in 10 bet you can still lose (would you bet your annual salary on 7 in 10 odds?).

As I noted in the comments, a .286 batting average is quite good in the MLB.

To be clear as to why I commented in the first place, and why I am pushing back: it is simply wrong to say that the polls were wrong in 2016. Yes, Wang’s prediction was grossly wrong, as was Natalie Jackson’s at HuffPo. But if the lesson people think they learned from 2016 is that we should distrust the polls, (and therefore we should dismiss polls now) they learned the wrong lesson.

And look, I thought HRC was going to win—it was the more probable outcome going into election night and, again, she did win the popular vote. And, yes, I was surprised (more than I should have been) by the Trump wins in MI, WI, and PA (especially PA). And, yes, polling in those states was inadequate.

The Wisconsin polling was way off (HRC +6.5%, but Trump winning 0.7%) but it stopped also several days before the race (i.e, it was under polled). PA was closer (HRC +1.9% with Trump winning by 0.7%). Michigan was between the two, with HRC’s advantage in the average being 3.4%, but with Trump winning 0.3%. It is worth noting that the last poll noted in the RCP average in Michigan had Trump at +3.4%.

The general lessons here, therefore, are as follows:

The polling itself, especially the national polling, in 2016 was pretty good (despite media narratives that were reinforced by the emotions of media consumers, whether it was elation at Trump winning when he seemed doomed or despair because Trump won when he seemed doomed).
State-level polling is more problematic than national polling. It is expensive and less polling is done in some states (although I can guarantee a lot of polling dollars will flow into MI, PA, and WI this year).
In general, the broader population (often including reporters) don’t understand basic polling issues (to include basic issues like margin of error) and they very frequently fail to grasp probability (see, e.g., the thriving lottery industry in this country).
Models and predictions can create false understanding.
Nate Silver has the best track record out there to this point over several electoral cycles (but he isn’t a wizard or soothsayer).
Sam Wang and Natalie Jackson at HuffPo blew it in 2016. That doesn’t mean that the polling in 2016 failed. It means that they screwed up their applications of the polling data.

Coincidentally, I noted the following in my Twitter feed yesterday:

https://twitter.com/nataliemj10/status/1271063202181324800?s=20

Now, am I saying that because Biden has good numbers now that he is going to win? No, not in the least. I do think that there are some good signs in the numbers for him (such as his ability to hit 50% and the general stability of his lead). It is also true that there is a reasonable ability to infer EC outcomes from national numbers when the margins are large enough.

Still, I expect even more state-level polling in 2020 than we saw in 2016 and there is still a lot of time to go.

I am saying that it is likely, however, that pollsters will be able to accurately gauge, within a reasonable margin of error, what public support is for the candidates, just as they did in 2016.

Comments

Kathy says:

Friday, 12 June 2020 at 10:35

The difference between the final total and the polling is 1.11%. That is pretty darn close.

No and yes.

No, it’s not 1.11% but rather 1.11 percentage points. In percentage terms, it’s about 3%. See 46.8 times 1.03 = 48.20. It’s a common mistake, albeit an important one. If a loan is sold to you at 10% but then rises to 11%, it’s not 1% higher, but 10%.

Yes, it’s pretty darn close. Likely within the margin of error.

3
Steven L. Taylor says:

Friday, 12 June 2020 at 10:48

@Kathy: You are correct–that was careless of me. But, yes, within the MOE.

4
Kathy says:

Friday, 12 June 2020 at 11:05

@Steven L. Taylor:

as I said, it’s a common mistake.

I’ve a friend who was an actuarial and then moved on to gaming math. He runs a popular gambling site, owned a gaming math consultancy company, and was even chief of gaming math at the Venetian for a short time. And I’ve caught him making that mistake now and then.

3
Teve says:

Friday, 12 June 2020 at 11:08

For a long time I was one of those “the lottery is a tax on stupid people” people, but then I heard someone say that the lottery was actually a great bargain, because “you can pay one dollar and then get to fantasize for the entire week that come Saturday you’ll be a millionaire”.

10
de stijl says:

Friday, 12 June 2020 at 11:53

@Teve:

I remember when the state lottery came in.

People have no concept of random draws or statistics.

It was quite disheartening.

I had a Stats prof who was ESL and had a quirk where he would say “Imagine you pick a card from deck of card” and my brain would just go “Sir, deck of cards is plural” but I just sat there.

He wasn’t bad. He could do the math.
DrDaveT says:

Friday, 12 June 2020 at 11:56

@Teve:

I heard someone say that the lottery was actually a great bargain, because “you can pay one dollar and then get to fantasize for the entire week that come Saturday you’ll be a millionaire”.

It’s better than that — for many people, a lottery ticket is the only investment available that has some positive probability of life-changing returns. The entertainment value is real, but so is the financial value. In dollar terms, it’s a dumb bet — but in utility terms it may not be.

6
Kathy says:

Friday, 12 June 2020 at 12:14

@Teve:

I played our local version of lotto for years. I learned three things:

1) You’ll wind up buying more than just one ticket per drawing. Often you’ll wind up pooling resources with friends, family, and even coworkers. It becomes a job collecting money, buying tickets (often twice a week), and checking tickets for prizes.

2) You can fantasize as much about what you’d do with the money buying no tickets at all.

3) You tend to fantasize much less when you stop buying tickets.

And now we’ve digressed from the topic. I may post about no-loss lotteries in the open thread.

1
Teve says:

Friday, 12 June 2020 at 12:20

@DrDaveT: It’s also a communal experience. How’s your numbers do? Oh I got three of the numbers, how did yours do?
Kylopod says:

Friday, 12 June 2020 at 13:14

Correction: Sam Wang isn’t at NYT. He hosts a site called Princeton Election Consortium. NYT had Hillary’s chances at 85%, quite a bit lower than Wang’s over-99% estimate.
gVOR08 says:

Friday, 12 June 2020 at 13:22

@Kathy: I have a pet peeve with finance and econ people. They generally talk about annualized percentages but they’re sloppy about saying so. If they say “the economy grew 1% last quarter” they probably mean at an annualized 1%, but they almost never say so. Or sometimes they’ll say “grew at a 1% rate”, which is a little better. I are an engineer and we are trained (or at least used to be before software did everything) to be very explicit about units, so this sort of thing bugs me. But I guess with econ types there are conventions not necessarily apparent to an outsider.

This time four years ago I was playing with Larry Bartels’ favorite prez election model. It uses change in real per capita income in the 2nd and 3rd quarters, along with party incumbency. 3rd qtr data isn’t available until after the election, so it’s really an explanation tool, not a predictive tool. (And it said that after two terms of a D prez the D candidate would lose the popular vote by, IIRC, 2%. She did much better, so no, she didn’t run a lousy campaign.) To run his model I had to reverse engineer his past election examples to be sure I was using the right change numbers.

3
Steven L. Taylor says:

Friday, 12 June 2020 at 13:26

@Kylopod:

Correction: Sam Wang isn’t at NYT. He hosts a site called Princeton Election Consortium. NYT had Hillary’s chances at 85%, quite a bit lower than Wang’s over-99% estimate.

Thanks for that. I was conflating him with Nate Cohn (I think).

1
Steven L. Taylor says:

Friday, 12 June 2020 at 13:29

@Steven L. Taylor: And I knew that I shouldn’t rely on my memory about those models! (I should have double-checked!).
gVOR08 says:

Friday, 12 June 2020 at 13:38

@Kylopod:

NYT had Hillary’s chances at 85%

Which is still only 6 to 1 against. I hope, with slim evidence, NYT learned their lesson and will not feel free to run wild with Hunter Bidengate, or whatever the GOPs come up with, as they did with Her EMAILS!! and Bennnnnghazzzzi, and Her Foundation!!! There’s always a Mel Brooks line. “I was up for the lead in Lilac Time. What happened? I didn’t get it.” A donor asked for a meeting with the Secretary of State.

On which note, I see Bret Stephens and Ross Douthat have columns today decrying Bennet’s “resignation” over Tom (wish I were in the land o’) Cotton’s column as a cancel mob thing. Methinks they feel their own affirmative action sinecures are less safe this week.

1
Stormy Dragon says:

Friday, 12 June 2020 at 15:42

@gVOR08:

To paraphrase Nate Silver: there’s a 6 in 7 chance any particular day is not Friday. Yet none of us woke up this morning and went, “Friday!? How could this possibly have happened?”

8
Kathy says:

Friday, 12 June 2020 at 16:24

When you check in at the 4 Queens in Vegas, you get a small coupon book. One is for an even-money match bet on craps (meaning a pass line bet). I tend to bet $10 and the coupon(*), and then back that with single odds (my usual bet is $5, not $20).

Last time I did that, the point was 8 and 6 are the second most likely numbers to come up on a roll. The likeliest is 7, which also wipes out your pass line bet. So your odds are good, but not the best. I put in full odds, and three rolls later seven showed up and stole the show.

I knew the odds, I understand the odds, and I was still disappointed.

BTW, craps has an opposite bet, called don’t pass. On that one, simplified version, you win if seven knocks out the point. The house edge on don’t pass is a bit lower than on pass, but not enough to be popular even with smart gamblers. In essence you’re playing against everyone else, rather than with them. It’s less fun.

(*) I had thought they’d take the coupon and match my bet with chips. Nope. they placed the coupon under my bet on the pass line spot. But when they collected the bet, the dealer gave the coupon to the box man, who then placed it inside the cash box.

1
de stijl says:

Friday, 12 June 2020 at 20:00

No matter how many times I eventually understand craps, I always forget the next time.

Granted it’s a 5 year gap or more everytime.

I get 21 and Roulette. I get Hold ’em poker.

Craps is like watching cricket. Something big just happened. The score changed. I have no idea what just happened. It’s so opaque it is kinda fascinating.

I have crammed the basics into short term memory 2 possibly 3 times and it is all gone like French. Je m’appelle de stijl and Ou est la biblioteque? is all I’ve got.

I can sing Frere Jacques, though. Apparently Father Jack is your dorm roomie. He’s dormez vous’ing.

1
Kathy says:

Friday, 12 June 2020 at 20:21

@de stijl:

I didn’t dare play craps until my fourth trip. When I saw the layout, I wondered why there was no thick manual at the table.

What I think confuses many people is that seven is both good and bad, depending on when it rolls. On the pass line, you win even money with a seven if it rolls before a point is set. Once there’s a point (a target number), then you lose your pass line bet and any odds bets backing it up when seven rolls.

This is in addition to come bets, place bets, buy bets, odds bets, hardway bets*, hop bets (never place hop bets!), Big 6/8 bets (never do those, either, place the 6 or 8 instead), and any carnival style side bets like the Fire Bet.

One way to keep current is to look for free online games. Some casinos have them, or did, it’s been a while since I played.

At Vegas, half the time I play at the real tables, and the other half at electronic “bubble craps” machines. The latter is a faster, cheaper game.

*Hardways are contraindicated, too. But I place them on occasion to tip the dealers. I do that when 6 or 8 are the point. I give the stick man two $5 chips and say “two-way hard 8.” This means the bet wins if a double 4 is rolled, and half the winnings are for the dealers.

Because they don’t risk money of their own, they don’t “lose.” Still, I feel like I’m gambling away their salary. So what I do if we seven out, which is common, is push a $5 chip the the dealer and say “This is for the bet that didn’t hit.” That they really appreciate.

Lastly, always tip the dealers, whether you win or not. They make most of their salary in tips.

1
Scott O says:

Friday, 12 June 2020 at 20:41

@Kathy: About 30 years ago the Massachusetts lottery had an ad campaign with the slogan “If you don’t play you can’t win”. Some of us adapted the slogan “If you don’t play you can’t lose”.

2
de stijl says:

Saturday, 13 June 2020 at 03:45

@Kathy:

Those were English words. Proper sentences. I am fairly certain you are not shining me on.

It’s also gibberish to me.

It’s so opaque. Dude rolls a four. Everyone groans. Chips get scraped off and pushed forward. Then dudette rolls a four and everyone claps and high fives. Big winner!

It’s fascinating. It’s so elaborate.
Natalie says:

Saturday, 13 June 2020 at 11:00

I would have been happy to speak with you about the details of 2016 had you bothered to reach out.
Steven L. Taylor says:

Saturday, 13 June 2020 at 11:26

@Natalie: I apologize for not doing so, but I was trying to address the difference between polling and models and happened to see your tweet in the context of a discussion that had already started here on the site.

I am not trying to reopen debates about the 2016 models. I am just trying to point out the difference between several of the predictions made in 2016 and the polling itself.