More on this “Stolen” Election

One of the things that many people seem to be hanging their hat on in terms of the “Bush Stole the Election” meme is the exit polling. Early in the day exit poll leaks showed that Kerry had some pretty substantial leads in key battleground states. People then look at the margin of error (MOE for short) and then try to do some back-ass-ward calculation to come up with some completely stupid “probability” that Bush stole the election.

I’m sorry to do this to all you budding probability theorists but the above kind of stuff is just plain and simple crap. These exit polls rely on a view of probability and statistics known as Frequentist. What does that mean? It means you rely on view of probability that rests on the frequency of an event happening over a large number of trials (if you are guessing this isn’t the only view of probability you are quite right). The first immediate problem is that the Frequentist approach really cannot be used for one-time events. Sporting events is a good example. The weather, that year’s team, the injured players for that game, etc. all play a role in determining the outcome. Hence you cannot have repeated trials of the same game over and over to get some sort of handle on the probabilities of which team wins. Now it is not a huge leap from the sporting event example to elections. Elections are one-time events. We cannot go back in time and re-run the election; never mind making sure that if we could go back in time that everything happened exactly like it did the first time through. So why do people use Frequentist statistics on elections? It is easy. Just about every canned software package out there basically has built in Frequentist techniques and almost totally ignores other approaches to statistics (e.g., the Bayesian approach which does not have the above problem with assigning probabilities to one time events).

Now why do all canned statistics packages have Frequentist techniques? Because the technique is always the same. There is nothing different if you apply linear regression analysis to agricultural data than if you are applying it to astrophysics. The techniques are always the same. Always. The nice thing (and in my view also the bad thing) about Frequentist statistics is that it is mechanistic. Basically it provides a very nice way for people with little or no grasp of statistics to crunch numbers and get a Result. That magic thing…the Result. What is even better is if the Result is Statistically Significant. With this you have something Important. Now this is nice if you know about statistics, are doing research, and want to not be bothered always writing up code to crunch your numbers. The problem is that it also means any Joe Schmoe can come along dump some numbers in there and crunch away. Who cares if it is valid to use that technique, the data is messy, etc. Crunch, crunch, crunch.

So wherein lies the danger with all of this? Lets suppose we have two candidates for President. The polling data shows that one candidate is leading. The polls have a 95% confidence level attached to them. Wow, that candidate is going to win. After all the Result is Statistically Significant, right? Well two days later it breaks that the candidate is also a pedophile and has a 9-year-old child as a lover (this example has been shamelessly stolen from Dierdre McCloskey). Whoops, the trailing candidate now wins by a landslide. But, but, but…those polls. Why the trailing candidate must have STOLEN THE ELECTION!

No, no and no!. This extreme example highlights one of the features about Frequentist statistics that is almost never…ever mentioned by pollsters. The level of confidence or MOE is a pre-experimental measure. What does that mean? It means that the confidence for Frequentist results is due to the performance of the technique in repeated trials. That is prior to actually observing the data there is 95% probability that the technique will capture the true value of the parameter of interest (POI). However, once the data is observed the probability degenerates to the trivial case, either 0 or 1 (100%). So those exit polls were based on the notion that 95% of the time they work, but once we have the data in hand either they are right or they are wrong. Further, there is nothing random with Frequentist results once the data has been obtained. All components of confidence intervals, estimates, etc. are constants. I normally wouldn’t point out that a constant is not random, but I think in this case it has to be pointed out: a constant is not random.

Another problem with Frequentist statistics is the properties that are attached to the parameters that being estimated. For example, in the case of an election we want to know the probability that Kerry (or Bush) is going to win. The Frequentist view holds that this is a fixed constant that exists….somewhere, and that by sampling we can make inferences about this unobservable fixed constant (feeling a little uncomfortable about this notion…well hey, that’s your problem I personally take the Bayesian view). As the example above shows, this isn’t always the case. Sure you could try to get around this problem by saying, “Well, sure the probability isn’t fixed for all time, but it is a fixed constant at each instant, but moves over time and any poll is simply an inference about that fixed constant at the moment in time.” But this actually undermines the view that early exit polling meant something 12 hours later when the polls were closed and it was looking very much like Bush was going to win. I bet if we sent Zogby and all the others back to their phones and they did another poll the results would look much different.

The bottom line is that you have to be very, very careful when using Frequentist results. Using the results of Frequentist analysis to try and gin up some sort of probability about the election being stolen is difficult at best. First is the problem that the only probability associated with those results are trivial (0 or 1). Second, is that even a statistically significant result at one point in time does not have to mean that result is always statistically significant. Third is that, strictly speaking, we shouldn’t be using Frequentist measures for things like elections. So, to all you wannabe statistical investigators, budding probability theorists, and whackos who think that Bush stole the election and you can prove it with exit polling…do yourself a favor. Buy a textbook on probability theory and statistics, preferably one that provides a side-by-side discussion of both Frequentist and Bayesian approaches. At least you wont look so Goddamned incoherent.

FILED UNDER: 2004 Election, , , , ,
Steve Verdon
About Steve Verdon
Steve has a B.A. in Economics from the University of California, Los Angeles and attended graduate school at The George Washington University, leaving school shortly before staring work on his dissertation when his first child was born. He works in the energy industry and prior to that worked at the Bureau of Labor Statistics in the Division of Price Index and Number Research. He joined the staff at OTB in November 2004.

Comments

  1. ken says:

    It looks to me James that you are giving far too much thought and energy to an issue that no one takes seriously this time around. Now if you would only spend as much effort on the stolen 2000 election you might discover something of value.

  2. James Joyner says:

    Ken: The post is by Steve Verdon.

    There are several prominent Democrats actually expounding the “stolen election” theory, however, including sitting Members of Congress.

  3. Sgt Fluffy says:

    Wow, That made my head hurt

  4. capt joe says:

    Moore is already planning on doing a sequel to F911.

    I bet this becomes the centerpoint to his new movie.

  5. Ron says:

    Steve, I knew this was you before I finished the first paragraph. I can’t believe James lets you do this on his blog! Stevelish, indeed 🙂

  6. Gary says:

    Does this mean Kerry actually did win the election, before he lost it?

    It’s a great explanation. Works for me. But my liberal friends’ eyes will glaze after 30 seconds, and then they’ll go back to, “yea, but he still stole the election”.

  7. Joe Carter says:

    An excellent explanation of the flaws in the frequentist approach. My only quibble is that you didn’t explain why using Bayesian probabilities would be any better.

    Though I only have a rudimentary grasp of the Bayesian approach, I think it is much more interesting and useful. Still, I think it would have been instructive to hear a defense of that method.

  8. ken says:

    James, sorry to confuse you with someone else. I didn’t realize you started having guest bloggers.

  9. Anjin-San says:

    Democrats need to deal with the fact we got waxed by a raging mediocracy like Bush.

    We lost. Don’t point fingers, look in the mirror.

  10. Super Fly says: