## Combining Forecasts

James Hamilton has a good post on why you should combine forecasts. The basic story is that each forecast will have different amounts of information, by combining the forecasts you’ll most likely end up with better information that either of the forecasts in isolation. For those of you who like the/want the technical aspects of this,

Suppose we have available two polls that have surveyed voters for a particular election. The first surveyed 1,000 voters, and found that 52% of those surveyed favored candidate Jones, with a margin of error of plus or minus 3.2%. [By the way, in case you’ve forgotten your Stat 101, those margins of error for purposes of evaluating the null hypothesis of no difference between the candidates can be approximated as (1/N)0.5, or 0.032 when N = 1,000]. The second poll surveyed 500 voters, of whom 54% favored candidate Jones, with the margin of error for the second poll of plus or minus 4.5%. Would you (a) throw out the second poll, because it’s less reliable than the first, and (b) then conclude that the evidence for Candidate Jones is unpersuasive, because the null hypothesis of no difference between the candidates is within the first poll’s margin of error?

If that’s the conclusion you reach, you’re really not making proper use of the data in hand. You should instead be reasoning that, between the two polls, we have in fact surveyed 1,500 voters, of whom a total of 520 + 270 = 790 or 52.7% favor Jones. In a poll of 1,500 people, the margin of error would be plus or minus 2.6%. So, even though neither poll alone is entirely convincing, the two taken together make a pretty good case that Jones is in the lead.

We saw something similar to this with the latest release of unemployment numbers that both I and Prof. Hamilton discussed.

Prof. Hamilton also raises another issue about bias as well. One thing some statisticians are concerned with is bias. For example, economatricians of the Frequentist school will tell you that their estimators are BLUE, Best Linear Unbiased Estimator. However as Prof. Hamilton points out, it might be the case that accepting some level of bias in favor of getting a smaller mean squared error.

In doing so, we acknowledge that we may make a systematic error in inference that you will avoid, but we will nevertheless be closer to the truth most of the time than you will if there are substantial benefits to bringing in extra data.

In other words, just because an estimator/estimate is biased doesn’t mean that we should completely ignore it. Just some food for thought next time you see some estimates/forecasts.

Mightn’t the Central Limit Theorem come into play here?

This is what RealClearPolitics does. In 2004 their final electoral map looked like this. Only Wisconsin was wrong.