Nate Silver has a curious post up today. He points out that Obama is leading Romney by an average of 2.3% (according to Real Clear Politics, slightly more according to other averages) in Ohio polls. He then goes through all recent examples when there were at least three polls in a state ten days before an election. He shows that with one exception, these polls accurately predicted the winner in every case in which the winner was ahead by at least 1.5 points.
Okay, but this is a small data set, includes state polls with a greater than 2.3% lead that wound up being decided by razor thin margins in favor of the poll leader, and that also leaves much important information out. For example, in 2004 Bush was behind in the Ohio polls by .9%, but actually won by 2.4%, a spread of 3.3, more than the current Romney-Obama margin. That especially pertinent datum appears nowhere in the post. [Silver’s chart does tell us that Bush won, but isn’t pertinent to discuss Ohio specifically, and also to point out that he won by 2.4%, a 3.3% spread, and not, say, .1%, a 1% spread?]
And isn’t the broader issue how often state polls are off by at least 2.3%? Just for example, Silver’s chart shows that Bush was leading by .9% in Florida in the polls 2004. He actually won by 5.0%, a difference of 4.1%. That didn’t change the “outcome,” but isn’t it very relevant to how secure a 2.3% lead in the polls is? [I randomly chose this as the first and only non-Ohio poll vs. outcome I looked up. It may be a huge anomaly. If so, Silver surely has the data at hand to tell us that.]
If I were investigating the issue, I’d want to know three things: (1) How often are the state polls ten days out off by at least 2.3%? (2) How often are state polls ten days out off by at least 2.3%, favoring the leader?; and (3) Very specifically, how often do the state polls favor an incumbent president by at least 2.3% more than he gets on election day? And of course, it would also be useful to know if Ohio polls specifically tend to underestimate the Republican vote (they also did in 2008, but only by about .8%) This would, I think, overall tell you a lot more about how significant the 2.3% lead in Ohio is than what Silver tells us today.
UPDATE: Obviously, one would rather be in Obama’s position in the polls than in Romney’s. I just don’t see why if we’re trying to measure just how much better a position it is, we’d use a cruder measure (how often candidates overcame deficits) with a smaller dataset over a more specific measure with a greater dataset (e.g., my number 2 above), even if it’s just for discussion in a blog post as opposed to being his formal forecast model.
And of course, none of this has any effect on the outcome of the election–I don’t understand those who think Silver is intentionally biasing either his model or his blog posts, as if he could somehow wish Obama to victory by giving encouraging news to Democrats who follow politics closely. Not to mention that he has a strong reputational stake in accuracy.
But I find the “horserace” aspect of this election (as opposed to the typical election year pablum emanating from both candidates, which I find just depressing, and as opposed to their actual policies, which are worse than depressing from my libertarian standpoint) especially interesting, and find it fun to discuss. If you are such a partisan that the only thing you have contribute in the comments is to call either Silver or me partisan idiots, please don’t comment. Substantive comments, including and especially those that explain, logically, why the author concludes that Silver has the better take on this, are most welcome.