Statistical Ideas: Blind leading the blind

In his latest debacle, Nate Silver was among many pollsters/pundits who in 2016 forecasted a super-high probability for a Hillary Clinton win (after giving Donald Trump only 2% probability to make it through his primaries). This high “probability” was of course mistaken, and sets an extraordinarily high bar for actual delivery of outcomes. And he was in the high 80% range, for a couple months prior to the election. These high probabilities are nothing new to Nate Silver, but also provides an opportunity to examine how poor polling works and how we might be better served ignoring them and instead listening to one another. No one wins all of the time. And one would have to conclude -based on Nate Silver’s statistics- that he would be correct ~85% of the time, instead of really being incorrect 85% of the time! Looking below the hood of his national forecast, we can see his more disastrous his state-by-state analysis which people enjoyed but fed into this his poor overall forecast. Now keep in mind that one is evaluated -as a forecaster- on actually forecasting better than the average bloke. For example, everyone knows that a best-seed basketball team will beat a worst-seed basketball team in the NCAA March Madness. So if everyone guesses that, and if everyone is then right, that doesn’t mean everyone is an above-average mastermind!

What would be a good baseline for presidential elections? The “dumb forecaster” concept in journal literature from the American Statistical Association, where I have served on their editorial panel, is simply guessing the same election results as what just happened in the previous election. No more, no less. Put differently, one doesn’t need to think any further than to state that all the 2016 state’s electoral results will be the same as in 2012!

And yet such a dummy would have gotten 44 of the 50 states correct. The 6 wrong states by the dummy are obviously the 6 states that flipped (notably they all flipped for Donald Trump). Most pundits, including our parents and children, if asked to guess at the electoral map only using past election outcomes and no polling "insight" or punditry, would have also averaged about 44 states correct. That’s 88%: so we might think we’re virtuosi, but we’re just all doing the same as a dummy who also got 44 states correct without a single thought about this election.

What’s worse is that all of the infrastructure backing Nate Silver led him to only get less than 40 states forecasted correctly! Statistically a lot worse than a dummy, and completely exposing multiple issues with his models. And the 6 states that flipped, he was correct on only 2, almost completely accounting for his catastrophic breakdown. Over a handful of other states, he too had incorrectly flipped for no reason, and each time was wrong and weakened his perceived forecasting shrewdness.

Let’s peer into the 6 states that flipped, and Nate Silver’s analysis running into the election:

State	Electorals (Nate Silver wrong or correct)	Nate Silver’s probability Clinton winning state	Nate Silver’s probability said state could tip election
Florida	29 wrong	51%	17%
Pennsylvania	20 wrong	76%	11%
Michigan	16 wrong	79%	11%
Wisconsin	10 wrong	82%	4%
Ohio	18 correct	65%	7%
Iowa	6 correct	73%	1%

Nate Silver generally had a 72% probability of being correct in each of the first four states above (totaling a monstrous 75 electoral votes, and each one he got wrong). And on more than a handful other states that he incorrectly flipped, he gave each one of those about a 75% probability each, of being correct. But his overall election probability was lower, at 71%. This is mathematically inconsistent. Through the central limit theorem adopted to summing a variable series (see page 18), the overall forecast must adopt a lower uncertainty and therefore a more robust probability. Professor Nassim Taleb exposed yet another flaw when we fit a stochastic model to election probabilities and suggests that -with much higher than recognized- forecasting uncertainty (by Nate Silver and many others), the election probabilities (assuming the surveys were valid and unbiased to begin with) should have been essentially tamped down to coin flips in the many months before the election (something we stated as well).

So now we know these probabilities were totally off the map, we can also see what does this mean for someone utilizing these forecasts? We’ve also presented in the rightmost column the chance, that a co-variant polling shift wouldn’t catch all these states at the same time. To be a tipping state, implies that this is the most marginal surprise of the election. Nate Silver got four tipping states wrong, and they all had large electoral votes which mattered. He also was incorrect when he confidently swaggered that the probability of those same four states mattering were each between just 4%, to 17%. So we had a rare 4% probability event, combined with a rare 17% probability event, etc. This shows an extreme degree of confidence that each state's forecast was in fact independent of on another, when as we saw with the entire missing of the Donald Trump movement, Nate Silver's models were overly reliant on Hillary Clinton winning to begin with and any Donald Trump surprises in one location would be countered by Hillary Clinton (it wasn't) in another. So we see from these low 4%-17% probabilities, that there is a preposterous assertion that Mr. Silver's election outcome was this extreme of a fluke, and nothing more for anyone to learn. After all, this pollster has a live track record for one decade, tops. The correct analysis instead is that Nate Silver’s models are wholly inconsistent between his national polls and the his state-polls, and are poor replicas of what’s happening on the landscape, and dummies simply taking guesses are correct more often. This is also not a popular vote issue as the many states Nate Silver got wrong, Hillary Clinton lost the popular vote in addition to the electoral vote for those states (there is no mathematical defense for over-sampling and over-campaigning in California).

Now we got to this point when we have poorly constructed samples. This has been happening for far too long, and usually no one notices because someone may be close to the right answer. Including a dummy. Rasmussen Polling is a great example in this case, with a poor forecast for the 2016 election (albeit correct by chance alone), and suddenly their 50%ish current job approval rating seems to have outsized credibility. The second gap is the mapping between the surveys, to what matters in the booth. People consistently saying one thing, but thinking/doing another. There is not enough sampling to overcome some things, including those who don't like to be sampled. We can also have for example a liberal pollster, paid for by a liberal media organization (or showcased in recent contender Jon Ossoff's campaign e-mails!), and then it’s very easy to see how the cast of important assumptions of how they survey people’s responses will suddenly be completely skewed in one direction. Without it that liberal pollster would get no attention. Take care. This whole game leaves a public that is later bewildered, having assumed one version of the story was far more appealing and righteous than it actually was. Such as this partisan, fat-tail devotee, below.

attacks on congress = ratio/rate as on public, though these mortality-cause contrasts are dumb. 1954 capital incident wounded 5. cc @nntaleb pic.twitter.com/IKg5LmeH64

— Statistical Ideas (@salilstatistics) June 16, 2017

Statistical Ideas

Pages

Saturday, June 24, 2017

Blind leading the blind

2 comments: