Pages

Friday, March 25, 2016

Mercurial pollster results

​Short-term update: shared by founders of both KDNuggets, and Naked Capitalism, and enjoyed by leading academic professors.  Also latest article (though pre-Trump video is here).

On March 1, Trump had a 94% probability to win Alaska, but he instead lost.  A week later on March 8, Clinton had a >99% probability to win Michigan, but she instead lost.  These are not two small failures out of a large lot of primaries, but rather two failed predictions out of 19 by then, where Nate Silver gave his predicted victor a 90% or greater chance of winning.  These 19 forecasts evidence a false sense of overconfidence in the outcome, as he claimed an average of a sky-high 98% probability of winning.  Instead the true probability of the predictions through Michigan was a far softer and more unstable <90%.  Polling is not a sport where merely being right >50% is mind-blowing.  Any fool can dumbly "predict" by selecting whoever is leading in the polls, just like any Wall Street "strategist" can and they all do dumbly "predict" each year that markets will rise.  When assessing the weak polling performance it is clear that the 2016 record will be made severely worse for Mr. Silver, if he again fails in the near future.  Though as we show, he has made changes to help prevent that.  In this article we explore the probability science behind subjectively forecasting from over-fitting polling data, where Mr. Silver's forecasts fall shy of his trumpeted confidence intervals (and it is due to poor changes on his end more so than simply "bad luck"), and how since this double-whammy of failures (Alaska, Michigan) we see statistically different polling characteristics.  Being far more selective and speculating on a fewer fraction of primaries going forward, and when he does forecast he is placing significantly weaker confidence on his own predictions (in essence now making them ineffectual).  Elections are an important part of our democratic process, and this process continues with a few more primaries in the next couple weeks (Wisconsin being the grandest).  Now is the time to scrutinize election forecasters, be fair-minded and always go to the booth regardless, and share these insights with colleagues who are otherwise having their motivation altered based on desultory polling.

In order to know the expected number of correct election forecasts one should expect, one merely needs to sum up the expected probability of selecting the correct winner!  Given not one "forecast" was thankfully as egregious as 0% or 100% confident and generally similar to one another, summation of probabilities is a clever technique would then result in a single metric one can probabilistically contrast across the varying forecasted result.  Between the two parties, there have been 51 primary contests (40 through March 8).  Of those 40 through Michigan, Mr. Silver has made predictions on 31 of them.  See the layout of the predictions in the table below (also feel free to see and share freely available raw data):


(A) (B) (C) (B/A) (C/B)
# of primaries # of which Silver forecasts >=90% probability Participation Claimed conf.
Up to Mar. 8 40 31 19 78% 61%
Mar. 8, to Jun. 14 55 12 6 22% 50%
Total 95 43 25


Now of the 31 predictions, 19 were claimed to have a confidence of >=90%.  We sum up these 19 overconfidence prediction below:

92%+92%
+94%+94%+94%
+96%
+97%
+>99%+>99%+>99%+>99%+>99%+>99%+>99%+>99%+>99%+>99%+>99%+>99%
= >18.5

These 19 predictions have a claimed average chance of success at 98%!  And it sums to ~19 correct predictions out of 19.  And instead Mr. Silver was correct on only 17 of the 19, well shy of the 98% accuracy claim.

We can now perform a hypothesis test to see the chance that such an underperformance could occur from chance alone (and this doesn't even taken into account the fact that any fool could have simply "predicted" the leading candidate in the polls to win).  We know that such a high proportion expectation deserves an asymmetrical confidence interval.  This requires advanced probability formulas that we'll discuss in a later article (but essentially we skew the confidence interval to provide a wider reach towards 50% as opposed to towards 100%).  Mr. Silver's results are a true two-tail ~75% confidence interval.  For probability wonks, it doesn't matter if we use exact methods, or a binomial or Poisson model approximation.  In English, the pollster's results are significantly weak and any additional near-term election failure would drive down the current critical one-tail level from 12%, down to just 3.5%!  Just as we expect our airbags to deploy in a car crash >99% of the time, an actual 89% success rate in deployment would be considered a failure and we just showed it has a low probability to occur from "bad luck" alone.  Mr. Silver's recent failure would be excused if it were only once in every 10 or so Presidential election cycles.  Yet his high-profile failures occur much more frequently than this (herehere) and hence doubtful to be from chance alone!

This very topic was brought to my attention through a phone conversation from Michael Shedlock (for whom Nobel Laureate Paul Krugman has once admired unlike here).  Mr. Shedlock was intuitively alerted to Mr. Silver's mercurial treatment of "There’s no forecast for (state) yet because there isn’t enough recent polling."  It is clear to us that he is massaging the surveyed polling results using nebulous means, and then providing false confidence of success against the predictions.  It is elementary probability theory that the prediction confidence interval must be wider than the underlying polls, and hence anyone's probability of winning being as high as Mr. Silver has them is universally incredulous.

To give credit where it's due, Mr. Silver has an aesthetically alluring website interface with reasonably easy access to recent predictions.  However anyone wanting to get to the bottom of data and discrepancies in a mass way would find it difficult.  More important is that viewers expect to see consistency in the methods employed, and here we are instead seeing differences in the polling done subsequent to the recent mortifying misses in Alaska and Michigan.  

We see in the table above [green column (B/A)] that the number of forecasts that he is now performing is far fewer and he is being far more selective in choosing primaries where the spreads are much more obvious.  So from predicting on 78% of the primaries before March 8, to now just 22%.  This is equivalent to now someone asking out less attractive people on dates, after that person's ego is bruised when suffering a string of rejections in asking out more attractive people.  

Even more disconcerting is to see in the table above [orange column (C/B)] that the >=90% assuredness that he claims on his selected candidate winning continues to come down since March 8, and continues to come down in upcoming primary predictions.  So from 61% of his forecasts before March 8, to ~55% from March 8 to today, to now 40% for upcoming elections.  Since March 8, this averages to just 50% on the"fewer and more selective" primary forecasts that Mr. Silver chooses to now predict.

In summary, one should be cautious in evaluating a discretionary and biased overlay to underlying data.  Wall Street strategists perform worse than a coin flip as we show in the linked article above.  And Mr. Silver's predicted accuracy is made worse through his efforts (presumably) to optimize for the strongest probability outcome.  Underlying polling data is freely made available, and serves a more important purpose to you and your colleagues, versus someone's trumpeted "crystal ball".  There is always an allure to the someone having magical clairvoyance, though rarely do we see this exist (here, here).

Free email subscription enjoyed by thousands (note you must complete process by accept the immediate email confirmation):


No comments:

Post a Comment