Statistical Ideas: Presidential coin flip

Short-term update: top Trump surrogate and Transition team advisor shares our latest analysis. Also tweeted here by the great Nassim Taleb. Polling statistics from Statistical Ideas topped 4 million reads in the past month. Latest article is here.

Back in March 2016, we computed that then-candidate Donald Trump had a nearly 1/3 chance of winning in a general election match-up versus Hillary Clinton. To many seers in the mainstream media and in the election forecasting community, going into 2016, this probability seemed preposterously high. Huffington Post’s editorial board refused to cover Mr. Trump, the NeverTrump movement was spawned, and Nate Silver gave him only a 2% chance of eventually securing his party’s nomination. Shortly after his numerous gaffes (here, here) in his own primary season forecasting results, the latter pundit conceded in June that he was narrow-minded, and then gave Mr. Trump the same 1/3 chance we gave a few months earlier (in March 2016).

Now much has happened since June of this year, particularly for those closely involved with these campaigns. At the end of July (GOP convention) and again in mid-September (after the 9/11 “overheated” health scare), Donald matched Mrs. Clinton in the polls. The spread has otherwise generally remained favorable to Hillary; though it has remained closely at approximately 3% since mid-September, including post the televised debates.

This 3% spread is far less than the 5%+ spread she was basking in, during multiple periods from late spring 2016 onwards. The Presidential election is now just several weeks away, and the polls are getting a more solemn stare. Pollsters though are still flip-flopping, apprehensively unsure with how to deal with an election season that can easily humiliate them, yet again.

Therein lies the probability and statistics milieu that we are in. A 3% spread is still being given a 3:1 odds (75% chance for Hillary Clinton) based on market bets and related volatility in same. Yet we saw after the Brexit vote over the summer, such bookies can often be quite mistaken. And in this case they are certainly wide off the mark! How can the spread narrow from 5+%, to 3%, and yet the winning probability for that spread-leader rise? It can’t. Not to mention that there the extra qualms that are missing in this case. Most notably is the stochastic nature of polling results from here (through Election Day), systematic selection mismatch (not necessarily bias in expectation) in the polls, a high sense of melancholy and uncertainty among undecided voters, and the hasty rise in 3^rd party candidates. One must consider all of the uncertainties here, and all of the links in this article that includes my previous analysis of mainstream and market-based forecasters.

The last few uncertainties (the unknown unknowns) have compelling empirical levels associated with them based on data gathered by a team of researchers (including Andrew Gelman also of Columbia University). Scraping together a disparate swath of state-level electoral college data sources, where we happened to see tighter margins of errors versus the errors for gubernatorial and congressional elections, their paper shows an outrageous 7:4 ratio in the error for just 2-person competitions (obviously from probability theory this error would lead to even higher spread ratios with the introduction of multiple 3^rd party candidates.)

To pursue this lead, we asked all four of these researchers for their opinion about the history of these uncertainties, and which candidate might be leading the uncertainty source “this time”. Their response from David Rothschild was to counter that this might not be a great concern in this case, and that they could “predict expected movement [of people claiming support for 3^rd party candidates] towards major party candidates”. And we certainly appreciate the judgment on this article nonetheless! I also asked my friend and close Donald Trump advisor, Anthony Scaramucci, for thoughts on these topics and how his candidate was able to narrow the spread over the course of 2016. He has shared this information with his powerful network and provided multiple shrewd thoughts (off the record) to help ground some of the ideas for this article. Last, we directly reached out to a Hillary Clinton strategist and fundraisers for counsel on this article and will fill in with a quote as soon as that is received.

Now factoring in both the recent 3% spread we mentioned early in this article, as well as the increased uncertainty risk we saw in this research (plus both the stochastic nature of how intentions may fluctuate and the weak Bayesian priors many have inherent in their prediction models), the probability of Donald Trump winning is actually closer to 45%. And all 3^rd party candidates have <<1% probability of winning. In the chart above we see the basic probability in the markets is ~25%, and if the conditionals in green or blue ever came true then uncertainty on the horizontal axis would tighten a bit. That's not all; what we still see in purple is the knockoff movement that subjective overlays can still have on market predictions. Instead, the primary probability difference we've always seen in our "accurate level" versus the markets is in the orange principal direction (and other leading forecasters have relished that about our research). Or closer to 45%. And of course as the weeks move closer to November 8, the probability for Trump (if he's still lagging then) will lower slightly per the red anchor.

This all shows how tight the race really is. Essentially a coin flip, at this point. A coin flip not only would lead to superior results versus prejudiced forecasters (supplanting new insights with greater weights than the larger body of modeled data). But a coin flip (even if weighed just a little bit “unfairly”) would often result in the spread-laggard eventually winning an upcoming national election!

To give a sense for that, assume that a coin is weighted 55% towards H, and 45% towards T. Tossing the coin three times could be used to represent three upcoming presidential elections, and where the probability remains similar to where it is right now. Given this set-up, there would be an 83% probability that the spread-laggard would win at least one of the three upcoming elections (not sure of the order but certainly at least one of them nonetheless). And there is only a 9% chance that the spread-leader would win all of the next three elections. And even if the probability of H was as high as 75% (hence signifying a 50% spread versus T at 25%), the probability of this spread-leader winning all three elections is only 42%!

So that is your real November surprise! The surprise that despite all the delirious and deceptive odds we provide for consumption before the election, unless you energetically get involved and play fair with pollsters, the fate of The Free World might be reduced to more-or-less a coin flip.

Statistical Ideas

Pages

Thursday, October 6, 2016

Presidential coin flip

2 comments: