Statistical Ideas: Data hounds released

The markets can't go down in the final month (December) of the year, correct? That's what the PhDs at Kensho would have you believe as they began promoting their high-profile offerings through different organizations, such as CNBC and Wall Street research firms. Kensho, which states that it combines top science pedigree from leading Wall Street and Silicon Valley companies (e.g., Goldman Sachs and Google), have trained their supercomputers to slice and dice the ocean of detailed market data and spout a dizzying array of the best trading ideas for you. But like the ubiquitous Stock Trader's Almanac that never made anyone wealthy, does this modern day version work on even something simple? Is this the magical panacea that has solved for everything there is to know, allowing Main Street to finally enjoy a level field with the systems of elite hedge funds?

Let's examine this one big, Santa Claus rally call, a little more carefully and see what we generally learn:

"[B]etween Nov. 25 and Dec. 31. The S&P 500 performed positively 100 percent of the time. During this time period the median return ... for the S&P 500 was 2.2 percent."

First of all, they quote statistics in the form similar to "x out of the past y times", which for the December trade we see x=10 and y=10. Is this a good sample, and is the size undoubtedly significant? What happened 11 or 12 years ago and earlier? All of this is left out of the Kensho story. Second of all, they fine-tuned the trading period to always cover precisely November 25 onwards, instead of some other start date, such as November 23 noon, or December 1. One would falsely conclude from their results that there is a nearly 100% chance that we'd see a positive December. Likely close to their highly precise-sounding +2.2% (or a S&P of 2110), or at least something good. And significant at best (just annualize what 2.2% monthly comes to!) Sadly a chorus of financial professionals (e.g., Bespoke, James Cramer, MAN hedge fund CEO, etc.) have fallen in-line with supportive rationale for this inevitable December rally, and that computers dispassionately just get things right more often than humans (odd since systematic trading strategies have not performed well during this recovery). But let's leave normal sense aside, and prematurely make way for the Dow 18000 celebratory hats on the floor of the New York Stock Exchange!

What we've witnessed, squeezed into recent weeks, is mass absurdity. If event A has happened 0 of the past 8 years, and event B has happened 5 of the past 11 years, then what exactly can we conclude about event B (on an absolute basis or relative to event A)? In general, nothing. Event A is the number of times it has rained during the Super Bowl, and event B is the number of times a team name starting with "New" has won the Super Bowl (i.e., New England, New Orleans, New York). In reality the probability of event A and event B might actually be very close to one another, and somewhere between the wide values given by 0/8 and 5/11. It would always be foolish to go attend the next Super Bowl without first checking the weather forecast, or to blindly wager on the next Super Bowl winner without better understanding the actual teams playing, Here we also simply have a loose sample size underlying this comparison.

Yet the way these market statistics are popping up, as if from some divine program, they give the sense these statistics mean considerably more than they should. And they offer it in a way one normally wouldn't think about probability in normal life activities. See these related articles, which different university syllabi link to, on thinking through combinatorial probabilities (here, here). We can also set up a web calculator in the future, on our web portal, if there is interest in having something automatically solve for you, custom probabilities such as this.

If one data mines for narrow, spurious results, then surely it will be found. The best way to study market data is to look at it all more robustly and ensure a large sample size grounded in fact, instead of trying to fish out sparks of strange signals that mean nothing. As I once heard Professor Shiller say concerning those chasing market results, "if you're looking for trouble, you're going to find it."

Those who bought into this December obsession, are no different than those who believed (at the turn of the millennium) that stocks could never lose to bonds, over a decade-length period. And for no other reason than this was the pattern over the previous 5-10 decades. Or people, such as Art Cashin, who state -with all the seriousness one can muster- that 2015 will certainly be an up year, only because the year ends with the number "5". Kensho came out promising a fun and enticing +2.2% since November 25, and instead the market handed out a wild -2% so far (that's nowhere near at least something good.) Anyone who bought a call option (a leveraged bet) on the S&P at 2110 (or even an at-the-money 2065) is now feeling stupid, thanks in part to pedigreed-people confusing others with careless statistics.

Of course anything is possible between here and year-end (the point of this article). But the main storyline won't change, and say the market doesn't give a positive year-end rally as many of these data hounds suggested, then this should be something to consider as a counter argument for "this time it's different" way of thinking. Hard to perhaps now imagine, but a year from now the newly promoted statistic might be to buy December (yet again), since it would have worked an impressive "90% of the time with a median return of 2%". Mastermind hackers will always have a receptive audience with money to waste.

4 comments:

AnonymousDecember 12, 2014 at 5:44 AM
Totally agree. Sadly Kensho is a goldmine for CNBC and others. It gives them the ability to spit out interesting statistics that sound like news/analysis but in all likelihood are useless. Now anytime they don't have something newsworthy to talk about (which is most of the time) they can just ask Kensho to spit out a statistic they can talk about.
AnonymousDecember 17, 2014 at 10:33 PM
I agree with you totally, but do not get what the events A and B are in saying that in 10 out of 10 years, SPY return was positive. Care to elaborate?

Statistical Ideas

Pages

Wednesday, December 10, 2014

Data hounds released

4 comments: