Statistical Ideas: Jobs: outliers, and elections

October 2 temporary update: this web log article was first on Bloomberg reads

In this article we examine two different perspectives on U.S. employment data. We show how statistics analysis can be looked at in different ways, on essentially the same data set. Our first topic concerns the concept of an "outlier", which showed up often recently as failed pundits have used this to excuse (this is merely one example) their poor forecasting abilities on the previous jobs report (for August). The second topic concerns how employment conditions might be expected to change, depending on the results of next month's U.S. mid-term elections. These two topics cover employment discussions in a couple different recently popular perspectives from brokers to politicians, and shows how a probabilist would consider the information.

Perspective one
The recent month's (August) jobs report surprised many pundits with "only" 142,000 jobs gain. If an economist or professional speculator has not recently had to personally struggle finding work, then it would be easy to naïvely assume the job market just continues to surge higher at the same high rate. Notwithstanding that these high, monthly job gains (from earlier in 2014) have not otherwise been seen in very many years. If we do no more than just take the 6 months before August for granted, then we'd anchor our calculations on an August jobs guess of 240,000 (and distorted mass layoffs data doesn't help). A value that was nearly 100,000 more than the actual August reading. Such an over-reliance on a single high point-estimate not only signals weak marksmanship, but it is then often complemented with an equally poor self-reflection when stating that the failure wasn't their own fault: "August was an outlier".

August could not be an outlier basically because such a term requires a much more substantial amount of data; the August value would have to be by multiple accounts remarkably out of bounds. It's easy, but false, for a failed pundit to throw around such a technical term, when such a term is hard to prove. Ironically it's easy sometimes to disprove, as we do here. There is, in fact, no one easy formula for something to be categorized as an outlier. Some formulas come close, such as the Mahalanobis distance. But even using simple, crude rules-of-thumb, the ~100,000 jobs miss is a 1 standard error event (see FAQ1 on page 7 of the report).

Let's use a plain eyeball test to help clarify this. Look at the time series below, but with your right hand cover up the rightmost vertical data for August 2014.

If we look at the 6-month average, through July, it was as we stated at 240,000. However it should by that time be clear, again with just the naked eye, that the job values have generally been descending since early Q2. For one to have properly forecasted a sub-200,000 reading for August, or at least some additional choppiness here, would have been rare but actually not so out-of-line with the data shown. The actual 142,000 reading shows there was a failed forecast by many, but not a sudden outlier in that month's reading. This web log has documented other systematic examples of failed forecasts (see here for an example). When a professional pundit fails to even call an economic value to within the right ball park, then they must not escape blame.

Let's take the 10 more recent monthly job gains above, and see how they look in sequence, when paired up alongside each month's prior monthly reading. We show this in the chart below, and highlight the recent August reading by coloring that marker grey.

So for example, we see that the month prior to August (July) had a monthly jobs gain of 212,000. This value, shown against the horizontal axis, is in the middle of the pack and is not an outlier. In that context as well, for the grey marker, the corresponding value for the subsequent month (which would be August's 142,000) is shown against the vertical axis. That too is assuredly not an outlier. Yes it's below average, but it's hardly so remarkably out of bounds -within the data cluster shown- for anyone to genuinely suggest that it is an outlier.

Perspective two

Leaving the micro-calls on employment aside, let's turn our attention to another recently popular discussion theme of the mid-term elections. They are just a month away! It would be great to build a decision framework that assesses the markets and economy, gaming out the likely differences in results based on election outcomes. This is a bit of a strange task, to take a relatively small discrete number of elections (relative to the sea of continuous market data), and suggest that in the U.S. the former can prove statistically significant differences in results from the latter.

What further complicates things is that not every two or four year election cycle is the same, given a party's continuity in office is equally important versus simply the election result. For example, we wouldn't mathematically model the second, consecutive Obama election victory the same way we would model his first election victory (the second victory didn't yield a change in political parties). Tracking the differences in economics under a political party, during their entire streak in office (even when this would cover multiple Presidents), is the task we show in this article. One of the take-aways from this exercise is that showing a small amount of summary statistics can again be misleading. We would need, if possible, more granular insights in order to make better informed decisions. A second take-away is that when looking at election data, we should consider the (statistical) independence of political parties and leadership changes, on a two or four year cycle.

We write this article on the eve of both the September 2014 labor report release, and next month's U.S. mid-term elections. The overall analysis here though is critically performed at the level of full-term cycles. One can see that the President is not elected at mid-terms, but what does occur is the far more complicated and sometimes ambiguous modeling of, changes in Congressional power. Due to averaging math, there is a little less of a sample size when any of us just move our analysis from the first term's end, to mid-terms of a second term (75% of full elections). For moving two-year Congressional election cycle analysis to the same reference point, there is even less of a sample size (50% of full elections). See respective calculations below:

(100%+50%) / 2

= 75%

200% * (50%^3)

= 50%

In this article we're going to focus on changes on the important area of labor conditions, as opposed to financial markets. Though changes in the latter could just have easily been analyzed, following a similar playbook as shown here. We also use a little more than the past 40 years of data. As noted earlier, we look strictly at changes in political party, and measure the unemployment rate performance during this time. Other analysis that is popularly put out shows, for example, a confusing calculation of George HW Bush's term in office as an independent analysis for Republican performance. How could it be independent of the White House previous to his term, when he had just previously served for 8 years as the Vice President of the U.S.? Rhetorical question! Our correction for this is to here show something different -which is in one series- consume the entire Republican run. So covering all three terms of Reagan and the elder Bush.

The descriptive analysis that emanates from this small time series of data shows that Democrats (in blue) typically inherit a 7.6% unemployment rate, while exiting the White House with a 1.2% decrease in the unemployment rate. In the charts below T1 represents the time of the new political leadership, and ch. represents the total change while that party remains consecutively in office. And the Republicans (in red) typically inherit a 5.5% unemployment rate, while exiting the White House with a 2.0% increase in the unemployment rate.

The issue with this sort of summary statistics is that it would confuse many into simply believing that Democrats have a better performance against the unemployment rate. Of course the story is much more complicated. If we dig deeper into the data, then we see that there have been 72 recessionary months during these past several decades of the study. Only 11 of those were when Democrats were in office. In the chart below, we partition the party performances based on: growth months, and recession months.

Let's spend some time on the chart above. We see that the performance of Democrats in growth periods is not necessarily better versus Republicans, since the former inherited a generally weaker economy where perhaps an even stronger improvement in the unemployment rate could be expected. During recessionary periods, for which the Republicans tend to get chastised more during, they generally preside over less painful recessions (albeit historically they have had more time in recessionary periods, at 61 months or 22% of the Republican president months). The typical Republican recession starts at 5.4% unemployment rate, and increases 2.5%. While the typical Democratic recession on the other hand starts at a higher 7.0% unemployment rate, and still increases by 1.7%. The current unemployment rate is 6.1% (one can see details on the front page of labor department report linked further above), which suggests that we are closer to the type of environment where Republicans would generally be elected into the White House.

Sure there are somewhat conflicting signals from these two perspectives we discuss above. For those who heavily argue against the second perspective (as though it means much more than it does), suggesting the Democrats are here unfairly blamed for the recent global financial crisis, that's also wrong. Mathematically, the argument fails normal logic since the entire economic impact of the recession in the Democrat's values charted above, are symmetrically amplified in reverse for their later recovery performance. We can recall that the economic recovery, which I helped lead an important part of, has so far been during a continuous Democratic party in the White House. Even prior to our statistical partitioning of the analysis (between recession and growth months), the performance of Democratic and Republican heads didn't show a strong enough pattern that could be replicated elsewhere. Elsewhere as in across time, or disaggregated to the state level.

As an example, the political conclusions here do happen to look similar for New York state governor's economic performances. But these national conclusions are the exact opposite when applied to another large state on the other coast: California. And other state regions, such as District of Columbia, suffer from lack of data. Since we didn't have enough changes of Republication leadership over the past 40 years, in order to facilitate any probability conclusion at all.

This feeds into our overall article message. It is difficult to create meaningful market or economic conclusions, based exclusively on recent data or even broad themes such as changes in political party. Also a somewhat promising statistical signal from one perspective, can just as quickly be contradicted through a less optimistic signal from another perspective.

Final housekeeping: Here are some unrelated items inserted at the end of this web log article. The first thing is that our previous article was miscategorized as spam by some people's e-mail service providers. We therefore suggest that you clean up some of your junk folders, just a couple times a year. When you do you can explicitly filter important messages, such as this one, into a special folder and so your e-mail service provider would be disallowed from miscategorizing any similar messages going forward. The second thing is that the Statistics Topics book has been top ranked, two months straight, on this prominent reader's monthly sales list (new list out soon). To celebrate this achievement and to broadly make the book accessible to my current autumn university students, I'll have reduced prices for the next week. Also Significance magazine is -this week only- soliciting brief nominations at this e-mail, for the best statistics book published in 2014. Please give back to our collective community of learning, by writing reviews for Significance, Amazon, or elsewhere. It's very easy to do it as well: just focus on important themes such as why you like a certain book, what differs about it versus other related books you have read, is it a book you have recommended to others, and what in the book have you been able to apply in your daily life. Thanks much for everyone's support; this web log will sometime soon turn two year's old, and it has been (as you know) a tremendous pleasure.

Statistical Ideas

Pages

Tuesday, September 30, 2014

Jobs: outliers, and elections

No comments:

Post a Comment