Statistical Ideas: Distribution risks

From the simplest, parametric probability distributions, to the most sophisticated, nonparametric distributions, it is essential to learn the key elements for managing one's distribution risk. This could be the risk of a particular income scenario, the risk on what age you will die, or any number of imaginable unknowns. Knowing this will help make more informed life decisions that involve encountering a risk by yourself (e.g., longevity risk), versus sharing it broadly with many others. And some of the ideas here need to be thought through, since readily available government actuary tables do not exist for many utilitarian population clusters.

First, let's look at the plainest distribution, which is the Bernoulli (or a simplified binomial). See some prepatory articles on this (here, here, here, here). We start with one person "X", who we might consider to be ourselves. Person X is expected to either have a 0 income with probability 50%, or 1 income also with probability 50%. So the expected income for X is 0.5. Now we can introduce a second person "Y", which greater income potential (and expected income of 1.5). If we expect that Person Y's income would correlate to Person X's (e.g., either they both are on the higher scenario or both on the lower scenario), then we expect the average income between them to now be 0.5, and 1.5. See the upper left diagram below.

Other things could happen of course. Looking at the lower right diagram, in the illustration above, we consider that multiple things may happen simultaneously in some theoretical closed form model. For example, when X and Y are "together" (e.g., married or business partners or neighbors) then some dynamics could work for them together. Their association might lead Person X to have a greater income distribution, and Person Y might have a lower income distribution. Additionally their distributions may be inversely correlated (as shown by the crossing solid arrows). Such that when X has the lower of his or her two income possibilities, then Y has the higher of his or her two income possibilities. And the reverse would also hold by default. In such a scenario (as shown with the dashed arrows) both individuals' incomes are more homogenized, and their joint incomes are more consistent (in this case they were spreadless at 1 each). Peruse each of the four diagrams above.

While the probability math was easier to see with the basic distributions above, let's see how this quick lesson applies to a more complex real-world distribution. Here we will repurpose the 1950 actuary table that we analyzed in the "Centenarian risk" article. We also note now to get comfortable with it, since we will continue to reuse this data in future articles. In the left illustration below, we will show in red the simulated, expected deaths of people born in 1950 (who we focus our attention on since they are just now retiring in 2015). This distribution is the actuarial equivalent to Person X above, or Person Y if they are perfectly correlated (r=1) to Person X. A careful reader would have observed a relatively high death rate in the year following live birth (~3%), which makes the first year the only time where the typical remaining lifetime increases after one survives that age!

And in green we show simply show a simulation of what a second random person's death distribution would be. Of course by themselves (r=0), his or her death distribution would be nearly the same as the national distribution shown for Person X. And in fact, by definition the initial actuary table (for the national population) applies to all people and so regardless of how one synthetically tries to sample a (inversely) correlated variable, it's distribution will always remain the same. They all share the same expected death age of 68.

On the right illustration immediately above, we see that the average death age of the pairing of the first and second person are equal whether we look at the red (r=1) or the green (r=0) distributions. This summary statistic is far more applicable to consider when thinking about personal risk assessment. We also show in blue what an inversely correlated (r=-0.5) scenario looks like. This is more similar to the diagram in the upper right of the top most illustration in this article. As opposed to the red distribution immediately above for Y=X (or r=1), which is most similar to the diagram in the upper left of the top most illustration above. The red arrows shows that both the left and the right distributions are equal.

We can see with the blue arrow that while the expected death ages of all pairs of Person X and Person Y are still 68, the dispersion about this age is least when Person Y is least correlated to Person X. Notice even in the top right diagram, in the top most illustration, that the average incomes were both 1 (instead of the spread between 0.5 and 1.5 on the top left diagram).

The paramount lessons we get from this is that by mixing additional humans to the actuarial risk pool, the average death stays the same regardless of correlation between people (here, here). Additionally, typical scatterings about the average death age comes down considerably (while also looking more Gaussian), as should be expected when mixing any set of identical, mound-shaped probability distributions. For the independent, blue distribution (r=0) as an example, the highest average death age generally reduces from 110 to about 95 (see here and here for details). And lastly, we leveraged the national life table for all of our work here, since the idea of relationships between locations (e.g., the bottom two diagrams of the top most illustration) can not be flexibly modeled with government actuary tables. One can see a set, for say, married couples. But not one for guessing differences among custom sets of people, such as students enrolled in the same college.

Statistical Ideas

Pages

Thursday, July 30, 2015

Distribution risks

No comments:

Post a Comment