Temporary update: shared by some prolific individuals, including a Board member of Lending Club
The definition of insanity is doing something over and over again and expecting a different result.
The definition of insanity is doing something over and over again and expecting a different result.
Variations of this captivating expression have been around since the early 1980s (though misattributed a
generation earlier to the late physicist Albert Einstein). Those underlying ideas however have roots etched
into significant religions, such as the ideology of Taoism. The world is more complicated however, and natural
probability models can properly clash with these well-worn expressions similar
to “this time it’s different.” In this
article we explore many specific ideas where closed-form probability frameworks
can be distinctly leveraged to help solve how to think through risk choices
where the assumption of normality breaks down.
Sometimes life events offer explicitly abnormal risks, such as in a continuous
model; other times we see experiences that behave similar to an annuity, and
other times still we show modern techniques in terms of thinking through lump sum, and random
walk models, and the manipulation of multiple normal distributions. The latter two topics can be thought about in
terms of life phenomenon having ordered chaos versus having creative
chaos. Regardless, here we show that
in many situations people overestimate the accumulation of these abnormal risks
over their life (while we generally understand people to underestimate the characteristics
of any one risk as being normal-tailed when they can many times be fat-tailed).
Russian roulette
We’ll discuss many examples in this article, to develop the appropriate probability
theory and concepts. The first is from the
lethal game of Russian roulette, except here we have a modified version of the
game where the player is unaware of the number of rounds (say 0, 1, or 2) that
were loaded into the 6-chamber revolver.
After a few safe attempts, where the trigger was pulled a few times, can
we always assert that the future pulls of the trigger will mimic the past? The few historical observations we would have
used in this case would –after all- be consistent with any argument that no bullets
would be fired in the future. Instead it
should be obvious in this example that it would not be insanity to think that
the next pull of the trigger could be different. In real-life we often just witness a few
historical observations of an event, lack all of the appropriate context, and still
prematurely conclude with extraordinary confidence that it would be insane
to think next time is different. Yet instead,
the next time could very different (and also arrive with irreversible consequences!)
Since this is a
probability focused site, it should be noted that the model for the expected
value of Russian roulette would depend on the assumption of whether the chamber
is to be spun between attempts of the game (“with repetition” in probability
parlance) or not be spun (“without repetition”). The expected number of attempts before a
bullet is discharged, assuming a single bullet was loaded into the chamber, is
clearly 3.5 (the median value of the number of possible rounds). The 20th
century civil rights leader Malcolm X wrote in his biography that he had played
this technique once to show his criminal peers that he has ready to die, by
pulling the trigger of a pistol against his head a few times.
The “with
repetition” framework is similar to a simple transition matrix model that we
often see in biological and economic systems.
We can see this analysis here for an example with fund managers wanting to remain in a
certain quartile performance status. For
our simplified two-state Russian roulette, we have a special form of a Pascal probability
model (named after the dazzling 17th century French mathematician). When the number of possible “failures” is
equal to an integer, then this becomes a negative binomial distribution. And when we only care about the end of the
game with one of these failures (a released round to the head), then this
probability model further collapses to a geometric distribution.
p(k) =
(1-p)k-1p
Where k is the number of attempts,
and p is the probability of failure (or 1/6).
Now the typical number of attempts in this “with repetition” example, we
have this calculus.
E(*) = sumk=0→∞ (1-p)k-1p*k
= p*sumk=0→∞ (1-p)k-1*k
=
-p*[sumk=0→∞ d/dp (1-p)k]
=
-p*[d/dp sumk=0→∞ (1-p)k]
=
-p*[d/dp 1+(1-p)+(1-p)2+(1-p)3+…]
=
-p*[d/dp 1+1/(1-(1-p))]
=
-p*[d/dp 1+p-1]
=
-p*[-p-2]
=
p-1
=
6
Of course the
property for this above formulas is memoryless, similar to the property of exponential
decay models. In other words, after 6
successful attempts, the expected number of attempts before death is neither
suddenly never (expecting this time it won’t be different) nor suddenly
imminent. It continuous to remain 6, as
though the previous attempts never happened.
So for a constant y prior number of successes, there
are the following typical number of attempts:
E(*/y) = sumk=0→∞ (1-p)k-1p*k
=
sumk=0→y-1 (1-0)k-10*k + sumk=y→∞ (1-p)k-y-1p*k
= 0 + p*sumk=y→∞ (1-p)k-y-1*k
= p*sumk=y→∞ (1-p)k-y-1*(k-y) +
p*sumk=y→∞ (1-p)k-y-1*(y)
=
-p*[sumk=y→∞ d/dp (1-p)k-y ] + p(1+p)-1y[1+(1-p)+(1-p)2+…]
=
-p*[d/dp sumk=y→∞ (1-p)k-y] + p(1+p)
-1y[1+1/(1-(1-p))]
=
-p*[d/dp 1+(1-p)+(1-p)2+(1-p)3+…] + p(1+p) -1y[1+p-1]
=
-p*[d/dp 1+1/(1-(1-p))] + (1+p) -1y[p+1]
=
-p*[d/dp 1+p-1] + y
=
-p*[-p-2] + y
= p-1 + y
= 6 + y
= p-1 + y
= 6 + y
Traditional annuity-due
Next, let’s shift
to the risks associated with a growth compounding vehicle; say a lump-sum
payment or an annuity-due (payment at the start of a period). Say that we have scenarios where one has $32
at risk, over 32 years of their lifetime.
There are a couple different ways that this can occur. The first way is similar to someone having
been gifted $32 at the start of one’s career, and then investing those proceeds
over their subsequent career of 32 years.
In this first way, the ending risk associated with the growth
of this payment would be normally distributed with a standard deviation equal σ√t, where σ is equal to the independent, annual
standard deviation of the returns, and t is equal to 32 years. The assumption of independent and identically
distributed (i.i.d.) is important here since independent implies no covariation
(the opposite does not hold as we can see with y=x2.) So now the variance and standard deviation of
2 i.i.d. with standard deviation σ is:
Variance(X1+X2)
= Covariance(X1,X1)
+ Covariance(X1,X2) + Covariance (X2,X1)
+ Covariance(X2,X2)
= Variance (X1) + 2Covariance(X1,X2)
+ Variance(X2)
= 2Variance (X1) + 2*0
Standard deviation (X1+X2)
= √2*√Variance (X1)
= √2*σ
With t
number of i.i.d., the covariance matrix above (where t=2) would simply
collapse to the generalized version of σ√t since all of the covariance of the
difference pairs of variables would equal 0.
In these cases, it’s also easy to see how to cut this overall volatility
risk in half:
½(σ√t)
= (½σ)√(t)
= (σ)√(¼t)
In other words
the $32 lump sum could be invested in a growth of half of the annual risk level
(clearly the expected returns of this growth would also be cut meaningfully),
or the time horizon would need to come down from the 32 years to ¼*32, or just
8 years. Now most people would not be
interested in reducing their risk exposure in half, by only investing in the first 8
years of their career, followed by not investing for the remaining 24
years. We are also making the assumption
here that one would live for the entire 32 years, a problem that is undoubtedly
more complicated as we’ll show, when this assumption is false.
The second way
one could have $32 at risk over 32 years is if he or she were accumulating this
level steadily, during such time. Say $1
at time 0, $1 at time 1, another $1 at time 2, etc. This is a more realistic view of
how one accumulates risk exposure over their careers. In this case there is no simple risk model
that can in a closed-form mathematically show what the final risk distribution
should be. But we will see and
rationalize that this resulting, more practical risk exposure, is not
normally distributed. See the
illustration below where we generate hundreds of simulations to demonstrate
various distributions of outcomes.
In the
illustration above (in blue), the
outcome distribution for the first lump sum example above using both 10% annual σ and a 40% annual σ.
And (in red) we see the
outcome distribution for the second example we have here of the annuity-due,
again using 10% σ and 40% σ.
Let’s review the summary statistics now for all 4 distributions
above. The first 2 distributions are
normally distributed (so lacking skew and have normal-tails). The fourth distribution (40% annuity) is platykurtic with visibly tighter distribution then the fatter tails
associated with the normal distribution (as seen in the 40% lump sum
distribution above). The third
distribution (10% annuity) also follows the same distribution as the 40%
annuity, but interestingly given the lower σ it is exceptionally difficult to
distinguish its risk profile versus a normal distribution (e.g., see the
overlap with the 10% lump sum distribution).
So the idea of
the risk distribution for this second, more realistic, way of being exposed to
$32 in risk over one’s career can be modeled as a normal distribution
in
some annuity cases! Notably when
there is a steady flow of risk accumulation that is subject to very low risk. We’ll see that theme throughout this article
and it has implications for answering the same question of how to reduce one’s
risk, where only in this case can we borrow some of the same risk formulas
shown in the lump sum section above (albeit the definition of s would be different).
In both of these
cases, we can help understand how the difference in answering the question how
much risk to have in one’s portfolio based upon age is going to vary from the
simply rule of thumb of investing 100 minus your age in risk,
or zero if one if fortunate enough to be a centurion. So if one is 20 years old, this would be 100-20
or 80% of their portfolio in risk. And
32 years later, at age 52, this would be 100-52 or 48% of their portfolio in
risk. One need not take these portions of
risk and career length too literally (obviously everyone’s personal situation
and the market offerings at any given time would influence these choices), but
rather think of this framework as a directional impression for how one
might be thinking about risk over time depending on the different approaches, especially for any lower σ volatility scenario. One should be able to consider what possible
outcome is someone protecting himself or herself from that leads to a
noticeably linear risk model, and such a lower risk profile in most
the early years (e.g., 2/3 of the initial years one is underinvested, and in
1/3 of the final years one is overinvested)?
Incidentally, this later point of being overinvested in the latter
several years also applies to most forms of the recently in vogue, target-date or “lifestyle”
investing.
And what we
discussed here in the concept of what this risk means, because if the growth coefficient
of variation (articles on this coefficient here, here, here) is say 2 then the recovery from a “risk” in the later years of
one’s career would be made up for relatively quickly (e.g., in about 1/6 of the
final years instead of 1/3). This would
rotate the figurative blue distribution of risk upwards, as if pivoting
the entire blue line curved line from the initial value of 100% in a counter
clock-wise fashion such that the blue line currently positioned at (A)
comes up to about where (A’) currently is. Mortality, and valuations would both work to only
adjust this (A’) level slightly.
Increasing or decreasing annuities
Other serious mathematical
entanglements ensue in the real world (and we’ll intrepidly tackle them here),
where there is somewhat ordered changes in the annuity progression over
time. For example, the level of the
annuity can increase or decrease in a step-wise function, making modeling from
the past observations generally more difficult.
The motivations are clear, as one can have an escalating amount of risk
during much of their lives, and other than through open-end simulation there is
no way to have this modeled. Now one can
also have declining balances, where the amount of risk reduces over time. For example, if one has a mortgage that they
pay off during their career, or if one has a retirement nest egg that they can
draw down on the fixed schedule. Both of
these ideas for changing annuity structures, with or without a fixed amount of overall
variability, has been self-written about in a top annual publication in the
Society of Actuaries (Dollar Cost Averaging Risk, p.17).
Equity valuations on individual companies operate with similar unfitting
approximations here, as an income stream is brought to present value with a
constant discount rate. And then
projected into the future against various risk scenarios.
Additionally, as
with the case of the traditional annuity-due example, the overall risk profile
is not always growing in a smooth convex manner, such as with the lump sum
example. The rationale is the amount of
risk both grows with time but the balance behind such risk can be marginalized
on an annual basis. So these forces can
oppose one another and create a maximum value in the middle of the life
cycle.
Even here, we
revert to the common article themes that in a few closed-form situations we’ll
show that we can model low risk relative to absolute return experiences
differently from most risk events. And
another common lesson is that the amount of risk that one can take on during
their careers or lifecycle is generally higher (but not so much as to employ
leverage where other risk/return considerations and costs come into play),
unless there are truly the abnormal asymmetric risks which are also so severe
(e.g., Russian roulette) such that “different” (to be argued later) is
something that can not be “naturally” recovered from. This is unlike situations such as financial market
or economic shocks, where excess leverage or risk in one’s career is not
necessary; instead one needs to stay focused on the correct level of risk to
take on conditional with their context of ever happening to be in a high return
and low risk period.
Advanced random walk
Now let’s look at
modeling the uncertainty associated with a lifetime, interest-compounding stream
of income so that we can answer important questions about the level of risk we
should be exposed to through our life.
To do this, we must consider two critical aspects of this stream of future
random events. The first is when
repetition generally shows similar results,
and the second is when repetition fails and then shows dissimilar results. This is not being insane, but rather a prudent
process within risk management. And the
results here provide supplemental understanding to our previous discussion of
why one should not have an age-based rule for how much of their portfolio
should be in risk during their career.
Having a better understanding
of what factors would influence a typical decision on
thinking about portfolio risk, averaging across all equal-aged people and
across all economic cycles, poses difficult questions. Yet this is ultimately more fundamental than
the popular media and financial advisor debates hanging off the sides, such
as ignoring these probability models and simply advising citizens based upon an
inconsistent set of quick rules that confusingly mix in a guess on the current
asset valuation levels.
The point of this
article is that we should always be thinking differently about this sort of
risk modeling framework, given the types of probability ideas that can be
expanded upon as we are learning to do here.
It is instead insanity to not think something
different can occur and not independently think about life choices reflecting unknown
unknowns. This includes the insanity
of recycling awkward spiritual advice from counselors.
Helping fuel this
crazed drive, for thinking about how much risk to take on, is to complete a
void where no closed-form theoretical actuary models can assist in looking at two
important tweaks to this problem. Such
as incorporating a probability of death randomly between when one starts their
career, and when one retires. Speaking
of Russian roulette odds, most seem to ignore that one has about a 1/5 chance in
some tables, for a 20 year olds dying before their 65th birthday. Or having the ability to incorporate a high,
real income growth during one’s career, for those where this applies. Of course there are numerous unique considerations
as well one needs to think about for their personal risk modeling, such as
family structure.
It’s therefore
easy, but sloppy, to discard some of these and try open-form modeling to derive
what our best decisions to be. The issue
here is we too often allow our computer skills to conflate with random creative
biases, in order to come up with a risk decision. We’ve seen this in other examples from public policy, to unfortunately even academia. The focus should be
instead of thinking through the problems as we do with the Monty Hall problem in
Chapter 2 of Statistics Topics.
Some random walk
processes that assist in modeling assuming a random interest rate (could be
reversed to look at variations in risky asset returns) imply that the
logarithmic function applied to the returns is normally distributed [e.g., ln(1+It)-ln(1+It-1)~εt]. Of course this helps with modeling, since the
addition or subtraction of two i.i.d. normal distributions is also normally
distributed. And as we showed in the
initial annuity-due examples, this probability assumption could be repeated
numerous times, to reflect the length t of a lifetime process.
Other random walk
processes make the assumption that the combination of two normal distributions through
multiplication (and we’ll also show this here through division) can also
results in an approximately normal distribution (e.g., lnIt-lnIt-1~εt). This is akin to how a continuous return and a
discrete arithmetic return are very similar when the return is near zero. The necessities of these other forms of
normal distribution combinations help provide greater number of closed-form
modeling functions that can be performed to better understand the resulting
distribution.
Even though a variation of this unconventional
version immediately above is noted at the end of a popular Bowers’ Actuarial Mathematics
book, it has limitations that we’ll see. And
we typically we should not be able to multiply or divide two normal
distributions together and have it result in a normal distribution. If both distributions are centered about
zero, then we’ll see here one popular exception applies (for the case of multiplication
of course!) One of the easier normalcy
tests to apply to see the range-bound nature of other exceptions is the Jarque-Bera test, self-named after two 20th
century economists wrestling with the issue of how to approximate the limits of
higher-order moments relative to that of a normal distribution.
The formula is
simpler to use versus the Wilks’ test, and approaches a χ2 distribution with 2 degrees of
freedom (or heavily right skewed).
(n-k+1)/6*[Skew2 – ¼(Kurtosis-3)2]
One can examine articles here
on the higher-order moments, such as skew and kurtosis. A search of this blog for example shows this
great reference on this theory for the small-sample approximation of the normal
distribution, the Student’s t,
named after a fledgling 20th century statistician unable to publish
by his employer (Guinness brewery in Dublin) using his real name. And –unquestionably- at this point in the
article you might be pouring yourself a well-earned glass Guinness. But this is where we’ll show some important mathematical
properties you won’t see elsewhere about the theoretical relationship between
the normal trajectory of lump sum risk and the application of Jarque-Bera test
for the normal distribution. This distribution,
having a skew of 0 and a kurtosis of 3 (excess kurtosis of 0), results in a
Jarque-Bera lower bound of 0. We can
assume the k degrees of freedom concept will be minimal for the simulation:
~ n/6*[02 – ¼(0)2]
~ n/6*[0-0]
~ 0
To understand how
high the absolute level of the expected return needs to be (in relation to the
underlying standard deviation) for the normally distributed returns multiplied
or divided to also equal a normal distribution, we run a simulation with a high n size per graphic data,
of thousands. The result is we see that
within the range shown, the return needs to be nearly twice the value of the
standard deviation. See position (C)
on the illustration below. Not typically
the case, though it the convex function does reflect that for very high or for very low reward to
risk ratios, one can be more certain in the forward risk distribution of their
outcomes. See position (B)
on the illustration below for a representative area of a high risk to absolute
reward ratio.
Now we’ve covered
a variety of different applications of lump sum and annuity math, and we see
that even here that there are outcome differences not subject to quick and lazy
“rules of thumb”. For example, the lump
sum model originally looked like a good reflection of summing the normal
distribution, but here we see that it is still good, but for the random walk we
are geometrically multiplying or diving risks.
And annuity-dues were bad applications for the normal distribution, but here
it can be fine depending on either of two situations. The first is where we have fixed risk (see
the increasing or decreasing annuities), or second here in the advanced random
walk example only when there is a high return/risk ratio. This is a lot initially to keep track
of! But it shows the beautiful and
complicated nature of nature again, and how less polished people would
unknowingly sweep over these details with detrimental guidance.
What we further
see from the simulations illustrated above is that closer these reward to risk
ratios are to 0, the more uncertain we are in describing the future risk
distribution of lifetime performance outcomes. The dark
blue dots, which we often see in the addition and subtraction diagram,
reflects a Jarque-Bera of <3. Or mostly uniform degree of normalcy in the
result distribution, as is theoretically the case. The green
dots on the other hands are for Jarque-Bera of between 3 and 10. The remaining space is the most non-normal
range of dots; those are in red.
Staying with
Jarque-Bera for our article, let’s make sure to thoroughly explore the
important differences and similarities of using Jarque-Bera versus other normal
tests. The key here for the
approximation is that the Jarque-Bera only uses 2 high-order statistics,
which may be best for small data sets but it still does not consume any other
information concerning the shape of the distribution beyond those two moments. Note that we already have the Tchebysheff's inequality (named after 19th
century Russian mathematician Пафну́тий Льво́вич Чебышёв) for a lower order
test using variance (or even σ).
Other broader tests include Wilks’.
But again here a larger dataset would be needed to compensate for the
non-parametric nature of this test.
We should also
note that in a number of other blog articles here we have explored in depth
other important abnormal estimations and manipulations. Convolution theory is one of those stimulating
applications, particularly in risk modeling.
These examples (here, here, here, here) are generally leptokurtic, though that is a lesser point when dealing with the development
of risk where growth factors are involved, as they are here with most of our
annuity and lump sum examples.
As we noted
before with the annuity example, open-form simulations and advanced analytics are
not always best. It doesn’t train one’s
brain to think outside the box. This is
another form of looking at the something incorrect over and over, and rationalizing
that as correct. One should instead
strive to see things differently if they hope to ever be prepared for the
“unexpected”. During this recent financial
crisis we see examples of how people were caught off-guard by homogenously
thinking with the herds, and then seeing risk unfold for the second time in a
decade. That got Lehman Brothers' Fuld into
trouble (before we rescued a number of similar institutions with TARP) in ways
he still doesn’t know (here, here). Having more nuanced
approaches allows one to better think “around the corner” and similar to a Bonsai
planting be cautious in applying just the right amount of modeling attention
where needed (not too much and not too little).
Creative chaos
A fifth and final
example here is where we have an annuity type of model that adjusts (e.g., increasing
or decreasing) and the risk disturbances over time are also changing in
scale. Such cases would lead to extreme
models where we have abnormal risks that must be modeled through a Monte Carlo
(a probability model like some of the earliest probability applications
invented for the purposes of considering schemes used in the southern France gambling halls) type of simulation. On a related note, one can enjoy this
sophisticated gambler’s ruin answer here or here. However we can imagine
that normality may exist on the fringes still if the volatility is still kept
low! Think about that since it isn’t
intuitive and yet it’s a powerful insight to consider. One can think about the distributions in life
from accumulated fallen tree-leave patterns, or the distribution of passengers
among train cars, or how much risk is encumbered in a small amount of space.
Similar to how we
see above the chaotic assembly in which the leaves were generated and then where
they have fallen all over (though mostly nearer the middle), the philosophical concept
that in life we’ll never see something that is different this time, is neither always
true nor always false. Contrary, our life
experiences will always depend on knowing the context we personally
come across, since life patterns always unfold in ways perhaps different from
the last. And to parents’ delights
everywhere, some of such understanding also only occurs from aging.
The meaning of
“different” in expecting something different is also a source of vagary. Risk is different from non-risk, but it’s
also different from other risks we’ve seen in the past. Risks do not always unfold in predictable
ways (in terms of frequency, severity, distribution shape, or in the way the trend
develops). One can see Chapter 4 of Statistics Topics, concerning stochastic modeling.
Surely there will
be many times where future repetitions will continue to result in generally
similar results. But we should always also expect that unknown non-normal
risks would occur, which are more difficult to properly model and protect
against. The future will not simply always
just even itself out. To be
clear, the location of the (B) and (C) in the diagram is where we need to
understand where we currently are in the financial
markets. It is not insane again here, therefore,
to not expect different results based on context, while being cautionary
towards those who claim to offer such well risk-adjusted opportunities.
So it may only
be insanity after all to be either overly fearful or overly confident that life
occurrences will be different “next time” in some sort of predictable way!
Lastly, in other news: There are many updates worth
noting. The market call that was made in
our last article, and published in a WSJ network’s MarketWatch column, came true to the day with what
we noted would be a 1.3% S&P drop to 2103.
And the article before that on the "big data" numbers behind facial recognition was
featured by many news, helping catapult my face(s) recently to the top of the viral
#HowOldRobot.
Additionally, I have just joined
the academic council of FutureAdvisor, alongside great economists and a former endowment CIO.
Also in addition to board advisory work on both coasts, have just
announced that all proceeds from my bestselling book Statistics Topics have been donated to the American
Statistical Association. We will
continue to work bringing first-class and free (even of 3rd party commercials) probability and statistics
education, to anyone who wants to benefit from same.
Lastly, this blog has gained a new readership milestone to be proud of,
crossing >1/3 million reads on the site, and >1 million
reads across all sites. Here below we
can again enjoy some of the current, top decile articles that were written on this blog
that received notable attention from leading professionals around the globe. Thanks much for the support; what a ride!
No comments:
Post a Comment