Pages

Tuesday, July 29, 2014

Convolution theory

When examining numerical outcomes, where the outcomes are based on multiple ordinal and nominal results, convolution theory is an important probability framework to understand.  Say that we are running a biological experiment where the individual treatment result can be assigned a random integer score from 1, through 4, where 1 is directionally the most favorable score.  The results are scaled based on historical norms such that in each random experiment, each integer score has an equal 25% chance of occurring. 

One now repeats this experiment three times (either all for one individual, or once is each of three closely identical individuals), and then computes the average from those three scores.  The probability of getting an average score of 1 is fairly easy to compute (25%3).  But what about the probability of getting an average score of 2?  Would the probability be similar to how we got an average score of 1 (i.e., 25%3), or perhaps would it be something similar to a few times this probability (e.g., 3 or 4 times 25%3)?  The answer here is that it is actually 10 times more likely to have an average score of 2, then it is to have an average score of 1.  This higher probability amount is the topic of our article here, and this has important applications that we will show, spanning from bioscience experiments, to investment performance.

This is also similar to a probability question assigned to Rutgers’ graduate biostatistics and business degree students earlier this year (here).  There we use this analogy of consecutively tossing a throw-back tetrahedron shaped die, popular in historic Dungeon and Dragon games.  The die could look like an equilateral pyramid, or a barrel.  For this shape, there are 64 permutations possible from three die rolls, where repetition is possible:

(4P1)3 = [4!/(4-1)!]3 = 43

For more notes on probability terms, such as “permutation”, please use the search bar on this web log to facilitate other articles on those topics.  So there is only one way to get an average of 1 from three die rolls.  This is to continuously attain a roll score of 1, on three repeating experiments (e.g., 1-1-1).  That's one permutation now, of those 64 permutations we said were possible.  Also, there are three ways to get an average of 1.3, since there are the 3 permutations where one can attain two dice rolls of 1 and also one dice roll of 2 (e.g., 1-1-2, 1-2-1, and 2-1-1). 

But how many ways are there to get an average roll of 2?  Of course we have the clear permutation of straight 2 rolls (2,2,2).  But we can also get the same average in three permutations of any two dice rolls of 1, combined with one dice roll of 4.  Also there are six permutations where we have one dice roll of 1, another dice roll of 2, and a third dice roll of 3 (e.g., 1-2-3, 1-3-2, 2-1-3, 2-3-1, 3-1-2, and 3-2-1).  This totals 10 (1+3+6) permutations, again out of the 64.  Or a probability of 16%.

For reference, we can show the mathematics here for solving the entire probability distribution.  If this is not of interest, then one can skip over the frequency distribution, in grey bold-italics below, and continue reading just after it.

Allocation of 64 permutation of ways to get different averages from three 4-sided die rolls: 
1:                1, which is {1,1,1} 
1.3:             3, the Perms are {1,1,2},{1,2,1},{2,1,1} 
1.7:             6, which is 3 Perms of {1,2,2} and 3 Perms of {1,1,3}  
2:                10, which is 3 Perms of {1,1,4} and 6 Perms of {1,2,3} and {2,2,2} 
2.3:            12 Perms each for avg=2.3&2.7, as [64­2*(1+3+6+10)]/2 from other avg’s
2.7:             12 due to symmetry with avg=2.3 above 
3:               10 due to symmetry with avg=2 above 
3.3:             6, which is 3 Perms of {3,3,4} and 3 Perms of {2,4,4}  
3.7:             3, the Perms are {4,4,3},{4,3,4},{3,4,4} 
4:               1, which is {4,4,4} 

We can visualize these result above, in the illustration below.  The red discrete bars represent the probability distribution for the experiment's “average” result, shown along the x-axis.  The green curved line is a reference for the binomial approximation for the normal/Gaussian distribution.  In general, as the sample size increases (e.g., number of trials or repetitions), the binomial distribution's moment properties approach that of the normal.  For example, excess kurtosis (or fat tails) disappears.



  
We should notice that the convolution model (in red) shows to have a heavier-tailed, then the binomial approximation (in green).  At the level of an average score of exactly 2, the probability using either theoretical model is roughly the same.  However there is 6% greater probability of attaining a score of just 2 or better if one uses the convolution theory approach (31% versus 25%).

The ideas here that we have multiple ways of attaining a favorable score of 1, or even 2, has been discussed previously on this web log, in the context of relevant experiments from Nobel Laureate biologists, to investment risk (here, here)  We discuss in these previous links that convolution ideas do not require the simplicity of a fixed frequency count of experiments or trials, and hence they are powerful probability measures for us to work with in a range of risk applications.

Before connecting these ideas for those interested in evaluating the nuances of investment risk and return performance, let’s continue to look at our biological system we started above.  But now we consider results from a broader range of models.  From a one 4-sided dice toss, and also five tosses of a similar 4-sided dice.  For illustrative ease, we only show the top-half of the distribution, for the model of 5 die tosses.




We notice that probability associated with an average of 2, or an average of 1, both decrease with additional die tosses.  Of course partly due to the additional discrete outcomes, which now must share some of the total 100% probability allocation.  But what’s most important to see here is that as we have more die tosses: the probability of attaining a perfect average of 1 decreases, while the probability of attaining an average score of just 2 or better more greatly exceeds that from the binomial distribution model.  Additionally the excess kurtosis rapidly decreases for the binomial model, while decreases less quickly in our convolution theory.

We know from the prior investment web log article (which was also recently cited in The New York Times) that the probability for a mainstream fund manager to perform in the top-quartile, in each of 5 years, is 0.098% (25%5).  For fund managers to be in the top-half, in each of 5 years, the probability is 3% (50%5).  While the extreme top-quartile metric of 0.098% exactly matches what we get from convolution theory (see the chart label immediately above at the 1.0 average score), the probability for the top-half metric instead falls short by a wide margin (10% versus 3%).  And even looking at an average score of just 2 or better, we have the same 7% probability spread between the convolution approach and the binomial approximation (22% versus 15%).

This enhanced probability occurs since with convolution properties, there are many more ways to attain the desired average score of just 2 (other than simply scoring just 1 or 2), for each of 5 years.  This implies that an investor looking to select a fund manager who typically outperforms, defined as in being 1st or 2nd quartile performers over a long-run sense (as opposed to in a consistent sense), must find it vital that we need to understand the nuance here.  And the result is that for similar risk, we should focus our probability attention instead towards one’s average performance.

As a final note, the amount of continuity-correction probability of having an average score, residing between an average score of 2 and an average score of 3.  This middle ground, also includes the true 50% percentile average.  And the ever-larger probability amount we can see for 1, 3, and 5 die-tosses is: 0%, 38%, and a majority 56%, respectively.

No comments:

Post a Comment