Pages

Saturday, February 6, 2016

Kevin Bacon’s 4.5 degrees of separation

Short-term update: among other CEOs in various fields, this was the 2nd share in the past few days by an expert The Wall Street Journal contributor (this time by a distinguished technology executive.)  Feel free to see and share this analysis with anyone interested in better understanding the drivers behind their -or their company's- global exposure.

Are we just 3.5 degrees of separation away from one another?  That’s what facebook would have you believe, as their blog shows paid “research” modeled against their “1.6 billion users”.  In reality, this claim is inflated and relies on some sleight of hand, just as other claims have been from them and other technology giants (here, here).  Not everyone on earth is a facebook user (it is completely blocked in China), and not every facebook account is a real person interested in as much of a network overlap with whom they really know in the physical world (which is why they have a block feature!)  We see examples of both celebrities with many thousands of “friends” who are really followers, and also celebrities with only a handful of friends because they are not really facebook users at all.  As we’ll quickly prove in our analysis here, if you have say 360 facebook friends, and they all also have 360 friends (none of whom know one another), and so forth… only then can you arrive at the suggested 3.6 degrees of separation that facebook says is the average among their 1.6 billion users.  But it’s academic fantasy to assume that every facebook user has a well-above-average 360 friends, and their friends also have the same high 360 friends, etc.  Someone starting with such a high friends count will see their network expand quickly into those having friends of friends closer to just below the facebook average number of friends.  This dilution in friends count therefore increases one’s degrees of separation to higher than 3.6.  Also even for all of your family/friends, and their friends, some will know one another, and so they are not uniquely random people, among the 1.6 billion walking the earth.  This marginal expansion in network will also require a larger increase in one’s degrees of separation from facebook’s stated 3.6 average.  So even if you start with an above average 360 friends, and then make these adjustments, you will soon find yourself requiring more than 4.5 degrees of separation to know everyone (not the 3.5-3.6 that facebook’s claims).  Even more if you expand from facebook to the rest of human civilization (astounding to some, ~6 billion of the world’s population are not on facebook).  So these differences in degrees of separation are quite a bit to make a fuss over.  It would be worth the equivalent of you knowing more than just your friends, but also being friends with all of their friends.

Before we dig into the math, which is quite a bit easier than the confusing “hash” laden technical description they vaguely put out to explain their model, let’s show what is facbeook’s distribution of degrees of separation.  Then we can discuss reality.

My own degrees of separation is shown to be a marginally higher than that of American facebook founder, Mark Zuckerberg.  This is odd, since Mr. Zuckerberg is one of the most famous people on the planet, and even the premise of a biographical movie.  Note also that our Google+ accounts are also within the same magnitude of page views (mine with 2.1 million views versus his with 3.1 million views).  These facts should unearth something a little awry with how any of these social media data can mislead, and also we need to have some level of confidence (intervals) to better grasp any news statistic (e.g., facebook users are typically 3.5 degrees of separation apart.)


Notice in their illustration above that they show a range about their 3.6 degrees of separation (or 3.5 as the media continuously headlines the statistic at anyway).  This “range” is roughly 2.7 on the low-side (note facebook limits the number of friends that one has to the low thousands so  ironically the curve beyond that is only possible outside that site), and 4.6 on the high-side.  Quite tight as we’ll see later.  They also state that “most” facebook users have at least 100 friends.  We can generously infer from probability models that such a comment might mean a typical number of facebook friends at 200, which is what we will use throughout this analysis (in truth even a higher value such as 360 wouldn’t change the results at all).  Let’s now show a range of possible scenarios, for what the degrees of separation look like, given a starting number of facebook friends for you.


As we noted before, the range of output shows the flexibility in thinking about the distribution and the errors, something facebook did not provide with their single distribution further above.  We’ll start with the dark blue curve, which indicates the example we just went through with 360 friends all having 360 friends who don’t know one another.  Following the horizontal axis to roughly 360, and then going up to see the corresponding degrees of separation in dark blue, we see it comes to the 3.6 we discussed was the facebook stated average in their modeled “distribution”.  The mathematics behind knowing a constant number of friends of friends, none of who know one another, is as follows.

friends^dos                                                                                           ~ 1.6 billion
dos * ln(friends)                                                                                   ~ ln(1.6 billion)
dos                                                                                                        ~ 21.2 / ln(friends)

Where “friends” is the number of facebook friends, and “dos” is degrees of separation.  So for 360 friends, or 3.6 dos, we see the final equation above solves:

3.6                                                                                                        ~ 21.2 / 5.9

Using this formula we see that for someone with 100 friends this equates to 4.6 degrees of separation (or the lower end of the facebook stated range), and for someone with 2550 friends this equates to 2.7 degrees of separation (of the higher end of the facebook stated range).  That’s some range in number of facebook friends!  But for our typical facebook user with a generously allotted 200 friends, this equates to 4.0 degrees of separation (not the 3.5/3.6 they claim).  And this dark blue line is only a minimum starting point for the degrees of separation.  The adjustments only bolsters to this.

One should tweak this assumption based on how quickly their friends of friends instead approach the 200 count in size, regardless of how high the initial number of friends you have are.  For someone starting with 360 friends, this degrees of separation rises from 3.6, to 3.9 (and is represented by the light blue curve), assuming all of one’s friends immediately have 200 friends and all future networks stay at that level.  If we instead assume the network gradually morphs from, say 360 friends, to the 200 friends only later as we move through the network (e.g., friends of friends), then the degrees of separation instead goes to 3.8, instead of the 3.6 starting point.  We see this on medium blue curve, and either way the point is the degrees of freedom rises from the facebook stated 3.6.

Now sticking with the individual with 360 friends, even if those friends all knew 360 other people, some of them will know one another and hence they are not unique.  We would be mistakenly double or triple counting.  So in the red color curves we instead show the assumption that only ½ of all new friends are unique within the 1.6 billion users, which ½ are overlaps already in one’s network.  Enjoy the formulae below for that set-up.

friends*[friends/2]*[friends/4]*…*[friends/2^(dos-1)]                      ~ 1.6 billion
friends^dos/2^[0+1+2+…+(dos-1)]                                                    ~ 1.6 billion
dos*ln(friends) – dos/2*(dos-1)*ln2                                                    ~ ln(1.6 billion)

So for 200 friends, for example:

dos*5.3 – dos/2*(dos-1)*.69                                                                ~ 21.2

Instead of 4.0 degrees of separation therefore, for someone with say the typical 200 facebook friends, we now have ~6 degrees of separation!  Now we are talking the theoretical limitation previously -in the physical world- named after the American actor Kevin Bacon.  So for the typical facebook user, we should assume that they are somewhere between the blue and the much higher red degrees of separation curves (albeit much course closer to the blue curve).  And they are also closer to the higher, medium color curve as opposed to the dark curves.  The medium curves represent friends of friends who quickly come down to a count of 200 within the network.  And if you have slightly less friends than 200, there simply is no possible mathematical solution to have your decaying friends count, who only partially know one another, to ever expand to eventually reach everyone.  On the flip side, if you have more than a ¼ million actual friends, then you have the potential to know everyone within just 2 degrees of separation!

So in total with these adjustment possibilities, the typical facebook user must have somewhere between 4.0, and 4.5, degrees of separation (notice the encircled region highlighted on the chart above).  Much greater than roughly 3.5-3.6 they have.  And the actual range in degrees of separation is also a world of difference larger than what they show.  It is instead closer to the distribution on the right (there are no axis units associated with the dark probability curve but imagine twice the vertical units shown for the histogram), more strongly reflecting probabilistic reality.  A detectable low-end at <<2.0 (not 2.7), and a detectable high-end at >>6.0 (not 4.6).  Another way of interpreting this chart below is appreciating that while in facebook’s top chart here less there is less than a ¼ chance of degrees of separation being at least 4.0, in the chart below we estimate that being at least 4.0 has a greater than ½ chance of occurring.

What we learn from doing this analysis is that the true degrees of separation between a facebook user, and all other 1.6 billion users, is closer to 4.5.  And if we consider this for all humans, not just facebook users, the answer of nearly 5 or 6 becomes even greater versus facebook’s hyped 3.5/3.6 degrees of separation.  Adding people to the planet, unless our individual networks also enlarge in the real world (which ironically they don’t if we are too busy sitting alone on facebook!) only increases our degrees of separation, not reduce it.  It is interesting that the primary driver of our degrees of separation is how many friends we start with in our network (not the quality of those friends).  And in order to avoid statistics abuse, facebook and other technology companies would show a range of scenarios to give a sense of precision, as we were easily able to do here (quickly and for free).  

In a world of cratering technology stock prices (note the recent record lows on LinkedIn and Twitter), knowing a true gauge of a social media site’s reach makes a world of difference.  Treating news from these companies with both thrill and skepticism is also justified.

No comments:

Post a Comment