Pages

Friday, June 19, 2015

Games and Monty Hall

The Monty Hall problem is an absorbing, conditional probability puzzle based on the American television game show, Let's Make a Deal.  It is the same problem featured in the movie of Bringing Down the House, where an M.I.T. professor uses the problem as part of a classroom case study for card games.  It was originally posed in a letter several decades ago (though infamous variants existed >100 years ago), and we'll show the solution to this original problem (also in Chap. 2 of Statistics Topics) as well as some lessons from solving new variations we propose in this web log article.

Original puzzle (1 car, 2 goats)
Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1 [but the door is not opened], and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?

The correct solution is to always switch doors.  A large number of people, including PhDs and famed scientists, have been publicly expletive and wrong when exclaiming switching doors is an incorrect solution (which is why they are not real statisticians!)  For this article we'll keep the solutions easier by always relating the "host's door" to Door 2 (instead of Door 3 as stated explicitly in the letter).  Most people instead believe either that there is a 50-50 chance now that the car is behind either of the remaining doors (Door 1, or Door 2).  Or some people incorrectly believe that their case for the original Door 1 choice is now strengthened.  

This game could be framed as a Bayesian problem, where we start off with an understanding that the initial probability of the car hidden behind any of the three doors is an equal 1/3.  One car, three doors, easy enough.  And so for the initial Door 1 choice, this probability of 1/3 must stay that way through the entire puzzle, no matter what.  Due to symmetry we continue to visualize the solution below, with the host selecting Door 2 to reveal a goat.   


Door 1
(your selection)
Door 2

Door 3

Initial probability for where car is
1/3
1/3
1/3
Transitioned probability for where car is after host selects Door 2
1/3
0
1/3 + 1/3 = 2/3


The English mathematician Bayes developed this underlying probability law bearing his name.  The basic Bayesian idea is to update one’s understanding of a distribution, based upon new information, which in this case was that the game host revealing that there was a goat behind Door 2.

We can see the formulae highlighted below, and it might take a little bit of staring at to use, which is why we intuitively go through examples here in the article.  To affix Bayes to our Monty Hall problem, we must think of event B as the probability of the goat originally being behind any of the doors, such as the initial one you chose.  And event A as being the “revised” transition probability of the goat being behind any of the "not chosen" doors, after the host reveals what’s behind one of the doors.  Ultimately we know event B occurs, in that a goat is behind the host's opened door even though at the start of the game we don't know which that is.

Say we wanted to know the probability of event A, conditioned on event B:

p(A | B)*p(B)
= p(B | A)*p(A)

p(A | B) 
= p(B | A)*p(A)/p(B)

While the general formulation is shown above, sometimes we need to further re-describe p(B) to attain a mathematical solution. 

p(B) 
= p(B | A)*p(A) + p(B | A complement)*p(A complement)

Now let’s return to the initial 1/3 probability of the car being behind any of the three doors, if we ignore Door 1 (i.e., the initial contestant-selected door), then the probability of the car being behind either Door 2 or Door 3 is 2/3.  And these events are of course mutually exclusive and completely exhaustive (MECE).  So now there is a 2/3 probability of the car being behind only Door 3, since the host later shows that the goat is (and the car is not) behind Door 2.  Note that the 2/3 probability is not an insignificant difference; it's 100% greater than the initial 1/3 probability behind each door.

First variation (1 car, 3 goats)
It’s nice to suddenly master the original Monty Hall example, but what if the new problems of tomorrow always look a little different from what you're used to?  How would you calculate the transformed probabilities and know what hedging strategy to deploy in tackling the newer questions that are different twists of original question solutions?  With this in mind, we’ll work out the solutions of the two variations of the Monty Hall problem above.  How does this example change if we look at a situation where there is 1 car and 3 goats, behind a total of 4 doors?  Or if there are 2 cars and 2 goats, behind 4 doors?

We can reason out the first variation of the Monty Hall problem since most people find this variation more intuitive.  The initial probability of a car being behind any of the four doors, is now an equal ¼.  You select a door, now leaving three “not selected” doors which total 3*¼ probability among them.  Once the host reveals one of those 3 “not selected” doors, this deduces to the remaining two “not selected doors” to share the transitioned ¾ probability of the car between the two of those doors.  Or ½*3/4 (or 3/8) probability.  Your initially selected door still has a 1/4 (or 2/8) probability.  

Since 3/8>2/8, it is in your best gaming interest to change your selection to one of the two remaining “not selected” doors.  Each of these latter doors has a 50% greater chance of hiding the car versus your initial selection.  Or 3/8 versus 2/8.  The original Monty Hall problem, while less intuitive to most people, exhibited an even larger 100% difference as shown above for those who decide to switch doors.

Second variation (2 cars, 2 goats)
Now we'll start with reasoning out our second variation of the Monty Hall problem (if there are 2 cars and 2 goats, all behind 4 doors).  And we'll see how far we can go in learning from the alterations in the assumptions.  First though, let’s test your hypothesis of what the final solution will look like, taking notice of how you make logical deductions as well.  Relative to the probability associated with your initial selection, the transformed probability of the switched selection is:
[A] about 1.0 times the initial selection
[B] about 1.33 times
[C] about 1.5 times (=50% solved in the first variation Monty Hall problem of 1 car, and 3 goats)
[D] about 1.67 times
[E] about 2.0 times (=100% solved for in the original Monty Hall problem of 1 car, and 2 goats)

The initial probability of a car being behind any of the four doors, is an equal to 2/4 (or ½).  This is a high probability, and for your initially chosen door, it remains with you throughout the puzzle.  One can't add these probabilities together for a car behind all doors (½*4) since we must instead use combination theory (searching this blog for combinations and some samples: here, here).  But anyway here the puzzle, similar to many situations in life, things get more complicated.  We have to consider the conditional probabilities after the host selects day Door 2, depending on what was behind your initial Door 1 selection.


Door 1
(your selection)
Door 2

Door 3
Door 4

50% initial probability for car in Door 1

Goat
50% probability for a car after host selects Door 2
50% initial probability for goat in Door 1

Goat
100% probability for a car after host selects Door 2
Transitioned probability for car after host selects Door 2
50% like always
0
75%, only conditional on if selected
75%, only conditional on if selected


For example, there is a 50% chance there was a car already behind Door 1 (recall we stated upfront there was an equal 2/4 chance).  And in these 50% of cases, the probability of the other car being in each of two "not selected" doors (Door 3 and Door 4) is 50% each.  For the other 50% of the cases where there was not a car already behind Door 1, since the host uncovers the second door with a goat there is a 100% probability that either "not selected" door you pick (Door 3 and Door 4) contains a car.

So the transition probability becomes:

p(A)*p(B | A) + p(A complement)*p(B | A complement)


This keeps the general formula form from the top of the article.  But of course with different event definitions for variables A and B.

p(initial door=car)*p(second door=car | given initial door=car)
+ p(initial door=goat)*p(second door=car | given initial door=goat)
= 50%*50% + 50%*100%
= 75%


We notice in this quick MECE partitioning of the Bayes puzzle we are able to get an answer of [C] (75% is 50% more than the initial probability of 2/4).  And while 3/4 here is higher than the 2/3 we got in the original Monty Hall puzzle, it is important to note that having more cars in relation to doors (as we have here) implies a greater chance to discover a car behind a "not selected" door if a contestant switches their choice.  Also note in the presentation above that the bottom line is not MECE but a summary (MECE sums to 100% and 50%+75%+75%>100%).  

It also makes sense in this article to appreciate that probability models in Las Vegas, to the rest of the real world, are sometimes not constant.  But one needs to always be adaptable to changing assumptions and know how to break through when thinking through different assortments of related probability problems.  Future situations again may continue to expose new twists, similar to how we showed some in the article above.


Unrelated: We've had a number of important blog shares.  The first is for our "Fine, if you ignore history" via The Big Picture (by Bloomberg contributor).  And for our most recent "Volatility-products + prayer" article given the current high stock prices, it was stylishly among the top read business articles anywhere per Street EYE; nicely shared by a number of people including most recently by Tom Keene (Bloomberg editor-at-large), Howard Lindzon (head of investment fund and of Stock Twits), Zero HedgeCFA, Abnormal Returns, and a Federal Reserve advisor.

1 comment:

  1. A comment is needed based on many offline notes. The puzzle has been infamous across time and cultures. It was even in a recent Ian McEwan novel. But not completely original as earlier variants existed >100yrs ago, including the established Bertrand's box paradox...

    There are 3 boxes:
    1. a box containing 2 gold coins,
    2. a box containing 2 silver coins,
    3. a box containing 1 gold coin and 1 silver coin.

    After choosing a box at random and withdrawing 1 coin at random, if that happens to be a gold coin, what is the probability that the remaining coin is gold?

    ReplyDelete