Tuesday, February 19, 2013

An Elaboration on the Monty Hall Problem

Many, many years ago I breezed into a Mensa meeting and saw then Chancellor of the Triple Nine Society, Cyd Bergdorf, and Ron Hoeflin working busily on some kind of problem.  I knew Cyd personally, but I knew Ron only by reputation.  Being the gadfly that I was, I walked over, sat down and asked what they were working on.  It was the now famous 'Monty Hall Problem.'  They explained it to me.  It goes like this:

Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?

I immediately said, 'You switch, of course.  And, by the way, the qualification that the host knows what is behind the door is unnecessary.  I'm going to go get a drink.'  This remark that the qualification is not needed has led to a never ending dispute with other high IQ people, most recently Garth Zietsman and Rick Rosner.  So, I am going to put the argument here where it may be able to garner a larger audience.

To make it very clear, I am stating the the reservation is unnecessary precisely in the context of the problem as stated.  Thee are a whole lot of Bayesian assumptions that, I argue, do not apply because there is nothing in the problem that states that it should be taken as one is a series of similar games.

First, I will reproduce verbatim what I found to be the most clearly stated positions of Rick Rosner and Garth Zeitsman.  If you want to read the full dialogue, it is available here

Rick Rosner
Three doors, you pick one, then Monty randomly opens one. 1/3 chance the game is wrecked, because Monty prematurely revealed a car - he's never supposed to reveal the car until you make your final choice of door. 2/3 chance the game isn't wrecked. Two equal possibilities among the 2/3 of games that aren't wrecked - car is behind your first choice, or car is behind the door Monty didn't open. Random door-opening seems to either wreck games or leave an equal probability among the remaining unopened doors.

Garth Zeitsman
There is an urn with one blue and two green balls. A selects a ball followed by B (randomly) who notes his ball is green. If A had selected a blue ball B would have had two ways to select a green ball. If A had selected a green ball B would have had one way in which to select a green ball. However A had two ways in which to select a green and one way to select a blue ball. So the chance of A
having selected a blue given that B's was green is 1*2/(1*2 + 2*1) = 1/2.

When B can look into the urn and deliberately pick out a green then his probability of selecting green is always 1 and the probability of A selecting blue given that B has green is simply the initial probability of his selecting blue.


Michael Ferguson
Now both Rick and Garth are correct if we assume that the game described is one element in a string of games where n>>1.  However, that is not stated in the problem.  We have no reason to assume that this game has ever been played before or ever will be played again.  If that was the problem, they would be correct.  We could expect to improve our odds from one third to two thirds if Monty knows what is behind the doors and chooses to always open a door with a goat behind it.  We would expect that our odds would be the same by either staying or switching if Monty doesn't know what is behind the doors.  However, that is not the problem as stated.

When we flip a coin, the natural odds of a fair toss coming up heads is 1/2.  If we choose one door out of three the natural odds of choosing a car is 1/3.  One might ask how this natural odds of 1/3 could somehow change to 1/2.  The answer is that it does not.  Something very different is the cause.

Suppose we play 12 games.  We would expect that we will choose a door car 4 times.  That gives us the 4/12=1/3 natural odds.  If Monty always opens a goat door we didn't choose, we will have 8 unopened car doors and the odds if we switch is 8/12 or 2/3.  However, if Monty doesn't know, we will have 4 games where we chose a car door, 4 games where the unchosen door has a car and we will have four games where Monty, as Rick puts it, wrecks the game.  So, now, switch or not, we should expect to win 4 cars.

So the odds to not change from the natural 1/3 to 1/2 actually.  Rather, we have created a subset that doesn't include four unopened car doors.  This artificially decreases the probability that the unchosen door has a car.  In other words, and this is absolutely the crux of the issue, the lowering of the odds does not reside in the individual games but rather is a characteristic of the subset that is created.  The changing odds is directly a result of opening car doors.

In other words, by resorting to Bayesian reasoning, Garth and Rick are calling into existence games where car doors are opened when, in fact, there is no indication in the language of the problem that there ever will be played any other games, with or without Monty knowing what is behind the door.  In the only case of the game that we know to exist, Monty revealed a goat.

This game naturally belongs to the set of games string of n length in which no car doors are opened.  By naturally, I mean that there is no way to take this game out of that set.  If I could state that this game belongs to a set of n games in which Monty Hall knows what is behind the doors.  I could take it out of that set by stating that Monty Hall doesn't know what is behind the doors.  But it irrevocably belongs to this set.

I then state that a common factor of all game strings, regardless of the value of n, is that we expect that there are twice as many cars behind the unchosen door than there are between the chosen door.  To illustrate the significance I use the following thought problem.

Suppose we play twelve simultaneous games. We choose one door in each of twelve sets of three doors. Our expectation is that we will choose a car four times and will choose a goat the other eight times. Then Monty opens one door in each set of three doors and reveals a goat. Now, we have chosen a car door four times and the twelve unchosen doors contain the other eight cars. Clearly, we should switch doors when asked.

Now after all this transpires, Monty tells us that he didn't know what was behind all the doors. We are surprised because the odds against him choosing only goats in twelve straight games is (2/3)^12 or approximately 130:1. Yet, it does not change the proper strategy; there are still only four cars behind the chosen doors and eight behind the unchosen doors.

Now suppose that we play three simultaneous games. The logic is the same. Our chosen door has only one car and the unchosen door has two. We should switch. We are not so surprised when Monty tells us that he didn't know what was behind the doors because the odds against have fallen to (2/3)^3 or just a little over 3:1. In fact, for n games in which no car doors are opened, no matter the value of n, switching doubles our chances.

This happens because there is nothing magical about Monty knowing or not. The advantage falls from 2:1 to 1:1 by the process of games with opened car doors being eliminated from the pool. In other words, the odds do not change one iota UNTIL a car door is opened. When we are presented one game in which a goat was revealed and n=1, there are no car door openings that can change the odds of having chosen a car door from 1/3 to 1/2. Consequently, in one game in which a goat was revealed, my initial probability of 1/3 of having chosen a car door remains. The unchosen door still contains 2/3 of the probability.


For some reason, even people at the highest IQ level can have difficulty in grasping this.   I want to make this clear that I am not arguing against Bayesian probability.  I am only arguing that it is applicable only in those cases where multiple 'runs' of the game are taking place.
 

5 comments:

  1. I'm not surprised people have difficulty grasping this because it doesn't make sense. And btw you are arguing against Bayesian probability which is not reliant on performing a large number of trials (that's called Frequentist probability)

    You acknowledge that over a large number of trials that the probability of winning by switching (for ignorant Monty) is a 1/2 but in a single trial it is 2/3. Don't you see the contradiction there? What you are saying is that probability theory doesn't apply to a single outcome.

    You make a point of saying the problem only relates to a 'single game', then spend a considerable amount of time showing the results of multiple trials - I'm not sure what the point of that was.

    You also make several statements that are either incorrect or dependent on specific conditions existing, for example:
    "Yet, it does not change the proper strategy; there are still only four cars behind the chosen doors and eight behind the unchosen doors". Simply not true, there could be any number of cars (between 0 and 12) behind the chosen doors with varying probability values, and, given that Monty has randomly revealed 12 goats from the 12 pairs of remaining doors you should revise your original estimate (upwards) that there are only 4 cars behind your chosen doors.

    And:
    "in one game in which a goat was revealed, my initial probability of 1/3 of having chosen a car door remains. The unchosen door still contains 2/3 of the probability" is only true under certain well-defined conditions.

    There is a difference in probability calculations between an event that DID NOT happen and an event that CANNOT happen.

    The Bayesian calculation for the problem where Monty can open a car door but didn't:

    P(A|B) = P(A).P(B|A)/P(B)and
    P(A) and P(B) are 'a priori' probabilities.
    P(A) = Prob(Player's door contains car) = 1/3
    P(B) = Prob(Host opens a goat door) = 2/3
    P(A|B) = (1/3 * 1)/(2/3) = 1/2

    Contrast with the the problem where Monty cannot open a car door:
    P(A) = Prob(Player's door contains car) = 1/3
    P(B) = Prob(Host opens a goat door) = 1
    P(A|B) = (1/3 * 1)/1 = 1/3

    Both solutions only rely on 'a priori' probability estimates and not the resultsof a large number of trials.

    ReplyDelete
  2. Marley52,

    You are correct as to my assertion. Bayesian probability calculations are not valid when one or more of the initial probabilities have already collapsed. In other words the proper calculation is not all cases from the initial state, because we are not at the initial state. The proper calculation is the probability of those cases where a donkey was exposed.

    Let's use coin flips as an example. If we are going to flip a coin twice, there is a 25% probability that we will flip two heads. However, if we flip a heads and then ask you what the probability is that we will flip two heads the answer is 50%, because we have already flipped the first one. Easy to understand here but difficult when it is a donkey door. I don't understand why.

    ReplyDelete
  3. Bayesian probability calculations are used to solve conditional probability problems of the type "What is the probability of Event A happening GIVEN that Event B has (already) happened?", so I can't agree with your statement "..are not valid when one or more of the initial probabilities have already collapsed".

    The proper calculation is to consider all possible outcomes of the probability event (including those that couldn't happen, those that could happen but didn't happen, and those that could happen and did happen) and assign probabilities to each possible outcome.

    So, in the problem where Monty doesn't know which door the car is behind, opening the door the contestant picked (an outcomes that couldn't happen) is assigned a probability of zero, opening the car door (an outcome that could happen but didn't happen)is assigned a probability of 1/3, opening a goat door (an outcome that could happen and did happen)is assigned a probability of 2/3.

    I don't see the relevance of the coin flip "analogy", the coin flips are independent events, Monty opening a door you didn't pick isn't.

    Compare the above to the standard MHP, where the probability Monty opens the car door is assigned a probability of zero because it is an outcome that couldn't happen, and the probability Monty opens a goat door is assigned a probability of 1 because it is certain to happen.

    The difference in probabilities between the outcomes of the two problems is why you get different results

    What you don't do is assign probabilities based on a single result of an experiment, as you are trying to do ("he opened a goat door therefore the probability he opens a goat door is 1" is false reasoning)


    ReplyDelete
    Replies
    1. I agree with Marley52 except when he says that it doesn't make sense ;-)
      The way I see it in the case Monty choses a door after you at random is :
      A priori you have 6 equally probable cases (3 choices for you times the 2 following for Monty).
      2 of these include Monty revealing the car.
      So when Monty reveals a goat door it means you are within the other 4.
      Amongst these 4 cases, 2 include you having chosen the car door, hence the 1/2 probability to win, switching or not.

      Delete
    2. I meant it was Michael's original post that didn't make sense :).
      I think he got his prior and posterior probabilities mixed up.
      Good explanation BTW of why a random opening results in a 50/50 choice

      Delete