I’ve been reading Leonard Mlodinow’s The Drunkard’s Walk: How Randomness Rules our Lives. The book covers the Monty Hall problem, Bayes’s Theorem, availability bias, the illusion of control and so forth. If these are unfamiliar, look no further for an entertaining account.
On the other hand, I can’t say that I learned much I didn’t already know. Nevertheless, I still enjoyed reading the book – it’s well written and filled with interesting nuggets (Did you know that the great mathematician Paul Erdos refused to believe that you should switch doors?). If you teach probability theory or intro stats you will find lots of good examples to brighten up your lectures.
One problem did intrigue me. Suppose that a family has two children. What is the probability that both are girls? Ok, easy. Probability of a girl is one half, probabilities are independent thus probability of two girls is 1/2*1/2=1/4.
Now what is the probability of having two girls if at least one of the children is a girl? A little bit harder. Temptation is to say that if one is a girl the probability of the other being a girl is 1/2 so the answer is 1/2. That’s wrong because you are not told which of the two children is a girl and that makes a difference. Better approach is to note that without any additional information there are four possibilities of equal likelihood for the sex of two children (B,B), (G,B), (B,G), (G,G). If we know that at least one is a girl we can remove (B,B) so three equally likely possibilities, (G,B), (B,G), (G,G), remain and of these 1 has two girls so the answer is 1/3.
Ok, now here is the stumper. What is the probability of a family having two girls if one of the children is a girl named Florida?
At first it seems impossible that knowing the name should make a difference. Surely, the answer is 1/3 just as before? After all, every child has a name. But knowing the name does make a difference. Here’s a hint, Florida is a rare name.















Next Comments →
Is it because Florida is a rare name, the girl named Florida is more likely to be the second girl? This makes (G,G) more likely than (B,G) or (G,B).
I hope I am not embarrassing myself with this argument.
Half of one of course.
But I’m going to name both of my girls Florida just to spite him!
actually i’m sure theres something missing there…heh =)
What if the family’s last name is ‘Florida’?
P(G=2/G=florida) = P(G=FLORIDA/G=2)/P(G=florida). I guess the probability is pretty big if P(G=florida) id small.
Does her dad have a scar on his cheek and an evil eye?
My colleagues and I have spent lots of time arguing over this question as one of us insists on using it in interviews. I claim the problem is ill-defined, i.e. is not a yet a well-posed mathematical problem, because the even though the English sounds correct, the “preparation procedure” for the situation is ambiguous.
For example, is “Florida” a constant or a random variable? If I searched for a family with a girl of this name, making “Florida” a constant, then the answer is 1/2. If I took any family such that there is at least one girl and asked for a name and reported the answer “Florida” (making “Florida” a random variable), then this name provides no new information (after all every child DOES have a name) and the answer is 1/3.
Rarity of name plays no role here. We may as well assume all names are unique, for example perhaps names are DNA sequences.
In fact, one can extend this reasoning to the “easy” part of the problem, the first part. How did I choose this family before presenting the problem? Perhaps I choose the family to have EXACTLY one girl or EXACTLY two girls. Then the probabilities are 0 and 1 respectively. The reason that Part 1 seems unambiguous is that there seems to be a “canonical” preparation procedure that everyone assumes, i.e. the family is uniformly randomly chosen from all families having two children and at least one girl.
My colleagues and I have spent lots of time arguing over this question as one of us insists on using it in interviews. I claim the problem is ill-defined, i.e. is not a yet a well-posed mathematical problem, because the even though the English sounds correct, the “preparation procedure” for the situation is ambiguous.
For example, is “Florida” a constant or a random variable? If I searched for a family with a girl of this name, making “Florida” a constant, then the answer is 1/2. If I took any family such that there is at least one girl and asked for a name and reported the answer “Florida” (making “Florida” a random variable), then this name provides no new information (after all every child DOES have a name) and the answer is 1/3.
Rarity of name plays no role here. We may as well assume all names are unique, for example perhaps names are DNA sequences.
In fact, one can extend this reasoning to the “easy” part of the problem, the first part. How did I choose this family before presenting the problem? Perhaps I choose the family to have EXACTLY one girl or EXACTLY two girls. Then the probabilities are 0 and 1 respectively. The reason that Part 1 seems unambiguous is that there seems to be a “canonical” preparation procedure that everyone assumes, i.e. the family is uniformly randomly chosen from all families having two children and at least one girl.
Although I am just an organic chemist, and we never use statistics or probability (thank God) I agree with AntiAntiCamper. Everyone is rare along some axis, so what is true is the Florida girl is true of any sibling for a slightly different wording of the problem. In other words, as AntiAntiCamper points out, since rare is not numerically defined, one can use DNA data to arrive at the limit of uniqueness (or close to it, say 1 out of 80 billion). Since every human who is not an identical twin is genetically unique, we can make a “Florida” like statement about any sibling.
I have a comment/question for those here who are more versed in probability than I am.
I claim that it does not matter if you know or do not know which child (older/younger) was revealed to you UNLESS which child is to be revealed is part of the conditions of the test. It does you no good to be told afterward which child was revealed to you. Here is my reasoning:
You have a family with “at least one girl.” As we have seen above, this means that there is a 1/3 possibility of the other child being a girl. The reason for this is that only one of the four cases can be absolutely rejected (BB). The other three remain and are all equally likely. This appears to me to be totally sound.
However, if we are now told “the older child is a girl,” the claim is that the probability changes to 1/2 that the other child is also a girl because we can now also reject BG, leaving two equally likely possibilities.
But if this claim is true, then I believe that the following line of reasoning is valid:
At least one of the children is a girl and therefore there must be a girl child that is the older or younger child. We have no information about which child she may be, so she is 50% likely to be the older child and 50% likely to be the younger child. If the above claim is true, then if we know that she is the older child then the younger child is a 50% shot to be another girl and if we know she is the younger child than the older child is a 50% shot to be another girl. Since 50% * 50% + 50% * 50% = 50%, I claim that even before we know whether the child is the older or younger child, the probability is 50% that the unknown child is a girl. But this is a contradiction, since it has been established that the probability before we know which child it is is only 1/3. Therefore, I claim that it does no good to know which child is the girl, UNLESS that requirement is stipulated as part of the test.
In other words, if you consider all cases where the older child is a girl, half are immediately rejected because the older child is a boy and the probability is 1/2 that the other child is a girl. However, if either the older or younger child or both could be a girl and you are simply told which it is after knowing that the family has “at least one girl,” meaning that you could be told either “older” or “younger,” you have not actually acquired any new information and the probability remains 1/3.
I feel that I am wrong about this. I feel that this is likely because I am moving from “at least one child” to positively selecting one of the children to be the “at least one child,” and that this is probably an error. Nonetheless, I would like feedback. I’m not scared of math, so if you need to use it, by all means, do so.
If a family already has a girl named Florida, why would they also name their other girl Florida? You’d think this is a trivial point but it completely affects the general case of Mlodinow’s solution, which he gave in a comment in the WSJ link above.
He writes,
“According to the rules of conditional probability the chances a family has two girls if it has a girl named Florida are thus:
(Total Probability of GF-GN, GN-GF, and GF-GF) / (Total Probability of all 5 of the above events) =
= .25x*x + 2*.25x*(1-x) / [.25x*x + 2*.25x*(1-x) +2*.25x]
= [2x-x*x] / [4x – x*x]”
x is the probability that a girl would be named Florida. So if x is very small, then the solution approaches 1/2. However, as x increases, the solution decreases.
But what if the question was,
“What is the probability of a family having two girls if one of the girls is named Jen?”
Assuming that 5% of girls are named Jen, the answer would be 49.4%. Close to 50%, but not a trivial difference.
If 10% of girls are named Jen, then the answer would be 48.7%. But this assumes, per Mlodinow’s solution, that 1% of all families with two girls would name both of their girls Jen!
So Mlodinow’s solution doesn’t really work for cases like “girls named Florida” where names aren’t given out randomly. But it still works if “girls named Florida” is replaced with a trait that is truly random, even if the solution is somewhat counterintuitive.
*And yes, regardless of whether a family would name both of their daughters Florida the answer is still approximately 1/2, but the way Mlodinow arrives at his solution assumes names are given out randomly.
One half. Because families with two girls have two chances to name one “Florida” while those with only one girl have only one chance. It doesn’t quite work out because there is some chance that both willbe named “Florida” (in a strict probability sense) but that’s why it’s important that the name is rare, so th chance of that is negligible.
I have Bruce Schechter’s book on Erdos, and he says that Erdos was convinced a few days later by Ron Graham at Bell Labs.
Yes, I agree that this question is ill-posed. At the WSJ, they phrased it like this:
You know that a certain family has two children, and you remember that at least one is a girl with a very unusual name (that, say, one in a million females share), but you can’t recall whether both children are girls. What is the probability that the family has two girls — to the nearest percentage point?
But as others have suggested, we can rephrase this it thusly:
You know that a certain family has two children, and you remember that at least one is a girl with a combination of genes (that, say, one in a million females share), but you can’t recall whether both children are girls. What is the probability that the family has two girls — to the nearest percentage point?
This is now true of anyone, since you can always just use any arbitrarily large gene set (or sequence) till you come up with one that is sufficiently unique.
John Lynch,
You’re double counting. Consider the three cases with one girl: GG, GB, BG. Assume the older child is a girl: then there is a 50% chance that the younger is a girl. Assume the younger child is a girl: again, there is a 50% change the older is a girl.
But you cannot then add .5*.5 + .5*.5. Why? You need to subtract out the intersection. You’re adding up “older child a girl times younger child a girl” to “younger child a girl and older child a girl” which is the same thing.
Mathematically, if one kid must be a girl, where YG is younger girl and OG is older girl: P(G)=P(YG)P(YG|OG)+P(OG)P(OG|YG)-*P(OG and YG) = 2/3*1/2+2/3*1/2+1/3*1/2-1/3 = 1/3.
Following up on my earlier comment/question:
From AntiCamperCamper:
“For example, is “Florida” a constant or a random variable? If I searched for a family with a girl of this name, making “Florida” a constant, then the answer is 1/2. If I took any family such that there is at least one girl and asked for a name and reported the answer “Florida” (making “Florida” a random variable), then this name provides no new information (after all every child DOES have a name) and the answer is 1/3.”
This makes me feel that my earlier hypothesis is correct, but ACC knows the terminology better than I do. In my early problem, if which child the girl is is a constant, then the probability of the other child being a girl is 1/2, but if it is a random variable then it does not affect the outcome because each child has to be either older or younger.
Think if it as a game where two coins are each hidden under a cup. At the start, you know nothing, so the probability of HH under the cups is 1/4. If the person running the game looks under the cups and then tells you that at least on of them is H, then the probability of HH is now 1/3. At this point, if you tell the person running the “game” to reveal an H to you and he agrees, it cannot be the case that the probability that the unrevealed coin is H is now 1/2. This is because one coin HAD to be revealed, and it HAD to be an H, and an H HAD to be hidden, and it was equally likely which coin was going to be revealed (assuming that in the HH case, the revealed coin is chosen randomly). If the probability changed to 1/2, then by my reasoning above, it must also have been 1/2 prior to the revelation (since each possible revelation was equally likely).
This is massively different from the case where the person running the game must reveal the coin under a specific, predetermined cup even after telling you that at least coin is an H. In this case, if it turns out to be an H, then the probability of the other coin being an H is 1/2. If it is a T, then the probability of the other being an H is exactly 1, since at least one must be an H. When we talk about the probabilities changing when we are told that the older child is a girl, this is only true if the older child HAD to be revealed to us, boy or girl.
In any case, I now feel that my original hypothesis is right. Feedback still welcome, of course.
The skeptics are off-base, the question is fine as is.
The relevant assumptions are these, there is nothing unusual about them:
1) 50% of all births in two-child families are male, 50% female.
2) For any two distinct births X,Y:
Gender[X] is conditionally independent of Gender[Y];
Gender[X] is conditionally independent of Name[Y];
Name[X] is conditionally independent of Name[Y];
3) P(Name[Y]=Florida | Gender[Y]=Female) = 0.001 or something else small.
Now here’s the proof:
There are nine possible events in a two-child family.
The probabilities below follow directly from the premises (1), (2) and (3).
P(Female-Florida, Female-Florida) = 0.5*0.001 * 0.5*0.001
P(Female-Florida, Female-NonFlorida) = 0.5*0.001 * 0.5*0.999
P(Female-Florida, Male) = 0.5*0.001 * 0.5
P(Female-NonFlorida, Female-Florida) = 0.5*0.999 * 0.5*0.001
P(Female-NonFlorida, Female-NonFlorida) = 0.5*0.999 * 0.5*0.999
P(Female-NonFlorida, Male) = 0.5*0.999 * 0.5
P(Male, Female-Florida) = 0.5 * 0.5*0.001
P(Male, Female-NonFlorida) = 0.5 * 0.5*0.999
P(Male, Male) = 0.5 * 0.5
Work it out: find the aggregate probability mass of:
a) All events that contain two females at least one of whom is named Florida,
b) All events that contain a female named Florida.
You will find that (a)/(b) is about equal to 0.5. QED
It is hard to tell now that I know what is actually meant, however I was originally taken by the “one child is x, what is the probability of the other being x” question. The way you put it (“at least one child is x”) makes it clear that you are picking among them. However, as I first encountered the problem, it was just noted the status of one child, not letting on that they had both children to search to find said “x”.
I think an intuitive way to see it is that in the three allowed cases (G,B), (B,G), (G,G) knowing that one of the girl is name Florida reduces the probability of (G,B) and (B,G) almost twice as much as the probability of (G,G) because the possibility of finding a Florida in one of the spots of (G,G) is almost twice as much as in the other pairs.
If you are picking people randomly to find Floridas, you are almost twice as likely to find her in a (G,G) pairs than in either a (G,B) or (B,G) pair. Also from a random population with an equal number of (G,B),(B,G),(G,G) when in the process of filtering out all pairs not containing a Florida, you will keep a (G,G) pair almost twice as often as the other kinds of pairs as you have two chances of falling on a Florida in the (G,G) pair.
if the frequencies of Floridas is 1/1000
f:named Florida
nf: not name florida
Knowing a (G,B) or (B,G) pair:
P(Gf, B)=1/1000 * 1
the probability of a Florida is 1/1000
P(Gnf,B)=999/1000 * 1
the probability of no Florida is 999/1000
Knowing a (G,G) pair:
(Gf,Gnf) = 1/1000*999/1000 ~1/1000
(Gnf,Gf) =999/1000*1/1000 ~1/1000
(Gf,Gf)=1/1000*1/1000 ~ 0
the probability of a Florida is ~ 2/1000
(Gnf,Gnf)=999/1000*999/1000 ~ 998/1000
the probability of no Florida is ~ 998/1000
2/1000 is twice 1/1000
This throws the entire answer to the first question (1/3) into doubt for me. Why? Well, let’s say we have our family with at least one girl. That girl has to have a name, yes? Let’s call that name “N” and any others “X”. So our choices are: (GN, GX), (GX, GN), (GN, B), (B, GN), and (GN, GN). It is very rare for someone to name both their daughters the same name! Therefore the answer to the original question is ~50%.
Aren’t you all making this more complex than it needs to be?
There are three cases
1. you know that one of the children is a girl. So the possibilities are;
GB BG GG.
As Tyler said, only 1 of 3 gives us two girls
2. you know the first (or second) is a girl. So the possibilities are;
GB GG
So with more information than the first instance, 1 of 2 gives us two girls
3. you know one of the girls is unique (has the name florida). Note; we ignore the case of both children having the same name or one being transgendered or a hermaphrodite etc.
So the possibilities are
BG(named Florida) G(named Florida)B G(named Florida)G GG(named Florida)
Which gives us 2 of 4 or 1 of 2 chances that both are girls
I have been pleasantly surprised how cordial and respectful everyone has been toward one another. That was until Radford Neal implied this is actually “endless unprofitable debate with people who refuse to think”.
OK. I admit I was a bit too premptively ornery. The underlying truth here, though, is that some people do get locked into one, incorrect, way of thinking, and it’s very hard to dislodge them from this.
There are wider implications to these puzzles. In AI, some people spent decades debating questions like “You’re told Twiggy is a bird, do you assume that twiggy can fly?”, but “Now you’re told that Twiggy is a Penguin, do you still think that twiggy can fly?”, etc. These all are isolated from any context that would allow one to assign probabilities. That may be a problem with the “Florida” puzzle, though I think the intended interpretation is fairly obvioous. In my puzzle, the context is established by narrative to avoid such problems.
Is the original question affected by the fact the it is assumed (by the 1/3 proponents) that the questioner has complete knowledge and a possible fixed answer in mind (“I will find a girl”)?
If the situation were such that there were two rooms containing either a boy or a girl. The questioner instead of looking at both, only looks in one and says “one child is x” or “at least one child is x”. Is this not a reasonable situation for a person to assume? In this case, I don’t see how one can affect the other, just as one coin toss does not affect the next.
On the other hand, if the questioner was looking for an “at least one girl” family, then by default you are ruling out 1/4 of chances, not by probability, but by choice.
If the questioner was not looking for a certain “x”, but did look into both rooms and chooses a random child to describe, does that change the answer from random to “unbalanced”?
Part of the reason I question it is that the “Monty Hall” problem works because there is total knowledge which affects which door is removed. If “Monty” removed a random other door, possibly making the prize unattainable, then it would make no difference to change.
I am not sure we should be treating the events as independent. We are assuming, for example, that BG and GB are independent possibilities. Since they will be “selected” simultaneously for observation, then the two possibilities are not independent.
If I introduce you to two of my children, and they are both boys, then the possibility of an unrelated child being a boy is 50%. The two possibilities are simply BBG and BBB.
If the third child happens to be related, it does not expand the possibilities to include GBB and BGB. The age of the children was never discussed. The only two independent possibilities are:
1. Two boys and a girl. Or
2. Three boys.
The order is not relevant.
Holy crap, I’ve just skimmed most of the comments, but I think you guys are over thinking this. Work through it just like the first problem.
Here are all of the possibilities for two children if we allow three “types” (B-boy, F-girl named Florida, G-girl not named Florida):
BB, BF, BG
GB, GF, GG
FB, FF, FG
The problem says we need at least one F, which leaves us with:
BF, GF, FB, FG, FF
We need to make one assumption: that Florida is a rare name, and the probability of two Floridas in one family is nearly zero, so now we have four equally likely possibilities:
BF, GF, FB, FG
Two of these are 2 girls, so the probability is 2/4 or 1/2.
It may be more accurate to say the probability is ever so slightly greater than 1/2… and assuming nothing about the probability of the name Florida, we can say the probability of two girls lies somewhere between 1/2 and 3/5.
The difference here is that pinning down one of the names changes the GG case, in that the order of the two girls now matters. In the general case, a girl is a girl, it doesn’t matter how they’re ordered, we can say it’s symmetrical. Stephen, in your case where you say “whatever her name is, call it N” you still have a symmetrical case, because you can say that of both girls, so the GG case just becomes NN and there’s no change (p=1/3). The probability only changes if you single out ONE of the girls as somehow special.
1/500
Just because “Florida” is a rare name doesn’t mean it’s rare for a particular family. If dad says, “If I have a daughter I want to name her after my grandmother Florie,” and mom is down with that, Then P(Florida|Girl) = 1. If Florida is a special name, and not a flippant choice, it’s much more likely that the first-born daughter will take the name.
So I don’t think it’s the rarity of the name that matters, but knowing anything identifying about the girl (Florida, the older one, the blond one, etc.) changes the odds to 1/2, as RobbL put so succinctly.
David,
I think the rarity helps in that it distinguishes between what is likely to be shared between any siblings (or within sexes). In other words, if it is likely that any girls will share the same characteristic (both parents are blond, therefore you would expect most or all of the children to have blond hair), then it doesn’t tell you anything about how the questioner made his/her distinguishing observation.
Mike Sackton has the best explanation of the fundamental difference between the two questions (with no messy probability calculations).
Tac-Tics, your second paragraph there is wrong. The thing about the (B,B), (G,B), (B,G), (G,G) approach is that each of those pairs is equally likely. If you take the set approach that (B,G) = (G,B), then sure you have three states, but that one is twice as likely as the other two, and the numbers still work out to 1/3.
But I have to admit the Florida version of the question gives me headaches. I understand the logic of the nine states approach giving you an answer of 1/2; but I have a hard time convincing myself that approach is the correct one to take…
It’s easy to see that if the older child is a girl then the probability of both girls is 1/2, ie, the probability that the second child is a girl. More generally, if some specific child is a girl — the older child, or in this case the one with the memorable name — then the probability that the other child is a girl is 1/2.
This also works if you meet the parents pushing one of their kids in a stroller and the kid is a girl. You’ve learned slightly more than “at least one is a girl”. You’ve learned that one specific, essentially randomly chosen child is a girl. So the probability that the other one is a girl is 1/2.
It’s like putting the kids in an urn and pulling one out and seeing it’s a girl.
I don’t think stating the order of birth necessarily helps you, as it could be post hoc information retrieval. If you had assurances beforehand that the questioner was going to report the sex of the first born, then that helps you. If you don’t know whether they are going to report either the first or second born, even if they tell you after the fact, it doesn’t help you.
This is very similar to another well-known problem, which perhaps makes it easier to understand:
I have a deck of cards, shuffled randomly (all permutations equally likely). I deal out the top two cards, face down.
a) What is the probability both are red?
b) I look at the cards, and truthfully tell you one is red. What is the probability both are red?
c) The first card I dealt, I flip over, and you see it is red. What is the probability both are red?
d) I look at the cards, and truthfully tell you one is the ace of hearts. What is the probability both are red?
Here we can use approximate probabilities, so we can just say that the answer to (a) is 1/4, not (26*25)/(52*51).
This avoids all the confusion about “choosing girls vs choosing families”, etc. The key point, of course, is whether (d) is like (c) or like (b)!
anobdemus,
Any unambiguously feminine name, even one as common as “Jen,” is enough to uniquely identify one child and change the problem. Requiring P(2 Jens) > 0 is nonsensical.
In the first case, the information you have is that there are two children. You don’t know there’s even one girl. P = 1/4.
In the second case you know there is one girl, eliminating B-B. P = 1/3.
In the third case you learn there is one girl and she has a name. Names are unique within a family and so you have Florida-B, Florida-G. P = 1/2.
I think in the second case you can and should make the assumption that the girl has a name even if you don’t know it, and therefore the second case always degenerates to the third case.
Ah, I see my mistake. When you’re guaranteed to have at least one girl as one of your two children, the events are no longer independent of each other (since having a boy first determines the sex of your second child).
Carry on.
In my universe there are 3 families. I’m going to tell you my labels for them but I won’t tell you their REAL names. However I WILL tell you that that they all have unique names. M = male, F = female.
Family 1: M1 F1
Family 2: M2 F2
Family 3: F3 F4
I’ve uniformly randomly chosen a family and put them in Room A. We all agree that there is at least one girl in Room A and the probability that there are two girls in Room A is 1/3, right?
OK, I’ve just spoken with a girl in Room A and her name is Hawaii.
What is the probability NOW that there are two girls in the room?
I hope “mk” and Benoit are still lurking about. They gave the best challenges to my early post.
Observations:
1. If all names are equally rare, then discovering a name cannot change the probabilities as it is clear we get no new information.
2. These probability puzzlers ultimately must be solved by using Bayes Theorem. Informal probabilistic reasoning is simply not sufficient for these subtleties.
3. The Bayes Theorem solution is itself extremely subtle in this case.
4. This is a really great problem to think about.
I think MikeP has it.
When you can’t tell the girls apart, you could double-count them.
Jordan and Hawaii and you happen to know Jordan is a girl, but you don’t know which one she is.
Jordan and Hawaii and you happen to know Hawaii is a girl, but you don’t know which one she is.
It’s one possibility whichever one of them you know about. Two chances that the one you don’t know about is a boy, one chance the one you don’t know about is a girl.
But if you can tell them apart then there are only two choices for the one you don’t know about.
The statistics turns weirder for the cases where you can’t tell things apart. We assume we can’t tell electrons apart or protons apart etc. And we get weird statistics concerning elementary particles. Coincidence? You decide.
That is to say, you need to make the problem about individuals (microstates), rather than members of a group (macrostates).
Rarity does play a role. …assuming that a family would not name both girls Florida.
These statements are not both correct. Rarity does not necessarily play a role if the appearances of the trait are not independent. If, as you explicitly mention, a family does not give both siblings the same name, then the rareness of the name is utterly immaterial.
In particular, even if all girls were named Florida except those with a sister named Florida, then the probability is still 1/2.
Intuitively, rarity of the name is a proxy for uniqueness of the trait among the siblings. But if the trait is already unique — e.g., a name — then rareness does not matter.
Let’s go back and rephrase the exact same problem a different way.
A trickster has a cabinet with three drawers. In each drawer he has two coins. In two of the drawers he has one gold coin and one silver coin. In the other drawer he has two gold coins.
He picks a drawer at random and shows you one coin. It turns out to be a gold coin. What’s the chance the other coin is also gold?
That chance is 1/2. There was a 2/3 chance that he picked a drawer that had a silver coin in it, in the first place. If he did that, there was a 1/2 chance he’d show you the silver coin and then you’d know for sure that it wasn’t the drawer with two gold coins. But that didn’t happen. So of the four remaining possibilities, two of them are with the drawer with two gold coins.
Now suppose that the trickster picks a drawer at random, and then shows you a gold coin from it. Now the chance is 1/3 that the other coin is gold. It was a 1/3 chance to begin with, and there’s always a gold coin for him to show you so that doesn’t change the odds at all.
If the father could be talking about either child, then there are two different ways for the other to be a boy. G1B G2B G1G2. If the father is definitely talking about one of them then there’s only one way for the other to be a boy. G1B.
Similarly with for example the double-slit experiment from physics. If you know which slit the photon or the electron went through, then the result is very different from the case in which you don’t know.
That should be: P(A|B,C) = P(A|B) = 1/3.
MikeP,
In your example, why would P(Z) not equal P(Y,T), the joint probability of at least one girl and the likelihood of Florida?
If P(Z) = P(Y,T), then P(Z) = P(T)P(Y|T).
And since the probability of at least one girl is completely independent of the popularity of the name “Florida,” P(Y|T) = P(Y).
I think P(Z|X) ~ 2P(T) is fallacious, because you are changing X’s meaning from, “chance of family of 2 girls,” to “looked at any 2 girls.”
If you try to use Bayes by saying, “If I look at enough N i’ll find R,” or P(R|N) = P(N|R)P(R)/P(N), you find that P(N) has no logical meaning. It’s 1. So P(R|N) = P(R).
I think you have to consider Z shorthand for both Y and T, and determine P(X|Y,T), which remains 1/3.
In your example, why would P(Z) not equal P(Y,T), the joint probability of at least one girl and the likelihood of Florida?
Y contains the following possibilities:
G B
G G
Z contains the following possibilities:
GT B
GT G
GT GT
I don’t think there is any way you can distribute T across Y to get Z.
In particular, the second and third cases of Z are vastly different and are completely uncaptured by saying that P(Z)=P(Y)P(T). And it is exactly the second case that contributes all the weight of probability that turns the 1/3 of the generalized girl into the 1/2 of the specialized girl, while the problem is interesting only because P(GT GT) is insignificant or zero.
I think P(Z|X) ~ 2P(T) is fallacious, because you are changing X’s meaning from, “chance of family of 2 girls,” to “looked at any 2 girls.”
I don’t agree. The probability that at least one girl has trait T given there are two girls is the ambient probability one has the trait plus the ambient probability the other has the trait minus the ambient (or forced) probability both have the trait.
Mike P Says:
You could do this with any trait, so long as the negation of the trait is not correlated with sex.
And then gives some examples of traits.
I do think that “any” trait is sufficient. I believe that the trait needs to allow one to specify what happened in the original random event. I.e., two separate children were born, does the trait allow us to specify which of these two children was female?
Knowing that the oldest was female does allow us to specify what happened in the original random event.
Knowing that the right handed baby is female, does not, however allow us to specify. Even if we know that only one baby is right handed then we still can’t make inferences about what happened in the original sample space. The situation would be different however if we knew in advance that there was going to be a one right handed and one left handed baby. But this is not the case.
When we know in advance that there is one child called Florida, and she is female, then we can specify which baby was female and the probability that the other is female is 1/2
Even if we know that only one baby is right handed then we still can’t make inferences about what happened in the original sample space.
Exactly. And since we cannot make inferences about the other sibling from this meager sex-independent information, we are stuck with the a priori probability that a child is a girl: 1/2. It is exactly the lack of information conveyed by calling out traits of one child that are not possessed by the other that prevents us from lumping cases together and complicating the probability the other is a girl.
Put another way, I can, with probability 1, tell you something about one of my children that is not true of the other child. If the negation of that something has nothing to do with sex, I am giving you absolutely no information about the sex of my other child. The probability it is a girl remains the a priori 1/2.
However, if we meet a girl on the street who tells us that her name is Florida, and that she has one sibling, the probability that her sibling is female is 1/3
Actually, this is the very simplest case. You meet a girl. She tells you she has a sibling. She can tell you anything and everything about herself and, so long as she does not implicate her sibling in doing so, the probability that her sibling is a girl is 1/2.
Nope.
It is true that only a third of all two-child families with girls in them have two girls in them. But two-girl families have twice as many girls to contribute to walking down streets. So half the girls from two-child families you’ll meet walking down the street come from families with two girls in them.
See my Jul 09, 2008 at 08:57 PM comment upthread for how the problem statement decides for us whether we are counting families or counting girls.
Your distinction between counting girls and counting families is not correct.
I toss two coins. I show you one coin and it is a head. The probability that the other coin is a head is 1/3. Try this at home.
A couple has two kids. You meet one child and she is female. The probability that the other is female is 1/3
P(HH) = 1/4. P(HH/H) = 1/3. Notice that the coin you looked at is a 1943 steel penny, but that doesn’t affect it’s fraction of coming up heads.
You are using P(A|B) = P(B|A)P(A)/P(B) and creating a hybrid B = HeadsIsSteel or GirlIsFlorida. Your P(B|A) defines A as “N=2″, but your P(A) and P(A|B) define A as TwoHeads.
I’d like to hear your explanation why you shouldn’t use
P(A|B,C) = P(C|A,B)P(A|B)P(B)/P(B|C)P(C)
Where B,C is the joint probability of the first coin you look at being both heads (B) and steel (C).
Since P(Steel|HH,H) = P(Steel) and P(H|Steel) = P(H), P(HH|H,Steel) = P(HH|H). The first coin being steel does not tell you about both coins being heads. You are being tricked into throwing out your prior: P(HH|HIsSteel) -> P(H), or P(GG|GIsFlorida) -> P(G).
It totally violates my intuition, but the answer is 1/2 (in the limit of Florida being a very unlikely name). I think the simplest way to see this is by making a list of mutually exhaustive and exclusive possibilities, assigning probabilities to each.
To start with I’ll assume that names are assigned independently, so that it is possible for siblings to have the same name.
I’m defining the probability to be: “Probability that a family with two children has two girls, given that they have at least one girl, and her name is Florida” (*)
Here are the possibilities (note these are mutually exclusive and exhaustive, and you can check that the probabilities sum to 1):
P(BB) = 1/4
P(BF) = 1/4 T
P(BN) = 1/4 (1-T)
P(FB) 1/4 T
P(NB) = 1/4 (1-T)
P(FN) = 1/4 T (1-T)
P(NF) = 1/4 (1-T) T
P(NN) = 1/4 (1-T)^2
P(FF) = 1/4 T^2
(BB means two boys, BF means first child is a boy, second a girl name Florida, FF means two girls both named Florida, etc. T is the probability that a girl is named Florida.)
Our desired probability is …
P(FN + NF +FF) / P(FN + NF + FF + BF + FB) =
(1/4) (T(1-T) + (1-T)T + T^2) / (1/4) (T(1-T) + (1-T)T + T^2 + T + T) =
(1/4) (2T-T^2) / (1/4) (4T-T^2)
So the answer here is (2T-T^2)/(4T-T^2). Note that if *all* girls are named Florida, this reduces to 1/3, so we have the correct limit. For T<<1, we get the 1/2 answer.
Next Comments →
Comments on this entry are closed.