Vous êtes sur la page 1sur 3

CALIFORNIA INSTITUTE OF TECHNOLOGY

Ma 2b Introduction to Probability and Statistics

KC Border Winter 2013

Assignment 2: More interesting probability calculations


Due Tuesday, January 22 by noon at 253 Sloan Additional Instructions: For each exercise please rate its diculty (on a scale of your choosingjust explain it), and record how much time you spent on it. When asked for a probability or an expectation, give both a formula and an explanation for why you used that formula, and also give a numerical value when available. Exercise 1 (The World Series) (50 pts) The World Series is a tournament between the champion of the USAs National League and American League to decide the U.S. Major League Baseball champion. At present, it is won by the rst team to win four games out of a possible seven. Since baseball games do not end in ties, at most seven games are ever played.1 It is often said that baseball is a game of inches. This means that small changes in the physical outcomes of a given play can lead to loss or victory. It also means that the outcome of a game between two teams is eectively random. Let us say that if the probability p that Team A beats Team B is strictly greater than 1/2, then Team A is better than Team B. Note that it is possible (with probability 1 p) for the better team to lose a game. Frederick Mosteller [1] estimated (based on data from 44 Series from the rst half of the 20th century) that the probability that the better team wins any given World Series game is 0.65 and that the outcomes of the games are stochastically independent. Recently I redid his calculation for all 108 Series through 2012 and came up with 0.59. (You will have a chance to gure this out later in the course.) 1. Let p be the probability that Team A wins any given game. The probability that Team B wins is thus 1 p. Assuming independence of successive games, write the general formula for the probability w(p, n) that Team A wins a Series that requires n wins (out of at most 2n 1 games), as a function of p. (Assume the probability of a tie game is zero.) (Hint: This is a sum of binomial probabilities.) Does it matter if the Series stops after one team wins n games or if all 2n 1 games must be played?
This is a lie. In the past (before night baseball) there were three World Serieses that had a tie game, when the game was called on account of darkness. There were also three Serieses that were a best-of-nine format.
1

Ma 2b KC Border

Assignment 2

Winter 2013 2

2. Find the formula for the probability distribution of the length (in games) of the Series. (This is a little bit tricky: Remember Team A might win the series or Team B might win.) 3. The current world series is a best 4 out of 7 series (n = 4). What is the probability that the Team A wins the World Series if p = 0.6? 4. Suppose the probability p that the better team wins is 0.6. How many games would be needed to 90% sure that the Series winner was indeed the better team? To be 95% sure? (Hint: I think the simplest way to do this is just to make a table of w(0.6, n) for dierent values of n.) Exercise 2 (The missing women) (50 pts) For the sake of argument let us assume that the probability of being born a boy P (B) is the same as the probability of being born a girl P (G), namely 1/2. Let us assume that the sex of dierent children are stochastically independent, and that there are no multiple births or adoptions. In this case we would expect the population to be half male and half female. Or would we? According to Nobel Prize wining economist Amartya Sen [2], due to dierential mortality, in Europe and North America there are about 105 females for every 100 males. But in other countries the ratio is considerably lower. The number of females per 100 males is 94 in China, 93 in India, and 92 in Pakistan, and 84 in Saudi Arabia (which has a large migrant male workforce). These latter countries are sometimes described as having missing women. A possible explanation might be that in some countries, parents prefer to have boys so they may continue to have children until they have a boy or maybe two boys. Lets see if this works. 1. In a country with a one child policy what is expected number of boys per family? What is expected number of girls? What is the expected total number of children? 2. Suppose the parents choose to stop having children once they have a boy or get to four girls, which ever comes rst. What is expected number of boys per family? What is expected number of girls? What is the expected total number of children? 3. Suppose the parents choose to stop having children once they have two boys or get to four girls, which ever comes rst. What is expected number of boys per family? What is expected number of girls? What is the expected total number of children? 4. Suppose the parents choose to stop having children once they have a boy no matter how many children that takes. What is expected number of boys per family? What is expected number of girls? What is the expected total number of children? (Hint: This involves an innite series.)

Ma 2b KC Border

Assignment 2

Winter 2013 3

5. In the last case what is the expected percentage of boys in a familys children? (Hint: Remember Jensens Inequality.) (Hint: I had to look up the sum of the resulting series.) Exercise 3 (Introduction to order statistics) (40 pts) Let X1 , . . . , Xn be independent and identically distributed random variables with common cumulative distribution function F . 1. Let Y (n) = max{X1 , . . . , Xn }. Express the cumulative distribution function of Y (n) in terms of F and n. 2. Let Y (1) = min{X1 , . . . , Xn }. Express the cumulative distribution function of Y (1) { } in terms of F and n. (Hint: It is easier to compute Prob Y (1) > t .) 3. What can you say about Y (1) and Y (n) in terms of stochastic dominance? (Refer to Lecture 5 notes.) A random variable X has an exponential distribution with parameter if its cumulative distribution function is F (t) =
1 et , 0,

t 0, t 0.

4. If 1 < 2 , what can you say about exponential random variables with parameters 1 and 2 in terms of stochastic dominance? 5. Consider the minimum of n independent exponential random variables with common parameter . What is its cdf? What is interesting about its cdf?

References
[1] F. Mosteller. 1952. The world series competition. Journal of the American Statistical Association 47:355380. [2] A. K. Sen. 1998. Mortality as an indicator of economic success and failure. Economic Journal 108:125.

Vous aimerez peut-être aussi