Vous êtes sur la page 1sur 3

STATISTISCS 30001 - CLASSES 15/21

TUTORING SESSION 5

Exercise 1

The probability that, after a first purchase, a customer of a flagship store makes a second one is
equal to 15%. Consider 250 randomly chosen customers among those that have already placed an
order at the store. What is the probability that more than 40 but at most 50 customers will make
a second purchase?

SOL Note that this solution procedure is SLIGHTLY different to the solution done in class. Here we
use the point (3) of the AMAZING ULTIMATE RECAP. In Class we used point (1) and (2) of the
AMAZING ULTIMATE RECAP. Of course the two solutions are equivalent, choose your favourite
approach.
Observe that the probability of the event we’re after is equivalent to

(n n)
40 ^ 50
P < pn ≤

and that
np(1 − p) = 31.875 .
^ by a normal distribution with mean p
Henceforth, we can approximate the distribution of pn
and standard deviation p(1 − p)/n . Consequently,
p^n − p
( n n) ( p(1 − p)/n ) ( 0.0226 0.0226 )
40 ^ 70 0.16 − p 0.2 − p 0.01 0.05
P < pn ≤ =P < ≤ ≈P <Z≤ = P(0.44 < Z ≤ 2.21) = 0.9864 − 0.67 = 0.3164
p(1 − p)/n p(1 − p)/n

Exercise 2

A survey on the music industry has shown that, in the last 5 years, the percentage of records with
retail price lower than 5 dollars has considerably increased.
1. In the population of records produced in the last 5 years, the average retail price is 7.2
dollars and the standard deviation is 1.1 dollars. Using the previous information,
approximately compute the percentage of records with retail price lower than 5 dollars.
2. How would you modify the answer to the previous question if you knew that the variable
Retail Price followed a symmetric and bell-shaped distribution?
3. And how would you change the answer if you knew that Retail Price was normally
distributed?
SOL
1. Knowing population average and standard deviation only and having no additional
information on the shape of the distribution, we can rely on Chebychev’s Inequality only:
according to Chebychev’s Inequality, an interval of the form (µ-kσ; µ+kσ) contains at least
(1- 1/k2)100% of the values of the population: since 5 dollars is the lower endpoint of one of
such intervals, with k=2 (µ- kσ = 7.2 - 2*1.1 = 5) we observe

(5; 9.4) = (7.2 - 2*1.1, 7.2 + 2*1.1) with k = 2.


1
At least 1 − *100% = 75% of the records, produced in the last 5 years, has a price
22
between 5 and 9.4 dollars. Consequently, at most 25% of the records has a price outside the
considered interval, and a fortiori, at most 25% will have a price smaller than 5 dollars.
Without any additional piece of information, it is not possible to state how this 25% of
values can be divided between the two remaining intervals (0,5) and above 9.5.

2. The advantage of Chebychev’s Inequality is its applicability to any population. However


many large-sized populations are approximately symmetric and bell-shaped, with a large
proportion of values concentrated around the average. In this case, it is then possible to
apply the Empirical Rule, that for the interval (5, 9.4), centered on the mean and with a half-
width of 2σ, suggests a percentage equal to 95%. Since the distribution is symmetric, we
can state that the percentage of records with a retail price smaller than 5 dollars is
(approximately) 2.5%.

3. If the distribution of Retail Price (X) is exactly normal, we can directly compute

( 1.1 )
5 − 7.2
P(X < 5) = P Z < = P(Z < − 2) = 1 − P(Z < 2) = 1 − 0.9972 = 0.0228 → 2.28%

Exercise 3

The record company is evaluating whether to assign a bonus to those artists willing to produce a
record longer than 50 minutes. To evaluate the costs of this strategy, compute the probability that a
record lasts longer than 50 minutes. Assume that length of a record is normally distributed with
mean 42 (minutes) and standard deviation 8.

SOL Denoting with X the length of a record, we have

The probability that a record will last longer than 50 minutes is 15.87%.

Exercise 4
Carlo estimates that on a given evening out the probability that he spends less than 15 Euro is 17% .
Given a sample of 100 evenings out, what is the probability that Charles spends less than 15 Euro in
at least 20 of them?

SOL In this case, np(1-p) = 100·0.17·0.83 = 14.11 > 5. Therefore, we can approximate the binomial
distribution by a normal distribution with mean np = 100·0.17 = 17 and variance np(1-p) = 14.11.
The required probability is then given by:
Exercise 5
The expected number of confetti in a confetti box is 997 confetti, with standard deviation equal to
150. A random sample of 50 boxes is drawn. Calculate the probability that the average number of
confetti contained in the 50 boxes is greater than 1000.

SOL For n>30 we apply the Central Limit Theorem and assume that the sample mean is approximately
normally distributed

following this assumption, we can calculate the probability that the number of confetti contained in
the 50 boxes is greater than 1000:

( 21.213 )
1000 − 997
P (X̄ > 1000) = P Z > = p (Z > 0.1414) = 1 − P (Z < 0.1414) = 0.4438.

Exercise 6
Two different machines (an old one and a new one) are used to fill bags with flour. There is a
suspicion that the new machine produces heavier bags. It is possible to reasonably assume that the
weight in grams of the bags being filled by the two machines be described by two normal random
variables (call them X and Y) with unknown expected values and with known variances both equal
to 1200. Two samples of 20 and 25 bags are extracted at random from among the bags produced by
the old and the new machine, respectively.
1. What are the expected value and the variance of the difference between the two sample
means? How is such difference distributed?
2. Assuming that μX = μY, determine the value k such that P(X̄ − Ȳ < k) = 0.8.

SOL