Vous êtes sur la page 1sur 10

Statistics ~ Midterm Review

Short Answer
We all know that the body temperature of a healthy person is 98.6F. In reality, the actual body temperature
of individuals varies. Here are dotplots and summary statistics of the body temperatures for 90 healthy
individuals (45 males and 45 females).

Variable
Female
Males

Total
Count
45
45

Mean
98.344
98.122

SE
Mean
0.111
0.110

StDev

Minimum

Q1

Median

Q3

Maximum

0.745
0.735

96.400
96.300

97.900
97.600

98.400
98.100

98.800
98.600

100.000
99.500

1. Use Scenario 1-12. Determine if there are any outliers in each distribution. Show your work.
2. Use Scenario 1-12. Draw parallel boxplots of these two distributions. Be sure to label the plots and provide a
scale.
3. Use Scenario 1-12. Write a few sentences comparing the body temperatures of healthy males and females.
Scenario 1-13
The data below is the number of unprovoked attacks by alligators on people in Florida each year for a 33-year
period.
6
9
14

12
15
14

2
14
5

4
6
17

17
18
17

4
1
5

6
9
13

10
6
22

3
6
20

9
11
3

13
24
5

4. Use Scenario 1-13. (a) Construct a histogram for this distribution. Choose an appropriate bin width, and be
sure to provide a label and scale for each axis.
(b) Based on your histogram, what numerical measures of center and spread would be best to use for this
distribution? Explain your choice.
Scenario 2-7
The Dow Jones Industrial Average (The Dow) is an index measuring the stock performance of 30 large
American companies, and is often used as a measure of overall economic growth in the United States. Below
is Minitab output describing the daily percentage changes in the Dow for the first three months of 2009 and
the first three months of 2010. (Note that the market was open for 61 days during the first three months of
each year. A negative value indicates a percentage decrease in the index for that day).
Descriptive Statistics: Dow 2009, Dow 2010

Variable
Dow 2009
Dow 2010

N
61
61

Mean
-0.198
0.078

StDev
2.331
0.821

Minimum
-4.600
-2.640

Q1
-1.530
-0.270

Median
-0.310
0.110

Q3
1.150
0.465

Maximum
6.820
1.660

Both distributions are approximately Normally distributed.


5. Use Scenario 2-7. Consider a day when the Dow increased by 1%. In which year, 2009 or 2010, would such a
day be considered a better day for the stock market, relative to other days in that year? Provide appropriate
statistical calculations to support your answer.
6. Use Scenario 2-7. Based on these data, estimate the number of days that the Dow decreased by more than 1%
in these 61 days.
7. Use Scenario 2-7. Estimate the 19th percentile of daily change for the first three months of 2010.
8. Normal body temperature varies by time of day. A series of readings was taken of the body temperature of
a subject. The mean reading was found to be 36.5C with a standard deviation of 0.3C. If you wanted to
convert the temperatures to the Fahrenheit scale, what would the new mean and standard deviation be?
(Note: F = C(1.8) + 32).
Scenario 2-8
A local post office weighs outgoing mail and finds that the weights of first-class letters is approximately
Normally distributed with a mean of 0.69 ounces and a standard deviation of 0.16 ounces.
9. Use Scenario 2-8. What is the 60th percentile of first-class letter weights?
10. Use Scenario 2-8. First-class letters weighing more than 1 ounce require additional postage. What proportion
of first-class letters at this post office require additional postage?
Scenario 3-8
A certain psychologist counsels people who are getting divorced. A random sample of ten of her patients
provided the data in the following scatterplot, where x = number of years of courtship before marriage, and y
= number of years of marriage before divorce.

Below is the computer output for the regression of length of marriage versus length of courtship.

11. Use Scenario 3-8. Describe what the scatterplot reveals about the relationship between length of courtship and
length of marriage.
12. Use Scenario 3-8. Suppose a new point at (4.5, 8), that is, years of courtship = 4.5 and years of marriage = 8,
were added to the plot. What effect, if any, will this new point have on the correlation between courtship
duration and marriage duration? Explain.
13. Use Scenario 3-8. What is the slope of the regression line? Interpret the slope in the context of this problem.
14. Use Scenario 3-8. Explain what the quantity S = 2.74982 measures in the context of this problem.
Scenario 3-9
One weekend, a statistician notices that some of the cars in his neighborhood are very clean and others are
quite dirty. He decides to explore this phenomenon, and asks 15 of his neighbors how many times they wash
their cars each year and how much they paid in car repair costs last year. His results are in the table below:
x = number of car washes per year
y = repairs costs for last year

Mean
6.4
$955.30

Standard deviation
3.78
$323.50

The correlation for these to two variables is r = -0.71


15. Use Scenario 3-9. Find the equation of the least-squares regression line (with y as the response variable).
16. Use Scenario 3-9. What percentage of the variation in repair costs can be explained by the number of times
per year a car is washed?
17. Use Scenario 3-9. Based on these data, can we conclude that washing your car frequently will reduce repair
costs? Explain.
Scenario 4-6
Read the following brief article about aspirin and alcohol.
Aspirin may enhance impairment by alcohol
Aspirin, a long time antidote for the side effects of drinking, may actually enhance alcohols effect,
researchers at the Bronx Veterans Affairs Medical Center say. In a report on a study published in the Journal
of the American Medical Association, the researchers said they found that aspirin significantly lowered the
bodys ability to break down alcohol in the stomach. As a result, five volunteers who had a standard breakfast
and two extra-strength aspirin tablets an hour before drinking had blood alcohol levels 30 percent higher than
each had when they drank alcohol alone. Each volunteer consumed the equivalent of a glass and a half of
wine.
That 30 percent could make the difference between sobriety and impairment, said Dr. Charles S. Lieber,
medical director of the Alcohol Research and Treatment Center at the Bronx center, who was co-author of the
report with Dr. Risto Roine.
18. Use Scenario 4-6. Explain why this is an experiment and not an observational study.
19. Use Scenario 4-6. Identify the explanatory and response variables.
20. Use Scenario 4-6. Identify the experimental design used in this study. Justify your answer.

21. Use Scenario 4-6. In the second sentence above is the phrase, researchers said they found that aspirin
significantly lowered the bodys ability to break down alcohol What is the statistical meaning of the word
significantly in the context of this study?
22. Use Scenario 4-6. This was a controlled experiment. Describe how it was controlled and explain the purpose
of doing so.
Scenario 4-7
High blood pressure adds to the workload of the heart and arteries and may increase the risk of heart attacks.
If not treated, this condition can also lead to heart failure, kidney failure, or stroke. We wish to test the
effectiveness of Angiotensin-converting enzyme (ACE) inhibitors as a treatment for high blood pressure.
23. Use Scenario 4-7. It is well known that men and women may react differently to common cardiovascular drug
treatments. What sort of experimental design would you choose for this study, and why?
24. Use Scenario 4-7. Explain why an experiment involving 600 men and 500 women is preferable to one
involving 60 men and 50 women.
25. Use Scenario 4-7. Assume that 600 men and 500 women suffering from high blood pressure are available for
the study. Describe a design for this experiment. Be sure to include a description of how you assign
individuals to the treatment groups.
26. Bias is present in each of the following sampling designs. In each case, identify the type of bias involved and
state whether you think the sample result obtained is lower or higher than the actual value for the population.
(a) A political pollster seeks information about the proportion of American adults who oppose gun controls.
He asks an SRS of 1000 American adults: Do you agree or disagree with the following statement:
Americans should preserve their constitutional right to keep and bear arms. A total of 910, or 91%, said,
Agree (that is, 910 out of the 1000 oppose gun controls).
(b) A flour company in Minneapolis wants to know what percent of local households bake at least twice a
week. A company representative calls 500 randomly-selected households during the daytime and finds that
50% of those who responded bake at least twice a week.
Scenario 5-10
An airline estimates that the probability that a random call to their reservation phone line result in a
reservation being made is 0.31. This can be expressed as P(call results in reservation) = 0.31. Assume each
call is independent of other calls.
27. Use Scenario 5-10. Describe what the Law of Large Numbers says in the context of this probability.
28. Use Scenario 5-10. What is the probability that none of the next four calls results in a reservation?
29. Use Scenario 5-10. You want to estimate the probability that exactly one of the next four calls result in a
reservation being made. Describe the design of a simulation to estimate this probability. Explain clearly how
you will use the partial table of random digits below to carry out your simulation.
30. Use Scenario 5-10. Carry out 5 trials of your simulation. Mark on or above each line of the table so that
someone can clearly follow your method.
188

87370

88099

89695

87633

76987

85503

26257

51736

189
190
191
192

88296
79485
40830
32006

95670
92200
24979
76302

74932
99401
23333
81221

65317
54473
37619
00693

93848
34336
56227
95197

43988
82786
95941
75044

47597
05457
59494
46596

83044
60343
86539
11628

31. A grocery store examines its shoppers product selection and calculates the following: The probability that a
randomly-chosen shopper buys apples is 0.21, that the shopper buys potato chips is 0.36, and that the shopper
buys both apples and potato chips is 0.09.
(a) Let A = Randomly-chosen shopper buys apples, and C = Randomly-chosen shopper buys potato chips.
Sketch a Venn diagram or two-way table that summarizes the probabilities above.
(b) Find each of the following:
i. The probability that a randomly-selected shopper buys apples or potato chips.
ii. The probability that a randomly-selected shopper buys potato chips or doesnt buy apples.
iii. The probability that a randomly-selected shopper doesnt buy apples and doesnt buy potato chips.
32. Wile E. Coyote is pursuing the Road Runner across Great Britain toward Scotland. The Road Runner chooses
his route randomly, such that there is a probability of 0.8 that hell take the high road and 0.2 that hell take
the low road. If he takes the high road, the probability that Wile E. catches him is 0.01. If he takes the low
road, the probability he gets caught is 0.05. Find the probability that he took the high road, given that he was
caught.
Scenario 5-11
The table below gives the distribution of students at a certain high school for two categorical variables, grade
year and the students answer to the question, Do you eat regularly in the school cafeteria?
Grade
Eat in
cafeteria?

YES
NO
Totals

9th
130
18
148

10th
175
34
209

11th
122
88
210

12th
68
170
238

Totals

495
310
805

33. Use Scenario 5-11. If you choose a student at random, what is the probability he or she eats regularly in the
cafeteria?
34. Use Scenario 5-11. If you choose a student at random, what is the probability he or she eats regularly in the
cafeteria, given that he or she is in 10th grade?
35. Use Scenario 5-11. If you choose a student at random, are the events 10 th grade and eats regularly in the
cafeteria independent? Explain how you know.
36. Your statistics class has 26 students in it14 girls and 12 boys. Your teacher uses a calculator to select two
students at random to solve a problem on the board.
(a) Draw a tree diagram representing the data.
(b) What is the probability that both students are girls?

(c) Given that the second student chosen is a girl, what is the probability that the first student was also a girl?

Statistics ~ Midterm Review


Answer Section
SHORT ANSWER
1. ANS:
For females: Q1 = 97.9 and Q3 = 98.8; 1.5 x IQR = 1.35.
Q1 1.35 = 96.55 so 96.4 is an outlier; Q3 + 1.35 = 100.15, so no high outliers. For males: Q1 = 97.6 and
Q3 = 98.6; 1.5 x IQR = 1.5. Q1 1.5 = 96.1 so no low outliers; Q3 + 1.5 = 100.1 so no high outliers.
REF: Test 1A
2. ANS:
See boxplots in solution for #12, Test 1A on TPS4 website or in TRB.
REF: Test 1A
3. ANS:
Both distributions are roughly symmetric and have similar variability, since the IQRs are 0.9 and 1.0 degrees
Fahrenheit. Females typically have slightly higher body temperatures: the median for females is 98.4 F and
the median for males is 98.1 F.
REF: Test 1A
4. ANS:
(a) Student histograms will vary with chosen bin width.
(b) Since the data is skewed to the right, it would be best to use median and interquartile range as measures of
center and spread, respectively.
REF: Test 1A
5. ANS:
In 2009, the z-score for 1% was

. In 2010, the z-score for 1% was

. This means that 1% had a higher relative standing in 2010 than in 2009. (Percentiles
for a 1% increase were 69.6% in 2009 and 86.9% in 2010.)
REF: Test 2A
6. ANS:
, percentile for .34 is .3669. 61 days x .3669 = 22.38, or about 22 days.
REF: Test 2A
7. ANS:
z for 19th percentile (from Table A) is 0.88. So 0.88(0.821) + 0.078 = 0.644 or about a 0.64% decrease.
REF: Test 2A
8. ANS:
Mean = 36.5(1.8) + 32 = 97.7 F. Standard deviation = 0.3(1.8) = 0.54 F.

REF: Test 2A
9. ANS:
z for 60th percentile is 0.25, and

ounces.

REF: Test 2A
10. ANS:
, which has a proportion of 1 0.9738 = .0262 letters above it requiring additional
postage.
REF: Test 2A
11. ANS:
There appears to be a moderately strong, positive, linear relationship between length of courtship and length
of marriage.
REF: Test 3A
12. ANS:
The correlation would decrease, since this point is well outside the linear pattern in the other points, so it
weakens the linear association.
REF: Test 3A
13. ANS:
Slope = 2.4559. For each 1-year increase in length of courtship, predicted length of marriage increases by
2.4559 years.
REF: Test 3A
14. ANS:
The observed values for length of marriage for these ten couples was, on average, 2.7498 years away from the
marriage length values predicted by the regression line.
REF: Test 3A
15. ANS:

Slope =

Y-intercept =

. So

REF: Test 3A
16. ANS:
, or 50.41%.
REF: Test 3A
17. ANS:
No. Since this was not a controlled experiment, there could be lurking variables that are responsible for the
association observed here. Perhaps the frequency with which drivers wash their cars is confounded with
other good car-maintenance habits, such as changing the cars oil frequently.

REF: Test 3A
18. ANS:
This is an experiment because treatments (aspirin and alcohol; alcohol only) are imposed on the subjects.
REF: Test 4A
19. ANS:
Explanatory variable: Aspirin consumption; Response: blood alcohol content.
REF: Test 4A
20. ANS:
Matched pairs experiment. Each subject was given both treatments (aspirin before drinking and no aspirin)
and thus acted as his own pair.
REF: Test 4A
21. ANS:
Significantly means that the difference found in the subjects blood alcohol level between the two treatments
was large enough that it was unlikely to have arisen from chance variation.
REF: Test 4A
22. ANS:
Each subject acted as his or her own control, drinking alcohol alone and alcohol with aspirin. Comparing
each subjects blood alcohol under both treatments allowed the researchers to isolate the impact of aspirin
from any other variables.
REF: Test 4A
23. ANS:
A randomized block designblocking by genderwill reduce the impact that differences between the
responses of men and women to the treatment might have on variability arising from random assignment.
REF: Test 4A
24. ANS:
A larger number of subjectsgreater replicationdecreases the impact of random variation on experimental
results, thereby increasing our ability to distinguish the effects of the treatment.
REF: Test 4A
25. ANS:
First create two block comprised of the 600 men and the 500 women. Then, within each block assign the men
numbers from 001 to 600 and the women numbers from 001 to 500. Choose 3-digit numbers from the
random number table, ignoring repeats and unassigned numbers, until you have selected 300 men. Then
begin elsewhere in the table and follow the same procedure to randomly select 250 women. These subjects
will be treated with ACE, and the remaining subjects will receive a placebo. Compare changes in blood
pressure between the ACE group and the control group.
REF: Test 4A
26. ANS:
(a) Wording of question bias: its possible that using the term constitutional right generates a positive
response more often than a question that does not mention the constitution, so that 91% is an overestimate of
support. (b) Non-response bias: people without jobs outside the home are more likely to be home during the
daytime, and probably have more time to bake. This would make 50% an overestimate.

REF: Test 4A
27. ANS:
As the number of calls becomes larger and larger, the proportion of calls resulting in a reservation will get
closer and closer to 0.30.
REF: Test 5B
28. ANS:
= (0.7)4 0.2401.
REF: Test 5B
29. ANS:
Assign the digits 01 through 31 to calls that result in a reservation being made and 32 through 00 to other
calls. Choose four two-digit numbers from the random digits table and determine how many of the four calls
result in reservations being made. Do this may times. The proportion of those trials that result in one call
resulting in a reservation is the probability estimate.
REF: Test 5B
30. ANS:
Let R = reservation made and O = other call. Starting on line 188, the first five trials are: OORO = 1 call;
OOOO = no calls; OOOO = no calls; OORR = 2 calls; RORO = 2 calls. Probability estimate is 1/5 = 0.20.
REF: Test 5B
31. ANS:
(a) See Venn diagram and two-way table in solutions for #12, Test 5B on TPS4 website or in TRB.
(b) i.)

. ii.)

iii.)

REF: Test 5B
32. ANS:
.
REF: Test 5B
33. ANS:
.
REF: Test 5C
34. ANS:
.
REF: Test 5C
35. ANS:
No:

REF: Test 5C
36. ANS:
(a) Diagram to be completed in class.

(b)

(c)
REF: Test 5C

Vous aimerez peut-être aussi