Vous êtes sur la page 1sur 34

Introduction

Chapter 1

Check Your Understanding, Page 5:

1. The cars in the student parking lot.


2. He measured the cars model (categorical), year (quantitative), color (categorical), number of cylinders
(quantitative), gas mileage (quantitative), whether it has a navigation system (categorical), and weight
(quantitative).

Exercises, page 7:

1.1 Type of wood, type of water repellent and paint color are categorical. Paint thickness and weathering
time are quantitative.
1.2 Gender, Race, and Smoker status are categorical. Age, systolic blood pressure and level of calcium
are quantitative.
1.3 (a) The individuals are the AP Statistics students who completed a questionnaire on the first day of
class. (b) The categorical variables are gender (female or male), handedness (right or left), and favorite
type of music (classical, gospel, rock, rap, country, R&B, top 40, oldies, etc.). The quantitative variables
are height (in inches), amount of time the student is expecting to spend on homework (in minutes per
week), and the total value of coins in a students pocket (in cents). (c) The highlighted individual is a
female who is right handed. She is 58 inches tall, spends 60 minutes on homework, prefers Alternative
music and had 76 cents in her pocket.
1.4 (a) The individuals are roller coasters opened in 2009. (b) The categorical variables are Roller
coaster (the name of the coaster), type (steel or wood) and the design (sit down, flying). The quantitative
variables are height (in feet), speed (in mph) and duration (in seconds). (c) The highlighted roller coaster
is the Prowler, a wood, sit-down type coaster. Its height is 102.3 feet, its speed is 51.2 mph and the
duration of the ride is 150 seconds.
1.5 Student answers will vary; for comparison, recent U.S. News rankings have used measures such as
academic reputation (measured by surveying college and university administrators), retention rate,
graduation rate, class sizes, faculty salaries, student-faculty ratio, percentage of faculty with highest
degree in their fields, quality of entering students (ACT/SAT
scores, high school class rank, enrollment-to-admission ratio), financial resources, and the percentage of
alumni who give to the school. Examples of categorical variables would include region of the country
and type of institution (2-year college, 4-year college, university).
1.6 Student answers will vary. One possible answer is as follows. Reality shows (yes or no indicating
whether the student watches reality shows), Music (yes or no indicating whether the student watches
music videos, concerts, or documentaries about musicians and singers), Time (the average amount of
time, in minutes per day, spent watching television), Network (the average number of network
programsshows, movies, sporting events, etc. watched per week on ABC, CBS, NBC, and FOX).
The categorical variables are Reality shows and Music, and the quantitative variables are Time and
Network.
1.7 b
1.8 c

Chapter 1: Exploring Data

Section 1.1
Check Your Understanding, page 14:

1. There are 2367 females, 2459 males and 4826 total in the data set. This means that
2367

100% = 49% are female and 51% are male.


4826
2. A bar chart is given below. The percents of males and females are very similar.

Check Your Understanding, page 17:

720
696
100% = 49.15% and Male =
1. A 50-50 chance : Female =
100% = 50.85%. The rest are

1416
1416
shown in the table for number 2.
2.
female

male

All

A 50-50 chance

696
49.15

720
50.85

1416
100.00

A good chance

663
46.66

758
53.34

1421
100.00

Almost certain

486
44.88

597
55.12

1083
100.00

Almost no chance

96
49.48

98
50.52

194
100.00

Some chance but probably not

426
59.83

286
40.17

712
100.00

All

2367
49.05

2459
50.95

4826
100.00

Cell Contents:

Count
% of Row

The Practice of Statistics for AP*, 4/e

Exercises, page 22:

1.9 (a) The percent of cars with other colors is 100 20 17 17 13 12 11 5 3 2 = 0%.
Apparently this list of colors constitutes nearly all possible car colors. (b) A bar graph is given below.

(c) It would be appropriate to make a pie chart of these data because we have all possible colors listed
here.
1.10 (a) The percent of spam that occur in the other category is 100 19 20 7 7 6 25 9 = 7%.
(b) A bar graph is given below.

(c) If you include the category of other, then it would be appropriate to make a pie chart, as all possible
types of spam would be included. Without the category of other, it would not be appropriate to make a
pie chart.
1.11 (a) A pie chart would also be appropriate since all days are accounted for in the dataset. A bar
graph is given below.

Chapter 1: Exploring Data

(b) Perhaps induced or C-section births are often scheduled for weekdays.
1.12 (a) A bar graph is given below.

(b) You need to know the total number of deaths among people aged 15-24 in the US in 2005 so that you
can figure out how many deaths would be categorized as other.
1.13 Estimates will vary, but should be close to the actual reported numbers (which can be found at the
Census Bureau Web site): 64% Mexican, 9% Puerto Rican.
1.14 Estimates will vary but should be close to: 20% Business, 12% Social Science.
1.15 (a) A pie chart could not be used for these data because the given percents represent fractions of
different age groups, rather than parts of a single whole. (b) A bar graph is given below.

The Practice of Statistics for AP*, 4/e

1.16 (a) Movie attendance seems to drop off as people get older. A bar graph is given below.

(b) A pie chart could not be used for these data because the given percents represent fractions of different
age groups, rather than parts of a single whole. (c) We do not know how many people are in each age
group.
1.17 (a) The pictures should be proportional to the number of students they represent. As drawn, it
appears that most of the students arrived by car but in reality most came by bus (14 took the bus, 9 came
in cars).
(b) A bar graph is given below.

1.18 (a) The vertical scale does not begin at 0 which over-emphasizes the difference in cholesterol.
(b) The drop in cholesterol is much smaller than it appeared in the original graph. A bar graph is given
below.

Chapter 1: Exploring Data

1.19 (a) This table describes 133 people, of which 36 were buyers of coffee filters made of recycled
paper. (b) To find the marginal distribution of opinion we need to know the total numbers of people with
32
49
each opinion:
100% = 36.84% said higher, 133 100% = 24.06% said the same, and

133
52
60.9% of consumers think the quality of the
133 100% = 39.10% said lower. 36.84 + 24.06 =

recycled product is the same or higher than the quality of other filters.
1.20 (a) The sum of the six counts is 5375 students. The proportion of these students who smoke is
1004
= 0.1868, so the percent of smokers is 18.68%. (b) The marginal distribution of parents smoking
5375
behavior is shown below.
Count
Percent

Neither parent smokes


1356
25.23%

One parent smokes


2239
41.66%

Both parents smoke


1780
33.12%

1.21 There were 36 buyers and 97 nonbuyers among the respondents, so (for example)
20
36 100% = 55.56% of buyers rated the quality as higher. Similar arithmetic with the buyers and

nonbuyers rows gives the two conditional distributions of opinion, shown in the table below. We see that
buyers are much more likely to consider recycled filters higher in quality, though 25% still think they are
lower in quality. We cannot draw any conclusion about causation: It may be that some people buy
recycled filters because they start with a high opinion of recycled products, or it may be that use
persuades people that the quality is high.
Buyers
Nonbuyers

Higher
55.56%
29.90%

The same
19.44%
25.77%

Lower
25.00%
44.33%

1.22 The three conditional distributions are shown in the table below.
Neither parent
One parent
Both parents
smokes
smokes
smoke
Student does not smoke
86.14%
81.42%
77.53%
Student smokes
13.86%
18.58%
22.47%
The conditional distributions reveal what many people expectparents have a substantial influence on
their children. Students that smoke are more likely to come from families where one or more of their
parents smoke.
1.23 The biggest difference between Europeans and Americans in their color choice for cars is the
distinction between white/pearl and silver. Americans are much more likely to choose white/pearl, while
Europeans are much more likely to choose silver. The only other differences worth mentioning are that
Europeans are more likely to choose black or gray than Americans, while Americans are more likely to
choose red than Europeans.
1.24 (a) Two side-by-side bar graphs are shown below. Each graph presents a slightly different view of
the same percentages.

The Practice of Statistics for AP*, 4/e

(b) The graph above on the left provides an easy comparison of luxury cars with SUV/truck/van
percentages for each color. The heights of the bars are different for every color except silver, with the
biggest differences in white pearl and white. The graph above on the right shows the colors separately for
each type of vehicle. Once again the differences in preferences are clear from the different shapes of the
bars for the two types of cars. Black was the most popular color choice for luxury vehicles while white
was most popular for SUVs.
1.25 State: What is the relationship between whether someone belongs to an environmental organization
and their use of a snowmobile for visitors to Yellowstone National Park? Plan: We suspect that
belonging to an environmental group will reduce the chances that someone will use a snowmobile so
well compare the conditional distributions for snowmobile use for those who belong to an environmental
organization and for those who dont. Do: Well make a side-by-side bar graph to compare the
snowmobile use of the two groups of people. (See graph below.) Conclude: Those who are members of
an environmental group are much more likely to have never used a snowmobile (nearly 70% of this group
has never used a snowmobile) where as only about 35% of the other group has never used a snowmobile.

Chapter 1: Exploring Data

Those in an environmental group are less likely to have rented or owned a snowmobile than those not in
an environmental group.

1.26 State: Do the data support the idea that people who get angry easily tend to have more heart
disease? Plan: We suspect that people with different anger levels will have different rates of CHD, so
well compare the conditional distributions of CHD for each anger level. Do: Well display the
conditional distributions in a table to compare the rate of CHD occurrence for each of the three anger
levels. (See table below.) Conclude: The conditional distributions show that, while CHD occurrence is
quite small overall, the percent of the population with CHD does increase as the anger level increases.
Low anger
Moderate Anger High Anger
CHD
1.70%
2.3%
4.3%
No CHD
98.30%
97.7%
95.7%
Total
100%
100%
100%
1.27 d
1.28 b
1.29 d
1.30 d
1.31 e
1.32 c
1.33 Answers will vary. Two possibilities are given below.
10
40
30
20
50
0
30
20
1.34 (a)

Hit
No hit
Joe
120
380
Moe
130
370
(b) The overall batting averages are 0.240 for Joe and 0.260 for Moe. Moe has the best overall batting
average. Two separate tables, one for each type of pitcher, are shown below. Against left-handed
pitchers, Joes batting average is 0.200 and Moes batting average is 0.100. Against right-handed

The Practice of Statistics for AP*, 4/e

pitchers, Joes batting average is 0.400 and Moes batting average is 0.300. Joe is better against both
kinds of pitchers.
Left-handed pitchers
Right-handed pitchers
Hit
No hit
Hit
No hit
Joe
80
320
Joe
40
60
Moe
10
90
Moe
120
280
(c) Both players do better against right-handed pitchers than against left-handed pitchers. Joe spent 80%
of his at-bats facing left-handers, while Moe only faced left-handers 20% of the time.
1.35 (a) The two-way table is shown below. (b) Overall, 11.88% of white defendants and 10.24% of
black defendants receive the death penalty. For white victims, 12.58% of white defendants and 17.46%
of black defendants receive the death penalty. For black victims, 0% of white defendants and 5.83% of
black defendants receive the death penalty. (c) The death penalty is more likely when the victim was
white (14.02%) rather than black (5.36%). Because most convicted killers are of the same race as their
victims, whites are more often sentenced to death.
Death penalty
No death penalty
White defendant
19
141
Black defendant
17
149
1.36 (a) The individuals are vehicles. (b) The variables are make/model (categorical), vehicle type
(categorical), transmission type (categorical), number of cylinders (quantitative), city MPG (quantitative)
and highway MPG (quantitative).

Section 1.2
Check Your Understanding, page 31:

1.
2.
3.
4.

This distribution is skewed to the right.


The median is 1.5 and the mean is 1.75
The number of siblings varies from 0 to 6.
There are two potential outliers: those students reporting 5 and 6 siblings.

Check Your Understanding, page 34:

1. In general, it appears that females have more pairs of shoes than males. The median report for the
males was 9 pairs while the female median was 26. The females also have a larger range of
57 13 =
44 in comparison to the range for the males of 38 4 =
34. Finally, both males and females
have distributions that are skewed to the right, though the distribution for the males is more heavily
skewed as evidenced by the three likely outliers at 22, 35 and 38. The females do not have any likely
outliers.
2. b
3. b
4. b

Check Your Understanding, page 39:


1. The graph is below:

Chapter 1: Exploring Data

2. The distribution is roughly symmetric and bell-shaped. The median IQ appears to be between 110 and
120 and the IQs vary from 80 to 150. There do not appear to be any outliers.

Check Your Understanding, page 41:


1. The graph is given below:

2. The graph is given below:

3. This is a bar graph because the horizontal axis divides the observations up into categories.
4. It would not be correct to describe this graph as right-skewed because the x-variable is categorical, not
quantitative.

10

The Practice of Statistics for AP*, 4/e

Exercises, page 42:

1.37 (a) The graph is shown below:

(b) The data is roughly symmetric with a center of 6 hours. The range is 11 3 =
8 hours. There do not
appear to be any outliers.
1.38 (a) A dotplot for the total number of gold medals for a sample of countries is shown below.

The overall distribution is skewed to the right with a mode of 0, which indicates that many countries did
not win any gold medals. China, with 51 gold medals, is clearly unusual, as is the US with 36 and even
Great Britain with 19. The rest of the countries earned 7 or fewer the majority none. (b) No, this does
not seem to be a representative sample since 17 out of the 30 countries in the sample (or about 57%) won
medals. Overall, only about 27% won medals.
1.39 (a) The two dots above the 2 represent games where the opposing team won by 2 goals. (b) Only
two of the 34 differentials are negative, which indicates that the U.S. womens soccer team had a very
good season. The team scored at least as many goals as their opponents in 32 of 34 games. In one game
they beat the other team by 8 goals, a very unusual event in soccer.
1.40 (a) The dot above 6 represents a car that got 6 mpg more on the highway than it did in city driving.
(b) From the dotplot we see that the EPA mpg rating is higher on the highway than in the city for all of
the cars. Most of the cars got at least 9 miles per gallon more on the highway than in the city. Five of the
cars got 7 more miles per gallon on the highway than in the city while only two cars got less than 7 miles
per gallon more on the highway than in the city.
1.41 (a) Answers will vary. One possible dotplot is given below:

(b) As coins get older they get taken out of circulation and new coins are introduced. So most coins in
someones pocket will be from recent years, but there may be a few from previous years.

Chapter 1: Exploring Data

11

1.42 The shape of this distribution is fairly uniform. That is, all of the numbers appear with about the
same frequency in the last digit of telephone numbers.
1.43 Based on the dotplots, the average ratings were higher for the students in the internal reasons group.
Both distributions are roughly symmetric with a range of about 20. But the internal reasons distribution
has a center of 21 whereas the external reasons distribution has a center of 17.
1.44 Based on the dotplots, it appears that the claim is partially true. The middle shelf does have cereals
with the most sugar. However, the bottom shelf has more cereals that have very little sugar and the top
shelf has cereals with a wide range of sugar values.
1.45 (a) If we had not split the stems, most of the data would appear on just a few stems. (b) 16.0%.
The high percentage for Utah may be due to the Mormon Church. (c) Ignoring Utah, the data is roughly
symmetric around 13% with a spread of roughly 3.5%.
1.46 (a) If we had not split the stems, most of the data would appear on just a few stems. (b) Key: 2|3
means that an 8-ounce serving of that soft drink has 23 mg of caffeine. (c) This distribution is somewhat
skewed to the right. The center is 28 mg and the values range from 15 mg to 47 mg. All of these drinks
meet the USFDAs limit.
1.47 (a) The stemplots are given below:
Without splitting stems
6|0 3 5 5 7
7|0 1 2 4 4 8 8 9 9 9
8|1 1 3 6 6 7
9|0 6

With splitting stems


6|0 3
6|5 5 7
7|0 1 2 4 4
7|8 8 9 9 9
8|1 1 3
8|6 6 7
9|0
9|6
Student preferences may vary, but the split stems on the right show more detail. (b) The distribution is
relatively symmetric, with center near 780 mm (the median is 784 mm), and range of
957 604 =
353 mm. (c) Monsoon rainfall was below average in 18 of the 23 El Nio years, and only
exceeded 900 mm in one of those years.

12

The Practice of Statistics for AP*, 4/e

1.48 (a) and (b) The stemplots are shown below. The stemplot with the split stems shows the skewness,
gaps, and outliers more clearly. (c) The distribution of the amount of money spent by shoppers at this
supermarket is skewed to the right, with a median of $28.07 and a range of $93.34 $3.11 =
$90.23.
There are a few gaps (from $62 to $69 and $71 to $82) and some outliers on the high end ($86 and $93).
Stem-and-leaf of Dollar
50
Leaf Unit = 1.0
0
1
2
3
4
5
6
7
8
9

Stem-and-leaf of Dollar
50
Leaf Unit = 1.0

0
0
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9

399
1345677889
000123455668888
25699
1345579
0359
1
0
366
3

3
99
134
5677889
0001234
55668888
2
5699
134
5579
03
59
1
0
3
66
3

1.49 (a) Not only are most responses multiples of 10; many are multiples of 30 and 60. Most
people will round their answers when asked to give an estimate like this. The students who
claimed 360 minutes (about 6 hours) and 300 minutes (about 5 hours) may have been
exaggerating. (b) The stemplots suggest that women (claim to) study more than men. The
approximate centers are 175 minutes for women and 120 minutes for men.
Women
9
2 2 2 2 2 2 2
8 8 8 8 8 8 8 8 8 8 7 5 5 5
4 4 4

6
1
5
0
6

Chapter 1: Exploring Data

0
0
1
1
2
2
3
3

Men
0 3
6 6
2 2
5 5
0 0

3
6
2
8
3

3 3 4
7 9 9 9 9
2 2 2 2
4 4

13

1.50 (a) The stemplot is given below:


Division I-AAA

Division V-AA

8 3 6 7 7
1 4 4 4 5 6
1 5 4 8
2 6 0 2 6 6 7 7 9
1 7 2 4
7 8 6
1 9 2 3 6 8 8
6 10
(b) The Division I-AAA scores are reasonably symmetric and bell-shaped with a median of 63.5
and a range of scores of 106 38 =
68. The Division V-AA scores are more uniform with scores
varying from 36 to 98 and a median of 66. While the Division V-AA median is higher than the
Division I-AAA median, the scores are more variable, with many lower scores and many higher
scores.
7
8 6 5 4
8 7 5 5 4 4 4 3
8 7 6

7
3
3
4

6
2
2
1

1.51 (a) The distribution is slightly skewed to the left, although not strikingly so if one ignores
the low outlier(s). (b) Answers will vary. The center is between 0% and 2.5%. (c) The highest
return was between 10% and 12.5%. Ignoring only one low outlier, the lowest return was
between 17.5% and 15%. If we ignore two low outliers, the lowest return was between
12.5% and 10%. (d) About 37% of these months (102 out of 273) had negative returns.
1.52 The distribution of lengths of words in Shakespeares plays is skewed to the right with a
center between 5 and 6 letters. The smallest word contains one letter and the largest word
contains 12 letters, so the range is 11 letters.
1.53 (a) The graph is given below:

(b) The distribution is roughly symmetric. Based on the histogram, the center is near 23 minutes,
and the range is 30.9 15.5 =
15.4 minutes. There do not appear to be any outliers.

14

The Practice of Statistics for AP*, 4/e

1.54 (a) The graph is given below:

(b) The distribution of the emissions is skewed to the right with center near 3 metric tons per
person. The range is 19.6 0.1 =
19.5 metric tons per person and there appear to be three outliers:
Canada, Australia and the United States.
1.55 The graph is given below:

The distribution is somewhat skewed to the left with center at 35. The smallest DRP score is 14
and the largest DRP score is 54, so the scores have a range of 40. There are no gaps or outliers.
1.56 The graph is given below:

This distribution is roughly symmetric. There are no clear outliers, though some may suggest that
the approximate 10 minute drive times are outliers.

Chapter 1: Exploring Data

15

1.57 (a) The histogram below shows that the distribution is skewed to the right with a single
peak. The center is at 4 letters, with a range of 15 1 =
14 letters. There are no gaps or outliers.

(b) There are more 2, 3, and 4 letter words in Shakespeares plays and more very long words in
Popular Science articles.
1.58 (a) The histogram is given below:

(b) The distribution of chest sizes is remarkably symmetric with center around 40 inches and
range of 48 33 =
15 inches. This information is important so that the military can plan for
having the correct distribution of uniform sizes.
1.59 It is difficult to effectively compare the salaries of the two teams with these two histograms
because the scale on the x-axis is very different from one graph to the other.
1.60 Both distributions are skewed to the right, but the Yankees have a longer right tail. This
means that the Yankees have a higher center and a larger spread. The median salary for the
Yankees appears to be about $400,000 whereas the Phillies have a median of about $300,000.
The range of the Yankees salaries is from $3,600,000 $0 =
$3,600,000 whereas the Phillies have
a range of $1,600,000 $0 =
$1,600,000 (obviously the lowest salary is not $0, but we do not
know from the histogram exactly what the lowest salary is). Finally, the Yankees have an outlier
somewhere between $3,200,000 and $3,600,000.

16

The Practice of Statistics for AP*, 4/e

1.61 Answers will vary. A possible bar graph is given below:

1.62 Answers will vary. A possible bar graph is given below:

1.63 (a) The percents for women sum to 100.1% due to roundoff error. (b) Relative frequency
histograms are shown below since there are considerably more men than women.

Chapter 1: Exploring Data

17

(c) Both histograms are skewed to the right, with the womens salaries generally lower than the
mens. The peak for women is the interval from $20,000 to $25,000, and the peak for men is the
interval from $25,000 to $30,000. The range of salaries is the same, with salaries in the smallest
and largest intervals for both genders.
1.64 Among those Vietnamese who are younger, there are more males than females. This seems
to be true until those who are aged 35 or more. At that point the females seem to outnumber the
males. Most Vietnamese are young with a peak frequency for those in their 20s. After 30 the
numbers start to decline fairly rapidly.
1.65 (a) Two bar graphs are shown below for comparison:

(b) The two distributions are very different. The distribution of scores on the statistics exam is
roughly symmetric with a peak at 3. The distribution of scores on the AB calculus exam shows a
very different pattern, with a peak at 1 and another slightly lower peak at 5. The College Board
considers 3 or above to be a passing score. The percents of students passing the exams are
very close (59.5% for calculus AB and 58.8% for statistics). Some students might be tempted to
argue that the calculus exam is easier because a higher percent of students score 5. However,
there is a larger percent of students who score 1 on the calculus exam. From these two
distributions it is impossible to tell which exam is easier. (Note: Grade setting depends on a
variety of factors, including the difficulty of the questions, scoring standards, and the
implementation of scoring standards. The distributions above do not include any information
18

The Practice of Statistics for AP*, 4/e

about the ability of the students taking the exam. If we have a less able group of students, then
scores would be lower, even on an easier exam.)
1.66 It appears that the prediction for China in 2050 suggests that until age 65 or so each age
group will have more men than women. After age 65, it appears that each age group will have
slightly more women than men. The largest group of Chinese will be in their late 50s and early
60s, in 2050. In fact, other than the late 50s and early 60s, the distribution looks quite uniform
(equal numbers of people at different age levels) until the mid 80s and above.
1.67 (a) This histogram represents the amount of studying. We would expect most students to
study some, but not a huge amount. Any outliers would likely be high outliers, leading to a rightskew distribution. (b) This graph represents the right vs. left-handed variable. About 90% of the
population is right-handed and since 0 represents right-handed people we would expect a much
higher bar at 0 than at 1. (c) This graph represents the gender of the students. We would expect
a more even distribution among the males and females than we would for the right-handed and
left-handed students. (d) This histogram represents the heights of the students. The distribution
of heights is usually symmetric and bell-shaped.
1.68 (a) Radio stations are categorical, so use a bar graph one bar for each station. (b) Since
hours studied per week is quantitative, either use a dotplot, stemplot, or a histogram. (c) Since
calories is quantitative, either use a dotplot, stemplot, or a histogram.
1.69 a
1.70 d
1.71 c
1.72 b
1.73 b
1.74 d
1.75 (a) The individuals are Major League Baseball players who were on the roster on opening
day of the 2009 season. (b) There are six variables besides name. Two of them are categorical
(team, position) and the other 4 are quantitative (age, height, weight and salary). (c) Age is
measured in years, height in feet and inches, weight in pounds and salary in dollars.
1.76 (a) Generally, more people love the newer devices such as the iPod, Broadband and
HDTV. Those less loved are older technologies like cable tv and pay tv. (b) It would not be
appropriate to make a pie chart with these data because the categories are not dividing up the
whole into pieces. Individuals could be represented in more than one bar.
1.77 (a) There were 10 + 9 + 24 + 61 + 206 + 548 = 858 observations in all. Of those,
71
10 + 61 = 71 were elite players. So
100% = 8.28% were elite soccer players. There were
858
10 + 9 + 24 = 43 who had arthritis, which means that 5% of the people had arthritis. (b) 10 of the
71 elite players had arthritis. This means that 14.1% had arthritis. 10 of the 43 people who had

Chapter 1: Exploring Data

19

arthritis were elite players. This means that 23.3% of those with arthritis were elite soccer
players.
1.78 The percent of each group who have arthritis is 14.1% for the elite soccer players, 4.2% for
the non-elite soccer players and 4.2% for the people who did not play. This suggests an
association between playing elite soccer and developing arthritis. This can also be seen in the
following bar graph:

Section 1.3
Check Your Understanding, page 55:

1. Since the distribution is skewed to the right, we would expect the mean to be larger than the
median.
5 + 10 + 10 + 15 + ... + 85
2. The mean is
= 31.25 minutes, which is bigger than the median of
20

22.5 minutes.
3. If we divided the travel time up evenly among all 20 people, each would have a 31.25 minute
travel time.
4. In this case, since the distribution is skewed, the median would be a better measure of the
center of the distribution.

Check Your Understanding, page 61:

1. The data, when ordered, are: 307 311 311 313 317 318 318 326 338 353. Therefore, the
minimum is 307 and the maximum is 353. The median is the average of the 5th and 6th
317 + 318
observations and is
= 317.5. The first quartile is the median of the bottom 5
2

observations, or the 3rd observation: 311. The third quartile is the median of the top 5
observations, or the 8th observation: 326. So the 5- number summary is 307 311 317.5 326
353.

20

The Practice of Statistics for AP*, 4/e

2. The IQR is 326 311 = 15. This is the range of the middle half of the data.
3. 1.5IQR = 1.5(15) = 22.5. Any outliers occur below 311 22.5 = 288.5 or above
326 + 22.5 = 348.5. There are no observations below 288.5. However, there is one observation
above 348.5 the value 353. It is an outlier.
4. The graph is given below. Note that Minitab computes the quartiles differently and so does
not find the highest point to be an outlier.

Check Your Understanding, page 64:


67 + 72 + 76 + 76 + 84
1. The mean is
= 75. If the total of all of the heights was the same, but
5

all players were the same height, they would be 75 inches tall.
2. The table is given below:
Observation
67

Deviation
67 75 = 8

72

72 75 = 3

76
76
84

76 75 = 1
76 75 = 1
84 75 = 9

Total

Squared Deviation

(8)2 = 64
(3)2 = 9
12 = 1
12 = 1
92 = 81
156

3. The variance is the sum of the squared deviations (taken from the Total line) divided by 4
156
(the number of observations 1). In this case that means that the variance is
= 39 inches
4
squared. The standard deviation is the square root of the variance. In this case that is 6.24
inches.
4. The players heights vary about 6.24 inches from the mean height of 75 inches on average.

Exercises, page 70:

86 + 84 + ... + 93 1190
= = 85. If Joey had scored
14
14
the same number of points on the first 14 quizzes, but the scores had all been the same, then he
would have scored an 85 on each quiz.
1.79 The mean of Joeys first 14 quiz scores is

Chapter 1: Exploring Data

21

1.80 The mean weight for the 7 defensive linemen on the 2009 Dallas Cowboys is
306 + 305 + 315 + 303 + 318 + 309 + 285
= 305.86 pounds. If the same number of pounds were
7
spread equally among the 7 men, they would all weigh 305.86 pounds.
1.81 (a) Putting the scores in order: 74 75 76 78 80 82 84 86 87 90 91 93 96 98. Since
there are 14 scores, the median is the mean of the 7th and 8th scores. Therefore the median is
84 + 86
= 85. About half of the scores are lower than 85 and about half are larger than 85. (b) If
2
Joey had a 0 for the 15th quiz then the sum of his quiz scores would still be 1190 leading to a
1190
= 79.33. To find the median, we add the 0 to the beginning of the list in part (a).
mean of
15
Since there are now 15 measurements, the median would be the 8th measurement which is 84.
Notice that the median did not change much but the mean did. This shows that the mean is not
resistant to outliers, but the median is.
1.82 (a) Putting the weights in order: 285 303 305 306 309 315 318. Since there 7
measurements, the median is the 4th. Therefore the median is 306 pounds. About half of the men
weigh less than 306 pounds and half weigh more than 306 pounds. (b) If the lowest weight were
265 instead of 285, then the mean would become smaller, but the median would not change. This
is because the median is resistant to outliers, but the mean is not.
1.83 The mean is $60,954 and the median is $48,097. The distribution of salaries is likely to be
quite right skewed because of a few people who have a very large income. When a distribution is
skewed to the right, the mean is bigger since the tail values pull the mean toward them.
1.84 The mean house price is $216,400 and the median is $172,600. The distribution of house
prices is likely to be quite skewed to the right because of a few very expensive homes. When a
distribution is skewed to the right, the mean is bigger since the tail values pull the mean toward
them.
1.85 The teams annual payroll is 1.2(25) = 30 or $30 million. No, you would not be able to
calculate the teams annual payroll from the median because you cannot determine the sum of all
25 salaries from the median.
1.86 The mean salary is $60,000. Seven of the eight employees (everyone but the owner) earned
less than the mean. The median is $22,000. An unethical recruiter would report the mean salary
as the typical or average salary. The median is a more accurate depiction of a typical
employees earnings, because it is not influenced by the outlier of $270,000.
1.87 (a) Estimate the frequencies of the bars (from left to right): 10, 40, 42, 58, 105, 60, 58, 38,
27, 18, 20, 10, 5,5,1 and 3 (although answers may vary slightly, the frequencies must sum to
500). Using these values, we can estimate the mean by adding 2 ten times, 3 forty times,, and
17 three times. This is equivalent to multiplying the value of each bar (2 through 17) by its
frequency or height. This gives us a sum of 3504. The mean is then estimated by dividing by the
3504
number of responses: x =
= 7.01. We estimate the median by finding the average of the
500
250th and 251st values. The median is found to be 6. (b) Since the median is less than the mean,
we would use the median to argue that shorter domain names are more popular.

22

The Practice of Statistics for AP*, 4/e

1.88 (a) Estimate the frequencies of the bars (from left to right): 15, 11, 15, 11, 8, 5, 3, 3, 3
(although answers may vary slightly, the frequencies must sum to 74). We estimate the median
by finding the average of the 37th and 38th values. The median is found to be 2. The first quartile
is the median of the lower 37 observations. This means that it is the value of the 19th observation.
This is found to be 1. The third quartile is the median of the upper 37 observations, which means
that it is the value of the 56th observation. This is found to be 4. (b) Using these values, we can
estimate the mean by adding 0 fifteen times, 1 eleven times,, and 8 three times. This is
equivalent to multiplying the value of each bar (0 through 8) by its frequency or height. This
gives us a sum of 194. The mean is then estimated by dividing by the number of responses:
194
x=
= 2.62.
74
1.89 (a) Putting the data in order we get: 74 75 76 78 80 82 84 86 87 90 91 93 96 98.
There are 14 observations here so the first quartile is the median of the bottom 7 observations.
This means that it is the value of the 4th observation. We find it to be 78. The third quartile is the
median of the top 7 observations, so it is the value of the 11th observation. We find it to be 91.
So IQR = 91 78 = 13. The middle 50% of the data have a spread of 13 points. (b) Any outliers
are below Q1 1.5IQR or above Q3 + 1.5IQR. These are computed to be 78 1.5(13) =
58.5 and
91 + 1.5(13) = 110.5. There are no points outside of these bounds, so there are no outliers.
1.90 (a) Putting the data in order we get: 285 303 305 306 309 315 318. Since there are 7
data points, the median is the 4th and is not included in either the lower half or the upper half of
the data set. The first quartile is the middle of the bottom 3 observations, or 303. The third
quartile is the middle of the top 3 observations, or 315. Therefore IQR = 315 303 = 12. The
middle 50% of the weights have a spread of 12 pounds. (b) Any outliers are below Q1 1.5IQR
or above Q3 + 1.5IQR. These are computed to be 303 1.5(12) = 285 and 315 + 1.5(12) = 333.
While we do have an observation of exactly 285, it is not lower than the boundary we computed,
so it is not designated as an outlier. This data set has no outliers.
1.91 (a) Using a stemplot to put the data in order:
0
0001133557889
1|4 represents 14 messages sent
1
4
2
5569
3
4
24
5
2
6
7
2
8
9
28
10
11 8
We now find that the median is 9, the first quartile is 3 and the third quartile is 43. The IQR is
40. So designate anything below 3 1.5(40) = 57 or above 43 + 1.5(40) = 103 as outliers. This
means that the value of 118 is an outlier. The boxplot produced by computer software is shown
below.

Chapter 1: Exploring Data

23

(b) The article claims that teens send 1742 texts a month. This works out to be about 58 texts a
day (assuming a 30 day month). That seems pretty high given this data set. Twenty-one of the
25 students sent fewer than that, in fact, half of the students sent less than 10 messages (about
1/6th of the amount claimed in the article).
1.92 (a) The median is the average of the ranked scores in the middle two positions (the 15th and
16th ranked scores). The median is 87.75. Half of the students scored less than 87.75 and half
scored more than 87.75. Q1 is the score one-quarter up the list of ordered scores, 82. Q3 is the
score three-quarters up the ordered list of scores, 93. IQR = 93 82 = 11. The middle 50% of the
scores have a range of 11 points. Any observation above Q3 + 1.5IQR = 93 + 1.5(11) = 109.5 or
below Q1 1.5IQR = 82 1.5(11) = 65.5 is considered an outlier. Thus, the scores 43 and 45 are
outliers. The boxplot produced by computer software is shown below.

(b) Most students did quite well. In fact over of the class got higher than an 80 typically a
grade of B.
1.93 (a) Since the data are recorded as
, positive numbers indicate students who
had more text messages than calls. It appears from the boxplot as though the first quartile is 0
which means that approximately 25% made more calls than they texted. But the remaining 75%
had more texts than calls. So this does support the articles conclusion. (b) No we cannot make
any more general conclusion. The sample was not a random sample and there may be some
commonality among his students that affected their responses to this question.
1.94 (a) Minimum = 3, maximum = 55. The median divides the ordered data in half and is
therefore the 26th ordered observation, 8. Q1 is the median of the first half of the distribution and
is therefore the 13th ordered observation, 4. Q3 is the median of the last half of the distribution and
is therefore the 39th ordered observation, 12. IQR = 12 4 = 8. An outlier is any observation that
is less than Q 1 1.5IQR = 9 or greater than Q3 + 1.5IQR = 24. Thus there are 4 outliers: 27, 31,
34 and 55. Boxplot is shown below.
24

The Practice of Statistics for AP*, 4/e

(b) Use the median and IQR rather than the mean and standard deviation because the distribution
is right skewed.
1.95 (a) The stock fund varied between about 3.5% and 3%. (b) The median return for the
stock fund was slightly positive, about 0.1%, while the median real estate fund return appears to
be close to 0%. (c) The stock fund is much more variable. It has higher positive returns, but also
higher negative returns.
1.96 All five income distributions are skewed to the right. As highest education level rises, the
median, quartiles, and extremes risethat is, all five points on the boxplot increase. Additionally,
the width of the box (the IQR) and the distance from one extreme to the other (the difference
between the 5th and 95th percentiles) also increase, meaning that the distributions become more
and more spread out.
1.97 (a) The mean phosphate level is x =

=
sx

32.4
= 5.4 mg/dl. The standard deviation is
6

2.06
= 0.6419 mg/dl. Details are provided below.
5

xi

xi x

5.6
5.2
4.6
4.9
5.7
6.4
32.4

0.2
-0.2
-0.8
-0.5
0.3
1.0
0

(xi x )

0.04
0.04
0.64
0.25
0.09
1.00
2.06

(b) The typical phosphate level is an average of 0.6419 mg/dl different from the mean level.
1.98 (a) Mean = x =

7+ 7+9+9
= 8. The average amount of sleep that the first four students
4

got last night was 8 hours. The deviations are 7 8 = 1, 7 8 = 1, 9 8 = 1, 9 8 = 1. The


standard deviation is then

(1) 2 + (1) 2 + 12 + 12
=
4 1

4
= 1.15. (b) The distance between a
3

typical response and the mean response is 1.15 hours. (c) No, it would not be safe to make this
generalization. This is not a random sample and it is not likely that the first 4 students to arrive in

Chapter 1: Exploring Data

25

the classroom are representative of the entire class in terms of the amount of sleep they got last
night.
1.99 (a) It looks like the distribution is skewed to the right because the mean is much larger than
the median. (b) The standard deviation is 21.6974. The distance between a typical response and
the mean response is $21.6974. (c) The first quartile is 19.27 and the third quartile is 45.4 so the
IQR is 45.4 19.27 = 26.13. Any points below 19.27 1.5(26.13) = 19.925 or above
45.4 + 1.5(26.13) = 84.595 are outliers. Since the maximum point is 93.34, there are outliers.
1.100 (a) It would appear that the distribution for the female doctors is more likely to be
symmetric since the mean and median are relatively close together (19.1 and 18.5 respectively).
The mean and median for the male doctors are quite far apart (41.333 and 34 respectively). (b)
The IQR measures the range of the middle 50% of the data. This does not take outliers into
consideration. The standard deviation, however, uses every point and is not resistant to outliers.
So, while the middle 50% of the data set may look very similar, if one data set has many more
outliers, it will have a larger standard deviation. (c) It does appear that males perform more Csections. Each of the numbers in the 5-number summary was larger for the males than for the
females.
1.101 Yes, IQR is resistant. Answers will vary. Consider the simple data set 1, 2, 3, 4, 5, 6, 7, 8.
The median = 4.5, Q1 = 2.5, Q3 = 6.5, and IQR = 4. Changing any value outside the interval
between Q1 and Q3 will have no effect on the IQR. For example, if 8 is changed to 88, the IQR
will still be 4.
1.102 Variable A has a larger standard deviation because more of the observations have values
further from the mean. Because of the bell-shape to the distribution of variable B, more of the
observations have values quite close to the mean.
1.103 (a) One possible answer is 1, 1, 1, 1. (b) 0, 0, 10, 10. (c) For (a), any set of four identical
numbers will have sx = 0. For (b), the answer is unique; here is a rough description of why. We
want to maximize the spread-out-ness of the numbers (which is what standard deviation
measures), so 0 and 10 seem to be reasonable choices based on that idea. We also want to make

each individual squared deviation (x1 x ) , (x2 x ) , (x3 x ) and (x4 x ) as large as
2

possible. If we choose 0, 10, 10, 10or 10, 0, 0, 0we make the first squared deviation 7.52, but
the other three are only 2.52. Our best choice is two at each extreme, which makes all four
squared deviations equal to 52.

1.104 (a) This could be used to measure the center since we are averaging the 25th and 75th
percentiles, effectively finding a middle point between these positions. It would be resistant to
outliers, because any outliers would occur further out in the tail. (b) This could be used as a
measure of spread since it finds the distance between the smallest and largest values and then
divides by 2. It gives half of the range. This measure, however, would not be resistant to
outliers, since if outliers exist, they would be, by definition, either the max, the min, or both.
1.105 State: Do the data indicate that women have better study habits and attitudes towards
learning than men? Plan: We will draw side-by-side boxplots for each group. We will compute
the 5-number summary, the mean and the standard deviation for the scores of each group. Then
we will compare the groups using both graphical and numerical summaries. Do: The boxplots
are given below, as is a table of the numerical summaries.
26

The Practice of Statistics for AP*, 4/e

Variable
Women
Men

N
18
20

Mean
141.06
121.25

StDev
26.44
32.85

Minimum
101.00
70.00

Q1
123.25
95.00

Median
138.50
114.50

Q3 Maximum
156.75
200.00
144.50
187.00

Conclude: It appears from the boxplot and the numerical summaries that the women have higher
values for all of the components of the 5- number summary. They also have a higher mean and a
smaller standard deviation. This means that not only are their scores higher, but there is less
variability to their scores.
1.106 State: Do the different types of flowers have different lengths? Plan: We will look at
side-by-side boxplots, the 5-number summaries, the means and the standard deviations for all
three kinds of flowers. Do: The boxplots are given below, as is a table of the numerical
summaries.

Variable
H. bihai
red
yellow

Minimum
46.340
37.400
34.570

Variable
H. bihai
red
yellow

Mean
47.597
39.711
36.180

Q1
46.690
38.070
35.450

Median
47.120
39.160
36.110

Q3
48.293
41.690
36.820

Maximum
50.260
43.090
38.130

StDev
1.213
1.799
0.975

Conclude: H. bihai is clearly the tallest varietythe shortest bihai was over 3 mm taller than the
tallest red. Red is generally taller than yellow, with a few exceptions. Another noteworthy fact:
The red variety is more variable than either of the other varieties. Our overall conclusion, then, is
Chapter 1: Exploring Data

27

that the researchers appear to be correct in their assumption that the three varieties have different
lengths.
1.107 d
1.108 b
1.109 a
1.110 a
1.111 (a) Yes, a pie chart is appropriate here since the categories (method of communication)
are mutually exclusive (each student chose one method by which they most often communicated
with friends) and are parts of a whole. (b) The graph should not be described as skewed to the
right because this is a bar graph. The categories could be graphed in any order. Skewness only
describes quantitative data.
1.112 The histogram is given below:

This distribution is basically symmetric with a center around 170 cm and values ranging from
145.5 cm to 191 cm. There do not appear to be any outliers.
1.113 Women appear to be more likely to engage in behaviors that are indicative of habits of
mind. They are especially more likely to revise papers to improve their writing (about 55% of
females report this as opposed to about 37% of males). The difference is a little less for seeking
feedback on their work. In that case about 49% of the females did this as opposed to about 38%
of males.

28

The Practice of Statistics for AP*, 4/e

Chapter Review Exercises (page 75)


R1.1 (a) The individuals are movies. (b) The variables are name (categorical), Year
(quantitative, in years), Rating (categorical), Time (quantitative, in minutes), Genre (categorical)
and Box office sales (quantitative, in dollars). (c) This movie is Avatar, released in 2009. It was
rated PG-13, runs 162 minutes is an action film and had box office sales of $1,141,340,297.
R1.2 The bar chart is given below.

R1.3 (a) It is the areas of the phones that should be in proportion, not just the heights. For
example, the picture for send/receive text messages should be roughly twice the size of the
picture for camera when it is actually much more than twice as large. (b) It would not be
appropriate to make a pie chart for these data because they do not describe parts of a whole.
Students were free to answer in more than one category. (c) A bar graph is given below.

R1.4 (a) There were a total of 78 + 49 + 21 + 4 + 21 + 46 = 219 who responded. Of those,


78 + 49 + 21 = 148 were Facebook users. So the percent of Facebook users is

Chapter 1: Exploring Data

29

148

100% = 67.6%. This is part of a marginal distribution because it compares the total in one
219
column to the overall total in the table. (b) There were 78 + 4 = 82 younger students and 78 of
those were Facebook users. So the percentage of younger students who are Facebook users is
78
78
100% = 52.7% of Facebook users
100% = 95.1%. There were 148 Facebook users so
148
82
were younger.
R1.5 There does appear to be an association between age and Facebook status. Looking at the
conditional distributions of Facebook status for each of the age groups, we get the following
table:
Facebook user?
Age
Yes
No
Younger (18-22)
95.1%
4.9%
Middle (23-27)
70.0%
30.0%
Older (28 and up)
31.3%
68.7%
This can also be seen graphically in the bar graph given below:

From both the table and the graph we can see that the older the student is, the less likely that they
are to be a member of Facebook. For younger students, about 95% are members. That drops to
70% for middle students and drops even further to 31.3% for older students.
R1.6 (a) A stemplot is shown below.
Stem-and-leaf of density
Leaf Unit = 0.010
48 8
49
50 7
51 0
52 6799
53 04469
54 2467
55 03578
56 12358
57 59
58 5
30

= 29

The Practice of Statistics for AP*, 4/e

(b) The distribution is roughly symmetric with one value (4.88) that is somewhat low. The center
of the distribution is between 5.4 and 5.5. The densities have a range 5.85 4.88 =
0.97 and there
are no outliers. (c) Since the distribution is roughly symmetric, we can use the mean to estimate
the Earths density to be about 5.45 in these units.
R1.7 (a) A histogram is shown below.

The survival times are right skewed, as expected. The median survival time is 102.5 days and the
range of survival times is 598 43 =
555 days. There are several high outliers in with survival
times above 500. (b) The boxplot is shown below:

R1.8 (a) About 20% of low-income and 33% of high-income households consisted of two
people. (b) The majority of low-income households, but only about 7% of high-income
households, consist of one person. One-person households often have less income because
they would include many young people who have no job, or have only recently started
working. (Income generally increases with age.)
R1.9 (a) Since the standard deviation is 0.3004, a typical measurement of mercury per can of
tuna will be about 0.3004 from the mean. (b) The IQR = 0.380 0.0708 = 0.3092 so any point
below 0.0708 1.5(0.3092) = 0.393 or above 0.38 + 1.5(0.3092) = 0.8438 would be considered
an outlier. Since the smallest value is 0.012, there are no low outliers. But the largest value is
1.50, which is larger than 0.8438. So there is at least one high outlier. (c) The distribution of the
amount of mercury in cans of tuna is highly skewed to the right. The median is 0.18 ppm and the
IQR is 0.3092 ppm.

Chapter 1: Exploring Data

31

R1.10 The albacore tune generally has more mercury. Its minimum, first quartile, median and
third quartile are all larger than the respective values for light tuna. But that doesnt mean that
light tuna is always better. It has a much bigger spread of values with some cans having as much
as twice the amount of mercury as the largest amount in the albacore tuna. Both distributions are
skewed to the right with at least a couple of outliers on the right. The albacore tuna has a median
of 0.4 ppm and the light tuna has a median of 0.16 ppm. But the albacore has a smaller amount of
variation with an IQR of 0.1675 whereas the light tuna has an IQR of 0.2883.

AP Statistics Practice Test (page 78)


T1.1 d. Age and earned income are quantitative while marital status is categorical.
T1.2 e. The pie chart tells us what percent were manufactured in various countries. It does not
tell us anything about the actual gas mileages of the cars.
T1.3 b. US has the most cars, followed by Japan and then Germany. The next largest is Sweden
followed by France and Italy.
T1.4 b. Putting the measurements in order, the median is the 5th observation. The values in
order are: L 4.5 5.2 5.5 6.0 8.7 8.9 H H
T1.5 c. Just under half (about 62 of the 136) had under $10.
T1.6 c. The first quartile is somewhere between 0 and 10 and the third quartile is between 20
and 30. So the largest that the IQR could possibly be is 30 0 = 30.
T1.7 b. The third quartile is between the 30th and 31st observations.
T1.8 c. The mean salary of all workers will be somewhere between the mean salaries of the two
groups separately, however where it will be between these two numbers will depend on how
many workers are in each individual group.
T1.9 e. Among the small companies, 125 of the 200 surveys sent were returned. This is 62.5%.
T1.10 b. Among the small companies, 62.5% responded. Only 40.5% of the medium and 20%
of the large companies responded.
T1.11 d. Actually, high concentrations appear to have better weed control (fewer weeds growing)
than lower concentrations.
T1.12 (a) The histogram is given below.

32

The Practice of Statistics for AP*, 4/e

(b) The first quartile is the median of the bottom 15 data points, or the 8th data value. Therefore
it is 30 minutes. The third quartile is the 23rd data point (the median of the top 15 data points)
which is 77. So the IQR = 77 30 = 43. Any point below 30 1.5(43) = 34.5 or above
77 + 1.5(43) = 141.5 is an outlier. So the observation of 151 minutes is an outlier. (c) It would be
better to use the median and IQR to describe the center and spread of this distribution because it
is skewed to the right.
T1.13 (a) The table is given below.
Birth Defects
Nondiabetic
None
754
One or more
31
Totals
785
(b) The table is given below.

Diabetic Status
Prediabetic
362
13
375

Birth Defects
Nondiabetic
None
96.1%
One or more
3.9%
(c)The graph is given below.

Chapter 1: Exploring Data

Diabetic

Diabetic Status
Prediabetic
96.5%
3.5%

38
9
47

Totals

Diabetic

1154
53
1207

80.9%
19.1%

33

(d) There does appear to be an association. Nondiabetics and Prediabetics appear to have babies
with birth defects at about the same rate. However, those with Diabetes have a much higher rate
of babies with birth defects.

T1.14 (a) The longest that any battery lasted was between 550 and 559 hours. (b) Someone
might prefer to use Brand X because it has a higher minimum lifetime. (c) On the other hand,
some might prefer Brand Y because it has a higher median lifetime.
T1.15 Given below are side-by-side boxplots and descriptive statistics for both the American
League and the National League.

Variable
American League
National League

N
14
14

Mean
56.93
50.14

StDev
12.69
11.13

Minimum
35.00
29.00

Q1
47.50
45.00

Median
57.50
50.50

Q3
68.00
57.00

Maximum
77.00
67.00

The data suggest that the number of homeruns is somewhat less in the National League. All 5
numbers in the 5-number summary and the mean are less for the National League teams than for
the American League teams. However, there is more variability among the American League
teams with a standard deviation of 12.69 compared to 11.13 for the National League. Both
distributions are reasonably symmetric with no outliers.

34

The Practice of Statistics for AP*, 4/e

Vous aimerez peut-être aussi