Académique Documents
Professionnel Documents
Culture Documents
for
FINAL EXAM
Chapter 1
Population
the complete collection of
elements (scores, people,
measurements, etc.) to be studied
Sample
a subsub-collection of elements
drawn from a population
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
Definitions
Discrete - Countable
Continuous - Measurements with no
gaps
Levels of Measurement
Nominal - names only
Ordinal - names with some order
Interval - differences but no zero
zero
Ratio - differences and a zero
zero
Methods of Sampling
Random
Systematic
Convenience
Stratified
Cluster
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
Chapters 2,3
Frequency
0-4
5-9
10-14
v Class Boundaries
15-19
11
20-24
v Class Midpoints
v Class Width
Frequency Tables
Regular Freq. Table
Axial Load
Frequency
Relative
Frequency
Cumulative
Frequency
200 - 209
200 - 209
0.051
210 - 219
210 - 219
0.017
12
220 - 229
220 - 229
0.029
17
230 - 239
230 - 239
0.023
21
240 - 249
240 - 249
0.023
25
250 - 259
14
250 - 259
0.080
39
260 - 269
32
260 - 269
0.183
71
123
270 - 279
52
270 - 279
0.297
280 - 289
38
280 - 289
0.217
161
290 - 299
14
290 - 299
0.08-
175
Histogram
of Axial Load Data
60
Frequency
50
40
30
20
299.5
289.5
279.5
259.5
269.5
249.5
239.5
219.5
209.5
199.5
229.5
10
10
Important Distributions
Normal
Uniform
Skewed Right
Skewed Left
11
Stem--Leaf Plots
Stem
10 11 15 23 27 28 38 38 39 39
40 41 44 45 46 46 52 57 58 65
Stem
1
2
3
4
5
6
Leaves
015
378
8899
014566
278
5
12
Measures of
Center
Mean
Median
Mode
Midrange
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
13
14
x = 14.4
( rounded to one more
decimal place
than data )
Quiz
Scores
Midpoints
0-4
5-9
10-14
12
15-19
17
11
20-24
22
Frequency
15
Measure of Variation
Range
highest
score
lowest
score
16
Measure of Variation
Standard Deviation
a measure of variation of the scores
about the mean
(average deviation from the mean)
17
Measure of Variation
Variance
standard deviation squared
18
Same Means (x = 4)
4)
Different Standard Deviations
Frequency
s= 0
7
6
5
4
3
2
1
s = 0.8
1 2 3 4 5 6 7
s = 3.0
s = 1.0
1 2 3 4 5 6 7
1 2 3 4 5 6 7
1 2 3 4 5 6
Standard Deviation
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
19
x - 2s
(minimum
usual value)
x + 2s
Range 4s
Range
4
(maximum
usual value)
20
Rough Estimates of
Usual Sample Values
minimum usual value (mean) - 2 (standard deviation)
minimum x - 2(s)
maximum usual value (mean) + 2 (standard deviation)
maximum x + 2(s)
21
FIGURE 22- 13
95% within
2 standard deviations
68% within
1 standard deviation
34%
34%
2.4%
2.4%
0.1%
0.1%
13.5%
3s
13.5%
2s
1s
x + 1s
x + 2s
x + 3s
22
Measures of Position
z score
Population
Sample
z = x -
z = xs- x
23
Interpreting Z Scores
Unusual
Values
-3
Ordinary
Values
-2
-1
Unusual
Values
1
24
Other Measures of
Position
Quartiles and Percentiles
Start
Sort the data.
(Arrange the data in
order of lowest to
highest.)
Compute
k
L=
n
100
25
200 201 204 206 206 208 208 209 215 217 218
Find the 75th percentile.
where
(75 ) 11 = 8.75 = L
n = number of values
100
k = percentile in question
Is
L a whole
number
?
Yes
No
total by 2.
Change L by rounding
it up to the next
larger whole number.
The value of Pk is the
L=9
Figure 3 -6
26
Quartiles
Q1 = P 25
Q2 = P 50
Q3 = P 75
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
27
Boxplot
pulse rates (beats per minute) of smokers
52
69
52
71
60
72
60
73
60
75
60
78
63
80
63
82
66
83
67
88
68
90
5 - number summary
v Minimum - 52
v first quartile Q1 - 60
v Median - 68.5
v third quartile Q3 - 78
v Maximum - 90
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
28
Boxplot
Box--and
Box
and--Whisker Diagram
60
68.5
78
90
52
50
55
60
65
70
75
80
85
90
29
Chapters 4 and 5
30
Fundamentals of
Probability
31
P(A)
32
33
Rule 1
Relative frequency approach
Throwing a die 100 times and getting
15 threes
P(3) = 0.150
Rule 2
Classical approach
P(3 on a die) = 1/6 = 0.167
34
Probability Limits
v The probability of an impossible event is 0.
v The probability of an event that is certain
to occur is 1.
0 P(A) 1
Impossible
to occur
Certain
to occur
35
Complementary Events
The complement of event A, denoted
by A, consists of all outcomes in
which event A does not occur.
P(A)
P(A)
(read not A
A )
36
or
v round the final result to
three significant digits
P(struck by lightning last year) 0.00000143
37
Definitions
Compound Event
Any event combining 2 or more
events
Notation
P(A or B) = P (event A occurs or
event B occurs or they
both occur)
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
38
Disjoint Events
A = Green ball
B = Blue ball
4
8
1
8
disjoint
events
5
8
39
8
7
A = Even number
B = Number greater
than 5
Overlapping
events;; some
events
counted twice
6 7 8 9
&
10
counted twice
40
Contingency Table
Homicide
Robbery
Assault
Totals
Stranger
Acqu.. or Rel .
Acqu
12
39
379
106
727
642
1118
787
Unknown
Totals
18
69
20
505
57
1426
2000
95
41
Complementary Events
P(A) and P(A)
are
disjoint events
All simple events are either in A or A.
P(A) + P(A) = 1
42
43
Definitions
Independent Events
Two events A and B are independent if the
occurrence of one does not affect the
probability of the occurrence of the other.
Dependent Events
If A and B are not independent, they are
said to be dependent.
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
44
P(King Ace) =
4
52
4
51
P(drawing Ace, then a King) = 4 4 =
52
51
=
16
2652
0.00603
DEPENDENT EVENTS
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
45
Independent Events
Two selections
With replacement
P (both good) =
P (good and good) =
4
5
4
5
16
25
= 0.64
46
or
8
0.60 = 0.0168
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
47
Small Samples
from
Large Populations
If small sample is drawn from large
population (if n 5% of N), you can
treat the events as independent.
48
Chapter 4
49
Probability Distribution
x
(# of correct)
0
1
2
3
4
5
0.5
P(x)
.05
.10
.25
.40
.15
.05
0.4
.40
P(x)
0.3
.25
0.2
.15
0.1
0.0
.05
0
.1
1
.05
5
# of correct answers
Probability Histogram
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
50
Requirements for
Probability Distribution
P(x) = 1
where x assumes all possible values
0 P(
P(x
x) 1
for every value of x
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
51
= x P(
P(x
x)
Variance
= [x 2 P(
P(x
x) ] - 2
2
Standard Deviation
= [x 2 P(
P(x
x) ] - 2
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
52
P(x)
0
1
2
3
4
5
.05
.10
.25
.40
.15
.05
= 2.7
= 1.2
2
= 1.3
53
Binomial Experiment
Definition
1. The procedure must have a fixed number of
trials.
2. The trials must be independent
independent.. (The
outcome of any individual trial doesn
doesnt
affect the probabilities in the other trials.)
3. Each trial must have all outcomes
classified into two categories.
categories.
4. The probabilities must remain constant for
each trial.
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
54
Binomial Probability
Formula
P(x) = (n -nx!)! x ! px
qn-x
55
P(x)
P(x)
15
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0.206
0.343
0.267
0.129
0.043
0.010
0.002
0.0+
0.0+
0.0+
0.0+
0.0+
0.0+
0.0+
0.0+
0.0+
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0.206
0.343
0.267
0.129
0.043
0.010
0.002
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
56
57
Binomial Probability
Formula
P(x) =
n Cr
Number of
outcomes with
exactly x
successes
among n trials
px
qn-x
Probability of x
successes
among n trials
for any one
particular order
58
20C 3
17
59
=np
Variance 2 = n p q
Standard = n p q
Deviation
60
= 1.4 crashes
= 1.1 crashes
- 2 = 1.4 - 2(1.1) = - 0.8 (or 0)
+ 2 = 1.4 + 2(1.1) = 3.6
The usual number of US Air crashes out of seven randomly
selected crashes should be between -0.8 (or 0) and 3.6.
Four crashes would be unusual !
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
61
Chapter 6
Normal Probability
Distributions
62
6- 2
The Standard Normal
Distribution
63
64
Definition
Standard Normal Distribution
a normal probability distribution that has a
mean of 0 and a standard deviation of 1,
1 , and the
total area under its density curve is equal to 1.
-3
-2
-1
65
66
Table AA -2
v Designed only for standard normal distribution
v Is on two pages: negative z -scores and
positive z-scores
v Body of table is a cumulative area from the left
up to a vertical boundary
v Avoid confusion between zz-scores and areas
v Z-score hundredths is across the top row
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
67
Table AA-2
x
z
68
Table AA-2
X
z
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
69
Table AA-2
=1
z=x-0
1
X
z
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
70
Table AA-2
=1
z=x
Area =
Probability
X
z
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
71
Example:
=0
=1
94.29% of the thermometers will read freezing water less than 1.58
1.58
degrees.
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
72
Example:
P (z
(z > 1.23) = 0.8907
73
Example:
P (z
(z < 2.00) = 0.0228
P (z
(z < 1.50) = 0.9332
P (
( 2.00 < z < 1.50) =
0.9332 0.0228 = 0.9104
74
95% within
2 standard deviations
68% within
1 standard deviation
34%
34%
2.4%
2.4%
0.1%
0.1%
13.5%
3s
2s
13.5%
1s
x + 1s
x + 2s
x + 3s
75
Notation
P(a < z < b)
between a and b
betweena
P(zz > a)
P(
greater than, at least, more than,
not less than
P (z
(z < a)
less than, at most, no more than,
not greater than
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
76
6-3
Applications of
Normal Distributions
77
Converting to Standard
Normal Distribution
z=
Figure 66- 12
78
z =
38.8 36.0
= 2.00
1.4
79
80
6.2 6.3
Finding Values of
Normal Distributions
81
x = + (z )
82
83
84
16.5
85
6-5
86
87
n becomes larger.
88
Notation
the mean of the sample means
x =
the standard deviation of sample means
x = n
(often called standard error of the mean)
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
89
90
91
Chapter 7
Estimates and
Sample Sizes
92
Definition
Confidence Interval
(or Interval Estimate)
a range (or an interval) of values used to
estimate the true value of the population
parameter
Lower # < population parameter < Upper #
As an example
93
p - E < p < p + E
where
E = z / 2
p q
n
94
p=
p = xn
population proportion
sample proportion
of
(pronounced
p-hat)
q = 1 - p = sample
of
proportion
95
96
97
98
Example:
99
Example:
E = 1.96
(0.51)(0.49)
829
E = 0.03403
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
100
Example:
101
Example:
102
Not Known
103
x-E << x +E
where
E = t/2 s
n
104
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Large (z)
.005
(one tail)
.01
(two tails)
.01
(one tail)
.02
(two tails)
.025
(one tail)
.05
(two tails)
.05
(one tail)
.10
(two tails)
.10
(one tail)
.20
(two tails)
63.657
31.821
12.706
6.314
3.078
9.925
6.965
4.303
2.920
1.886
5.841
4.541
3.182
2.353
1.638
4.604
3.747
2.776
2.132
1.533
4.032
3.365
2.571
2.015
1.476
3.707
3.143
2.447
1.943
1.440
3.500
2.998
2.365
1.895
1.415
3.355
2.896
2.306
1.860
1.397
3.250
2.821
2.262
1.833
1.383
3.169
2.764
2.228
1.812
1.372
3.106
2.718
2.201
1.796
1.363
3.054
2.681
2.179
1.782
1.356
3.012
2.650
2.160
1.771
1.350
2.977
2.625
2.145
1.761
1.345
2.947
2.602
2.132
1.753
1.341
2.921
2.584
2.120
1.746
1.337
2.898
2.567
2.110
1.740
1.333
2.878
2.552
2.101
1.734
1.330
2.861
2.540
2.093
1.729
1.328
2.845
2.528
2.086
1.725
1.325
2.831
2.518
2.080
1.721
1.323
2.819
2.508
2.074
1.717
1.321
2.807
2.500
2.069
1.714
1.320
2.797
2.492
2.064
1.711
1.318
2.787
2.485
2.060
1.708
1.316
2.779
2.479
2.056
1.706
1.315
2.771
2.473
2.052
1.703
1.314
2.763
2.467
2.048
1.701
1.313
2.756
2.462
2.045
1.699
1.311
2.575
2.327
1.960
1.645
1.282
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pea
Pearson
rson Education, Inc.
.25
(one tail)
.50
(two tails)
1.000
.816
.765
.741
.727
.718
.711
.706
.703
.700
.697
.696
.694
.692
.691
.690
.689
.688
.688
.687
.686
.686
.685
.685
.684
.684
.684
.683
.683
.675
105
x = 26,227
s = 15,873
= 0.05
/2 = 0.025
t /2 = 2.201
E = t / 2 s = (2.201)(15,873) = 10,085.3
n
12
x -E < < x +E
< < 26,227 + 10,085.3
$16,141.7 < < $36,312.3
26,227 - 10,085.3
106
107
n=
( z /2 )2 p q
Formula 7 -2
E2
n=
(z
/2
)2
0.25
Formula 7 -3
E2
108
n = [z /2 ] 2 p q
E2
=
[1.645]2 (0.169)(0.831)
0.042
= 237.51965
= 238 households
109
E2
= (1.645)2 (0.25)
0.042
= 422.81641
= 423 households
110
E = z/2 n
(solve for
n=
z/2
n by algebra)
Formula 7 -5
111
Example:
= 0.01
z/2 = 2.575
E = 0.25
s = 1.065
n = z/2
= (2.575)(1.065)
0.25
112
Chapter 8
Hypothesis Testing
Claim:
113
H0:
H1:
114
Test Statistic
The test statistic is a value computed from
the sample data, and it is used in making
the decision about the rejection of the null
hypothesis.
/\
z= p- p
pq
n
115
Test Statistic
The test statistic is a value computed from
the sample data, and it is used in making
the decision about the rejection of the null
hypothesis.
t=
x - x
s
Test statistic
for mean
n
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
116
Test Statistic
The test statistic is a value computed from
the sample data, and it is used in making
the decision about the rejection of the null
hypothesis.
(n 1)s2
2
Test statistic
for standard
deviation
117
Critical Region
Set of all values of the test statistic that
would cause a rejection of the
null hypothesis
Critical
Regions
118
Critical Value
Any value that separates the critical region
(where we reject the null hypothesis) from the
values of the test statistic that do not lead to
a rejection of the null hypothesis
Reject H0
Fail to reject H0
Critical Value
( z score )
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
119
Two -tailed,
TwoRight--tailed,
Right
Left--tailed Tests
Left
The tails in a distribution are the
extreme regions bounded
by critical values.
120
Decision Criterion
Traditional method:
Reject H0 if the test statistic falls
within the critical region.
Fail to reject H0 if the test
statistic does not fall within the
critical region.
121
Figure 8-7
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
122
Comprehensive
Hypothesis Test
123
Example:
p = 46 / 821 = 0.0560
H0: p = 0.078
reject H 0
H1: p < 0.078
z=
p-p
pq
n
0.056 - 0.078
821
There is sufficient
evidence to support
claim that the air bag
hospitalization rate
is lower than the
7.8% rate for
automatic safety
belts.
= 0.01
p = 0.056
p = 0.078
z
- 2.35
(0.078 )(0.922)
= - 2.33
z = - 2.35
124
8-5
Testing a Claim about a Mean:
Not Known
125
Example:
270
273
258
n = 7 df = 6
x = 252.7 lb
s = 27.6 lb
204
254
228
282
126
= 0.01
0.01
165
252.7
t = 3.143
0
x - x
t= s
t = 8.407
Reject Ho
252.7 - 165
27.6
= 8.407
127
Example:
270
273
258
204
254
228
282
Final conclusion:
There is sufficient evidence to support the claim that the
sample comes from a population with a mean greater
than 165 lbs.
128
8-6
Testing a Claim about a
Standard Deviation
or
Variance
129
Chi--Square Distribution
Chi
Test Statistic
X2=
n
s
(n - 1) s 2
= sample size
= sample variance
2
2
= population variance
(given in null hypothesis)
130
131
0.025
0.025
57.153
df = 80
= 0.05
/ 2 = 0.025
106.629
132
0.975
0.025
n = 81
df = 80
Table AA-4
0.025
57.153
106.629
(n -1)s 2
133
114.586
Reject H0
57.153
106.629
x2 = 114.586
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
134
SUPPORT
Claim: 43.7
H0: = 43.7
H1: 43.7
REJECT
135
Table 88- 3
Hypothesis
Tests
Parameter
Conditions
Distribution
and Test
Statistic
Critical and
P-values
Proportion
np = 5 and
nq = 5
Normal:
Table A-2
not known
and normally
distributed or
n = 30
Student t:
Population
normally
distributed
Chi-Square:
p p
z =
p q
n
Mean
Standard
Deviation or
Variance
t =
X
s
n
( n 1) s
Table A-3
Table A-4
136
Chapter 10
Correlation
and
Regression
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
137
Overview
Paired Data
v is there a relationship
v if so, what is the equation
v use the equation for prediction
138
Definition
v Correlation
exists between two variables
when one of them is related to
the other in some way
139
Definition
v Scatterplot (or scatter diagram)
is a graph in which the paired
(x,y) sample data are plotted with
a horizontal x axis and a vertical y
axis. Each individual (x,y) pair is
plotted as a single point.
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
140
500
(72,416)
400
Weight
(lb.)
(68.5,360)
(67.5,344)
300
(72,348)
(73,332)
(73.5,262)
200
100
(37,34)
(53,80)
0
35
40
45
50
55
60
65
70
75
Length (in.)
141
(b) Strong
positive
(a) Positive
(c) Perfect
positive
Scatter Plots
Figure 1010-2
142
x
(d) Negative
Figure 1010-2
(e) Strong
negative
x
(f) Perfect
negative
Scatter Plots
143
No Linear Correlation
y
x
(g) No Correlation
Figure 1010-2
Scatter Plots
144
Definition
v Linear Correlation Coefficient r
measures strength of the linear
relationship between paired xand y-quantitative values in a
sample
145
Definition
Linear Correlation Coefficient r
r=
n xy - ( x)(y)
n(x2 ) - ( x) 2
n(y2 ) - ( y) 2
Formula 1010 -1
146
Rounding the
Linear Correlation Coefficient r
v Round to three decimal places so that
it can be compared to critical values
in Table AA-5
v Use calculator or computer if possible
147
148
= .05
.950
.878
.811
.754
.707
.666
.632
.602
.576
.553
.532
.514
.497
.482
.468
.456
.444
.396
.361
.335
.312
.294
.279
.254
.236
.220
.207
.196
= .01
.999
.959
.917
.875
.834
.798
.765
.735
.708
.684
.661
.641
.623
.606
.590
.575
.561
.505
.463
.430
.402
.378
.361
.330
.305
.286
.269
.256
149
Properties of the
Linear Correlation Coefficient r
1. - 1 r 1
2. Value of r does not change if all values of
either variable are converted to a different
scale.
3. The value of r is not affected by the choice of
x and y. Interchange x and y and the value of r
will not change.
4. r measures strength of a linear relationship.
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
150
H1 : 0
151
vTest statistic: r
vCritical values: Refer to Table AA-5
(no degrees of freedom)
Reject
= 0
Fail to reject
= 0
r = - 0.811
-1
Reject
= 0
r = 0.811
Sample data:
r = 0.828
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
152
0.27
1.41
2.19
2.83
2.19
1.81
0.85
3.05
y Household
= 0.05
n=8
Test statistic is
H0 : = 0
H1 : 0
r = 0.842
153
= 0.05
H1
Test statistic is
= .05
=0
: 0
H0:
r = 0.842
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
25
30
35
40
45
50
60
70
80
90
100
= .01
.950
.878
.811
.754
.707
.666
.632
.602
.576
.553
.532
.514
.497
.482
.468
.456
.444
.396
.361
.335
.312
.294
.279
.254
.236
.220
.207
.196
.999
.959
.917
.875
.834
.798
.765
.735
.708
.684
.661
.641
.623
.606
.590
.575
.561
.505
.463
.430
.402
.378
.361
.330
.305
.286
.269
.256
154
-1
Fail to reject
= 0
r = - 0.707
Reject
= 0
r = 0.707
Sample data:
r = 0.842
155
10.3
Regression
156
Regression
Definition
v Regression Equation
Given a collection of paired data, the regression
equation
y^ = b0 + b1x
algebraically describes the relationship between the
two variables
v Regression Line
(line of best fit or leastleast-squares line)
157
^
y is the dependent variable
(response variable)
^
y = b0 +b1x
b0 = y -
y = mx +b
b1 = slope
intercept
158
159
b1 =
Formula 10-3
b0 =
n( xy) (x) ( y)
(slope)
n( x2 ) ( x) 2
y b1 x
(y-intercept)
160
Rounding
the y-intercept b0 and the
slope b1
v Round to three significant digits
v If you use the formulas 1010-2, 1010 -3,
try not to round intermediate
values or carry to at least six
significant digits.
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
161
344
416
348
262
360
332
34
b0 = - 352 (rounded)
b1 = 9.66 (rounded)
y^ = - 352 + 9.66x
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
162
500
400
Weight
(lb.)
300
200
100
0
35
40
45
50
55
60
65
70
75
Length (in.)
163
Predictions
In predicting a value of y based on some
given value of x ...
1. If there is not a significant linear
correlation, the best predicted yy- value is y.
2. If there is a significant linear correlation,
the best predicted yy- value is found by
substituting the xx- value into the
regression equation.
164
165
344
416
348
262
360
332
34
y^ = - 352 + 9.66x
What is the weight of a bear that is 60 inches long?
Since the data does have a significant positive linear
correlation, we can use the regression equation
for prediction.
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
166
344
416
348
262
360
332
34
= 227.6 pounds
167
344
416
348
262
360
332
34
168
344
416
348
262
360
332
34
169
Chapter 11
Multinomial Experiments
And
Contingency Tables
170
11-2
Multinomial Experiments
171
Definition
Goodness--of
Goodness
of--fit test
used to test the hypothesis that an
observed frequency distribution fits
(or conforms to) some claimed
distribution
172
Goodness--of
Goodness
of--Fit Test
Notation
0
173
Expected Frequencies
If all expected frequencies are equal
equal::
E=
n
k
174
Expected Frequencies
If all expected frequencies are not all equal:
equal :
E=np
each expected frequency is found by multiplying
the sum of all observed frequencies by the
probability for the category
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
175
Key Question
We need to measure the
discrepancy between O and E;
the test statistic will involve
their difference:
O-E
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
176
Test Statistic
X2 =
(O - E) 2
E
Critical Values
1. Found in Table AA- 4 using k - 1 degrees of
freedom
where k = number of categories
2. Goodness
Goodness-- of
of-- fit hypothesis tests are always
right-- tailed.
right
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
177
Multinomial Experiment:
Goodness--of
Goodness
of--Fit Test
H0: No difference between
observed and expected
probabilities
H1: at least one of the
probabilities is different
from the others
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
178
H0: p1 = p2 = p3 = . . . = pk
H1: at least one of the probabilities is
different from the others
179
Mon
Tues
Wed
Thurs
Fri
31
42
18
25
31
p1 = p2 = p3 = p4 = p5
H 1:
180
Mon
Tues
Wed
Thurs
Fri
31
42
18
25
31
Tues
Wed
O:
Observed accidents
Day
31
42
18
Thurs
25
Fri
31
E:
Expected accidents
29.4
29.4
29.4
29.4
29.4
181
Tues
Wed
Thurs
Observed accidents
Day
31
42
18
25
31
Expected accidents
29.4
29.4
29.4
29.4
29.4
(O -E )2 /E
Fri
Test Statistic:
X2 =
(O -E) 2
= 0.0871 + 5.4000 + 4.4204 + 0.6585 + 0.0871 = 10.6531
E
Fail to Reject
p1 = p2 = p3 = p4 =
p5
Reject
p1 = p2 = p3 = p4 =
p5
X 2 = 9.488
182
= 0.05
Test Statistic falls within the critical region: REJECT the null hypothesis
Claim: Accidents occur with the same proportion (frequency);
that is, p1 = p2 = p3 = p4 = p5
H0:
p 1 = p 2 = p 3 = p4 = p 5
H1:
183
Fail to Reject
Reject
p1 = p2 = p3 = p4 =
p5
p1 = p2 = p3 = p4 =
p5
X 2 = 9.488
= 0.05
Test Statistic falls within the critical region: REJECT the null hypothesis
We reject claim that the accidents occur with equal proportions
(frequency) on the 5 workdays. (Although it appears Wednesday
has a lower accident rate, arriving at such a conclusion would
require other methods of analysis.)
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
184
185
Example:
186
Example:
33
21
E = np = (100)(0.30) = 30
Brown
n = 100
26
Yellow E = np = (100)(0.20) = 20
Red E = np = (100)(0.20) = 20
Orange E = np = (100)(0.10) = 10
Green E = np = (100)(0.10) = 10
E = np = (100)(0.10) = 10
Blue
187
Frequencies of M&Ms
Brown Yellow Red Orange Green Blue
Observed frequency
33
26
21
Expected frequency
30
20
20
10
10
10
0.3
1.8
0.05
0.9
2.5
(O -E)2 /E
Test Statistic
X2 =
(O - E) 2
=
E
0.4
5.95
Fail to Reject
Reject
X2
188
= 0.05
= 11.071
189
11--3
11
Contingency Tables
190
Definition
v Contingency Table (or two -way frequency table)
a table in which frequencies
correspond to two variables.
(One variable is used to categorize rows,
and a second variable is used to
categorize columns.)
Contingency tables have at least two
rows and at least two columns.
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
191
Definition
v Test of Independence
tests the null hypothesis that there is
no association between the row
variable and the column variable.
(The null hypothesis is the statement
that the row and column variables are
independent.)
independent
.)
192
Tests of Independence
H0 : The row variable is independent of the
column variable
H1 : The row variable is dependent (related to)
the column variable
This procedure cannot be used to establish a
direct causecause- and
and--effect link between variables in
question.
Dependence means only there is a relationship
between the two variables.
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
193
Test of Independence
Test Statistic
X2 =
(O - E) 2
E
Critical Values
1. Found in Table AA- 4 using
E=
194
195
Assault
Row Total
379
727
1118
Robbery
Homicide
12
Stranger
Acquaintance
or Relative
39
106
642
787
Column Total
51
485
1369
1905
196
Robbery
Homicide
12
Stranger
(29.93)
Acquaintance
or Relative
39
(21.07)
Row Total
727
1118
(284.64)
(803.43)
106
(200.36)
642
(565.57)
485
1369
51
Column Total
Assault
379
787
1905
E = (1118)(51)
= 29.93
1905
E=
(1118)(485)
1905
= 284.64
etc.
Final Review. Triola
Triola,, Essentials of Statistics, Third Edition. Copyright 2008. Pearson
Pearson Education, Inc.
197
X2 =
(O - E ) 2
E
Homicide
Stranger
Acquaintance
or Relative
Robbery
Forgery
12
(29.93)
[ 10.741]
379
(284.64)
[31.281]
727
(803.43)
[7.271]
39
(21.07)
[15.258]
106
(200.36)
[44.439]
642
(565.57)
[10.329]
(O -E )2
=
E
(12 -29.93)2
29.93
(E)
(O - E ) 2
E
= 10.741
198
X2 =
(O - E ) 2
E
Homicide
Stranger
Acquaintance
or Relative
Test Statistic
Robbery
Forgery
12
(29.93)
[ 10.741]
379
(284.64)
[31.281]
727
(803.43)
[7.271]
39
(21.07)
[15.258]
106
(200.36)
[44.439]
642
(565.57)
[10.329]
(E)
(O - E ) 2
E
Test Statistic:
199
X2 = 119.319
Fail to Reject
Independence
Reject
Independence
= 0.05
Reject independence
X2 = 5.991
200
X2 = 119.319
with = 0.05 and (r -1) (c -1) = (2 -1) (3 -1) = 2 degrees of freedom
Test Statistic:
Fail to Reject
Independence
Reject
Independence
X2 = 5.991
= 0.05
Reject independence
201
Definition
Test of Homogeneity
tests the claim that different populations
have the same proportions of some
characteristics
202
Yes
usable
No
seat belt?
Pittsburgh
42
74
87
70
Claim: The 3 cities have the same proportion of taxis with usab le seat belts
H0: The 3 cities have the same proportion of taxis with usable seat
seat belts
H1: The proportion of taxis with usable seat belts is not the same
same in all 3 cities
Reject
homogeneity
Fail to Reject
homogeneity
= 0.05
X2 = 5.991
203
204