Vous êtes sur la page 1sur 63

Week 14: Chapter 9.1-9.

2
CI & Hypotheses Tests for Two Large Samples

C
h
a
p
t
e
r

Identifying the Target Parameter

1 2
Mean difference;
difference in
averages

Quantitative Data

C
h
a
p
t
e
r

Comparing Two Population Means


Independent Sampling

Point Estimators


1 2 1 2
To construct a confidence interval or conduct a
hypothesis test, we need the standard deviation:

Single Sample

x s

Two Samples

x x
1

s12 s22

n1 n2

C
h
a
p
t
e
r

Sampling Distribution for (x1 x2 )


1. The mean of the sampling distribution of
1 2 is (1 2 ) That is:

x x x x 1 2
1

2.

If the two samples are independent, the standard


deviation of the sampling distribution is:

x x
1

s12 s22

n1 n2

C
h
a
p
t
e
r

The Sampling Distribution for (x1 x2 )


3. The sampling distribution for 1 2 is
approximately normal for large samples.

C
h
a
p
t
e
r

Large Sample CI for 1 2

100(1 )% confidence interval for 1 2 is:


1 2

1 2

21 22
+
1 2

21 22
+
1 2

Conditions Required for Valid Large-Sampling


Inferences about 1 2
1. The two samples are randomly and independently
selected from the target populations.
2. The sample sizes are both 30.
6

C
h
a
p
t
e
r

Example: College Retention


Private Colleges

Public Universities

n: 71

n: 32

Mean: 78.17

Mean: 84

Standard Deviation: 9.55

Standard Deviation: 9.88

Variance: 91.17

Variance: 97.64

What does a 95% confidence


interval tell us about retention
rates?

9
7

C
h
a
p
t
e
r

Example: College Retention


Label the parameters

1 = the mean retention for first-year students at private


institutions.

2 = the mean retention for first-year students at public


institutions.

The parameter of interest is 1 2 , the mean difference


of retention between private and public institutions.

9
8

C
h
a
p
t
e
r

Example: College Retention


Verify the required conditions
1. Assume two populations are independent and a sample
from each institution is selected randomly.

2. Samples are large.

C
h
a
p
t
e
r

Example: College Retention


Calculate the Confidence Interval

95% confidence interval for 1 2 is:


1 2

21 22
+
1 2

78.17 84 1.96

91.1 97.64
+
71
32

5.83 4.08
(9.91, 1.75)

10

C
h
a
p
t
e
r

Example: College Retention


Interpretation

Using this estimation procedure over and over again for


different samples, we know that approximately 95% of the
confidence intervals formed in this manner will enclose the
1 2 .

Therefore, we are 95% confident that the mean retention


for private institutions is between 1.75 and 9.91 less than
the mean retention for the public institutions.

9
11

C
h
a
p
t
e
r

Comparing Means from Two


Independent Populations (large sample)
Label the parameters
1 = the population mean for population I
2 = the population mean for population II

Hypotheses
Null Hypothesis
0 : 1 2 = 0

Alternative Hypotheses
: 1 2 < 0
: 1 2 > 0
: 1 2 0
0 = Hypothesized difference between the means

12

C
h
a
p
t
e
r

Comparing Means from Two


Independent Populations (large sample)
Test
We will use a large sample z-test.

Required Conditions
1. Assume two populations are independent and
a sample from each population is selected
randomly.
2. Samples are large.

9
13

C
h
a
p
t
e
r

Comparing Means from Two


Independent Populations (large sample)
Level of Significance
=? ? ? (usually given, if not choose = .05)

Rejection region
> if upper tailed test (or : 1 2 > 0 )

> if lower tailed test (or : 1 2 < 0 )


> 2 or > 2 if two-tailed test (or : 1 2 0 )

Calculation of proposed test statistic


z

( x1 x2 ) D0

(x x
1

2)

where, ( x1 x2 )

12
n1

22
n2

s12 s22

n1 n2
14

C
h
a
p
t
e
r

Comparing Means from Two


Independent Populations (large sample)
Decision and conclusion
If the calculated value of the proposed test
statistic belongs to the rejection region, we reject
0 ; otherwise we fail to reject 0 .
Write the conclusion in the context

9
15

C
h
a
p
t
e
r

Making a Decision Using -Value


1. Upper-tailed test
: 1 2 > 0

= P (z zcal)

2. Lower-tailed test
: 1 2 < 0

3. Two-tailed test

= P(z zcal)

= 2 P(z | zcal |)

: 1 2 0

Decision Rule:
If < , we reject 0
16

C
h
a
p
t
e
r

Example: College Retention 2


Private Colleges

Public Universities

n: 71

n:32

Mean: 78.17

Mean: 84

Standard Deviation: 9.55

Standard Deviation: 9.88

Variance: 91.17

Variance: 97.64

Test the hypothesis that there is no significant


difference in retention at privates and publics.
Use = .05.

9
17

C
h
a
p
t
e
r

Example: College Retention 2


Label the parameters
1 = the mean retention for first-year students at private
institutions.
2 = the mean retention for first-year students at public
institutions.

Hypotheses
Null Hypothesis
0 : 1 2 = 0
Alternative Hypothesis
: 1 2 0

9
18

C
h
a
p
t
e
r

Example: College Retention 2


Test
We will use large sample z-test to compare the mean
difference of retention between private and public
institutions.

Verifying required conditions


1. Assume two populations are independent and a sample
from each institution is selected randomly.
2. Samples are large

9
19

C
h
a
p
t
e
r

Example: College Retention 2


Level of Significance
= 0.05

Rejection Region
< 1.96 or > 1.96

Calculation of test statistic


zcal

( x1 x2 ) D0
s12 s22

n1 n2

(78.17 84) 0 5.83

2.799
2.08
91.17 97.64

71
32

20

C
h
a
p
t
e
r

Example: College Retention 2


Decision and Conclusion
Since = 2.799 is less than 1.96, that is,
= 2.799 belongs to the rejection region, so we
reject the null hypothesis.
We conclude that there is insufficient evidence at 5%
level of significance to support the claim that there is no
significant difference in retention rates at private and
public institutions.

9
21

C
h
a
p
t
e
r

Example: College Retention 2


Using p-value approach
= 2 2.799
= 2 0.0025 .0050

OR
= 2 ( > 2.799) = 2 0.0025 .0050

Decision and Conclusion


Since < , we reject 0 .
We conclude that there is insufficient evidence at 5% level
of significance to support the claim that there is no
significant difference in retention rates at privates and
publics.

22

Week 14: Chapter 9.2 (cont.)


CI & Hypotheses Tests for Two Small Samples

C
h
a
p
t
e
r

Small Sample Confidence Interval for


(1 2 ) for equal variances
Required conditions:
1. The two samples are randomly selected in an independent
manner from the two target populations.
2. Both sampled populations have distributions that are
approximately normal.
3. The population variances are equal, but unknown.
Check: We will be using the following rule to check the
equality of population variances.

1 s1
2
2 s2
4. Sample sizes are small.

C
h
a
p
t
e
r

Small Sample CI for 1-2

100(1-)% confidence interval for 1-2 is:


1 1
( x1 x2 ) t
s p
, ( n1 n2 2 )
n1 n2
2
2

Where
2
2
(
n

1
)
s

(
n

1
)
s
1
2
2
s 2p 1
n1 n2 2

C
h
a
p
t
e
r

Example: Teaching Reading


TABLE 9.2 Reading Test Scores for Slow Learners
New Method

Standard Method

80

80

79

81

79

62

70

68

76

66

71

76

73

76

86

73

70

85

72

68

75

66

9
26

C
h
a
p
t
e
r

Example: Teaching Reading


Label the parameters
1 = the mean reading test scores of slow learners taught
with the new method.
2 = the mean reading test scores of slow learners taught
with the standard method.

1 2 = the mean difference of


reading test scores of slow
learners taught with the new
method and standard method.

9
27

C
h
a
p
t
e
r

Example: Teaching Reading


Verifying required
conditions
1. The samples are
randomly and
independently selected
from the populations of
slow learner taught by
the new method and the
standard method.
2. The test scores are
approximately normally
distributed for both
teaching methods.

28

C
h
a
p
t
e
r

Example: Teaching Reading


Verifying required conditions
3. The variance of the test scores is the same for the two
populations.
Check: We will be using the following rule to check the
equality of population variances.

1 s1
2
2 s2
5.8348
= .9198 which is <2
6.3438

29

C
h
a
p
t
e
r

Example: Teaching Reading


95% confidence interval for is
1 1
( x1 x2 ) t
s p
, ( n1 n2 2 )
n1 n2
2
2

(n1 1) s12 (n2 1) s22


Where, s p
n1 n2 2
2

9
30

C
h
a
p
t
e
r

Example: Teaching Reading


Given,
New Method
Standard
Method

Mean

Std. Dev.

76.4

5.8348

10

72.3333

6.3437

12

1 1 21 + 2 1 22
=
1 + 2 2
2

10 1 5.8348 2 + 12 1 6.3437
=
10 + 12 2

= 37.45

1 + 2 2 = 10 + 12 2 = 20

= .025 = 2.086

31

C
h
a
p
t
e
r

Example: Teaching Reading


You can calculate:
1 2

1
1
1
1
+
= 76.4 72.33 .025 37.45
+
1 2
10 12
= 4.07 2.086 2.62
= 4.07 5.47
= (1.4, 9.54)

9
32

C
h
a
p
t
e
r

Example: Teaching Reading


Interpretation
Using this estimation procedure over and over again for
different samples, we know that approximately 95% of the
confidence intervals formed in this manner will enclose the
1 2 .
Therefore, we are 95% confident that
the difference in mean test scores
between using the new method of
teaching and using the standard
method falls into somewhere the
interval from 1.4 9.54.

9
33

C
h
a
p
t
e
r

Comparing Means from Two


Independent Populations (small sample)
Label the parameters
1 = the population mean for population I
2 = the population mean for population II

Hypotheses
Null Hypothesis
0 : 1 2 = 0

Alternative Hypotheses
: 1 2 < 0
: 1 2 > 0
: 1 2 0
0 = Hypothesized difference between the means
34

C
h
a
p
t
e
r

Comparing Means from Two


Independent Populations (small sample)
Test
We will use two small samples t-tests.
( x1 x2 ) ( 1 2 )
t
~ t( n1 n2 2 ) df
1
1

s 2p
n1 n2

Required Conditions
1. The two samples are randomly selected in an
independent manner from the two target populations.
2. Both sampled populations have distributions that are
approximately normal.
3. The population variances are equal, but unknown.
35

C
h
a
p
t
e
r

Comparing Means from Two


Independent Populations (small sample)
Level of significance
=? ? ? (usually given, if not choose = .05)

Rejection Region

> if upper tailed test (or : 1 2 > 0 )


< if lower tailed test (or : 1 2 > 0 )
> 2 or < 2 if two-tailed test (or : 1 2 0 )

Calculation of test statistic


tcal

( x1 x2 ) D0
1 1
s p
n1 n2

~ t( n1 n2 2) df

36

C
h
a
p
t
e
r

Comparing Means from Two


Independent Populations (small sample)
Decision and conclusion
If the calculated value of the proposed test statistic
belongs to the rejection region, we reject 0 ;
otherwise we fail to reject 0 .
Write the conclusion in the context.

9
37

C
h
a
p
t
e
r

Finding -Values for a Test


1. Upper-tailed test

= ( > )

: 1 2 > 0

2. Lower-tailed test
: 1 2 < 0

3. Two-tailed test
: 1 2 0

1 + 2 2

1 + 2 2

= ( > )

= 2 ( > | |)
1 + 2 2

Decision Rule:
If < , we reject 0
38

C
h
a
p
t
e
r

Example: Class Time


Does class time affect performance?
The test performance of students in two sections of
international trade, meeting at different times, were
compared.
8:00 AM Class

9:30 AM Class

Mean: 78

Mean: 82

Standard Deviation: 14

Standard Deviation: 17

Variance: 196

Variance: 289

n: 21

n: 21

With = .05, test 0 : 1 = 2 .

39

C
h
a
p
t
e
r

Example: Class Time


Label the parameters
Let 1 be the mean test scores of students in the 8:00 am
section of the international trade class.
Let 2 be the mean test score of students in the 9:30 am
section of the international trade class.

Hypotheses

Null Hypothesis
0 : 1 2 = 0

Alternative Hypothesis

: 1 2 0

40

C
h
a
p
t
e
r

Example: Class Time


Test
We will use small-sample t-test.

Required Conditions
1. Assume 8:00 am and 9:30 am classes are independent and
samples from these classes are chosen randomly.
2. Both sampled populations have
distributions that are approximately
normal.
3. Since 1 2 = .824, so we assume
that the population variances are
equal, but unknown.

41

C
h
a
p
t
e
r

Example: Class Time


Level of significance
= 0.05

Rejection Region
< 2.021 or > 2.021 (Note that = 40)

Calculation of test statistic


tcal

( x1 x2 ) D0
1 1
s 2p
n1 n2

(78 82) 0
1 1
(242 .5)
21 21

.832

2
2
(
n

1
)
s

(
n

1
)
s
20 196 20 289
1
2
2
s 2p 1

242 .5
n1 n2 2
40
42

C
h
a
p
t
e
r

Example: Class Time


Decision and Conclusion
Since = .832 is greater than 2.021, and is less
than 2.021, does not belong to the rejection
region, so we do not reject the null hypothesis.
We conclude that there is sufficient
evidence at 5% level of significance to
say that the mean test scores of
students in the 8:00 am and 9:30 am
sections of international trade class are
the same.

9
43

C
h
a
p
t
e
r

Example: Class Time


Decision using p-value approach
-value = 2 40
. 832 = 2 > .10 = > .20

OR
-value = 2 > .832 = 2 > .10 = > .20

Decision and conclusion


Since -value > , we do not reject 0 .
We conclude that there is sufficient
evidence at 5% level of significance to
say that the mean test scores of
students in the 8:00 am and 9:30 am
sections of international trade class
are the same.
44

Week 14: Chapter 9.3


Paired Differences

C
h
a
p
t
e
r

Comparison of Paired Samples


Paired data
Calculate the differences between each
pair = 1 2
Paired -tests use this difference data calculates
the mean of the differences and the standard error
of the differences

9
46

C
h
a
p
t
e
r

Paired Difference Confidence


Interval for = 1 2
Let be the population mean difference of
population differences between Population I and
Population II.
100(1 )% Confidence Interval for = 1 2

is

Where,
= sample mean difference
= sample standard deviation of differences
= number of pairs observe
= t-critical value at 1
2

47

C
h
a
p
t
e
r

Paired Difference Confidence


Interval for = 1 2
Required Conditions
1. A random sample of differences is selected
from the target population differences.
2. The population of differences has a distribution
that is approximately normal.

9
48

C
h
a
p
t
e
r

Example: Starting Salaries


Data on Annual Salaries for Matched Pairs of College Graduates
Pair

Male

Female

=
Diff

$29,300

$28,800

500

$41,500

$41,600

-100

$40,400

$39,800

600

$38,500

$38,500

$43,500

$42,600

900

$37,800

$38,00

-200

$69,500

$69,200

300

$41,200

$40,100

1100

$38,400

$38,200

200

10

$59,200

$58,500

700

Compare the starting salaries using a 95% confidence interval


49

C
h
a
p
t
e
r

Example: Starting Salaries


Label the parameters
1 = the mean starting salary for males in this sample
2 = the mean starting salary for females in this sample
The parameter of interest is = 1 2 , the mean of
differences of two paired mean salaries between male and
female recent college graduates.

9
50

C
h
a
p
t
e
r

Example: Starting Salaries


Verifying all required
conditions
1. Assume that a random
sample of 10 differences is
selected from all starting
salary differences between
recent college graduate
males and females.
2. All starting salary
differences between recent
college graduate males and
females has a distribution
that is approximately
normal.
51

C
h
a
p
t
e
r

Example: Starting Salaries


95% confidence interval for = is
Summary statistics
Column

Mean

Std. Dev.

Diff

10

400

434.6135

= 10 1 = 9
2 = .025 = 2.262

sd
434 .6135
xd t
400 2.262
nd
10
2
(89.0962 , 710 .9038 )
52

C
h
a
p
t
e
r

Example: Starting Salaries


Interpretation
Using this estimation procedure over and over again for
different samples, we know that approximately 95% of the
confidence intervals formed in this manner will enclose the
= 1 2 .
Therefore, we are 95% confident
that the true mean difference
between the starting salaries of
male and female recent college
graduates falls between $89.10
and $710.90.

9
53

C
h
a
p
t
e
r

Hypotheses Test for Paired Difference


for = 1 2 (small samples)
Label the parameters
1 = the population mean for population I
2 = the population mean for population II

Hypotheses

Null Hypothesis
0 : = 0

Alternative Hypothesis
: < 0
: > 0
: 0
0 = Hypothesized difference between the means
54

C
h
a
p
t
e
r

Hypotheses Test for Paired Difference


for = 1 2 (small samples)
Test
We will use small single-sample -test for paired
data.

Required conditions
1. A random sample of differences is selected from
the target population differences.
2. The population of differences has a distribution
that is approximately normal.

9
55

C
h
a
p
t
e
r

Hypotheses Test for Paired Difference


for = 1 2 (small samples)
Level of Significance
=? ? ? (usually given, if not choose = .05)

Rejection Region
> if upper tailed test (or : > 0 )

< if lower tailed test (or : < 0 )


>

or <

if two-tailed test (or : 0 )

Calculation of test statistic

1
56

C
h
a
p
t
e
r

Hypotheses Test for Paired Difference


for = 1 2 (small samples)
Decision and conclusion
If the calculated value of the proposed test statistic
belongs to the rejection region, we reject 0 ;
otherwise we fail to reject 0 .

Write the conclusion in the context.

9
57

C
h
a
p
t
e
r

Finding -Values for a Test


Upper-tailed test

1.

a.

a.

: < 0

Two-tailed test

3.

= ( > )

: > 0

Lower-tailed test

2.

a.

: 0

= ( < )

= 2 ( > | |)
1

Decision Rule
If < , we reject 0
58

C
h
a
p
t
e
r

Example: Tongue Twisters

Assume that a random sample of


21 differences of reading response
times between the tongue twister
group and the control group was
selected for the reading
experiment.

C
h
a
p
t
e
r

Example: Tongue Twisters


Label the parameters
1 = the mean reading response times for the tongue
twister group
2 = the mean reading response times for the control group
Parameter of Interest: d = 1- 2

Hypotheses

Null Hypothesis
0 : = 0
Alternative Hypothesis
: 0

60

C
h
a
p
t
e
r

Example: Tongue Twisters


Test
We will use t-test for paired data.

Required conditions
1. We are told that the sample was selected randomly.
2. Assume that the difference of reading response times
between the tongue twister group and the control group
has a normal distribution.

9
61

C
h
a
p
t
e
r

Example: Tongue Twisters


Level of Significance
= .05

Calculation of test statistic


.25 0
t
1.47
.78
21
p-value

= 2 > 1.47
= 2 > .05
> .10
62

C
h
a
p
t
e
r

Example: Tongue Twisters


Decision and conclusion

Since p-value > , we fail to reject 0 .


At 5% level of significance, we do have sufficient
evidence to say that the mean reading response
times between the tongue twister group and the
control group are the same.

9
63