Vous êtes sur la page 1sur 10

TESTING THE DIFFERENCES

Differences in Means from Large Samples



1 Instead of comparing the sample mean with the population mean, there are instances where two
sample means are compared.

For example, in experimental research, the mean obtained from an experimental group (a group
receiving experimental treatment) needs to compared with the mean of the control group (the
group without experimental treatment) so as to determine whether the treatment truly has an
effect or not.

Another example is when comparing two different brands of product, say two brands of
cement. If it is desired to determine whether one is better than the other in terms of average
compressive strengths, a comparison of the sample means is needed.

!he same basic step used in hypothesis testing is used in comparing the means of two samples.
!he null, "
o
, and "
1,
alternative hypotheses are also used.

In a two-tailed test, the null hypothesis "
o
# $
1
% $

states that there is no difference between
the two means while the alternative hypothesis "
1
# $
1
& $

states that there is a difference. 'oth
hypotheses must be stated together as follows#
"
o
# $
1
% $


"
1
# $
1
& $



If the two means have no difference, subtracting one from the other is (ero. If the means differ,
the difference will not be (ero. )ith these, the hypotheses may also be written as#

"
o
# $
1
* $

% +
"
1
# $
1
* $

& +

A right-tailed test will have these hypotheses#
"
o
# $
1
, $


"
1
# $
1
- $



which can also be written as# "
o
# $
1
* $

, +


"
1
# $
1
* $

- +

!he hypotheses for a left-tailed test are#
"
o
# $
1
. $


"
1
# $
1
/ $


or#
"
o
# $
1
* $

. +


"
1
# $
1
* $

/ +


0 !here is no need to 1now the population mean however.
2 At times, instead of +, the difference may be a specific value k
!hus,
In a two-tailed test:
"
o
# $
1
* $

% 1
"
1
# $
1
* $

& 1
In a right-tailed test:
"
o
# $
1
* $

, 1


"
1
# $
1
* $

- 1
In a left-tailed test:
"
o
# $
1
* $

. 1


"
1
# $
1
* $

/ 1




3 !he assumptions used to test the difference are#
!he samples must be independent of each other4 meaning there must be no relationship
between the sub5ects in each sample
!he populations from which the samples were obtained must be normally distributed, the
standard deviations of the variable must be 1nown, or the sample si(e must be e6ual to or
greater than 0+. "erein lies the definition of large samples.

7 )hen the foregoing conditions are met, the (*test is used in which the test value is obtained
using#
( ) ( )
2
2
2
1
2
1
2 1
2
1
n
s
n
s
X X
z
+

=


)here#
1 X , 2 X % sample means
$
1
,$

% population means
)hen the null hypothesis is $
1
% $

then $
1
* $

% +
s
1
, s

% sample standard deviations


n
1
, n

% sample si(es
8 9xample
In a survey it was found that the average wages for construction laborers in :ity A is ;hp
73.7<day while in :ity ', ;hp 21.=0<day. Assume that the data were obtained from two
samples of 3+ laborers each and that the standard deviations were ;hp 17.=7<day and ;hp
12.2><day, respectively. At a significance level of +.+3, can it be concluded that there is a
significant difference in the rates.

?olution#
?tate the hypothesis and identify the claim
"
o
# $
1
* $

% +
"
1
# $
1
* $

& + (claim)

Find the critical value(s)
Inasmuch as n%0+, and standard deviation of the population is 1nown, refer to the
(*distribution table.
@ % +.+3
?ince this is a two*tailed test, each tail area is half of the total tail area.
area % +.3 A @<
area % +.3 A +.+3<
area % +.283 (area to be identified in the table)

From the table, this area corresponds to 1.>+ B +.+7 % 1.>7
:.C. % B 1.>7
:.C. % * 1.>7
Determine the test value using
( ) ( )
2
2
2
1
2
1
2 1
2
1
n
s
n
s
X X
z
+

=


1 X % 73.7
2 X % 21.=0
$
1
* $

% +
s
1
% 17.=7
s

% 12.2>
n
1
% 3+
n

% 3+


( ) ( )
50
) 49 . 14 (
50
) 86 . 16 (
0 83 . 241 26 . 265
2 2
+

= z

( % 8.23 test value

Decide on whether to accept or re5ect the null hypothesis
!he test value ( % 8.23 does exceeds :.C. % B1.73

Inasmuch as the test value is in the re5ection region, the null hypothesis must be
re5ected (while the alternative hypothesis is accepted)

?ummari(e the results
!here is enough evidence to support the claim that the means are not e6ual and that
there is a significant difference between the rates of laborers in :ity A and :ity '

= !he confidence interval between two means can also be found. )hen hypothesi(ing that the
difference between two means is (ero ($
1
* $

% +), and the confidence interval actually contains
(ero, then the null hypothesis is accepted. Etherwise it is re5ected.

> !he confidence interval for the difference between two means is given by#

2
2
2
1
2
1
2
2 1
) (
n n
z X X

+ /( $
1
*$

) /
2
2
2
1
2
1
2
2 1
) (
n n
z X X

+ +

Fote that sample standard deviation s can be used in place of when n 30

1+ 9xample

Find the >3G confidence interval for the difference of the two means in the example in item H 7

?olution#
>3 G % +.>3
I (+.>3) % +.283

In the standard normal distribution table ((*distribution), locate a value nearest to +.283 then
find the corresponding (*value.

!he value +.283+ is found at the intersection of the row 1.> and column +.+7.
!hus, (
@<
% 1.> B +.+7
% 1.>7

From Item H 7
1 X % 73.7
2 X % 21.=0
s
1
% 17.=7
s

% 12.2>
n
1
% 3+
n

% 3+

Jsing
2
2
2
1
2
1
2
2 1
) (
n n
z X X

+ / ($
1
*$

)/
2
2
2
1
2
1
2
2 1
) (
n n
z X X

+ +


50
) 49 . 14 (
50
) 86 . 16 (
96 . 1 ) 83 . 241 26 . 265 (
2 2
+
/($
1
*$

) /
50
) 49 . 14 (
50
) 86 . 16 (
96 . 1 ) 83 . 241 26 . 265 (
2 2
+ +
18.7 /($
1
*$

)/0.20 confidence interval



Fote that since the confidence interval does not contain +, then the null hypothesis must be
re5ected which is the same conclusion in item H 7

Differences in Variances and Standard Deviations

11 Differences in variances and standard deviations are determined using the F-test
1 If two independent samples are selected from two normally distributed populations in which the
population variances are e6ual and if the sample variances s
1
K and s

K are compares as s
1
K< s

K,
the sampling distribution of the variance results in the F*distribution.

10 :haracteristics of the F*distribution#
!he values of F cannot be negative because variances are always positive
!he distribution is positively s1ewed
!he mean of F is approximately e6ual to 1
!he distribution is a family of curves based on the degrees of freedom of the variance of the
numerator and the degrees of freedom of the variance of the denominator

12 !he expression for the F*test is given by#
2
2
2
1
s
s
F =
)here s
1
is the larger of the two variances. !he sample from which this larger variance is
obtained has a sample si(e also designated as n
1
.

!here are also two degrees of freedom# n
1
*1 % d.f.F. (numerator)and n

*1 % d.f.D. (denominator)

13 !he F*distribution table depends on the @ value. !his means that each @ value has its own F*
distribution table.

For a right*tailed and left*tailed test, identify the table corresponding to the @ value4

For a two*tailed test, identify the table corresponding to @<

17 9xample
A medical researcher wishes to determine whether the variances of the heart rates (in beats per
minute) of laborers who smo1e cigarettes and laborers who do not smo1e differ. !he non*
smo1ers has a variance of 1+ and a sample si(e of 1=, while the smo1ers has a sample si(e of 7
with a variance of 07. At a significance level of +.+3, determine if there is enough evidence for the
claim.

?olution#
?tate the hypothesis and identify the claim
"
o
# L
1
K * L

K

% +
"
1
# L
1
K * L

K

& + (claim)

Find the critical value(s)
@ % +.+3

Fote that since the variance of 07 is larger, this is s
1
K, so n
1
% 7.
!hus, d.f.F. % n
1
*1 % 7*1 % 3
d.f.D. % n

*1 % 1=*1 % 18
?ince this is a two*tailed test, use @< % +.+3< % +.+3
Mefer to the +.+3 F*distribution table not the +.+3

Fote that d.f.F. % 3 is not found in the table.
In situations li1e this, use the nearest smaller degree of freedom.
!he closest smaller d.f.F. in the table is 2.
!hus, using d.f.F % 2 and d.f.D. % 18,
:.C. % B .37

Determine the test value using
2
2
2
1
s
s
F =
?
1
K % 07
?

K % 1+


10
36
= F

F % 0.7 test value

Decide on whether to accept or re5ect the null hypothesis
!he test value F % 0.7 exceeds :.C. % .37

Inasmuch as the test value is in the re5ection region, the null hypothesis must be
re5ected (while the alternative hypothesis is accepted)

?ummari(e the results
!here is enough evidence to support the claim that the variance of the heart rates of
smo1ers and non*smo1ers is different.

Small Independent Samples

18 A sample is small when the si(e is less than 0+
1= !he t test is used when the population standard deviation and one or both samples are less than
0+ in si(e. !he population must be normally or approximately normally distributed. !he samples
must also be independent, meaning they are not related.
1> "owever, there is a need to chec1 whether the sample variances are e6ual or not e6ual. For this
the F test is used.

+ )hen the variances are assumed e6ual, the test value is given by#


( ) ( )
( ) ( )
2 1 2 1
2
2 2
2
1 1
2 1
2 1
1 1
2
1 1
n n n n
s n s n
X X
t
+
+
+

=


)here the variables are as previously defined
Degrees of freedom are e6ual to n
1
+n
2
-2

)hen it is hypothesi(ed that there is no difference in means, of course,
1
-
2
is 0


1 )hen the variances are FE! e6ual, the test value is given by#


( ) ( )
2
2
2
1
2
1
2 1
2 1
n
s
n
s
X X
t
+

=



Degrees of freedom are e6ual to the smaller of n
1
-1 or n
2
-1

9xample
A researcher wants to determine whether the salaries civil engineers employed by 'ritish firms
are higher than those employed by Arabian A owned firms. ?ample salaries of civil engineers from
both types of firms were obtained from which the means and standard deviations of their
salaries were computed, as follows#






At a significance level of +.+1, can it be concluded that civil engineers in 'ritish firms receive
more than those in Arabian companiesN

?olution#

First determine whether the variances can be considered e6ual using the F test
?tate the hypothesis and identify the claim
"
o
# L
1
K * L

K

% + (claim)
"
1
# L
1
K * L

K

& +

Find the critical value(s)
@ % +.+1

!he standard deviation of 7++ is bigger that 23+ so that the former is s
1
while the latter is
s

. From which,

s
1
% 7++, n
1
% 1+, s

% 23+, n

% =,

d.f.F. % n
1
*1 % 1+*1 % >
d.f.D. % n

*1 % =*1 % 8
For a two*tailed test, use @< % +.+1< % +.++3

Mefer to the +.++3 F*distribution table to obtain :.C.
:.C. % B =.31
Determine the test value using
2
2
2
1
s
s
F =
?
1
K % (7++)K
?

K % (23+)K


2
2
) 450 (
) 600 (
= F

F % 1.8= test value

Decide on whether to accept or re5ect the null hypothesis
!he test value F % 1.8= is less than :.C. % .37

British Arabian
Oean J?P 7, =++ J?P 3,2++
?tandard deviation J?P 7++ J?P 23+
?ample si(e 1+ =
!he test value does not fall within the re5ection region4
Accept the null hypothesis ( and re5ect the alternative hypothesis )

?ummari(e the results
!here is enough evidence to support the claim that the variances of the salaries of civil
engineers employed by 'ritish and Arabian firms are e6ual

?ince the sample si(es are small, use the t test for the determining whether there is a difference
in means.

?tate the hypothesis and identify the claim
"
o
# $
1
, $


"
1
# $
1
- $

(claim)

Find the critical value(s)
@ % +.+1

?ince the variances are e6ual, use d.f. % n
1
Bn

*
d.f. % 1+ B = *
% 17

From the t*distribution table,
:.C. % B .3=0

Determine the test value using
( ) ( )
( ) ( )
2 1 2 1
2
2 2
2
1 1
2 1
2 1
1 1
2
1 1
n n n n
s n s n
X X
t
+
+
+

=


1 X % 7=++
2 X % 32++
$
1
* $

% + (since there is no specificied difference between the means)


s
1
% 7++
s

% 23+
n
1
% 1+
n

% =


( ) ( )
( ) ( )
8
1
10
1
2 8 10
) 450 ( 1 8 ) 600 ( 1 10
0 25400 26800
2 2
+
+
+

= t

t % 3.28, test value

Decide on whether to accept or re5ect the null hypothesis
!he test value t % 3.28 exceeds :.C. % .3=0 which means the test value is in the re5ection
region.
Inasmuch as the test value is in the re5ection region, the null hypothesis must be re5ected
(while the alternative hypothesis is accepted)

?ummari(e the results
!here is enough evidence to support the claim that the salaries of civil engineers
employed by 'ritish firms are higher than that paid by Arabian firms.







0 !he :onfidence Interval for the difference of two means when samples are small and
independent can also be determined as follows#

When variances are E!"#


( ) ( )
2 1 2 1
2
2 2
2
1 1
2
2 1
1 1
2
1 1
) (
n n n n
s n s n
t X X +
+
+



/ ($
1
*$

)/
( ) ( )
2 1 2 1
2
2 2
2
1 1
2
2 1
1 1
2
1 1
) (
n n n n
s n s n
t X X +
+
+
+


When variances are $%& E!"#


2
2
2
1
2
1
2
2 1
) (
n
s
n
s
t X X +

/ ($
1
*$

)/
2
2
2
1
2
1
2
2 1
) (
n
s
n
s
t X X + +





Small Dependent Samples

2 Dependent samples are those that are related. For instance to test the effect of a certain drug on
vision, the sub5ects are pretested (sub5ect are tested '9FEM9 applying the drug). Data is gathered
from this sample. !hen, the drug is applied, after which the sub5ects are tested and data are
obtained. ?ince the sub5ects are the same, the samples (two samples, one with no drug while the
other receives the drug) are related. In cases such as these, the pretest will have an influence on
the results of the posttest (test administered AF!9M treatment or application of whatever is being
determined). !his must therefore be ta1en into account.

?amples can also be dependent when the sub5ects are matched. For instance, to study the
effectiveness of learning by computers as compared to traditional lecture*discussion method, it
may be that students are paired. !hose with the same IQs are paired together, and afterward
each student is assigned to two different sample groups A one student for computer instruction
and the other under the group taught by traditional method. 9vidently, the two sample groups
are related and are thus dependent.

3 A special t test for dependent means is used in which the hypotheses are#

For a two-tailed test#
"
o
# $
D
% +


"
1
# $
D
& +


For a right-tailed test#
"
o
# $
D
, +


"
1
# $
D
- +


For a left-tailed test#
"
o
# $
D
. +


"
1
# $
D
/ +



)here# $
D
% expected mean of the difference of the matched pairs

7 !he general procedure in finding the test value involves the following#
Find the difference of the values of the pairs of data
D % R
1
A R


Find the mean of the differences

n
D
D

= where n is the number of data pairs
Find the standard deviation of the differences

( )
1
2
2

n
n
D
D
s
D

Find the estimated standard error of the differences

n
s
s
D
D
=

Finally, find the test value using#


n
s
D
t
D
D

=

d.f. % n A 1

$
D
must be in accord with the hypothesis

8 9xample
A researcher desires to determine whether a personSs cholesterol level will change if a person ate
oats for brea1fast every day. ?ix sub5ects were selected and their cholesterol levels in mg<dT
were measured. !hey then ate oats for brea1fast everyday for 2 wee1s after which their
cholesterol levels were measured with data are as shown. At a significance level of +.1+, can it be
concluded that the cholesterol level has changed. Assume normal distribution.








?olution#

?tate the hypothesis and identify the claim

"
o
# $
D
% +

and "
1
# $
D
& +

(claim)
Fote that UchangeV can either mean higher or lower. !his is thus a two*tailed test

Find the critical value
@ % +.1+
For a two*tailed test, use @< % +.+3
d.f. % 7 A 1
d.f. % 3

From the t*distribution table,
:.C. % B .+13 and :.C. % *.+13




?ub5ect 1 0 2 3 7
'efore
(R
1
)
1+ 03 += 1>+ 18 22
After
(R

)
1>+ 18+ 1+ 1== 180 =
Determine the test value using










n
D
D

= , mean of the difference
7 . 16
6
100
= = D


( )
1
2
2

n
n
D
D
s
D
standard deviation of the differences


1 6
6
) 100 (
4890
2

=
D
s % 3.2


n
s
D
t
D
D

=

6
4 . 25
0 7 . 16
= t % 1.71+, test value

Decide on whether to accept or re5ect the null hypothesis
!he test value t % 1.71+ is less than :.C. % B.+13 and greater than :.C. % *.+13 which
means it does not fall in the re5ection region.
Inasmuch as the test value is not in the re5ection region, the null hypothesis must be
accepted (while the alternative hypothesis is re5ected)

?ummari(e the results
!here is not enough evidence to support the claim that eating oats for brea1fast
everyday in six wee1s will change the cholesterol level


= !he confidence interval for the Oean Difference for small dependent samples is given by#


n
s
t D
n
s
t D
D
D
D
2 2

+ < <

Mecall the usefulness of the confidence interval when hypothesi(ing that differences are (ero as
noted in items H = and 1+




'efore (R
1
) After (R

) D % R
1
A R

DK %( R
1
A R

)K
1+ 1>+ + 2++
03 18+ 73 23
+= 1+ * 2
1>+ 1== 2
18 180 *1 1
22 = 17 37
WD % 1++ WDK % 2=>+

Vous aimerez peut-être aussi