Vous êtes sur la page 1sur 8

Hypothesis Testing

In a history class at a university it is known that students’ scores on the final


exam have a normal distribution with mean µ = θ0 and variance σ2 .
The History Department introduces a new teaching method and wants to
determine if this new method will increase final exam scores.

Null Hypothesis H0 : µ = θ0 vs Alternative Hypothesis HA : µ = θA > θ0

Let X1, X2, . . . , Xn be the final exam scores of the n students in the class.

Under H0 the joint probability density is denoted by

L(X1, . . . , Xn; θ0) , called the likelihood function under H0

and under HA the joint probability density is denoted by

L(X1, . . . , Xn; θA) , called the likelihood function under HA


The testing of H0 : µ = θ0 vs HA : µ = θA > θ0
which is a one-sided test, can be based on the ratio
L(X  ,...,X ; θ )
LR(X1, . . . , Xn; θ0 ,θA) = , called the likelihood ratio.
L(X  ,...,X  ; θ )

After the students take the exam, LR can be computed. If the value of LR is too
large then the new teaching method is better than the original method, would
reject H0 in favor of HA
It can be shown that LR varies directly with 
X, the mean of the students’ scores.
So the testing can equivalently be based on X or
 - θ
X
) =
T(X which has a standard normal distribution under H0 .
/√
) this is the test statistic.
T(X

Would reject H0 for large values of T(X ). This can be formalized or quantified by
setting the significance level (alpha level, α-level) of the test or the critical value
of the test, tc :
 - θ
X
) =
P(T(X ≥ tc | H0 ) = α where tc is the critical value.
/√

 - θ
X
) =
Distribution of T(X under H0.
/√

The set of values of X1, X2, . . . , Xn where T(X) ≥ tc is called the critical region
or rejection region, where H0 would be rejected. Denote the observed value of
) by to. If to ≥ tc then would reject H0 , otherwise accept H0. The p-value of
T(X
the test, is the probability that T(X ) under H0 is more extreme than to ;
) ≥ to | H0 ).
p-value = P(T(X
There are two possible errors in hypothesis testing.

Type I error – reject H0 when H0 is true

) ≥ tc | H0 ) = α
P(Type I error) = P( T(X

Type II error – accept H0 when H0 is not true

) < tc | HA ) = 1 – P( T(X
P(Type II error) = P( T(X ) ≥ tc | HA ) =

= 1 – Power of the test

where Power of the test = probability of rejecting H0 when HA is true

) ≥ tc | HA )
= P( T(X

A test with high Power would be a desirable property.

Increasing the Power would increase the P(Type I error).


The above example is based on the Neyman-Pearson Lemma.
Neyman-Pearson Lemma: For testing H0 : µ = θ0 vs HA : µ = θA > θ0 ,
a test which rejects H0 when
L(X  ,...,X ; θ )
LR(X1, . . . ,Xn; θ0 ,θA) = ≥ k for (X1, . . . , Xn) in C, critical region
L(X  ,...,X  ; θ )

where α = P( (X1, . . . , Xn) in C | H0 ) is a uniformly most powerful test.

But the Neyman-Pearson Lemma is generally only applicable to one-sided tests:


H0 : µ = θ0 vs HA : µ = θA > θ0
or vs HA : µ = θA < θ0
With the restriction to one-sided tests, Lemma has limited applicability.

In the example of the History Department introducing a new teaching method;


if the interest is in testing if the new teaching method changes the mean of the
final exam then one is using a two-sided test:

H0 : µ = θ0 vs HA : µ ≠ θ0
For a two-sided test one uses the Likelihood Ratio Test which involves
maximizing the likelihood under H0 and under HA .

L(X1, . . . ,Xn; θ0) = maximum of the likelihood under H0

L(X1, . . . ,Xn; µ ≠ θ0) = maximum of the likelihood under HA

L(X  ,...,X  ; µ  θ )
The two-sided test is based on the ratio L(X  ,...,X ; θ )
.

For the example of the history final exam the test statistic would be the same,
 - θ
X
) =
T(X ) that are too small or too large would lead to
, but values of T(X
/√
rejection of H0 or

)│ ≥ uc , critical value of the two-sided test.


Reject H0 when │T(X

)│ ≥ uc | H0 ) = α
Selecting uc or α so that P(│T(X

) ≤ -uc or T(X
or P(T(X ) ≥ uc | H0 ) = α
 - θ
X
) │ =
where │T(X │ /√ │ ≥ u c is the critical region or rejection region of
the two-sided test.

 - θ
X
) =
Distribution of T(X under H0.
/√

) by to. Reject H0 if |to | ≥ uc .


Again, denote the observed value of T(X

)│ ≥ |to | | H0 ) .
p-value = P(│T(X
)│ ≥ uc | H0 ) = α
P(Type I error) = P(│T(X

)│ < uc | HA ) = 1 – P(│T(X


P(Type II error) = P(│T(X )│ ≥ uc | HA ) =

= 1 – Power of the test

where Power of the test = probability of rejecting H0 when HA is true

)│ ≥ uc | HA ) = P(T(X
= P(│T(X ) ≤ -uc or T(X
) ≥ uc | HA ).

Note: When selecting the alpha-level for a one-sided test, it is usually


appropriate to use half the alpha value that would be used for a two-sided test.

Vous aimerez peut-être aussi