Académique Documents
Professionnel Documents
Culture Documents
LESSON 20:
PRINCIPLE OF HYPOTHESIS TESTING
So far we have talked about estimating a confidence interval along How is this Done?
with the probability (the confidence level) that the true population If the difference between our hypothesized value and the sample
statistic lies within this interval under repeated sampling. We now value is small, then it is more likely that our hypothesized value of
examine the principles of statistical inference to hypotheses testing. the mean is correct. The larger the difference the smaller the
By the end of this chapter you should be able to probability that the hypothesized value is correct.
• Understand what is hypothesis testing In practice however very rarely is the difference between the sample
mean and the hypothesized population value larger enough or
• Examine issues relating to the determination of level of
small enough for us to be able to accept or reject the hypothesis
significance
prima-facie. We cannot accept or reject a hypothesis about a
• Apply tests of hypotheses to large to management parameter simply on intuition; instead we need to use objective
Situations criteria based on sampling theory to accept or reject the hypothesis.
• Use of SPSS package to carry out hypotheses test and Hypotheses testing is the process of making inferences about a
interpretation of computer output including p- values population based on a sample. The key question therefore in
What is Hypothesis Testing? hypotheses testing is: how likely is it that a population such as
one we have hypothesized to produce a sample such as the one
What is a Hypothesis?
we are looking at.
A hypothesis is the assumption that we make about the
population parameter. This can be any assumption about a Hypotheses Testing-The theory
population parameter not necessarily based on statistical data. For Null Hypothesis
example it can also be based on the gut feel of a manager. Managerial
In testing our hypotheses we must state the assumed or
hypotheses are based on intuition; the market place decides whether
hypothesized value of the population parameter before we begin
the manager’s intuitions were in fact correct.
sampling. The assumption we wish to test is called the Null
In fact managers propose and test hypotheses all the time. For Hypotheses and is symbolized by Ho.
example:
For example if we want to test the hypotheses that the population
• If a manager says ‘if we drop the price of this car model by mean is 500. We would write it as:
Rs15000 , we’ll increase sales by 25000 units’ is a hypothesis.
Ho: µ=500
To test it in reality we have to wait to the end of the year to
and count sales. If we use the hypothesized value of a population mean in a
problem we represent it symbolically as: µHo.
• A manager estimates that sales per territory will grow on
average by 30% in the next quarter is also an assumption or The term null hypotheses has its origins in pharmaceutical testing
hypotheses. where the null hypotheses is that the drug has no effect, i.e., there
is no difference between a sample treated with the drug and
How would the manager go about testing this assumption?
untreated samples.
Suppose he has 70 territories under him.
• One option for him is to audit the results of all 70 territories Alternative Hypothesis
and determine whether the average is growth is greater than If our sample results fail to support the hypotheses we must
or less than 30%. This is a time consuming and expensive conclude that something else must be true. Whenever we reject
procedure. the null hypothesis the alternative hypothesis is the one we have
to accept. This symbolized by Ha .
• Another way is to take a sample of territories and audit sales
results for them. Once we have our sales growth figure, it is There are three possible alternative hypotheses for any Ho., i.e.:
likely that it will differ somewhat from our assumed rate. For Ha: µ≠500(the alternative hypothesis is not equal to 500)
example we may get a sample rate of 27%. The manager is Ha: µ>500(the alternative hypothesis is greater than 500)
then faced with the problem of determining whether his
Ha: µ<500( the alternative hypothesis is less than 500)
assumption or hypothesized rate of growth of sales is
correct or the sample rate of growth is more representative. Understanding Level of Significance
To test the validity of our assumption about the population we The purpose of testing a hypothesis is not to question the
collect sample data and determine the sample value of the statistic. computed value of the sample statistics but to make a judgment
We then determine whether the sample data supports our about the difference between the sample statistic and the
hypotheses assumption regarding the average sales growth. hypothesized population parameter. Therefore the next step, after
stating our null and alternative hypotheses, is to decide what
RESEARCH METHODOLOGY
The level of significance is demonstrated diagrammatically below
in figure 3. Here .95 of the area under the curve is where we would
accept the null hypotheses. The two coloured parts under the
curve representing a total of 5% of the area under the curve are
the regions where we would reject the null hypotheses.
Figure1
1. We have now determined that our calculated value of z
indicates that the sample mean lies two standard errors(SE)
to the right of the hypothesized population mean on the
Figure 3
standard normal scale.
A word of caution regarding areas of acceptance and rejection.
2. Our level of significance is 5%. We now determine the critical
Even if our sample statistic does not fall in the non shaded region
value of z at 0.05 level of significance. This value is 1.96.
this does not prove that our Ho is true. The sample results merely
3. A comparison between the observed z and the z permissible do not provide statistical evidence to reject the hypothesis. This is
by our given level of significance :observed z: 2 Critical z because the only way a hypothesis can be accepted or rejected with
:1.96 certainty is for us to know the true population parameter.
4. Since the observed value of z is greater than the critical value Therefore we say the sample data is such as to cause us to not reject
of z we can infer that the difference between the value of the null hypotheses.
sample mean and the hypothesized population mean is too
Selecting a Level of Significance
large at the 5% level of significance to be attributed to
There is no standard or externally given level of significance for
sampling variation. Hence we reject the null hypothesis. The
testing hypotheses. The level of significance at which we want to
manager would reject the consignment of aluminum sheets
test a hypothesis is set externally by the manager based on his
as not meeting the required specification level.
evaluation of the costs and benefits associated with acceptance or
Example 2 rejection of a null hypothesis. We can however not a few points
How many standard errors around the hypothesized value should regarding this issue:
we use to be 99.44 certain that we accept the hypothesis when it is 1. The higher the level of significance the greater the probability
true? of rejecting Ho when it is true.
This problem requires that we leave a probability 1-.994 =.0056 in This is illustrated in the figure 4 below, which shows three levels
the tail. Since it is two tailed test we have to halve this probability of significance: .01, .1, .50. The location of the sample statistic is
to determine z such that there is .0056/2 =.0028 area in each tail. also shown in each of the three distributions. It obviously remains
Area under one half of the normal curve =.5-.0028=.4972 the same.
Looking up the normal tables we find for positive values of z , a • In fig 4a &b we would accept Ho that the sample mean does
probability of .4972 is associated with a z =2.77 this is illustrated not differ significantly from the population mean.
in the figure 2 below. • In fig4c we would reject Ho.
Why? Because out level of significance of .5 is so high that we
would rarely accept Ho when it is true and frequently reject Ho
when it is true.
Activities
1. What do we mean when we reject a hypothesis on the basis
of a sample?
RESEARCH METHODOLOGY
we then determine z critical such that the entire 5% lies on either
the right side (upper tailed test) or on the left side ( lower tailed σ 2
SE = = = 0.28
test).This is illustrated by the coloured regions in figures 5a&5b. n 50
We can now calculate the standardized z statistic:
x − µ 99.75 − 100
z= = = 0.88
SE 0.0289
z. critical= 1.28
Therefore since -.88<1.28, the sample mean lies within the
acceptance region and the hospital can accept the null hypothesis:
the observed mean of the sample is not significantly different
from the hypothesized mean dose.
Applications of One Tailed Tests
Figure 5a Many managerial situations call for a one tailed test. Typically if a
problem requires you to test whether the sample statistic is:
• More than a given population statistic
• Less than a given population statistic a one tailed test is
appropriate.
If the problem requires us to assess whether the sample statistic is
not equal to a population statistic then we use a two tailed test.
Example
1. A highway safety engineer decides to test the load bearing
capacity of a bridge that is 20 years old. Considerable data are
available from similar tests on the same type of bridges.
Which type of hypothesis is appropriate one or two tail test
If the minimum load bearing capacity of this bridge must
Figure5b
be 10 tons , what are the null and alternative hypotheses?
The procedure for testing the hypothesis remains the same as in
the two tailed case. The only difference will be in the value of the The engineer would be interested in whether a bridge of this
z critical, which is determined by the entire level of significance on age could withstand minimum load bearing capacities
only one side of the normal distribution. necessary for safety purposes . She therefore wants its capacity
to be above a certain minimum level; so a one tailed test, i.e.,
An Example will Clarify the Situation a right tailed test would be used.
A hospital uses large quantities of packaged doses of a particular
The hypotheses are:
drug. Excessive doses will pass harmlessly out of the system.
Insufficient doses do not produce the desired medical treatment. Ho: µ=10 tons Ha: µ>10 tons
The hospital has purchased the drug from the same manufacturer 2. Hinton press hypothesizes that the average life of its press is
for many years. The hospital inspects 50 doses randomly and 14500 hours. The standard deviation of a press life is 2100
finds the mean dose to be 99.75cc. The population standard hours. From a sample of 25 presses the company finds
deviation of doses is 2cc. sample mean life to be 13000 hours. At a .001 significance
Our problem suggests that the hospital faces a problem if doses level should the company conclude that the average life of
are significantly less than 100cc as patient’s treatment will be affected. the press life is less than the hypothesized 14500 hours.
However if the dose is more than 100cc there appears to be no Our problem requires us to assess whether average press life is
major problem. Therefore the null hypotheses remains unchanged, significantly less than the hypothesized press life. Therefore it is
we are only interested in testing as the alternative hypotheses a one tail test.
whether the sample mean strength is significantly below 100cc. Ho: µ=14500 Ha: µ<14500 n=25, s =2100 µ=.01
The hypotheses can be stated as follows:
SE= 2100/ √25=420
Ho: µ=100cc
Z = 13000-14500/420= - 3.57
Ha: µ<100cc
z. critical for a one tail test= -2.33
This is a left tailed test and the coloured region corresponds to .10
-3.57<-2.33 implies we should reject Ho. The average life is
level of significance. The acceptance region consists of 40% on
significantly less than the hypothesized life.
the left side of the distribution plus the entire 50% on the right
side for a total area of 90%. This is shown in figure5a&5b.
RESEARCH METHODOLOGY
Delhi? x2 and vice versa. By constructing a distribution of all possible
sample differences –x1 - –x2 we end up with a distribution of the
3. Macroswift estimated last year that 35% of its potential
difference between sample means shown below in figure 6c.
software buyer were planning to wait to purchase the new
operating system Window Panes, until an upgrade has been
released. After an advertising campaign to reassure the public
, Macroswift surveyed 3000 people and found 950 who were
still skeptical. At the 5% level of significance can the company
conclude the proportion of skeptical people has decreased?
Hypothesis Tests of Differences between Means
So far we have examined the case where we are testing the results
of a sample against a hypothesized value of a population statistic.
We now turn to case where we wish to compare the parameters for
two different populations and determine whether these differ
from each other. In this case we are not really interested in the
actual value of the two parameters but the relation between the
two parameter, i.e. is there a significant difference between them.
Example of hypothesis of this type are: Figure 1a&1b
• Whether female employees earn less than males for the same
work.
• A drug manufacturer may need to compare reactions of one
group of animals administered the drug and the control
group.
• A company may want to see if the proportion of Figure 1c
promotable employees in one installation is different from
another.
In each case we are not interested in the specific values of the
individual parameters as the relation between the two parameters.
The mean of this distribution is
The core problem reduces to one of determining whether the
means from two samples taken from two different populations The standard deviation of the distribution of the difference
is significantly different from each other. If they are not, we can between sample means is called the standard error of the difference
hypothesize that the two samples are not significantly different between two means.
from each other.
Theoretical Basis
Shown below are three different but related distributions. The testing procedure for a hypothesis is similar to the earlier
Figure 1a shows the population distribution for two different cases.
populations 1 and 2. They have respectively the following (µ1 - µ2) Ho: µ1 = µ2
characteristics:
Ha: µ1≠ µ2 a=.05
Mean s 1 and s 2 and standard deviations s 1 and s 2.
The z statistic = (–x1 - –x2 ) – (µ1 - µ2) Ho / s –x1 - –x2
Figure 1b shows the respective sampling distribution of sample.
Since we will usually we will be testing for equality between the
This distribution is defined by the following statistics:
two population means hence:
Mean of sampling distribution of sample means : µ1
(µ1 - µ2) Ho =0 since µ1 = µ2
and µ2
An example will make the process clearer:
Standard deviation of sampling distribution of mean or the
A manpower statistician is asked to determine whether hourly
standard error of sampling mean: s ¯x1 and s ¯x2.
wages of semi skilled labour are the same in two cities. The result
We have two populations 1, 2 with mean µ1 and µ2 with standard of the survey is given in the table below. The company wants to
deviation s 1 and s 2. The associated sampling distribution for test the hypothesis at .05 level of significance that there is no
sampling means –x1 and –x 2. However what we are now interested significant difference between the hourly wage rate across the two
in is the difference between the two values of the sampling means. cities.
i.e., the sampling distribution of the difference between the
sampling means –x1 - –x2. How do we derive this distribution ?
Suppose we take a random sample from the distribution of
Population 1 and another random sample from the distribution
of Population 2. If we then subtract the two sample means, we
b. Using a = 0.01, test whether the two samples can p2: sample proportion of success in sample 2
reasonably be considered to have come from n1 : sample size 1
populations with the same mean. n2 : sample size 1
Standard error of the difference between two proportions
Since we do not know the population proportions, we need to
estimate them from sample statistics:
We hypothesize that there is no difference between the two
proportions. In which case our best estimate of the overall
population proportion successes is the combined proportion of
successes in both samples. This can be calculated from the
following formula:
= (86-82)-0/1.296=3.09
Since 3.09>2.58, we reject Ho and it is reasonable to conclude that
the two samples come from different populations.
RESEARCH METHODOLOGY
populations. The estimated standard error of the difference forms had errors. The manager wants to test at the .15 level of
between two proportions is as follows: significance, the hypothesis that that personal appearance method
produces lower errors. The hypothesis is a one tailed test. The
procedure for this as the same as for carrying out a one tailed test
The standard z statistic in this case is calculated as : for comparing sample means. The data is as follows:
2. Two different areas of a large Eastern city are being considered probabilityofgettingasamplemeanthisfarawayfro µm
Ho.
as sites for day-care centers. of 200 households surveyed in This is called a probability value or p- value of the sample mean.
one section, the proportion in which the mother’s worked full- The two methods are equivalent and essentially represent two
time was 0.52. In another section of the city 40 percent of the sides of a coin.
150 households surveyed had mothers working at fulltime
• In the earlier case we prespecify a level of probability and
jobs. At the 0.04 level of significance is there a significant
compare the observed probability of getting a sample
difference in the proportions of working mothers in the two
statistic with the prespecified level of probability( a ).
areas of the city?
• We now ask what is the probability value of getting such a
Ans:z calculated =2.23>z critical=2.05. Therefore reject Ho.
result. This is termed the p- value of a statistic.
4. On Friday 11 stocks in a sample of 40 of the 2500 stocks
Once the p- value is determined the decision maker can then
traded on the BBSE advanced, i.e., their price of their shares
weigh all relevant factors and decide whether to accept/reject Ho
increases. In a sample of 60 BSE stocks on Thursday , 24 had
without being bound by a prespecified level of significance.
advanced At α =.10 , can you conclude that a smaller proportion
of BSE stocks advanced on Friday than did on Thursday? The p- value can also be more informative. For example if we
reject a Ho at a=.05, we only know that the observed value was
5. A coal fired power plant is considering two different systems
atleast 1.96 SE away from the mean. A p- value tells us the exact
for reducing pollution. The first system has reduced the
probability of the getting a sample mean 1.96SE away from the
emission of pollutants to acceptable levels68% as
mean.
determined from 200 air samples. The second more
expensive system has reduced the emission of pollutants to The concept will be made clearer with the help of an example.
acceptable levels 76% of the time as determined on the basis Example
of 250 air samples. If the expensive system is significantly A machine is used to cut Swiss Cheese into blocks of specified
more effective than the inexpensive system, the management weight. On the basis of experience the weight of a block has a
of the power plant will install the inexpensive system. Which standard deviation of .3gm. The machine is currently set to cut
system will be installed if the management uses a significance blocks of 12 gm. A sample of 25 blocks is found to have an
level of .02 in making its decisions? average weight of 12.25g. Should we conclude the machine needs
In this chapter we will wrap up our analysis of hypothesis testing to be recalibrated?
for large samples. By now you should have a good idea how to Since this is a two tailed test we need to determine the probability
apply the principles of hypothesis testing to different types of of observing a value of >x atleast as far away from 12 as 12.25 or
managerial problems. However these days any management 11.75gm if Ho is true.
problem that we may wish to analyze generates such a large
We therefore need to calculate the probability
volume of data that it is virtually impossible to analyze and test
hypotheses manually. Given the widespread availability of P(x>12.25 or x<d 11.75) if Ho is true.
computers and statistical packages we can easily run such tests on Our hypothesis can be stated as :
the computer. Therefore it becomes important to understand Ho: µ=12 Ha: µ≠12
and interpret how these tests are run on the computer.
s =.3 n=25 >x=12.15
Obviously the basic theory and principles of statistical analysis do
s >x = s / √n=.3/5=.06
not change when a test is carried out on computer. However there
are some differences in the way the level of significance is presented. We can then convert >x to a standard z score.
Computer outputs usually present the prob value or p-value. We
shall look at what prob values mean and compare them with
conventional tests of significance. From the normal tables we can find the probability that a z greater
We shall also look at what is considered a good hypothesis test. than 2.5 is .5-.4938=.0062
This is done by measuring the power of a test. This concept will Since this is a two tailed test the p- value is 2*.0062=.0124 this
also be presented in detail. information is shown in the figure 1 below;
Probability Values
So far we have tested a hypothesis at a given level of significance.
In other words before we take the sample we specify how unlikely
the observed result will have to be in order for us to reject Ho. For
example we test the hypothesis that observing a sample mean
this far away from the true population mean is less than 5%,
where %% is our externally given level of significance.
There is another way to approach the decision whether to accept or
reject Ho which does not require us to prespecify the level of
significance before taking a sample. In this case we take a sample,
RESEARCH METHODOLOGY
Given the above information the cheese packer can now decide cut sections of tubing used in pressure-measuring devices.
whether to recalibrate the machine or not. As we can see the p – The length of the sections is normally distributed with a
value is very low and he probably will not go in for recalibration. standard deviation of 0.06". Twenty-five pieces have been cut
with the machine set to cut sections 5.00" long. When these
If he had he carried out a conventional hypotheses test at .05 level
pieces were measured, their mean length was found to be
of significance is also illustrated in the figure 1.On the basis of the
4.97". Use prob values to determine whether the machine
z test he would have rejected the Ho 5 % level. However at a
should be recalibrated because the mean length is
significance level of .01 we would have accepted the hypotheses, as
significantly different from 5.00".
the critical z value would have been 2.58.
3. SAT Services advertises that 80 percent of the time, its
The p- value tells us the largest significance level at which we
preparatory course will increase an individual’s score on the
would have accepted Ho, i.e, .0124 and the associated z value
College Board exams by at least 50 points on the combined
(±2.5). Thus at any level of significance above .0124 we would
verbal and quantitative total score. Lisle Johns, SAT’s
reject Ho.
marketing director, wants to see whether this is a reasonable
Uses of p- values claim. Lisle has reviewed the records of 125 students who
Use of p values saves the tedium of looking up tables. The smaller took the course and found that 94 of them did, indeed,
the probability value, the greater the significance of the finding. increase their scores by at least 50 points. Use prob values to
The simple rule of thumb is: As long as µ>p reject Ho. determine whether SAT’s ads should be changed because the
For example if we have a p-value=.01 and µ=.05 . Then this percentage of students whose scores increase by 50 or more
means that the probability of getting our sample result is .01. We points is significantly different from 80 percent.
compare this with our standard of accepting/ rejecting Ho which Using the computer to Test the Hypotheses
is .05. since .05>.01 we reject Ho as the probability of getting such These days in actual managerial situations hypotheses tests are
a result is much lower than our level of significance. rarely done manually. Therefore it is important that students can
Example 2 interpret computer output generated for hypotheses tests by
The Coffee Institute has claimed that more than 40% of American various standard statistical analysis packages. The most popular
adults regularly have a cup of coffee with breakfast. A random of these packages are SPSS and Minitab. Broadly all programmes
sample of 450 individuals showed that 200 of them were regular follow the same principles. Instead of a comparing the calculated
coffee drinkers at breakfast. What is the prob value of a test of z value with a predetermined level of significance, most packages
hypotheses seeking to show that the Coffee Institute’s claim was display the prob values or p- values.
correct. To accept or reject an hypotheses we compare the level of significance
(a ) and the p- value. If a >p- value we reject Ho at the relevant level
of significance and vice versa.
An example will help show how computer outputs results for
hypotheses testing.
Example
While designing a test it was expected that the average grade would
be 75%, i.e., 56.25 out of 75. This hypotheses was tested against
actual test results for a sample of 199 students.
Activities
Ho: µ=56.25 Ha: µ≠56.25
1. A car retailer Hunks that a 40000 mile claim for tyre life by the
This hypotheses was tested at a =.05
manufacturer is too high. She carefully records the mileage
obtained from a sample of 64 such tyres. The mean turns The computer output for this test is shown below using the
out to be 38,500 miles. The standard deviation of the life of Minitab package. The observed t value for this test was -15.45,
all tyres of this type has previously been calculated by the with an associated (two-tailed) prob value of 0.0000. Because this
manufacturer to be 7,600 miles. Assuming that the mileage prob value is less than our significance level of a = 0.05, we must
is normally distributed, determine the largest significance reject Ho and conclude that the test did not achieve the desired
level at which we would accept the manufacturer’s mileage level of difficulty.
claim, that is, at which we would not conclude the mileage is This is shown in Table 1
significantly less than 40,000 miles. T test for a mean
2. The North Carolina Department of Transportation has Test for µ=56.25 vs µ≠56.25
claimed that at most, 18 percent of passenger cars exceed 70
Varaible N Mean Stdev SEMean T P-value
mph on Interstate 40 between Raleigh and Durham. A
random sample of 300 cars found 48 cars exceeding 70 mph. Result 199 45.281 10.014 .710 -15.45 0.00
What is the prob value for a test of hypothesis seeking to Table1
show the NCDOT’s claim is correct? T test for difference between two sample means
Here we test hypotheses of equality of two means.
caliber of teaching being done by the graduate-student teaching probability of incorrectly accepting the null hypothesis.
assistants. As a result, they decided to test whether students in Of course, we would like this ß (the probability of accepting a null
sections taught by the graduate TAs really did worse in the exam hypothesis when it is false) to be as small as possible or we would
than those students in sections taught by the faculty. like 1-ß (the probability of rejecting a null hypothesis when it is
If we let the TAs’ sections be sample 1 and the faculty’s sections false) to be as large as possible.
be sample 2, then the appropriate hypotheses for testing this Therefore a high value of 1-ß(something near 1.0) means the test
concern are is working quite well ( i.e it is rejecting the null hypothesis when it
Ho: µ1 = µ2 is false); a low value of 1-ß (something near 0.0) means that the
test is working very poorly( i.e. not rejecting the null hypotheses
Ha: µ1< µ2 when it is false).
The underlying population is assumed to be equal for both The value of 1-ß is a measure of how well the test is working and
samples. is known as the power of the test. If we plot the values of 1-ß for
The Minitab output for doing this test is given below. The test each value of for which the alternative hypothesis is true, the
results are reported assuming that the two population variances resulting curve is known as a power curve. This can be explained
are equal. If we can assume that the two variances are equal, then better with the help of an example.
the test reported by Minitab is the test using a pooled estimate for Example
s 2. This is shown in table 2 We were deciding whether to accept a drug shipment. Our test
Table2 indicates that we should reject the null hypothesis if the
Two sample T- Test standardized sample mean is less than - 1.28, that is, if sample
mean dosage is less than 100.00 - 1.28 (0.2829), or 99.64 cc.
Instrnum N Mean stdev SE mean In Figure 9a, we show a left-tailed test. In Figure 9b, we show
the power curve which is plotted by computing the values of 1
1 89 44.93 9.76 1.0
What does the data tell us regarding the efficacy of TA ? The prob
value is quite high, i.e., .33. Therefore if we compare this with a
level of significance of .05 (a = 0.05) we would accept the null
hypotheses that there is no difference in results between TAs and
faculty. In this case we would accept the null hypotheses at any
level of significance up to .33.
Measuring the Power of a Test
What should a good hypothesis test do ?
Ideally and (the probabilities of Type I and Type 2 errors) should Figure 2a
both be small. A Type I error occurs when we reject a null
hypothesis that is true. a (the significance level of the test) is the
probability of making a Type I error. Once we decide on the
significance level, there is nothing else we can do about a.
A Type 2 error occurs when we accept a null hypothesis that is
false; the probability of a Type 2 error is ß. Ideally a manager
would want a hypothesis to reject a null hypothesis when it is
false. Suppose the null hypothesis is false. Then managers would
like the hypothesis test to reject it all the time. Unfortunately,
hypothesis tests cannot be foolproof; sometimes when the null
hypothesis is false, a test does not reject it, and a Type 2 error is
made.
When the null hypothesis is false, µ (the true population mean)
does not equal µHo (the hypothesized population mean);
instead,equals some other value. For each possible value of for
RESEARCH METHODOLOGY
Point C on the power curve in Figure 2 b shows population
mean dosage is 99.42 cc. Given that the population mean is 99.42
cc, we must compute the probability that the mean of a random
sample of 50 doses from this population will be less than 99.64 cc
(the point below which we decided to reject the null hypothesis
i.e., the value of the dose at which we rejected the null hypothesis.
This shown in Figure 2c .
Figure 2e
As we can see the values of 1 - ß continue to decrease to the right
of point E. This is because as the population mean gets closer
and closer to 100.00 cc, the power of the test (1 - ß ) gets closer and
closer to the probability of rejecting the null hypothesis when the
population mean is exactly 100.00 cc. This probability is nothing
but the significance level of the test which in this case is 0.10. The
curve terminates at point F, which lies at a height of 0.10 directly
Figure 2 c over the population mean.
We had computed the standard error of the mean to be 0.2829 cc. What does the Power Curve in figure 2b Tell Us?
So 99.64 cc is (99.64- 99.42)/0.2829 = 0.78 As the shipment becomes less satisfactory (as the doses in the
Thus 99.64 is .78 SE above the true population mean when it shipment become smaller), our test is more powerful (it has a
takes a value µ= 99.42 cc. greater probability of recognizing that the shipment is
The probability of observing a sample mean less than 99.64 cc unsatisfactory). It also shows us, however, that because of
and thus rejecting the null hypothesis is 0.7823, when we take the sampling error, when the dosage is only slightly less than 100.00
true population mean to be µ= 99.42 cc. This is given by the cc, the power of the test to recognize this situation is quite low.
colored area in Figure 9c. Thus, the power of the test 1 - ß Thus, if having any dosage below 100.00 cc is completely
unsatisfactory, the test we have been discussing would not be
at = 99.42 is 0.7823. This simply means that if = 99.42, the
probability that this test will reject the null hypothesis when it is appropriate.
false is 0.7823. Example:
Point D in Figure 9b shows that if the population mean dosage Before the 1973 oil embargo and subsequent increase in oil price,
is 99.61 cc. We then ask what is the probability that the mean of petrol usage in the US had grown at a seasonally adjusted rate of
a random sample of 50 doses from this population will be less .57% per month, with a standard deviation of of .10% per month.
than 99.64cc and thus cause the test to reject the null hypothesis? In 15 randomly chosen months between 1975 and 1985, petrol
This is illustrated in Figure 2d. Here we see that 99.64 is (99.64 - usage grew at an average rate of .33% per month. At a .01 level of
99.61)/0.2829, or 0.11 standard error above 99.61 cc. The significance can you conclude that the growth in the use of gasoline
probability of observing a sample mean less than 99.64cc and had decreased as a result of the embargo?Compute the power of
thus rejecting the null hypothesis is 0.5438, the colored area in the test for =.50, .45 and .4% per month.
Figure 9d. Thus, the power of the test (1 - µ ) at µ= 99.61 cc is =.10 n=15,
0.5438.
Ho: µ=.57, Ha: µ<.57 At a =.01, the lower limit of the
acceptance region is :
µHo – 2.33 µ/ √n=.57-2.33(.10) / √15=.510
a. At µ=.50, the power of the test is :
P(>x<.510)=P(z<.510-.50/.10/ √15=
P(z<.39)=.5+.1517=.6517
b. At =.45, the pow er of the test is
P(>x<.510)=P(z<.510-.45/.10/”15=
P(z<2.32)=.5+.4898=.9898
c. At =.40, the power of th e testis
Figure 2d
P(>x<.510)=P(z<.510-.40/.10/”15=
Using the same procedure at point E, we find the power of the
P(z<.4.26)=.5+.1517=1.00
test at = 99.80 cc is 0.2843; this is illustrated as the colored area in
Figure 2e.
1. A manufacturer of petite women’s sportswear has no direction of difference is given, it is two tailed test.
hypothesized that the average weight of the women its • Choose the statistical test on the basis of the assumption
buying1.ts_clothing is 110 pounds. The company takes two about the population distribution and measurement level.
samples of its customers and finds one sample’s estimate of The form of the data can also be a factor. In light of these
the population mean is 98 pounds, and the other sample considerations, one typically chooses the test that has the
produces a mean weight of 122,pqunds. In the test of the general power efficiency or ability to reduce decision error.
company’s hypothesis that the population mean is 110 • Select the desired level of confidence. While α = 0.05 is
pounds versus the hypothesis that the mean does equal 110 the most frequently used level, many others are also used.
pounds, is one of these sample values more likely to lead The α is the significant level that we desire and is set in
accept the null hypothesis? Why or why not? advance of the study.
2. On an average day; about 5 percent of the stocks on the New • Compute the actual test value of the data.
York Stock set a new high for the year. On Friday Sept 18th ,
• Obtain the critical test value, usually by referring to a table for
1992 the Dow Jones closed at closed at 3,282,on a robust
the appropriate type of distribution.
volume of over 136 million shares traded. A random
sample of 120, stocks showed that 16 had set new annual • Interpret the result by comparing the actual test value with
highs that day. Using a significance level of 0.01, should we the critical test value.
conclude that more stocks than usual set new highs on that Notes
day?
3. A finance developed a theory that predicted that closed end
equity funds should sell at a premium of about 5% on
average. Assuming that the discount /premium population
is approximately normally distributed does the sample
information support his theory? Test at .05 level of
significance.
4. A company recently criticized for not paying women as much
as men claims that its average salary paid to all employees is
$23500. From a random sample of 29 women, the average
salary was calculated to be $23,000. If the population
standard deviation is known to be $1250 for these jobs
determine whether we could reasonably, within 2 standard
errors expect to find $23000 as the sample mean if, in fact ,
the company’s claim is true.
5. A manufacturer of vitamins for infants inserts a coupon for
a free sample of its production a package that is distributed
at hospitals to new parents. Historically about 185 of the
coupons have been redeemed. Given current trends for
having fewer children and starting families later, the firm
suspects that today’s parents are better educated on average
and as a result more likely to use vitamin supplements for
their infants. A sample of 1500 new parents redeemed 295
coupons. Does this support at a significance level of 2percent
the firm’s beliefs about today’s new parents.
6. From a sample of 10,200 loans made by a state employees
credit union in the most recent five year period, 350 were
sampled to determine what proportion was made to
women. This sample showed that 39% of the loans were
made to women employees. A complete census of loans 5
years ago showed that 41% were women borrowers. At a
significance level of .02 can you conclude that the proportion
of loans made to women has changed significantly in the last
five years?
Points to Ponder
• Hypothesis testing cab be viewed as a six-step procedure
• Establish a Null Hypothesis as well as alternative
Hypothesis. It is a one tail test of significance if the