Académique Documents
Professionnel Documents
Culture Documents
Hypothesis Testing
1. What is a Hypothesis?
2. What is Hypothesis Testing?
3. Hypothesis Testing Examples (One Sample Z Test).
4. Hypothesis Test on a Mean (TI 83).
5. Bayesian Hypothesis Testing.
6. More Hypothesis Testing Articles
See also:
Critical Values
What is the Null Hypothesis?
What is a Hypothesis?
Andreas Cellarius hypothesis, showing the planetary motions.
A hypothesis is an educated guess about something in the world around you. It should
be testable, either by experiment or observation. For example:
A new medicine you think might work.
A way of teaching you think might be better.
A possible location of new species.
A fairer way to administer standardized tests.
It can really be anything at all as long as you can put it to the test.
If I (decrease the amount of water given to herbs) then (the herbs will increase
in size).
If I (give patients counseling in addition to medication) then (their overall
depression scale will decrease).
If I (give exams at noon instead of 7) then (student test scores will improve).
If I (look in this certain location) then (I am more likely to find new species).
A good hypothesis statement should:
Hypothesis testing in statistics is a way for you to test the results of a survey or
experiment to see if you have meaningful results. You’re basically testing whether your
results are valid by figuring out the odds that your results have happened by chance. If
your results may have happened by chance, the experiment won’t be repeatable and so
has little use.
Hypothesis testing can be one of the most confusing aspects for students, mostly
because before you can even perform a test, you have to know what your null
hypothesis is. Often, those tricky word problems that you are faced with can be
difficult to decipher. But it’s easier than you think; all you need to do is:
Step 2: State the Alternate Hypothesis. The claim is that the students have above
average IQ scores, so:
H1: μ > 100.
The fact that we are looking for scores “greater than” a certain point means that this
is a one-tailed test.
Step 5: Find the rejection region area (given by your alpha level above) from
the z-table. An area of .05 is equal to a z-score of 1.645.
Step 6: If Step 6 is greater than Step 5, reject the null hypothesis. If it’s less than Step
5, you cannot reject the null hypothesis. In this case, it is greater (4.56 > 1.645), so
you can reject the null.
*This process is made much easier if you use a TI-83 or Excel to calculate the z-score
(the “critical value”).
See:
Critical z value TI 83
Z Score in Excel
Sample problem: A sample of 200 people has a mean age of 21 with a population
standard deviation (σ) of 5. Test the hypothesis that the population mean is 18.9 at α
= 0.05.
Step 1: State the null hypothesis. In this case, the null hypothesis is that the population
mean is 18.9, so we write:
H0: μ = 18.9
Step 2: State the alternative hypothesis. We want to know if our sample, which has a
mean of 21 instead of 18.9, really is different from the population, therefore our
alternate hypothesis:
H1: μ ≠ 18.9
Step 3: Press Stat then press the right arrow twice to select TESTS.
Step 7: Arrow down to Calculate and press ENTER. The calculator shows the p-value:
p = 2.87 × 10-9
This is smaller than our alpha value of .05. That means we should reject the null
hypothesis.
Bayesian Hypothesis Testing: What is
it?
Bayesian hypothesis testing helps to answer the question: Can the results from a test or
survey be repeated?
Why do we care if a test can be repeated? Let’s say twenty people in the same village
came down with leukemia. A group of researchers find that cell-phone towers are to
blame. However, a second study found that cell-phone towers had nothing to do with
the cancer cluster in the village. In fact, they found that the cancers were completely
random. If that sounds impossible, it actually can happen! Clusters of cancer can
happen simply by chance. There could be many reasons why the first study was faulty.
One of the main reasons could be that they just didn’t take into account that
sometimes things happen randomly and we just don’t know why.
P Values.
It’s good science to let people know if your study results are solid, or if they could have
happened by chance. The usual way of doing this is to test your results with a p-value.
A p value is a number that you get by running a hypothesis test on your data. A P value
of 0.05 (5%) or less is usually enough to claim that your results are repeatable. However,
there’s another way to test the validity of your results: Bayesian Hypothesis testing.
This type of testing gives you another way to test the strength of your results.
Arguments against.
1. Including prior data or knowledge isn’t justifiable.
2. It is difficult to calculate compared to non-Bayesian testing.
Example
Not so long ago, people believed that the world was flat.
Step 1: Figure out the hypothesis from the problem. The hypothesis is usually hidden in
a word problem, and is sometimes a statement of what you expect to happen in the
experiment. The hypothesis in the above question is “I expect the average recovery
period to be greater than 8.2 weeks.”
Step 2: Convert the hypothesis to math. Remember that the average is sometimes
written as μ.
Broken down into (somewhat) English, that’s H1 (The hypothesis): μ (the average) > (is
greater than) 8.2
Step 3: State what will happen if the hypothesis doesn’t come true. If the recovery
time isn’t greater than 8.2 weeks, there are only two possibilities, that the recovery
time is equal to 8.2 weeks or less than 8.2 weeks.
H0: μ ≤ 8.2
Broken down again into English, that’s H0 (The null hypothesis): μ (the average) ≤ (is
less than or equal to) 8.2
Step 1: State what will happen if the experiment doesn’t make any difference. That’s
the null hypothesis–that nothing will happen. In this experiment, if nothing happens,
then the recovery time will stay at 8.2 weeks.
H0: μ = 8.2
Broken down into English, that’s H0 (The null hypothesis): μ (the average) = (is equal to)
8.2
Step 2: Figure out the alternate hypothesis. The alternate hypothesis is the opposite of
the null hypothesis. In other words, what happens if our experiment makes a
difference?
H1: μ ≠ 8.2
In English again, that’s H1 (The alternate hypothesis): μ (the average) ≠ (is not equal
to) 8.2
Contents:
1. Independent Variable
2. Predictor Variable
Example: you want to know how calorie intake affects weight. Calorie intake is your
independent variable and weight is your dependent variable. You can choose the
calories given to participants, and you see how that independent variable affects the
weights. You may decide to include a control variable of age in your study to see if it
affects the outcome.
The above graph shows the independent variable of male or female plotted on the
x=axis. “Male” or “Female” is unchangeable by you, the researcher, or anything you can
perform in your experiment. On the other hand, the dependent variable of “mean
vocabulary scores” is potentially changed by which independent variable is assigned. In
other words, the mean vocabulary scores depend on the independent variable: whether
the participant is male or female.
Another way of looking at independent variables is that they cause something (or are
thought to cause something). In the above example, the independent variable is calorie
consumption. That’s thought to cause weight gain (or loss).
Independent variables are also called the “inputs” for functions. They are traditionally
plotted on the x-axis of a graph. In statistics, an independent variable is also sometimes
called:
A controlled variable.
An explanatory variable.
An exposure variable (in reliability theory).
A feature (in machine learning and pattern recognition).
An input variable.
A manipulated variable.
A predictor variable.
A regressor (in regression analysis).
A risk factor (in medical statistics).
A predictor variable has essentially the same meaning as an independent variable. It’s
plotted on the x-axis, and it affects a dependent variable. However, it’s not exactly the
same, as you use the term in very specific situations:
In regression analysis, where the predictor variable is also called a regressor.
The other variable (comparable to the dependent variable) is called a criterion
variable.
In non-experimental studies, where it is the presumed “cause.” For example,
scores on a math test indicate an aptitude for engineering. “Scores on the math
test” are the predictor variables and engineering aptitude is the criterion variable.
Woman(1).
Man(2).
Transgender woman(3).
Transgender man(4).
When you only have two classes coded 0 or 1, it’s called a dummy variable. Dummy
variables can make it easier to understand the results from a regression analysis. Other
codings, like 2/3 or 8/9 can also be used (they just make the output more difficult to
comprehend).
Select a maximum of one predictor variable for every five observations, if your
predictive model is good.
Use a maximum of one predictor variable for every ten observations if your
predictive model is weak, or if you have a slew of variables to choose from.
If you have categorical variables, treat each included one as half of a normal
predictor.
If you have trouble figuring out which of your variables is the independent one, and
which is the dependent one, try inserting the variables into the following sentence:
Source: NIH.GOV.
Potential Confusion
You, the researcher, define your variables when you set up your experiment.
Your hypothesis statement is what determines whether a variable is dependent or
independent. Any variable can be and independent variable(IV) or dependent
variable(DV). For example, let’s say you are interested in studying the health benefits of
walking. You write the following two hypothesis statements:
1. A more nutritious diet leads to more daily walking.
2. More daily walking leads to increased happiness.
Both of the statements above are valid (assuming they correctly describe what you are
trying to test with your experiment). However, walking is the DV in statement 1 and
the IV in statement 2.
Back to Top
Tip: If you have trouble figuring out which of your variables is the independent one, and
which is the dependent one, try inserting the variables into the following sentence:
Example 3: A researcher studies how different drug doses affect the progression of a
disease and compares the intensity and frequency of symptoms when different doses
are given. The IV is the dose given and the DV is the intensity and frequency of
symptoms. The intensity and frequency of symptoms “depends” on the dose of drug
given.
Example 4: You are studying how tutoring affects SAT scores. Your independent
variable(IV) is tutoring and the dependent variable(DV) is test scores. The test scores
“depend” on the tutoring.
Back to Top
Q1: You are conducting an experiment to see if exposure to more sunlight increases
happiness levels for workers who typically spend the entire day in windowless offices.
1. Sunlight.
2. Happiness level.
3. Windowless offices.
4. Time of day.
Click here for the answer.
1. The greenhouse.
2. Water level, fertilizer and nutrient levels.
3. How tall the plants grow.
4. Optimal resources.
Click here for the answer.
Q3: A researcher suspects that a cholera outbreak is happening because of tainted wells
in the city. Most of the cases are clustered around public wells that draw their water
from the underground aquifer.
Q4: Studies have shown that condom use is effective in controlling the spread of HIV.
However, studies also show that a combination of two HIV medications (tenofovir and
emtricitabine) can also control the spread of the disease.
1. Tenofovir.
2. Emtricitabine.
3. Both 1 and 2.
4. HIV.
Click here for the answer.
Original map by John Snow showing the clusters of cholera cases in the London
epidemic of 1854
Solution to Q1:
Q1: The correct answer is 2, happiness level. Happiness levels depend upon the amount
of sunlight. If you try any of the other combinations, none make sense in the statement
“x depends on y.” For example, “sunlight depends on happiness” doesn’t make a whole
lot of sense. Plus, the clue was in the hypothesis statement itself (exposure to more
sunlight increases happiness). Back to Quiz.
Solution to Q2:
Q2: The correct answer is 3, how tall the plants grow (how tall the plants grow
depends on the resources used). Back to Quiz.
Solution to Q3:
Q3: The correct answer is 2, cholera. The cholera outbreak depends upon (i.e. is a result
of) the polluted water supply from the aquifer. Back to Quiz.
Solution to Q4:
Q4: The correct answer is 4, HIV. Controlling the spread of HIV depends upon condom
use and the medications listed.Back to Quiz.
Put another way, the dependent variable is the variable that is being measured by you,
the experimenter. In psychology, the DV is often a score of some type. For example, a
score on memorization task, an IQ test, or a depression scale.
Back to Top.
For example, let’s say you were investigating how health is affected by age,
socioeconomic status, or heart disease. The independent variables (i.e. age 0-18,
18-64, 65+) are placed in the columns. Health (perhaps measured on a scale from 1
to 10 with 10 being the best) is placed in the rows. Placing your data using this
standardized format makes it easier to interpret results.
Back to Top.
An experimental variable.
An explained variable.
A measured variable.
An outcome variable.
An output variable.
A responding variable.
A regressand (in regression analysis.)
A response variable.
Back to Top.
Outcome variable.
A simple example: let’s say you were interested on whether snack foods improved test
scores. In an experimental study you could separate students into two groups, feed one
group snacks while taking a test and deny the other group (the control group) access to
food. In the non-experimental case, you would find a group of students (say, in an
entire college) and separate the students into those who eat snacks during a test and
those who do not. You could then observe their performance on a test.
Expert opinion.
One or more case reports.
Program evaluations. These are studies designed to see whether a program is
meeting its goals.
Quality improvement methods (Plan-Do-Study-Act), used to measure or
redefine standards.
Case control studies; performed after an event has happened. Data is gathered
and the researcher attempts to find the cause based on this historical data.
Cohort studies: similar to case control but the participants are gathered before
any event has happened. For example, a group of 1,000 people age 40-50 might
be studied for 10 years to see who develops heart disease.
Hypothesis Testing
In statistics, during a statistical survey or a research, a hypothesis has to be set and defined. It is
termed as a statistical hypothesis It is actually an assumption for the population parameter. Though,
it is definite that this hypothesis is always proved to be true. The hypothesis testingrefers to the
predefined formal procedures that are used by statisticians whether to accept or reject the
hypotheses. Hypothesis testing is defined as the process of choosing hypotheses for a particular
probability distribution, on the basis of observed data.
Hypothesis testing is a core and important topic in statistics. In the research hypothesis testing, a
hypothesis is an optional but important detail of the phenomenon. The null hypothesis is defined as
a hypothesis that is aimed to challenge a researcher. Generally, the null hypothesis represent the
current explanation or the vision of a feature which the researcher is going to test. Hypothesis
testing includes the tests that are used to determine the outcomes that would lead to the rejection of
a null hypothesis in order to get a specified level of significance. This helps to know if the results
have enough information, provided that conventional wisdom is being utilized for the establishment
of null hypothesis.
A hypothesis testing is utilized in the reference of a research study. Hypothesis test is used to
Related Calculators
evaluate and analyze the results of the research study. Let us learn more about this topic.
Hypothesis testing is one of the most important concepts in statistics. A statistical hypothesis is an
assumption about a population parameter. This assumption may or may not be true. The
methodology employed by the analyst depends on the nature of the data used and the goals of the
analysis. The goal is to either accept or reject the null hypothesis.
1. Test Statistic
The decision, whether to accept and reject the null hypothesis is made based on this value. The
test statistic is a defined formula based on the distribution t, z, F etc. If the calculated test statistic
value is less than the critical value, we accept the hypothesis, otherwise, we reject the hypothesis.
z test statistic is used for testing the mean of the large sample. The test statistic is given by
zz = x¯−μσn√x¯−μσn
where, x¯x¯ is the sample mean, μμ is the population mean, σσ is the population standard
deviation and n is the sample size.
2. Level of Significance
The confidence at which a null hypothesis is accepted or rejected is called level of significance. The
level of significance is denoted by αα
3. Critical Value
Critical value is the value that divides the regions into two-Acceptance region and rejection region.
If the computed test statistic falls in the rejection region, we reject the hypothesis. Otherwise, we
accept the hypothesis. The critical value depends upon the level of significance and alternative
hypothesis.
The alternative hypothesis is one sided if the parameter is larger or smaller than the null hypothesis
value. It is two sided when the parameter is different from the null hypothesis value. The null
hypothesis is usually tested against an alternative hypothesis(H1). The alternative hypothesis can
take one of three forms:
5. P - Value
The probability that the statistic takes a value as extreme or more than extreme assuming that the
null hypothesis is true is called P- value. The P-value is the probability of observing a sample
statistic as extreme as the test statistic, assuming the null hypothesis is true. The P value is the
probability of seeing the observed difference, or greater, just by chance if the null hypothesis is true.
The larger the P value, the smaller will be the evidence against the null hypothesis.
Hypothesis Benefits and Process
Back to Top
Type 1: When we recognize the research hypothesis and the null hypothesis is supposed to be
correct.
Type 2: When we refuse the research hypothesis even if the null hypothesis is incorrect.
Hypothesis testing begins with the hypothesis made about the population parameter. Then, collect
data from appropriate sample and obtained information from the sample is used to decide how
likely it is that the hypothesized population parameter is correct. The purpose of hypothesis testing
is not to question the computed value of the sample statistic but to make a judgement about the
difference between two samples and a hypothesized population parameter.
We illustrate the five steps to hypothesis testing in the context of testing a specified value for a
population proportion. The procedure for hypothesis testing is given below :
1) Simple Hypothesis
If a hypothesis is concerned with the population completely such as functional form and the
parameter, it is called simple hypothesis.
Example:
The hypothesis “Population is normal with mean as 15 and standard deviation as 5" is a simple
hypothesis
2) Composite Hypothesis or Multiple Hypothesis
If the hypothesis concerning the population is not explicitly defined based on the parameters, then it
is composite hypothesis or multiple hypothesis.
Example:
The hypothesis “population is normal with mean is 15" is a composite or multiple hypothesis.
3) Parametric Hypothesis
A hypothesis, which specifies only the parameters of the probability density function, is called
parametric hypothesis.
Example:
If a hypothesis specifies only the form of the density function in the population, it is called a non-
parametric hypothesis.
Example:
A null hypothesis can be defined as a statistical hypothesis, which is stated for acceptance. It is
the original hypothesis. Any other hypothesis other than null hypothesis is called Alternative
hypothesis. When null hypothesis is rejected we accept the alternative hypothesis. Null hypothesis
is denoted by H0 and alternative hypothesis is denoted by H1.
Example:
When we want to test if the population mean is 30, then null hypothesis is “Population mean is 30''
and alternative Hypothesis is “Population mean is not 30".
Logic of Hypothesis Testing
Back to Top
The logic of hypothesis testing is similar to the "presumed innocent until proven guilty". In
hypothesis testing, we assume that the null hypothesis is a possible truth until the sample data
conclusively demonstrate otherwise. A hypothesis test is a statistical method that uses sample data
to evaluate a hypothesis about a population.
The probability of rejecting the null hypothesis, when it is true, is called Type I error whereas the
probability of accepting the null hypothesis is called Type II error. Probability of Type II error is
denoted by ββ.
Example:
Suppose a toy manufacturer and its main supplier agreed that the quality of each shipment will
meet a particular benchmark. Our null hypothesis is that the quality is 90%. If we accept the
shipment, given the quality is less than 90%, then we have committed Type I error. If we reject the
shipment, given the the quality is greater than 90%, we have committed Type II error.
1. Only one of the Type I error or the Type II error is possible at a time.
2. The power of a test is defined as 1 minus the probability of type II error. Power = 1−β1−β.
Step 4: Finding the critical value with required level of significance and degrees of freedom
The problem of multiple hypothesis testing arises when there are more than one hypothesis to be
tested simultaneously for statistical significance. Multiple hypothesis testing occurs in a vast variety
of field and for a variety of purposes. Testing of more than one hypothesis is used in many field and
for many purposes.
An alternate way of multiple hypothesis testing is multiple decision problem. When considering
multiple testing problems, the concern is with Type 1 errors when hypothesis are true and type 11
errors when they are false. The evaluation of the procedures is based on criteria involving balance
between these errors.
Bayesian involves specifying a hypothesis and collecting evidence that support or does not support
the statistical hypothesis. The amount of evidence can be used to specify the degree of belief in a
hypothesis in probabilistic terms. The probability of supporting hypothesis can become vary high or
low. Hypothesis with a high probabilistic terms are accepted as true, and with low are rejected as
false.
Bayesian hypothesis testing works just like any other type of Bayesian inference. Let us consider
the case where we are considering only two hypotheses, H1H1 and H2H2
The probability of our data P(x⃗ x→) takes into account the possibility of each hypothesis under
consideration to be true:
This weight is given in the terms of probability, is called the level of significance(p value) of the
statistical test. The level of significance is the probability of obtaining a value of the statistic that is
likely or reject H0H0 as the actual observed value of the test statistic, assuming that null
hypothesis is true.
If the level of significance is a small value, then the sample data fail to support null hypothesis and it
reject H0H0. If the level of significance is a large value, then we fail to reject null hypothesis.
Solved Example
Question: XYL Company, with a very small turnover, is taking feedback on permanent
employees. During the feedback process, it was found that the average age of XYL
employees is 20 years. The relevance of the data was verified by taking a random sample of
hundred workers and the common age turns out as 19 years with a standard deviation of 02
years. Now XYZ should continue to make its claim, or it should make changes?
Solution:
2. State the Significance Level: Since the company would like to maintain its present
message to new human resources, XYZ selects a fairly weak significance
level(αα = 0.5). Because this is a two-tailed analysis, half of the alpha will be
assigned to every tail of the allocation. In this condition the important values of Z =
+1.96 and -1.96.