Vous êtes sur la page 1sur 11

Agenda

 Review of Statistics
Statistical Inference
Hypothesis Testing
Type I Errors, Type II Errors, p-value (1 hr)
One-Tailed Test, Two-Tailed Test (2 hrs)

EduPristine | (Confidential)
Statistical inference

 Statistical Inference is the process of drawing a relationship between a population and a


sample drawn from that population
 Estimation & Hypothesis Testing: (Branches of statistical inference)
Estimation: Finding out estimator values (mean, variance etc) of the sample
Hypothesis testing: Judge whether the hypothesis made is reliable or not based on the sample
estimators

EduPristine | (Confidential)
Hypothesis testing

 A statistical hypothesis test is a method of making statistical decisions from and about
experimental data
 Null-hypothesis testing answers the question:
How well the findings fit the possibility that chance factors alone might be responsible"
Example: Does your score of 6/10 imply that I am a good teacher???

 There are five ingredients to any statistical test:


Null Hypothesis
Alternate Hypothesis
Test Statistic
Rejection / Critical Region
Conclusion

EduPristine | (Confidential)
Properties of point estimators
 Linearity
 Unbiasedness
 Minimum Variance
 Efficiency
 Best Linear unbiased estimator (BLUE)
 Consistency

 Unbiased estimator: One or more values of an estimator is equal to the true value of a
parameter

 Efficient estimator: Considering only the unbiased estimators of a parameter, the one which
has least variance is called the efficient estimator

 Consistent estimator: The estimator which approaches the true value of its parameter as the
sample size increases

EduPristine | (Confidential)
Launching a niche course for MBA students?

 Christos, brand manager for a leading financial training center, wants to introduce a new niche finance
course for MBA students. He met some industry stalwarts and found that with the skills acquired by
attending such a course, the students would be very hot in job market
 He meets a random sample of 100 students and discovers the following characteristics of the market
Mean household income to $20,000
Interest level in students = high
Current knowledge of students for the niche concepts = low
 Christos strongly believes the course would adequately profitable in students if they have the buying
power for the course. They would be able to afford the course only if the mean household income is
greater than $19,000
 Would you advice Christos to introduce the course?
What should be the hypothesis?
Hint: What is the point at which the decision changes (19,000 or 20,000)?
What about the alternate hypothesis?
What other information do you need to ensure that the right decision is arrived at?
Hint: confidence intervals / significance levels?
Hint: Is there any other factor apart from mean, which is important? How do I move from population parameters
to standard errors?
What is the risk still remaining, when you take this decision?
Hint: Type-I / II errors?
Hint: P-value

EduPristine | (Confidential)
Identifying the critical sample Mean value Sampling distribution

 To reach a final decision, Christos has to make a general inference (about the population)
from the sample data
 Criterion: Mean income across all households in the market area under consideration
If the mean population household income is greater than $19,000, then Christos should introduce
the product line into the new market

 Christoss decision making is equivalent to either accepting or rejecting the hypothesis:


The population mean household income in the new market area is greater than $19,000
 The term one-tailed signifies that all z-values that would cause Christos to reject H0, are in
just one tail of the sampling distribution
-> Population Mean
H0: $19,000
Ha: > $19,000

EduPristine | (Confidential)
Christoss criterion for decision making

0.25

0.2
Critical Value (Xc)
0.15

0.1

0.05

$19,000

 Sample mean values greater than $19,000--that is x-values on the right-hand side of the
sampling distribution centered on = $19,000--suggest that H0 may be false
 More important the farther to the right x is , the stronger is the evidence against H0

Reject H0 if the sample mean exceeds Xc

EduPristine | (Confidential)
Computing the criterion value
 Standard deviation for the sample of 100 households is $4,000. The standard error of the
mean (sx) is given by:

s
sx = = $400
n
 Critical mean household income xc through the following two steps:
Determine the critical z-value, zc. For = 0.05:
zc = 1.645

Substitute the values of zc, s, and (under the assumption that H0 is "just" true )
xc = + zcs = $19,658

Decision Rule:
If the sample mean household income is greater than $19,658, reject the null hypothesis and
introduce the new course

EduPristine | (Confidential)
Test statistic

 The value of the test statistic is simply the z-value corresponding to = $20,000
x 0.25
Z= = 2.5
sx
0.2
= 0.05
X c = + Zc *
0.15

0.1

 In this case, since the observed sample 0.05


statistic (20,000) is greater than the
critical value (19,658), so the null 0
hypothesis is rejected => = $19,000 X = $20,000
 There is a significant difference in the Z=0 Z = 2.5
hypothesized population parameter and
the observed sample statistic => Do not Reject H0 Reject H0
 Mean income > 19,000 =>
X c = $ 19 , 658
 Launch the course Z c = 1 . 645

EduPristine | (Confidential)
Errors in estimation
 Please note:
Actual
You are inferring for a population, based only on a sample H0 is True H0 is False
This is no proof that your decision is correct Inference
Its just a hypothesis Correct Decision Type-II Error
There is still a chance that your inference is wrong H0 is True Confidence P(Type-II Error)
Level = 1- =
How do I quantify the prob. of error in inference?
Type-I Error
H0 is False Significance Power=1-
 Type I and Type II Errors Level =
Type I error occurs if the null hypothesis is rejected
when it is true
Type II error occurs if the null hypothesis is not rejected
when it is false

 Significance Level
-> Significance level
the upper-bound probability of a Type I error
1 - ->confidence level
the complement of significance level

EduPristine | (Confidential)
P-Value Actual significance level

 P-value
The probability of obtaining an observed 0.25

value of x (from the sample) is as high as


$20,000 or more when actual populations 0.2

mean () is only $19,000 = 0.00621 = 0.05


0.15
This value is sometimes called the actual
significance level, or the p-value 0.1
Calculated probability of rejecting the null
hypothesis (H0) when that hypothesis (H0) 0.05

is true (Type I error)


0
 The actual significance level of 0.00621
in this case means that the odds are less = $19,000 p-value = 0.00621
Z=0
than 62 out of 10,000 that the sample
mean income of $20,000 would have Do not Reject H0 Reject H0
occurred entirely due to chance
(when the population mean income
is $19,000)

EduPristine | (Confidential)

Vous aimerez peut-être aussi