Statistical Inference

Statistics
It is a branch of mathematics used to summarize,

analyze & interpret a group of numbers of
observations.
Types of Statistics
Descriptive Statistics :
It summarize data to make sense or meaning of a list of
numeric values.
Inferential Statistics :
It is used to infer or generalize observations made with
samples to the larger population from which they
were selected. Broadly it is classified into theory of
estimation and testing of hypothesis

Estimation & Testing of
Hypothesis
Estimation
The method to estimate the value of a population
parameter from the value of the corresponding sample
statistic.

Testing of Hypothesis
A claim or belief about an unknown parameter value.

Types of Estimation

Point estimation
It is the value of sample statistic that is used to
estimate most likely value of the unknown population
parameter.

Interval estimation
It is the range of the values that is likely to have
population parameter value with a specified level of
confidence.
Properties of estimation
Consistency-
The statistic tend to become closer to population parameter
as the sample size increases.
Unbiasedness-
E(Statistic) = Parameter
Efficiency-
Refers to the size of the standard error(SE). E.g., SE of sample
median is greater than the sample mean, So the sample mean
is more efficient .
Sufficiency-
Refers to the usage of sample information by the statistic.
E.g., Sample mean is more sufficient than sample median
because usage is more.
Drawback of point estimation
No information is available regarding its reliability,i.e,
how close it is to its true population parameter.

In fact, the probability that a single sample statistic
actually equals to the population parameter is
extremely small
Interval Estimation
Confidence Interval= Point estimate margin 0f error

Margin of error = critical value of Z or t at 90%, 95%
& so on confidence level* standard
error of particular statistic
Estimation
Population mean Avg. salary
Population proportion Stock Market
Interval Estimation for population mean()
SAMPLE SIZE
FORMULAE
Large Sample(n30)

Known SD()

Unknown SD()

Sample Mean square(S)
n
Z x
o
o
2
n
S
Z x
2
o
( )
2
1
1
x x
n
Interval Estimation for population mean()
SAMPLE SIZE
FORMULAE
Small Sample(n<30)

Known SD()

Unknown SD()

Sample Mean square(S)
n
Z x
o
o
2
n
S
t x
2
o
( )
2
1
1
x x
n
Interval Estimation for population
proportion(p)
n
p p
Z p p
) 1 (
2
=
o
Test of hypothesis
Hypothesis
Statements about characteristics of populations,
denoted as H.
Types of Hypothesis
Null & Alternative hypothesis
Simple & Composite hypothesis
Hypothesis Testing
Null Hypothesis-
The hypothesis actually tested is called the null hypothesis. It is
denoted as H
0.
It is the claim that is initially assumed to be true.
It may usually be considered the skeptics hypothesis: Nothing
new or interesting happening here!

Alternative Hypothesis-
The other hypothesis, assumed true if the null is false, is the
alternative hypothesis. It is denoted as H
1
or H
a
. H
a
may usually
be considered the researchers hypothesis.

These two hypotheses are mutually exclusive and exhaustive so
that one is true to the exclusion of the other.

Possible conclusions from hypothesis-testing analysis are reject
H
0
or fail to reject H
0
.
Hypothesis Testing
Simple Hypothesis -
It specifies the distribution completely (One tail test)
H
0
:
1
=
2
H
A
:
1
> or <
2

Composite hypothesis-
It does not specifies the distribution completely(Two tail test)
H
0
:
1
=
2

H
A
:
1

2

Examples of Hypothesis :
There exists a positive relationship between
attendance and result.
Bankers assumed high-income earners are more
profitable than low-income earners

Rules for Hypotheses
H
0
is always stated as an equality claim involving
parameters.
H
a
is an inequality claim that contradicts H
0
. It may
be one-sided (using either > or <) or two-sided
(using ).
A test of hypotheses is a method for using sample
data to decide whether the null hypothesis should be
rejected.
Rejection region - Values of the test statistic for
which we reject the null in favor of the alternative
hypothesis

Test Procedure

A test procedure is specified by
1. A test statistic, a function of the sample data
on which the decision is to be based.
2. (Sometimes, not always!) A rejection region,
the set of all test statistic values for which H
0

will be rejected
Hypothesis Testing
Test Result
True State
H
0
True H
0
False
H
0
True Correct
Decision
Type I Error
H
0
False Type II Error Correct
Decision
) ( ) ( Error II Type P Error I Type P = = | o
Goal: Keep o, | reasonably small
Errors in Hypothesis Testing
A type I error consists of rejecting the null
hypothesis H
0
when it was true.
A type II error consists of not rejecting H
0
when
H
0
is false.
are the probabilities of type I and
type II error, respectively.

and o |
Level Test o
A test corresponding to the significance level is
called a level test. A test with significance level
is one for which the type I error probability is
controlled at the specified level.
o
o
Sometimes, the experimenter will fix the value of
, also known as the significance level.
o
Steps in Hypothesis-Testing Analysis
1. Set up H
0
and H
a
2. Identify the nature of the sampling distribution curve
and specify the appropriate test statistic
3. Determine whether the hypothesis test is one-tailed or
two-tailed
4. Taking into account the specified significance level,
determine the critical value (two critical values for a two-
tailed test) for the test statistic from the appropriate
statistical table
5. State the decision rule for rejecting H
0
6. Compute the value for the test statistic from the sample
data
7. Using the decision rule specified in step 5, either reject
H
0
or reject H
a

Large sample test(Z-test)

Test for single mean:
A cinema hall has cool drinks fountain supplying
orange & colas. When the machine is turned on ,it fills
a 550ml cup with 5ooml of the required drink. The
manager has 3 problems.
1.The clients have been complaining that the machine supplies less than
5ooml.
2. The manager wants to make sure that the amount of cola does not exceed
500ml.
3. The manager want to minimize customer complaint & at the same time
does not want any overflow.
In the case of cinema hall, suppose n= 36, sample
mean= 499ml & the specifications of the machine give
the standard deviation of the output as 1 ml. The
significance level is 10%.


Test for difference mean:
Business Today has conducted a survey between
Sonepur & Muzaffarpur on the hourly wages of
laborers. Results of the survey are as follows.
Town Mean Hourly Wages S.D Sample
Sonepur Rs.8.95 Rs.0.40 200
Muzaffarpur Rs.9.10 Rs.0.60 175
Business Today wants to test the hypothesis at the
0.05 significance level that there is no difference
between hourly wages for the landless laborers in the
two towns.


Test for proportion:
A cable TV operator claims that 40% of the homes in a
city have opted for his services. Before sponsoring
advertisements on the local cable channel, a company
conducted a survey & found that 250 out of 550 persons
were found to have cable TV services from the operator .
On the basis of this data can we accept the claim of the
cable TV operator at 1% significance level.


( )
0
0 0
1 /
p p
z
p p n
Single Mean
Difference Mean
Proportion
2
2
2
1
2
1
2 1
2 1
) ( ) (
n n
X X
Z
o o

+

=
n
X X
Z
o
2 1

=

Critical values of Z

Level of
significance()
10% 5% 1%
Critical values for
two-tailed test
1.64 1.96

2.58

Critical values for
left-tailed test

-1.28

-1.64 -2.33
Critical values for
right-tailed test

1.28 1.64 2.33

Statistical Inference

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Statistical Inference

Transféré par

Droits d'auteur :

Formats disponibles

Statistics

It is a branch of mathematics used to summarize,

Vous aimerez peut-être aussi