D Sampling

Sampling and Sampling Distributions
Statistical Inference

The purpose of statistical inference is to obtain information about a population based on a sample drawn from the population. A population is the set of all the elements of interest in a study. A sample is a subset of the population.

The sample results provide only estimates of the values of the population characteristics. It is hoped that the value of x obtained from a sample, is a good approximation of Q. It is hoped that the numerical value of s2 closely approximates W , and so forth.


We study techniques that allow us to use statistics and probability theory to draw conclusions about populations and to access the accuracy of these conclusions.

A parameter is a numerical characteristic of a population. With proper sampling methods, the sample methods, results will provide good estimates of the population characteristics. However, we do not expect the sample results to exactly equal the population characteristics.

In this chapter we show how random sampling can be used to select a sample from a population. show how data from a simple random sample can be used to compute estimates of the population mean, population standard deviation, and population proportion. Introduce the concept of sampling distribution.
Simple Random Sampling

Finite Population
A simple random sample from a finite population of size N is a sample selected such that each possible sample of size n has the same probability of being selected. Replacing each sampled element before selecting subsequent elements is called sampling with replacement. replacement.
Sampling without replacement is the procedure used most often. In large sampling projects, computercomputergenerated random numbers are often used to automate the sample selection process.

Infinite Population
A simple random sample from an infinite population is a sample selected such that the following conditions are satisfied. Each element selected comes from the same population. Each element is selected independently.

The population is usually considered infinite if it involves an ongoing process that makes listing or counting every element impossible. The random number selection procedure cannot be used for infinite populations.
Point Estimation
In point estimation we use the data from the sample to compute a value of a sample statistic that serves as an estimate of a population parameter. This estimate is referred to as a sample statistic. statistic. The sample statistic serves as an estimate of a population parameter.
Point Estimation
We refer to x as the point estimator of the population mean Q. s is the point estimator of the population standard deviation W.
p is the point estimator of the population
proportion p.
INTRODUCTION OF SAMPLING DISTRIBUTIONS

One of the most common stats is the use of a sample mean to make inferences about the population mean. However, on each repetition of selecting a sample and calculating the mean, we can anticipate obtaining a different value for the sample mean.
SAMPLING DISTRIBUTIONS

The probability distribution for all possible values of the sample mean is called the sampling distribution of the sample mean . Just as we can talk about the probability distribution for all possible values of the sample mean, we can also calculate the probability distribution for all possible values of the sample standard deviation and sample proportion.
Sampling Distribution of
The sampling distribution of x is the probability distribution of all possible values of the sample mean x .

Expected Value of x
E (x ) = Q
where Q = the population mean

Standard Deviation of x Finite Population Population Infinite
W N n Wx ! ( ) n N 1
W Wx ! n
A finite population is treated as being infinite if n/N < .05. (N n)/(N 1) is the finite correction factor. is referred to as the standard error of the mean. mean.
W x
Sampling Distribution of x

If we use a large (n > 30) simple random (n sample, the central limit theorem enables us to conclude that the sampling distribution of x can be approximated by a normal probability distribution. When the simple random sample is small (n (n < 30), the sampling distribution of x can be considered normal only if we assume the population has a normal probability distribution.
Properties of Point Estimators

Unbiasedness
If the expected value of the sample statistic is equal to the population parameter being estimated, the sample statistic is said to be an unbiased estimator of the population parameter.
Efficiency
The point estimator with the smallest standard deviation is said to have greatest relative efficiency. efficiency.
Consistency
A point estimator is consistent if the values of the point estimator tend to become closer to the population parameter as the sample size increases.
Developing Sampling Distributions

Suppose theres a population...

Random variable, X, is Age of individuals Values of X: 18, 20, 22, 24

measured in years EVERYONE is one of these 4 ages in this population
D
A
1984-1994 T/Maker Co.
Population Characteristics
Summary Measure
Q !
i !1
Population Distribution
P(X)
.3 .2
Xi N
18 20 22 24 ! 21 4
.1 0 A B
(20)
C
(22)
D
(24)
W !
i !1
X i Q N
(18)
! 2 . 236
Uniform Distribution
All Possible Samples of Size n = 2

1st Obs 2nd Observation 18 20 22 24
18 18,18 18,20 18,22 18,24 20 20,18 20,20 20,22 20,24 22 22,18 22,20 22,22 22,24 24 24,18 24,20 24,22 24,24
16 Sample Means
1st 2nd ser ti n Obs 18 20 22 24
18 18 19 20 21 20 19 20 21 22 22 20 21 22 23 24 21 22 23 24
16 Samples
Samples Taken with Replacement
Sampling Distribution of All Sample Means

16 Sample Means
1st 2nd ser ti n Obs 18 20 22 24
Sample Means Distribution

P(X) .3 .2 .1 0
18 18 19 20 21 20 19 20 21 22 22 20 21 22 23 24 21 22 23 24
# in sample = 2,
_
18 19 20 21 22 23 24
# in Sampling Distribution = 16
Summary Measures for the Sampling Distribution

18 19 19 . 24 Qx ! ! ! 21 N 16
i !1
Xi
Wx ! !
i !1
X i Qx N
2
18 21 19 21 . 24 21

2
16
! 1.58
Comparing the Population with its Sampling Distribution

Population
Q= 21, W= 2.236

P(X) .3 .2 .1 0 A
(18)
Sample Means Distribution n=2 Q x ! 21 W x ! 1. 58

P(X) .3 .2 .1
B
(20)
C
(22)
D
(24)
_
18 19 20 21 22 23 24
Results for other sample sizes
Sample Size Mean Variance St. Dev 1 21 5 2.236 2 21 2.5 1.581139 3 21 1.666667 1.290994 4 21 1.25 1.118034 5
Properties of Sample Mean as Estimator of Population Mean

Expected value of sample mean is population mean UNBIASED E( X ) ! Q Among UNBIASED estimators, the mean has the SMALLEST variance EFFICIENT Variance = standard error
W
_ W x = W
n
As n increase, W _ decrease.
CONSISTANT
When the Population is Normal Sampling Distribution is Also Normal

Population Distribution Central Tendency
= 10
=
Q = 50 X
Variation W _ = W x n
Sampling Distributions
n=4 DX = 5
QX = 50 X
n =16 DX = 2.5
X
Central Limit Theorem

As Sample Size Gets Large Enough Sampling Distribution Becomes Almost Normal regardless of shape of population
X X
When The Population is Not Normal

Central Tendency
Population Distribution
Qx ! Q
Variation W W x ! n
W = 10
Q = 50 X Sampling Distributions
n WDX 4 n 30 WDX 1.8
QX ! 50
Example: Sampling Distribution

X Q 7.8 8 Z! ! ! .50 W / n 2 / 25
X Q 8 .2 8 ! ! . 50 Standardized Sampling Z ! W / n 2 / 25 Normal Distribution Distribution
W X ! .4
W=1
.3830
.1915 .1915
7.8 8 8.2
Q = 0
Sample Distribution of Sample Proportion
Figure 8.12 : Using sample proportion to make an inference about the population proportion
Example 8.2: In a population of razor blades, 15% are defective. What is the probability of randomly selecting 90 razor blades and finding 10 or less defective?
Figure 8.13 : The probability of randomly selecting 90 razor blades and finding 10 or less defective
Statistical Inference: Estimation for Single Populations

Statistical inference is the branch of statistics which deals with uncertainty in decision making and provides a basis for making scientific decisions.
Types of Estimates
We can make two types of estimates about the

population. They are referred to as point estimates and interval estimates.
A point estimate is the sample statistic that is used to

estimate the population parameter.
An interval estimate is the range of values within

which a researcher or an employee can say with some confidence that the population parameter falls. This range is called confidence interval.
Using the Z Statistic for Estimating Population Mean
The z statistic can be used for estimating the population

parameter on the basis of the sample statistic. Confidence interval for estimating population mean
The confidence interval with the associated probability can be calculated as below:
Figure 9.1: z scores for confidence interval in relation to alpha 9.1:
In estimation, any confidence level can be applied; however, the most widely used levels are 90 %, 95%, and 99%.
Figure 9.2: Distribution of sample means for 99% confidence interval
Example 9.1: A researcher has taken a random sample of size 70 from a population with a sample mean of 35 and a population standard deviation of 4.62. Construct a 90% confidence interval to estimate the population mean.
This result implies that the researcher is 90% confident that the population mean will lie between 34.091 and 35.909. The point estimate is 35.
Confidence interval for estimating population mean , when is unknown and sample size is large (n 30)
Example 9.3: In order to estimate the customer loyalty for a particular product, a researcher poses the following question to a sample of 100 customers: How many years have you been continuously using this product? This sample yielded a mean period of 8 years with a sample standard deviation of 2 years. Construct a 95% confidence interval for estimating the population mean.
Example 9.3 (Solution)
This result implies that the researcher is 95% confident that the population mean (average years after purchase in the population) will lie between 7.608 years and 8.392 years.
Estimating Population Mean Using the t Statistic (Small (Small Sample Case)
In the case of small sample size (n < 30), the z formula discussed earlier is not applicable. The problem can be solved by using the t statistic.
Figure 9.10 : Comparison of standard normal curve with two t distributions having degrees of freedom 10 and 20, respectively
The t Distribution
The t distribution, developed by William Gosset is a

family of similar probability distributions with a specific t distribution depending on a parameter known as the degrees of freedom.
As sample size n increases, t distribution values tend to

approach standard normal curve.
The difference between tabular values of t and z

becomes negligible as sample size increases. This is a reason why many researchers use the z distribution for large samples.
The t Distribution (Contd.)
Figure 9.11 : Comparison of standard normal curve with two t curves with sample size n = 10 and n = 22
Degrees of Freedom
The number of degrees of freedom indicates the number of values that are free to vary in a random sample. The degrees of freedom can be understood as the number of independent observations for a source of variation minus the number of independent parameters estimated in computing the variation. Confidence interval to estimate population parameter , when population standard deviation is unknown and the population is normally distributed.
Example 9.4: The personnel department of an organization wants to apply cost-cutting measures for improving efficiency. costAs the first step, the personnel department wants to curtail telephone expenses incurred by employees. For this, personnel department has taken a random sample of 10 employees and gathered the following data about telephone expenses (in thousand rupees) in the previous year:10, 12, 24, 23, 11, 14, 15, 34, 16, 23. Construct a 95% confidence interval to estimate the average telephone expenses of the employees in the population.
Solution:
Confidence Interval Estimation for Population Proportion

Confidence interval to estimate the population proportion p
Example 9.5: A research company conducted a survey on 300 randomly selected tax payers. It found that out of 300 tax payers, 180 tax payers have filled the SARAL form correctly. Construct a 95% confidence interval to estimate the percentage of tax payers who have filled the form correctly in the population.
End

D Sampling

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

D Sampling

Transféré par

Droits d'auteur :

Formats disponibles

Sampling and Sampling Distributions

Simple Random Sampling

Simple Random Sampling

Simple Random Sampling

Simple Random Sampling

INTRODUCTION OF SAMPLING DISTRIBUTIONS

Standard Deviation of x Finite Population Population Infinite

Properties of Point Estimators

Developing Sampling Distributions

Random variable, X, is Age of individuals Values of X: 18, 20, 22, 24

1984-1994 T/Maker Co.

All Possible Samples of Size n = 2

Sampling Distribution of All Sample Means

Sample Means Distribution

Summary Measures for the Sampling Distribution

18 21  19 21  .  24 21

Comparing the Population with its Sampling Distribution

Q= 21, W= 2.236

Sample Means Distribution n=2 Q x ! 21 W x ! 1. 58

Results for other sample sizes

Properties of Sample Mean as Estimator of Population Mean

When the Population is Normal Sampling Distribution is Also Normal

Central Limit Theorem

When The Population is Not Normal

Example: Sampling Distribution

Sample Distribution of Sample Proportion

Statistical Inference: Estimation for Single Populations

 We can make two types of estimates about the

 A point estimate is the sample statistic that is used to

 An interval estimate is the range of values within

Using the Z Statistic for Estimating Population Mean

 The z statistic can be used for estimating the population 

Figure 9.1: z scores for confidence interval in relation to alpha 9.1:

Figure 9.2: Distribution of sample means for 99% confidence interval

Example 9.3 (Solution)

 The t distribution, developed by William Gosset is a

 As sample size n increases, t distribution values tend to

 The difference between tabular values of t and z

The t Distribution (Contd.)

Confidence Interval Estimation for Population Proportion

Vous aimerez peut-être aussi

18 21 19 21 . 24 21

We can make two types of estimates about the

A point estimate is the sample statistic that is used to

An interval estimate is the range of values within

The z statistic can be used for estimating the population

The t distribution, developed by William Gosset is a

As sample size n increases, t distribution values tend to

The difference between tabular values of t and z