Vous êtes sur la page 1sur 52

Sampling and Sampling Distributions

Statistical Inference


The purpose of statistical inference is to obtain information about a population based on a sample drawn from the population. A population is the set of all the elements of interest in a study. A sample is a subset of the population.

Statistical Inference


The sample results provide only estimates of the values of the population characteristics. It is hoped that the value of x obtained from a sample, is a good approximation of Q. It is hoped that the numerical value of s2 closely approximates W , and so forth.

 

Statistical Inference


We study techniques that allow us to use statistics and probability theory to draw conclusions about populations and to access the accuracy of these conclusions.

Statistical Inference


A parameter is a numerical characteristic of a population. With proper sampling methods, the sample methods, results will provide good estimates of the population characteristics. However, we do not expect the sample results to exactly equal the population characteristics.

Statistical Inference


In this chapter we show how random sampling can be used to select a sample from a population. show how data from a simple random sample can be used to compute estimates of the population mean, population standard deviation, and population proportion. Introduce the concept of sampling distribution.

Simple Random Sampling




Finite Population

A simple random sample from a finite population of size N is a sample selected such that each possible sample of size n has the same probability of being selected. Replacing each sampled element before selecting subsequent elements is called sampling with replacement. replacement.

Simple Random Sampling

Sampling without replacement is the procedure used most often. In large sampling projects, computercomputergenerated random numbers are often used to automate the sample selection process.

Simple Random Sampling




Infinite Population

A simple random sample from an infinite population is a sample selected such that the following conditions are satisfied. Each element selected comes from the same population. Each element is selected independently.

Simple Random Sampling


The population is usually considered infinite if it involves an ongoing process that makes listing or counting every element impossible. The random number selection procedure cannot be used for infinite populations.

Point Estimation
In point estimation we use the data from the sample to compute a value of a sample statistic that serves as an estimate of a population parameter. This estimate is referred to as a sample statistic. statistic. The sample statistic serves as an estimate of a population parameter.

Point Estimation

We refer to x as the point estimator of the population mean Q. s is the point estimator of the population standard deviation W.
p is the point estimator of the population

proportion p.

INTRODUCTION OF SAMPLING DISTRIBUTIONS




One of the most common stats is the use of a sample mean to make inferences about the population mean. However, on each repetition of selecting a sample and calculating the mean, we can anticipate obtaining a different value for the sample mean.

SAMPLING DISTRIBUTIONS


The probability distribution for all possible values of the sample mean is called the sampling distribution of the sample mean . Just as we can talk about the probability distribution for all possible values of the sample mean, we can also calculate the probability distribution for all possible values of the sample standard deviation and sample proportion.

Sampling Distribution of

The sampling distribution of x is the probability distribution of all possible values of the sample mean x .


Expected Value of x

E (x ) = Q
where Q = the population mean

Sampling Distribution of


Standard Deviation of x Finite Population Population Infinite

W N n Wx ! ( ) n N 1

W Wx ! n

Sampling Distribution of

A finite population is treated as being infinite if n/N < .05. (N n)/(N 1) is the finite correction factor. is referred to as the standard error of the mean. mean.

W x

Sampling Distribution of x


If we use a large (n > 30) simple random (n sample, the central limit theorem enables us to conclude that the sampling distribution of x can be approximated by a normal probability distribution. When the simple random sample is small (n (n < 30), the sampling distribution of x can be considered normal only if we assume the population has a normal probability distribution.

Properties of Point Estimators




Unbiasedness
If the expected value of the sample statistic is equal to the population parameter being estimated, the sample statistic is said to be an unbiased estimator of the population parameter.

Efficiency
The point estimator with the smallest standard deviation is said to have greatest relative efficiency. efficiency.

Consistency
A point estimator is consistent if the values of the point estimator tend to become closer to the population parameter as the sample size increases.

Developing Sampling Distributions


Suppose theres a population...



 

Random variable, X, is Age of individuals Values of X: 18, 20, 22, 24


measured in years EVERYONE is one of these 4 ages in this population

D
A

1984-1994 T/Maker Co.

Population Characteristics
Summary Measure
Q !
i !1

Population Distribution
P(X)
.3 .2

Xi N

18  20  22  24 ! 21 4

.1 0 A B
(20)

C
(22)

D
(24)

W !

i !1

X i  Q N

(18)

! 2 . 236

Uniform Distribution

All Possible Samples of Size n = 2


1st Obs 2nd Observation 18 20 22 24

18 18,18 18,20 18,22 18,24 20 20,18 20,20 20,22 20,24 22 22,18 22,20 22,22 22,24 24 24,18 24,20 24,22 24,24

16 Sample Means
1st 2nd ser ti n Obs 18 20 22 24

18 18 19 20 21 20 19 20 21 22 22 20 21 22 23 24 21 22 23 24

16 Samples
Samples Taken with Replacement

Sampling Distribution of All Sample Means


16 Sample Means
1st 2nd ser ti n Obs 18 20 22 24

Sample Means Distribution


P(X) .3 .2 .1 0

18 18 19 20 21 20 19 20 21 22 22 20 21 22 23 24 21 22 23 24
# in sample = 2,

_
18 19 20 21 22 23 24

# in Sampling Distribution = 16

Summary Measures for the Sampling Distribution


18  19  19  .  24 Qx ! ! ! 21 N 16
i !1

Xi

Wx ! !

i !1

X i  Qx N
2

18 21  19 21  .  24 21


2

16

! 1.58

Comparing the Population with its Sampling Distribution


Population

Q= 21, W= 2.236


P(X) .3 .2 .1 0 A
(18)

Sample Means Distribution n=2 Q x ! 21 W x ! 1. 58


P(X) .3 .2 .1

B
(20)

C
(22)

D
(24)

_
18 19 20 21 22 23 24

Results for other sample sizes

Sample Size Mean Variance St. Dev 1 21 5 2.236 2 21 2.5 1.581139 3 21 1.666667 1.290994 4 21 1.25 1.118034 5

Properties of Sample Mean as Estimator of Population Mean


  

Expected value of sample mean is population mean UNBIASED E( X ) ! Q Among UNBIASED estimators, the mean has the SMALLEST variance EFFICIENT Variance = standard error

W
_ W x = W

n
As n increase, W _ decrease.

CONSISTANT

When the Population is Normal Sampling Distribution is Also Normal


Population Distribution Central Tendency
= 10

=
Q = 50 X

Variation W _ = W x n

Sampling Distributions
n=4 DX = 5
QX = 50 X

n =16 DX = 2.5
X

Central Limit Theorem


As Sample Size Gets Large Enough Sampling Distribution Becomes Almost Normal regardless of shape of population

X X

When The Population is Not Normal


Central Tendency
Population Distribution

Qx ! Q
Variation W W x ! n

W = 10
Q = 50 X Sampling Distributions
n WDX 4 n 30 WDX 1.8

QX ! 50

Example: Sampling Distribution


X  Q 7.8  8 Z! ! ! .50 W / n 2 / 25
X Q 8 .2  8 ! ! . 50 Standardized Sampling Z ! W / n 2 / 25 Normal Distribution Distribution

W X ! .4

W=1
.3830

.1915 .1915

7.8 8 8.2

Q = 0

Sample Distribution of Sample Proportion

Figure 8.12 : Using sample proportion to make an inference about the population proportion

Example 8.2: In a population of razor blades, 15% are defective. What is the probability of randomly selecting 90 razor blades and finding 10 or less defective?

Figure 8.13 : The probability of randomly selecting 90 razor blades and finding 10 or less defective

Statistical Inference: Estimation for Single Populations

Statistical Inference


Statistical inference is the branch of statistics which deals with uncertainty in decision making and provides a basis for making scientific decisions.

Types of Estimates

 We can make two types of estimates about the


population. They are referred to as point estimates and interval estimates.

 A point estimate is the sample statistic that is used to


estimate the population parameter.

 An interval estimate is the range of values within


which a researcher or an employee can say with some confidence that the population parameter falls. This range is called confidence interval.

Using the Z Statistic for Estimating Population Mean

 The z statistic can be used for estimating the population 


parameter on the basis of the sample statistic. Confidence interval for estimating population mean

The confidence interval with the associated probability can be calculated as below:

Figure 9.1: z scores for confidence interval in relation to alpha 9.1:

In estimation, any confidence level can be applied; however, the most widely used levels are 90 %, 95%, and 99%.

Figure 9.2: Distribution of sample means for 99% confidence interval

Example 9.1: A researcher has taken a random sample of size 70 from a population with a sample mean of 35 and a population standard deviation of 4.62. Construct a 90% confidence interval to estimate the population mean.

This result implies that the researcher is 90% confident that the population mean will lie between 34.091 and 35.909. The point estimate is 35.

Confidence interval for estimating population mean , when is unknown and sample size is large (n 30)

Example 9.3: In order to estimate the customer loyalty for a particular product, a researcher poses the following question to a sample of 100 customers: How many years have you been continuously using this product? This sample yielded a mean period of 8 years with a sample standard deviation of 2 years. Construct a 95% confidence interval for estimating the population mean.

Example 9.3 (Solution)

This result implies that the researcher is 95% confident that the population mean (average years after purchase in the population) will lie between 7.608 years and 8.392 years.

Estimating Population Mean Using the t Statistic (Small (Small Sample Case)
In the case of small sample size (n < 30), the z formula discussed earlier is not applicable. The problem can be solved by using the t statistic.

Figure 9.10 : Comparison of standard normal curve with two t distributions having degrees of freedom 10 and 20, respectively

The t Distribution

 The t distribution, developed by William Gosset is a


family of similar probability distributions with a specific t distribution depending on a parameter known as the degrees of freedom.

 As sample size n increases, t distribution values tend to


approach standard normal curve.

 The difference between tabular values of t and z


becomes negligible as sample size increases. This is a reason why many researchers use the z distribution for large samples.

The t Distribution (Contd.)

Figure 9.11 : Comparison of standard normal curve with two t curves with sample size n = 10 and n = 22

Degrees of Freedom

The number of degrees of freedom indicates the number of values that are free to vary in a random sample. The degrees of freedom can be understood as the number of independent observations for a source of variation minus the number of independent parameters estimated in computing the variation. Confidence interval to estimate population parameter , when population standard deviation is unknown and the population is normally distributed.

Example 9.4: The personnel department of an organization wants to apply cost-cutting measures for improving efficiency. costAs the first step, the personnel department wants to curtail telephone expenses incurred by employees. For this, personnel department has taken a random sample of 10 employees and gathered the following data about telephone expenses (in thousand rupees) in the previous year:10, 12, 24, 23, 11, 14, 15, 34, 16, 23. Construct a 95% confidence interval to estimate the average telephone expenses of the employees in the population.
Solution:

Confidence Interval Estimation for Population Proportion


Confidence interval to estimate the population proportion p

Example 9.5: A research company conducted a survey on 300 randomly selected tax payers. It found that out of 300 tax payers, 180 tax payers have filled the SARAL form correctly. Construct a 95% confidence interval to estimate the percentage of tax payers who have filled the form correctly in the population.

End

Vous aimerez peut-être aussi