Vous êtes sur la page 1sur 31

Estimation In Statistics

STEI ITB Bayu Rima Aditya

Introduce
In statistics, estimation refers to the process by which one makes inferences about a population, based on information obtained from a sample. Statisticians use sample statistics to estimate population parameters. For example: a. Sample means are used to estimate population means; b. Sample proportions, to estimate population proportions.

Point Estimate vs Interval Estimate


Point estimate. A point estimate of a population parameter is a single value of a statistic. For example: a. The sample mean x is a point estimate of the population mean . b. The sample proportion p is a point estimate of the population proportion P. Interval estimate. An interval estimate is defined by two numbers, between which a population parameter is said to lie. For example: a < x < b is an interval estimate of the population mean . It indicates that the population mean is greater than a but less than b.

Confidence Intervals
Statisticians use a confidence interval to express the precision and uncertainty associated with a particular sampling method. A confidence interval consists of three parts. 1. A confidence level. 2. A statistic. 3. A margin of error. For example: Suppose we compute an interval estimate of a population parameter. We might describe this interval estimate as a 95% confidence interval. This means that if we used the same sampling method to select different samples and compute different interval estimates, the true population parameter would fall within a range defined by the sample statistic + margin of error 95% of the time.

Confidence Level
The probability part of a confidence interval is called a confidence level. The confidence level describes the likelihood that a particular sampling method will produce a confidence interval that includes the true population parameter. For Example: Suppose we collected all possible samples from a given population, and computed confidence intervals for each sample. Some confidence intervals would include the true population parameter; others would not. A 95% confidence level means that 95% of the intervals contain the true population parameter; a 90% confidence level means that 90% of the intervals contain the population parameter

Margin of Error
In a confidence interval, the range of values above and below the sample statistic is called the margin of error.

For example: Suppose the local newspaper conducts an election survey and reports that the independent candidate will receive 30% of the vote. The newspaper states that the survey had a 5% margin of error and a confidence level of 95%. These findings result in the following confidence interval: We are 95% confident that the independent candidate will receive between 25% and 35% of the vote.

Example
Which of the following statements is true. I. When the margin of error is small, the confidence level is high. II. When the margin of error is small, the confidence level is low. III. A confidence interval is a type of point estimate. IV. A population mean is an example of a point estimate. (A) I only (B) II only (C) III only (D) IV only. (E) None of the above.

Solution
The correct answer is (E). The confidence level is not affected by the margin of error. When the margin of error is small, the confidence level can low or high or anything in between. A confidence interval is a type of interval estimate, not a type of point estimate. A population mean is not an example of a point estimate; a sample mean is an example of a point estimate

Standard Error
The standard error is an estimate of the standard deviation of a statistic. This lesson shows how to compute the standard error, based on sample data. The standard error is important because it is used to compute other measures, like confidence intervals and margins of error.

Notation
Population parameter N: Number of observations in the population Ni: Number of observations in population i P: Proportion of successes in population Pi: Proportion of successes in population i : Population mean i: Mean of population i : Population standard deviation p: Standard deviation of p Sample statistic n: Number of observations in the sample ni: Number of observations in sample i p: Proportion of successes in sample pi: Proportion of successes in sample i x: Sample estimate of population mean xi: Sample estimate of i s: Sample estimate of SEp: Standard error of p

x: Standard deviation of x

SEx: Standard error of x

Standard Deviation of Sample Estimates


Statisticians use sample statistics to estimate population parameters. Naturally, the value of a statistic may vary from one sample to the next. The variability of a statistic is measured by its standard deviation.
Statistic Sample mean, x Sample proportion, p Difference between means, x1 - x2 Standard Deviation x = / sqrt( n ) p = sqrt [ P(1 - P) / n ] x1-x2 = sqrt [ 21 / n1 + 22 / n2 ]

Difference between proportions, p1 - p2 p1-p2 = sqrt [ P1(1-P1) / n1 + P2(1-P2) / n2 ]

Standard Error of Sample Estimates


Sadly, the values of population parameters are often unknown, making it impossible to compute the standard deviation of a statistic. When this occurs, use the standard error.
Statistic Sample mean, x Standard Error SEx = s / sqrt( n )

Sample proportion, p Difference between means, x1 - x2

SEp = sqrt [ p(1 - p) / n ] SEx1-x2 = sqrt [ s21 / n1 + s22 / n2 ]

Difference between proportions, p1 - p2 SEp1-p2 = sqrt [ p1(1-p1) / n1 + p2(1-p2) / n2 ]

Example
Which of the following statements is true. I. The standard error is computed solely from sample attributes. II. The standard deviation is computed solely from sample attributes. III. The standard error is a measure of central tendency. (A) I only (B) II only (C) III only (D) All of the above. (E) None of the above.

Solution
The correct answer is (A). The standard error can be computed from a knowledge of sample attributes - sample size and sample statistics. The standard deviation cannot be computed solely from sample attributes; it requires a knowledge of one or more population parameters. The standard error is a measure of variability, not a measure of central tendency.

Margin of Error
In a confidence interval, the range of values above and below the sample statistic is called the margin of error.

For Example: Suppose we wanted to know the percentage of adults that exercise daily. We could devise a sample design to ensure that our sample estimate will not differ from the true population value by more than, say, 5 percent (the margin of error) 90 percent of the time (theconfidence level).

How to Compute the Margin of Error


The margin of error can be defined by either of the following equations: 1. Margin of error = Critical value x Standard deviation of the statistic. 2. Margin of error = Critical value x Standard error of the statistic If you know the standard deviation of the statistic, use the first equation to compute the margin of error. Otherwise, use the second equation.

How to Find the Critical Value #1


The critical value is a factor used to compute the margin of error. The central limit theorem states that the sampling distribution of a statistic will be normal or nearly normal, if any of the following conditions apply: 1. The population distribution is normal. 2. The sampling distribution is symmetric, unimodal, without outliers. 3. The sampling distribution is moderately skewed, unimodal, without outliers. 4. The sample size is 30 or greater than 30, without outliers.

How to Find the Critical Value #2


When one of these conditions is satisfied, the critical value can be expressed as at score or as a z score. To find the critical value, follow these steps: 1. Compute alpha (): = 1 - (confidence level / 100) 2. Find the critical probability (p*): p* = 1 - /2 3. To express the critical value as a z score, find the z score having a cumulative probability equal to the critical probability. 4. To express the critical value as a t score, follow these steps: a) Find the degrees of freedom (DF). When estimating a mean score or a proportion from a single sample, DF is equal to the sample size minus one. For other applications, the degrees of freedom may be calculated differently. We will describe those computations as they come up. b) The critical t score is the t score having degrees of freedom equal to DF and acumulative probability equal to the critical probability (p*).

Example
Nine hundred (900) high school freshmen were randomly selected for a national survey. Among survey participants, the mean grade-point average (GPA) was 2.7, and the standard deviation was 0.4. What is the margin of error, assuming a 95% confidence level? (A) 0.013 (B) 0.025 (C) 0.500 (D) 1.960 (E) None of the above.

Solution
The correct answer is (B). To compute the margin of error, we need to find the critical value and the standard error of the mean. To find the critical value, we take the following steps: 1. Compute alpha (): = 1 - (confidence level / 100) = 1 - 0.95 = 0.05 2. Find the critical probability (p*): p* = 1 - /2 = 1 - 0.05/2 = 0.975 3. Find the critical z score. Since the sample size is large, the sampling distribution will be roughly normal in shape. Therefore, we can express the critical value as a z score. For this problem, it will be the z score having a cumulative probability equal to 0.975. Using the Normal Distribution Tabel, we find that the critical value is 1.96.

Next, we find the standard error of the mean, using the following equation: SEx = s / sqrt( n ) = 0.4 / sqrt( 900 ) = 0.4 / 30 = 0.013 And finally, we compute the margin of error (ME). ME = Critical value x Standard error = 1.96 * 0.013 = 0.025 This means we can be 95% confident that the mean grade point average in the population is 2.7 plus or minus 0.025, since the margin of error is 0.025.

Confidence Interval
Statisticians use a confidence interval to describe the amount of uncertainty associated with a sample estimate of a population parameter.

How to Interpret Confidence Intervals #1


Example: Suppose that a 90% confidence interval states that the population mean is greater than 100 and less than 200. How would you interpret this statement? Some people think this means there is a 90% chance that the population mean falls between 100 and 200. This is incorrect. Like any population parameter, the population mean is a constant, not a random variable. It does not change. The probability that a constant falls within any given range is always 0.00 or 1.00.

How to Interpret Confidence Intervals #2


The confidence level describes the uncertainty associated with a sampling method. Suppose we used the same sampling method to select different samples and to compute a different interval estimate for each sample. Some interval estimates would include the true population parameter and some would not. A 90% confidence level means that we would expect 90% of the interval estimates to include the population parameter; A 95% confidence level means that 95% of the intervals would include the parameter.

Confidence Interval Data Requirements


To express a confidence interval, you need three pieces of information. 1. Confidence level 2. Statistic 3. Margin of error Given these inputs, the range of the confidence interval is defined by the sample statistic + margin of error. And the uncertainty associated with the confidence interval is specified by the confidence level. Note: Often, the margin of error is not given; we must calculate it

How to Construct a Confidence Interval #1


There are four steps to constructing a confidence interval: 1. Identify a sample statistic. Choose the statistic (sample mean, sample proportion) that you will use to estimate a population parameter. 2. Select a confidence level. As we noted in the previous section, the confidence level describes the uncertainty of a sampling method. Often, researchers choose 90%, 95%, or 99% confidence levels; but any percentage can be used.

How to Construct a Confidence Interval #2


3. Find the margin of error. If you are working on a homework problem or a test question, the margin of error may be given. Often, however, you will need to compute the margin of error, based on one of the following equations. Margin of error = Critical value * Standard deviation of statistic Margin of error = Critical value * Standard error of statistic 4. Specify the confidence interval. The uncertainty is denoted by the confidence level. And the range of the confidence interval is defined by the following equation. Confidence interval = sample statistic + Margin of error

Example
Suppose we want to estimate the average weight of an adult male in Dekalb County, Georgia. We draw a random sample of 1,000 men from a population of 1,000,000 men and weigh them. We find that the average man in our sample weighs 180 pounds, and the standard deviation of the sample is 30 pounds. What is the 95% confidence interval. (A) 180 + 1.86 (B) 180 + 3.0 (C) 180 + 5.88 (D) 180 + 30 (E) None of the above

Solution
The correct answer is (A). To specify the confidence interval, we work through the four steps below. 1. Identify a sample statistic. Since we are trying to estimate the mean weight in the population, we choose the mean weight in our sample (180) as the sample statistic. 2. Select a confidence level. In this case, the confidence level is defined for us in the problem. We are working with a 95% confidence level.

3. Find the margin of error. The key steps are shown below. a. Find standard error. The standard error (SE) of the mean is: SE = s / sqrt( n ) = 30 / sqrt(1000) = 30/31.62 = 0.95 b. Find critical value. The critical value is a factor used to compute the margin of error. To express the critical value as a t score, follow these steps. Compute alpha (): = 1 - (confidence level / 100) = 0.05 Find the critical probability (p*): p* = 1 - /2 = 1 - 0.05/2 = 0.975 Find the degrees of freedom (df): df = n - 1 = 1000 - 1 = 999 The critical value is the t score having 999 degrees of freedom and a cumulative probability equal to 0.975. From the t Distribution Calculator, we find that the critical value is 1.96. Note: We might also have expressed the critical value as a z score. Because the sample size is large, a z score analysis produces the same result - a critical value equal to 1.96. Compute margin of error (ME): ME = critical value * standard error = 1.96 * 0.95 = 1.86

Specify the confidence interval. The range of the confidence interval is defined by the sample statistic + margin of error. And the uncertainty is denoted by the confidence level. Therefore, this 95% confidence interval says that the population mean falls within the interval 180 + 1.86.

Vous aimerez peut-être aussi