Vous êtes sur la page 1sur 3

Page 1 of 3

ENGINEERING DATA ANALYSIS


POINT ESTIMATION

 Suppose we have an unknown population parameter, such as a population mean μ or a population proportion p, which we'd
like to estimate. We can't possibly survey the entire population. So, of course, we do what comes naturally and take a random
sample from the population, and use the resulting data to estimate the value of the population parameter. Of course, we want
the estimate to be "good" in some way.

 Point estimation, is the process of finding an approximate value of some parameter – such as the mean (average) of a
population from random samples of the population.

UNBIASED ESTIMATOR

 In statistics, estimation (or inference) refers to the process by which one makes inferences (e.g. draws conclusions) about a
population, based on information obtained from a sample.
 A statistic is any measurable quantity calculated from a sample of data (e.g. the average). This is a stochastic variable as, for
a given population, it will in general vary from sample to sample.
 An estimator is any quantity calculated from the sample data which is used to give information about an unknown quantity in
the population (the estimand).
 An estimate is the particular value of an estimator that is obtained by a particular sample of data and used to indicate the
value of a parameter.
Example:

 Population: people in this room


 Sample I: people sitting in the middle row Sample
II: people whose names start with the letter M
 Statistic: average height
 I can use this statistic as an estimator for the average height of the population obtaining different results from the two samples

Definition
 The bias of an estimator is the difference between the expectation value over its PDF (i.e. its mean value) and the population
value.

 An estimator is called unbiased if the b=0, while it is called biased otherwise.

 An unbiased estimator is an accurate statistic that’s used to approximate a population parameter. “Accurate” in this sense
means that it’s neither an overestimate nor an underestimate.

 This essentially means that it is an unbiased estimator when the mean of the statistic’s sampling distribution is equal to the
population’s parameter.

Example:
1. Show that the sample mean is an unbiased estimator of a population mean. E(x-bar)= myu 𝐸(𝑋̅) = 𝜇
2. Show that the sample variance is an unbiased estimator of the population variance. 𝐸(𝑠 2 ) = 𝜎 2

Useful properties:
𝑬 ∑ 𝑿𝒊 = ∑ 𝑬(𝑿𝒊 )
𝑬(𝒄𝑿) = 𝒄𝑬(𝑿)
𝑬 (𝑪) = 𝑪

EDA: Point Estimation


Page 2 of 3
𝒏

∑ 𝑪 = 𝒏𝑪
𝒊=𝟏
𝟐
̅̅̅̅𝟐 ) = 𝝈 + 𝝁𝟐
𝑬(𝑿
𝒏

Rules for the Variance


 The variance of a constant is zero.
𝑽(𝒄) = 𝟎
 Adding a constant value to a random variable does not change the variance.
𝑽(𝑿 + 𝒄) = 𝑽(𝑿)
 Multiplying a random variable by a constant increases the variance by the square of the constant.
𝑽(𝒄𝑿) = 𝒄𝟐 𝑽𝒂𝒓(𝑿)
 The variance of the sum of two or more independent random variable is the sum of its respective variances.
𝑽(𝑿 + 𝒀) = 𝑽(𝑿) + 𝑽(𝒀)

Example:
Let X is a random variable with E(X)=100 and V(X)=15. Find the following
a. 𝑉(3𝑋 + 10)
b. 𝑉(−2𝑋)

Goodness of an Estimator
Three of the measures that we will use to assess the goodness of an estimator are its bias, its mean-square error and its standard
error.
1. If 𝜃̂ is an estimator of 𝜃, then the bias of 𝜃 is given by
𝐵(𝜃̂) = 𝐸(𝜃̂) − 𝜃
2. Mean-square error of 𝜃̂ is given by
2
𝑀𝑆𝐸(𝜃̂) = 𝐸(𝜃̂ − 𝜃)
2
𝑀𝑆𝐸(𝜃̂) = 𝑉𝑎𝑟(𝜃̂) + [𝐵(𝜃̂)]
3. If 𝜃̂ is an estimator of 𝜃, then the standard error of 𝜃̂ is simply its standard deviation.
𝑆𝐸(𝜃̂) = √𝑉𝑎𝑟(𝜃̂)

MEAN SQUARE ERROR


 The mean square indicates the “average” sample error of estimates which can be calculated for all possible random sample of
the size n.
Example:
1. Determine whether the estimator 𝑝̂ = 𝑋/𝑛 of the parameter 𝑝 from a 𝐵𝑖𝑛(𝑛, 𝑝) distribution.
2. Determine the MSE(𝑝̂ )
3. Determine the SD(𝑝̂ )
4. Determine the MSE the sample mean
5. Determine the Standard error of the sample mean

EDA: Point Estimation


Page 3 of 3

STATISTICAL INTERVALS

 A confidence interval estimate for 𝜇 is an interval of the form 𝑙 ≤ 𝜇 ≤ 𝑢, where the end-points 𝑙 and 𝑢 are computed from
the sample data. Because different samples will produce different values of 𝑙 and 𝑢, these end-points are values of random
variables L and U, respectively. Supposed that we can determine values of L and U such that the following probability
statement is true:
𝑃{𝐿 ≤ 𝜇 ≤ 𝑈} = 1 − 𝛼
where 0 ≤ 𝛼 ≤ 1. There is a probability of 1 − 𝛼 of selecting a sample for which the confidence interval will contain the true
value 𝜇.

Confidence Interval on the Mean of a Normal Distribution, Variance Known


 If 𝑥̅ is the sample mean of a random sample of size n from a normal population with known variance, (1 − 𝛼)% confidence
interval on 𝜇 is given by
𝜎 𝜎
𝑥̅ − 𝑍𝛼 ≤ 𝜇 ≤ 𝑥̅ + 𝑍𝛼
2 √𝑛 2 √𝑛
𝛼
where 𝑍𝛼 is the upper 1002 percentage point of the standard normal distribution.
2

 The interpretation of a confidence interval (for example on the 𝜇) is


“The observed interval [𝑙, 𝑢] brackets the true value of 𝑢, with confidence interval (1 − 𝛼)%.”

Example:
Ten measurements of impact energy (J) on specimens of A238 steel cut at 60 degree Celsius are as follows 64.1, 64.7,64.5,
64.6, 64.5, 64.3, 64.6, 64.8, 64.2, and 64.3. Assume that impact energy is normally distributed with 𝜎 = 1𝐽. We want to find a
95% confidence interval for 𝜇, the mean impact energy.

One-sided Confidence Bounds


 A (1 − 𝛼)% upper-confidence bound for 𝝁 is
𝜎
𝑙 = 𝑥̅ + 𝑍𝛼 𝑛

 A (1 − 𝛼)% lower-confidence bound for 𝝁 is
𝜎
𝑙 ≤ 𝑥̅ − 𝑍𝛼 𝑛

Example: Using the same example above. Find the lower one-sided 95% confidence interval for the mean impact energy.

EDA: Point Estimation

Vous aimerez peut-être aussi