Harvard Government 2000 Lecture 3

Point Estimation
Interval Estimation
Testing
Gov2000: Quantitative Methodology for

Political Science I
Lecture 3: Univariate Statistical Inference
October 1, 2007
Gov2000: Quantitative Methodology for Political Science I
Point Estimation
Interval Estimation
Testing
Outline
1 Point Estimation
Sampling Distributions for Point Estimators
Small Sample Properties
Large Sample Properties
2 Interval Estimation
Sampling Distributions for Interval Estimators
Small Sample Properties
Large Sample Properties
3 Testing
Some Statistical Decision Theory
Sampling Distributions for Test Statistics
p-Values, Rejection Regions, and CIs

Point Estimation Sampling Distributions for Point Estimators
Interval Estimation Small Sample Properties
Testing Large Sample Properties
Point Estimation
Suppose we are primarily interested in specific characteristics of the population

distribution.
A parameter is a characteristic of the population distribution (e.g. the mean), and is

often denoted with a greek letter. (e.g. θ)
A statistic is a function of the sample.
Often we use a statistic to estimate (or guess) the value of a parameter, and we will
denote this with a hat (e.g. θ̂). Such estimation is known as point estimation.
Point Estimators, written as θ̂ or maybe X , are random quantities.
Point Estimates are realized values of an estimator, and hence they are not random
(e.g. x̄).

Consider income data from the 1996 ANES
Histogram of income
0.08
0.06
Density
0.04
0.02
0.00
0 5 10 15 20
income

Histogram of income
0.08
Population Density
0.06
Density
0.04
0.02
0.00
0 5 10 15 20
income

The Balance Point for the Density
We may not have enough data to get a good estimate of the density (infinite data
histogram), but we may have enough data to estimate one characteristic (parameter) of
the density. Often we choose the balance point as our parameter of interest.
Also Known As:

expected value
µ
population mean
true mean
true average
infinite data average

Histogram of income
0.08
Density Balance Point
0.06
Density
0.04
0.02
0.00
0 5 10 15 20
income

Why the balance point?
It is a reasonable measure for the “center” of the density.

We have some intuition about balance points.
The balance point tells us a lot about the normal density.
Many intuitive estimators for the density balance point have properties that are
easy to describe.

Estimators for the Density Balance Point
Some possibilities for µ

b:
1 Y1 , the first data observation

1
2
2
(Y1 + Yn ), the average of the first and the last observations
3 the number 7
1
4 Yn = n
(Y1 + · · · + Yn ), the sample average
Clearly, some of these estimators are better than others (which ones?), but how can we
define “better”?

Sampling Distributions of Point Estimators
In order to assess the properties of an estimator, we assume it has a distribution under

“repeated sampling”, and we call this distribution a sampling distribution.
Illustrative Example:
X = the number of times a respondent voted in the last two presidential elections.
We will assume three possible values {0,1,2}
8
< 1/4 x = 0
Assume P(x) = 1/2 x = 1
1/4 x = 2
:
Assume n=2
Exercise:
1 List all the possible samples
2 Calculate the probability of each sample under repeated sampling
3 Form the sampling distribution for the sample mean

ANES Example
If we think of the data as randomly sampled from a density, then Y1 , . . . , Yn are

independent and identically distributed (i.i.d.) random variables with,
E[Yi ] = µ
V [Yi ] = σ 2
b, which is a function of Y1 , . . . , Yn , will be a random variable with its own

Then µ
expectation and variance.

How to draw a sampling distribution for µ

b:
1 sample an infinite number of data sets of size n
2 calculate µ
b for each data set
3 form an infinite “data” histogram for µ
b, where the “data” are the µ
bs from each
data set
The next slide shows an approximation of this procedure for the four proposed
estimators. I simulated 10,000 data sets of size n from the density shown at the
beginning of the lecture notes.

0.10
0.06
0.06
Density
Density
0.02
0.02
−0.02
−0.02
−10 0 10 20 30 40 0 10 20 30
muHat1 muHat2
1.0
0.3
0.6
Density
Mass
0.1
0.2
−0.1
−0.2
5 10 15 20 12 14 16 18 20 22
muHat3 muHat4

Bias
Bias is the expected difference between the estimator and the parameter. Bias is not
the difference between an estimate and the parameter.
h i
Bias(θ̂) = E θ̂ − θ
h i
= E θ̂ − θ
For example, the sample mean is an unbiased estimator for µ.
h i
Bias(X n ) = E X n − E[X ]
= E [µ̂ − µ]
= 0

Example
1 E[Y1 ] = µ
2 E[ 12 (Y1 + Yn )] = 1
2
(µ + µ) = µ
3 E[7] = 7
1
4 E[Y n ] = n
nµ =µ
Estimators 1,2, and 4 all get the right answer on average. Which is better?

0.10
0.06
0.06
Density
Density
0.02
0.02
−0.02
−0.02
−10 0 10 20 30 40 0 10 20 30
muHat1 muHat2
1.0
●
0.3
0.6
Density
Mass
0.1
0.2
−0.1
−0.2
5 10 15 20 12 14 16 18 20 22
muHat3 muHat4

Election Example
Let π be the proportion of voters who will vote for the Republican candidate in the 2008
general election. Let’s examine two estimators.

1 vote rep
1 µ̂ = Y1 =
0 otherwise
2 µ̂ = class guess
Which is unbiased?
Which do you prefer?

Variance
All else equal, we prefer estimators with small variance. In particular, if two estimators
are unbiased, we prefer the estimator with the smaller variance.
Low variance means that under repeated sampling, the estimates are likely to be
similar.
Note that this doesn’t necessarily mean that a particular estimate is close to the true
parameter value.
Note also that the standard deviation from a sampling distribution is often called the
standard error.

Variance
1 V [Y1 ] = σ 2
2 V [ 12 (Y1 + Yn )] = 1
4
V [Y1 + Yn ] = 1
4
(σ 2 + σ2 ) = 1 2
2
σ
3 V [7] = 0
1 1 2
4 V [Y n ] = n2
nσ 2 = n
σ
Among the unbiased estimators, the sample average has the smallest variance. This
means that Estimator 4 (the sample average) is likely to be closer to the true value µ,
than Estimators 1 and 2.
In order to fully understand this, it is helpful to again look at the sampling distributions.

0.10
0.06
0.06
Density
Density
0.02
0.02
−0.02
−0.02
−10 0 10 20 30 40 0 10 20 30
muHat1 muHat2
1.0
●
0.3
0.6
Density
Mass
0.1
0.2
−0.1
−0.2
5 10 15 20 12 14 16 18 20 22
muHat3 muHat4

Properties and comparisons of the estimators
Recall the definitions of the estimators:

1 Y1 , the first data observation
1
2
2
(Y1 + Yn ), the average of the first and the last observations
3 the number 7
1
4 Yn = n
(Y1 + · · · + Yn ), the sample average
From the pictures on the previous slide:

Estimators 1,2, and 4 are unbiased
Estimator 3 has no variance
Estimator 4 has the lowest variance among the unbiased estimators

Least Squares Estimation
Choose a to minimize the sum of the squared errors.
n
X n
X
2
(xi − a) = {(xi − x̄) + (x̄ − a)}2
i=1 i=1
n n
X o
2 2
= (xi − x̄) + 2(x̄ − a)(xi − x̄) + (x̄ − a)
i=1
n
X n
X n
X
2
= (xi − x̄) + 2(x̄ − a) (xi − x̄) + (x̄ − a)2
i=1 i=1 i=1
n
X
= (xi − x̄)2 + n(x̄ − a)2
i=1

Best Linear Unbiased Estimator for µ
Let X1 , ..., Xn be ∼i.i.d ?(µ, σ 2 ),

Pn
i=1 wi Xi is a linear estimator for µ.
Show that X is the best linear unbiased estimator for µ (i.e. smallest variance unbiased
estimator).
Pn Pn
1 Use E[ i=1wi Xi ] = µ to derive something about i=1 wi .
Pn
2 Simplify V [ i=1 wi Xi ].
1
3 Write each wi in this simplified expression as n
+ ci .
4 ...

Mean Square Error
MSE is the expected squared difference between the estimator and the parameter.
MSE is not the squared difference between an estimate and the parameter.
Furthermore, MSE can be written as the Bias squared plus the Variance.
MSE(θ̂) = E[(θ̂ − θ)2 ]

= Bias(θ̂)2 + V (θ̂)
For example, consider the sample mean.
MSE(X n ) = E[(X n − µ)2 ]

= Bias(X n )2 + V (X n )
= 0 + V (X n )

Example
Assume an i.i.d. sample and recall the two possible definitions of sample variance:
n
1X
S02n = (Xi − X n )2
n
i=1
n
1 X
S12n = (Xi − X n )2
n−1
i=1
Which has less bias?
Which has smaller variance?
Which has smaller MSE?

Asymptotic Unbiasedness
E[θbn ] → θ
n=1 n = 10 n = 100
0.40
0.4
0.4
0.35
0.3
0.3
0.30
0.25
θ^)
θ^)
θ^)
f(θ
f(θ
f(θ
0.2
0.2
0.20
0.15
0.1
0.1
0.10
0.05
0.0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
θ^ θ^ θ^

Consistency
An estimator θb is consistent if it converges in probability to the estimand (parameter of

interest).
θbn →p θ

The Weak Law of Large Numbers Revisited

If X1 , X2 , . . . , Xn , . . . are i.i.d. with −∞ < E[X1 ] = µ < ∞, then X n →p µ
n=1 n = 10 n = 100
0.40
4
1.2
0.35
1.0
3
0.30
0.8
0.25
f(Xn)
f(Xn)
f(Xn)
2
0.6
0.20
0.4
0.15
1
0.2
0.10
0.0
0.05
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
Xn Xn Xn

Asymptotic Sampling Distribution
An estimator θbn with possibly unknown sampling distribution, has asymptotic sampling
distribution F if
1 θbn has a sampling distribution described by cdf Fn , and
2 Fn →d F as n → ∞

The Classical Central Limit Theorem

2 2
√X1 , X2 , . . . , Xn , . . . are 2i.i.d. with E[X1 ] = µ and V [X1 ] = σ and E|X | < ∞, then
If
n(X n − µ) →d N (0, σ )
n=1 n=2
0.08
0.08
Density
Density
0.04
0.04
0.00
0.00
0 5 10 15 20 25 0 5 10 15 20 25
muHat4 muHat4
n=10 n=30
0.20
0.00 0.10 0.20 0.30

Density
Density
0.10
0.00
10 15 20 12 14 16 18 20
muHat4 muHat4

Point Estimation Sampling Distributions for Interval Estimators
What is Interval Estimation?

Point estimates attempt to predict a scalar parameter with single number.
We might want more information about the uncertainty in our estimate.

We may want a bound for an estimate instead of trying to predict the parameter
with a single number.
Interval estimation accomplishes both of these goals. For a scalar parameter θ, an

interval estimator takes the following form:
[θ̂lower , θ̂upper ]
where the lower and upper bounds are random quantities.
An interval estimate is a realized value from an interval estimator. For example:

s s
[x̄ − 1.96 · √ , x̄ + 1.96 · √ ]
n n
where the lower and upper bounds are fixed quantities.


Example: Party ID
QUESTION:
---------
Generally speaking, do you usually think of yourself as a
REPUBLICAN, a DEMOCRAT, an INDEPENDENT, or what?
Would you call yourself a STRONG [Democrat/Republican] or
a NOT VERY STRONG [Democrat/Republican]?
Do you think of yourself as CLOSER to the Republican
Party or to the Democratic party?
VALID CODES:
------------
0. Strong Democrat (2/1/.)
1. Weak Democrat (2/5-8-9/.)
2. Independent-Democrat (3-4-5/./5)
3. Independent-Independent
(3/./3-8-9 ; 5/./3-8-9 if not apolitical)
4. Independent-Republican (3-4-5/./1)
5. Weak Republican (1/5-8-9/.)
6. Strong Republican (1/1/.)

Sampling Distribution for PID Interval Estimator
Let X be a discrete random variable describing PID with the following distribution.
x 0 1 2 3 4 5 6
f (x) .16 .15 .17 .10 .12 .14 .16
Consider the following procedure.

1 Take a random sample of size n.
2 Construct an interval estimate for µ (E[X ]) with the form [x̄ − s, x̄ + s]
3 Repeat

Sampling Distribution for PID Interval Estimator
Interval Estimates
10
8
sample
6
4
2
0 1 2 3 4 5 6

Example: Feeling Thermometer Scores
============================================================================
B1. INTRO THERMOMETERS PRE
============================================================================
Please look at page 2 of the booklet.

I’d like to get your feelings toward some of our political
leaders and other people who are in the news these days. I’ll
read the name of a person and I’d like you to rate that
person using something we call the feeling thermometer.
Ratings between 50 degrees and 100 degrees mean
that you feel favorable and warm toward the person.
Ratings between 0 degrees and 50 degrees mean that you
don’t feel favorable toward the person and that you
don’t care too much for that person. You would
rate the person at the 50 degree mark if you don’t feel
particularly warm or cold toward the person.
If we come to a person whose name you don’t recognize, you
don’t need to rate that person. Just tell me and we’ll move on
to the next one.

Clinton and Edwards FTS
Histogram of hcFTS
80
Frequency
40
0
0 20 40 60 80 100
hcFTS
Histogram of jeFTS
Frequency
40 80
0
0 20 40 60 80 100
jeFTS

Sampling Distribution for FTS Score Interval Estimator
Clinton FTS Mean Interval Estimates
2 4 6 8
sample
0 20 40 60 80 100
^
µ
Edwards FTS Mean Interval Estimates

2 4 6 8
sample
0 20 40 60 80 100
^
µ

Coverage Probability
Coverage probability is the probability that an interval estimator contains the true value
of the parameter.
P(θ̂lower ≤ θ ≤ θ̂upper ) = 1 − α
This is usually written as 1 − α. (To be explained later).
Question:
What is the probability that an interval estimate contains the true value of the
parameter. For example,
s s
[x̄ − 1.96 · √ , x̄ + 1.96 · √ ]
n n

FTS Example: Mean from Normal Distribution

(Variance Known)
Suppose we assume that JE FTS scores as normally distributed, and we know
(somehow) that σ = 25.5. Recall that if X1 , ..., Xn ∼i.i.d. N(µ, σ 2 ) , then
b−µ
µ
σ ∼ N(0, 1)
√
n
!
b−µ
µ
P −1.96 ≤ σ ≤ 1.96 = 95%
√
n
„ «
σ σ
P b − 1.96 √ ≤ µ ≤ µ
µ b + 1.96 √ = 95%
n n
σ
µ̂ ± 1.96 √
n

Is 95% all there is?
Our 95% CI had the following form:

σ
µ̂ ± 1.96 √
n
Where did the 1.96 come from?
!
b−µ
µ
P −1.96 ≤ σ ≤ 1.96 = 95%
√
n

(1 − α)% Confidence Intervals
!
b−µ
µ
P −zα/2 ≤ σ ≤ zα/2 = (1 − α)%
√
n
„ «
σ σ
P b − zα/2 √ ≤ µ ≤ µ
µ b + zα/2 √ = (1 − α)%
n n
We usually construct the (1 − α)% confidence interval with the following formula.
σ
µ̂ ± zα/2 √
n
Question:
Why not 100% confidence?

FTS Example: Mean from Normal Distribution

(Variance Unknown)
Suppose we model JE FTS scores as normal distributed with σ unknown. Recall that if
X1 , ..., Xn ∼i.i.d. N(µ, σ 2 ) , then
b−µ
µ
σ ∼ N(0, 1)
√
n
Question:
Why can’t our previous interval be used?
σ
µ̂ ± zα/2 √
n

Estimating σ and the SE
Recall that the sample variance can be written as the following:

n
2 1 X
S = (Xi − X n )2
n−1
i=1
and that the sample standard deviation can be written as

p
S = S2
We will plug in S for σ and our estimated standard error will be
S
SE[µ̂]
c = √
n

Recall the t distribution
If Z ∼ N(0, 1), Y ∼ χ2ν , and Z and Y are independent, then
Z
X ≡ q
Y
ν
follows a tν distribution.
If a sample (X1 , . . . , Xn ) of any size n is taken from a normal distribution with
known mean and unknown variance then the sampling distribution of the sample
mean minus the known mean divided by the sample standard error will have the
t distribution with ν = n − 1.

(1 − α)% t- Intervals
b−µ
µ
σ ∼ tn−1
√
n
0 1
b−µ
µ
P @−tn−1,α/2 ≤ σ̂
≤ tn−1,α/2 A = (1 − α)%
√
n
„ «
σ̂ σ̂
P b − tn−1,α/2 √ ≤ µ ≤ µ
µ b + tn−1,α/2 √ = (1 − α)%
n n
We usually construct the (1 − α)% confidence interval with the following formula.
σ̂
µ̂ ± tn−1,α/2 √
n
For a 95% confidence interval, tn−1,α/2 is often close to 2.

Asymptotic Coverage Probability
Without making an assumption about the population distribution, we will often not know
the sampling distribution of the interval estimator, and therefore, we will not know the
coverage probability.
We may be able to derive the asymptotic coverage probability instead.
P(θ̂lower ,n ≤ θ ≤ θ̂upper ,n ) → 1 − α
as
n→∞

FTS Example: Mean from Unknown Distribution
Suppose we do not assume a distribution for HC FTS. Recall that if X1 , ..., Xn

∼i.i.d. ?(µ, σ 2 ) , then
bn − µ
µ
→d N(0, σ 2 )
√1
n
and
σ̂n →p σ
it can be shown that
bn − µ
µ
σ̂n
→d N(0, 1)
√
n
Therefore, our normal quantile confidence intervals will have valid asymptotic
coverage. (t-quantile intervals also)

0.6
t1
t4
0.5
t 15
0.4
Density
0.3
0.2
0.1
0.0
−4 −2 0 2 4

Example: Clinton and Edwards FTS Interval Estimates
Clinton and Edwards 95% CIs
3.0
2.5
Clinton
2.0
1.5
1.0
0.5
0.0 Edwards
40 45 50 55 60
^
µ
Point Estimation Some Statistical Decision Theory

Interval Estimation Sampling Distributions for Test Statistics
Testing p-Values, Rejection Regions, and CIs
The Trial Analogy
Suppose we must decide whether to convict or acquit a defendant based on evidence

presented at a trial. There are four possible outcomes.
Table: Decisions and Outcomes

Truth
Guilty Innocent
Decision Convict Correct Type I Error
Acquit Type II Error Correct
Our goal is to limit the probability of error.

The Trial Analogy
Suppose we can somehow model the probabilities for the various outcomes conditional
on the true state of the world.
Table: Probabilities given the true state of the world

Truth
Guilty Innocent
Decision Convict 1−β α
Acquit β 1−α
We would like α and β to be small, but it may be difficult to achieve both goals.
The standard statistical approach is to pick a small level for α (e.g. 5%), and then try to
minimize β given this constraint.

The Statistical Version

Suppose we must decide whether to reject or fail to reject a prior hypothesis about the
world (null hypothesis) in favor of an alternative hypothesis.
Table: Decisions and Outcomes

Truth
Alternative Hypothesis Null Hypothesis
Decision Reject Correct Type I Error
Fail to Reject Type II Error Correct
Table: Probabilities given the true state of the world

Truth
Alternative Hypothesis Null Hypothesis
Decision Reject 1−β α
Fail to Reject β 1−α

Edwards FTS Example
As in our previous example, let µ be the expected value of JE FTS for the population.
Lets assume the population mean for HC FTS is 55 (i.e. equal to the sample mean)
Here are two possible hypothesis tests:
H0 : µ = 55
H1 : µ 6= 55
H0 : µ ≤ 55
H1 : µ > 55

Test Statistics
A test statistic is a function of the sample and the null hypothesis (and may provide
evidence against the null hypothesis).
Examples:
1 If H0 : µ = 55, then X − 55 would be a test statistic.
2 If H0 : µ ≤ 55, then X − 55 would be a test statistic.
Why does the second test statistic make sense given the inequality in the null
hypothesis?

The One Sample t-Statistic
Let µ0 be the “null” value of the parameter µ (e.g. 55). Then the one sample t-statistic
can be written as the following:
X − µ0
S
√
n
Notice that being a function of the sample, this t-statistic will have a sampling
distribution.

Null Distributions for Test Statistics
A null distribution is the sampling distribution for the test statistic when the null
hypothesis is true. More exactly, the null distribution is the sampling distribution for the
test statistic when θ = θ0 .
For our example, the null distribution is the sampling distribution of the t-statistic
X − 55
S
√
n
when µ = 55.

The Null Distribution for the t-Statistic
Suppose we model JE FTS scores as normally distributed with σ unknown. Recall that
if X1 , ..., Xn ∼i.i.d. N(µ, σ 2 ) , then
X − 55
S
∼ tn−1
√
n
when µ = 55.

Null Distribution (µ = 55 and n = 520)
Null Distribution
0.4
0.3
f(test statistic)
0.2
0.1
0.0
−3 −2 −1 0 1 2 3
test statistic

p-Value
The p-value is the probability under the null distribution of getting a sample at least as
extreme as the one we got.
“Extreme” is defined by the alternative hypothesis.
Examples:
˛
H1 : µ 6= 55 ⇒ p-value = P(tstat ≥ |tobs | ∪ tstat ≤ −|tobs |˛µ = 55)
˛
H1 : µ > 55 ⇒ p-value = P(tstat ≥ tobs ˛µ = 55)

One and Two Sided p-values
Two Sided p−value

0.4
f(test statistic)
t−obs
0.2
−t−obs
0.0
−3 −2 −1 0 1 2 3
test statistic
One Sided p−value

0.4
f(test statistic)
t−obs
0.2
0.0
−3 −2 −1 0 1 2 3
test statistic

Rejection Regions
Recall that α is the probability of Type I Error. Often we want to limit α to 5% while
minimizing the probability of Type II Error. This can be accomplished in the following
manner.
α=5%)
Two Sided Rejection Region (α
0.4
f(test statistic)
fences
0.2
0.0 t−obs
−3 −2 −1 0 1 2 3
test statistic
α=5%)
One Sided Rejection Region (α
0.4
f(test statistic)
fence
0.2
t−obs
0.0
−3 −2 −1 0 1 2 3
test statistic

Rejection Regions and p-values

Notice the relationship between α and p-value.
α=5%)
Two Sided Rejection Region (α
0.4
f(test statistic)
fences
0.2
t−obs
−t−obs
0.0
−3 −2 −1 0 1 2 3
test statistic
α=5%)
One Sided Rejection Region (α
0.4
f(test statistic)
fence
0.2
t−obs
0.0
−3 −2 −1 0 1 2 3
test statistic

α Rejection Regions and 1 − α CIs
α=5%)
Rejection Regions and CIs (α
fences
CI
0.3
f(X|H 0)
0.2
0.1
0.0
50 52 54 56 58 60

Harvard Government 2000 Lecture 3

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Harvard Government 2000 Lecture 3

Transféré par

Droits d'auteur :

Formats disponibles

Point Estimation

Gov2000: Quantitative Methodology for

Gov2000: Quantitative Methodology for Political Science I

Gov2000: Quantitative Methodology for Political Science I

Suppose we are primarily interested in specific characteristics of the population

A parameter is a characteristic of the population distribution (e.g. the mean), and is

A statistic is a function of the sample.

Point Estimators, written as θ̂ or maybe X , are random quantities.

Gov2000: Quantitative Methodology for Political Science I

Point Estimation Sampling Distributions for Point Estimators

Consider income data from the 1996 ANES

Gov2000: Quantitative Methodology for Political Science I

Gov2000: Quantitative Methodology for Political Science I

Point Estimation Sampling Distributions for Point Estimators

The Balance Point for the Density

Also Known As:

Gov2000: Quantitative Methodology for Political Science I

Gov2000: Quantitative Methodology for Political Science I

Point Estimation Sampling Distributions for Point Estimators

Why the balance point?

It is a reasonable measure for the “center” of the density.

Gov2000: Quantitative Methodology for Political Science I

Estimators for the Density Balance Point

Some possibilities for µ

1 Y1 , the first data observation

Gov2000: Quantitative Methodology for Political Science I

Point Estimation Sampling Distributions for Point Estimators

Sampling Distributions of Point Estimators

In order to assess the properties of an estimator, we assume it has a distribution under

Gov2000: Quantitative Methodology for Political Science I

If we think of the data as randomly sampled from a density, then Y1 , . . . , Yn are

b, which is a function of Y1 , . . . , Yn , will be a random variable with its own

Gov2000: Quantitative Methodology for Political Science I

Point Estimation Sampling Distributions for Point Estimators

How to draw a sampling distribution for µ

Gov2000: Quantitative Methodology for Political Science I

Gov2000: Quantitative Methodology for Political Science I

Point Estimation Sampling Distributions for Point Estimators

For example, the sample mean is an unbiased estimator for µ.

Gov2000: Quantitative Methodology for Political Science I

Gov2000: Quantitative Methodology for Political Science I

Point Estimation Sampling Distributions for Point Estimators

Gov2000: Quantitative Methodology for Political Science I

Which do you prefer?

Gov2000: Quantitative Methodology for Political Science I

Point Estimation Sampling Distributions for Point Estimators

Gov2000: Quantitative Methodology for Political Science I

Gov2000: Quantitative Methodology for Political Science I

Point Estimation Sampling Distributions for Point Estimators

Gov2000: Quantitative Methodology for Political Science I

Properties and comparisons of the estimators

Recall the definitions of the estimators:

From the pictures on the previous slide:

Gov2000: Quantitative Methodology for Political Science I

Point Estimation Sampling Distributions for Point Estimators

Least Squares Estimation

Choose a to minimize the sum of the squared errors.

Gov2000: Quantitative Methodology for Political Science I

Best Linear Unbiased Estimator for µ

Let X1 , ..., Xn be ∼i.i.d ?(µ, σ 2 ),

Gov2000: Quantitative Methodology for Political Science I

Point Estimation Sampling Distributions for Point Estimators

Mean Square Error

MSE(θ̂) = E[(θ̂ − θ)2 ]