Vous êtes sur la page 1sur 47

Chapter 9 Chapter 11

Sections 9
9.6
6 9.9,
99
Sections 11.1 11.4

Monday, June 7th


P i
Previous L
Lecture
t
We have seen how to find the sampling
distribution and the confidence intervals
for cases with categorical variables and
more specifically
p y for:
Proportion of one sample
Difference of two population proportions
Thi llecture
This t
Now we will see how to find sampling
distributions for cases where we have
quantitative data and more specifically for:
Mean of one population
Difference of the means of two dependent
samples
p
Difference for the means of two independent
samples
Mean of one
population
E
Example
l
Let say that I am interested to find the
distribution of the mean weight of college male
students Then there are four cases:
students.
Case 1: known population mean and known
population standard deviation
Case 2: known population mean but unknown
population standard deviation
Case 3: unknown population mean and known
population standard deviation
Case 4: unknown population mean and unknown
pop lation standard de
population deviation
iation
R f h our memory
Refresh
Population mean is denoted by:
Sample
p mean is denoted by:y x
Population standard deviation is denoted
y
by:
Sample standard deviation is denoted by s
IMPORTANT: The four cases that follow
are true under the condition that the
g of my
weight yppopulation
p is a normal
curve
B k tto our Example
Back E l
Case 1:
I know that the mean of the weight of the population
of all male college students is 180 lbs and population
standard deviation is 30 then I have a normal
distribution with
Mean = 180
M
Standard deviation = 30
And we can assume that for a sample of n
observations the distribution of x is normal with
mean 180 and standard deviation 30
n n
Of course this is not the case 99.9% ((why?)y ) of the
times so this case is not that interesting
E
Example
l
Case 2:
I know that the weight of all the college male has mean 180 with
unknown p population
p standard deviation. In this case we take a
sample of n people. Then we can assume that the sample mean
x follows a normal distribution with:
Mean = =180 s
St d d error =
Standard
n
where s is the sample standard deviation
CONDITIONS:
Population
P l ti iis normally
ll di
distributed.
t ib t d
Sample size n>30
E
Example
l
Case 3:
I know that the weight of all the college male
has unknown mean with population standard
deviation 30. In this case we take a sample of
n people
people. Then we can assume that the
sample mean x follows a normal distribution
with:
Mean= x (which is calculated from the sample)

Standard deviation= where 30
n
E
Example
l
Case 4:
I know that the weight of all the college male has
unknown mean with unknown standard deviation. In
this case we take a sample of n people. Then we can
assume that the sample mean x follows a normal
distribution with:
Mean= x (found in the sample)
Standard error= s where s is the sample standard
deviation n
CONDITIONS:
CONDITIONS
Population is normally distributed.
Sample size n>30
What happens in case 2 and 4 if
n<30
Then we have as a sampling distribution
the same distribution as when n>30 but
with the extra condition that we have a bell
shaped curve.
In order to find the confidence interval
though we use the t- distribution with n-1
degrees of freedom which we will discuss
later
Confidence interval for the mean of
one population
Confidence Interval keeps the same format always. That
is it will be:
Sample estimate Multiplier Standard error
Now in the mean of one population this translates to the
following
x Multiplier
n
s
x Multiplier
n
depending if you know population standard deviation or not
H
How tto fi
find
d multiplier
lti li
In cases 1 and 3 we know that we have normal
distribution so the multiplier will be found using
the Tables of standard normal
normal.
Example: I know that the population of women
height has standard deviation 10. I ask 20
women what is their height. My sample has
average 63. Find a 95% confidence interval for
th mean height
the h i ht off a woman. (Whi
(Which h case iis
this?)
H
How tto fi
find
d multiplier
lti li
In cases 2 and 4 we know that we have normal
distribution only if n>30 so the multiplier will be
found using the Tables of standard normal.
Example: I ask 50 women what is their height.
M sample
My l h
has average 63 and d standard
t d d
deviation of 10. Find a 90% confidence interval
for the mean height of a woman
woman. (Which case is
this?)
H
How tto fi
find
d multiplier
lti li
What happens though in the following
example:
p
Example: I ask 20 women what is their
height My sample has average 63 and
height.
standard deviation of 10. Find a 90%
confidence interval for the mean height of
a woman. (Which case is this?)
t di t ib ti
t-distribution
In the previous example we have seen the
t-distribution. The t-distribution is
characterized by the degrees of freedom
which are always y equal
q to n-1,, where n is
the sample size.
Difference off the
Diff h
means
of two samples
E
Examples
l
I am teaching a Stat 200 class and I am
giving
g g them two midterms during g the
semester. I want to see how they improve
from one midterm to another.
I am teaching two Stat 200 classes and I
am giving them the same midterm and I
am trying to see if one class is different
from the other
E
Examples
l
First case: Dependent samples

Second case: Independent samples


D
Dependent
d t samples
l
I am teaching a Stat 200 class and I am
giving
g g them two midterms during g the
semester. I want to see how they improve
from one midterm to another.
So how do I find a sampling distribution for
the mean of the differences of the
average?
Steps
St
Put the grades of each student together.
Calculate the differences of the two grades
which are denoted with d i
Find the average of those differences
denoted with d
Find
Fi d th
the standard
t d dd deviation
i ti off th
the
differences, denoted with d
P bl
Problems
Now the previous steps are easy to do
them to find the mean and standard
deviation when we have a small
population like a Stat 200 class.
Now what happens if we want to compare
the weight gain in a single day of all the
population of USA.
USA That means we have
to measure everyone in the morning and
in the night
night. How easy is that?
S l ti
Solution
We take a sample!!
Now if we have a sample that means that
we do not know the population mean and
standard deviation
deviation. So we have that the
sample mean denoted with d follows a
normal distribution with
Mean= d
sd
Standard error=
n
C diti
Condition
The normal distribution in the previous
slide is true only
y if
thedistribution of the differences is bell-
shaped
p
sample size is large.

By large we mean n>30


Wh t if n<30?
What 30?
Then we have as a sampling distribution
the same distribution as when n>30 but
with the extra condition that we have a bell
shaped
p curve.
In order to find the confidence interval
though we use the t- t distribution with n-1
n1
degrees of freedom.
Confidence Interval for dependent
samples
General form of this interval is:
Sample estimate Multiplier Standard error

I this
In thi case thi
this ttranslates
l t tto
sd
d Multiplier
n
H
How tto fi
find
d multiplier
lti li
If n>30 refer to normal distribution

If n<30 refer to t-distribution with n-1


degrees of freedom
E
Example
l 1
I am giving two tests to 40 students in a
Stat 200 class. The average of the
differences between the scores of the two
tests is 12 and standard deviation of the
differences is 4
4.
What is the sampling distribution in this case?
What is the 95% Confidence Interval for the
mean of the differences.
E
Example
l 2
I am giving two tests to 20 students in a
Stat 200 class. The average of the
differences between the scores of the two
tests is 12 and standard deviation of the
differences is 4
4.
What is the sampling distribution in this case?
What is the 95% Confidence Interval for the
mean of the differences.
I d
Independent
d t Samples
S l
I am teaching two Stat 200 classes and I
am ggiving
g them the same midterm and I
am trying to see if one class is different
from the other.
So how do I find a sampling distribution for
the mean of the differences of the
average?
E
Example
l
We denote one of the two classes as
Class 1 and the other as Class 2.
We find the means of the two classes,
denoted by: 1 , 2
Then we find the standard deviation of the
scores of the two classes denoted by
1, 2
P bl
Problems
Now we have the same problems as in the
dependent
p case. What happens
pp if the two
populations are large and we cannot
measure everybody.
y y
Then we have to find sample means which
we denote with x1 , x2 and the sample
standard deviations denoted by s1 , s2
C
Comparison
i
Now, the objective is to find the sampling
distribution of the differences.
When the samples are independent we
have that the sampling distribution of the
difference between the two means follow
a normal distribution with:
Mean= x1 x2
s12 s22
Standard error=
n1 n2
C diti
Conditions
Both population measurements are both
bell-shaped
p
Both sample sizes n1 , n2 should be large;
that means
means, both greater than 30
30.
Wh t if n1 or n2 <30
What 30
Then we have as a sampling distribution
the same distribution as when n1, n2 >30
but with the extra condition that we have a
bell shaped curve.
In order to find the confidence interval
though we use the t- distribution with
different degrees of freedom
freedom, that we will
see later
n1, n2 <30
30
We have two cases:
Unpooled case (General Case)
Pooled case (Assuming equal variances)
U
Unpooled
l d case
The degrees of freedom are equal to the
following
g approximation
pp which is called
Welchs approximation:
2
s1 s2
2 2


n1 n2
2 2
1 s 2
1 s 2


1 2
n1 1 n1 n2 1 n2
A conservative
ti approach
h
Now because that formula is a total mess
we can use a conservative approximation
pp
that says that the degrees of freedom are
equal
q to the minimum between
n1 1 and n2 1
P l d case
Pooled
In this case we assume that the two independent
variable have equal variances and so equal
standard deviation
Then the common standard deviation is called
pooled standard deviation and is calculated
by:
sp
n1 1 s1 n2 1 s2
2 2

n1 n2 2
The degrees of freedom in this case are
calculated by: n n 2
1 2
Confidence intervals for the
diff
difference off means independent
i d d t
samples
p
First of all, we have that the general format of
confidence intervals is:
S
Sample ti t Multiplier
l estimate M lti li Standard
St d d error
Which, in this case it translates to:
x1 x2 Multiplier Standard error
To find the standard error and the multiplier you
have to check in which case the problem is.
If n1,n2 both
b th >30
30
Then we have that the difference of the
means of independent
p samples
p follow a
normal distribution and the multiplier will
be found in the table of standard normal.
The standard error in this case is equal to
s12 s22

n1 n2
If one off n1,n2 < 30
Then we have to decide if we have the unpooled
or pooled case:
If we have the unpooled case then the standard
error is equal to the one in previous slide
s12 s 22

n1 n 2
For the multiplier we have to look on the table of
t distribution with degrees of freedom equal to
the Welchs approximation or if clearly stated we
can use the conservative approach
If one off n1,n2 < 30
If we have the pooled case then:
The standard error is equal to
1 1
sp
n1 n2
The multiplier can be found on the table of t
di t ib ti with
distribution ith n1 n2 2 d
degrees off
freedom
E
Example
l 1
I ask the mean height of 100 PSU male students
and 150 PSU female students. The sample
average for males is 72 inches and females 66
inches. The sample standard deviation for
males is 9 and for females is 7.
Find the sampling distribution of the difference of
the two means
Find the 90% confidence interval for the
difference of the two means
E
Example
l 2
I have a Stat 200 class with 20 male students
and 23 female students and I give them a test.
The sample average for males is 82 and for
females 86. The sample standard deviation for
males is 2.8 and for females is 3.
Find the sampling distribution of the difference of
the two means using the unpooled case Welch
approximation
Find the 90% confidence interval for the
difference of the two means.
E
Example
l 3
I have a Stat 200 class with 20 male students
and 23 female students and I give them a test.
The sample average for males is 82 and for
females 86. The sample standard deviation for
males is 2.8 and for females is 3.
Find the sampling distribution of the difference of
the two means using the unpooled case
conservative approach
Find the 90% confidence interval for the
difference of the two means.
E
Example
l 4
I have a Stat 200 class with 20 male students
and 23 female students and I give them a test.
The sample average for males is 82 and for
females 86. The sample standard deviation for
males is 2.8 and for females is 3.
Find the sampling distribution of the difference of
the two means using the pooled case
Find the 90% confidence interval for the
difference of the two means.

Vous aimerez peut-être aussi