Vous êtes sur la page 1sur 5

The "95%" says that 95% of

Confidence
experiments like we just did will
include the true mean, but5%
won't.

Intervals So there is a 1-in-20 chance


(5%) that our Confidence
Interval does NOT include the
true mean.
An interval of 4 plus or minus 2

A Confidence Interval is a range of


values we are fairly sure our true
value lies in. Calculating the
Confidence Interval
Example: Average Height
Step 1: find the number of
We measure the heights observations n, calculate their
of 40 randomly chosen men, mean X, and standard deviation s
and get a mean height
of 175cm, Using our example:
 Number of observations: n = 40
 Mean: X = 175
We also know the standard
 Standard Deviation: s = 20
deviation of men's heights
Note: we should use the
is 20cm. standard deviation of the
The 95% Confidence entire population, but in
Interval (we show how to many cases we won't know
calculate it later) is: it.
We can use the standard
175cm ± 6.2cm deviation for the sample if
we have enough
observations (at least n=30,
hopefully more).

Step 2: decide what Confidence


Interval we want: 95% or 99% are
common choices. Then find the "Z"
value for that Confidence Interval here:
This says the true mean of ALL Confidence
Z
men (if we could measure all Interval
their heights) is likely to be
80% 1.282
between 168.8cm and 181.2cm.
85% 1.440
But it might not be!
90% 1.645

95% 1.960

99% 2.576

99.5% 2.807

99.9% 3.291

For 95% the Z value is 1.960

Step 3: use that Z in this formula for


the Confidence Interval Calculator
X ± Zs√n
We have a Confidence Interval
Where: Calculator to make life easier for you.
 X is the mean
 Z is the chosen Z-value from the table
above
 s is the standard deviation
 n is the number of observations Another Example
And we have:
Example: Apple Orchard
175 ± 1.960 × 20√40
Are the apples big enough?
Which is:
There are hundreds of apples on
175cm ± 6.20cm the trees, so you randomly
choose just 46 apples and get:
In other words: from 168.8cm to  a Mean of 86
181.2cm  a Standard Deviation of 6.2

The value after the ± is So let's calculate:


called the margin of error
The margin of error in our X ± Zs√n
example is 6.20cm
We know:
 X is the mean = 86
 Z is the Z-value = 1.960 (from the
table above for 95%)
 s is the standard deviation = 6.2
 n is the number of observations =
46
86 ± 1.960 × 6.2√46 = 86 ± Maybe we had this sample, with
1.79 a mean of 83.5:

So the true mean (of all the


hundreds of apples) is likely to
be between 84.21 and 87.79
True Mean

Now imagine we get to pick ALL Each apple is a green dot,


the apples straight away, and our observations are marked
get them ALL measured by the purple
packing machine (this is a
luxury not normally found in That does not include the
statistics!) true mean. Expect that to
happen 5% of the time for a
And the true mean turns out to 95% confidence interval.
be 84.9
So how do we know if the
Let's lay all the apples on the sample we took is one of the
ground from smallest to largest: "lucky" 95% or the unlucky 5%?
Unless we get to measure the
whole population like above we
simply don't know.

This is the risk in sampling , we


might have a bad sample.
Each apple is a green dot,
except our observations which
are blue

Our result was not exact ... it is Example in Research


random after all ... but the true
mean is inside our confidence Here is Confidence Interval used
interval of 86 ± 1.79 (in other in actual research on extra
words 84.21 to 87.79) exercise for older people:

But the true mean might


not be inside the confidence
interval but 95% of the time it
will!

95% of all "95%


Confidence Intervals" will
What is it saying? Looking at the
include the true mean. "Male" line we see:
 1,226 Men (47.6% of all people)
 had a "HR" (see below) with
a mean of 0.92,
 and a 95% Confidence
Interval (95% CI) of 0.88 to
0.97 (which is also 0.92±0.05)
Also from -1.96 to +1.96 standard
"HR" is a measure of health deviations, so includes 95%
benefit (lower is better), so that
line says that the true benefit
of exercise (for the wider
population of men) has a 95%
chance of being between 0.88 Conclusion
and 0.97
The Confidence Interval is based on
* Note for the curious: "HR" is Mean and Standard Deviation. Its
used a lot in health research and formula is:
means "Hazard Ratio" where
lower is better, so an HR of 0.92 X ± Zs√n
means the subjects were better
off, and 1.03 means slightly Where:
worse off.  X is the mean
 Z is the Z-value from the table below
 s is the standard deviation
 n is the number of observations
Standard Normal Confidence
Z
Interval
Distribution 80% 1.282
It is all based on the idea of 85% 1.440
the Standard Normal Distribution ,
90% 1.645
where the Z value is the "Z-score"
95% 1.960

For example the Z for 95% is 1.960, 99% 2.576


and here we see the range from -1.96 99.5% 2.807
to +1.96 includes 95% of all values:
99.9% 3.291

From -1.96 to +1.96 standard Confidence Level


deviations is 95%
In survey sampling, different samples can be randomly
selected from the same population; and each sample
Applying that to our sample looks like can often produce a different confidence interval.
this: Some confidence intervals include the true population
parameter; others do not.
A confidence level refers to the percentage of all
possible samples that can be expected to include the Factors that Affect
true population parameter. For example, suppose all
possible samples were selected from the same Confidence Intervals (CI)
population, and a confidence interval were computed  Population size: this does not usually affect the CI
for each sample. A 95% confidence level implies that but can be a factor if you are working with small
95% of the confidence intervals would include the true and known groups of people.
population parameter.  Sample Size: the smaller your sample, the less
likely it is you can be confident the results reflect
Confidence Level: What is the true populationparameter.
 Percentage: Extreme answers come with
it? better accuracy. For example, if 99 percent of
Statistics Definitions > Confidence Level voters are for gay marriage, the chances of error
are small. However, if 49.9 percent of voters are
“for” and 50.1 percent are “against” then the
chances of error are bigger.

When a poll is reported in the media, a confidence level


0% and 100% Confidence
is often included in the results. For example, a survey
might report a 95 percent confidence level. But what
Level
A 0% confidence level means you have no faith at
exactly does this mean? At first glance you might think all that if you repeated the survey that you would get the
that it means it’s 95 percent accurate. That’s close to the same results. A 100% confidence level means there is no
truth, but like many things in statistics, it’s actually a little doubt at all that if you repeated the survey you would
more defined. get the same results. In reality, you would never publish
the results from a survey where you had no confidence
Real Life Example at all that your statistics were accurate (you would
Example: A recent article on Rasmussen Reportsstates probably repeat the survey with better techniques). A
that “38% of Likely U.S. Voters now say their health 100% confidence level doesn’t exist in statistics, unless
insurance coverage has changed because of Obamacare”. you surveyed an entire population — and even then you
If you scroll down to the bottom of the article, you’ll see probably couldn’t be 100 percent sure that your survey
this line: “Themargin of sampling error is +/- 3 wasn’t open to some kind or error or bias.
percentage points with a 95% level of confidence.”
It’s impractical to survey all 300 million+ U.S. residents,
so it’s impossible to know exactly how many people
Confidence
would actually respond “yes my health insurance has
changed.” We take a sample(say, 2,000 people) and,
Coefficient
The confidence coefficient is the confidence levelstated
using good statistical techniques like simple random
as a proportion, rather than as a percentage. For
sampling, take our “best guess” at what that actual figure
example, if you had a confidence level of 99%, the
is (we call that unknown figure a population parameter).
confidence coefficient would be .99.
What a 95 percent confidence level is saying is that if the
In general, the higher the coefficient, the more certain
poll or survey were repeated over and over again, the
you are that your results are accurate. For example, a .99
results would match the results from the actual
coefficient is more accurate than a coefficient of .89. It’s
population 95 percent of the time.
extremely rare to see a coefficient of 1 (meaning that you
What about “+/- 3 are positive without a doubt that your results are
completely, 100% accurate). A coefficient of zero means
percentage points”? that you have no faith that your results are accurate at all.
The following table lists confidence coefficients and the
The width of the confidence interval tells us more about
how certain (or uncertain) we are about the true figure equivalent confidence levels.
in the population. This width is stated as a plus or minus Confidence coefficient Confidence level (1 – α *
(in this case,+/- 3) and is called the confidence interval. (1 – α) 100%)
When the interval and confidence level are put together,
you get a spread of percentage. In this case, you would 0.90 90 %
expect the results to be 35 (38-3) to 41 (35+3) percent,
95% of the time. 0.95 95 %

0.99 99 %

Vous aimerez peut-être aussi