Vous êtes sur la page 1sur 51

TWO MAIN BRANCHES OF STATISTICS:

DESCRIPTIVE STATISTICS
INFERENTIAL STATISTICS
Definition
Descriptive Statistics consists of
methods for organizing, displaying, and
describing data by using tables, graphs,
and summary measures.
Case Study: How Much Did Companies Spend
on Ads in 2011?
Case Study: How Women Rate Their Lives
Definition
Inferential Statistics consists of
methods that use sample results to
help make decisions or predictions
about a population.
Data can be used in different ways and the two main areas
are

Descriptive Statistic Inferential statistics

consists of generalizing from


consists of the collection , samples to populations, performing
organization , summarization and estimations and hypothesis testing,
presentation of data. determining relationships among
variables, and making predictions.

EX:”the average age of the


student is 14 years” EX: the relationship between
smoking and lung cancer”
Inferential Statistics
• Estimation
• e.g., Estimate the population
mean weight using the sample
mean weight
• Hypothesis testing
• e.g., Test the claim that the
population mean weight is 70 kg

Inference is the process of drawing conclusions or making decisions


about a population based on sample results
Estimation
• Estimation Defined
• Confidence Levels
• Confidence Intervals
• Confidence Interval Precision
• Standard Error of the Mean
• Sample Size
• Standard Deviation
• Confidence Intervals for Proportions
Estimation Defined:

• Estimation – A process whereby we select a


random sample from a population and use
a sample statistic to estimate a population
parameter.
Point and Interval Estimation
• Point Estimate – A sample statistic used to
estimate the exact value of a population
parameter
• Confidence interval (interval estimate) – A
range of values defined by the confidence
level within which the population parameter is
estimated to fall.
The Process of Estimation
Margin of Error and the Interval Estimate

A point estimator cannot be expected to provide the


exact value of the population parameter.

An interval estimate can be computed by adding and


subtracting a margin of error to the point estimate.

Point Estimate +/- Margin of Error

The purpose of an interval estimate is to provide


information about how close the point estimate is to
the value of the parameter.
Margin of Error and the Interval Estimate

The general form of an interval estimate of a


population mean is

x  Margin of Error
Interval Estimate of a Population Mean:
s Known
 In order to develop an interval estimate of a
population mean, the margin of error must be
computed using either:
• the population standard deviation s , or
• the sample standard deviation s
 s is rarely known exactly, but often a good estimate
can be obtained based on historical data or other
information.
 We refer to such cases as the s known case.
Confidence Levels:
• Confidence Level – The likelihood, expressed as a
percentage or a probability, that a specified interval
will contain the population parameter.
• 95% confidence level – there is a .95 probability that a
specified interval DOES contain the population mean.
In other words, there are 5 chances out of 100 (or 1
chance out of 20) that the interval DOES NOT contain
the population mean.
• 99% confidence level – there is 1 chance out of 100
that the interval DOES NOT contain the population
mean.
The confidence level = 𝟏 − 𝜶 𝟏𝟎𝟎%
where α = probability that the confidence
interval does not contain the true
population parameter.

α corresponds to the level of significance

Critical value is the value that indicates the


point beyond which lies the rejection region.

In hypothesis testing, if the absolute value


of your test statistic is greater than the
critical value, reject the null hypothesis.
Interval Estimate of a Population Mean:
s Known
There is a 1 -  probability that the value of a
sample mean will provide a margin of error of z / 2s x
or less.

Sampling
distribution
of x

/2 1 -  of all /2


x values

x

z /2 s x z /2 s x
Interval Estimate of a
Population
• Interval Estimate of Mean:

s Known s
x  z /2
n

where: x is the sample mean


1 - is the confidence coefficient
z/2 is the z value providing an area of
/2 in the upper tail of the standard
normal probability distribution
s is the population standard deviation
n is the sample size
Interval Estimate of a Population Mean:
s Known
 Values of z/2 for the Most Commonly Used
Confidence Levels

Confidence Table
Level  /2 Look-up Area z/2
90% .10 .05 .9500 1.645
95% .05 .025 .9750 1.960
99% .01 .005 .9950 2.576
Meaning of Confidence

Because 90% of all the intervals constructed using


x  1.645s x will contain the population mean,
we say we are 90% confident that the interval
x  1.645s x includes the population mean .

We say that this interval has been established at the


90% confidence level.

The value .90 is referred to as the confidence


coefficient.
Example Unit 4, page 8

The mean score of a random sample of 49


Grade 11 students who took the first
periodic test is calculated to be 78. The
population variance is known to be 0.16.
a. Find the 95% confidence interval for the
mean of the entire Grade 11 students.
b. Find the lower and upper confidence
limits.
Exercises 4.1, page 21

Find α in the following confidence levels:


3. 92% 5. 96%

Find the critical value 𝒛𝜶 that corresponds to


𝟐
the given confidence level:
9. 92% 11. 96%

Find the margin of error:


25. 9,849.30 < µ < 10,150.70
27. 12,328.96 < µ < 12,671.04
Exercises 4.1, page 21

Find the margin of error E:


31. Confidence Level: 90%
σ = 12
n = 40

33. Confidence Level: 99%


σ = 18 kg
n = 60
Exercises 4.1, page 21

Find the confidence interval:


35. Sample mean 𝒙 ഥ = 𝟔𝟓. 𝟓𝟎
margin of error = 3.50
Exercises 4.1, page 21

Find the minimum sample size required to


estimate an unknown population mean µ
using the given data.
37. Confidence level: 95%
margin of error = 130
σ = 400

39. Confidence level: 90%


margin of error = 12
σ = 50
Exercises 4.1, page 21

Find the lower confidence limit


43. Margin of error = 810.90
upper confidence limit = 2,310.50

Find the upper confidence limit


51. Margin of error = 301
lower confidence limit = 1,199
Exercises 4.1, page 21

Find the 90% confidence interval of a


population mean µ for the following values.
53. 𝒏 = 𝟒𝟗, 𝒙ഥ = 𝟓𝟔, 𝝈 = 𝟎. 𝟖

Find the 99% confidence interval of a


population mean µ for the following values.
57. 𝒏 = 𝟒𝟗, 𝒙ഥ = 𝟕𝟐, 𝝈 = 𝟎. 𝟕𝟖
Exercises 4.1, page 25

61. The mean and standard deviation of the


scores of 45 Grade 10 students who took the
final test are calculated to be 76 and 12.5,
respectively. Find the 95% confidence
interval for the mean of the entire Grade 10
students.
Interval Estimate of a Population Mean:
s Known

A survey of 30 emergency room patients


found that the average waiting time for
treatment was 174.3 minutes. Assuming
that the population standard deviation is
46.5 minutes, find the best point
estimate of the population mean and the
99% confidence interval of the
population mean.
When s Is Unknown
 Cannot use z distribution
 2 uncertain values:  and s

 need wider interval to be confident

 Student’s t distribution
 also normal distribution

 width depends on how well s

approximates s
Student’s t Distribution

 The t distribution was discovered by


William S. Gosset in 1908.
 Gosset was an Oxford graduate in
mathematics and worked for the
Guinness Brewery in Dublin.
 He used the pseudonym Student to
avoid getting fired for doing statistics
on the job!!!
t Distribution

A t distribution with more degrees of freedom has


less dispersion.

As the degrees of freedom increases, the difference


between the t distribution and the standard
normal probability distribution becomes smaller
and smaller.
t Distribution
t distribution
Standard (20 degrees
normal of freedom)
distribution

t distribution
(10 degrees
of freedom)

z, t
0
Characteristics of the t Distribution
The t distribution is similar to the standard
normal distribution in these ways:
1. It is bell-shaped.
2. It is symmetric about the mean.
3. The mean, median, and mode are equal
to 0 and are located at the center of the
distribution.
4. The curve never touches the x axis.
Characteristics of the t Distribution

The t distribution differs from the standard


normal distribution in the following ways:
1. The variance is greater than 1.
2. The t distribution is actually a family of
curves based on the concept of degrees
of freedom, which is related to sample
size.
3. As the sample size increases, the t
distribution approaches the standard
normal distribution.
The Concept of Degrees of Freedom
Many statistical distributions use the
concept of degrees of freedom, and the
formulas for finding the degrees of freedom
vary for different statistical tests. The
degrees of freedom are the number of
values that are free to vary after a sample
statistic has been computed, and they tell
the researcher which specific curve to use
when a distribution consists of a family of
curves.
The t-distribution formula

ഥ−𝝁
𝒙
𝒕= 𝒔
𝒏
Where:
df = n – 1
ഥ = sample mean
𝒙
µ = population mean
s = standard deviation of the sample
n = sample size
Example Unit 4.2, page 27

A student researcher wants to determine


whether the mean score in mathematics of
the 25 students in Grade 8 Section Newton is
significantly different from the school mean
of 89. The mean and the standard deviation
of the scores of the students in Section
Newton are 95 and 15, respectively. Assume
95% confidence level.
Formula for a Specific Confidence Interval for the
Mean When σ Is Unknown

Assumptions for Finding a Confidence Interval for


a Mean When σ Is Unknown
1. The sample is a random sample.
2. Either 𝒏 ≥ 𝟑𝟎 or the population is normally
distributed if n < 30.
Example:
The data represent a sample of the number
of home fires started by candles for the past
several years. (Data are from the National
Fire Protection Association.) Find the 99%
confidence interval for the mean number of
home fires started by candles each year.

5460 5900 6090 6310


7160 8440 9930

𝒏 σ 𝑿𝟐 − σ 𝑿 𝟐
𝒔𝟐 =
𝒏 𝒏−𝟏
Visits to Networking Sites A sample of 10
networking sites for a specific month has a
mean of 26.1 and a standard deviation of 4.2.
Find the 99% confidence interval of the true
mean. Assume that the variable is
approximately normally distributed
Thunderstorm Speeds Ameteorologist who
sampled 13 thunderstorms found that the
average speed at which they traveled across
a certain state was 15 miles per hour. The
standard deviation of the sample was 1.7
miles per hour. Find the 99% confidence
interval of the mean. If a meteorologist
wanted to use the highest speed to predict
the times it would take storms to travel
across the state in order to issue warnings,
what figure would she
likely use?
Confidence Intervals and Sample Size
for Proportions

The Third Quarter 2018 Social Weather Survey,


conducted from September 15-23, 2018, found
74% of adult Filipinos have much trust, in
President Rodrigo Duterte.

• The parameter 74% is called a proportion.


• A proportion represents a part of a whole.
• Proportions can also represent probabilities.
A point estimator of the population proportion
p is given by the statistic
𝒙
ෝ=
𝒑
𝒏
Where:
ෝ = sample proportion
𝒑
x = number of elements in the
sample having the same
characteristics
n = sample size
Example 1
In a random sample of 120 teachers, 48 of
them have master’s degree. Find the value
ෝ.
of 𝒑

ෝ = proportion of the sample that DOES NOT


𝒒
possess the characteristic of interest
ෝ =𝟏−𝒑
𝒒 ෝ
The Central Limit Theorem for sample proportion led
to the following formula. This can be used to
calculate probabilities for the sample proportion 𝒑ෝ.
The formula can be used only if 𝒏𝒑 > 𝟓 𝒂𝒏𝒅 𝒏𝒒 > 𝟓.

ෝ−𝒑
𝒑
𝒛=
𝒑𝒒
𝒏
Where:
ෝ = sample proportion
𝒑
p = population proportion
n = sample size
q=1–p
Example
If 15% of the batteries produced daily by a
company is defective, what is the
probability of randomly selecting 70
batteries and finding 14 or more of them
defective?
Formula for a Specific Confidence Interval
for a Proportion

Rounding Rule for a Confidence Interval for a


Proportion Round off to three decimal places.
Example
A sample of 150 students were chosen at
random from all Grade 11 students in a
private school in Pampanga. The sample
indicated that 75% of them were in favour
of having an educational trip. Find a) 95%
and b) 99% confidence intervals for the
proportion of all students who are in favor
of having an educational trip.
Formula for Minimum Sample Size Needed for
Interval Estimate of a Population Proportion

• If necessary, round up to obtain a whole number.


• if some approximation of 𝒑 ෝ is known (e.g., from a
previous study), that value can be used in the formula.
• if no approximation of is known, you should use 𝒑 ෝ = 𝟎. 𝟓.
This value will give a sample size sufficiently large to
guarantee an accurate prediction, given the confidence
interval and the error of estimate.
Example
A researcher wishes to estimate, with 95%
confidence, the proportion of people who
own a home computer. A previous study
shows that 40% of those interviewed had a
computer at home. The researcher wishes
to be accurate within 2% of the true
proportion. Find the minimum sample size
necessary.

Vous aimerez peut-être aussi