Académique Documents
Professionnel Documents
Culture Documents
Descriptive Statistics
Collect
Organize
Summarize
Display
Analyze
Inferential Statistics
Predict and forecast
values of population
parameters
Test hypotheses about
values of population
parameters
Make decisions
Qualitative Categorical or
Nominal: Examples
are Color
Gender
Nationality
Quantitative Measurable or
Countable: Examples
are Temperatures
Salaries
Number of points scored
on a 100 point exam
Scales of Measurement
Nominal Scale - groups or classes
Gender
Ordinal Scale - order matters
Ranks
Interval Scale - difference or distance
matters has arbitrary zero value.
Temperatures
Ratio Scale - Ratio matters has a
natural zero value.
Salaries
Population (N)
Sample (n)
Why Sample?
Census of a
population may
be:
Impossible
Impractical
Too costly
Example 1-2
A large department store
collects data on sales made by
each of its salespeople. The
number of sales made on a
given day by each of 20
salespeople is shown on the next
slide. Also, the data has been
sorted in magnitude.
Sorted Sales
6
9
10
12
13
14
14
15
16
16
16
17
17
18
18
19
20
21
22
24
Sorted
Sales
6
9
10
12
13
14
14
15
16
16
16
17
17
18
18
19
20
21
22
24
Quartiles
Position
(20+1)25/100=5.25
13 + (.25)(1) = 13.25
Median
(20+1)50/100=10.5
16 + (.5)(0) = 16
Third Quartile
(20+1)75/100=15.75
First Quartile
Quartiles
Measures of Central
Tendency
Median
Mode
Mean
Measures of Variability
Range
Interquartile range
Variance
Standard Deviation
Other summary
measures:
Skewness
Kurtosis
Median
Mode
Mean
Average
Sorted Sales
6
9
10
12
13
14
14
15
16
16
16
17
17
18
18
19
20
21
22
24
16 + (.5)(0) = 16
Median
..
.. .. .. .. :: .. :: :: :: .. .. .. ..
----------------------------------------------------------------------------------------------------------------------------10 12
1213
1314
1415
1516
1617
1718
1819
1920
2021
2122
22
66
9910
24
24
Mode = 16
The mode is the most frequently occurring value. It
is the value with the highest frequency.
Sample Mean
x
i 1
x
i 1
317
x
i 1
317
1585
.
20
..
..
..
.. .. .. .. :: .. :: :: :: .. .. ..
----------------------------------------------------------------------------------------------------------------------------10 12
1213
1314
1415
1516
1617
1718
1819
1920
2021
21
66
9910
22 24
24
22
Mean = 15.85
Range
Difference between maximum and minimum values
Interquartile Range
Difference between third and first quartile (Q3 - Q1)
Variance
Average*of the squared deviations from the mean
Standard Deviation
Square root of the variance
6
9
10
12
13
14
14
15
16
16
16
17
17
18
18
19
20
21
22
24
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Minimum
First Quartile
Q1 = 13 + (.25)(1) = 13.25
Maximum
Interquartile
Range
Q3 - Q1 =
18.75 - 13.25 = 5.5
(x )
2 i1
N
s
2
i 1
Sample Variance
( x)
N
2
i 1
(x x)
i 1
n 1
(
)
x
n
n
x
i 1
i 1
n 1
s s
xx
-9.85
-6.85
-5.85
-3.85
-2.85
-1.85
-1.85
-0.85
0.15
0.15
0.15
1.15
1.15
2.15
2.15
3.15
4.15
5.15
6.15
8.15
0
(x x) 2
97.0225
46.9225
34.2225
14.8225
8.1225
3.4225
36
81
100
144
3.4225
0.7225
0.0225
0.0225
0.0225
1.3225
1.3225
4.6225
4.6225
9.9225
17.2225
26.5225
37.8225
66.4225
378.5500
s
2
169
196
225
256
256
256
289
289
324
324
361
400
441
484
576
(x x)
i 1
n 1
378.55
(20 1)
378.55
19.923684
19
n
x
2
i 1
n
x
i 1
n 1
100489
317
5403
5403
20
20
19
20 1
2
19.923684
19
19
s s 19.923684 4.46
5403
Exhaustive
Every observation is assigned to a group
Frequency Distribution
f(x)/n
f(x)/n
RelativeFrequency
Frequency
Relative
30
30
38
38
50
50
31
31
22
22
13
13
0.163
0.163
0.207
0.207
0.272
0.272
0.168
0.168
0.120
0.120
0.070
0.070
184
184
1.000
1.000
Cumulative Frequency
Distribution
F(x)
xx
F(x)
SpendingClass
Class($)
($) Cumulative
CumulativeFrequency
Frequency
Spending
lessthan
than100
100
00totoless
100totoless
lessthan
than200
200
100
200totoless
lessthan
than300
300
200
300totoless
lessthan
than400
400
300
400totoless
lessthan
than500
500
400
500totoless
lessthan
than600
600
500
30
30
68
68
118
118
149
149
171
171
184
184
F(x)/n
F(x)/n
CumulativeRelative
RelativeFrequency
Frequency
Cumulative
0.163
0.163
0.370
0.370
0.641
0.641
0.810
0.810
0.929
0.929
1.000
1.000
Thecumulative
cumulativefrequency
frequencyof
ofeach
eachgroup
groupisisthe
thesum
sumof
ofthe
the
The
frequenciesof
ofthat
thatand
andall
allpreceding
precedinggroups.
groups.
frequencies
Histogram
Histogram Example
Frequency Histogram
Histogram Example
Relative Frequency Histogram
Skewness
Measure of asymmetry of a frequency distribution
Skewed to left
Symmetric or unskewed
Skewed to right
Kurtosis
Measure of flatness or peakedness of a frequency
distribution
Platykurtic (relatively flat)
Mesokurtic (normal)
Leptokurtic (relatively peaked)
Skewness
Skewed to left
Skewness
Symmetric
Skewness
Skewed to right
Kurtosis
Platykurtic - flat distribution
Kurtosis
Mesokurtic - not too flat and not too peaked
Kurtosis
Leptokurtic - peaked distribution
Chebyshevs Theorem
Empirical Rule
Applies only to roughly mound-shaped and
symmetric distributions
Specifies approximate percentages of observations
within a given number of standard deviations from the
mean
Chebyshevs Theorem
At
least
1
1 3
75%
2
4 4
2
1
1 8
89%
2
9 9
3
1
1 15
1 2 1
94%
16
16
4
2
Lie
within
3
4
Standard
deviations
of the mean
Empirical Rule
1 standard deviation
of the mean
Lie
within
2 standard deviations
of the mean
3 standard deviations
of the mean
Pie Charts
Categories represented as percentages of total
Bar Graphs
Heights of rectangles represent group frequencies
Frequency Polygons
Height of line represents frequency
Ogives
Height of line represents cumulative frequency
Time Plots
Represents values over time
Pie Chart
Bar Chart
Fig. 1-11 Airline Operating Expenses and Revenues
12
Average Revenues
Average Expenses
10
8
6
USAir
0.2
0.1
0.0
0
10
20
Sales
30
40
50
Relative Frequency
0.3
Ogive
1.0
0.5
0.0
0
10
20
Sales
30
40
50
Time Plot
M o n thly S te e l P ro d uc tio n
(P ro b le m 1 -4 6 )
Millions of Tons
8.5
7.5
6.5
5.5
Month
J F M A M J J A S ON D J F M A M J J A S ON D J F M A M J J A S O
Stem-and-Leaf Displays
Quick-and-dirty listing of all observations
Conveys some of the same information as a histogram
Box Plots
Median
Lower and upper quartiles
Maximum and minimum
Box Plot
Elementsof
ofaaBox
BoxPlot
Plot
Elements
Outlier
Smallest data
point not
below inner
fence
Outer
Fence
Inner
Fence
Q1-1.5(IQR)
Q1-3(IQR)
Q1
Median
Interquartile Range
Q3
Inner
Fence
Q3+1.5(IQR)
Outer
Fence
Q3+3(IQR)