Vous êtes sur la page 1sur 35

Summarizing and Describing

Numerical Data
Lectures 3+4+5 Topics
Measures of Central Tendency
Mean, Median, Mode
Measures of Variation
The Range, Variance and
Standard Deviation
Shape
Symmetric, Skewed, Skewness, Kurtosis
Summary Measures
Summary Measures

Central Tendency Variation

Mean Mode
Median Range Coefficient of
Variation
Variance

Standard Deviation
Measures of Central Tendency
Central Tendency

Mean Median Mode


n
xi
i 1
n
The Mean (Arithmetic mean,
Average)
It is the Arithmetic Average of data values:

x
n
xi xi x2 xn
i 1

Sample Mean n n
The Most Common Measure of Central Tendency
Affected by Extreme Values (Outliers)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Mean = 5 Mean = 6
The Arithmetic
Mean
This is the most popular and useful
measure of central location

Sum of the observations


Mean =
Number of observations
The Arithmetic
Mean

Sample mean Population mean


n
ii11xxii
n N
i1 x i
x
nn N

Sample size Population size


The arithmetic
The Arithmetic
mean

Mean
Example 4.1
The reported time spent on the Internet of 10 adults are 0, 7, 12, 5,
33, 14, 8, 0, 9, 22 hours. Find the mean time spent on the Internet.
i 1 xi
10
0x1 7x2 ... 22
x10
x 11.0 hours
10 10
Example 4.2
Suppose the telephone bills represent
the population of measurements ( 200). The population mean is

i200
1 x i x42.19
1 x38.45
2 ... x45.77
200
43.59
200 200
Weighted mean for data grouped
by categories or variants

ik1 xi f i
x
fi
When many of the measurements have the same value, the
measurement can be summarized in a frequency table. Suppose
the number of children in a sample of 16 families were recorded
as follows:

NUMBER OF CHILDREN 0 1 2 3
NUMBER OF FAMILIES 3 4 7 2
16 families

16
i 1 xi f i x1. f1 x2 f 2 ... x16 f16 3(0) 4(1) 7(2) 2(3)
x 1.5
16 16 16
The Median
Important Measure of Central Tendency
In an ordered array, the median is the
middle number.
If n is odd, the median is the middle number.
If n is even, the median is the average of the 2
middle numbers.
Not Affected by Extreme Values
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Median = 5 Median = 5
The Median

The Median of a set of observations is the


value that falls in the middle when the
observations are arranged in order of
Example 4.3 Comment
magnitude or ranked increasingly
Find the median of the time spent on the internet Suppose only 9 adults were sampled
for the adults of example 4.1 (exclude, say, the longest time (33))

Even number of observations Odd number of observations

0, 0, 5,
0, 7,
5, 8,
7, 8, 9, 12,
9, 12,
14,14,
22,22,
33 33 0, 0, 5, 7, 8 9, 12, 14, 22
The Mode
A Measure of Central Tendency
Value that Occurs Most Often
Not Affected by Extreme Values
There May Not be a Mode
There May be Several Modes
Used for Either Numerical or Categorical Data

0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

No Mode
Mode = 9
The Mode
The Mode of a set of observations is the
variable value that occurs most frequently.
Set of data may have one mode (or modal
class), or two or more modes.

For large data sets


The modal class the modal class is
much more relevant
than a single-value
mode.
Approximating Descriptive
Measures for grouped
Data by classes
Approximating descriptive measures for
grouped data may be needed in two
cases:
when approximated values.suffices the needs,
when only secondary grouped data are
available. i 1 xi f i
k
x x midpoint
k
i 1 fi ni frequency
Example 4.13
Approximate the mean and standard deviation
of the telephone call durations problem
(example ), as represented by the frequency
distribution
Class Class Frequency Midpoint
i limits n xi x ni
Real value :
1 2-5 3 3.5 10.5
x 10.26 2 5-8 6 6.5 39.0
3 8-11 8 9.5 76.0
. . . . .
6 17-20 2 18.5 37.0

n = 30 312.0

8 11 3.5
14 6.5
17 20 More
Median and Mode

Median
Me -1
1
( ni 1) - n i
2
Me x 0 K i 1
n Me
Median and Mode

Mode
1
Mo x 0 K
1 2
Relationship among Mean, Median,
and Mode

If a distribution is symmetrical, the


mean, median and mode coincide
If a distribution is non symmetrical, and
skewed to the left or to the right, the
three measures differ.
A positively skewed distribution A negatively skewed distribution
(skewed to the right) (skewed to the left)

Mode Mean Mean Mode


Median Median
Summary Measures
x i x
2
Summary Measures s
2
n 1

Central Tendency Variation

Mean Mode
n Median Range Coefficient of
xi Variation
i 1
n Variance
Standard Deviation
Measures of Variation
Variation

Variance Standard Deviation Coefficient of


Variation
Range Population
Population
Variance Standard S
Deviation CV 100%
Sample
Sample
X
Variance
Standard
Deviation
The Range
Measure of Variation
Difference Between Largest & Smallest
Observations:
Absolute Range = x La rgest x Smallest
Relative Range = ( xLargest xSmallest) / mean

Ignores How Data Are Distributed:


7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
Deviation
Individual deviation from the mean = xi mean

Overall deviation = 0, because X i X 0

Summing squared deviations X X


2
i

or
absolute values of the deviations | x i x |
Variance
Important Measure of Variation
Shows Variation About the Mean
Computed as an arithmetic mean of
squared deviations or as a square mean of
individual deviations
2 Xi
2
For the Population:
N
X i X
2
For the Sample: s 2
n1
For the Population: use N in the For the Sample : use n - 1
denominator. in the denominator.
Standard Deviation
Most Important Measure of Variation
Shows Variation About the Mean:
For the Population:

i
X 2

X i X 2
For the Sample: s
n 1

For the Population: use N in the For the Sample : use n - 1


denominator. in the denominator.
Sample Standard Deviation

X i X
2
s
n1

Data: Xi : 10 12 14 15 17 18 18 24

n=8 Mean =16

s= (10 16)2 (12 16)2 (14 16)2 (15 16)2 (17 16)2 (18 16)2 (24 16)2
81

= 4.2426
Comparing Standard Deviations
Data : X i : 10 12 14 15 17 18 18 24

N= 8 Mean =16

X i X
2
s = = 4.2426
n 1
X i
2
= 3.9686
N

Value for the Standard Deviation is larger for data considered as a Sample.
Comparing Standard Deviations
Data A - AGE
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 3.338

Data B - AGE
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = .9258
Data C - AGE
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 4.57
Coefficient of Variation

Measure of Relative Variation


Always a % or coefficient
Shows Variation Relative to Mean
Used to Compare 2 or More Groups
Formula ( for Sample):
S
CV 100%
X
Comparing Coefficient of Variation
Stock A: Average Price last year = $50
Standard Deviation (sd) = $5
Stock B: Average Price last year = $100
(sd) = $5
Coefficient of Variation:
Stock A: CV = 10%
S
CV 100% Stock B: CV = 5%
X
Both average prices are
representatives
Shape
Describes How Data Are Distributed
between smallest and largest values
Measures of Shape:
Symmetric or skewed
Left-Skewed or Right-Skewed or
Positive Skew-ness Symmetric Positively Skewed
Mean Median Mod Mean = Median = Mode Mode Median Mean
e
Box plot graphical presentation of
CTM
Central tendency
measures summary
Discussed Measures of Central Tendency
Mean, Median, Mode
Addressed Measures of Variation
The Range, Variance,
Standard Deviation, Coefficient of Variation
Determined Shape of Distributions
Symmetric or Skewed
Coefficient of skewness
Mean Median Mode Mean = Median = Mode Mode Median Mean

Vous aimerez peut-être aussi