Vous êtes sur la page 1sur 20

FURTHER STATISTICS

MEASURES OF DISPERSION AND SKEWNESS


Objectives
By the end of the session, you should be able to:
1. Calculate and interpret the range;
2. Calculate and interpret the interquartile range;
3. Calculate and interpret the quartile deviation
The Range
The simplest measure of dispersion (spread).
Difference between highest and lowest data set.
Highest – Lowest
Example 1
Find the range:
(a) 84, 88, 76, 77, 79, 89, 82, 85, 87, 88, 80
(b) 34, 54, 56, 33, 22, 45, 52, 66
Solution
a. Highest value is 89
Lowest value 76
Range = 89 – 76 =13

b. Highest Value is 66
Lowest value is 22
Range = 66 – 22 = 44
QUARTILES
Quartiles, Q are values that divide an ordered set of data into four equal parts.
Q1 - values (observations) below or equal to first quartile
Q2 – is the median, observations greater than Q1 or equal to second quartile
INTERQUARTILE RANGE
1 3
Is the difference between 𝑄3 – 𝑄1 𝑄1 = (𝑛 + 1) th of observation 𝑄3 = (𝑛 +
4 4
1) 𝑡ℎ of observation
𝑛 3𝑛
If n is even then 𝑄1 = 𝑡ℎ observation and 𝑄3 =
4 4

EXAMPLE 2
Interquartile Range of 84, 88, 76, 77, 79, 89, 82, 85, 87, 88, 80
Arrange observation in ascending order
76, 77, 79, 80, 82, 84, 85, 87, 88, 88, 89
11+1
n = 11 ∴ 𝑄1 = =3
4

𝑄1 = 3𝑟𝑑 observation
𝑄1 = 79

3(11 + 1)
𝑄3 = 𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
4
𝑄3 = 9𝑡ℎ observation
𝑄3 = 88

Interquartile Range
𝑄3 − 𝑄1 = 88 − 79 = 9

1
QUARTILE DEVIATION (SEMI-INTERQUARTILE RANGE) = of the
2
interquartile range.
𝑄3 − 𝑄1
=
2
From example 2
88 − 79 9
= = = 4.5
2 2
1. For each of the following data sets, calculate:

a. The Range;
b. The interquartile Range
c. The Quartile Deviation (semi-interquartile Range)

I. 1, 3, 5, 7, 9, 11, 23, 25, 27, 29, 31


II. 43, 34, 45, 35, 31, 21, 15, 17, 22, 40
III. 11, 13, 15, 17, 19, 11, 23, 45, 47, 49, 31
IV. 113, 134, 145, 315, 311, 121, 115, 171, 322, 430

2. Compare the following pairs of data sets:


a. I and II
b. III and IV
c. II and III

Solution
a. RANGE
I. Ascending Order 1, 3, 5, 7, 9, 11, 23, 25, 27, 29, 31
Range = 31 – 1 = 30

II. 15, 17, 21, 22, 31, 34, 35, 40, 43, 45
Range = 45 – 15 = 30

III. 11, 11, 13, 15, 17, 19, 23, 31, 45, 47, 49
Range = 49 − 11 = 38

IV. 113, 115, 121, 134, 145, 171, 311, 315, 322 430
Range = 430 − 113 = 317

b. The Interquartile Range


Ascending Order
1, 3, 5, 7, 9, 11, 23, 25, 27, 29, 31

I. There are 11 data observations


11+1
𝑄1 = 𝑡ℎ = 3𝑟𝑑 observation
4
𝑄1 = 5
3(11+1)
𝑄3 = 𝑡ℎ = 9𝑡ℎ observation
4
𝑄3 = 27
Interquartile Range 𝑄3 − 𝑄1 = 27 − 5 = 22

II. 15, 17, 21, 22, 31, 34, 35, 40, 43, 45
𝑛 = 10
10
𝑄1 =
4
𝑄1 = 2.5𝑡ℎ observation
17 + 21
𝑄1 = = 19
2
3(10)
𝑄3 = 𝑡ℎ observation
4
𝑄3 = 7.5𝑡ℎ observation
35 + 40
= 37.5
2

Interquartile Range = 𝑄3 − 𝑄1 = 37.5 − 19 = 18.5

III. 11, 11, 13, 15, 17, 19, 23, 31, 45, 47, 49
11+1
𝑄1 = = 3𝑟𝑑 observation
4
𝑄1 = 13
3(11+1)
𝑄3 = = 9𝑡ℎ observation
4
𝑄3 = 45

Interquartile Range = 𝑄3 − 𝑄1 = 45 − 13 = 32
IV. 113, 115, 121, 134, 145, 171, 311, 315, 322 430
10
𝑄1 = = 2.5𝑡ℎ observation
4
115 + 121
𝑄1 = = 118
2
3(10)
𝑄3 = = 7.5𝑡ℎ observation
4
311 + 315
𝑄3 = = 313
2

Interquartile range = 𝑄3 − 𝑄1 =313 − 118 = 195

c. Quartile Deviation (Semi-Interquartile range

Quartile Deviation = one-half of the interquartile range


22
I. = 11
2
18.5
II. = 9.25
2
32
III. = 16
2
195
IV. = 97.5
2

1, 3, 5, 7, 9, 11, 23, 25, 27, 29, 31


15, 17, 21, 22, 31, 34, 35, 40, 43, 45
11, 11, 13, 15, 17, 19, 23, 31, 45, 47, 49
113, 115, 121, 134, 145, 171, 311, 315, 322 430
I and II
Equal range, I is more spread out in the central part than II
III and IV
IV is wider spread out than in II
II and III
II is wider (more spread out) than in II.

The Range and Interquartile Range from data in a Frequency Distribution


Range = Midpoint of Highest class – Midpoint of Lowest class
OR
Range = Highest value – Lowest value
Find the range for the following data in the frequency table below:
Marks 40 – 44 45 – 49 50 – 54 55 – 59 60 – 64 65 – 69 70 – 74
Freq. 4 11 20 31 19 11 4

Solution
Midpoint of Lowest Class = (40+44)/2 = 42
Midpoint of Highest Class = (70+74)/2 = 72
Range = 72 – 42 = 30
Or
Range = 74 – 40 =34

Interquartile Range = 𝑸𝟑 − 𝑸𝟏
𝑛 3𝑛
− 𝐶𝐹 − 𝐶𝐹
4 4
𝑄1 = 𝐿 + ×𝑖 𝑄3 = 𝐿 + ×𝑖
𝑓 𝑓

Where:
L - Lower class Boundary of Class containing the First Quartile
n - Total number of Frequencies
CF - the Cumulative number of Frequencies of all the Classes preceding
the Class in which the Quartiles Q1 lies
f - the frequency of the Quartile Class
i - the width (class size) of the class in which the first Quartiles lies.

Example 4
Find the interquartile Range for the data set in Example 3
Class Cumulative
Classes Boundaries Freq. Freq.
40 - 44 39.5 - 44.5 4 4
45 - 49 44.5 - 49.5 11 15
50 - 54 49.5 - 54.5 20 35
55 - 59 54.5 - 59.5 31 66
60 - 64 59.5 - 64.5 19 85
65 - 69 64.5 - 69.5 11 96
70 - 74 69.5 - 74.5 4 100

100
Lower Quartile Class = = 25𝑡ℎ observation
4

Which is 50 – 54 class
L = 49.5
CF = 15
f = 20
i = 54.5 – 49.5 = 5
𝑛
− 𝐶𝐹
𝑄1 = 𝐿 + 4 ×𝑖
𝑓
100
−15
4
= 49.5 + ×5
20
10
= 49.5 + ×5
20

49.5 + 2.5 = 52
3(100)
Third Quartile Q3 = = 75𝑡ℎ observation
4

60 – 64 is 𝑄3 class
L = 59.5
CF = 66
f = 19
i = 64.5 - 59.5 = 5
3(100)
− 66
𝑄3 = 59.5 + 4 ×5
19
𝑄3 = 59.5 + 2.37
𝑄3 = 61.87
Interquartile Range = Q3 – Q1 = 61.87 – 52 = 9.87

Exercise 6.2
Find the:
a) Range
b) Interquartile Range
For the following table
1.
Weight (gm) 116 - 118 119 - 121 122 - 124 125 - 127 128 - 130
Freq. 7 19 28 16 5

2.
Mark 1 – 10 11 – 20 21 – 30 31 – 40 41 – 50 51 – 60
Freq. 2 6 9 15 12 6

3.
Age 15 -19 20 -24 25 – 29 30 -34 35 – 39
Freq. 4 15 19 16 6
Solution
a. Range = 130 − 116 = 14
Or
128+130
Midpoint of Class containing highest value = = 129
2
116+118
Midpoint of Class Containing Lowest Value = = 117
2
Range = 129 − 117 = 12

Class Cumulative
Classes Boundaries Freq. Freq.
116 - 118 115.5 - 118.5 7 7
119 - 121 118.5 - 121.5 19 26
122 - 124 121.5 - 124.5 28 54
125 - 127 124.5 - 127.5 16 70
128 - 130 127.5 - 130.5 5 75

75
Q1 class = =18.75th observation
4

Q1 class is 119 – 121


L = 118.5
n = 75
CF = 7
f = 19
i = 121.5 − 118.5 = 3
𝑛
− 𝐶𝐹
𝑄1 = 𝐿 + 4 ×𝑖
𝑓
75
−7
𝑄1 = 118.5 + 4 ×3
19
𝑄1 = 118.5 + 1.86 = 120.36

Third Quartile Q3
3(75)
Q3 = = 56.25𝑡ℎ observation
4

Q3 class is 125 – 127


L = 124.5
n = 75
CF = 54
f = 16
i = 127.5 – 124.5 = 3
3𝑛
− 𝐶𝐹
𝑄3 = 𝐿 + 4 ×𝑖
𝑓
3(75)
− 54
𝑄3 = 124.5 + 4 ×3
16
𝑄3 = 124.5 + 0.42 = 124.92

Interquartile Range = Q3 – Q1 = 124.92 – 120.36 = 4.56


2. Range =60 − 1 = 59
OR
1+10
Midpoint of class containing the smallest value = = 5.5
2
51+60
Midpoint of class containing the highest value = = 55.5
2

Range = 55.5 − 5.5 = 50


Class Cumulative
Classes Boundaries Freq. Freq.
1 – 10 1 - 10.5 2 2
11 – 20 10.5 - 20.5 6 8
21 – 30 20.5 - 30.5 9 17
31 – 40 30.5 - 40.5 15 32
41 – 50 40.5 - 50.5 12 44
51 – 60 50.5 - 60.5 6 50

50
𝑄1 class = = 12.5𝑡ℎ observation
4

Q1 class is 21 – 30
L = 20.5
n = 50
CF = 8
f=9
i = 30.5 − 20.5 = 10
50
−8
𝑄1 = 20.5 + 4 × 10
9
𝑄1 = 20.5 + 5 = 25.5

3(50)
Q3 class = = 37.5𝑡ℎ observation
4

Q3 class is 41 – 50
L = 40.5
n = 50
CF = 32
f = 12
i = 50.5 − 40.5 = 10
3(50)
− 32
𝑄3 = 40.5 + 4 × 10
12
𝑄3 = 40.5 + 4.58 = 45.08

Interquartile range = Q3 – Q1 = 45.08 – 25.5 = 19.58

3. Range = 39 – 15 = 24
15+19
Midpoint of Class containing least Value = = 17
2
39+35
Midpoint of Class containing highest value = = 37
2

Range = 37 − 17 = 20

Class Cumulative
Classes Boundaries Freq. Freq.
15 -19 14.5 - 19.5 4 4
20 -24 19.5 - 24.5 15 19
25 – 29 24.5 - 29.5 19 38
30 -34 29.5 - 34.5 16 54
35 – 39 34.5 - 39.5 6 60

60
Q1 class = = 15𝑡ℎ observation is 20 – 24
4

L = 19.5
n = 60
CF = 4
f = 15
i = 24.5 − 19.5 = 5
60
−4
𝑄1 = 19.5 + 4 ×5
15
𝑄1 = 19.5 + 3.67 = 23.17

3(60)
𝑄3 class = = 45𝑡ℎ observation is 30 – 34
4

L = 29.5
n = 60
CF = 38
f = 16
i = 34.5 − 29.5 = 5
3(60)
− 38
𝑄3 = 29.5 + 4 ×5
16
𝑄3 = 29.5 + 2.19 = 31.69

Interquartile Range = 𝑄3 − 𝑄1 = 31.69 − 23.17 = 8.52

THE VARIANCE AND STANDARD DEVIATION


Population Variance
2
𝑁 2 (∑𝑁
𝑖=1 𝑥𝑖 )
∑𝑁
𝑖=1(𝑥𝑖 −𝜇)
2 ∑𝑖=1 𝑥𝑖 −
𝜎2 = OR 𝜎 2 = 𝑁
𝑁 𝑁

Sample Variance
2
𝑁 2 (∑𝑁
𝑖=1 𝑥𝑖 )
∑𝑛
𝑖=1(𝑥𝑖 −𝑥̅ )
2 ∑𝑖=1 𝑥𝑖 −
𝑠2 = OR 𝑠 2 = 𝑛
𝑛−1 𝑛−1
Standard Deviation is the square root of the variance
Compute the variance for the sample data:
484, 503, 496, 510, 491, and 516.
Solution
n=6
484+503+496+510+491+516 3000
mean, 𝑥̅ = = = 500
6 6

𝑥𝑖 𝑥𝑖 − 𝑥̅ (𝑥𝑖 − 𝑥̅ )2
484 -16 256
503 3 9
496 -4 16
510 10 100
491 -9 81
516 16 256
∑(𝑥𝑖 − 𝑥̅ )2 =718

2
∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2
𝑠 =
𝑛−1
718 718
𝑠2 = = = 123.6
6−1 5
Compute the variance for the population data: 6, 4, 5, 3, 4
Solution
x 𝑥2
6 36
4 16
5 25
3 9
4 16
∑ 𝑥 =22 ∑ 𝑥 2 =102

(∑ 𝑥)2 222
∑ 𝑥2 − 102 −
𝜎2 = 𝑁 = 5 = 1.04
𝑁 5
Exercise 6.3
Calculate the variance and the standard deviation for each of the following:
1. Sample data: 1, 3, 5, 7, 9, 11
2. Sample data 45, 35, 31, 21, 15, 17, 22, 40
3. Population data 1, 2, 3, 5, 3, 4
4. Population data 6.1, 5.9, 4.8, 10.2, 9.6, 6.1
Solution
1+3+5+7+9+11 36
1. Mean, 𝑥̅ = = =6
6 6

𝑥 𝑥 − 𝑥̅ (𝑥 − 𝑥̅ )2
1 -5 25
3 -3 9
5 -1 1
7 1 1
9 3 9
11 5 25
∑(𝑥 − 𝑥̅ )2 =70

∑(𝑥−𝑥̅ )2 70 70
Sample Variance, 𝑠 2 = = = = 14
𝑛−1 6−1 5
Sample standard Deviation, s = √14 = 3.74

45+35+31+21+15+17+22+40 226
2. Mean, 𝑥̅ = = = 28.25
8 8

𝑥𝑖 𝑥𝑖2
45 2025
35 1225
31 961
21 441
15 225
17 289
22 484
40 1600
𝟐
∑𝟖𝒊=𝟏 𝒙𝒊 226 ∑𝟖𝒊=𝟏 𝒙𝒊 =7250
𝟖 𝟐
𝟖 𝟐 (∑𝒊=𝟏 𝒙𝒊 ) (𝟐𝟐𝟔)𝟐
∑𝒊=𝟏 𝒙𝒊 − 𝟕𝟐𝟓𝟎−
2 𝒏 𝟖
Sample variance = 𝑠 = = = 123.64
𝒏−𝟏 𝟖−𝟏
Sample Standard Deviation =√𝟏𝟐𝟑. 𝟔𝟒 = 𝟏𝟏. 𝟏𝟐

1+2+3+5+3+4 18
3. Mean, 𝜇 = = =3
6 6

𝑥𝑖 𝑥𝑖 − 𝜇 (𝑥𝑖 − 𝜇)2
1 -2 4
2 -1 1
3 0 0
5 2 4
3 0 0
4 1 1
6 6
∑𝑖=1 𝑥𝑖 =18 ∑𝑖=1(𝑥𝑖 − 𝜇)2 =10

∑𝑁
𝑖=1(𝑥𝑖 −𝜇)
2 10
Population Variance 𝜎 2 = = = 1.67
𝑁 6

Population Standard Deviation 𝜎 = √1.67 = 1.29


6.1+5.9+4.8+10.2+9.6+6.1 42.7
4. Mean μ = = = 7.12
6 6

𝑥𝑖 𝑥𝑖2
6.1 37.21
5.9 34.81
4.8 23.04
10.2 104.04
9.6 92.16
6.1 37.21
6
∑𝑖=1 𝑥𝑖 =42.7 ∑𝑖=1 𝑥𝑖2 =328.47
6

2
𝑁 2 (∑𝑁
𝑖=1 𝑥𝑖 ) (42.7)2
∑𝑖=1 𝑥𝑖 − 328.47−
Population variance 𝜎 2 = 𝑁
= 6
= 4.10
𝑁 6
Population standard Deviation, σ = √4.10 = 2.02
THE VARIANCE AND STANDARD DEVIATION FROM DATA IN
FREQUENCY DISTRIBUTION TABLE
Formulas
2
(∑𝑛
𝑖=1 𝑓𝑖 𝑥𝑖 )
∑𝑛 2 ∑𝑛 2
𝑖=1 𝑓𝑖 𝑥𝑖 −
𝑖=1 𝑓(𝑥𝑖 −𝑥̅ )
Sample Variance 𝑠 2 = OR 𝑠 2 = 𝑛
𝑛−1 𝑛−1
2
(∑𝑁
𝑖=1 𝑓𝑖 𝑥𝑖 )
∑𝑁 2 ∑𝑁
𝑖=1 𝑓𝑖 𝑥𝑖
2

𝑖=1 𝑓(𝑥𝑖 −𝜇)
Population Variance 𝜎 2 = OR 𝜎 2 = 𝑁
𝑀 𝑁

Exercise 6.4
Calculate the variance and the standard deviation for each of the following sample
data sets:
1. Ages of pupils attending a birthday party.
Ages (yr) 1–3 4–6 7–9 10 – 12 13 – 15
Frequency 2 5 10 5 3

2. Distance walked for Charity


Distance 1–5 6 – 10 11 – 15 16 – 20 21 – 25
(km)
Frequency 14 9 11 10 6

Population data
1. Distribution of ages of a group of 50 people in a village.

Age 10 – 19 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69
Frequency 5 11 10 12 8 4

2. Number of children per family in a village.

No. of 1–3 4–6 7–9 10 – 12 13 – 15 16 – 18


children
No. of 3 5 7 4 3 2
families

Solution
Age 𝑓 𝑥 𝑓𝑥 𝑥2 𝑓𝑥 2
1–3 2 2 4 4 8
4–6 5 5 25 25 125
7–9 10 8 80 64 640
10 – 12 5 11 55 121 605
13 – 15 3 14 42 196 588
∑ 𝑓 =25 ∑ 𝑓𝑥 =206 ∑ 𝑓𝑥 2 =1966
(∑ 𝑓𝑥)2 =42436

∑ 𝑓𝑥 206
Mean 𝑥̅ = ∑𝑓
= = 8.24
25

(∑ 𝑓𝑥)2 42436
∑ 𝑓𝑥 2 − 1966 −
𝑠2 = 𝑛 = 25 = 11.19
𝑛−1 25 − 1
Sample standard Deviation = √11.19 = 3.35
Distance
𝑓 𝑥 𝑓𝑥 𝑥 − 𝑥̅ (𝑥 − 𝑥̅ )2 𝑓(𝑥 − 𝑥̅ )2
(km)
1-5 14 3 42 -8.5 72.25 1011.5
6-10 9 8 72 -3.5 12.25 110.25
11-15 11 13 143 1.5 2.25 24.75
16-20 10 18 180 6.5 42.25 422.5
21-25 6 23 138 11.5 132.25 793.5
50 575 2362.5

∑ 𝑓𝑥 575
Mean 𝑥̅ = ∑𝑓
= = 11.5
50

∑ 𝑓(𝑥−𝑥̅ )2 2362.5
Sample Variance 𝑠 2 = = = 48.21
𝑛−1 50−1

Sample Standard Deviation, 𝑠 = √48.21 = 6.94


Class
Age 𝑓 midpoint 𝑓𝑥 𝑥−𝜇 (𝑥 − 𝜇 )2 𝑓(𝑥 − 𝜇 )2
𝑥
10-19 5 14.5 72.5 -23.8 566.44 2832.2
20-29 11 24.5 269.5 -13.8 190.44 2094.84
30-39 10 34.5 345 -3.8 14.44 144.4
40-49 12 44.5 534 6.2 38.44 461.28
50-59 8 54.5 436 16.2 262.44 2099.52
60-69 4 64.5 258 26.2 686.44 7632.24
50 1915

∑ 𝑓𝑥 1915
Mean μ = ∑ = = 38.3
𝑓 50

∑ 𝑓(𝑥−𝜇)2 7632.24
Population Variance 𝜎 2 = ∑𝑓
= = 152.64
50

Population Standard Deviation 𝜎 = √152.24 = 12.34

𝑓 𝑥 𝑓𝑥 𝑥2 𝑓𝑥 2
1-3 3 2 6 4 12
4-6 5 5 25 25 125
7-9 7 8 56 64 448
10-12 4 11 44 121 484
13-15 3 14 42 196 588
16-18 2 17 34 289 578
∑ 𝑓 =24 ∑ 𝑓𝑥 =207 ∑ 𝑓𝑥 2 =2235
8.625 42849

∑ 𝑓𝑥 207
Mean μ = ∑ = = 8.625
𝑓 24

(∑ 𝑓𝑥)2 (207)2
∑ 𝑓𝑥2 − 2235− 2235−1785.375
2 𝑁 24
Population Variance 𝜎 = = = = 18.73
𝑁 24 24

Population standard deviation σ =√18.73 = 4.33

Vous aimerez peut-être aussi