Académique Documents
Professionnel Documents
Culture Documents
Objectives
Set up a frequency distribution for a mass of data. Calculate the mean, median and mode mean for grouped data. Calculate and interpret other measures of location like the deciles, quartiles & percentiles.
Calculate the standard deviation, variance, mean deviation and quartile deviation for dd t grouped data. Construct histograms, bar charts, frequency polygons, pie charts and ogives. Describe a given set of data in terms of skewness and kurtosis.
Statistical data collected should be arranged in such a manner that will allow a reader to distinguish their essential features. Depending on the type and the objectives of the person presenting the information, data may be presented using one or a combination of three forms.
8/27/2011
Frequency Distribution
When that data include a large number of observations, it is convenient to group the values into mutually exclusive classes and show the number of observations occurring in each class in a tabular form. A frequency distribution is the arrangement of data that shows the frequency of occurrence of values falling within arbitrarily defined ranges of the variable known as class intervals. The smallest and largest values that fall in a given interval are called class limits.
Se
Ju ne Ju ly Au gu pte s t m b O er c to No be ve r m De be ce r m be r
Textual Form data is presented in paragraph form especially when they are purely qualitative or when very few numbers are involved.
4000 3500
Ja n
bru
Fe
8/27/2011
number of observations falling in a particular class while the midpoint between the upper and lower class limits is called class mark/midpoint.
Problem:
8/27/2011
Solution:
Computing for the range:
We choose 5 because it is the odd number. If i = 5, lowest limit should be 25. We choose 25 because it is the smallest multiple of the chosen interval which is smaller than the smallest value in the set set. If lowest limit is 25, the bottom interval should be 29 25. The interval 29 - 25 contains the lowest score (28).
R = 82 28 = 54
Computing for the class interval: C ti f th l i t l
54 5 .4 10
Classes
f 1 1 1 4 4 7 6 6 6 3 0
84 - 80 79 - 75 74 - 70 69 - 65 64 - 60 59 - 55 54 - 50 49 - 45 44 - 40 39 - 35 34 - 30 29 - 25
MEAN
N f 40
8/27/2011
Midpoint Method
After the f column, make another column and enter the midpoint (Xm) of each class. Multiply the frequency with the midpoint and enter it in the next column Label the column f Xm. Get the sum column. sum. Use the formula:
Short Method
Choose a class at or near the middle of the distribution to be designated as the origin. After the f column, construct the deviation column (d). Mark the chosen class zero. In succession, write -1, -2 and so on for classes lower in value than the origin. In like manner, write 1, 2, 3 and so on for classes greater in value than the origin. Construct f x d column and get the algebraic sum.
( fX
N
Problem:
Use the formula:
Classes f 4 7 12 10 9 6 2
x z
( fxd)alg
N
For the given frequency distribution, distribution compute for the mean using:
8/27/2011
( fX
N
f 4 7 12 10 9 6 2 N = 50
d 3 2 1 0 -1 -2 -3
fd 12 14 12 0 -9 -12 -6
x z
( fxd)
N
11 (5) 50
a lg
1905 50
x 37
x 38.1
x 38.1
fX
1905
fd 11
MEDIAN
Steps: N Find 2 Find the accumulated sum of the frequencies up to the sum that contains N 2
(N cf ) Md L 2 i f
where L = lower limit of class which contains N/2 f = frequency of class containing N/2 cf = cumulative sum that approaches or is equal to N/2
8/27/2011
MODE
Rough Mode( R. Mo) - obtained by inspection and is equal to the p q Xm of class having the highest frequency. Theoretical Mode( T. Mo) 3Md 2x
Problem:
For the given frequency distribution in the previous problem, compute for the: Median R. Mode T. Mode
i=5
R. Mode = 42
(N cf ) Md L 2 i f (2517) Md 35 (5) 10 Md 39
N 50 25 2 2
T. Mode 3Md 2x
since
Md 39
x 38.1
8/27/2011
Qk L
( kN
4 f
cf )
Dk L
( kN
10 f
cf ) i Pk L
( kN
100 f
cf ) i
8/27/2011
Problem:
For the given frequency distribution in the previous problem, compute for:
i=5
kN (1)50 12.5 4 4
Q1 D3 P88
Qk L
Computing for D3
Classes 54-50 49-45 44-40 44 40 39-35 34-30 29-25 24-20 f 4 7 12 10 9 6 2 N = 50 17 8 2 cf
i=5
kN (88)50 44 100 100
8/27/2011
VARIANCE
MEASURES OF VARIATION
RANGE
The range is computed as the difference between the upper limit of the highest class interval and the lower limit of the lowest class interval.
f ( xm x ) 2 N
STANDARD DEVIATION
f (x
x) 2
MEAN DEVIATION
D
Problem:
f x
For the given frequency distribution, determine: variance standard deviation mean deviation quartile deviation
QUARTILE DEVIATION
Q Q3 Q1 2
10
8/27/2011
Xm 87 82 77 72 67 62 57 52 47 42 37 32
f 1 1 2 3 4 4
f 7 6 6 6 3 1
89-85 84-80 79-75 74-70 69-65 64-60 64 60 59-55 54-50 49-45 44-40 39-35 34-30
( fX
N
2443 44
x 55.5
N = 44
fX
32
m
2443
(xm - X )2 992.25 702.25 462.25 272.25 132.25 42.25 42 25 2.25 12.25 72.25 182.25 342.25 552.25
f(xm - X )2 992.25 702.25 924.50 816.75 529.00 169.00 169 00 15.75 73.50 433.50 1093.50 1026.75 552.25
31.5 26.5 21.5 16.5 11.5 6.5 65 1.5 -3.5 -8.5 -13.5 -18.5 -23.5
Since
2 166.57
7329 44
2
166.57
12.906
2 166.57
N = 44
f (x
x) 2 7329
11
8/27/2011
f /x m -
31.5 26.5 43.0 49.5 46.0 26.0 26 0 10.5 21.0 51.0 81.0 55.5 23.5
m
Qk L
kN 4 cf i
f
f x
84-80
m
79-75 74-70 69-65 64-60 59-55 54-50 49-45 44-40 39-35 34-30
kN 1(44) 11 4 4
Q1 45
11 10 5 45.83
6
7 6 6 6 3 1
465 D 44
kN 3( 44) 33 4 4
D 10.6
Q3 60
33 335 60
4
N = 44
f x
x 465
N = 44
Q3 Q1 60 45.83 2 2 Q 7.085
12
8/27/2011
HISTOGRAM
The histogram is similar to a bar chart but the bases of each bar are the class boundaries rather than class limits.
FREQUENCY POLYGON
A frequency polygon is a line q y p yg graph of class frequencies plotted against class marks.
Problem:
C lasses f
BAR GRAPH
15 Fr e q u e n c y 10 5 0 20-24 25-29 30-34 35-39 40-44 45-49 50-54 Class Marks
For the following frequency distribution, construct: bar graph histogram frequency polygon
4 7 12 10 9 6 2
13
8/27/2011
HISTOGRAM
15 Fr e q u e n cy 10 6 5 0 Class Boundaries 2 9 10 7 4 12
FREQUENCY POLYGON
15 Fr e q u e n cy 10 5 0 20-24 25-29 30-34 35-39 40-44 45-49 50-54 Classes
PIE CHART
Problem:
The following table classifies enrolment in a certain university. Construct a pie chart to show the enrolment distribution.
14
8/27/2011
(Ogive Curve)
Problem:
An ogive curve is a line graph obtained by plotting values from the tabular arrangement b class i t t by l intervals whose l h frequencies are cumulated. From this curve, the centile rank of a certain score can be determined. A centile rank denotes the percentage of scores that fall below a specified score in a distribution.
Construct the ogive curve for the given frequency distribution. What score correspond to C50? C88? What is the centile rank of a score of 50?
Classes 64-60 59-55 54-50 49-45 44-40 39-35 34-30 29-25 24-20 19-15 14-10 9-5
f 2 12 20 32 46 58 64 58 42 23 15 4
120 100 80 60 40 20 0 0 9 14 19 24 29 34 39 44 49 UL 54 59 64
Ogiv e Curve
Score 50 = C91
C50 = 33
C88 = 48
N 376
15
8/27/2011
KURTOSIS (ku)
Kurtosis refers to the flatness or peakedness of a frequency distribution. It shows the shape of the curve or the arrangement of a set of distribution in relation to the other set of distribution. The coefficient of kurtosis is given by:
ku
Q P90 P 10
Types of Kurtosis
leptokurtic (ku < 0.263) mesokurtic (ku = 0.263) platykurtic (ku > 0.263)
SKEWNESS (sk)
Skewness refers to the symmetry or
asymmetry of a frequency distribution. The coefficient of skewness is given by:
sk
3( x md ) s
16
8/27/2011
X Md Mo
X Md Mo
( Mo Md X )
Problem:
For a certain frequency distribution, the ff. data are given:
s 13.7
md 147 Q1 138
x 147
Mo Md X
( X Md Mo)
17
8/27/2011
Solution:
Q3 Q1
Q 2 ku P90 D1 P90 P 10
sk
sk
3( x md ) s
Student Activity
1. Define each of the following: a c ass a a. class mark c histogram c. stog a b. ogive d. frequency polygon 2. What advantages does each of the following forms of presenting data offer? a. textual b. tabular c. graphical
18
8/27/2011
Part II. Solve the following using Microsoft Excel Applications. 3. Distinguish between: a. class limits and class boundaries b. skewness and kurtosis 4. Give the class mark, the class boundaries and the interval size for each of the following: a. 10 19 b. 1.5 5.0 c. 12.85 13.43
The list below gives the weekly food budget and weekly incomes for 39 households.
1. 1 Construct frequency distribution table for food budget using i = 25 and determine:
a. mean b. median c. rough and theoretical mode d skewness
F ood B udget 1598 1680 1660 1583 1476 1633 1717 1596 1613 1607 1728 1672 1572 1634 1461 1726 1732 1620 1616 1579
W eekly In com e 1553 1740 1652 1581 1481 1634 1692 1561 1566 1626 1699 1685 1589 1571 1443 1712 1724 1628 1564 1526
F ood B u dget 1639 1655 1736 1587 1622 1689 1700 1613 1615 1458 1750 1700 1654 1625 1565 1563 1566 1587 1584
W eekly In com e 1636 1677 1761 1603 1605 1631 1765 1688 1667 1479 1747 1673 1641 1613 1521 1583 1542 1567 1610
2. Construct frequency distribution table for weekly income using i = 25 and determine:
a) b) c) d) standard deviation mean deviation quartile deviation kurtosis
3. Plot a bar chart for food budget and superimpose on it the frequency polygon for weekly income.
19
8/27/2011
4. Take the difference between weekly income and food budget for each household and construct a frequency distribution d di t ib ti and cumulative frequency l ti f distribution. 5. Plot the ogive curve for the data in (4). What score corresponds to a centile rank of 71?
20