Académique Documents
Professionnel Documents
Culture Documents
How closely a set of data clusters around its centre Measures of Spread or Dispersion: 1. Range 2. Interquartile Range (IQR) 3. Standard Deviation 4. Variance Measures of Position (Ranking Data): 1. Percentiles 2. Quartiles 3. Z-Scores
Measures of Position
Determine the position of a value, relative to other values, in a set of data Measures of Position (Ranking Data): 1. Percentiles 2. Quartiles 3. Z-Scores Quartiles are required to determine interquartile ranges Data must be arranged in order to determine percentiles and quartiles
Measures of Position
1.
Percentiles
Divide a set of ordered data into 100 intervals with equal numbers of values k percent of the data are less than or equal to kth percentile, Pk (100 k) percent of the data are greater than or equal to kth percentile, Pk
Measures of Position
2.
Quartiles
Divide a set of ordered data into four groups with equal numbers of values Median = Second Quartile Median divides data into two equally sized groups
Measures of Position
3.
Z-Scores
= the number of standard deviations that a datum is from the mean Divide the deviation of a datum from the mean by the standard deviation Variable values below the mean have negative zscores whereas values above the mean have positive z-scores, and values equal to the mean have zero z-score
Measures of Position
Implications of Z-Scores
Z-scores are used to rank any set of data, using the standard deviation as a unit of measure A z-score of 0.072 indicates that it is approximately 7% of a standard deviation or 0.072 standard deviation below the mean A z-score of 0.46 indicates that it is approximately half a standard deviation or 0.46 standard deviation above the mean
Measures of Spread
Z-Scores
Measures of Spread
While measures of central tendency are used to estimate "normal" values of a dataset, measures of dispersion are important for describing the spread of the data, or its variation around a central value. Measures of Spread or Dispersion: 1. Range 2. Interquartile Range (IQR) 3. Standard Deviation 4. Variance
Measures of Spread
1.
Range
Simply put
Not always the best measure Box & whisker plot shows it graphically
Example
Data points include: 7, 9, 12, 13, 24, 29
Measures of Spread
2.
Measures of Spread
To find IQR:
1. 2. 3.
Find Median (Q2) Find upper & lower Median (Q3 & Q1) IQR is difference between Q3 & Q1 (50% of data)
NOTE: A box & whisker plot shows IQR graphically Smaller range means more reliable data (less spread) Outliers have little impact IQR values.
Measures of Spread
IQR Examples:
Solution
a) Q2 =(18 + 21)/2 = 19.5 Q1 =(14 + 17)/2 = 15.5 Q3 =(25 + 27)/2 = 26 IQR = Q3 Q1 = 26 15.5 = 10.5 b) Q2 = 47 Q1 = 40 Q3 = 51 IQR = 51 40 = 11
Measures of Spread
3.
Standard Deviation
Mathematicians choice for measuring spread of data = Square root of the average sum of the squared differences between each data point and the mean
xi
i 1
n Population
f x
i 1 i i
n Grouped Population
x x
n i 1 i
n 1
Sample
n 1 Grouped Sample
f x x
n i 1 i i
Measures of Spread
Represents an average of the square of the distance each piece of data is from the mean. If data is clustered about the mean, little dispersion & low standard deviation. If data is spread out, widely scattered & high deviation. Outliers have a larger impact on since every piece of data is considered. Use the mid-value/mid-point of each interval as x for Grouped data
Measures of Spread
4.
VARIANCE (2 )
= another measure of dispersion/spread Equal to the square of standard deviation
Measures of Spread
Example 1: Calculate Sample Standard Deviation
Measures of Spread
Sample Standard Deviation is then:
Measures of Spread
Example 2: Calculate Sample Standard Deviation
Measures of Spread
Sample Standard Deviation is then: