Properties - Describing Quantitative Data

Properties describing quantitative data
Numerical values of an observation around which most numerical values of other observations in the data set show a tendency to cluster or group Extent to which values are dispersed around the central value called variation. Extent of departure of numerical values from symmetrical distribution around the central value called skew ness
Requisites of a measure of central tendency

It
should be rigidly defined It should be based on all the observations Easy to understand and calculate Should have sampling stability Should not be unduly affected by extreme observation
MEASURES OF CENTRAL TENDENCY

Averages of Position The Mode The Median Mathematical Averages The Mean The Symmetrical Distribution The Positively Skewed Distribution The Negatively Skewed Distribution
Mode
A measure of central tendency Value that occurs most often Not affected by extreme values Used for either numerical or categorical data There may be no mode or several modes
Mode = 9 No Mode
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
0 1 2 3 4 5 6
Mode measure of location recognized by the location of the most frequently occurring value of a set of data Sales during 20 days period 53,56,57,58,58,60,61,63,63,64,64,65,65,67, 68,71,71,71,71,74 (ascending order data)
Mode
Mode for frequency distribution

Sales Volume (Class Interval) 53-56 57-60 61-64 65-68 69-72 72 and above No. of Days (Frequency) 2 4 5 4 4 1
Frequency distribution of sales per day
Mode: The Category or Score with the Largest frequency(or %)
The mode is always a category or score The mode is not necessarily the category with the majority(more than 50% of the cases) The mode is the only measure of central tendency for nominal variables Some distributions are bimodal
Mode for grouped data, M0 = L + fm fm-1 2fm fm-1 fm+1 h
THE MEDIAN measuring qualitative characters
The median is a measure of central tendency for variables which are at least ordinal. The median represents the exact middle of a distribution.
It is the score that divides the distribution into two equal parts
Finding the Median in sorted data How satisfied are you with your health insurance? Responses of 7 Individuals very dissatisfied very satisfied somewhat satisfied
very dissatisfied
somewhat dissatisfied somewhat satisfied very satisfied
Total(N)
To locate the median

Arrange the responses in order from lowest to highest (or highest to lowest): Response very dissatisfied very dissatisfied somewhat dissatisfied
somewhat satisfied ( The middle case =Median) somewhat satisfied very satisfied very satisfied _________________________________________________
Summary :Locating the Median with N=Odd

The median is the response associated with the middle case. You find the middle case by :(N + 1) 2 Since N= 7, the middle case is the (7 + 1) 2, or the 4th case The response associated with the 4th case is somewhat satisfied. Therefore the median is: Somewhat satisfied.
To locate the median (N=Even)

Suicide rates of cities
7.44, 10.00, 12.26, 12.61, 13.38, 14.11, 14.30, 14.78
The median is located halfway between the two middle cases. When the variable is interval we can average the two middle cases. Median = 12.61 + 13.38 = 12.99 2
Median
Robust measure of central tendency Not affected by extreme values
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Median = 5
Median = 5
In an ordered array, the median is the middle number

If n or N is odd, the median is the middle number If n or N is even, the median is the average of the two middle numbers
Median for grouped data

Age of Automobiles 0-4 4-8 8-12 12-16 16-20 Frequency Cumalative Frequency 13 13 29 42 Median 48 90 22 112 class 8 120 120
Med = L + (n/2) cf h f
Partition Values: Quartiles, Deciles, and Percentiles Quartiles Divide an ordered data set into 4 equal parts - 2nd Quartile - Median Deciles Divide an ordered data set into 10 equal parts - 5th Decile - Median Percentiles Divide an ordered data set into 100 equal parts - 50th Percentile - Median
Quartile for a grouped data, Qi = L + i(n/4) cf h; f
i = 1,2,3
Decile for a grouped data, Di = L + i(n/10) cf h; i = 1,29 f

Percentile for a grouped data, Pi = L + i(n/100) cf h; i = 1,299 f
_____________________________
Mean. The arithmetic average obtained by adding up all the scores and dividing by the total number of scores.
___________________________________________________________
Objectives of an Average
Determine one single value that may be used to describe the character sticks of entire series. Facilitate comparison at a particular point of time Facilitate statistical inference Helps in decision making process
The Mean
_________________________________________________________________ Mean. The arithmetic average obtained by adding up all the scores and dividing by the total number of scores. _________________________________________________________________
Y
Y = raw scores of the variable y __ Y = the mean of y
Y
N
Y = the sum of all the y scores

N = the number of observations
Crime Rate in Indian Cities:Finding the Mean

CITY Mumbai Delhi Kolk atta Chennai Banglore Hy derabad Baroda Chandigarh Meerut Bhopal Honolulu Jaipur Patna Kanpur Ajmer
Crime RATE per 1000 29.3 28.9 32.9 36.5 25 14.7 58.4 48.8 12.8 21.8 3.4 6.6 40.6 12.9 19.8
Total
392.4
Y 392 .4 26 .16 Mean Y

N 15
Sample statistic a numerical value used as a summary measure using data of the sample for estimation or hypothesis testing
Population parameter - a numerical value used as a summary measure using data of the population
Mean (Arithmetic Mean)

Mean (arithmetic mean) of data values Sample mean
X1 X 2 X n n Population mean
i 1 i
X
N
Sample Size
Xn
X
i 1
Population Size
X1 X 2 N
XN
Arithmetic mean of ungrouped raw data

Direct method Indirect method (short cut method)
Finding the mean in a frequency distribution

When data are arranged in a frequency distribution, we must give each score its proper weight by multiplying it by its frequency. We use the following formula to calculate the mean:
__ Y =fY N
where __ Y = fY = fY = N =
the mean a score multiplied by its frequency the sum of all the f Ys the total number of cases in the distribution
Calculating the Mean from a Frequency Distribution

# of Children(Y) Frequency(f) Frequency*Y(fY) 0 12 0 1 25 25 2 733 1466 3 333 999 4 183 732 5 26 130 6 15 90 7 12 84 Total 1339 3526
fY
3526 2.6 1339
Weighted Mean
Used when values are grouped by frequency or relative importance

Example: Sample of 26 Repair Projects
Days to Complete 5 6 7 8 Frequency 4 12 8 2
Weighted Mean Days to Complete:

XW
w x w
i
i i
(4 5) (12 6) (8 7) (2 8) 4 12 8 2 164 6.31 days 26
Indirect method
The human resource manager at a city hospital began a study of the overtime hours of the registered nurses. Fifteen nurses were selected at random and following overtime hours were recorded during a month: 13 13 12 15 17 15 5 12 6 7 12 10 9 13 12 5 9 6 10 5 6 9 6 9 12
Arithmetic mean of grouped (classified) data Direct & Step deviation method)
The following distribution gives the pattern of overtime work done by 100 employees of a company. Calculate the average overtime work done per employee
No. of Employees 11 20 35 20 8 6 Mid Value 12.5 17.5 22.5 27.5 32.5 37.5
Overtime 10-15 15-20 20-25 25-30 30-35 35-40
d=(m-A)/5 -2 -1 0 1 2 3
fd -22 -20 0 20 16 18 12
Geometric Mean
Geometric Mean of a set of n numbers is defined as the nth root of the product of the n numbers and is used to average percents, indexes, and relatives. The formula is: (Xi > 0)
X G n X1 X 2
Xn
More directly measures the change over more than one period Geometric Mean Arithmetic Mean
30
Relationship between Mean, Median and Mode

M0 = 3Median 2Mean OR
Mean Mode = 3 (Mean Median)
The Shape of Distributions
Distributions can be either symmetrical or skewed, depending on whether there are more frequencies at one end of the distribution than the other.
Symmetrical Distributions
A distribution is symmetrical if the frequencies at the right and left tails of the distribution are identical, so that if it is divided into two halves, each will be the mirror image of the other.
In a unimodal symmetrical distribution the mean, median, and mode are identical.
1.4. Shape of a Distribution

Describes
how data is distributed Symmetric or skewed

Left-Skewed Symmetric Right-Skewed
Mean < Median < Mode Mean = Median = Mode Mode < Median < Mean
(Longer tail extends to left) (Longer tail extends to right)
Choosing a Measure of Central Tendency

IF variable is Nominal..
Mode Mode or Median(or both)
IF variable is Ordinal...
IF variable is Interval-Ratio and distribution is Symmetrical
Mode, Median or Mean
IF variable is Interval-Ratio and distribution is Skewed
Mode or Median
Calculate the mean, median and mode for the following data pertaining to marks in statistics. There are 80 students in class and the test is of 140 marks. Marks more than No. of Students 0 80 20 76 40 50 60 28 80 18 100 9 120 3

Properties - Describing Quantitative Data

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Properties - Describing Quantitative Data

Transféré par

Droits d'auteur :

Formats disponibles

Properties describing quantitative data

Requisites of a measure of central tendency

MEASURES OF CENTRAL TENDENCY

Mode for frequency distribution

Frequency distribution of sales per day

Mode: The Category or Score with the Largest frequency(or %)

Mode for grouped data, M0 = L + fm fm-1 2fm fm-1 fm+1 h

THE MEDIAN measuring qualitative characters

To locate the median

Summary :Locating the Median with N=Odd

To locate the median (N=Even)

In an ordered array, the median is the middle number

Median for grouped data

Quartile for a grouped data, Qi = L + i(n/4) cf h; f

Decile for a grouped data, Di = L + i(n/10) cf h; i = 1,29 f

Y = the sum of all the y scores

Crime Rate in Indian Cities:Finding the Mean

Y 392 .4 26 .16 Mean Y

Mean (Arithmetic Mean)

Arithmetic mean of ungrouped raw data

Finding the mean in a frequency distribution

Calculating the Mean from a Frequency Distribution

3526 2.6 1339

Used when values are grouped by frequency or relative importance

Weighted Mean Days to Complete:

(4 5) (12 6) (8 7) (2 8) 4 12 8 2 164 6.31 days 26

Overtime 10-15 15-20 20-25 25-30 30-35 35-40

Relationship between Mean, Median and Mode

Mean Mode = 3 (Mean Median)

The Shape of Distributions

1.4. Shape of a Distribution

how data is distributed Symmetric or skewed

Choosing a Measure of Central Tendency

Mode Mode or Median(or both)

IF variable is Interval-Ratio and distribution is Symmetrical

Mode, Median or Mean

IF variable is Interval-Ratio and distribution is Skewed

Vous aimerez peut-être aussi