Vous êtes sur la page 1sur 36

# Properties describing quantitative data

Numerical values of an observation around which most numerical values of other observations in the data set show a tendency to cluster or group Extent to which values are dispersed around the central value called variation. Extent of departure of numerical values from symmetrical distribution around the central value called skew ness

## Requisites of a measure of central tendency

It

should be rigidly defined It should be based on all the observations Easy to understand and calculate Should have sampling stability Should not be unduly affected by extreme observation

## MEASURES OF CENTRAL TENDENCY

Averages of Position The Mode The Median Mathematical Averages The Mean The Symmetrical Distribution The Positively Skewed Distribution The Negatively Skewed Distribution

Mode
A measure of central tendency Value that occurs most often Not affected by extreme values Used for either numerical or categorical data There may be no mode or several modes
Mode = 9 No Mode

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

0 1 2 3 4 5 6

Mode measure of location recognized by the location of the most frequently occurring value of a set of data Sales during 20 days period 53,56,57,58,58,60,61,63,63,64,64,65,65,67, 68,71,71,71,71,74 (ascending order data)

Mode

## Mode for frequency distribution

Sales Volume (Class Interval) 53-56 57-60 61-64 65-68 69-72 72 and above No. of Days (Frequency) 2 4 5 4 4 1

## Mode: The Category or Score with the Largest frequency(or %)

The mode is always a category or score The mode is not necessarily the category with the majority(more than 50% of the cases) The mode is the only measure of central tendency for nominal variables Some distributions are bimodal

## THE MEDIAN measuring qualitative characters

The median is a measure of central tendency for variables which are at least ordinal. The median represents the exact middle of a distribution.
It is the score that divides the distribution into two equal parts

Finding the Median in sorted data How satisfied are you with your health insurance? Responses of 7 Individuals very dissatisfied very satisfied somewhat satisfied

very dissatisfied
somewhat dissatisfied somewhat satisfied very satisfied

Total(N)

## To locate the median

Arrange the responses in order from lowest to highest (or highest to lowest): Response very dissatisfied very dissatisfied somewhat dissatisfied

somewhat satisfied ( The middle case =Median) somewhat satisfied very satisfied very satisfied _________________________________________________

## Summary :Locating the Median with N=Odd

The median is the response associated with the middle case. You find the middle case by :(N + 1) 2 Since N= 7, the middle case is the (7 + 1) 2, or the 4th case The response associated with the 4th case is somewhat satisfied. Therefore the median is: Somewhat satisfied.

## To locate the median (N=Even)

Suicide rates of cities
7.44, 10.00, 12.26, 12.61, 13.38, 14.11, 14.30, 14.78

The median is located halfway between the two middle cases. When the variable is interval we can average the two middle cases. Median = 12.61 + 13.38 = 12.99 2

Median
Robust measure of central tendency Not affected by extreme values

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Median = 5

Median = 5

## In an ordered array, the median is the middle number

If n or N is odd, the median is the middle number If n or N is even, the median is the average of the two middle numbers

## Median for grouped data

Age of Automobiles 0-4 4-8 8-12 12-16 16-20 Frequency Cumalative Frequency 13 13 29 42 Median 48 90 22 112 class 8 120 120

Med = L + (n/2) cf h f

Partition Values: Quartiles, Deciles, and Percentiles Quartiles Divide an ordered data set into 4 equal parts - 2nd Quartile - Median Deciles Divide an ordered data set into 10 equal parts - 5th Decile - Median Percentiles Divide an ordered data set into 100 equal parts - 50th Percentile - Median

i = 1,2,3

## Decile for a grouped data, Di = L + i(n/10) cf h; i = 1,29 f

Percentile for a grouped data, Pi = L + i(n/100) cf h; i = 1,299 f

_____________________________
Mean. The arithmetic average obtained by adding up all the scores and dividing by the total number of scores.
___________________________________________________________

Objectives of an Average
Determine one single value that may be used to describe the character sticks of entire series. Facilitate comparison at a particular point of time Facilitate statistical inference Helps in decision making process

The Mean
_________________________________________________________________ Mean. The arithmetic average obtained by adding up all the scores and dividing by the total number of scores. _________________________________________________________________

Y
Y = raw scores of the variable y __ Y = the mean of y

Y
N

## Y = the sum of all the y scores

N = the number of observations

## Crime Rate in Indian Cities:Finding the Mean

CITY Mumbai Delhi Kolk atta Chennai Banglore Hy derabad Baroda Chandigarh Meerut Bhopal Honolulu Jaipur Patna Kanpur Ajmer

Crime RATE per 1000 29.3 28.9 32.9 36.5 25 14.7 58.4 48.8 12.8 21.8 3.4 6.6 40.6 12.9 19.8

Total

392.4

## Y 392 .4 26 .16 Mean Y

N 15

Sample statistic a numerical value used as a summary measure using data of the sample for estimation or hypothesis testing

Population parameter - a numerical value used as a summary measure using data of the population

## Mean (Arithmetic Mean)

Mean (arithmetic mean) of data values Sample mean
X1 X 2 X n n Population mean
i 1 i

X
N

Sample Size

Xn

X
i 1

Population Size

X1 X 2 N

XN

## Arithmetic mean of ungrouped raw data

Direct method Indirect method (short cut method)

## Finding the mean in a frequency distribution

When data are arranged in a frequency distribution, we must give each score its proper weight by multiplying it by its frequency. We use the following formula to calculate the mean:

__ Y =fY N

where __ Y = fY = fY = N =

the mean a score multiplied by its frequency the sum of all the f Ys the total number of cases in the distribution

## Calculating the Mean from a Frequency Distribution

# of Children(Y) Frequency(f) Frequency*Y(fY) 0 12 0 1 25 25 2 733 1466 3 333 999 4 183 732 5 26 130 6 15 90 7 12 84 Total 1339 3526

fY

Weighted Mean

## Used when values are grouped by frequency or relative importance

Example: Sample of 26 Repair Projects
Days to Complete 5 6 7 8 Frequency 4 12 8 2

XW

w x w
i

i i

## (4 5) (12 6) (8 7) (2 8) 4 12 8 2 164 6.31 days 26

Indirect method
The human resource manager at a city hospital began a study of the overtime hours of the registered nurses. Fifteen nurses were selected at random and following overtime hours were recorded during a month: 13 13 12 15 17 15 5 12 6 7 12 10 9 13 12 5 9 6 10 5 6 9 6 9 12

Arithmetic mean of grouped (classified) data Direct & Step deviation method)
The following distribution gives the pattern of overtime work done by 100 employees of a company. Calculate the average overtime work done per employee
No. of Employees 11 20 35 20 8 6 Mid Value 12.5 17.5 22.5 27.5 32.5 37.5

## Overtime 10-15 15-20 20-25 25-30 30-35 35-40

d=(m-A)/5 -2 -1 0 1 2 3

fd -22 -20 0 20 16 18 12

Geometric Mean
Geometric Mean of a set of n numbers is defined as the nth root of the product of the n numbers and is used to average percents, indexes, and relatives. The formula is: (Xi > 0)

X G n X1 X 2

Xn

More directly measures the change over more than one period Geometric Mean Arithmetic Mean
30

## Relationship between Mean, Median and Mode

M0 = 3Median 2Mean OR

## The Shape of Distributions

Distributions can be either symmetrical or skewed, depending on whether there are more frequencies at one end of the distribution than the other.

Symmetrical Distributions

A distribution is symmetrical if the frequencies at the right and left tails of the distribution are identical, so that if it is divided into two halves, each will be the mirror image of the other.
In a unimodal symmetrical distribution the mean, median, and mode are identical.

Describes

## how data is distributed Symmetric or skewed

Left-Skewed Symmetric Right-Skewed

Mean < Median < Mode Mean = Median = Mode Mode < Median < Mean
(Longer tail extends to left) (Longer tail extends to right)

## Choosing a Measure of Central Tendency

IF variable is Nominal..

## Mode Mode or Median(or both)

IF variable is Ordinal...

## IF variable is Interval-Ratio and distribution is Skewed

Mode or Median

Calculate the mean, median and mode for the following data pertaining to marks in statistics. There are 80 students in class and the test is of 140 marks. Marks more than No. of Students 0 80 20 76 40 50 60 28 80 18 100 9 120 3