Vous êtes sur la page 1sur 36

Properties describing quantitative data

Numerical values of an observation around which most numerical values of other observations in the data set show a tendency to cluster or group Extent to which values are dispersed around the central value called variation. Extent of departure of numerical values from symmetrical distribution around the central value called skew ness

Requisites of a measure of central tendency


It

should be rigidly defined It should be based on all the observations Easy to understand and calculate Should have sampling stability Should not be unduly affected by extreme observation

MEASURES OF CENTRAL TENDENCY


Averages of Position The Mode The Median Mathematical Averages The Mean The Symmetrical Distribution The Positively Skewed Distribution The Negatively Skewed Distribution

Mode
A measure of central tendency Value that occurs most often Not affected by extreme values Used for either numerical or categorical data There may be no mode or several modes
Mode = 9 No Mode

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

0 1 2 3 4 5 6

Mode measure of location recognized by the location of the most frequently occurring value of a set of data Sales during 20 days period 53,56,57,58,58,60,61,63,63,64,64,65,65,67, 68,71,71,71,71,74 (ascending order data)

Mode

Mode for frequency distribution


Sales Volume (Class Interval) 53-56 57-60 61-64 65-68 69-72 72 and above No. of Days (Frequency) 2 4 5 4 4 1

Frequency distribution of sales per day

Mode: The Category or Score with the Largest frequency(or %)

The mode is always a category or score The mode is not necessarily the category with the majority(more than 50% of the cases) The mode is the only measure of central tendency for nominal variables Some distributions are bimodal

Mode for grouped data, M0 = L + fm fm-1 2fm fm-1 fm+1 h

THE MEDIAN measuring qualitative characters

The median is a measure of central tendency for variables which are at least ordinal. The median represents the exact middle of a distribution.
It is the score that divides the distribution into two equal parts

Finding the Median in sorted data How satisfied are you with your health insurance? Responses of 7 Individuals very dissatisfied very satisfied somewhat satisfied

very dissatisfied
somewhat dissatisfied somewhat satisfied very satisfied

Total(N)

To locate the median


Arrange the responses in order from lowest to highest (or highest to lowest): Response very dissatisfied very dissatisfied somewhat dissatisfied

somewhat satisfied ( The middle case =Median) somewhat satisfied very satisfied very satisfied _________________________________________________

Summary :Locating the Median with N=Odd


The median is the response associated with the middle case. You find the middle case by :(N + 1) 2 Since N= 7, the middle case is the (7 + 1) 2, or the 4th case The response associated with the 4th case is somewhat satisfied. Therefore the median is: Somewhat satisfied.

To locate the median (N=Even)


Suicide rates of cities
7.44, 10.00, 12.26, 12.61, 13.38, 14.11, 14.30, 14.78

The median is located halfway between the two middle cases. When the variable is interval we can average the two middle cases. Median = 12.61 + 13.38 = 12.99 2

Median
Robust measure of central tendency Not affected by extreme values

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Median = 5

Median = 5

In an ordered array, the median is the middle number


If n or N is odd, the median is the middle number If n or N is even, the median is the average of the two middle numbers

Median for grouped data


Age of Automobiles 0-4 4-8 8-12 12-16 16-20 Frequency Cumalative Frequency 13 13 29 42 Median 48 90 22 112 class 8 120 120

Med = L + (n/2) cf h f

Partition Values: Quartiles, Deciles, and Percentiles Quartiles Divide an ordered data set into 4 equal parts - 2nd Quartile - Median Deciles Divide an ordered data set into 10 equal parts - 5th Decile - Median Percentiles Divide an ordered data set into 100 equal parts - 50th Percentile - Median

Quartile for a grouped data, Qi = L + i(n/4) cf h; f

i = 1,2,3

Decile for a grouped data, Di = L + i(n/10) cf h; i = 1,29 f


Percentile for a grouped data, Pi = L + i(n/100) cf h; i = 1,299 f

_____________________________
Mean. The arithmetic average obtained by adding up all the scores and dividing by the total number of scores.
___________________________________________________________

Objectives of an Average
Determine one single value that may be used to describe the character sticks of entire series. Facilitate comparison at a particular point of time Facilitate statistical inference Helps in decision making process

The Mean
_________________________________________________________________ Mean. The arithmetic average obtained by adding up all the scores and dividing by the total number of scores. _________________________________________________________________

Y
Y = raw scores of the variable y __ Y = the mean of y

Y
N

Y = the sum of all the y scores


N = the number of observations

Crime Rate in Indian Cities:Finding the Mean


CITY Mumbai Delhi Kolk atta Chennai Banglore Hy derabad Baroda Chandigarh Meerut Bhopal Honolulu Jaipur Patna Kanpur Ajmer

Crime RATE per 1000 29.3 28.9 32.9 36.5 25 14.7 58.4 48.8 12.8 21.8 3.4 6.6 40.6 12.9 19.8

Total

392.4

Y 392 .4 26 .16 Mean Y


N 15

Sample statistic a numerical value used as a summary measure using data of the sample for estimation or hypothesis testing

Population parameter - a numerical value used as a summary measure using data of the population

Mean (Arithmetic Mean)


Mean (arithmetic mean) of data values Sample mean
X1 X 2 X n n Population mean
i 1 i

X
N

Sample Size

Xn

X
i 1

Population Size

X1 X 2 N

XN

Arithmetic mean of ungrouped raw data


Direct method Indirect method (short cut method)

Finding the mean in a frequency distribution


When data are arranged in a frequency distribution, we must give each score its proper weight by multiplying it by its frequency. We use the following formula to calculate the mean:

__ Y =fY N

where __ Y = fY = fY = N =

the mean a score multiplied by its frequency the sum of all the f Ys the total number of cases in the distribution

Calculating the Mean from a Frequency Distribution


# of Children(Y) Frequency(f) Frequency*Y(fY) 0 12 0 1 25 25 2 733 1466 3 333 999 4 183 732 5 26 130 6 15 90 7 12 84 Total 1339 3526

fY

3526 2.6 1339

Weighted Mean

Used when values are grouped by frequency or relative importance


Example: Sample of 26 Repair Projects
Days to Complete 5 6 7 8 Frequency 4 12 8 2

Weighted Mean Days to Complete:


XW

w x w
i

i i

(4 5) (12 6) (8 7) (2 8) 4 12 8 2 164 6.31 days 26

Indirect method
The human resource manager at a city hospital began a study of the overtime hours of the registered nurses. Fifteen nurses were selected at random and following overtime hours were recorded during a month: 13 13 12 15 17 15 5 12 6 7 12 10 9 13 12 5 9 6 10 5 6 9 6 9 12

Arithmetic mean of grouped (classified) data Direct & Step deviation method)
The following distribution gives the pattern of overtime work done by 100 employees of a company. Calculate the average overtime work done per employee
No. of Employees 11 20 35 20 8 6 Mid Value 12.5 17.5 22.5 27.5 32.5 37.5

Overtime 10-15 15-20 20-25 25-30 30-35 35-40

d=(m-A)/5 -2 -1 0 1 2 3

fd -22 -20 0 20 16 18 12

Geometric Mean
Geometric Mean of a set of n numbers is defined as the nth root of the product of the n numbers and is used to average percents, indexes, and relatives. The formula is: (Xi > 0)

X G n X1 X 2

Xn

More directly measures the change over more than one period Geometric Mean Arithmetic Mean
30

Relationship between Mean, Median and Mode


M0 = 3Median 2Mean OR

Mean Mode = 3 (Mean Median)

The Shape of Distributions

Distributions can be either symmetrical or skewed, depending on whether there are more frequencies at one end of the distribution than the other.

Symmetrical Distributions

A distribution is symmetrical if the frequencies at the right and left tails of the distribution are identical, so that if it is divided into two halves, each will be the mirror image of the other.
In a unimodal symmetrical distribution the mean, median, and mode are identical.

1.4. Shape of a Distribution


Describes

how data is distributed Symmetric or skewed


Left-Skewed Symmetric Right-Skewed

Mean < Median < Mode Mean = Median = Mode Mode < Median < Mean
(Longer tail extends to left) (Longer tail extends to right)

Choosing a Measure of Central Tendency


IF variable is Nominal..

Mode Mode or Median(or both)

IF variable is Ordinal...

IF variable is Interval-Ratio and distribution is Symmetrical

Mode, Median or Mean

IF variable is Interval-Ratio and distribution is Skewed

Mode or Median

Calculate the mean, median and mode for the following data pertaining to marks in statistics. There are 80 students in class and the test is of 140 marks. Marks more than No. of Students 0 80 20 76 40 50 60 28 80 18 100 9 120 3