Académique Documents
Professionnel Documents
Culture Documents
=
=
1
, where N f
i
=
.
when 1 = r , then 0
1
= . If 2 = r , then
2
2
o = , the variance.
Raw moments: The r-th moment about any point a is defined by
( )
N
a x f
n
i
r
i i
r
=
=
1 /
, where N f
i
=
.
In particular
( )
1
1
0
0
= = =
=
N
N
N
f
N
x x f
i
n
i
i i
( )
1
1
0
/
0
= = =
=
N
N
N
f
N
a x f
i
n
i
i i
( )
0
1
1
= = = =
=
x x
N
x N
x
N
x f
N
x f
N
x x f
i i i
n
i
i i
( )
a x a
N
x f
N
a x f
i i
n
i
i i
= =
=1 /
1
= d (deviation)
( )
2 1
2
2
o =
=
N
x x f
n
i
i i
(variance)
Relations among Moments:
COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA
Page 3 of 11
( )
2
/
1
1
2 2
=
( )
3
/
1
/
1
/
2
1
3 3
2 3 + =
( ) ( )
4
/
1
2
/
1
/
2
/
1
/
3
/
4 4
3 6 4 + =
( ) ( ) ( )
4
/
1
/
4 4
3
/
1
/
3 3
2
/
1
/
2 2
/
1
/
1 1
/
+ + =
r
r
r
r
r
r
r
r
r r
C C C C
SKEWNESS
Skewness: Skewness is the lack of symmetry of a distribution. If the frequency curve of a
distribution has a longer tail to the right of the central maximum than to the left, the
distribution is said to be skewed to the right, or to have positive skewness. If the reverse is
true, it is said to be skewed to the left, or to have negative skewness.
1
st
Coefficient of Skewness,
1
| =
o
e x mod
deviation standard
mode - mean
=
2
nd
Coefficient of Skewness,
( ) ( )
o
|
median 3
deviation standard
median - mean 3
1
= =
x
With the help of moments skewness can be determined, Karl Pearson suggested
3
2
2
3
1
| = ; (the sign of
1
| is the same of
3
)
For symmetrical distribution 0
1
= |
A distribution is said to be skewed if (i) mean, median and mode give different values.
(ii)
3 1
andQ Q are not equidistant from median (
2
Q ).
KURTOSIS
Kurtosis: The degree of peakness or flatness of a distribution relative to a normal distribution
is called kurtosis.
Kurtosis,
2
2
4
2
| =
For normal distribution, if 3
2
= | then the curve is mesokurtic
if 3
2
> | then the curve is leptokurtic
if 3
2
< | then the curve is platykurtic
COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA
Page 4 of 11
Problem 01: The first four moments of a distribution about the value 5 of the variable are 2,
20, 40 and 50respectively. Show that the mean is 7. Also find the moments about mean,
skewness and kurtosis.
Solution: Given that 50 40 , 20 , 2 , 5
/
4
/
3
/
2
/
1
= = = = = and A
We have to find the moments about mean
7 5 2
/
1
= + = + = A x
0
1
=
( ) 16 2 20
2
2
/
1
1
2 2
= = =
( ) 64 2 2 20 2 3 40 2 3
3
3
/
1
/
1
/
2
1
3 3
= + = + =
( ) ( ) 162 2 3 2 20 6 2 40 4 50 3 6 4
4 2
4
/
1
2
/
1
/
2
/
1
/
3
/
4 4
= + = + =
skewness,
( )
( )
265 . 1
4096
5184
16
72
3
2
3
2
2
3
1
=
= =
|
and kurtosis,
( )
63 . 0
256
162
16
162
2 2
2
4
2
= = = =
|
Problem 02: First central moments of a distribution are 0, 16, -36 and 120. Comment on the
skewness and kurtosis of the distribution.
Solution: Given that 0
1
= , 16
2
= , 36
3
= , 120
4
=
Coefficient of skewness, 5625 . 0
4
36
3
3
2
3
1 1
=
= = =
|
The distribution is negatively skewed.
kurtosis,
( )
469 . 0
16
120
2 2
2
4
2
= = =
|
Since 3
2
< | the distribution is platykurtic.
Problem 03: Find skewness and kurtosis of the following distribution
Wages 2030 30--40 40--50 50--60 60--70 70--80
Workers 7 10 15 8 8 2
COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA
Page 5 of 11
Solution:
Wages
Mid value
i
x frequency
i
f d x
i
= 45
i i
d f
2
i i
d f
2030 25 7 -20 -40 2800
3040 35 10 -10 -100 1000
4050 45 15(=a) 0 0 0
5060 55 8 10 80 800
6070 65 8 20 160 3200
70--80 75 2 30 60 1800
50 =
i
f
= 60
i i
d f 9600
2
=
i i
d f
Mean, 2 . 46
50
60
45 = + = + =
i
i i
f
d f
a x
17 . 44
12
50
40 10
8 10 30
10 15
40
2
2 0 1
0 1
= + =
+ =
+ = i
f f f
f f
L Mode
Standard Deviation (S.D) =o = 8 . 13 44 . 1 192
2500
3600
50
9600
2
2
= = =
|
|
.
|
\
|
f
fd
f
fd
skewness,
1
| = 147 . 0
8 . 13
17 . 44 2 . 46
deviation standard
mode - mean
=
=
kurtosis,
( )
20 . 2
56 . 190
659 . 79892
2 2
2
4
2
= = =
|
so the kurtosis is platykurtic.
COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA
Page 6 of 11
CORRELATION AND REGRATION
Correlation: If two variables x and y vary in such a way that an increase in the one is
accompanied by an increase or decrease in the other, then the variables are said to be
correlated. An analysis of the covariation of two or more variables is usually called
correlation.
There are various types of correlation:
(i)Positive correlation: If one variable is increasing the other on an average is also
increasing or, if one variable is decreasing the other on an average is also decreasing, then
the correlation is said to be positive.
x Y
80 50
70 45
60 31
40 20
30 10
(ii)Negative Correlation: If one variable is increasing the other is decreasing or vice versa,
then the correlation is said to be negative.
x Y
100 10
90 20
60 30
40 40
30 50
(iii) Linear correlation: If the amount of change in one variable tends to bear a constant
ratio to the amount of change in the other then the correlation is said to be linear.
(iv)Non-linear: If the amount of change in one variable does not bear a constant ratio to the
amount of change in the other then the correlation is said to be non-linear.
There are various types of methods of studying correlation:
(i) Scatter diagram Method;
(ii) Karl Pearsons Coefficient of Correlation;
(iii) Spearmans Rank Coefficient Correlation; and
x y
10 15
11 20
12 22
18 25
20 37
x y
20 40
30 30
40 22
60 16
80 15
COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA
Page 7 of 11
(iv) Method of Least Square
(i)Scatter diagram Method: The simplest method for studying correlation in two variables
is a special type of dot chart called scatter diagram. When this method is used the given data
are plotted on a graph paper in the form of dots. If all the points lie on a straight line falling
from the lower left- hand corner to the upper right-hand corner, correlation is said to be
perfectly positive. On the other hand if all the points lying on a straight line rising from the
upper left- hand corner to the lower right-hand corner, correlation is said to be perfectly
negative.
(ii) Karl Pearsons Coefficient of Correlation: Karl Pearsons coefficient of correlation (r)
between two variables x and y is defined by
( )( )
( ) ( )
=
2 2
y y x x
y y x x
r
where x and y are the respective means of y x and
If deviations are taken from an assumed mean, then
( )( ) ( ) ( )
( ) ( ) ( ) ( )
=
2
2
2
2
a y a y N a x a x N
a y a x a y a x N
r
(iii) Spearmans Rank Coefficient Correlation: Spearmans rank coefficient correlation is
defined by
( ) 1
6
1
2
2
=
N N
d
R
where d is the difference of two ranks between paired items in two series.
Probable Error: The probable error of the coefficient of correlation is obtained by
N
r
r
2
1 =
-
COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA
Page 8 of 11
Properties of Coefficient of Correlation:
Property 1. Prove that 1 s r or 1 1 + s s r
Proof: We have
( )( )
( ) ( )
=
2 2
y y x x
y y x x
r
Let
( )
( )
=
2
x x
x x
a ,
( )
( )
=
2
y y
y y
b then
( )
( )
) 1 ...( .......... 1
0 1
0 1 2
1 2 1
2
2 2 2
>
> +
> + =
+ + =
+ + = +
r
r
r
r
b ab a b a
Similarly
From (1) and (2) we can write
1
1 1
s
s s
r
r
(Proved)
( )
( )
) 2 .( .......... 1
1
0 1
0 1 2
1 2 1
2
2 2 2
s
>
>
> =
+ =
+ =
r
r
r
r
r
b ab a b a
COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA
Page 9 of 11
Problem1: Ten students got the following percentage of marks in Mathematics and Statistics
calculate the coefficient of correlation
Math. 78 36 98 25 75 82 90 62 65 39
Stat. 84 51 91 60 68 62 86 58 53 47
Solution: Le the marks of two subjects be denoted by x and y respectively
x y x x y y
( )
2
x x ( )
2
y y
( )( ) y y x x
78 84 13 18 169 324 234
36 51 -29 -15 841 225 435
98 91 33 25 1089 625 825
25 60 -40 -6 1600 36 240
75 68 10 2 100 4 20
82 62 17 -4 289 16 -68
90 86 25 20 625 400 500
62 58 -3 -8 9 64 24
65 53 0 -13 0 169 0
39 47 -26 -19 676 361 494
= 650 x
= 660 y
0 0 5398 2224 2704
Here 65
10
650
= = x , 66
10
660
= = y , ( )
= 5398
2
x x , ( )
= 2224
2
y y and
( )( )
= 2704 y y x x
( )( )
( ) ( )
78 . 0
3457
2704
2224 . 5398
2704
2 2
= = =
=
y y x x
y y x x
r (Ans.)
Problem2: The rank of ten students in mathematics and statistics are given below calculate
the coefficient of correlation
Student A B C D E F G H I J
Math. 9 10 6 5 7 2 4 8 1 3
Stat. 1 2 3 4 5 6 7 8 9 10
Solution:
COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA
Page 10 of 11
Student
1
R
2
R
2 1
R R d =
2
d
A 9 1 8 64
B 10 2 8 64
C 6 3 3 9
D 5 4 1 1
E 7 5 2 4
F 2 6 -4 16
G 4 7 -3 9
H 8 8 0 0
I 1 9 -8 64
J 3 10 -7 49
We know
( ) ( )
697 . 0
1 100 10
280 6
1
1
6
1
2
2
=
=
N N
d
R
COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA
Page 11 of 11
REGRESSION
Regression:
If the scatter diagram indicates some relationship between variables x and y, then the dots of
the scatter diagram will be concentrated round a curve. This curve is called the curve of
regression.
The method which used for estimating the unknown values of one variable corresponding to
the known value of another variable is called regression analysis.
Equation of regression line: The average relationship between x and y can be described by
the linear equation bx a y + = whose geometrical presentation is a straight line.
The values of a and b are given by the equations
+ = x b na y and
+ =
2
x b x a xy
Problem 01: Calculate the regression line from the following data
x 4.3 4.5 5.9 5.6 6.1 5.2 3.8 2.1
y 12.6 12.1 11.6 11.8 11.4 11.8 13.2 14.1
Solution:
x
y xy
2
x
4.3 12.6 54.18 18.49
4.5 12.1 54.45 20.25
5.9 11.6 68.44 34.81
5.6 11.8 66.08 31.36
6.1 11.4 69.54 37.21
5.2 11.8 61.36 27.04
3.8 13.2 50.16 14.44
2.1 14.1 29.61 4.41
= 5 . 37 x
= 6 . 98 y
= 82 . 453 xy 01 . 188
2
=
x
Let bx a y + = be the equation of the regression line of y on x , where a and b are given by
the equations
) 1 .......( .......... .......... 5 . 37 8 6 . 98 b a
x b na y
+ =
+ =
and
) 2 ( .......... .......... 01 . 188 5 . 37 82 . 453
2
b a
x b x a xy
+ =
+ =
Solving (1) and (2) we get 675 . 0 , 49 . 15 = = b a
The required line is x y 675 . 0 49 . 15 =