Vous êtes sur la page 1sur 11

COURSE CODE:EMIS-506, BUSINESS STATISTICS

DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR


DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA

Page 1 of 11
Combined Standard Deviation:
It is possible to compute combined standard deviation of two or more groups. Combined
standard deviation of two groups is denoted by
12
o and is computed as follows:
2 1
2
2 2
2
1 1
2
2 2
2
1 1
12
N N
d N d N N N
+
+ + o + o
= o
Where,
. d ; d
group second of deviation Standard
group first of deviation Standard
deviation Strandard Combined
12 2 2 12 1 1
2
1
12
X X X X = =
= o
= o
= o


The above formula can be extended to find out the standard deviation of three or more
groups. For example, combined standard deviation of three groups:
3 2 1
2
3 3
2
2 2
2
1 1
2
3 3
2
2 2
2
1 1
123
N N N
d N d N d N N N N
+ +
+ + + o + o + o
= o
123 3 3 123 2 2 123 1 1
d ; d ; d X X X X X X = = =

Example:
The number of workers employed, the mean wage (in dollars) per week and the standard
deviation (in dollars) in each branch of a company are given below. Calculate mean wages
and standard deviation of all the workers taken together for company.

Branch No. of workers
employed
Weekly mean wage
(in dollars)
Standard deviation
(in dollars)
A 50 1413 60
B 60 1420 70
C 90 1415 80
Solution:
1416 $ `
90 60 50
) 1415 90 ( ) 1420 60 ( ) 1413 50 (

3 2 1
3
3
2
2
1
1
123
=
+ +
+ +
=
+ +
+ +
=
N N N
X N X N X N
X


Combined Standard deviation of three branches
3 2 1
2
3 3
2
2 2
2
1 1
2
3 3
2
2 2
2
1 1
123
N N N
d N d N d N N N N
+ +
+ + + o + o + o
= o
COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA

Page 2 of 11
1 1416 1415 d
4 1416 1420 d
3 1416 1413 d
123 3 3
123 2 2
123 1 1
= = =
= = =
= = =
X X
X X
X X

( ) ( ) ( ) ( ) ( ) ( )
51 . 72 $
90 60 50
1 90 4 60 3 50 80 90 70 60 60 50
2 2 2 2 2 2
123
=
+ +
+ + + + +
= o

Moments, Skewness and Kurtosis

Moments: Moments is a very important measure in statistics, which determine the shape and
nature of distribution. We can find the skewness and kurtosis of distribution by using
moments. If
n
x x x x ...., ,......... , ,
3 2 1
are n values assumed by the variable x , then the quantity
n
x
n
x x x x
r r
n
r r r
r

=
+ + + +
=
.......
3 2 1
is called r-th moment about zero or simply r-th
moment. The first moment with 1 = r is the arithmetic mean x .
Central / corrected moments: The r-th moment about the mean x is defined by
( )
N
x x f
n
i
r
i i
r

=

=
1
, where N f
i
=

.
when 1 = r , then 0
1
= . If 2 = r , then
2
2
o = , the variance.
Raw moments: The r-th moment about any point a is defined by
( )
N
a x f
n
i
r
i i
r

=

=
1 /
, where N f
i
=

.
In particular
( )
1
1
0
0
= = =

=
N
N
N
f
N
x x f
i
n
i
i i


( )
1
1
0
/
0
= = =

=
N
N
N
f
N
a x f
i
n
i
i i


( )
0
1
1
= = = =

=
x x
N
x N
x
N
x f
N
x f
N
x x f
i i i
n
i
i i


( )
a x a
N
x f
N
a x f
i i
n
i
i i
= =

=1 /
1
= d (deviation)

( )
2 1
2
2
o =

=
N
x x f
n
i
i i
(variance)
Relations among Moments:
COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA

Page 3 of 11
( )
2
/
1
1
2 2
=
( )
3
/
1
/
1
/
2
1
3 3
2 3 + =
( ) ( )
4
/
1
2
/
1
/
2
/
1
/
3
/
4 4
3 6 4 + =
( ) ( ) ( )
4
/
1
/
4 4
3
/
1
/
3 3
2
/
1
/
2 2
/
1
/
1 1
/


+ + =
r
r
r
r
r
r
r
r
r r
C C C C

SKEWNESS

Skewness: Skewness is the lack of symmetry of a distribution. If the frequency curve of a
distribution has a longer tail to the right of the central maximum than to the left, the
distribution is said to be skewed to the right, or to have positive skewness. If the reverse is
true, it is said to be skewed to the left, or to have negative skewness.
1
st
Coefficient of Skewness,
1
| =
o
e x mod
deviation standard
mode - mean
=
2
nd
Coefficient of Skewness,
( ) ( )
o
|
median 3
deviation standard
median - mean 3
1

= =
x

With the help of moments skewness can be determined, Karl Pearson suggested
3
2
2
3
1

| = ; (the sign of
1
| is the same of
3
)
For symmetrical distribution 0
1
= |
A distribution is said to be skewed if (i) mean, median and mode give different values.
(ii)
3 1
andQ Q are not equidistant from median (
2
Q ).

KURTOSIS


Kurtosis: The degree of peakness or flatness of a distribution relative to a normal distribution
is called kurtosis.
Kurtosis,
2
2
4
2

| =
For normal distribution, if 3
2
= | then the curve is mesokurtic
if 3
2
> | then the curve is leptokurtic
if 3
2
< | then the curve is platykurtic









COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA

Page 4 of 11






Problem 01: The first four moments of a distribution about the value 5 of the variable are 2,
20, 40 and 50respectively. Show that the mean is 7. Also find the moments about mean,
skewness and kurtosis.

Solution: Given that 50 40 , 20 , 2 , 5
/
4
/
3
/
2
/
1
= = = = = and A
We have to find the moments about mean
7 5 2
/
1
= + = + = A x
0
1
=
( ) 16 2 20
2
2
/
1
1
2 2
= = =
( ) 64 2 2 20 2 3 40 2 3
3
3
/
1
/
1
/
2
1
3 3
= + = + =
( ) ( ) 162 2 3 2 20 6 2 40 4 50 3 6 4
4 2
4
/
1
2
/
1
/
2
/
1
/
3
/
4 4
= + = + =
skewness,
( )
( )
265 . 1
4096
5184
16
72
3
2
3
2
2
3
1
=

= =

|
and kurtosis,
( )
63 . 0
256
162
16
162
2 2
2
4
2
= = = =

|


Problem 02: First central moments of a distribution are 0, 16, -36 and 120. Comment on the
skewness and kurtosis of the distribution.

Solution: Given that 0
1
= , 16
2
= , 36
3
= , 120
4
=

Coefficient of skewness, 5625 . 0
4
36
3
3
2
3
1 1
=

= = =

|
The distribution is negatively skewed.
kurtosis,
( )
469 . 0
16
120
2 2
2
4
2
= = =

|
Since 3
2
< | the distribution is platykurtic.

Problem 03: Find skewness and kurtosis of the following distribution

Wages 2030 30--40 40--50 50--60 60--70 70--80
Workers 7 10 15 8 8 2



COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA

Page 5 of 11





Solution:

Wages
Mid value
i
x frequency
i
f d x
i
= 45
i i
d f
2
i i
d f
2030 25 7 -20 -40 2800
3040 35 10 -10 -100 1000
4050 45 15(=a) 0 0 0
5060 55 8 10 80 800
6070 65 8 20 160 3200
70--80 75 2 30 60 1800

50 =
i
f

= 60
i i
d f 9600
2

=
i i
d f

Mean, 2 . 46
50
60
45 = + = + =

i
i i
f
d f
a x
17 . 44
12
50
40 10
8 10 30
10 15
40
2
2 0 1
0 1
= + =

+ =

+ = i
f f f
f f
L Mode
Standard Deviation (S.D) =o = 8 . 13 44 . 1 192
2500
3600
50
9600
2
2
= = =
|
|
.
|

\
|

f
fd
f
fd

skewness,
1
| = 147 . 0
8 . 13
17 . 44 2 . 46
deviation standard
mode - mean
=

=
kurtosis,
( )
20 . 2
56 . 190
659 . 79892
2 2
2
4
2
= = =

|
so the kurtosis is platykurtic.
















COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA

Page 6 of 11




CORRELATION AND REGRATION

Correlation: If two variables x and y vary in such a way that an increase in the one is
accompanied by an increase or decrease in the other, then the variables are said to be
correlated. An analysis of the covariation of two or more variables is usually called
correlation.

There are various types of correlation:

(i)Positive correlation: If one variable is increasing the other on an average is also
increasing or, if one variable is decreasing the other on an average is also decreasing, then
the correlation is said to be positive.

x Y
80 50
70 45
60 31
40 20
30 10



(ii)Negative Correlation: If one variable is increasing the other is decreasing or vice versa,
then the correlation is said to be negative.

x Y
100 10
90 20
60 30
40 40
30 50

(iii) Linear correlation: If the amount of change in one variable tends to bear a constant
ratio to the amount of change in the other then the correlation is said to be linear.

(iv)Non-linear: If the amount of change in one variable does not bear a constant ratio to the
amount of change in the other then the correlation is said to be non-linear.

There are various types of methods of studying correlation:

(i) Scatter diagram Method;
(ii) Karl Pearsons Coefficient of Correlation;
(iii) Spearmans Rank Coefficient Correlation; and
x y
10 15
11 20
12 22
18 25
20 37
x y
20 40
30 30
40 22
60 16
80 15
COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA

Page 7 of 11
(iv) Method of Least Square

(i)Scatter diagram Method: The simplest method for studying correlation in two variables
is a special type of dot chart called scatter diagram. When this method is used the given data
are plotted on a graph paper in the form of dots. If all the points lie on a straight line falling
from the lower left- hand corner to the upper right-hand corner, correlation is said to be
perfectly positive. On the other hand if all the points lying on a straight line rising from the
upper left- hand corner to the lower right-hand corner, correlation is said to be perfectly
negative.

(ii) Karl Pearsons Coefficient of Correlation: Karl Pearsons coefficient of correlation (r)
between two variables x and y is defined by


( )( )
( ) ( )



=
2 2
y y x x
y y x x
r
where x and y are the respective means of y x and

If deviations are taken from an assumed mean, then


( )( ) ( ) ( )
( ) ( ) ( ) ( )




=
2
2
2
2
a y a y N a x a x N
a y a x a y a x N
r

(iii) Spearmans Rank Coefficient Correlation: Spearmans rank coefficient correlation is
defined by

( ) 1
6
1
2
2

=

N N
d
R
where d is the difference of two ranks between paired items in two series.


Probable Error: The probable error of the coefficient of correlation is obtained by

N
r
r
2
1 =
-













COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA

Page 8 of 11

Properties of Coefficient of Correlation:

Property 1. Prove that 1 s r or 1 1 + s s r
Proof: We have
( )( )
( ) ( )



=
2 2
y y x x
y y x x
r
Let
( )
( )

=
2
x x
x x
a ,
( )
( )

=
2
y y
y y
b then

( )
( )
) 1 ...( .......... 1
0 1
0 1 2
1 2 1
2
2 2 2
>
> +
> + =
+ + =
+ + = +

r
r
r
r
b ab a b a

Similarly


From (1) and (2) we can write

1
1 1
s
s s
r
r

(Proved)
















( )
( )
) 2 .( .......... 1
1
0 1
0 1 2
1 2 1
2
2 2 2
s
>
>
> =
+ =
+ =

r
r
r
r
r
b ab a b a
COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA

Page 9 of 11
Problem1: Ten students got the following percentage of marks in Mathematics and Statistics
calculate the coefficient of correlation

Math. 78 36 98 25 75 82 90 62 65 39
Stat. 84 51 91 60 68 62 86 58 53 47

Solution: Le the marks of two subjects be denoted by x and y respectively

x y x x y y
( )
2
x x ( )
2
y y
( )( ) y y x x
78 84 13 18 169 324 234
36 51 -29 -15 841 225 435
98 91 33 25 1089 625 825
25 60 -40 -6 1600 36 240
75 68 10 2 100 4 20
82 62 17 -4 289 16 -68
90 86 25 20 625 400 500
62 58 -3 -8 9 64 24
65 53 0 -13 0 169 0
39 47 -26 -19 676 361 494

= 650 x

= 660 y
0 0 5398 2224 2704

Here 65
10
650
= = x , 66
10
660
= = y , ( )

= 5398
2
x x , ( )

= 2224
2
y y and
( )( )

= 2704 y y x x


( )( )
( ) ( )
78 . 0
3457
2704
2224 . 5398
2704
2 2
= = =


=

y y x x
y y x x
r (Ans.)

Problem2: The rank of ten students in mathematics and statistics are given below calculate
the coefficient of correlation

Student A B C D E F G H I J
Math. 9 10 6 5 7 2 4 8 1 3
Stat. 1 2 3 4 5 6 7 8 9 10









Solution:
COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA

Page 10 of 11

Student
1
R
2
R
2 1
R R d =
2
d
A 9 1 8 64
B 10 2 8 64
C 6 3 3 9
D 5 4 1 1
E 7 5 2 4
F 2 6 -4 16
G 4 7 -3 9
H 8 8 0 0
I 1 9 -8 64
J 3 10 -7 49


We know
( ) ( )
697 . 0
1 100 10
280 6
1
1
6
1
2
2
=

=

N N
d
R































COURSE CODE:EMIS-506, BUSINESS STATISTICS
DR. K. M. SALAH UDDIN, ASSISTANT PROFESSOR
DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS (MIS)
UNIVERSITY OF DHAKA

Page 11 of 11
REGRESSION

Regression:
If the scatter diagram indicates some relationship between variables x and y, then the dots of
the scatter diagram will be concentrated round a curve. This curve is called the curve of
regression.
The method which used for estimating the unknown values of one variable corresponding to
the known value of another variable is called regression analysis.

Equation of regression line: The average relationship between x and y can be described by
the linear equation bx a y + = whose geometrical presentation is a straight line.
The values of a and b are given by the equations

+ = x b na y and

+ =
2
x b x a xy

Problem 01: Calculate the regression line from the following data

x 4.3 4.5 5.9 5.6 6.1 5.2 3.8 2.1
y 12.6 12.1 11.6 11.8 11.4 11.8 13.2 14.1

Solution:
x
y xy
2
x
4.3 12.6 54.18 18.49
4.5 12.1 54.45 20.25
5.9 11.6 68.44 34.81
5.6 11.8 66.08 31.36
6.1 11.4 69.54 37.21
5.2 11.8 61.36 27.04
3.8 13.2 50.16 14.44
2.1 14.1 29.61 4.41

= 5 . 37 x

= 6 . 98 y

= 82 . 453 xy 01 . 188
2
=

x

Let bx a y + = be the equation of the regression line of y on x , where a and b are given by
the equations

) 1 .......( .......... .......... 5 . 37 8 6 . 98 b a
x b na y
+ =
+ =


and
) 2 ( .......... .......... 01 . 188 5 . 37 82 . 453
2
b a
x b x a xy
+ =
+ =



Solving (1) and (2) we get 675 . 0 , 49 . 15 = = b a

The required line is x y 675 . 0 49 . 15 =

Vous aimerez peut-être aussi