SR 7

RS – 4 – Multivariate Distributions
Chapter 4
Multivariate distributions
k≥2
Multivariate Distributions
All the results derived for the bivariate case can be generalized to n
RV.
The joint CDF of X1, X2, …, Xk will have the form:

P(x1, x2, …, xk ) when the RVs are discrete
F(x1, x2, …, xk ) when the RVs are continuous
1
Joint Probability Function
Definition: Joint Probability Function

Let X1, X2, …, Xk denote k discrete random variables, then
p(x1, x2, …, xk )
is joint probability function of X1, X2, …, Xk if
1. 0  p  x1 , , xn   1
2.    p  x , , x   1
x1 xn
1 n
3. P  X , , X   A    p  x , , x 
1 n 1 n
 x1 , , xn   A
Joint Density Function
Definition: Joint density function

Let X1, X2, …, Xk denote k continuous random variables, then
f(x1, x2, …, xk) = δn/δx1,δx2, …,δxk F(x1, x2, …, xk)
is the joint density function of X1, X2, …, Xk if
1. f  x1 , , xn   0
 
2.

   f  x , , x  dx , , dx

1 n 1 n 1
A
3. P  X 1 , , X n   A     f  x1 , , xn  dx1 , , dxn
2
Example: The Multinomial distribution

Suppose that we observe an experiment that has k possible outcomes
{O1, O2, …, Ok } independently n times.
Let p1, p2, …, pk denote probabilities of O1, O2, …, Ok respectively.
Let Xi denote the number of times that outcome Oi occurs in the n
repetitions of the experiment.
Then the joint probability function of the random variables X1, X2, …,
Xk is
n!
p  x1 , , xn   p1x1 p2x2  pkxk
x1 ! x2 ! xk !
Note: p1x1 p 2x2  p kxk

is the probability of a sequence of length n containing
x1 outcomes O1
x2 outcomes O2
…
xk outcomes Ok
n!  n 
 
x1 ! x2 ! xk !  x1 x2  xk 
is the number of ways of choosing the positions for the x1

outcomes O1, x2 outcomes O2, …, xk outcomes Ok
3

 n   n  x1   n  x1  x2   xk 
     
 x1   x2   x3   xk 


n!   n  x1 !    n  x1  x2 ! 
 x ! n  x  !   x ! n  x  x  !   x ! n  x  x  x  ! 
 
 1 1  2 1 2  3 1 2 3 
n!

x1 ! x2 ! xk !
n!
p  x1 , , xn   p1x1 p2x2  pkxk
x1 ! x2 ! xk !
 n  x1 x2
 xk
 p1 p2  pk
 x1 x2  xk 
This is called the Multinomial distribution

Suppose that an earnings announcements has three possible
outcomes:
O1 – Positive stock price reaction – (30% chance)
O2 – No stock price reaction – (50% chance)
O3 - Negative stock price reaction – (20% chance)
Hence p1 = 0.30, p2 = 0.50, p3 = 0.20.
Suppose today 4 firms released earnings announcements (n = 4).
Let X = the number that result in a positive stock price reaction,
Y = the number that result in no reaction, and Z = the number
that result in a negative reaction.
Find the distribution of X, Y and Z. Compute P[X + Y ≥ Z]
4!
p  x, y , z    0.30   0.50   0.20 
x y z
x yz  4
x! y! z !
4
z
Table: p(x,y,z) x y 0 1 2 3 4
0 0 0 0 0 0 0.0016
0 1 0 0 0 0.0160 0
0 2 0 0 0.0600 0 0
0 3 0 0.1000 0 0 0
0 4 0.0625 0 0 0 0
1 0 0 0 0 0.0096 0
1 1 0 0 0.0720 0 0
1 2 0 0.1800 0 0 0
1 3 0.1500 0 0 0 0
1 4 0 0 0 0 0
2 0 0 0 0.0216 0 0
2 1 0 0.1080 0 0 0
2 2 0.1350 0 0 0 0
2 3 0 0 0 0 0
2 4 0 0 0 0 0
3 0 0 0.0216 0 0 0
3 1 0.0540 0 0 0 0
3 2 0 0 0 0 0
3 3 0 0 0 0 0
3 4 0 0 0 0 0
4 0 0.0081 0 0 0 0
4 1 0 0 0 0 0
4 2 0 0 0 0 0
4 3 0 0 0 0 0
4 4 0 0 0 0 0
z
P [X + Y ≥ Z] x y 0 1 2 3 4
0 0 0 0 0 0 0.0016
= 0.9728 0
0
1
2
0
0
0
0
0
0.0600
0.0160
0
0
0
0 3 0 0.1000 0 0 0
0 4 0.0625 0 0 0 0
1 0 0 0 0 0.0096 0
1 1 0 0 0.0720 0 0
1 2 0 0.1800 0 0 0
1 3 0.1500 0 0 0 0
1 4 0 0 0 0 0
2 0 0 0 0.0216 0 0
2 1 0 0.1080 0 0 0
2 2 0.1350 0 0 0 0
2 3 0 0 0 0 0
2 4 0 0 0 0 0
3 0 0 0.0216 0 0 0
3 1 0.0540 0 0 0 0
3 2 0 0 0 0 0
3 3 0 0 0 0 0
3 4 0 0 0 0 0
4 0 0.0081 0 0 0 0
4 1 0 0 0 0 0
4 2 0 0 0 0 0
4 3 0 0 0 0 0
4 4 0 0 0 0 0
5
Example: The Multivariate Normal distribution

Recall the univariate normal distribution
 12  x 
2
1
f  x  e
2
the bivariate normal distribution
        
 x x 2 x x x y x  y 2
 12  
1 x x y y
f  x, y  
2 1  2 
e
2 x y 1   2
Example: The Multivariate Normal distribution

The k-variate Normal distribution is given by:
1  12  x μ   1  x μ 
f  x1 , , xk   f  x   e
 2 
k /2

1/ 2
where
 x1   1   11  12   1k 
x      22   2 k 
x   2 μ   2    12
        
     
 xk   k   1k  2 k   kk 
6
Marginal joint probability function
Definition: Marginal joint probability function

Let X1, X2, …, Xq, Xq+1 …, Xk denote k discrete random variables
with joint probability function
p(x1, x2, …, xq, xq+1 …, xk )
then the marginal joint probability function of X1, X2, …, Xq is
p12q  x1 , , xq      p  x , , x  1 n
xq 1 xn
When X1, X2, …, Xq, Xq+1 …, Xk is continuous, then the marginal

joint density function of X1, X2, …, Xq is
 
f12q  x1 , , xq      f  x , , x  dx
1 n q 1  dxn
 
Conditional joint probability function
Definition: Conditional joint probability function

Let X1, X2, …, Xq, Xq+1 …, Xk denote k discrete random variables
with joint probability function
p(x1, x2, …, xq, xq+1 …, xk )
then the conditional joint probability function of X1, X2, …, Xq given
Xq+1 = xq+1 , …, Xk = xk is
p  x1 ,  , x k 

p1 q q 1 k x1 ,  , x q x q 1 ,  , x k  p q 1 k  x q 1 ,  , x k 
For the continuous case, we have:

f  x1 ,  , x k 

f1 q q 1 k x1 ,  , x q x q 1 ,  , x k   f q 1 k  x q 1 ,  , x k 
7
Definition: Independence of sects of vectors

Let X1, X2, …, Xq, Xq+1 …, Xk denote k continuous random
variables with joint probability density function
f(x1, x2, …, xq, xq+1 …, xk )
then the variables X1, X2, …, Xq are independent of Xq+1, …, Xk if
f  x1 , , xk   f1q  x1 , , xq  f q 1k  xq 1 , , xk 
A similar definition for discrete random variables.
Definition: Mutual Independence

Let X1, X2, …, Xk denote k continuous random variables with joint
probability density function
f(x1, x2, …, xk )
then the variables X1, X2, …, Xk are called mutually independent if
f  x1 , , xk   f1  x1  f 2  x2   f k  xk 
A similar definition for discrete random variables.
8
Multivariate marginal pdfs - Example
Let X, Y, Z denote 3 jointly distributed random variable with joint

density function then
f  x, y , z   

 K x  yz
2
 0  x  1, 0  y  1, 0  z  1
 0 otherwise
Find the value of K.

Determine the marginal distributions of X, Y and Z.
Determine the joint marginal distributions of
X, Y
X, Z
Y, Z
Solution: Determining the value of K.

   1 1 1
1    f  x, y , z  dxdydz     K  x 
 yz dxdydz
2
   0 0 0
x 1
1 1
 x3  1 1
1 
 K     xyz  dydz  K     yz  dydz
0 0   x0 0 0 
3 3
y 1
1
1
y2 
1
1 1
 K y z  dz  K    z  dz
0
0 
3 2  y 0 3 2 12
if K 
1 7
 z z2  1 1 7
K   K  K 1
 3 4 0 3 4 12
9

The marginal distribution of X.
  1 1
  x 
12
f1  x     f  x, y , z  dydz  2
 yz dydz
 
7 0 0
y 1
12
1
 2 y2  12
1
 1 
   dz    x 
2
 x y z z  dz
7 0  2  y0 7 0
2 
1
12  2 z2  12  2 1 
  x z     x   fo r 0  x  1
7  4 0 7  4
The marginal distribution of X,Y.

 1
f12  x, y   
12
f  x, y, z  dz   x 2  yz dz
7 0
 

z 1
12  2 z2 
 x z  y 
7  2  z 0
12  2 1 
 x  y  for 0  x  1, 0  y  1
7 2 
10

Find the conditional distribution of:
1. Z given X = x, Y = y,
2. Y given X = x, Z = z,
3. X given Y = y, Z = z,
4. Y , Z given X = x,
5. X , Z given Y = y
6. X , Y given Z = z
7. Y given X = x,
8. X given Y = y
9. X given Z = z
10. Z given X = x,
11. Z given Y = y
12. Y given Z = z

The marginal distribution of X,Y.
12  2 1 
f12  x , y    x  y  for 0  x  1, 0  y  1
7  2 
Thus the conditional distribution of Z given X = x,Y = y is
f  x, y , z 
12 2

x  yz 
 7
f12  x , y  12  2 1 
x  y
7  2 
x 2  yz
 for 0  z  1
1
x2  y
2
11

The marginal distribution of X.
12  2 1 
f1  x    x   for 0  x  1
7  4
Then, the conditional distribution of Y , Z given X = x is
f  x, y , z 
12 2
7
x  yz 

f1  x  12  2 1 
x  
7  4
x 2  yz
 for 0  y  1, 0  z  1
1
x 2
Expectations for Multivariate Distributions
Definition: Expectation
Let X1, X2, …, Xn denote n jointly distributed random variable with
joint density function
f(x1, x2, …, xn )
then
E  g  X 1 ,  , X n  
 
    g x
 
1 , , xn  f  x1 ,  , x n  d x1 ,  , d x n
12
Expectations for Multivariate Distributions -

Example
Let X, Y, Z denote 3 jointly distributed random variable with joint

density function then
f  x, y, z   
12 2

 7 x  yz  0  x  1, 0  y  1, 0  z  1
 0 otherwise
Determine E[XYZ].
Solution:
1 1 1
E X YZ      xyz
12
7
 
x 2  yz dxdydz
0 0 0
1 1 1
   x 
12
 3
yz  xy 2 z 2 dxdydz
7 0 0 0
Expectations for Multivariate Distributions -

Example
1 1 1
 
1 1 1
12 2
E  XYZ      xyz    x 
12
x  yz dxdydz  3
yz  xy 2 z 2 d xd yd z
0 0 0
7 7 0 0 0
x 1
1 1
 x4 x2 2 2 
1 1
   yz  2 y 
12 3
 0 0  4  dydz  2
yz y z  z 2 dydz
7 2  x0 7 0 0
y 1
3
1
 y2 y3 2  3 1
1
2 2

7 0  2 z  2 3 z  dz  7 0  2 z  3 z  dz
y0
1
3  z2 2z3  3  1 2  3  17  17
         
7  4 9  0 7  4 9  7  36  84
13
Some Rules for Expectations – Rule 1

  
1. E X i      x i f  x1 ,  , x n  dx1  dx n   xi f i  xi  d xi
  
Thus you can calculate E[Xi] either from the joint distribution of
X1, … , Xn or the marginal distribution of Xi.
Proof:  
  x i f  x1 ,  , x n  dx1 ,  , dx n
 
  

  x i     f  x1 ,  , x n  dx1  dx i 1 dx i 1  dx n  dx i
    

  x i f i  x i  dx i

2. E  a1 X 1    a n X n   a1 E  X 1     a n E  X n 
This property is called the Linearity property.
Proof:
 
   a x
 
1 1    a n x n  f  x1 ,  , x n  dx1  dx n
 

 a1   x1 f  x1 ,  , x n  dx1  dx n
 
 
 an   x n f  x1 ,  , x n  dx1  dx n
 
14
3. (The Multiplicative property) Suppose X1, … , Xq are

independent of Xq+1, … , Xk then
E  g  X 1 , , X q  h  X q 1 , , X k  
 E  g  X 1 , , X q   E  h  X q 1 , , X k  
In the simple case when k = 2 , and g(X)=X & h(Y)=Y:
E  XY   E  X  E Y 
if X and Y are independent

Proof: E  g X 1 , , X q h X q 1 , , X k  
 
    g  x , , x  h  x
 
1 q q 1 , , xk  f  x1 , , xk  dx1  dxn
 
    g  x , , x  h  x
1 q q 1 , , xk  f1  x1 , , xq 
f 2  xq 1 , , xk  dx1  dxq dxq 1  dxk
 
 
 
    h  xq 1 , , xk  f 2  xq 1 , , xk      g  x1 , , xq 
    
f1  x1 , , xq  dx1  dxq  dxq 1  dxk
 E  g  X 1 ,  , X q   
 
   h x
 
q 1 ,  , x k  f 2  x q  1 ,  , x k d x q  1  d x k
15
 E  g  X 1 ,  , X q   
 

  hx

q 1 ,  , x k  f 2  x q 1 ,  , x k dx q 1  dx k
 E  g  X 1 ,  , X q   E  h  X q 1 ,  , X k  
Some Rules for Variance – Rule 1

1. V ar  X  Y  V ar  X  V a r Y  2C ov  X ,Y 
w h e re C ov  X ,Y  = E   X  X Y   Y  
Proof:
X  E    X  Y    
2
V ar  Y
 X Y

w h e re  X Y  E  X  Y   X  Y
Thus,
V ar  X  Y  E    X  Y     X   Y  
2
 
 E  X   X   2X  X Y   Y   Y   Y  
2 2
 
 V ar  X   2 C o v  X , Y   V ar Y 
16

Note: If X and Y are independent, then
C o v  X , Y  = E   X   X Y   Y  
= E  X   X  E Y   Y 
= E X   X   E Y      0 Y
and Var  X  Y   Var  X   Var Y 
Some Rules for Variance – Rule 1 - XY

Definition: Correlation coefficient
For any two random variables X and Y then define the correlation coefficient
XY to be:
C ov  X ,Y  C ov  X ,Y 
 xy = 
V ar  X  V a r Y   X Y
T hus C ov X ,Y =  XY  X  Y
and V ar  X  Y   2
X  2
Y  2 XY  X Y
  2
X  2
Y
if X and Y are independent.
17

C ov X ,Y  C ov X ,Y 
Recall  x y = 
V ar X  V a r Y   X  Y
Property 1. If X and Y are independent, then XY =0. (Cov(X,Y)=0.)
The converse is not necessarily true. That is, XY = 0 does not imply
that X and Y are independent.
Example:
y\x 6 8 10 fy(y)
E(X)=8, E(Y)=2, E(XY)=16
1 .2 0 .2 .4 Cov(X,Y) =16 – 8*2 = 0
2 0 .2 0 .2
3 .2 0 .2 .4 P(X=6,Y=2)=0≠P(X=6)*P(Y=2)=.4*
fx(x) .4 .2 .4 1 *.2=.08=> X&Y are not independent

Property 2.  1   XY  1
and  X Y  1 if there exists a and b such that
P Y  bX  a   1
where XY = +1 if b > 0 and XY = -1 if b< 0
Proof: Let U  X   X and V  Y   Y .
g  b   E  V  bU  0
2
Let for all b.
 
We will pick b to minimize g(b).
g  b   E  V  bU    E V 2  2 bVU  b 2U 2 
2
 
 E V 2   2 bE VU   b 2 E U 2 
18

Taking first derivatives of g(b) w.r.t b
g  b   E  V  b U    E V 2   2 bE VU   b 2 E U 2 
2
     
E V U 
g   b    2 E VU   2 bE U 2   0 => b  b m in 
E U 
2
Since g(b) ≥ 0, then g(bmin) ≥ 0

g  b m in  E  V 2   2 b m in E V U   b m2 in E  U 2 
2
E V U E  E V U   2
 E  V 2
  2 V U     E[U ]
E  U 2
  2
 
 E U  
 E V U 
2
 E  V 2
   0
E  U 2

 E VU 
2
 E V  2
 0
E U 2 
 E VU 
2
Thus, 1
E U 2  E V 2 
 E  X   
X  Y   Y  
2
or   XY
2
1
E   X   X   E  Y  Y  
2 2
   
=>  1   X Y  1
19

Note: g  b m in   E V 2   2 b m in E V U   b m2 in E U 2 
 E  V  b m in U  0
2
 
If and only if  X2 Y  1
This will be true if

P Y  Y   bmin  X   X   0   1
P Y  bmin X  a   1 where a   Y  bmin  X
i.e., P V  bminU  0   1

• Summary:
 1   XY  1
and  X Y  1 if there exists a and b such that
P Y  b X  a   1
E   X   X Y   X  
where b  b m in 
E  X   X  
2
 
C o v  X , Y   XY  X  Y 
 = =  XY Y
V ar  X  X2
X
Y
an d a   Y  b m in  X   Y   X Y 
X X
20

2. Var  aX  bY   a 2 Var  X   b 2 Var Y   2abCov  X , Y 
Proof
Var  aX  bY   E   aX  bY   aX bY  
2
 
with  aX bY  E  aX  bY   a  X  bY
Thus,
Var  aX  bY   E   aX  bY    a X  bY   
2
 
 E a  X   X   2ab  X   X Y  Y   b2 Y  Y  
2 2 2
 
 a2Var  X   2abCov  X , Y   b2Var Y 

3. Var  a1 X 1    an X n  
a12 Var  X 1     an2 Var  X n  

2a1a2 Cov  X 1 , X 2     2a1an Cov  X 1 , X n 
2a2 a3Cov  X 2 , X 3     2a2 an Cov  X 2 , X n 
2an 1an Cov  X n 1 , X n 
  ai2 Var  X i   2 ai a j Cov  X i , X j 

n
i 1 i j
n
  ai2 Var  X i  if X 1 , , X n are mutually independent
i 1
21
The mean and variance of a Binomial RV

We have already computed this by other methods:
1. Using the probability function p(x).
2. Using the moment generating function mX(t).
Now, we will apply the previous rules for mean and variances.
Suppose that we have observed n independent repetitions of a

Bernoulli trial
Let X1, … , Xn be n mutually independent random variables each

having Bernoulli distribution with parameter p and defined by
1 if repetition i is S (prob  p )
Xi  
0 if repetition i is F (prob  q )
The mean and variance of a Binomial RV
  E  X i   1 p  0  q  p
 2  Var[ X i ]  (1  p) 2 p  (0  p) 2 q  (1  p) 2 p  (0  p) 2 (1  p) 
 (1  p) ( p  p 2  p 2 )  qp
• Now X = X1 + … + Xn has a Binomial distribution with parameters

n and pThen, X is the total number of successes in the n repetitions.
 X  E  X 1     E  X n   p    p  np
 X2  var  X 1     var  X n   pq    pq  npq
22
Conditional Expectation
Definition: Conditional Joint Probability Function
Let X1, X2, …, Xq, Xq+1 …, Xk denote k continuous random variables

with joint probability density function
f(x1, x2, …, xq, xq+1 …, xk )
then the conditional joint probability function of X1, X2, …, Xq
given Xq+1 = xq+1 , …, Xk = xk is
f  x1 , , xk 
 
f1q q 1k x1 , , xq xq 1 , , xk 
f q 1k  xq 1 , , xk 
23
Definition: Conditional Joint Probability Function
Let U = h( X1, X2, …, Xq, Xq+1 …, Xk ) then the Conditional

Expectation of U given Xq+1 = xq+1 , …, Xk = xk is
E U xq 1 , , xk  
 
   h  x , , x  f
1 k 1q q 1k  x , , x
1 q 
xq 1 , , xk dx1  dxq
 
Note: This will be a function of xq+1 , …, xk.
Conditional Expectation of a Function -

Example
Let X, Y, Z denote 3 jointly distributed RVs with joint density function
then
f  x, y , z   

 7 x  yz
12 2
 0  x  1, 0  y  1, 0  z  1
 0 otherwise
Determine the conditional expectation of U = X 2 + Y + Z given
X = x, Y = y.
Integration over z, gives us the marginal distribution of X,Y:
12  2 1 
f12  x, y   x  y  for 0  x  1, 0  y  1
7 2 
24

Example
Then, the conditional distribution of Z given X = x,Y = y is
f  x, y , z 
12 2
7

x  yz 

f12  x, y  12  2 1 
 x  y
7 2 
x 2  yz
 for 0  z  1
1
x2  y
2

Example
The conditional expectation of U = X 2 + Y + Z given X = x, Y = y.
x 2  yz
1

E U x , y    x  y  z 2 1 dz
2
x 2y

0
1
1

 2 1  x 2  y  z x 2  yz dz
x 2y0
 
  yz   dz
1
1
 2 1
x 2y
2
  
  y x 2  y  x 2  z  x 2 x 2  y
0
z 1
 z3 z2 
1 
 2 1 y  y x  y  x 
x 2y 3
2 2
 
 x2 x2  y z    
2  z 0
 1 

1
x 2 y 3
2 1
2 2 1
2

 y   y x  y  x   x x  y 
2 2

  
25

Example
Thus the conditional expectation of U = X 2 + Y + Z given X = x,
Y = y.
 1  
E U x , y  
1
x 2 y 3
2
1
2
 2 1
2

 y   y x  y  x   x x  y 
2 2

 
 y x2 

1
 
x 2 y3 2
2 
 x 2  12 y  x 2

y 

1
x 2  13 y
1
 2 1  x2  y
2
x 2y
A Useful Tool: Iterated Expectations
Theorem
Let (x1, x2, … , xq, y1, y2, … , ym) = (x, y) denote q + m RVs.
Let U(x1, x2, … , xq, y1, y2, … , ym) = g(x, y). Then,
E U   E y  E U y  
Var U   E y Var U y    Vary  E U y  
The first result is commonly referred as the Law of iterated expectations.

The second result is commonly referred as the Law of total variance or
variance decomposition formula.
26
Proof: (in the simple case of 2 variables X and Y)

First, we prove the Law of iterated expectations.
Thus U  g X ,Y 
 
E U    g x, y  f x, y dxdy
 

E  U Y   E  g X ,Y Y    g x, y  fX Y x y dx


f x, y  dx
  g x, y 

fY  y 

hence EY  E U Y     E U y  fY  y  dy



EY  E U Y     E U y  fY  y  dy


 f  x, y  
   
 g  x , y  dx  fY  y  dy
 f Y  y  



    g  x, y  f  x, y  dx  dy
   
 
   g  x, y  f  x, y  dxdy  E U 
 
27

Now, for the Law of total variance:
Var U   E U 2    E U 
2
 
2
 EY  E U 2 Y    EY  E U Y  
       
 EY Var U Y   E U Y    E 2
 E U Y  
2
  Y  
  
 EY Var U Y    EY  E U Y    EY  E U Y   
2 2
 

 EY Var U Y    VarY E U Y  
A Useful Tool: Iterated Expectations - Example
Example:
Suppose that a rectangle is constructed by first choosing its length, X
and then choosing its width Y.
Its length X is selected form an exponential distribution with mean 
= 1/ = 5. Once the length has been chosen its width, Y, is selected
from a uniform distribution form 0 to half its length.
Find the mean and variance of the area of the rectangle A = XY.
28
Solution:
f X  x   15 e
 15 x
for x  0
fY X  y x  
1
if 0  y  x 2
x 2
f  x , y   f X  x  fY X  y x 
 15 x 1  15 x
 15 e = 2
5x e if 0  y  x 2 , x  0
x 2
We could compute the mean and variance of A = XY from the

joint density f(x,y)

 
E  A   E  XY     xyf  x , y  dxdy
 
 x 2  x 2
 
 15 x  15 x
 xy 2
5x e dydx  2
5 ye dydx
0 0 0 0
 
E  A 2   E  X 2Y 2     x 2 y 2 f  x , y  dxdy
 
 x 2  x 2
 
 15 x  15 x
 x y2 2 2
5x e dydx  2
5 xy 2 e dydx
0 0 0 0
and Var  A   E  A 2    E  A 
2
29
x 2  yx 2
 y2 
E  A    ye e
 15 x  15 x
2
5 dydx  2
5   dx
0 0 0   y 0
2
  3   15  x 2e x dx
 3
x e  15  0   3
2  15 x
 dx 
1
2 1 1 5
5 8 20 3
0
  3
 1
 53
2  125
10 
25
 12.5
 
20 20 2
1 3
5

 x 2  yx 2
 y3 
  xe
 15 x  15 x
E  A  
2 2
5
2
xy e dydx  2
5   dx
0 0 0   y 0
3
 5    15  x 4 e 
 5
x  15  0   5 
 15 x
 dx 
1 x
2 1 1 4 1
5 3 8 e 60 5
5
dx
0
 5 
 1
 55
4!  54
24  5 4  2  1250
 
60 60 12
1 5
5
T h u s V a r  A   E  A 2    E  A 
2
 1 2 5 0  1 2 .5   1 0 9 3 .7 5
2
30

Now, let’s use the previous theorem. That is,
E  A   E  XY   E X  E  XY X  
and Var  A  Var  XY 

 E X Var  XY X    VarX  E  XY X  
X 1 2
Now E  XY X   XE Y X   X  X
4 4
2  X 2  0
2
and Var  XY X   X Var Y X   X

2
 481 X 4
12
This is because given X, Y has a uniform distribution from 0 to X/2
Thus E  A  E  XY   E X  E  XY X  
 E X  14 X 2   14 E X  X 2   14 2
where 2  2nd moment for the exponential dist'n with   1

5
k!
Note k  for the exponential distn
k
2 25
Thus E  A  14 2  1
  12.5
 15 
4 2
2
• The same answer as previously calculated!! And no integration needed!
31
Now E  XY X   14 X 2 and Var  XY X   1

48 X4
Also Var  A  Var  XY 
4! 54
E X Var  XY X    E X  1
X  
4 1
4  1

 15 
48 48 48 4
2
VarX  E  XY X    VarX  14 X 2    14  VarX  X 2 

2
 
  14   E X  X 4   E X  X 2     14   4   2  
2 2 2 2
   
VarX  E  XY X    VarX  14 X 2    14  VarX  X 2 

2
  
2

  14 
2
 4!   2!   54  4!  2! 2   54
20  55
1  4
  1 2   42   42 4
 5  5  
Thus Var  A   Var  XY 
5 4 55 1 5  14 
   54     54    1093.75
2 4 2 4  8 
• The same answer as previously calculated!! And no integration needed!
32
The Multivariate MGF

Definition: Multivariate MGF
Let X1, X2, … , Xq be q random variables with a joint density
function given by f(x1, x2, … , xq). The multivariate MGF is
m X (t )  E X [exp( t' X )]
where t’= (t1, t2, … , tq) and X= (X1, X2, … , Xq )’.
If X1, X2, … , Xn are n independent random variables, then

n
m X (t )  m
i 1
Xi (t i )
The MGF of a Multivariate Normal

Definition: MGF for the Multivariate Normal
Let X1, X2, … , Xq be n normal random variables. The multivariate
normal MGF is
1
m X (t )  E X [exp( t' X )]  exp( t' μ  t'  t )
2
where t= (t1, t2, … , tq)’, X= (X1, X2, … , Xq )’ and μ= (μ1, μ2, … , μq )’.
33
Review: The Transformation Method

Theorem
Let X denote a random variable with probability density
function f(x) and U = h(X).
Assume that h(x) is either strictly increasing (or decreasing)
then the probability density of U is:
d h 1 (u )

g u   f h 1 (u )  du
 f x
dx
du
The Transformation Method (many variables)

Theorem
Let x1, x2,…, xn denote random variables with joint probability
density function
f(x1, x2,…, xn )
Let u1 = h1(x1, x2,…, xn).
u2 = h2(x1, x2,…, xn).
un = hn(x1, x2,…, xn).

define an invertible transformation from the x’s to the u’s
34
The Transformation Method (many variables)

Then the joint probability density function of u1, u2,…, un is given by:
d  x1 ,  , x n 
g u1 , , u n  f  x1 ,  , x n 
d u1 ,  , u n 
 f  x1 ,  , x n  J
 d x1 d x1 
 du 
dun 
d  x1 ,  , xn  1 
where J   det    
d u 1 ,  ,un   
 dxn 
dxn 
 d u 1 d u n 
Jacobian of the transformation
Example: Distribution of x+y and x-y

Suppose that x1, x2 are independent with density functions f1 (x1) and
f2(x2)
Find the distribution of u1 = x1+ x2 and u2 = x1 - x2

Solution: Solving for x1 and x2, we get the inverse transformation:
u1  u u1  u
x1  2
x2  2
2 2
The Jacobian of the transformation
 d x1 d x1 
 du du2 
d  x1 , x 2   d et 
1

J   dx2 dx2 
d u1 , u 2 
 du d u 2 
 1
35

1 1 
d  x1 , x 2   2    1  1    1  1    1
J   det  2 
d  u1 , u 2  1 1   2   2   2   2  2

 2 2
The joint density of x1, x2 is

f(x1, x2) = f1 (x1) f2(x2)
Hence the joint density of u1 and u2 is:
g  u1 , u 2   f  x1 , x 2  J
 u  u2   u1  u 2  1
 f1  1  f2  
 2   2 2

 u  u2   u1  u 2  1
From g  u1 , u 2   f1  1  f2  
 2   2 2
We can determine the distribution of u1= x1 + x2

g 1 u1    g u 1 , u2 du2


 u  u2   u1  u 2  1
 

f1  1
 2
 f2 
  2
 du2
2
u1  u 2 u  u2 dv 1
put v  th e n 1  u1  v , 
2 2 du2 2
36
Hence

 u  u2   u1  u 2  1
g 1 u 1    f1  1  f2   du2
  2   2  2

  f1 v  f 2 u 1  v  d v

This is called the convolution of the two densities f1 and f2.
Example (1): Convolution formula -The Gamma

distribution
Let X and Y be two independent random variables such that X and Y
have an exponential distribution with parameter 
We will use the convolution formula to find the distribution of

U = X + Y. (We already know the distribution of U: gamma.)
 u
f e
-  (u-y)
g U (u )  U ( u  y ) f Y ( y ) dy   e -  y dy
 0
u
  e -  u dy   2 ue -  u
2
0
This is the gamma distribution when α=2.
37
Example (2): The ex-Gaussian distribution

Let X and Y be two independent random variables such that:
1. X has an exponential distribution with parameter 

2. Y has a normal (Gaussian) distribution with mean  and standard
deviation .
We will use the convolution formula to find the distribution of

U = X + Y.
(This distribution is used in psychology as a model for response

time to perform a task.)

 ex x  0
Now f1  x   
 0 x  0

 x   2
1
f2 y e 2 2
2 
The density of U = X + Y is:


g u    f1  v  f 2  u  v  d v



 u  v   2
1
 e
v
 2 2
e dv
0 2 
38
  u  v   2
  v
or g u   2 2
e dv
2  0
  u  v   2  2  2  v
 
 e 2 2
dv
2  0
 v 2  2  u   v   u   2  2  2  v
 
 e 2 2
dv
2  0
 u   2  v 2  2   u      2
  v
  
 2
e 2 2
2
e dv
2  0

 u   2  v 2  2   u      2   v
  
or 
2 
e 2 2
e
0
2 2
dv
2 2
 u   2   u     2    v 2  2  u     2   v   u     2  
      

2
e 2 2
e
0
2 2
dv
2 2
u   2
  u     2    v 2  2  u     2   v   u     2  
 1 
 e 2 2

0 2
e 2 2
dv
2
 u   2   u     2  

 e 2 2
P V  0
39

Where V has a Normal distribution with mean
V  u      2 
and variance 2.
That is,

   u   
 2 
       2   u  
g u    e  2 
1      
   2

  
Where Φ(z) is the cdf of the standard Normal distribution
The ex-Gaussian distribution

0.09
g(u)
0.06
0.03
0
0 10 20 30
40
Distribution of Quadratic Forms

We will present different theorems when the RVs are normal variables:
Theorem 7.1. If y ~ N(μy, Σy), then z = Ay ~N(Aμy, A Σy A′),

where A is a matrix of constants.
Theorem 7.2. Let the n×1 vector y ~N(0, In). Then y′y ~n.
Theorem 7.3. Let the n×1 vector y ~N(0, σ2 In) and M be a symmetric
idempotent matrix of rank m. Then,
y′My/σ2 ~tr(M)
Proof: Since M is symmetric it can be diagonalized with an orthogonal

matrix Q. That is, Q′MQ = Λ. (Q′Q=I)
Since M is idempotent all these roots are either zero or one. Thus,
 I 0
Q' MQ     
0 0 
Note: dim(I) = rank(M) (the number of non-zero roots is the rank of
the matrix). Also, since Σiλi=tr(I), => dim(I)=tr(M).
Let v=Q′y.
E(v) = Q′E(y)=0
Var(v) = E[vv’]=E[Q′yyQ] =Q′E(σ2In)Q = σ2 Q′InQ = σ2 In
=>v ~ N(0,σ2In)
Then,  I 0
tr ( M ) tr ( M ) 2
 
v    
y ' My v' Q' MQv 1 1 v
  v'  v  2
2
i  i
 2
 2
2
 0 0   i 1 i 1
Thus, y′My/σ2 is the sum of tr(M) N(0,1) squared variables. It follows
a tr(M)
41
Theorem 7.4. Let the n×1 vector y ~ N(μy, Σy). Then,

(y -μy)′ Σy-1 (y -μy) ~n
Proof:
Recall that there exists a non-singular matrix A such that AA′= Σy.
Let v = A-1 (y -μy)′ (a linear combination of normal variables)
=> v ~ N(0, In)
=> v′ Σy-1 v ~n(using Theorem 7.3, where n=tr(Σy-1).
Theorem 7.5
Let the n×1 vector y ~ N(0, I) and M be an n×n matrix. Then, the
characteristic function of y′My is |I-2itM|-1/2
Proof:
 
1 ity'My  y' y / 2 1
y'My  Ey [eity'My]  e e dx  e y'(I 2itM) y / 2dx.
(2) y
n/ 2
(2) y
n/ 2
This is the normal density with Σ-1=(I-2itM), except for the

determinant |I-2itM|-1/2, which should be in the denominator.
Theorem 7.6
Let the n×1 vector y ~ N(0, I), M be an n×n idempotent matrix of
rank m, let L be an n×n idempotent matrix of rank s, and suppose ML
= 0. Then y ′My and y′Ly are independently distributed  variables.
Proof:
By Theorem 7.3 both quadratic forms distributed variables. We
only need to prove independence. From Theorem 7.5, we have
 y'My  Ey [eity'My] | I  2itM |1/ 2
 y'Ly  Ey [eity'Ly ] | I  2itL |1/ 2
The forms will be independently distributed if φy’(M+L)y = φy’My φy’Ly
That is,
y'(ML) y  Ey[eity'(ML) y ]  | I  2it(M  L) |1/ 2 | I  2itM |1/ 2| I  2itL|1/ 2
Since |ML|=|M||L|, the above result will be true only when ML=0.
42

SR 7

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

SR 7

Transféré par

Droits d'auteur :

Formats disponibles

RS – 4 – Multivariate Distributions

The joint CDF of X1, X2, …, Xk will have the form:

Joint Probability Function

Definition: Joint Probability Function

Joint Density Function

Definition: Joint density function

Example: The Multinomial distribution

Example: The Multinomial distribution

Note: p1x1 p 2x2  p kxk

is the number of ways of choosing the positions for the x1

Example: The Multinomial distribution

Example: The Multinomial distribution

Example: The Multivariate Normal distribution

Example: The Multivariate Normal distribution

Marginal joint probability function

Definition: Marginal joint probability function

When X1, X2, …, Xq, Xq+1 …, Xk is continuous, then the marginal

Conditional joint probability function

Definition: Conditional joint probability function

For the continuous case, we have:

Conditional joint probability function

Definition: Independence of sects of vectors

A similar definition for discrete random variables.

Conditional joint probability function

Definition: Mutual Independence

A similar definition for discrete random variables.

Multivariate marginal pdfs - Example

Let X, Y, Z denote 3 jointly distributed random variable with joint

Find the value of K.

Multivariate marginal pdfs - Example

Solution: Determining the value of K.

Multivariate marginal pdfs - Example

Multivariate marginal pdfs - Example

The marginal distribution of X,Y.

Multivariate marginal pdfs - Example

Multivariate marginal pdfs - Example

Thus the conditional distribution of Z given X = x,Y = y is

Multivariate marginal pdfs - Example

Then, the conditional distribution of Y , Z given X = x is

Expectations for Multivariate Distributions

Expectations for Multivariate Distributions -

Let X, Y, Z denote 3 jointly distributed random variable with joint

Expectations for Multivariate Distributions -

Some Rules for Expectations – Rule 1

Some Rules for Expectations – Rule 2

Some Rules for Expectations – Rule 3

3. (The Multiplicative property) Suppose X1, … , Xq are

In the simple case when k = 2 , and g(X)=X & h(Y)=Y:

if X and Y are independent

Some Rules for Expectations – Rule 3

Some Rules for Expectations – Rule 3

Some Rules for Variance – Rule 1

Some Rules for Variance – Rule 1

and Var  X  Y   Var  X   Var Y 

Some Rules for Variance – Rule 1 - XY

Some Rules for Variance – Rule 1 - XY

Property 1. If X and Y are independent, then XY =0. (Cov(X,Y)=0.)

Some Rules for Variance – Rule 1 - XY

Proof: Let U  X   X and V  Y   Y .

Some Rules for Variance – Rule 1 - XY

Since g(b) ≥ 0, then g(bmin) ≥ 0

Some Rules for Variance – Rule 1 - XY

Some Rules for Variance – Rule 1 - XY