Vous êtes sur la page 1sur 27

1.

Two Random Variables


• One random variable can be considered as a mapping from the
sample space to the real line.
• Two random variables can be considered as a mapping from the
sample space to the plane.
• Note that given the outcome of the experiment, both X and Y are
determined simultaneously.

1
Joint Probability Mass Function

• Assume that X   X , Y  assumes a countable set of values, i.e. both X and Y
are discrete random variables.{ X  x }  {Y  y }
j k
• The joint probability mass function specifies the probabilities of the
product-form event
p X ,Y x j , yk   P[{ X  x j }  {Y  yk }]

• The probability of any event A is obtained by summing over all outcomes in


A:
P[ A]  p
( x j , yk )  A
X ,Y ( x j , yk )

• The sum over all outcomes is 1:  p


xj yk
X ,Y ( x j , yk )  1

2
Marginal Statistics

• The marginal probability distribution of a random variable in a joint


distribution is simply the probability of the random variable when
considered in isolation.
• The marginal pmf of X is

p [ x ]  P[ X  x ]  P[ X  x , Y  anything]
X j j j


  p [x , y ] X ,Y j k
k 1


• Similarly, pY [ yk ]   p X ,Y [ x j , yk ]
j 1
3
Example

• Toss two fair die, and define

PX  2 Y  2  0 while PX  2 Z  2 


1
9
4
Joint Cumulative Distribution Function
FX ,Y ( x, y)  PX  x Y  y where x, y  
• The joint cumulative distribution function of two random variables X
and Y, F(x,y) is defined as
Y

(x,y)

5
Joint density

• The joint probability density function (pdf) f(x,y) of X and Y is the derivative
of the joint distribution: 2
 F ( x, y )
f ( x, y ) 
xy

• Properties: y x
F ( x, y )    f ( ,  )dd
 

 

•  
 
f ( ,  )dd  1

• P[( x, y )  A]   f ( ,  )dd
A
6
The joint pdf is not a probability

• Note that the value of the density function is not a probability.


• However, it is large in regions of the plane with high probability and small in
regions of the plane with low probability.
y  dy x  dx

Px  X  x  dx y  Y  y  dy    f ( ,  )dd


Y y x
dx
 f x, y dxdy
y dy

x X

• It is true that f(x,y) ≥ 0 for all x and y. However, it is not necessarily true that f(x,y)
≤ 1.
7
Marginal densities

• The marginal densities fX(x) or fY(y) can always be recovered from the joint
density f(x,y): 

dFX ( x) dFX ,Y ( x, )
f ( y)  Y 
f ( , y )d

X ,Y
f X ( x)  
dx dx Y
d  x
 
y
 f X ,Y ( ,  )d d Integrate along
dx     this horizontal line
Integrate
  d
for fY(y)
x 
    f X ,Y ( ,  )d d
along this
vertical line

  dx  

for fX(x)

 f X ,Y ( x,  )d x X


• However, it is not generally possible to determine the joint density from the
marginal densities.
8
Independence

• Definition: Two random variables X and Y are said to be independent


or statistically independent if for any events on the real line AX and AY,
P[X  AX   Y  AY ]  P[ X  AX ]P[Y  AY ]

• In other words, the probability of a product form event can be


expressed as the product of the probabilities of the events in X and Y.
• When random variables are independent, knowledge of the
probabilities of the RV’s in isolation is sufficient to specify the
probabilities of joint events.

9
Theorem

The following three statements are equivalent:


1. X and Y are independent.
2. FX ,Y x, y   FX x FY  y  for all x and y.
3. f X ,Y x, y   f X x  f Y  y  for all x and y.

Proof:
(1) → (2) Set AX = {X ≤ x} and AY = {Y ≤ y}.

(2) → (3) Differentiate FX,Y(x,y).


PX  AX  Y  AY     f X ,Y x, y dxdy
(3) → (1) For any AX and AY, AY AX

  
  f X x dx  fY  y dy   PX  AX  PY  AY 
 
 A 10  A 
 X  Y 
Additional facts about independent RVs

• Discrete random variables X and Y are independent if and only if the joint
pmf is the product of the marginal pmf

p X ,Y ( x j , yk )  p X ( x j ) pY ( yk )

• If X and Y are independent, then any functions (linear or nonlinear) of X


and Y, e.g.
W  g( X ) Z  h(Y )

are also independent.

11
Conditional Probability Density Functions

If X is continuous, then P[X = x] = 0. Thus, we define


PA X  x  lim PA x  X  x  h
h 0

We can define the conditional cdf of Y given X by y xh

PY  y  x  X  x  h
  f  ,  d d
X ,Y

FY X  y x   lim  lim  x
h 0 Px  X  x  h h 0 xh

 f  d
x
X

y y
h  f x,  d  f x,  d
X ,Y X ,Y

 lim 
 
if f X x   0
h 0 f X x h f X x 
f X ,Y x, y 
Differentiating with respect to y,
fY X  y x  
f X x 
12
Graphical Interpretation

• The conditional pdf is the probability that Y is


in the infinitesimal strip defined by (y, y+dy)
given that X is in the infinitesimal strip
defined by (x, x+dx).
• The conditional pdf is the mass of the red
column divided by the mass of the entire
yellow strip.
Y
• It can also be interpreted as a slice of the joint
density at X = x normalized to unit area.

f X ,Y  x, y 
fY | X  y | x   
x
 f X ,Y x, y dy X

13
Properties of the Conditional Density
f X ,Y ( x, y )  f X |Y ( x | y) fY ( y) or f X ,Y ( x, y )  fY | X ( y | x) f X ( x)

• Computing the joint density:



fY ( y )   fY | X ( y | x) f X ( x)dx (continuou s random variables)

pY ( j )   pY | X ( j | k ) p X (k ) (discrete random variables)
k

fY | X ( y | x) f X ( x)
• Bayes Theorem f X |Y ( x | y )  


fY | X ( y | x) f X ( x)dx

• If X and Y are independent,


f X |Y ( x | y)  f X ( x) and fY | X ( y | x)  fY ( y)
14
Removing Conditioning via Expectation
Since E[Y|X] is a function of X, it is also a random variable.
Taking the expectation over X, the conditioning disappears!

over X

over Y

This is an “expectation” version of the total probability theorem.

More generally,

15
2. N Random Variables

• An N dimensional random vector is a mapping from a probability


space W to RN
• For example, N = 3. 
W
S

X3

X1
X2
X2
X3
X1

16
Joint Cumulative Distribution Function
FX1 , X 2 ,...,X n ( x1 , x2 ,..., xn )  P[ X 1  x1 , X 2  x2 ,....., X n  xn ]

• To eliminate a variable:
FX1 , X 2 ,...,X n1 ( x1 , x2 ,..., xn 1 )  FX1 , X 2 ,...,X n ( x1 , x2 ,..., xn 1 , )
17
Joint Probability Density/Mass Functions
n
f X 1 , X 2 ,..., X n ( x1 , x2 ,..., xn )  FX 1 , X 2 ,..., X n ( x1 , x2 ,..., xn )
x1...xn
• Joint Probability Density Function
PA   f X 1 , X 2 ,..., X n ( x1 , x2 ,..., xn )dx1dx2  dxn
A
• For any set A,

p X1 , X 2 ,...,X n ( x1 , x2 ,..., xn )  P[ X 1  x1 , X 2  x2 ,....., X n  xn ]

• Joint Probability Mass Function

• For any set A, P A    p X1 X 2 X n ( x1 , x2 ,..., xn )


A

18

f X 1 X 2  X n1 x1 , x2 ,, xn 1   f X1 X 2 X n x1 , x2 , x3 ,, xn dxn

Marginal Statistics 
p X 1 X 2  X n1 x1 , x2 ,, xn 1   p
xn  
X1 X 2 X n x1 , x2 , x3 ,, xN 
• Eliminate variables from a pdf (pmf) by integrating (summing)
 
f X 1 x1    f X1 X 2X n x1 , x2 , x3 ,, xn dx2  dxn
 
 
p X 1 x1 , x2 ,, xn 1     p
x2   xn  
X1 X 2X n x1 , x2 , x3 ,, x N 

• Get marginals by successive integrations:

19
Graphical Interpretation

20
Conditional densities
f X1 X 2 X 3 x1 , x2 , x3 
f X1 X 2 | X 3 x1 , x2 | x3  
f X 3 x3 
f X1 X 2 X 3 x1 , x2 , x3 
f X1| X 2 X 3 x1 | x2 , x3  
f X 2 X 3 x2 , x3 
Conditional probability mass functions
p X 1 X 2 X 3 x1 , x2 , x3 
p X 1 X 2 | X 3 x1 , x2 | x3  
p X 3 x3 
p X 1 X 2 X 3 x1 , x2 , x3 
p X 1| X 2 X 3 x1 | x2 , x3  
p X 2 X 3 x2 , x3 

21
Independence

• The following statements are equivalent


1. X 1 , X 2 ,, X n are independent, i.e., for all A1 , A2 ,, An
P[ X 1  A1 , X 2  A2 ,.... X n  An ]  P[ X 1  A1 ]P[ X 2  A2 ]....P[ X n  An ]

2. FX 1 , X 2 ,..., X n ( x1 , x2 ,..., xn )  FX 1 ( x1 ) FX 2 ( x2 ) FX n ( xn )

3. f X 1 , X 2 ,...,X n ( x1 , x2 ,..., xn )  f X 1 ( x1 ) f X 2 ( x2 ).... f X n ( xn )

22
Expected value of a function of random variables
Z  g ( X 1 , X 2 ,..., X n )
Suppose Z is a function of n random variables X1, X2, … Xn.
  
E[ Z ]    ...
  
g ( x1 , x2 ,..., xn ) f ( x1 , x2 ,..., xn )dx1dx2 ...dxn

The expected value of Z can be computed from the joint density:

E[ Z ]  ... g ( x , x ,..., x ) p( x , x ,..., x )


x1 x2 xn
1 2 n 1 2 n

If the Xi are discrete,

23
Linearity of the Expectation

• For any two random variables, the mean of the sum is the sum of the
means: E[ Z ]  E[ X  Y ]    ( x  y) f ( x, y)dxdy
 
Let Z = X + Y    
  x  f ( x, y )dydx   y  f ( x, y )dxdy
   
 
  xf X ( x)dx   yfY ( y )dy  E[ X ]  E[Y ]
 

 
E


i
ai X i  

 a E X 
i
i i where the ai are real constants

• More generally, the order of expectation and any linear operation can
be swapped. For example,
24
Expectation of the product

• If X and Y are independent, then the expectation of the product is the


product of the expected values.
 
E[ XY ]    xyf ( x, y )dxdy
 
 
  xyf X ( x) fY ( y )dxdy by independen ce
 

   xf X ( x)dx     yfY ( y )dy 


  
 

     
 E[ X ]  E[Y ]

• More generally, if Xi are independent and gi(∙) are arbitrary functions


E[ g1 ( X 1 ) g 2 ( X 2 )...g n ( X n )]  E[ g1 ( X 1 )]  E[ g 2 ( X 2 )]  ...  E[ g n ( X n )]
25
Moments and Central Moments

 
Joint moment: E[ X Y ]   
j k
x j y k f ( x, y )dxdy (the jkth moment)
 

Central moment: E[( X  E[ X ]) j (Y  E[Y ]) k ]


The first and second moments are most widely used:
E[ X ]
means
E[Y ]

variances E[( X  E[ X ]) 2 ]  VAR[ X ]


E[(Y  E[Y ]) 2 ]  VAR[Y ]
covariance E[( X  E[ X ])(Y  E[Y ])]  COV ( X , Y )
correlation E[ XY ]
26
Variance of Sum
Let Z = X + Y, where X and Y are random variables.
VAR Z   VAR  X   2COV( X , Y )  VAR Y 
The variance of Z is

Proof VARZ  E[( Z  E[ Z ]) 2 ]


 E[( X  Y  E[ X ]  E[Y ]) 2 ]
 E[( X  E[ X ])  (Y  E[Y ])  ]
2

 E[( X  E[ X ]) 2 ]  2 E[( X  E[ X ])(Y  E[Y ])]  E[(Y  E[Y ]) 2 ]


 VAR X   2COV( X , Y )  VARY 

Note: If X and Y are uncorrelated, then VARZ   VAR X   VARY 


i.e. the variance of the sum is the sum of the variances.

27

Vous aimerez peut-être aussi