Vous êtes sur la page 1sur 14

Uncorrelatedness and Independence

Uncorrelatedness :Two r.v. x and y are uncorrelated if

Cxy = E [(x mx)(y my )T ] = 0


or equivalently

Rxy = E [xyT ] = E [x]E [yT ] = mxmT y


White random vector :This is dened to be a r.v. with zero mean and unit covariance (correlation) matrix.

mx = 0, Rx = Cx = I
Example: What will be the mean and covariance of the white r.v. under the orthogonal transform? Let T denote an orthogonal matrix (i.e. TT T = TTT = I).Such a matrix denes an orthogonal transform (i.e. a rotation of an coordinate system - preserves distances in the space). Thus, dene y = Tx.
1

Hence,

my = . . . = 0
and

Cy = . . . = I
Therefore the orthogonal transformation preserves whiteness. Example: Calculate the Rx for x = As + n, where s is a random signal with correlation matrix Rs and the noise vector n has zero mean and is uncorrelated with the signal. Independence :Two random variables x, y are statistically independent if px,y (x, y ) = px(x)py (y ) i.e. if the joint pdf of (x, y ) factors out into the product of their marginal probability distributions px and py .
2

From the denition of statistical independence it follows that E [g (x)h(y )] = E [g (x)]E [h(y )] where g, h are any absolutely integrable functions. Similarly for random vectors the denition of statistical independence reads px,y (x, y) = px(x)py (y) and the property reads E [g(x)h(y)] = E [g(x)]E [h(y)] Properties: the statistical independence of two r.v.s implies their uncorrelatedness Independence is a stronger property than uncorrelatedness. Only for gaussian variables uncorrelatedness and independence coincide.
3

Example: Consider the discrete random vector from our example X \ Y 0 1 2 0 1/18 1/9 1/6 6/18 1 1/9 1/18 1/6 6/18 2 1/6 1/9 1/18 6/18

6/18 5/18 7/18

Are X, Y independent? To check, Lets construct a table which entries are products of the corresponding marginal probabilities of X, Y.

X \ Y 0 1 2

0 6/18*6/18 6/18*5/18 6/18*7/18

1 6/18*6/18 6/18*5/18 6/18*7/18

2 6/18*6/18 6/18*5/18 6/18*7/18

Hence, X, Y are not independent. uncorrelated?

Are they
4

E (XY ) =
X X

XY p(X, Y )

= 0 0 1/18 + 0 1 1/9 + 0 2 1/6 +1 0 1/9 + 1 1 1/18 + 1 2 1/9 +2 0 1/6 + 2 1 1/6 + 2 2 1/18 = 15/18 However, E (X )E (Y ) = 19/18 hence X, Y are correlated.

Central limit theorem (CLT)


Classical probability is concerned with random variables and sequences of independent identically distributed (iid) r.v.s. A very important case - sequence of partial sums of iid r.v.s
k

xk =
i=1

zi
5

Consider the normalised variables x mxk yk = k xk where mxk and xk are the mean and variance of Xk . Central limit theorem asserts that the distribution of yk converges to a normal distribution with k . Analogous formulation of the CLT holds in the case of random vectors.

CLT justies use of the gaussian variables for modelling random phenomena

in practice sums of a relatively small number of r.v.s will show gaussianity even if individual components are not necessarily identical.
6

Conditional probability
Conditional density :Consider random vectors x, y with marginal pdfs px(x) and py (y), respectively and a joint pdf px,y (x, y). Conditional density of x given y is dened as px|y (x|y) = px,y (x, y) py (y)

Similarly, conditional density of y given x is dened as px,y (x, y) py|x(y|x) = px(x) The conditional probability distributions allow to address questions like, what is the probability density of a r.v. x given that a random vector y has a xed value y0. For statistically independent r.v.s the conditional densities equal the respective marginal densities.
7

Example: Consider the bivariate discrete random vector X \ Y 0 1 2 0 1/18 1/9 1/6 6/18 1 1/9 1/18 1/6 6/18 2 1/6 1/9 1/18 6/18 6/18 5/18 7/18

The conditional probability function of Y given X = 1 is Y |X=1 1 0


1/9 5/18

1
1/18 5/18

2
1/9 5/18

Bayes Rule :From denitions of the conditional densities we can obtain the following alternative formulas for calculating the joint pdf px,y (x, y) = py|x(y|x)px(x) = px|y (x|y)py (y) From the above follows so called Bayes rule for calculating the conditional density of y given x:
8

py|x(y|x) =

px|y (x|y)py (y) px(x)

where the denominator can be calculated by integration px(x) =


px|y (x| )py ( )d

Bayes rule allows to compute the posterior density py|x(y|x) given the observed vector x and either knowing or assuming the prior distribution py (y). Conditional expectations E [g(x, y)|y] =

g (, y)px|y ( |y)d

The conditional expectation is a random variable - it depends on the r.v. y. The following relationship holds E [g(x, y)] = E [E [g(x, y)|y]]
9

The family of multivariate gaussian densities px(x) = 1 (2 )n/2(det Cx)1/2

1 1 exp (x mx)T C x (x mx ) 2 where n is the dimension of x, mx is the mean and Cx is the covariance matrix of x and is assumed to be strictly positive denite. Properties: mx and Cx dene uniquely the Gaussian pdf. closed under linear transforms - if x is a random gaussian vector then y = Ax is also gaussian with my = Amx and Cy = ACxAT marginal and conditionals are gaussian
10

uncorrelatedness and geometric structure : If the covariance matrix Cx of the multidimensional gaussian density is not diagonal, then the components of x are not independent. Cx is symmetric and positive denite matrix, hence it can be represented as

Cx = EDET
n

=
i=1

iei eT i

where E is an orthogonal matrix containing eigenvectors of Cx as its columns and D = diag(1, 2, . . . , n ) is a diagonal matrix containing the corresponding eigenvalues of Cx. Transform

u = ET (x mx)
rotates the data so that the components of u are uncorrelated and hence independent.
11

The cross-section of gaussian pdf with constant value of the density is a hyper-ellipsoid
1(x m ) = c (x mx)T Cx x

centered at the mean, with axis parallel to the eigenvectors of Cx and the eigenvalues being the corresponding variances.

Higher-order Statistics
Consider a scalar r.v. x with a probability density function px(x). The j th moment of x is j = E [xj ] =

j px( )d

and the j th central moment of x j = E [(x 1) ] =


j

( mx)j px( )d

12

Skewness and Kurtosis : The third central moment called skewness provides a measure of asymmetricity of the pdf. The fourth order statistics called kurtosis indicates nongaussianity of r.v. It is dened for zero-mean r.v. as kurt(x) = E [x4] 3(E [x2])2 Distribution with negative kurtosis are called subgaussian (usually atter than Gaussian or multimodal). Distribution with positive kurtosis are called supergaussian (usually sharper peaked than Gaussian with longer tails). Properties of kurtosis:

for 2 statistically independent r.v. x, y , kurt(x + y ) = kurt(x) + kurt(y ) for any scalar a: kurt(ax) = a4kurt(x)
13

Example: Laplacian density has a pdf px(x) = exp( |x|) 2 Example: Exponential family of pdfs (with zero mean) contains Gaussian, Laplacian and uniform pdfs as special cases: |x| px(x) = C exp E [|x| ] i.e. for = 2 the above pdf is equivalent to the Gaussian pdf px(x) = C exp | x| 2 2E [|x|2] x2 = C exp 2 2x

= 1 gives Laplacian pdf and yields uniform pdf.

14

Vous aimerez peut-être aussi