Académique Documents
Professionnel Documents
Culture Documents
The I-Measure
Raymond W. Yeung 2010
Department of Information Engineering
The Chinese University of Hong Kong
An Example
X
1
X
2
H ( ) ,
X
1
X
2
H ( )
X
1
X
2
H ( )
X
2
X
1
I ( ) ;
X
1
H ( )
2
X H ( )
Substitution of Symbols
H/I
,
;
|
(
X
1
X
2
) = H(X
1
|X
2
)
(
X
2
X
1
) = H(X
2
|X
1
),
(
X
1
X
2
) = I(X
1
; X
2
)
2. Inclusion-Exclusion formulation in set-theory
(
X
1
X
2
) =
(
X
1
) +
(
X
2
)
(
X
1
X
2
)
corresponds to
H(X
1
, X
2
) = H(X
1
) +H(X
2
) I(X
1
; X
2
)
in information theory.
3.1 Preliminaries
Denition 3.1 The eld F
n
generated by sets
X
1
,
X
2
, ,
X
n
is the collection
of sets which can be obtained by any sequence of usual set operations (union,
intersection, complement, and dierence) on
X
1
,
X
2
, ,
X
n
.
Denition 3.2 The atoms of F
n
are sets of the form
n
i=1
Y
i
, where Y
i
is either
X
i
or
X
c
i
, the complement of
X
i
.
Example 3.3
The sets
X
1
and
X
2
generate the eld F
2
.
There are 4 atoms in F
2
.
There are a total of 16 sets in F
2
Denition 3.4 A real function dened on F
n
is called a signed measure if
it is set-additive, i.e., for disjoint A and B in F
n
,
(A B) = (A) +(B).
Remark () = 0.
Example 3.5
X
1
X
2
A signed measure on F
2
is completely specied by the values on the
atoms
(
X
1
X
2
), (
X
c
1
X
2
), (
X
1
X
c
2
), (
X
c
1
X
c
2
)
The value of on other sets in F
2
are obtained by set-additivity.
Section 3.3
Construction of the I-Measure !*
Let
X be a set corresponding to a r.v. X.
N
n
= {1, 2, , n}.
Universal set
=
iN
n
X
i
.
Empty atom of F
n
A
0
=
iN
n
X
c
i
A is the set of other atoms of F
n
, called non-empty atoms. |A| = 2
n
1.
A signed measure on F
n
is completely specied by the values of on
the nonempty atoms of F
n
.
Notations For nonempty subset G of N
n
:
X
G
= (X
i
, i G)
X
G
=
iG
X
i
Theorem 3.6 Let
B =
X
G
: G is a nonempty subset of N
n
.
Then a signed measure on F
n
is completely specied by {(B), B B}, which
can be any set of real numbers.
Proof of Theorem 3.6
|A| = |B| = 2
n
1
u column k-vector of (A), A A
h column k-vector of (B), B B
Obviously can write h = C
n
u, where C
n
is a unique k k matrix.
On the other hand, for each A A, (A) can be expressed as a linear
combination of (B), B B by applying
(A B C) = (AC) +(B C) (A B C)
(AB) = (A B) (B).
(see Appendix 3.A) That is, u = D
n
h.
Then u = (D
n
C
n
)u, showing that D
n
= (C
n
)
1
is unique.
Two Lemmas
Lemma 3.7
(A B C) = (A C) +(B C) (A B C) (C).
Lemma 3.8
I(X; Y |Z) = H(X, Z) +H(Y, Z) H(X, Y, Z) H(Z).
Construct the I-Measure
on F
n
using by dening
(
X
G
) = H(X
G
)
for all nonempty subsets G of N
n
.
, G
of N
n
where G and G
are
nonempty:
(
X
G
X
G
X
G
) = I(X
G
; X
G
|X
G
)
G
(
X
G
X
G
) = I(X
G
; X
G
)
G = G
(
X
G
X
G
) = H(X
G
|X
G
)
G = G
and G
(
X
G
) = H(X
G
)
Theorem 3.9
dened on F
n
.
Can employ set-theoretic tools to manipulate expressions of Shannons
information measures.
Proof of Theorem 3.9
(
X
G
X
G
X
G
)
=
(
X
GG
) +
(
X
G
G
)
(
X
GG
G
)
(
X
G
)
= H(X
GG
) +H(X
G
G
) H(X
GG
G
) H(X
G
)
= I(X
G
; X
G
|X
G
),
In order that
(
X
G
) = H(X
G
)
for all nonempty subsets G of N
n
.
Thus
is nonnegative for n = 2.
For n = 3,
(
X
1
X
2
X
3
) = I(X
1
; X
2
; X
3
) can be negative.
Example 3.10
X
1
, X
2
i.i.d. binary r.v.s uniform on {0, 1}
X
3
= X
1
+ X
2
mod 2
Easy to check:
H(X
i
) = 1, for all i
X
1
, X
2
, X
3
are pairwise independent, so that
H(X
i
, X
j
) = 2 and I(X
i
; X
j
) = 0, for all i = j
Under these constraints, I(X
1
; X
2
; X
3
) = 1.
3.5 Information Diagrams
X
1
X
1
X
2
H ( )
X
1
H ( )
X
2
X
3
X
1
I ( ) ; X
3
; ; X
1
I ( ) X
3
X
2
; X
1
I X
3
X
2
( )
, X
3
X
2
X
1
H ( )
0
0 0
1 1
1
1
X
1
X
2
X
3
The information diagram for Example 3.10
X
1
X
2
X
3
X
4
Theorem 3.11 If there is no constraint on X
1
, X
2
, , X
n
, then
can take
any set of nonnegative values on the nonempty atoms of F
n
.
Proof
Let Y
A
, A A be mutually independent r.v.s.
Dene X
i
, i = 1, 2, , n by
X
i
= (Y
A
: A A and A
X
i
).
Claim: X
1
, X
2
, , X
n
so constructed induce the I-Measure
such that
(A) = H(Y
A
), for all A A.
which are arbitrary nonnegative numbers.
Consider
H(X
G
) = H(X
i
, i G)
= H((Y
A
: A A and A
X
i
), i G)
= H(Y
A
: A A and A
X
G
)
=
AA:A
X
G
H(Y
A
)
On the other hand,
H(X
G
) =
(
X
G
) =
AA:A
X
G
(A)
Thus
AA:A
X
G
H(Y
A
) =
AA:A
X
G
(A)
One solution is
(A) = H(Y
A
), for all A A.
By the uniqueness of
X
3
X
2
can be suppressed.
The values of
(
X
1
;
X
2
;
X
3
) =
(
X
1
;
X
3
) = I(X
1
; X
3
)
Thus,
is a measure.
X
3
X
1
X
2
For n = 4,
X
1
X
c
2
X
3
X
c
4
X
1
X
c
2
X
3
X
4
X
1
X
c
2
X
c
3
X
4
X
1
X
2
X
c
3
X
4
X
c
1
X
2
X
c
3
X
4
The information diagram can be displayed in two dimensions.
The values of
is a measure.
X
1
X
4
X
3
X
2
...
X
1
X
2
X
n -1
X
n
For a general n, the information diagram can be displayed in two dimen-
sions because certain atoms can be suppressed.
The values of
is a measure.
See Ch. 12 for a detailed discussion in the context of Markov random eld.
3.6 Examples of Applications
To obtain information identities is WYSIWYG.
To obtain information inequalities:
If
is nonnegative, if A B, then
(A)
(B)
because
(A)
(A) +
(B A) =
(B)
If
1
:
(X
1
, X
2
) (X
3
, X
4
)
(X
1
, X
3
) (X
2
, X
4
)
2
:
(X
1
, X
2
, X
3
) X
4
(X
1
, X
2
, X
4
) X
3
Each (conditional) independency forces
X
4
).
Analysis of Example 12.12
1
2