Vous êtes sur la page 1sur 4

COMP-761: Quantum Information Theory

Winter 2009

Lecture 4 January 20
Lecturer: Patrick Hayden

4.1

Scribe: Mazen Al Borno

Erasure Channel

Back in 1948, it was a surprise to find that positive rates were achievable in general. If it seems obvious to
you, keep in mind that the rate achievable using repetition codes for a non-trivial erasure channel is exactly
zero. The only way to get the error probability to zero is to repeat an unbounded number of times.

4.2

Jointly Typical Sequences


(n)

The set of jointly typical sequences A

n Y n with respect to p(x, y) is:

A(n)
={(xn , y n ) n Y n :



1

log p(xn ) H(X) < ,
n




1
log p(y n ) H(Y ) < ,

n


1

log p(xn , y n ) H(X, Y ) < }.
n

Theorem 4.2.1. Joint Asymptotic Equipartition (Joint AEP) Theorem
i.i.d

(n)

1. If (xn , y n ) p(xn , y n ), then P r{(xn , y n ) A } 1


(n)

(n)

2. |A | 2n(H(X,Y )+) and > 0, |A | (1 )2n(H(X,Y )) for n sufficiently large.
i.i.d

3. If (
xn , yn ) p(x)p(y) (i.e, same marginals as (xn , y n ) but x
n and yn are independent), then
(n)
(n)
n n
n(I(X,Y )3)
n n
P r{(
x , y ) A } 2
and P r{(
x , y ) A } (1 )2n(I(X,Y )+3) for n sufficiently large.
Proof: (of Joint AEP)
1. Follows after applying AEP three times.
2. Same procedure as for AEP.
3. We first provide an intuitive argument. The size of the X typical set is 2nH(X) and the size of the Y
typical set is 2nH(Y ) . However, the size of the jointly typical set is only 2nH(X,Y ) . Therefore, assuming

4-1

COMP-761

Lecture 4 January 20

Winter 2009

nH(X,Y )

(n)

2
nI(X;Y )
all sequences are equiprobable, P r{(
xn , yn ) A } nH(X)2
. Formally,
nH(Y ) = 2
2
X
P r{(
xn , yn ) A(n)
p(xn )p(y n )
 }=
(n)

xn ,y n A

n(H(X)) n(H(Y ))


|A(n)
2
 |2

2n(H(X,Y )+) 2n(H(X)) 2n(H(Y ))


2n(I(X,Y )3)
Proving the lower bound on P r{(
xn , yn ) A(n)
 } is similar.


4.3

Shannons Noisy Coding Achievability

Fix p(x). Let  > 0. You want to generate a (2nR , n) code.


1. Choose xn (1), xn (2), . . . , xn (2nR ) i.i.d according to pn (xn ) =

Qn

j=1

p(xj )
(n)

2. Use typical set decoding. Decode y n as the unique w such that (xn (w), y n ) A . If no such w exists,
its a failure.
We begin by estimating Ec Pen , which is the average over all codes (c) of the average error probability. There
are two possible sources of error for a given w:
(a) (xn (w), y n ) is not jointly typical.
(b) There exists a w0 6= w such that (xn (w0 ), y n ) are jointly typical.
Let M = 2nR .
M
1 X (n)
Ec Pe(n) = Ec
P (w)
M w=1 e

M
1 X
Ec Pe(n) (w)
M w=1

= Ec Pe(n) (1) [by the symmetry of the code with respect to permutation of messages]
Error type: Let y n be generated by the channel by input xn (1).
(n)

(a) P r{(xn (1), y n )


/ A } <  for n sufficiently large.
(b) If w0 6= 1, then xn (w0 ) and xn (1) are independent, which implies that y n and xn (w0 ) are independent.
By the Joint AEP:
n(I(X;Y )3)
P r{(xn (w0 ), y n ) A(n)
 }2
nR
P r{w0 6= w such that (xn (w0 ), y n ) A(n)
1)2n(I(X;Y )3)
 } (2

The last step is justified by the union bound P r{A B} P r{A} + P r{B}. If R < I(X; Y ) 3, then
(n)
(n)
P r{w0 6= w such that (xn (w0 ), y n ) A } <  as n gets sufficiently large. Therefore, Ec Pe < 2
for n sufficiently large.
4-2

COMP-761

Lecture 4 January 20

Winter 2009

(n)
Since the expectation over codes of Pe is no more than 2, there must exist a code with this average
error rate. Starting from that code, we now have to find another code with good worst case error criterion
(n)
(n)
Pe . Expurgation: Throw away the half of the codewords with worst Pe (w). What remains will have
(n)
Pe (w) 4.
(n)

Proof: Suppose not. Then, all tossed codewords must have Pe (w) > 4.
M
1 X (n)
P (w)
Pen =
M w=1 e
1 X
Pe(n) (w)

M
w tossed

1 M
>
( )4
M 2
= 2.
(n)
This contradicts the known Pe .

The expurgated code has a rate of:

1
n

log[2nR /2] = R n1 . Why cant we do better?


(n)

Proof (The Converse proof ): Assume for the moment that we have a code at rate R where Pe = 0.
(Well relax the assumption in the next lecture.) Assign a uniform distribution to the messages. We have a
Markov Chain: w xn y n w
= w.
nR = H(w)
= I(w; y n ) + H(w|y n ) (H(w|y n ) = 0 since Pe(n) = 0)
I(xn ; y n ) (by the data-processing inequality)
n
() X
I(xj ; yj )

j=1

n max I(x; y)
p(x)


Why ()? I(xn ; y n ) = H(y n ) H(y n |xn )
n
X
n
= H(y )
H(yj |xn , y1 , . . . , yj1 )
j=1

Since the channel is memoryless, P (yj |xn , y1 , . . . , yj1 ) = P (yj |xj ).


n
X
= H(y n )
H(yj |xj )
j=1

n
X
j=1
n
X

[H(yj ) H(yj |xj )] (by subadditivity)


I(xj , yj )

j=1

4-3

COMP-761

4.4

Lecture 4 January 20

Winter 2009

Binary Symmetric Channel

Reminder: p(0|0) = p(1|1) = 1 p and p(0|1) = p(1|0) = p.


I(X; Y ) = H(Y ) H(Y |X)
X
= H(Y )
p(x)H(Y |X = x)
x

= H(Y ) H2 (p) (H2 (p) is a binary entropy: H2 (p) = H(X) for Bernoulli X)
1 H2 (p)
For p(x) = 12 , then I(X; Y ) = 1 H2 (p). Therefore, max I(X; Y ) = 1 H2 (p).
p(x)

4-4

Vous aimerez peut-être aussi