Vous êtes sur la page 1sur 30

Digital Communications -2008-09

L. Paura Lecture 2
1
Data Compression
Summary :
Information source model
Measure of the information: entropy of a discrete random
variable
Entropy of a information source
Conditional Entropy
Source Coding Theorem (I Shannon Theorem)
Huffman Coding
Lempel-Ziv Coding
Reference : Proakis Salehi (II ed.) Cap.4
Digital Communications -2008-09
L. Paura Lecture 2
2
Information Source Model
Examples of information sources:
Audio-broadcasting System : speech signal
Video-broadcasting System : video signal
Fax transmission system: monochromatic image
Communication System among computers : ASCII symbol
Sequence or sequence of binary symbols
Digital Communications -2008-09
L. Paura Lecture 2
3
Information Source Model
Digital Source:
It is a source whose output is a sequence of symbols
belonging to a finite and discrete alphabet (N possible
symbols) with a rate 1/ T .
Digital
Source
1 0 1
...... , , ,..... X X X

A digital source can be modelled as a time-discrete random signal


namely, a sequence of random variables.
Digital Communications -2008-09
L. Paura Lecture 2
4
Information Source Model
Example: binary memoryless source
The source provides symbols {0,1} statistically independent
(each other) and identically distributed every T second.
Binary
Memoryless Source
,0,0,1,0,01,0,1,.
The sequence of RVs X
i
is iid (independent and identically distri-
buted) Discrete Memoryless Source (DMS)
Digital Communications -2008-09
L. Paura Lecture 2
5
Information Source Model
The characterization of the sequence X
i
is simple: we have to assign
only a parameter
Pr(X
i
=1) p Pr(X
i
=0) = 1- p
The probability that the source produce an arbitrary block of
symbols in the time interval kT is evaluated by exploiting the statistical
independence among the symbols
Example: Pr (010110100) = p
4
(1-p)
5
In general, say m the number of 1 one has :
Pr(X
1
.X
k
) = p
m
(1-p)
k-m
Given the value of p the source is completely statistically characterized
Digital Communications -2008-09
L. Paura Lecture 2
6
Example of source coding
Let be S a BMS producing symbols1 e 0 with p = 3/4
If no coding is performed to storage a sequence of
10.000 symbols we need a memory amount of M = 10.000 binary
locations (bits)
Let us consider the following binary code :
00 c
1
000
01 c
2
001
10 c
3
01
11 c
4
1

l
1
=3, l
2
=3, l
3
=2, l
4
=1 [ ]
4
1
i i
i
E l l p
=
= =

[ ]
2 2
1 1 3 3 1 3
3 3 2 1 1.69
4 4 4 4 4 4
E l
| | | |
= + + +
| |
\ \

M 1.695000=8.450 l.m.b.
Digital Communications -2008-09
L. Paura Lecture 2
7
Example of source coding
8450 < 10.000
We have saved memory amount to storage our information by
representing the most likelyhood pairs of binary symbols (namely,
11in our example) with shortest codeword (unitary length)
Representation of the code by a binary tree:
Rule :
two branches depart from each node
Let us denote with 1 the upper branch
Let us denote with 0 the lower branch
000
001
01
1

No codeword is prefix
of any other codeword
Prefix code
Digital Communications -2008-09
L. Paura Lecture 2
8
Example of source coding
How much memory amount can we save at most?
I Shannon Theorem gives us the answer
The lowest number of symbols to represent without
distortion (say, distortion-less or noiseless) every soure
symbol is given by the average amount of information
carried by any source symbol.
Digital Communications -2008-09
L. Paura Lecture 2
9
Measure of the information
The information of an event is associated with its uncertainty
level namely, with the probability of such event.
Example: Match Napoli-Roma
Possible events:
V=The Napoli team wins
S= The Napoli team loses
P=The Napoli team draws
( ) ( ) ( )
Pr Pr Pr 1 V S P + + =
Digital Communications -2008-09
L. Paura Lecture 2
10
Measure of the information
Let us assume that we know the three probabilities:
P
r
(V) = 0.001, P
r
(S) = 0.99, P
r
(P) = 0.009
I(V) > I(P) >> I(S)
The information of an event is a decreasing function
of its probability.
P
r
(V) < P
r
(P) < <P
r
(S)
Digital Communications -2008-09
L. Paura Lecture 2
11
Measure of the information of a
discrete random variable
The information associated with two independent events A
and B must be the summation of the single information
of A and B, namely
I(AB) = I(A) + I(B)
Let us define the information I(a
i
) associated with the
symbol a
i
( ) ( ) ( )
i
I a log1/ log
i i
p a p a = =
Autoinformation of the symbol a
i
Digital Communications -2008-09
L. Paura Lecture 2
12
Measure of the information of a
discrete random variable
If the logarithm is in base 2 the autoinformation I() is
measured in bit (binary unit)
If the logarithm is in base e the autoinformatione I() is
measured in nat (natural unit)
1 Nat = 1.443 bit
Digital Communications -2008-09
L. Paura Lecture 2
13
Measure of the information of a
discrete random variable
( ) ( ) ( )
i
I a log1/ log
i i
p a p a = =
Such a definition of I() assures two proprieties:
1. Decreasing with p(a
i
)
2. Additivity for independent symbols
(thanks to the logarithm!!)
( ) ( )
( )
( )
i j i j
I a ,b log , =I a +I b
i j
p a b =
Moreover:
3. I( ) depends only on the probability of a
i;
namely, I[p(a
i
)]
4. I() is a continuous function of p(a
i
) = p
i
Digital Communications -2008-09
L. Paura Lecture 2
14
Measure of the information of a
discrete random variable
The correspondence a
i
I(p
i
) defines a discrete RV X.
( ) [ ] ( )
i
1 1
1
H X = E I a ( ) log
N N
i i i
i i i
E X I a p p
p
= =
| |
= =
(
|
\

Average Information carried by the observation of a single


output of the source.
Entropy of the random variable X
Digital Communications -2008-09
L. Paura Lecture 2
15
Measure of the information of a
discrete random variable
Example: Binary Source with parameter p
( ) ( )
2
1
1
X log log (1 )log(1 )
b i
i i
H p p p p p H p
p
=
| |
= = =
|
\

0 1
p
H
b
(X)
0.5
1 The uncertainty (the information)
is maximum when the symbols are
equiprobable : p = 0.5 H
b
=1
If p=0 or =1no
uncertainty H
b
(X) = 0
Digital Communications -2008-09
L. Paura Lecture 2
16
Entropy of a pair of RV
Example:
Let us consider a joint trial whose sample space is
S= S
1
S
2
con S
1
={V,P, S} e S
2
= { G, NG}
We can define a RV over a single trial characterized by the
sample space S
i
S
1
X
S
2
Y
H(X) measures the average information carried by the trial S
1
H(Y) measures the average information carried by the trial S
2
G- Totti is in the team
NG- Totti is not in the team
Digital Communications -2008-09
L. Paura Lecture 2
17
Entropy of a pair of RV
How much is the average amount of information
carried by S namely, by the pair (X,Y) of RVs?
i j i j
i j
i j
1
I(x , ) log log (x , )
(x , )
(x , ) Pr( , )
i j
y p y
p y
dove p y X x Y y
=
= =

Let us define the information carried by the pair of results x


i
e y
i
Is I(X,Y) equal to I(X) +I(Y) ?
NO!!!
Digital Communications -2008-09
L. Paura Lecture 2
18
Entropy of a pair of RV
i j i j
(x , ) I(x , ) y y
Therefore I(X, Y) is a RV function of the RVs X and Y
i, i,
i,j
H(X,Y) E[I(X,Y)] = - p(x ) logp(x )
j j
y y

Joint Entropy of the RVs X eY


Digital Communications -2008-09
L. Paura Lecture 2
19
Entropy of a vector of RVs
1 2
1 2 1 2 1 2
, ,..,
( , ,... ) ( , ,.., ) log ( , ,.., )
n
n n n
x x x
H X X X p x x x p x x x =

Average Information carried by the block of n RVs.
Propriety of the additivity
If the n RVs are statistically independent it can be easily shown that:
1 2
1
( , ,... ) ( )
n
n i
i
H X X X H X
=
=

Digital Communications -2008-09
L. Paura Lecture 2
20
Conditional Entropy
I(x
i
|y
j
) = -log p(x
i
|y
j
)
Where p(x
i
|y
j
) is the probability that x
i
occurr given
(condizionally ) the the event y
j
is occurred
,
( / ) ( , ) log ( / )
i j
i j i j
x y
H X Y p x y p x y

Digital Communications -2008-09


L. Paura Lecture 2
21
Conditional Entropy
More generally:
1 2
1 2 1
1 2 1 2 1
, ,..,
( / , ,... )
( , ,.., ) log ( / , ,.., )
n
n n
n n n
x x x
H X X X X
p x x x p x x x x

=


Digital Communications -2008-09
L. Paura Lecture 2
22
Conditional Entropy
H(X,Y) = H(X) +H(Y/X)
[ ]
,
,
,
, ,
( , ) ( , ) log ( , )
( , ) log ( ) ( / )
( , ) log ( ) log ( / )
( , ) log ( ) ( , ) log ( / ) ( ) ( / )
x y
x y
x y
x y x y
H X Y p x y p x y
p x y p x p y x
p x y p x p y x
p x y p x p x y p y x H X H Y X
= =
= =
= + =
= = +


Bayes Relation
Propriety of the log.
( , ) ( )
y
p x y p x =

Digital Communications -2008-09


L. Paura Lecture 2
23
Conditional Entropy
More generally:
1 2 1 2 1 3 1 2
1
1 2 1 1 1 1
1
( , ,... ) ( ) ( / ) ( / , ) ...
( / , ,... ) ( ) ( / ,.. )
n
n
n n i i
i
H X X X H X H X X H X X X
H X X X X H X H X X X

+
=
= + + +
+ = +

Chain Rule for the entropy
Digital Communications -2008-09
L. Paura Lecture 2
24
Conditional Entropy
If X e Y are statistically independent
( / ) ( )
( / ) ( )
H X Y H X
H Y X H Y
=
=
In fact in such a case: p(x/y)= p(x,y)/p(y)= p(x)p(y)/p(y) =p(x)
,
,
( / ) ( , )log ( / )
( ) ( )log ( )
( )log ( ) ( ) ( )
x y
x y
x y
H X Y p x y p x y
p x p y p x
p x p x p y H X
= =
= =
= =


=1
H(X)
The conditioning does not operate
Digital Communications -2008-09
L. Paura Lecture 2
25
Proprieties of the Entropy
H(X
1
,X
2
,,X
n
) is function of only p(x
1
,x
2
,x
n
)
If the RVs X
i
are jointly statistically independent
H(X
1
,X
2
,,X
n
) is function of the p(x
i
)
The entropy is non negative:
H(X
1
,X
2
,,X
n
) = H(X) > 0
In fact I(x
1
,x
2
,.,x
n
) > 0 namely, is a RV non negative
expectation of I(x
1
,x
2
,.,x
n
) i.e. H(X) > 0
Digital Communications -2008-09
L. Paura Lecture 2
26
Information rate of a discrete source
1 2
( , ,... )
lim
K
K
H X X X
H
K

=
average information carried by the block of K symbols
lim
K
information carried by the arbitrary source symbol
K
= =
=
N:B.: The limit exists if the source is stationary
Digital Communications -2008-09
L. Paura Lecture 2
27
Information rate of a discrete memoryless
source (DMS)
H(X
1
, X
2
,.X
K
) = H(X
1
) +H(X
2
) +.+ H(X
K
) =K H(X)
statist. indep.
Stationarity
1 2
( , ,... ) ( )
lim lim ( )
K
K K
H X X X KH X
H H X
K K

= =
X
i
is a digital random signal
Discrete Memory-
Less source
i
X
Digital Communications -2008-09
L. Paura Lecture 2
28
Information rate of a discrete memoryless
source (DMS)
For a DMS (Discrete Memoryless Source) the rate H
i.e. the average number of bits produced by the source
symbol is equal to the entropy of the arbitrary symbol X
( ) H H X =
H(X) is an average parameter of the RV
H is an average parameter of a random signal (digital signal)
Digital Communications -2008-09
L. Paura Lecture 2
29
Measure of the information
of a DMS
Example: A source of one side bandwidth of 4KHzis sampled
at the Nyquist rate . Assuming that the sequence of thesamples
Can be modeled roughlyas a discrete memoryless source
with the following source alphabet :
{ } { }
2, 1, 0,1, 2, with probabilities 1/2, 1/4,1/16,1/16 A
Evaluate the information rate of the DMS.
Digital Communications -2008-09
L. Paura Lecture 2
30
Measure of the information
of a DMS
5
1
1
( ) ( )log
( )
i
i i
H H X p x
p x
=
= = =

1 1 1 1 1 15
log2 log4 log8 log16 log16 /
2 4 8 16 16 8
bits sample = + + + + =
The Nyquist rate is 2 4.000 = 8.000 samples/sec
Since any sample carries 15/8 bits, the source rate in bits/sec is :
15
8.000 15.000 / sec
8
bits =

Vous aimerez peut-être aussi