Data Compression: Reference: Proakis Salehi (II Ed.) Cap.4

Digital Communications -2008-09
L. Paura Lecture 2
1
Data Compression
Summary :
Information source model
Measure of the information: entropy of a discrete random
variable
Entropy of a information source
Conditional Entropy
Source Coding Theorem (I Shannon Theorem)
Huffman Coding
Lempel-Ziv Coding
Reference : Proakis Salehi (II ed.) Cap.4
L. Paura Lecture 2
2
Information Source Model
Examples of information sources:
Audio-broadcasting System : speech signal
Video-broadcasting System : video signal
Fax transmission system: monochromatic image
Communication System among computers : ASCII symbol
Sequence or sequence of binary symbols
L. Paura Lecture 2
3
Digital Source:
It is a source whose output is a sequence of symbols
belonging to a finite and discrete alphabet (N possible
symbols) with a rate 1/ T .
Digital
Source
1 0 1
...... , , ,..... X X X
A digital source can be modelled as a time-discrete random signal

namely, a sequence of random variables.
L. Paura Lecture 2
4
Example: binary memoryless source
The source provides symbols {0,1} statistically independent
(each other) and identically distributed every T second.
Binary
Memoryless Source
,0,0,1,0,01,0,1,.
The sequence of RVs X
i
is iid (independent and identically distri-
buted) Discrete Memoryless Source (DMS)
L. Paura Lecture 2
5
The characterization of the sequence X
i
is simple: we have to assign
only a parameter
Pr(X
i
=1) p Pr(X
i
=0) = 1- p
The probability that the source produce an arbitrary block of
symbols in the time interval kT is evaluated by exploiting the statistical
independence among the symbols
Example: Pr (010110100) = p
4
(1-p)
5
In general, say m the number of 1 one has :
Pr(X
1
.X
k
) = p
m
(1-p)
k-m
Given the value of p the source is completely statistically characterized
L. Paura Lecture 2
6
Example of source coding
Let be S a BMS producing symbols1 e 0 with p = 3/4
If no coding is performed to storage a sequence of
10.000 symbols we need a memory amount of M = 10.000 binary
locations (bits)
Let us consider the following binary code :
00 c
1
000
01 c
2
001
10 c
3
01
11 c
4
1
l
1
=3, l
2
=3, l
3
=2, l
4
=1 [ ]
4
1
i i
i
E l l p
=
= =
[ ]
2 2
1 1 3 3 1 3
3 3 2 1 1.69
4 4 4 4 4 4
E l
| | | |
= + + +
| |
\ \

M 1.695000=8.450 l.m.b.
L. Paura Lecture 2
7
8450 < 10.000
We have saved memory amount to storage our information by
representing the most likelyhood pairs of binary symbols (namely,
11in our example) with shortest codeword (unitary length)
Representation of the code by a binary tree:
Rule :
two branches depart from each node
Let us denote with 1 the upper branch
Let us denote with 0 the lower branch
000
001
01
1
No codeword is prefix
of any other codeword
Prefix code
L. Paura Lecture 2
8
How much memory amount can we save at most?
I Shannon Theorem gives us the answer
The lowest number of symbols to represent without
distortion (say, distortion-less or noiseless) every soure
symbol is given by the average amount of information
carried by any source symbol.
L. Paura Lecture 2
9
Measure of the information
The information of an event is associated with its uncertainty
level namely, with the probability of such event.
Example: Match Napoli-Roma
Possible events:
V=The Napoli team wins
S= The Napoli team loses
P=The Napoli team draws
( ) ( ) ( )
Pr Pr Pr 1 V S P + + =
L. Paura Lecture 2
10
Let us assume that we know the three probabilities:
P
r
(V) = 0.001, P
r
(S) = 0.99, P
r
(P) = 0.009
I(V) > I(P) >> I(S)
The information of an event is a decreasing function
of its probability.
P
r
(V) < P
r
(P) < <P
r
(S)
L. Paura Lecture 2
11
Measure of the information of a
discrete random variable
The information associated with two independent events A
and B must be the summation of the single information
of A and B, namely
I(AB) = I(A) + I(B)
Let us define the information I(a
i
) associated with the
symbol a
i
( ) ( ) ( )
i
I a log1/ log
i i
p a p a = =
Autoinformation of the symbol a
i
L. Paura Lecture 2
12
If the logarithm is in base 2 the autoinformation I() is
measured in bit (binary unit)
If the logarithm is in base e the autoinformatione I() is
measured in nat (natural unit)
1 Nat = 1.443 bit
L. Paura Lecture 2
13
( ) ( ) ( )
i
I a log1/ log
i i
p a p a = =
Such a definition of I() assures two proprieties:
1. Decreasing with p(a
i
)
2. Additivity for independent symbols
(thanks to the logarithm!!)
( ) ( )
( )
( )
i j i j
I a ,b log , =I a +I b
i j
p a b =
Moreover:
3. I( ) depends only on the probability of a
i;
namely, I[p(a
i
)]
4. I() is a continuous function of p(a
i
) = p
i
L. Paura Lecture 2
14
The correspondence a
i
I(p
i
) defines a discrete RV X.
( ) [ ] ( )
i
1 1
1
H X = E I a ( ) log
N N
i i i
i i i
E X I a p p
p
= =
| |
= =
(
|
\

Average Information carried by the observation of a single

output of the source.
Entropy of the random variable X
L. Paura Lecture 2
15
Example: Binary Source with parameter p
( ) ( )
2
1
1
X log log (1 )log(1 )
b i
i i
H p p p p p H p
p
=
| |
= = =
|
\
0 1
p
H
b
(X)
0.5
1 The uncertainty (the information)
is maximum when the symbols are
equiprobable : p = 0.5 H
b
=1
If p=0 or =1no
uncertainty H
b
(X) = 0
L. Paura Lecture 2
16
Entropy of a pair of RV
Example:
Let us consider a joint trial whose sample space is
S= S
1
S
2
con S
1
={V,P, S} e S
2
= { G, NG}
We can define a RV over a single trial characterized by the
sample space S
i
S
1
X
S
2
Y
H(X) measures the average information carried by the trial S
1
H(Y) measures the average information carried by the trial S
2
G- Totti is in the team
NG- Totti is not in the team
L. Paura Lecture 2
17
How much is the average amount of information
carried by S namely, by the pair (X,Y) of RVs?
i j i j
i j
i j
1
I(x , ) log log (x , )
(x , )
(x , ) Pr( , )
i j
y p y
p y
dove p y X x Y y
=
= =
Let us define the information carried by the pair of results x

i
e y
i
Is I(X,Y) equal to I(X) +I(Y) ?
NO!!!
L. Paura Lecture 2
18
i j i j
(x , ) I(x , ) y y
Therefore I(X, Y) is a RV function of the RVs X and Y
i, i,
i,j
H(X,Y) E[I(X,Y)] = - p(x ) logp(x )
j j
y y
Joint Entropy of the RVs X eY

L. Paura Lecture 2
19
Entropy of a vector of RVs
1 2
1 2 1 2 1 2
, ,..,
( , ,... ) ( , ,.., ) log ( , ,.., )
n
n n n
x x x
H X X X p x x x p x x x =

Average Information carried by the block of n RVs.
Propriety of the additivity
If the n RVs are statistically independent it can be easily shown that:
1 2
1
( , ,... ) ( )
n
n i
i
H X X X H X
=
=

L. Paura Lecture 2
20
Conditional Entropy
I(x
i
|y
j
) = -log p(x
i
|y
j
)
Where p(x
i
|y
j
) is the probability that x
i
occurr given
(condizionally ) the the event y
j
is occurred
,
( / ) ( , ) log ( / )
i j
i j i j
x y
H X Y p x y p x y


L. Paura Lecture 2
21
Conditional Entropy
More generally:
1 2
1 2 1
1 2 1 2 1
, ,..,
( / , ,... )
( , ,.., ) log ( / , ,.., )
n
n n
n n n
x x x
H X X X X
p x x x p x x x x
=

L. Paura Lecture 2
22
Conditional Entropy
H(X,Y) = H(X) +H(Y/X)
[ ]
,
,
,
, ,
( , ) ( , ) log ( , )
( , ) log ( ) ( / )
( , ) log ( ) log ( / )
( , ) log ( ) ( , ) log ( / ) ( ) ( / )
x y
x y
x y
x y x y
H X Y p x y p x y
p x y p x p y x
p x y p x p y x
p x y p x p x y p y x H X H Y X
= =
= =
= + =
= = +

Bayes Relation
Propriety of the log.
( , ) ( )
y
p x y p x =

L. Paura Lecture 2
23
Conditional Entropy
More generally:
1 2 1 2 1 3 1 2
1
1 2 1 1 1 1
1
( , ,... ) ( ) ( / ) ( / , ) ...
( / , ,... ) ( ) ( / ,.. )
n
n
n n i i
i
H X X X H X H X X H X X X
H X X X X H X H X X X
+
=
= + + +
+ = +

Chain Rule for the entropy
L. Paura Lecture 2
24
Conditional Entropy
If X e Y are statistically independent
( / ) ( )
( / ) ( )
H X Y H X
H Y X H Y
=
=
In fact in such a case: p(x/y)= p(x,y)/p(y)= p(x)p(y)/p(y) =p(x)
,
,
( / ) ( , )log ( / )
( ) ( )log ( )
( )log ( ) ( ) ( )
x y
x y
x y
H X Y p x y p x y
p x p y p x
p x p x p y H X
= =
= =
= =

=1
H(X)
The conditioning does not operate
L. Paura Lecture 2
25
Proprieties of the Entropy
H(X
1
,X
2
,,X
n
) is function of only p(x
1
,x
2
,x
n
)
If the RVs X
i
are jointly statistically independent
H(X
1
,X
2
,,X
n
) is function of the p(x
i
)
The entropy is non negative:
H(X
1
,X
2
,,X
n
) = H(X) > 0
In fact I(x
1
,x
2
,.,x
n
) > 0 namely, is a RV non negative
expectation of I(x
1
,x
2
,.,x
n
) i.e. H(X) > 0
L. Paura Lecture 2
26
Information rate of a discrete source
1 2
( , ,... )
lim
K
K
H X X X
H
K
=
average information carried by the block of K symbols
lim
K
information carried by the arbitrary source symbol
K
= =
=
N:B.: The limit exists if the source is stationary
L. Paura Lecture 2
27
Information rate of a discrete memoryless
source (DMS)
H(X
1
, X
2
,.X
K
) = H(X
1
) +H(X
2
) +.+ H(X
K
) =K H(X)
statist. indep.
Stationarity
1 2
( , ,... ) ( )
lim lim ( )
K
K K
H X X X KH X
H H X
K K

= =
X
i
is a digital random signal
Discrete Memory-
Less source
i
X
L. Paura Lecture 2
28
Information rate of a discrete memoryless
source (DMS)
For a DMS (Discrete Memoryless Source) the rate H
i.e. the average number of bits produced by the source
symbol is equal to the entropy of the arbitrary symbol X
( ) H H X =
H(X) is an average parameter of the RV
H is an average parameter of a random signal (digital signal)
L. Paura Lecture 2
29
of a DMS
Example: A source of one side bandwidth of 4KHzis sampled
at the Nyquist rate . Assuming that the sequence of thesamples
Can be modeled roughlyas a discrete memoryless source
with the following source alphabet :
{ } { }
2, 1, 0,1, 2, with probabilities 1/2, 1/4,1/16,1/16 A
Evaluate the information rate of the DMS.
L. Paura Lecture 2
30
of a DMS
5
1
1
( ) ( )log
( )
i
i i
H H X p x
p x
=
= = =
1 1 1 1 1 15
log2 log4 log8 log16 log16 /
2 4 8 16 16 8
bits sample = + + + + =
The Nyquist rate is 2 4.000 = 8.000 samples/sec
Since any sample carries 15/8 bits, the source rate in bits/sec is :
15
8.000 15.000 / sec
8
bits =

Data Compression: Reference: Proakis Salehi (II Ed.) Cap.4

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Data Compression: Reference: Proakis Salehi (II Ed.) Cap.4

Transféré par

Droits d'auteur :

Formats disponibles

Digital Communications -2008-09

A digital source can be modelled as a time-discrete random signal

Average Information carried by the observation of a single

Let us define the information carried by the pair of results x

Joint Entropy of the RVs X eY

Digital Communications -2008-09

Digital Communications -2008-09

Vous aimerez peut-être aussi