Topic 2 - The Information Source

ECNG 3001 Communication Systems II
Topic 2 The Information

Source
Review of digital
information
Modeling information
sources
Information content of a
source
Fundamental limits on
representation of
information
Review: Analog, Digital,

Binary
?
2
Review: Bits and

Symbols
voltage
time
voltage
voltage
V4
V4
V3
V3
Symbol period
time
V2
V2
V1
V1
a) 4-level Symbol for logic 00
Symbol period
time
b) 4-level Symbol for logic 01
Examples of Symbols
1 bit per symbol
2 bits per symbol
00
01
10
11
What are the bit and symbol rates if 2 symbols

are transmitted per second in each of the above
cases?
Bit and Symbol Rates

Bit: Binary digit
1 or 0
E.g. 1011010 is 7 bits long
Bit rate: bits per second (bps)
Symbol:
Smallest unit of data transmitted at one time
Symbol rate or baud rate: symbols per second
Review: Symbols and

Signals
E
00
01
10
11
Data
Signal, Bit, Symbol,

Message
Consider the message Hello:
Digital Comms System

Model
Walk through sets of bits, symbols, signals,

messages:
Transmitter
Receiver
Information Sources
An info source produces time-varying, random
o/p
Properties of its random output depend on nature
of source (e.g. its BW, amplitude, statistics etc.)
Mathematical model required to measure
information content of source
Simplest model is the Discrete Memoryless
Source (DMS): Discrete-time, discrete-amplitude
random process in which all sets of (discrete)
random outputs are generated independently
and with the same probability distribution
The Discrete Memoryless

Source
Information Source
X-2, X-1, X0, X1, X2,
The o/p of a DMS is a discrete random variable,

X, whose value does not depend on those of
previous outputs
Let A = {a1, a2, , aN} denote the set of source
output symbols i.e. from which random variable
X takes its values
Let pi = P(X = ai) for i = 1, 2, , N denote the
probability distribution of the random variable
i.e. the probability pi that the random variable X
will take on some value ai out of the set A
The DMS is fully described by the
set A, called
N
pi i 1
its alphabet, and the probabilities,
DMS Example
An information source is described by the
alphabet A = {0,1} and probabilities: p(X i = 1) =
1 P(Xi = 0) = p.
This discrete memory-less source:
Is a binary source as it generates sequences of
1s & 0s
Produces, for each regular time interval, a value
of either 1 or 0 with probability p1 = p; p0 = 1-p
In the special case in which p = 0.5, the source is
called a binary symmetric source, or BSS
Developing a Measure of Info

Consider an information source that describes
weather conditions in Trinidad
The set, A, of possible source outputs is {very hot
(45C), hot (35C), warm (25C), cool (15C), cold
(5C), very cold (-5C)}
Do all possible source outputs yield the same
amount of information?
Would an output of very cold reveal more
information than one of hot?
What is More Info Content

(1)?
Yes, since it is already known that the climate of
Trinidad is generally hot, an indication of hot
would not reveal as much information as one of
very cold
Intuitively, the amount of information
conveyed by a source output decreases
with the probability of occurrence of that
source output
Assuming a discrete source, less probable
source outputs convey more information
What is More Info Content

(2)?
Now consider the case in which info from each source
output, aj, comprises 2 independent parts (e.g.
temperature, aj1, and air pollution level, aj2)
Revealing temperature information does not provide
any information about pollution and vice versa: they
are independent of each other
Intuitively, the amount of information provided by
revealing the combined source output is the sum of
the information conveyed by each of its components
14
Impact of probability
perturbations?
Now consider the case in which the likelihood of

experiencing a very cold temp changes a little
Info content is not drastically altered (it is still
relatively highly unlikely)
Intuitively, the measure of information content is a

continuous function of the probability of
occurrence of a given source output (no
discontinuities present in the function)
15
Information Content of an
Output
From intuition, then, we conclude that:
The information content of a particular output aj

depends only on the probability, pj, of aj, and not on
its value (j denoting the jth source output). This is
known as the self information, denoted by I(pj )
I(pj ) is a decreasing function of its argument
If pj= pj1pj2, then I(pj) = I(pj1) + I(pj2)
I(pj ) is a continuous function of pj
Modeling Information
Content
The only function that satisfies the properties of

information content (prev. slide) is the log,
i.e I (x) = - log (X) (negative sign reflects inverse
relationship)
Base of log determines units of information

e.g. for base 2, information content is expressed
in bits per source symbol or bits per sample
Thus self information can be defined as
I(pi)= - log2(pi) bits per source output
17
Info Content (Entropy) of a

Source
The information content of a DMS, also
known as its entropy H(X), is the
weighted sum of the self information of
all possible source outputs:
N
H ( X ) pi I ( pi )
i 1
The entropy of a discrete random

variable
X is:
N
N
1
H ( X ) pi log 2 pi pi log 2 ( )bits / symbol
pi
i 1
i 1
18
Binary Entropy
Consider a binary memory-less source:
Let p = probability of 0 occurring
And 1 p = probability of 1 occurring
The Binary entropy function is:
H ( X ) p log 2 p (1 p ) log 2 (1 p )
Observe that H(X)= 0, if p = 0 or p = 1
19
Binary Entropy Function
H(X)= 0, if
p = 0 or
p = 1
H ( X ) p log 2 p (1 p ) log 2 (1 p )
20
Max Entropy of a DMS

H(X) = max
if symbols equiprobable
i.e for BDMS if p = 1-p
= 0.5
N
For any
alphabet Nsize:
1
1
H p log p
log
max
i 1

i 1
N
H max
log 2 N
N
H max log 2 N bits / symbol
Example Equi-probable
Outputs
Consider a keypad comprising only A and B keys which are struck
with equal probability. The keypad generates messages of various
lengths:
keys/msg
Possible messages
AA
AB
BA
BB
AAA
AAB
ABA
ABB
BAA
BAB
BBA
BBB
Is this an example of a Discrete Memory-less Source? Why

so?
What is the alphabet of this source?
For each message size, what is the probability of guessing

a message?
How does the entropy of the source change, if at all,

with msg length?
if the alphabet were larger?
If the keys were struck with unequal probability?
22
Outputs of Unequal
Probabilities
Not always the case that every output has an

equal probability of occurrence
In English, the letter e occurs most often, letter
q generally followed by letter u
Consider a BDMS with outputs of unequal

probabilities of occurrence, P(0)=p; P(1)=q:
Every 0 conveys H = -log 2p bits of info
Every 1 conveys H = -log 2q bits of info
What is the avg information content (entropy)

for a source output comprising y symbols
Avg Info, Unequal

probability
In msg comprising y symbols,
No. of 0 symbols in msg = P(0) x (total no.

symbols) = (p)(y)
No. of 1 symbols in msg = P(1) x (total no.
Avg info in
message y is:
symbols)
= the
(q)(y)
H = (no. times 0 occurs x self information for 0)
+ (no. times 1 occurs x self information for 1)
[( p )( y ). log 2 ( p )] [(q )( y ). log 2 (q )] bits / symbol

Avg info in symbol = avg info in msg / no.
symbols
in msg:
( y )[ p log p q log q ]
H av
( y)
( p log 2 p q log 2 q ) bits / symbol

24
Beyond Binary
Previous example used only 2 symbols (0
and 1)
Generally, there can be a max of N
symbols
H p ( y ) log p p ( y ) log p p ( y ) log p .... p ( y ) log p
Thus,
( y ) p log p bits
1
i 1
And
H
av
( y ) pi log 2 pi
i 1
( y)
pi log 2 pi bits / symbol

i 1
25
Redundancy
When entropy is less than max, the message is
said to contain redundancy
Redundancy of a message given by:

H max H av
R
x 100%
H max
What do you think the advantages and

disadvantages of redundancy are?
Can redundancy facilitate error detection? If so, how?
Does redundancy reduce rate of info transmission? If
so, how?
26
Example 1
Consider two possible events one of
which occurs with probability p1 =
0.8.
Determine the information content of
each individual event
What does this result imply?
Determine the average information
Example 2
Determine the average information
of a message that is made up of two
likely symbols, given that the
probability of occurrence of each is
as follows:
p = 1; q = 0
What does this result indicate?
Example 3
A system can send out a group of four
pulses, each of width 1ms. and with
equal probability of having an amplitude
of 0, 1, 2 or 3V. The four pulses are
always followed by a pulse of amplitude
-1V to separate the groups.
Sketch a typical pulse sequence
What is the average rate of information
generated by this system?
Example 4
The probabilities of Example 3 are
altered such that the 0V level occurs
one-half of the time on average, the
1V level occurs one-quarter of the time
on average and the remaining levels
occur one-eighth of the time each.
Find the average rate of generation of
information
Determine the redundancy
Limits re Info
Representation
An information source requires, on average, a minimum of
H(X) bits per source output for error free representation
Information
Source
This imposes a fundamental limit on the

coding of the source information for
transmission in a communications system
Summary
A DMS is a simple model for an information
source
The info content or entropy of a source ...
Is a measure of its randomness
(uncertainty)
Is the weighted sum of the self information of
all possible source o/ps
Is expressed in bits / source symbol or bits /
sample
Is maximized when symbol probabilities are
equal
Is less than max when message contains
redundancy
Is ... for binary
Reading
Mandatory reading:
Proakis, J, Masoud Salehi. Fundamentals
of Communication Systems : Sections
1.2, 2.1.2 (pp.25 and 26 only), Intro to
Ch. 12, 12.1-12.1.2
Recommended reading
Proakis, J, Masoud Salehi. Fundamentals
of Communication Systems : Sections
5.1-5.3
33

Topic 2 - The Information Source

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Topic 2 - The Information Source

Transféré par

Droits d'auteur :

Formats disponibles

ECNG 3001 Communication Systems II

Topic 2 The Information

Review: Analog, Digital,

Review: Bits and

b) 4-level Symbol for logic 01

1 bit per symbol

2 bits per symbol

What are the bit and symbol rates if 2 symbols

Bit and Symbol Rates

Review: Symbols and

Signal, Bit, Symbol,

Digital Comms System

Walk through sets of bits, symbols, signals,

The Discrete Memoryless

The o/p of a DMS is a discrete random variable,

Developing a Measure of Info

What is More Info Content

What is More Info Content

Now consider the case in which the likelihood of

Intuitively, the measure of information content is a

From intuition, then, we conclude that:

The information content of a particular output aj

The only function that satisfies the properties of

Base of log determines units of information

Info Content (Entropy) of a

The entropy of a discrete random

Binary Entropy Function

Max Entropy of a DMS

Is this an example of a Discrete Memory-less Source? Why

What is the alphabet of this source?

For each message size, what is the probability of guessing

How does the entropy of the source change, if at all,

Not always the case that every output has an

Consider a BDMS with outputs of unequal

What is the avg information content (entropy)

Avg Info, Unequal

In msg comprising y symbols,

No. of 0 symbols in msg = P(0) x (total no.

[( p )( y ). log 2 ( p )] [(q )( y ). log 2 (q )] bits / symbol

( p log 2 p q log 2 q ) bits / symbol

pi log 2 pi bits / symbol

Redundancy of a message given by:

What do you think the advantages and

This imposes a fundamental limit on the

Vous aimerez peut-être aussi