Vous êtes sur la page 1sur 33

ECNG 3001 Communication Systems II

Topic 2 The Information


Source
Review of digital

information
Modeling information
sources
Information content of a
source
Fundamental limits on
representation of
information

Review: Analog, Digital,


Binary

?
2

Review: Bits and


Symbols
voltage

time

voltage

voltage

V4

V4
V3

V3

Symbol period

time

V2

V2

V1

V1
a) 4-level Symbol for logic 00

Symbol period

time

b) 4-level Symbol for logic 01

Examples of Symbols

1 bit per symbol

2 bits per symbol

00

01

10

11

What are the bit and symbol rates if 2 symbols


are transmitted per second in each of the above
cases?

Bit and Symbol Rates


Bit: Binary digit
1 or 0
E.g. 1011010 is 7 bits long
Bit rate: bits per second (bps)

Symbol:
Smallest unit of data transmitted at one time
Symbol rate or baud rate: symbols per second

Review: Symbols and


Signals
E

00

01

10

11

Data

Signal, Bit, Symbol,


Message
Consider the message Hello:

Digital Comms System


Model

Walk through sets of bits, symbols, signals,


messages:
Transmitter

Receiver

Information Sources
An info source produces time-varying, random
o/p
Properties of its random output depend on nature
of source (e.g. its BW, amplitude, statistics etc.)
Mathematical model required to measure
information content of source
Simplest model is the Discrete Memoryless
Source (DMS): Discrete-time, discrete-amplitude
random process in which all sets of (discrete)
random outputs are generated independently
and with the same probability distribution

The Discrete Memoryless


Source
Information Source
X-2, X-1, X0, X1, X2,

The o/p of a DMS is a discrete random variable,


X, whose value does not depend on those of
previous outputs
Let A = {a1, a2, , aN} denote the set of source
output symbols i.e. from which random variable
X takes its values
Let pi = P(X = ai) for i = 1, 2, , N denote the
probability distribution of the random variable
i.e. the probability pi that the random variable X
will take on some value ai out of the set A
The DMS is fully described by the
set A, called
N
pi i 1
its alphabet, and the probabilities,

DMS Example
An information source is described by the
alphabet A = {0,1} and probabilities: p(X i = 1) =
1 P(Xi = 0) = p.
This discrete memory-less source:
Is a binary source as it generates sequences of
1s & 0s
Produces, for each regular time interval, a value
of either 1 or 0 with probability p1 = p; p0 = 1-p
In the special case in which p = 0.5, the source is
called a binary symmetric source, or BSS

Developing a Measure of Info


Consider an information source that describes
weather conditions in Trinidad
The set, A, of possible source outputs is {very hot
(45C), hot (35C), warm (25C), cool (15C), cold
(5C), very cold (-5C)}
Do all possible source outputs yield the same
amount of information?
Would an output of very cold reveal more
information than one of hot?

What is More Info Content


(1)?
Yes, since it is already known that the climate of
Trinidad is generally hot, an indication of hot
would not reveal as much information as one of
very cold
Intuitively, the amount of information
conveyed by a source output decreases
with the probability of occurrence of that
source output
Assuming a discrete source, less probable
source outputs convey more information

What is More Info Content


(2)?
Now consider the case in which info from each source
output, aj, comprises 2 independent parts (e.g.
temperature, aj1, and air pollution level, aj2)
Revealing temperature information does not provide
any information about pollution and vice versa: they
are independent of each other
Intuitively, the amount of information provided by
revealing the combined source output is the sum of
the information conveyed by each of its components

14

Impact of probability
perturbations?

Now consider the case in which the likelihood of


experiencing a very cold temp changes a little
Info content is not drastically altered (it is still
relatively highly unlikely)

Intuitively, the measure of information content is a


continuous function of the probability of
occurrence of a given source output (no
discontinuities present in the function)

15

Information Content of an
Output

From intuition, then, we conclude that:

The information content of a particular output aj


depends only on the probability, pj, of aj, and not on
its value (j denoting the jth source output). This is
known as the self information, denoted by I(pj )
I(pj ) is a decreasing function of its argument
If pj= pj1pj2, then I(pj) = I(pj1) + I(pj2)
I(pj ) is a continuous function of pj

Modeling Information
Content

The only function that satisfies the properties of


information content (prev. slide) is the log,
i.e I (x) = - log (X) (negative sign reflects inverse
relationship)

Base of log determines units of information


e.g. for base 2, information content is expressed
in bits per source symbol or bits per sample
Thus self information can be defined as
I(pi)= - log2(pi) bits per source output

17

Info Content (Entropy) of a


Source
The information content of a DMS, also
known as its entropy H(X), is the
weighted sum of the self information of
all possible source outputs:
N
H ( X ) pi I ( pi )
i 1

The entropy of a discrete random


variable
X is:
N
N
1
H ( X ) pi log 2 pi pi log 2 ( )bits / symbol
pi
i 1
i 1
18

Binary Entropy
Consider a binary memory-less source:
Let p = probability of 0 occurring
And 1 p = probability of 1 occurring
The Binary entropy function is:

H ( X ) p log 2 p (1 p ) log 2 (1 p )
Observe that H(X)= 0, if p = 0 or p = 1

19

Binary Entropy Function

H(X)= 0, if
p = 0 or
p = 1

H ( X ) p log 2 p (1 p ) log 2 (1 p )
20

Max Entropy of a DMS


H(X) = max
if symbols equiprobable
i.e for BDMS if p = 1-p
= 0.5
N
For any
alphabet Nsize:
1
1
H p log p
log
max

i 1


i 1

N
H max
log 2 N
N
H max log 2 N bits / symbol

Example Equi-probable
Outputs
Consider a keypad comprising only A and B keys which are struck
with equal probability. The keypad generates messages of various
lengths:
keys/msg

Possible messages

AA

AB

BA

BB

AAA

AAB

ABA

ABB

BAA

BAB

BBA

BBB

Is this an example of a Discrete Memory-less Source? Why


so?

What is the alphabet of this source?

For each message size, what is the probability of guessing


a message?

How does the entropy of the source change, if at all,


with msg length?
if the alphabet were larger?
If the keys were struck with unequal probability?

22

Outputs of Unequal
Probabilities

Not always the case that every output has an


equal probability of occurrence
In English, the letter e occurs most often, letter
q generally followed by letter u

Consider a BDMS with outputs of unequal


probabilities of occurrence, P(0)=p; P(1)=q:
Every 0 conveys H = -log 2p bits of info
Every 1 conveys H = -log 2q bits of info

What is the avg information content (entropy)


for a source output comprising y symbols

Avg Info, Unequal


probability

In msg comprising y symbols,

No. of 0 symbols in msg = P(0) x (total no.


symbols) = (p)(y)
No. of 1 symbols in msg = P(1) x (total no.
Avg info in
message y is:
symbols)
= the
(q)(y)
H = (no. times 0 occurs x self information for 0)
+ (no. times 1 occurs x self information for 1)

[( p )( y ). log 2 ( p )] [(q )( y ). log 2 (q )] bits / symbol


Avg info in symbol = avg info in msg / no.
symbols
in msg:
( y )[ p log p q log q ]
H av

( y)

( p log 2 p q log 2 q ) bits / symbol


24

Beyond Binary
Previous example used only 2 symbols (0
and 1)
Generally, there can be a max of N
symbols
H p ( y ) log p p ( y ) log p p ( y ) log p .... p ( y ) log p
Thus,
( y ) p log p bits
1

i 1

And
H
av

( y ) pi log 2 pi
i 1

( y)

pi log 2 pi bits / symbol


i 1

25

Redundancy
When entropy is less than max, the message is
said to contain redundancy

Redundancy of a message given by:


H max H av
R
x 100%
H max

What do you think the advantages and


disadvantages of redundancy are?
Can redundancy facilitate error detection? If so, how?
Does redundancy reduce rate of info transmission? If
so, how?

26

Example 1
Consider two possible events one of
which occurs with probability p1 =
0.8.
Determine the information content of
each individual event
What does this result imply?
Determine the average information

Example 2
Determine the average information
of a message that is made up of two
likely symbols, given that the
probability of occurrence of each is
as follows:
p = 1; q = 0
What does this result indicate?

Example 3
A system can send out a group of four
pulses, each of width 1ms. and with
equal probability of having an amplitude
of 0, 1, 2 or 3V. The four pulses are
always followed by a pulse of amplitude
-1V to separate the groups.
Sketch a typical pulse sequence
What is the average rate of information
generated by this system?

Example 4
The probabilities of Example 3 are
altered such that the 0V level occurs
one-half of the time on average, the
1V level occurs one-quarter of the time
on average and the remaining levels
occur one-eighth of the time each.
Find the average rate of generation of
information
Determine the redundancy

Limits re Info
Representation
An information source requires, on average, a minimum of
H(X) bits per source output for error free representation

Information
Source

This imposes a fundamental limit on the


coding of the source information for
transmission in a communications system

Summary
A DMS is a simple model for an information
source
The info content or entropy of a source ...
Is a measure of its randomness
(uncertainty)
Is the weighted sum of the self information of
all possible source o/ps
Is expressed in bits / source symbol or bits /
sample
Is maximized when symbol probabilities are
equal
Is less than max when message contains
redundancy
Is ... for binary

Reading
Mandatory reading:
Proakis, J, Masoud Salehi. Fundamentals
of Communication Systems : Sections
1.2, 2.1.2 (pp.25 and 26 only), Intro to
Ch. 12, 12.1-12.1.2

Recommended reading
Proakis, J, Masoud Salehi. Fundamentals
of Communication Systems : Sections
5.1-5.3
33