Chapter 5 - Information Theory and Coding

Chapter 5 Information Theory and Coding
In communication systems, information theory, pioneered by C. E. Shannon, generally deals with mathematical formulation of the information transfer from one place to another. It is concerned with source coding and channel coding. Source coding attempts to minimize the number of bits required to represent the source output at a given level of efficiency. Channel coding, on the other hand, is used so that information can be transmitted through the channel with a specified reliability. Information theory answers the two fundamental questions in communication theory: what is the limit of data compression? (answer: entropy of the data H(X), is its compression limit) and what is the ultimate transmission rate of communication? (answer: the channel capacity C, is its rate limit) . Information theory also suggests means which systems can use for achieving these ultimate limits of communication.
5.1
Amount of Information
In general, the function of a communication system is to convey information from the transmitter to the receiver that means the job of the receiver is therefore to identify which one from the number of allowable messages was transmitted. The measure of information is related to uncertainty of events. This means commonly occurring messages convey little information, whereas rare messages will carry more information. This idea is explained by a logarithmic measure of information which is first proposed by R.V. L. Hartley. Consider a discrete source whose output , with i = 1, 2, , M, are a sequence of symbols chosen from a finite set { } , the alphabet of the source. The message symbols are emitted from the source with the probability distribution ( ). So the discrete message source can be mathematically modeled as a discrete random process with a sequence of random variables taking values from the set with probability distribution ( ). Now let the source select and transmit a message and lets further assume that the receiver has correctly identified the message. Then the system has conveyed an amount of information given by = ( ) = log
( )
The amount of information, also called self information, is a dimensionless number but by convention a unit of bit is used. As stated the message with small the probability of occurrence will have larger self-information.
= log
( )
(5.1)
5.1.1 Average Information (Entropy)

A source is said to be a discrete memoryless source if its symbols form a sequence of independent and identically distributed random variables. The average information, denoted H(X), of a discrete memoryless source is found as the expected value of the self information ( )
The average information is formally defined as the entropy of
( ) = { ( )} =
( ) log
( )
(5.2) .
Example 5.1 Consider a binary source that emits symbols 1 and 0 with probability distribution (0)= 1 p respectively. The entropy of this binary source is given by ( ) = log (1 ) log (1 )
(1)= p and
When plotted for different values of p, it shows that ( ) is maximum for p = 0.5 that is for the case of equally likely symbols. This example shows that for a source with M symbols ( ) log
The equality is achieved when the distribution is uniform (equally likely symbols), that is, ( ) = 1/M for all i.
Figure 5.1 Entropy of a binary memoryless Source
5.1.2 Information Rate

If the source of the messages generates messages at the rate r messages per second, then the information rate is defined to be which is said to be the average number of bits of information per second. = ( ) (5.3)
5.2
Source Coding
Source coding is the process of efficiently representing the source output with a design aim of reducing the redundancy. It can also reduce the fluctuations in the information rate from the source so that symbol surges from the message sequences of high probability symbols doesnt overload the channel.
Source coding can be generally be classified as lossless or lossy. In Lossless source coding is usually referred to as lossless data compression or data compaction the operation of decoding reproduces the original perfectly. On the other hand, in lossy source coding referred to as lossy data compression or simply data compression, the encoding operation is not reversed perfectly.
5.2.1 Coding For Discrete Memoryless Source

Assuming that each symbol in the source alphabet will be mapped into a binary codeword, we can classify a source code as fixed-length or variable-length. In a fixed-length code, the number of bits that represents each source symbol is fixed. In a variable-length code, shorter codewords are assigned to more likely source symbols and less probable source symbols are assigned longer codewords in such a way that the average codeword length is minimized. Consider a discrete memoryless source (DMS) producing symbols from an alphabet of M symbols with an a priori probabilities . Let the symbols be encoded by a variable-length codeword in which the i-th codeword has length . The average codeword length for this source is given by = (bit per codeword)
(5.4)
The source-coding theorem states that any discrete memoryless source can be losslessly encoded with a code whose average codeword length is arbitrarily close to, but not less than, the entropy of the source. Mathematically,
( ) (5.5)
Coding Efficiency
A measure of the efficiency of a source-encoding method can be obtained by comparing the average number of binary digits per source symbol to the entropy H:
= ( ) 100% (5.6) From eq. (5.5) and (5.6) it can be understood that the maximum amount of information that can be conveyed by L bit binary code word is = (5.7)
Fixed-Length Codewords
For equiprobable source symbols, let the source encoder assign a unique set of L binary digits to each symbol. Since there are M possible symbols, the number of binary digits per symbol required for unique encoding when M is a power of 2 is = log (5.8) and when M is not a power of 2 +1 = log (5.9)
where denotes the largest integer less than x. Eq. (5.8) shows that for M a power of 2 and equiprobable source symbols L = H(X). Hence, a fixed-length code of L bits per symbol attains 100% efficiency. However, if M is not a power of 2 but the source symbols are still equally likely, L differs from H(X) by at most 1 bit per symbol. When log 1, the efficiency of this encoding scheme is high. On the other hand, when M is small, the efficiency of the fixed-length code can be increased by encoding a sequence of J symbols at a time. To accomplish the desired encoding, we require M J unique code words. Since by using sequences of N binary digits, we can accommodate 2N possible codewords, we can select N must so that log . Hence, the minimum integer value of N required is +1 = log (5.10) Now the average number of bits per source symbol is N / J = L, thus, the inefficiency has been reduced by approximately a factor of 1/J relative to the symbol-by-symbol encoding described above. By making J sufficiently large the efficiency of the encoding procedure, measured by the ratio JH(X)/N, can be made as close to unity as desired.
Variable-Length Codewords
When the source symbols are not equiprobable, a more efficient encoding method is to use variable-length code words. An example of such encoding is the Morse code, which dates back to the nineteenth century. In the Morse code, the letters that occur more frequently are assigned short codewords and those that occur infrequently are assigned long codewords. Following this general philosophy, we may use the probabilities of occurrence of the different source symbols in the selection of the codewords. The problem is to devise a method for selecting and assigning the codewords to source symbols. This type of encoding is called entropy coding. Example 5.2 Suppose that a DMS with output symbols a1, a2, a3, a4 and corresponding probabilities P(a1) = , P(a2) = , P(a1) = P(a1) = 1/8 is encoded as shown below
Symbol a1 a2 a3 a4 P(ak) 1/8 1/8 Code I 1 00 01 10 Code II 0 10 110 111 Code III 0 01 011 111
Consider a sequence of coded bits 001001 ...
Code I is a variable-length code that has a basic flaw. For the sequence of bits above, the first symbol corresponding to 00 is a2. However, the next four bits are ambiguous (not uniquely decodable). They may be decoded either as a4 a3 or as a1 a2 a1. Perhaps, the ambiguity can be resolved by waiting for additional bits, but such a decoding delay is highly undesirable. We shall only consider codes that are decodable instantaneously, that is, without any decoding delay. 4
Code II in above table is uniquely and instantaneously decodable. It is convenient to represent the code words in this code graphically as terminal nodes of a tree, as shown in the Fig. 5.2a below. We observe that the digit 0 indicates the end of a code word for the first three code words. This characteristic plus the fact that no code word is longer than three binary digits makes this code instantaneously decodable. Note that no code word in this code is a prefix of any other code word. In general, the prefix condition requires that for a given codeword lk of length k having elements (b1, b2, b3, bk ), there is no other code word of length l < k with elements (b1, b2, b3, bl ), for 1 l k - 1.
a)Code tree for Code II
b)Code tree for code III
Fig. 5.2 Code tree In other words, there is no code word of length l < k that is identical to the first l binary digits of another code word of length k > l. This property makes the code words instantaneously decodable. Code III given in Table has the tree structure shown in Fig. 5.2 b. We note that in this case the code is uniquely decodable but not instantaneously decodable. Clearly, this code does not satisfy the prefix condition. The main objective of variable-length coding is to devise a systematic procedure for constructing uniquely decodable codes that are efficient in the sense that the average number of bits per source symbol, eq. (5.4) is minimized. The Huffman Source-Coding In Huffman coding, fixed-length blocks of the source output are mapped to variable length binary blocks. This is called fixed- to variable-length coding. The idea is to map the more frequently occurring fixed-length sequences to shorter binary sequences and the less frequently occurring ones to longer binary sequences. In variable-length coding, synchronization is a problem. This means that there should be one and only one way to break the binary-received sequence into code words. The following example clarifies this point The idea in Huffman coding is to choose code word lengths such that more probable sequences have shorter code words. If we can map each source output of probability pi to a code word of length approximately and at the same time ensure unique decodability, we can achieve an 5
average code word length of approximately H(X). Huffman codes are uniquely decodable instantaneous codes with minimum-average code word length. In this sense they are optimal. By optimality we mean that, among all codes that satisfy the prefix condition (and therefore are uniquely and instantaneously decodable), Huffman codes have the minimum-average codeword length. Huffman code is constructed as follows: 1. The source symbols are listed in order of decreasing probability. 2. A tree is constructed to the right with the two source symbols of the smallest probabilities combined to yield a new symbol with the probability equal to the sum of the two previous probabilities. 3. The final tree is labeled with 0 on the lower branch (upper branch) and 1 on the upper branch (lower branch). Steps Example 5.3 Consider a discrete source with alphabet {x1, x2, , x5} having probability of occurrence: p1 = p2 = 0.1; p3 = p4 = 0.2; p5 = 0.4 The coding process is shown below
Lempel- Ziv Algorithm (Reading Assignment)

Chapter 5 - Information Theory and Coding

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Chapter 5 - Information Theory and Coding

Transféré par

Droits d'auteur :

Formats disponibles

Chapter 5 Information Theory and Coding

5.1.1 Average Information (Entropy)

The average information is formally defined as the entropy of

Figure 5.1 Entropy of a binary memoryless Source

5.1.2 Information Rate

5.2.1 Coding For Discrete Memoryless Source

Consider a sequence of coded bits 001001 ...

a)Code tree for Code II

b)Code tree for code III

Lempel- Ziv Algorithm (Reading Assignment)

Vous aimerez peut-être aussi