Académique Documents
Professionnel Documents
Culture Documents
3.1 Overview 3.2 The Huffman Coding Algorithm 3.2.1 Minimum Variance Huffman Codes 3.2.2 Optimality of Huffman Codes (*) 3.2.3 Length of Huffman Codes (*) 3.2.4 Extended Huffman Codes (*) 3.3 Nonlinear Huffman Codes (*) 3.4 Adaptive Huffman Coding 3.4.1 Update Procedure 3.4.2 Encoding Procedure 3.4.3 Decoding Procedure
Ch 3 Huffman Coding 2
Outline
3.5 Golomb Codes 3.6 Rice Codes 3.6.1 CCSDS Recommendation for Lossless Compression 3.7 Tunstall Codes 3.8 Applications of Huffman Coding 3.8.1 Lossless Image Compression 3.8.2 Text Compression 3.8.3 Audio Compression 3.9 Summary 3.10 Projects and Problems
3.1 Overview
In this chapter, we describe a very popular coding algorithm called the Huffman coding algorithm: 9 Present a procedure for building Huffman codes when the probability model for the source is known. 9 9 A procedure for building codes when the source statistics are unknown Describe a new technique for code design that are in some sense similar to the Huffman coding approach 9 Some applications
Ch 3 Huffman Coding
Ch 3 Huffman Coding
distinct
k bits
As these codewords correspond to the least probable symbols in the alphabet, no other codeword can be longer than these codewords; therefore there is no danger that the shortened codeword would become the prefix of some other codeword. 6
Ch 3 Huffman Coding
Ch 3 Huffman Coding
Ch 3 Huffman Coding
In this case, the least probable symbols are a3 and a1 . Therefore, c(a3) = 3 0 c(a1) = 3 1
c(a4) = 1 0 = 2 10 c(a5) = 1 1 = 2 11 9
Ch 3 Huffman Coding
The average length for this code is l = .4*1 + .2*2 + .2*3 + .1*4 + .1*4 = 2.2 bits/symbol. A measure of the efficiency of this code is its redundancy the difference between the entropy and the average length. In this case, the redundancy = 2.2 2.122 = 0.078 bits/symbol. 11 Ch 3 Huffman Coding 12
Ch 3 Huffman Coding
a3 (0.6) 0 a2 (0.4) 1
a2
(0.4)
a1
(0.2) (0.2) 1
Figure 3.1 The Huffman encoding procedure. The symbol probabilities are list in parentheses.
Figure 3.2 Building the binary Huffman tree. Notice the similarity between Figures 3.1 and 3.2. This is not surprising, as they are a result of viewing the same procedure in two different ways. 13 Ch 3 Huffman Coding
(0.2)
a3
a4
(0.1)
a5
(0.1) 14
Ch 3 Huffman Coding
Table 3.6 Reduced four-letter alphabet Letter a2 a4 a1 a3 Probability 0.4 0.2 0.2 0.2 Codeword c(a2) 1 c(a1) c(a3) 15 Table 3.8 Reduced two-letter alphabet. Letter a2 a1 Probability 0.6 0.4 Codeword 3 2
Ch 3 Huffman Coding
Ch 3 Huffman Coding
16
a2 (0.6) a1 (0.4)
0 1
The average length for this code is l = .4*2 + .2*2 + .2*2 + .1*3 + .1*3 = 2.2 bits/symbol. These two codes are identical in terms of their redundancy. However, the variance of the length of the codewords is significantly different.
Ch 3 Huffman Coding
17
Ch 3 Huffman Coding
18
3
5
a2
(0.4)
a1
(0.2) (0.2) 1
(0.4)
a2
a1
(0.2)
a3
(0.2)
a3
(0.2)
a4
(0.1)
a5
(0.1)
a4
(0.1)
a5
(0.1)
minimum variance
Sibling property : nodes y(2j-1) and y(2j) are sibling for 1 j < n node number for the parent number is greater than y(2j-1) and y(2j) 19 Ch 3 Huffman Coding 20
Figure 3.4 Two Huffman trees corresponding to the same probabilities. Ch 3 Huffman Coding
The letter ak is encoded as (e+1)-bit binary representation of k-1 e-bit binary representation of k-r-1 , ex: a1 a2 a22 21 [ 1 2*10 ] [ 2 2*10 ] [22 > 2*10 ] 1-1 2-1 if 1 k 2r otherwise
As transmission progresses, nodes corresponding to symbols transmitted will be added to the tree, and the tree is reconfigured using a update procedure.
22-10-1 1011
Ch 3 Huffman Coding
Ch 3 Huffman Coding
Both transmitter and receiver 9 9 Start with the same tree structure Update procedure is identical
Ch 3 Huffman Coding
23
Ch 3 Huffman Coding
24
Is this the root node ? Yes STOP Figure 3.6 (b) Update Procedure Ch 3 Huffman Coding
No
Go to parent node
26
2
50
2
50
1
48 r ( aar )
0
51
2
50
initial tree
0
49
1
50
(a) a Send a binary code 00000 for a Since the index of a is 1 Ch 3 Huffman Coding
28
2
50
2
50
2
50
a 1 47 1 45 NYT
2 49 B Old NYT
2
50
1 r
48
1
48
1
48
1
48
1
46
0
45
1
46
1
46
Swap nodes
( aard ) v Send 000 for NYT node, then send the fixed code 1011 for v Since the index of v is 22 So, the fixed code is 1011 (11) Ch 3 Huffman Coding
0
43
1
44
v ( aardv )
Read in Symbol 3 50
2
50 2 48
2 a
49
1 r
47 1 45 NYT
Swap nodes
1 r
47 1 45 NYT
2 48
Send code for NYT node followed by index in the NYT list
Yes
No
Code is the path from the root node to the corresponding node
1
46
1
46
0
43
1
44
0
43
1
44
( aardv )
Message [
No
0 0 0 0 0 1
0 1 0 0 0 1 0 0 0 0 0 1 1 0 0 0 1 0 1 1 0
NYT
NYT
NYT
Ch 3 Huffman Coding
33
Ch 3 Huffman Coding
34
Read e bit
No
Is the e-bit number p less than r ? Yes Read one more bit
No
Add r to p
0 0 0 0 0 1
B No Is this the last bit ? Yes START Figure 3.9 (c) flowchart of the decoding procedure Message [
0 1 0 0 0 1 0 0 0 0 0 1 1 0 0 0 1 0 1 1 0
Ch 3 Huffman Coding
37
Ch 3 Huffman Coding
38
Table 3.24 Compression using Huffman codes on pixel difference values. Sena Sensin Earth Omaha Image Name Sena Sensin Earth Omaha 39 Ch 3 Huffman Coding Bits/Pixel 4.02 4.70 4.13 6.42 Total Size(bytes) Compression Ratio 32,968 38,541 33,880 52,643 1.99 1.70 1.93 1.24 40
256*256 Gray scale raw image. Figure 3.10 Test Images. ftp://ftp.mkp.com/pub/Sayood/uncompressed_software/datasets/images/ Ch 3 Huffman Coding
Adaptive Huffman coder 9 Adv. 9 Can be used as an on-line or real-time coder 9 Disadv. 9 More vulnerable to errors 9 More difficult to implement
Ch 3 Huffman Coding
41
Ch 3 Huffman Coding
42
Ch 3 Huffman Coding
43
Ch 3 Huffman Coding
44
16 bits : 65,536 distinct values Huffman coder would require 65,536 distinct (variable-length) codewords. In most applications, a codeword of this size would not be practical. Large alphabet 9 9 Recursive indexing chapter 8 Others [ reference: #180]
Table 3.29 Huffman codes of differences of 16-bit CD-quality audio. File Name Mozart Cohn Mir 45 Original File Size(bytes) 939,862 402,442 884,020 Entropy(bits) 9.7 10.4 10.9 Estimated Compressed File Size(bytes) 569,792 261,590 602,240 Compression Ratio 1.65 1.54 1.47 46
Ch 3 Huffman Coding
Ch 3 Huffman Coding