Académique Documents
Professionnel Documents
Culture Documents
ystem) so
that this interface is easier to implement while still haVIng
enough bandwidth to accommodate most lines. Since the entropy
coder only compresses data statistically, in some very rare but
possible cases, the actual output rate of the entropy coer within
a line may be higher than the reduced interf'ce bandwidth. Some
data have to be discarded by the multiplexer in these cases. The
degradation caused by this loss of data is minimized by coding
the important subbands frst. Thus, if some data need to be
discarded, only the less important subbands are afected. A
special codeword may be needed to mark this forced-end-of-line
case. The entropy decoder is an inverse of the entropy coder
except some error-handling functions are included.
III. THE IMPLEMENTATION OF THE ENTROPY CODER AND
DECODER
In this section, the structures of the entropy coder and decoder
based on a parallel approach are described frst. Then full-custom
VLSI implementations of these two experimental prototype chips
are introduced.
A. The Entropy Coder
The entropy coder consists two major parts: RLC and VLC
coders. The input data is run length coded frst, and then
variable length coded. In the RLC coder, input data that is not
part of a zero run is passed through, otherwise the length pf the
zero run is determined and properly encoded. For both cases,
4
Many source coders preserve the sample rale.
LEI AND SUN: ENTROPY CODING SYSTEM FOR DIGITAL HDTV APPLICATIONS 149
Luiace
I
r-
Buffe Sts
RLC VLC FO
8
16
16
t
MU
Bufer
16
aee Chom
RLC VLC
FIFO
8 16 16 =
Fig, 2, Block diagram of two-path entropy coder,
VC
dt ou
clk
en
VLdti
(fr RL dt ot)
VLCoutpu[
available
clk
en
Uncoe
word Table
Codeword
Table
Coelengtit
Table
AND-Plae
OR-Plae
OR-Plae
Reset
PLA
Fig, 3, VLC coder.
one extra bit is added to indicate whether the symbol represents
a zero run or a single sample. The RLC coder can be easily
implemented by a counter, some registers and logic gates. When
a zero run is present, the RLC coder generates no output untl
the last zero or maximum run length is reached, Therefore, the
output of the RLC coder is not continuous, and the operation of
the downstream VLC coder is gapped,
The VLC coder maps the input data into variable-length
Coewords, concatenates them together, and segments them into
16-bit words for output. The parallel VLC coder shown in Fig. 3
encodes each codeword in one clock cycle regardless of its
length. The functions of some major circuit components in Fig,
3 are as follows. The PLA does the table look-up of the
codewords. The barrel shifter BS1 concatcnates these codewords
together and the barrel shifter BS2 segments them into 16-bit
words for output. Thc function of a 4-bit accumulator of code
lengths is mimicked by the barrel shifer BS3 and the register
L [. The carry-out of the accumulator forms the output-available
signal of te VLC coder. Thc signal en in Fig. 3 is an enable
signal which is derived fom the data-available of the RLC
coder. When there is a zero run in the source data, operation of
the VLC coder is suspended until the RLC coder obtains its run
length. During this time period, en is low and the registers in
the VLC coder will rctain their old data.
The codeword look-up table can be implemented by a read
only memory (ROM), programmable logic array (PLA) , or
random access memory (RM). Using RAM, a user pro
grammable VLC can be implemented, However, the size will be
larger, the sped will be slower than the other two approaches,
and extra circuitry is needed for preloading the eodebooks, A
ROM is more suitable when the number of eodebook entries is
2 n, where n is an integer, otherwise some address locations are
150 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VlDEO TECHNOLOGY, VOL. 1, NO. 1, MARCH 1991
wasted. For these reasons, a PLA is used in our current imple
mentation. Although there is only one PLA shown in Fig. 3,
multiple paged PLA's will be implemented to allow diferent
tables for diferent subband signals to achieve higher compres
sion efficiency.
Registers Wo and La store the results of the PLA table
look-up, i,e. , codeword and coe length respectively. The maxi
mal code length is 16 bits in our implementations. The codeword
from the PLA and in Wo is left-adjusted and stufed with 1 's, if
necessary, on the right. The frst bit is on the lef. The code
length is represented by 16 bits in a decoded form, i.e. , the
position of the only I indicates the length. WI stores the
concatenated previous codeword which have not yet been output.
We will call these codeword bits the residual bits. W2 is the
output latch. LI records the number of the residual bits in W
I
'
C3 indicates whether there is an output availahle in the output
latch W2.
The parallel concatenation of codewords is done hy barrel
shifcrs BSI and BS2 Functionally, the output of a barrel
shifter is a sliding window on its input data. BSI and BS2 both
provide 16-bit wide windows on their 31 input bits. BS3 has a
32-bit wide window on its 47 input bits. Thcy all have 16
diferent shift positions. BSI is controlled by the current code
length stored in Lo and shifs the codeword from Wo into WI
so that the rightmost bit of WI is the last bit of the codeword.
Consequently, the data stored in register WI is ready to concate
nate with the next codewords. The barrel shifter BS2 is con
trolled by the number of residual bits as recorded by L I' The
residual bits in WI are left-adjusted by BS2 If the sum of the
residual bits and the current codeword is more than 16, the
output of BS2 contains the frst 16 bits which have not yet been
output but are ready for output.
The combination of the barrel shifer BS3 and the latch LI
functions as a 4-bit accumulator of code lengths. If the residual
bit length is greater than 16, the right 16 bits of BS3's output are
all D's and C3 is set to 1 to indicate output-available. The 16-bit
pattcrn from the latch L 1 is partially duplicated into a 31-bit
input to BS] so that the barrel shifter functions as a rotator. The
other 16 input bits of BS3 are connected to "0" for detecting
the carry-out condition. If the right 16 bits of BS3's output are
all O's, there is a carry-out. The left 16 bits of BS3's output
update the new number of residual bits in W
I
' The number of
residual bits is usually between 1 and 16 except at the beginning
of the operation when i is zero. Since LI is 16 bits long, it can
only represent 1 to 16 in the decoded form. Thus, at the
beginning of the operation, LI needs to be set to 16, which is
modulo-16 equivalent to zero, and C2 is set to 1 to indicate this
is a zero, not 16. The example in Fig. 4 illustrates how this VLC
coder works.
There are several reasons for using the barrel shifter BS3
instead of a 4-bit accumulator: 1) a barrel shifter is faster than an
accumulator; 2) sinee the output is already a 16-bit decoded
patter, a 4-to-16 decoder is not required for BS2; and 3) the
decoded representation of a code length reduces the capacitive
loading on the bit-line in the PLA code length OR-plane. Conse
quently, using the barrel shifter results in faster circuitry and
also saves design time since the design of the barrel shifter is
available anyway.
5
The 16-bit maximal code length limitation has been found to incur very
little penalty (less than 1 %) on the coding rates of all the test sequences we
used. The circuits discmsed here can also be modified casily to accommo
date longer code length.
Wo ( 1 "2 \; CJ c2
(toeword) 1tr
}/"lJUlTf_I5Ul) (ourUlregmr) len)
o UUVD | @ 1 ? 1
1 (11[.**+*** \ ttt#{;Q1\1 - A
g1_g_+-. . Q .... .. .
\
0 1010 p^I^l 3
9 +... J}@1OJ QI01\IOJ . ... 11
1D + 13 _*. ... 0110101J `01@00 011U t
5 11111111111 15 1p JD100) 10111 J
6 10100J01 1 *+** J0 l !111II11Il > oq 111111111110 J
"10 17 0110 111l )11J 111\ I 10
8 111'J11111 16 101:110001001) 1010J1<1J
9 J101 ... ... . ** 111 l'11 11J1 M W , 111.11111100 15
I U jJg0 ...... ......... - 1 010 11 , 111 1111 U
Fig. 4. Example of VLC coder operations.
B. The Entropy Decoder
The entropy decoder contains a VLC dccoder followed by a
RLC decoder. It performs an inverse function of the entropy
coder. The RLC decoder passes the VLC-decoded codewords
through if they are not run-length codes, otherwise it outputs the
specifed number of zeros. During a zero run, while zeros are
being output, the operation of the VLC decoder must be sus
pended. Thus, the output of the RLC decoder is continuous, but
the operation of the VLC decoder needs to be intermittent in
analogy to the VLC coder. Similar to the RLC coder, the RLC
decoder can also be easily implemented by a counter, some
register, and logit: gates.
The VLC decoder is more difcult to implement than the VLC
coder. The input to the VLC decoder is a bit stream without
explicit word-boundaries. The VLC decoder has to decode a
codeword, determine its length, and shif the input data stream
by the number of bits corresponding to the decoded code length,
before det:oding the next codeword. These are recursive opera
tions that cannot be pipe lined .
A block diagram of a parallel VLC decoder is shown in Fig.
S, The fnctions of its major components are described as
follows. The PLA is the codebook table. It matches a codeword
and outputs the corresponding symbol and code length. The code
length is accumulated by barrel shifter BS
I
and register D2. The
barrel shifter BSo then shifs its opening window to the next
codeword according to this accumulated code length. An exam
ple of this VLC decoder operations is shown in Fig. 6.
In principle, a decoding table could be implemented by a
ROM, however, it would require a 216-word ROM which would
be very waste l'uL It is much more efcient to use a content
addressable memory (CAM) [12] or a PLA [10] whose sizes are
determined only by the number of code-book entries. A user
programmable VLC decoder can be implemented by using a
CAM [12], however, it would result in a circuit much larger and
slower than a circuit using a PLA. In the folowing discussion,
use of a PLA is assumed. The operation of the circuitry is as
follows.
The input data are stored in registers Do and D
I
The 16-bit
pattern in D2 represents the number of decoded bits (i.e.,
accumulated codelength) in DI. The number can lie between 0
and 15. This pattern controls the barrel shifter BSo so that the
undecoded bits appear at the output of the barrel shifter.
The AND-plane of the PLA essentially performs a parallel
pattern matching on the data stream. When a codeword is
matched, the corresponding word-line in the PLA AND-plane is
activated which enables the corresponding word transistors in
the OR-planes to output the decoded codewords and the code
length.
The decoded codc length is used to control the second barrel
shifter BS
J
whose function is a 4-bit accumulator, analogous to
LEI AND SUt: ENTROPY CODING SYSTEM FOR DIGITAL HDTV APPLICATIONS 15
I
Read
16
Codeword
Table
Coelengt
Table
De
word Table
A-Pa OR-Plae OR-Plae
PLA
Fig. 5. VLC decoder.
D1 DO
Decoded
SHIF Carry-out
Symbol
Barrel-Shifter
Output" 0111110000011101 001000110010101
Example Codebook:
01111100000111011001000110010101
b 2 0
a
00 2
b 01 2
c
100 3
d
101 3
e
110 3
f
1110 4
9 11110 5
h
11111 5
9
a
a
b
a
7
9
11
15
2
4
6
0
0
0
0
o
o
Fig. 6. Example of VLC decoder operations.
the BS3 in the VLC coder. BS] shifts the pattern of D2', output
according to the newly decoded code length. The resultant new
pattern corresponds to the accumulated code length. This new
pattern controls BSo so as to output the correct window or 16
bits for the next decoding cycle.
When the accumulated code length exceeds 15, a carry-out bit
becomes I. It indicates that all the bits in D] have been used and
that Do may not contain the whole next codeword. In this case,
when the gapped ready clock generated by the RLC decoder
latches the decoded output, a read signal is generated. The
contents of Do is loaded into D], a new 16-bit word is loaded
into Do, and the barrel shifter shifts to the new position, all at
the same time, to prepare for the next decoding cycle.
If the accumulated code length does not exceed 15, the
carry-out signal is O. Since the maximum code length is 16 and
at least 16 bits of data in Do and D] are not used yet, there arc
always enough bits for the next decoding cycle. Do and DI are
lef unchanged. The new accumulated code length pattern simply
controls the BSo to shift to the correct position for the next
decoding cycle. Thus, VLC decoding is achieved in one clock
cycle regardless of the code length.
C. A Full-custom IC Implementation
Since the circuit components of the entropy coer are similar
to those of the entropy decoder and the speed requirement of the
entropy decoder is more difcult to achieve, we will fous on the
entropy decoder here. The mask-size and the simulated speed of
some critical parts, barrel shifters and the PLA, are shown in
152 IEEE TRAlSACTlONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY. VOL.!. NO. I. MARCH 1991
Fig. 7. Chip layout of entropy decoder.
TABLE I
THE LAYOUT SIZE AND SIMULATION SPEED OF THE CRITICAL PARTS
Technology
Barrel shifter BSo
Barrel shifter BS 1
PLA (with 16 entries)
1.2 14m double-metl CMOS
Mask-size 304 I"m x 271 I"m
Delay-time 1.5 ns from input and shift
to output
Mask-siLe 598 I"m X 282 I"m
Delay-time - 1.5 ns from inplll and shift
to output
Mask-size 557 Im X 29371"m
Evaluation-time - 4.2 ns to wordlength
OR-Plane output,
including an output bufer.
(assuming the maximum number of entries
with same length is 60)
Table L A mask layout of the decoder chip is shown in Fig. 7.
Six codebooks are included in the chip. The core of the chip
contains about 37K transistors in an area about 3.8 mm x 4.0
m. Operating speed higher than the required 52 MHz is
anticipated. For our usage. the maximum number of entries in a
codebook is 1 60. However, extension to more codebook entries
is easy.
The PLA's are implemented using Domino CMOS circuits
[15] which achieve high speed oprations with low power con
sumption [16]. The AND-plane and OR-plane of the PLA are
precharged in the frst half clock cycle and evaluated in the
second. During the precharge time of VLC decoder, the signals
are propagating through Do. Dj, BSo, and the PLA address
bufers. Thus. the time during the precharge is not wasted.
The critical path of the VLC decoder includes the PLA
AND-plane evaluation time, the code length OR-plane evalua
tion time, and the BS1 delay time. The PLA AND-plane evalua
tion is speeded up by employing larger-than-minimum size
transistors. Tbe OR-plane evaluation time is very dependent on
the capacitive loading on the bit-lines. Since the code length is
fully decoded, the transistors in the code length OR-plane are
very sparsely populated. This greatly reduces the capacitive
loading and increases the evaluation speed. To further speed up
the operation in the OR-plane, each bit is implementcd by
multiple transistors sharing drain difsions. This improves the
ratio of transistor strength to load capacitance. Also, the word
lines from the PLA address decoder are bufered between the
code length OR-plane and the decoded-word OR-plane. This
minimizes the capacitive loading on the word-lines which ad
dress the code length OR-plane.
To minimize the capacitive loading on the bit-line of the
codeword OR-plane, the transistors on the OR-plane are popu
lated in such a way that no bit-line ever has more than 50%
occupancy of transistor drains on it. If a bit-line has more "1"
entries, the polarity of that hit-line is inverted and the output
polarity is corrected by an inverting sense amplifier [17].
The PLA layouts are generated by a PLA generator written in
the C Language. This PLA generator makes the chips mask-pro
grammable for other systems using diferent codehooks.
IV. CODEWORD SYNCHRONIZATION AND ERROR
CONCEALMENT
One major concern on using variable-length code is its error
propagation property. An erroneous bit from transmission or
storage of the encoded bit stream will cause the codeword to be
misinterpreted and as the codeword's length is not fixed, this
may result in a loss of synchronization of the bit stream.
Decoding errors may propagate to the subsequent source sym
bols. Although resynchronization may naturally occur afer a
while [ 1 8], [19J or it can be guaranteed by careful designs of the
code [201-[22], the number of the decoded symhols may not be
correct. In video or image coding applications, this would result
in a shift on part of the reconstructed picture which is very
objectionable. A technique for resynchronizing both the code
word and the sample position is the use of synchronizing words
at suitable intervals. These synchronizing words have to be
recognizable whether the dccoding is synchronized or not. A
codeword with this property is called a clear codeword. Basi
cally. a clear codeword is a codeword which cannot be gener
ated by any concatenations of other codewords. The end-of-line
(EOLl codeword used in the international digital facsimile cod
ing standard [3] is an example of the clear codeword.
The identification of these clear codewords is not dependent
on the correct decoing of their proceeding symbols; they can be
identified by their special codeword patterns. Thus, if there is
any bit error in the coded data stream, the error propagation is
confned at most until the next clear codeword. Furthermore,
since the number of codewords between each pair of clear
codewords is known, most of the errors can be detected by
counting the decoded samples between the clear codewords.
Such error detection not only can prevent a possible position
shift of the decoed samples following the erroneous segment,
but also can activate error concealment mechanisms for the
erroneous segment. For example, if a bad line is detected, we
may repeat the previous line. If the erroneous segment is a
high-frequency suhband, we can retain other correct subbands
and only replace the erroneous subband by zeros. According to
the study in [23], these two error concealment techniques are
very effective.
If multiple variable-length coded bit streams have to be multi
plexed together, usually they cannot be multiplexed directly in a
word-interleaving fashion due to a diffrent number of words for
each bit stream. By using clear codewords as segment delim
iters. they can be multiplexed segment by segment. These clear
LEI AND SUN: EITROPY CODING SYSTEM FOR DIGITAL HDTV APPLICATlONS 153
o : Leaf
o : Branching Point
Fig. 8. Example of construction of code tree with shorter clear codeword.
codewords are recognizable to the demultiplexer and the seg
ments can be demultiplexed without the requirement of
variable-length decoding frst.
Although the clear codeword, EOL, is used in the interna
tional digital FAX coding standard [3], the design of an optimal
Hufman codebook which also includes clear codewords has not
been shown. A clear codeword cannot be obtained automatically
from the Hufman algorithm. Usually, the codewords generated
hy the Hufman algorithm are not clear, i.e., they can be
generated by a concatenation of other codewords. In order to
make a clear codeword, a reserved codeword has to be extended
by several bits. Naturally, it is desired to make these extension
bits as few as possible. With the code1engths given by the
Hufman algorithm, there are many diferent codes. For diferent
codes or diferent reserved codeword patterns, the number of the
needed extension bits may be diferent.
A method of fnding an efcient code with a clear codeword
will be introduced here. As shown heuristically in [24], a good
bit pattern for the reserved codeword is all I' s (or all 0' s).
(Since they are equivalent, we will only discuss the former case
without loss of any generality.) The all-one reserved codeword
tends to require shorter extension bits than others in order to
make it clear. To convert the reserved all-ones codeword into a
clear codeword, a sufx of a few more l's followed by a 0 is
needed. The sufce 0 after the all 1 's pattern in the clear
codeword is needed to mark the end of the clear codeword.
Otherwise, an all 1 's clear codeword cannot be clearly located
when it succeeds a codeword with sufx l's or when it precedes
a codeword with prefx 1 'so
If the reserved codeword and the clear codeword should have
the patterns we described above, it is simple to observe that we
should arrange the code tree such that a concatenation of code
words other than the reserved codeword will not form a long
segment of consecutive l's in order to obtain a shorter clear
codeword. For any code tree, all codewords (except the reserved
all-ones codeword) must contain at least one 0, otherwise the
prefx property 6 is contradicted. Thus, we only have to consider
the consecutive 1 's formed by the concatenation of two code
words. The longest possible run of 1 's in the concatenation of
codewords, other than the reserved codeword, is formed by the
codeword with longest sufx of l's followed by the codeword
with longest prefx of 1 's. If the reserved codeword is n 1 's,
there must be at least one codeword having a prefx of n 1 l's
and one O. Also, no codeword except the reserved codeword has
a prefx of l' s longer than n 1 as this would contradict the
prefx property. Thus, for the codewords other than the reserved
one, the maximal number of the prefx l's is always n 1.
Therefore, in order to ohtain the shortest clear codeword, we
only need to minimize the maximal number of consecutive l's in
the sufixes of the codewords. The resulting clear codeword is
the reserved all l' s appended by this number of longest sufx l' s
and a O. Thus, knowing the code length for each codeword from
the Hufman algorithm, we can rearrange the shape of the code
tree such that the longest sufx l' s of the code words (excluding
the reserved codeword) is minimized.
In the tree representation of a variable-length code, the num
ber of the leaves on level k is the number of codewords with
length k, while the root is viewed as level O. The key idea for
minimizing the sufx l's of codewords is the assignment of
noes with the longest sufx l's (except the all-one reserved
pattern) as leaves on each level. Thus, the nodes with longer
sufx l's are terminated into leaves, thereby thwarting the sufx
l's growth to the next level. The assignment will result in a code
tree with shortest sufx l' s of its code words . This code tree is
thus the optimal code tree in the sense that its reserved all-one
codeword needs least extension bits to make it clear.
An example of this optimal code tree is shown in Fig. 8. In
this example, the variable-length code contains three 3-bit code-
6
No codeword is a prefx of any other codeword.
154 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY. VOL. 1. NO. I. MARCH lYYl
words, six 4-bit codewords, six 5-bit codeword" and four I-hit
codewords. Totally, there are 19 codewords, including the re
served codeword with eodclength of 5 bits. The frst and second
level have no leaves since no codeword is one or two hits long.
On the third level, there are 8 nodes and tree of them must be
assigned as leaves, since three 3-bit codewords are required.
The nodes with longer sufx I' s (exeept the all-one node) are
assigned to be leaves first. For example, node 011 has the
longest sufx l's (except the all-one pattern), so it is the frst
node we want to assign as a leaf. Besides thi s, nodes 001 and
101, which have the second longest sufx l' s, are also assigned
as leaves, The leave assignment for other levels is similar
except, on level 5, the node 11111 is frst chosen as a leaf sincc
it is the required 5-bit reserved codeword, In this example, the
longest sufx l' s is 2 bits, To make the reserved 11111 clear, it
has to be extended three more bits as 11111110.
V. SUMMARY
In this paper, a complete entropy coding system for HDTV
applications is described. Parallel structures of the entropy coder
and decoder are introduced. This parallel entropy coder (or
decoder) encodes (or decodes) each codeword in one clock cycle
regardless of its code length. Thus, te required clock rate is
lower and a parallel processing system is also easy to design.
These parallel entropy coder and decoder are implemented in
two experimental prototype chips which are capable to encode
and decode 52 million samples/s. Clear codewords are intro
duced for variable-length codeword synchronization and multi
plexing. A systematic method of designing an efcient code with
clear codewords is described.
ACKNOWLEDGMEN'j'
We would like t thank J. A. Bellisio, P. E. Fleischer, and
M. E. Lukacs for stimulating discussions and valuable com
ments. We also like to thank S. Palaniraj for hclping us on doing
layouts and writing the PLA generator.
REFERENCES
[1] W. K. Pratt, Digital Image Processing. New York: Wiley,
pp. 632, 1978.
[2] D. A. Hufman, "A method for te construction of minimum
redundancy codes," Proc. IRE, vol. 40, pp. 1098-1101. Sept.
1952.
[3] R. Hunter and A. H. Robinson, "International digital facsimile
coding standards," Proc. 1111, vol. 68, no. 7, pp. 854-867,
July 1980.
[4] CCITT SGXV, "Draft Revised Recommendation H.261-Video
code for audiovisual services at p X 6 kbit/s," OM XV-R
17-E, CCITT Study Group XV-Report R 17. Specialist Group
on Coding for Visual Telephony, Jan. 1990.
[5] W. H. Chen and W. K. Pratt, "Scene Adaptive Coder," IEEE
Trans. Commun., vol. COM-32, no. 3, pp. 225-232, Mar.
1984.
[6] T. -C. Chen, P. E. Fleischer, and S.-M. Lci, "A subband scheme
for advanced TV coding in BISD N applicati ons, " presented at
3rd Int. Workshop on HDTV, Italy, Aug. 1989.
[71 P. E. Fleischer, T.-c. Chen, and S.-M. Lei. " Coding of ad
vanced TV for BISDN usiug multiple subbands." Proc. of Inf.
Symp. on Circuits and Systems, New Orleans. LA. pp.
1314-1318, May 1990.
[8] R. Ballart and Y.-C Ching, "SONET: Now it's the standard
optical network," IEEE Communications lfagazine. Mar.
1989.
[9] J. L. Sicrc and A. Leger, "Silicon complexity of VLC decoder
vs Q-coder," JPEG N258, ISO/JTC1/SC2/WG8. CCITT SGVII.
Feb. 1989.
[ 101 J. W. Peake, "Decompaction, " IBM Technical Disclosure Bul
letin, vol. 26, no. 9, pp. 4794-4797, Feh. 1984.
[11] M. E. Lukacs, "Variable word length coding for a high data rate
DPCM video coder," in Proc. Picture Coding Symp., pp.
54-56, 1986.
[ 12] M.-T. Sun, K.-M. Yang, and K.-H. Tzou, "A high-speed pro
grammable VLSI for decoding variable-length codes," Applica
tions of Digital Image Processing XII, A. G. Tescher, ed.,
Proc. SPIE 1153, Aug. 1989.
l13J M.-T. Sun and S.-M. Lei, "A parallel variable-length-code
decoder for advanced television applications. presented at 3rd
Int' nl Workshop on HDTV. Italy. Aug. 1989.
[ 14] S.-M. Lei, M.-T. Sun, K. Ramachandran, and S. Palaniraj.
"VLSI implementation of an entropy coder and decoder for
advanced TV applications. " in Proc. of Int. Symp. on Circuits
and Systems, New Orleans, LA, pp. 3030-3033, May 1990.
[15] R. H. Krambeck, C. M. Lee, and H. S. Law, "High sped
compact circuits with CMOS," IEEE 1. Solid-State Circuits.
voL SC-17, pp. 614-619, June 1982.
[ 16] J. A. Pretorius, A. S. Shubal, and A. T. Salama. "Charge
redistribution and noise margins in domino CMOS logic," IEEE
Trans. Circuits Syst., vol. CAS-33, no. 8. pp. 786-793. Aug.
1986.
[17] P. C. Rossbach, R. W. Linderman, and D. M. Gallagher, "An
optimizing XROM silicon compiler," Proc. IEEE Custom In
tegrated Circuits Conj.. Portland, OR, pp. 13-16, May 4-7.
1987.
ll8J J. C. Maxted and J. P. Robinson. "Error recovery for variable
length codes," IEEE Trans. Inform. Theory, vol. 1T-31, no.
6. pp. 794-801. Nov. 1985.
[19] B. Rudner, "Construction of minimum-redundancy codes with an
optimum synchronizing property," IEEE Trans. Inform. The
ory, vol. 1T-17, pp. 478-487, July 1971.
[20] T. J. Ferguson and J. H. Rabinowitz. "Self-synchronizing
Hulfman codes," IEEE Trans. Inform. Theory, vol. IT-30,
no. 4, pp. 687-693, July 1984.
[21] P. G. Neumann, "Self-Synchronizing Sequential Coding with
Low Redundancy," Bell Sys. Tech. Journal, vol. 50, no. 3. pp.
951-981, Mar. 1971.
[22] P. G. Neumann, "Eficient Error-Limiting Variable-Length
Codes," IRE Trans. Inform. Theory, vol. IT-8, pp. 292-304.
July 1962.
[23] D. S. Lee and K. H. Tzou. "Hierarchical DCT coding of HDTV
for ATM networks. " Proc. ICASSP, vol. 4, pp. 2249-2252.
Apr. 1990.
[24] S.-M. Lei. "The construction of efcient variable-length codes
with clear synchr onizing codewords for digital video
applications," Packet Video ' 91, Kyoto. Japan, Mar. 18-19.
1991.
[25] T.-C. Chen, P. E. Fleischer, and K.-H. Tzou, "Multiple Block
size Transform Coding for Video Using a Subband Structure,"
IEEE Trans. Circuits Syst. Video Techno!., vol. 1, no. I, Mar.
1991.
LEI AND SUI: ENTROPY CODING SYSTEM FOR DIGITAL HDTV APPLICATIONS 155
Shaw-Min Lei (S'87-M'88) received the B.S.
and M.S. degrees from the National Taiwan
University. Taipei, R.O.C., in 1980 and 1982,
and the Ph.D. degree from the Universit of
California, Los Angeles in 1988, all in electri
cal engineering.
From 1982 to 1984, he was an Instructor of
Electrical Engineering at Naval Academy, Tai
wan. He has been with Bellcore. Red Bank, NI,
since August 1988. Presently, he is a member
of Technical Staf in the Digital Video District.
His current research interests include video coding, HDTV signal
processing. digital flter structure, VLSI architecture for digital signal
processing. data compression, and error control coding.
Ming-Ting Sun (S'79-M'85-SM'89) received
the B.S. degree fom National Taiwan Univer
sity in 1976, the M.S. degree from the Univer
sity of Texas at Alngton in 1981, and the
Ph.D. degree fom the University of California,
Los Angeles, in 1985, all in electrical engineer
ing.
He has been with Bellcore, Red Bank, NJ,
since 1985, where he is a Member of Technical
Staf. His reserch interests include VLSI archi
tecture and algorithms for video processing,
digital signal processing, and adaptive filters.
Dr. Sun received an Award of Excellence from Bellcore in 1987. He
has been published in about 30 publications and has been awarded a
patent. He is the Chairman of the IEEE CAS Standards Committee and
is an Associate Editor of the IEEE TRANSACTONS ON CIRCUITS AND
SYSTEMS POR VIDEO TECHNOLOGY.