The JPEG2000 StillImage
Compression Standard
Majid Rabbani
Eastman Kodak Research Laboratories
Majid.Rabbani@kodak.com
Diego Santa Cruz
Swiss Federal Institute of Technology, Lausanne
(EPFL)
Diego.SantaCruz@epfl.ch
1
Table of Contents
Introduction and Background ......................................................................................................................2
Image Compression Standardization Activities........................................................................................3
What is JPEG?..........................................................................................................................................6
Baseline JPEG Encoder/Decoder .............................................................................................................9
Baseline JPEG Pros and Cons ................................................................................................................11
JPEG2000 Compression Paradigm.........................................................................................................12
JPEG2000 Objectives.............................................................................................................................23
JPEG2000 Timetable..............................................................................................................................24
The JPEG 2000 Standard ...........................................................................................................................26
Preprocessing.......................................................................................................................................27
Reversible Color Transform............................................................................................................28
Irreversible Color Transform...........................................................................................................29
Discrete Wavelet Transform (DWT) ..................................................................................................40
1D Discrete Wavelet Transform (DWT)........................................................................................42
Example of Analysis Filter Banks...................................................................................................44
2D Wavelet Decomposition ...........................................................................................................51
BiOrthogonal Filter Banks .............................................................................................................55
Filter Normalization .................................................................................................................59
Examples..................................................................................................................................60
Signal Symmetric Boundary Extension ...................................................................................66
Subband L2 Norms ...................................................................................................................68
Complexity Issues ....................................................................................................................72
The Lifting Scheme..................................................................................................................74
Integer (5,3) Filter Lifting Example..................................................................................77
Integer (13,7) Filter Lifting Example................................................................................78
Daubechies (9,7) Filter Lifting Example...........................................................................79
IntegertoInteger Transforms ..................................................................................................81
JPEG2000 DWT Choices................................................................................................................84
Quantization..........................................................................................................................................85
Quantizer Choices in Part 1.............................................................................................................87
Uniform Scalar Quantizer with Deadzone.......................................................................................88
The Human Visual System (HVS) ..................................................................................................93
Embedded Quantization ................................................................................................................103
Entropy (Tier 1) Coding ....................................................................................................................106
Information Content and Ideal Codelength ...................................................................................108
Arithmetic Coding.........................................................................................................................112
Codeblocks in Wavelet Domain....................................................................................................116
Codeblock Size Restrictions..........................................................................................................117
RD Optimized Embedded Coding ...............................................................................................118
BitPlane Coding Passes................................................................................................................119
BitPlane Coding Example ............................................................................................................122
Entropy (Tier 2) Coding ....................................................................................................................135
Tier 2 role......................................................................................................................................136
Example of bitplane pass coded data ...........................................................................................137
Layers............................................................................................................................................144
Precinct partition ...........................................................................................................................146
Packets...........................................................................................................................................148
Packet head encoding: Tag trees ...................................................................................................150
Packet output to codestream..........................................................................................................153
Layer (SNR) progressive example ................................................................................................154
Resolution progressive example....................................................................................................157
Codestream....................................................................................................................................161
Rate allocation ....................................................................................................................................162
Rate allocation principle................................................................................................................163
Postcompression rate allocation...................................................................................................165
Efficient ratedistortion estimation................................................................................................168
Visual frequency weighting...........................................................................................................169
Region of Interest (ROI) coding ........................................................................................................172
Region of Interest coding principle ...............................................................................................173
ROI mask ......................................................................................................................................174
Encoding ROIs: General scaling ...................................................................................................176
Encoding ROIs: Maxshift..............................................................................................................177
General scaling Pros & Cons.........................................................................................................178
Maxshift Pros & Cons ...................................................................................................................179
ROI Maxshift example ..................................................................................................................181
Error resilience ...................................................................................................................................183
Errorprone channels .....................................................................................................................184
Error effects...................................................................................................................................185
Protecting codeblock data ............................................................................................................186
Protecting packet heads .................................................................................................................187
File format: JP2 ..................................................................................................................................189
JP2 .................................................................................................................................................190
JP2: basic boxes ............................................................................................................................191
JP2: color.......................................................................................................................................192
Part 2 Extensions ................................................................................................................................193
Trellis Coded Quantization (TCQ) ................................................................................................194
Visual masking..............................................................................................................................195
Arbitrary wavelets .........................................................................................................................196
Single sample overlap (SSO) ........................................................................................................197
Multiple component transformation ..............................................................................................198
JPX file format ..............................................................................................................................199
Other Part 2 extensions..................................................................................................................201
Part 1 Amendments............................................................................................................................202
AMD1 : Profiles...........................................................................................................................203
AMD2 : sYCC colorspace (JP2) ..................................................................................................205
AMD3 : Efficient geometric manipulations (proposal) ...............................................................206
Other JPEG 2000 Features................................................................................................................207
JPEG 2000 Feature Summary...........................................................................................................208
JPEG 2000 performance assessment ................................................................................................209
Which algorithms to compare to?..................................................................................................211
MPEG4 VTC ...............................................................................................................................212
JPEGLS........................................................................................................................................214
PNG...............................................................................................................................................215
SPIHT............................................................................................................................................216
Test images....................................................................................................................................217
Test conditions ..............................................................................................................................219
Lossless compression ....................................................................................................................220
Nonprogressive lossy compression ..............................................................................................224
SNR progressive lossy compression .............................................................................................229
ROI Maxshift coding.....................................................................................................................233
Error resilience ..............................................................................................................................234
Supported functionality: subjective evaluation .............................................................................237
Introduction and
Background
Image Compression Standards
Facilitates the exchange of compressed data between
various devices and applications.
Economy of scale: permits common hardware/software to
be used for a wide range of products, thus lowering the
cost and shortening the development time.
Provides reference points for the expected quality of
compressed images.
International Standard Organizations
ISO/IEC
International Organization for Standardization
Deals with information processing, e.g., image storage
and retrieval.
ITUT
International Telecommunications Union Telecommunications Sector
Formerly known as CCITT.
Deals with information transmission.
4
Image Compression Standards
Binary (bilevel) images:
Group 3 & 4 (1980); JBIG (1994); JBIG2 (ongoing)
Continuoustone still images:
JPEG (1992); JPEG2000 (IS Part 1 in Dec. 2000, other
parts ongoing)
Image sequences (moving pictures):
H.261 (1990); H.263 (1995); H.263+ (1997), H.263L
MPEG1 (1994); MPEG2 (1995);
MPEG4 (1999); MPEG7 (new)
5
What Is JPEG?
The JPEG (Joint Photographic Experts Group)
committee, formed in 1986, has been chartered with the
Digital compression and coding of continuoustone still
images
Joint between ISO and ITUT
Has developed standards for the compression of lossy,
lossless, and nearly lossless of still images in the past
decade
Website: www.jpeg.org
JPEG Summary
The JPEG committee has published the following standards:
ISO/IEC 109181  ITUT Rec. T.81 : Digital Compression
and Coding of ContinuousTone Still Images: Requirements
and guidelines
ISO/IEC 109182  ITUT Rec. T.83 : Compliance testing
ISO/IEC 109183  ITUT Rec. T.84: Extensions
ISO/IEC 144951  ITUT Rec. T.87 : Lossless and NearLossless Compression of ContinuousTone Still Images Baseline
ISO/IEC 154441  ITUT Rec. T.800: JPEG2000 Image
Coding System.
7
Examples of JPEG Applications
Consumer imaging (digital cameras, picture disk, etc.)
Professional imaging (desktop publishing, graphic arts,
digital cameras, etc.)
Medical imaging
Remote sensing
Internet imaging
Scanning and printing
Image databases
Mobile
8
Baseline JPEG Encoder Block Diagram
Header
Compressed
Data
88 Blocks
FDCT
Quantizer
Huffman Encoder
Quantization
Tables
Huffman Tables
Baseline JPEG Decoder Block Diagram
Header
Compressed
Data
Huffman Decoder
Huffman Tables
Reconstructed
Image Data
Dequantizer
IDCT
Quantization
Tables
10
Baseline JPEG Pros and Cons
Advantages
Disadvantages
Memory efficient
Single resolution
Low complexity
Single quality
Compression efficiency
No target bit rate
Visual model utilization
No lossless capability
Robustness
No tiling
No region of interest
Blocking artifacts
Poor error resilience
11
JPEG2000 Compression Paradigm
Encode
Decode
Coding efficiency
Multiresolution
Target bit rate
Quality scalability
Lossless to lossy progression
Tiling
Improved error resilience
Flexible bit stream syntax
Idempotent recompression
File format
12
13
14
15
16
0.125 bpp (64:1 CR)
17
0.25 bpp (32:1 CR)
18
0.50 bpp (16:1 CR)
19
1.0 bpp (8:1 CR)
20
Overall 0.25 bpp, ROI 0.75 bpp, BG 0.10 bpp
21
22
JPEG2000 Objectives
Advanced standardized image coding system to serve
applications into the next millenium
Address areas where current standards fail to produce the
best quality or performance
Provide capabilities to markets that currently do not use
compression
Provide an open system approach to imaging applications
23
The JPEG 2000 Standard
Part 1: Core Image Coding System (royalty and fee free)
Part 2: Extensions (some technology covered by IPR)
Part 3: Motion JPEG 2000
Part 4: Conformance Testing
Part 5: Reference Software
Part 6: Compound Image File Format
24
JPEG 2000 Timetable
Part
Title
1 JPEG 2000 Core Image
Coding System
CFP
WD
CD
FCD
FDIS
IS
97/03 99/03 99/12 00/03 00/10 00/12
2 JPEG 2000 Image Coding 97/03 00/03 00/08 00/12 01/07 01/11
Extensions
3 Motion JPEG 2000
99/12 00/07 00/12 01/03 01/07 01/11
4 Conformance Testing
99/12 00/07 00/12 01/07 01/11 02/03
5 Reference Software
99/12 00/03 00/07 00/12 01/07 01/11
6 Compound Image File
Format
97/03 00/12 01/03 01/07 01/11 02/03
25
JPEG2000 Fundamental Building Blocks
PreProcessing
Original
Image Data
Discrete Wavelet
Transform (DWT)
Uniform Quantizer
with Deadzone
Compressed
Image Data
Adaptive Binary
Arithmetic Coder
(Tier1 Coding)
Bitstream
Organization
(Tier2 Coding)
26
PreProcessing
The input image is partitioned into rectangular and nonoverlapping tiles of equal size (except possibly for those
tiles at the image borders) that are compressed
independently using their own set of specified compression
parameters.
The unsigned sample values in each component are level
shifted (DC offset) by subtracting a fixed value from each
sample to make its value symmetric around zero.
The levelshifted values can be subjected to a forward
pointwise intercomponent transformation to decorrelate
the color data. A restriction is that the components must
have identical bitdepths and dimensions.
27
Reversible Color Transform (RCT)
Two color transforms have been defined in JPEG2000.
The reversible color transform (RCT) that is integertointeger and is intended for lossless coding.
The irreversible color transform (ICT) that is the same
as the conventional RGB to YCbCr transform.
Forward RCT:
1
Y = ( R + 2G + B )
4
Cb = B G
Cr = R G
Inverse RCT:
1
G = Y (Cb + Cr )
4
R = Cr + G
B = Cb + G
28
Irreversible Color Transform (ICT)
The ICT is the same as the conventional YCbCr transform for
the representation of image and video signals:
Y = 0.299 ( R G ) + G + 0114
. (B G)
Cb = 0.564 ( B Y ) and Cr = 0.713( R Y )
0.587
0114
. R
Y 0.299
Cb = 0169
.
0.331
0.500 G
C 0.500 0.419 0.081 B
r
.
R 10
G = 10
.
.
B 10
0.0
0.3441
17718
.
14021
.
Y
0.7142 Cb
0.0 Cr
29
30
31
32
33
34
35
36
37
38
39
Discrete Wavelet Transform
(DWT)
40
Transformation in JPEG2000 Part 1
Original
Image Data
Discrete Wavelet Transform
(DWT)
Wavelet
Coefficients
Multiresolution image representation is inherent to DWT.
The fullframe nature of the transform decorrelates the
image across a larger scale and eliminates blocking
artifacts at high compression ratios.
Use of integer DWT filters allows for both lossless and
lossy compression within a single compressed bitstream.
DWT provides a frequency band decomposition of the
image where each subband can be quantized according to
its visual importance.
41
1D Discrete Wavelet Transform (DWT)
Two procedures for performing the DWT exist that lead to
identical results: (i) Convolution, and (ii) Lifting.
The forward discrete wavelet transform (DWT)
decomposes a 1D sequence (e.g., line of an image) into two
sequences (called subbands), each with half the number of
samples, according to the following procedure:
The 1D sequence is separately lowpass and highpass
filtered.
The filtered signals are downsampled by a factor of two to
form the lowpass and highpass subbands.
The two filters are called the analysis filterbank.
42
The 1D TwoBand DWT
lowpass
x(n)
h0
lowpass output
h1
highpass output
highpass
Analysis filterbank
43
Example of Analysis FilterBank
1D signal:
100
100
100
100
Lowpass filter h0: (1 2
200
6
200
2
200
200...
1)/8
Highpass filter h1: (1 2 1)/2
Before downsampling:
100 100 87.5 112.5 187.5 212.5 200
0
0
0
50
50
0
0
200...
0
After downsampling:
100
112.5
0
0
200...
212.5
50
44
45
46
47
48
Inverse DWT
During the inverse DWT, each subband is interpolated by a
factor of two by inserting zeros between samples and then
filtering each resulting sequence with the corresponding
lowpass, g0, or highpass, g1, synthesis filterbank.
The filtered sequences are added together to form an
approximation to the original signal.
0
100
112.5
212.5
200
50
100
100
100
100
200
200
200
200...
49
The 1D TwoBand DWT
lowpass
x(n)
lowpass
h0
h1
highpass
Analysis filterbank
g0
y(n)
g1
highpass
Synthesis filterbank
Ideally, it is desired to choose the analysis filterbank (h0 and
h1), and the synthesis filterbank (g0 and g1), in such a way so as
to make the overall distortion zero, i.e., x(n) = y(n). This is
called the perfect reconstruction property.
50
1Level, 2D Wavelet Decomposition (DC=1, AC=4)
51
2Level, 2D Wavelet Decomposition (DC=1, AC=4)
52
3Level, 2D Wavelet Decomposition (DC=1, AC=4)
53
2D Wavelet Decomposition
Resolution 0: LL3
LL3 HL3
Res 1: Res 0 +
LH3 + HL3 + HH3
LH3 HH3
Res 2: Res 1 +
LH2 + HL2 + HH2
LH2
HL2
HL1
HH2
Res 3: Res 2 +
LH1 + HL1 + HH1
LH1
HH1
Horizontal
Vertical
54
BiOrthogonal Filter Banks
Most wavelet based image compression systems use a class
of analysis/synthesis filters known as biorthogonal filters:
The basis functions corresponding to h0(n) and g1(n) are
orthogonal; and the basis functions for h1(n) and g0(n) are
orthogonal.
Linearphase (symmetrical) and perfect reconstruction.
Unequal length; oddlength filters differ by an odd
multiple of two (e.g., 7/9), while evenlength filters differ
by an even multiple of two (e.g., 6/10).
Symmetric boundary extension.
55
BiOrthogonal Filter Symmetry Conditions
Let h0 denote the lowpass finite impulse response (FIR)
analysis filter, and h1 denote the highpass FIR analysis
filter in a biorthogonal filter bank. Two situations occur:
Both filters are oddlength. Then h0 and h1 are both
symmetric and are referred to as wholesample
symmetric (WSS) filters.
Both filters are even length. Then h0 is symmetric and
h1 is antisymmetric and are referred to, respectively, as
halfsample symmetric (HSS) and halfsample anti
symmetric (HSA).
Due to symmetry conditions, only the transmission of half
of the filter coefficients are necessary.
56
WSS, HSS, and HSA Filters
HSS
HSA
0 +1
1
1 0 +1
WSS
1 0 +1
57
BiOrthogonal DWT
lowpass
x(n)
lowpass
h0
h1
highpass
g0
y(n)
g1
highpass
Analysis filterbank
Synthesis filterbank
h0 is orthogonal to g1
h1 is orthogonal to g0
58
Filter Coefficient Normalization
The expression n h0 ( n ) denotes the DC gain of the
n
analysis lowpass filter, while n ( 1) h1 ( n ) denotes the
Nyquist gain of the highpass analysis filter. The synthesis
filters are related to the analysis filters by the following
alias cancellation relations:
g0 (n ) = ( 1) n h1 ( n )
g1 ( n ) = ( 1) n h0 ( n )
where is a normalization constant given by:
2
=
n
n
(
)
(
)
h0 ( n ) 1 h1 ( n ) + h1 ( n ) 1 h0 ( n )
n
n
n
n
59
Daubechies (9,7) Filter
The following filter has been normalized to a DC gain of
one, and a Nyquist gain of two, as is implemented in the
JPEG2000 standard.
h0(n)
h1(n)
+0.602949018236
+0.266864118442
1
2, 0
3, 1
4, 2
+1.115087052456
1
2
3
4
0.078223266528
0.016864118442
+0.026748757410
0.591271763114
0.057543526228
+0.091271763114
60
Daubechies (10,18) Filter
n
h0(n)
h1(n)
1, 0
2, 1
3, 2
4, 3
5, 4
6, 5
7, 6
8, 7
9, 8
+0.536628801791
+0.054299075394
0.111388018824
+0.000058297264
+0.020401844374
+0.440781829293
0.115519002860
0.060571607153
+0.009733420188
+0.021802742673
+0.001787592313
0.006683900685
+0.000001928418
+0.000674873932
61
Daubechies (6,10) Filter
n
h0(n)
h1(n)
1, 0
2, 1
3, 2
+0.557543526228
+0.869813136679
+0.033728236885
0.188640851913
0.095087384971
0.009884638967
+0.026748757410
4, 3
5, 4
0.091271763114
62
(2,10) Integer Filter
n
h0(n)
h1(n)
1, 0
2, 1
3, 2
4, 3
5, 4
+1/2
+1
22/128
22/128
+3/128
+3/128
63
CRF* (13,7) & Swelden (13,7) Integer Filters
n
h0(n) (SWE)
h0(n) (CRF)
h1(n)
0
1
+348/512
+144/512
+164/256
+80/256
1
2, 0
+1
9/16
2
3
4
5
6
63/512
6/512
+18/512
0
1/12
31/256
16/256
+14/256
0
1/256
3, 1
4, 2
0
+1/16
* Canon Research France
64
(5,3) and (2,6) Integer Filters
n
h0(n)
h1(n)
+6/8
1
+1
1
2
+2/8
2, 0
1/2
1/8
h0(n)
h1(n)
1, 0
2, 1
3, 2
+1/2
+1
1/128
1/128
65
Symmetric Boundary Extensions
Consider a 1D signal of length N0. On the analysis side:
For transformation by WSS filters, the signal is
extended to a wholesample symmetric signal of length
2N0 2 and made periodic.
For HStype filters, it is extended to a HSS signal of
length 2N0 and made periodic.
In each case, the filters are extended with zeros to length N0
and applied by N0periodic convolution.
The type of symmetric boundary extensions required for
the synthesis side depend on the filter symmetries and the
odd or even size of the signal.
66
Symmetric Boundary Extensions
Original Signal
0
N01
WSS
0
N01
2N03
HSS
0
N01
2N01
67
Subband L2Norms
In an orthonormal transform (such as the DCT), the meansquarederror (MSE) in the image domain and the
transform domain are the same.
For quantized wavelet coefficients, under certain
assumptions on the quantization noise, the MSE of the
reconstructed image can be approximately expressed as a
weighted sum of the MSE of the wavelet coefficients,
where the weight for each subband is given by its L2norm.
The DWT filter normalization impacts both the L2norm
and the dynamic range of each subband.
68
Subband L2Norms After 2D, 3Level DWT
( 2 , 2 ) Normalization
Subband
(5,3)
(9,7)
3LL
0.67188
1.05209
3HL
0.72992
1.04584
3LH
0.72992
1.04584
3HH
0.79297
1.03963
2HL
0.79611
0.99841
2LH
0.79611
0.99841
2HH
0.92188
0.96722
1HL
1.03833
1.01129
1LH
1.03833
1.01129
1HH
1.43750
1.04044
(1,2) Normalization
(5,3)
(9,7)
5.37500
8.41675
2.91966
4.18337
2.91966
4.18337
1.58594
2.07926
1.59222
1.99681
1.59222
1.99681
0.92188
0.96722
1.03833
1.01129
1.03833
1.01129
0.71875
0.52022
69
3Level DWT with (9,7) Filter Scaled by the L2Norm
70
3Level DWT with (5,3) Filter Scaled by the L2Norm
71
DWT Complexity Issues
The complexity of the DWT depends on:
Filter sizes,
Floating point vs. integer filters
Except for a few special cases, e.g., the (5,3) integer filter,
the DWT is generally more computationally complex (~2
to 3) than the blockbased DCT.
As a fullframe transform, the DWT also requires
significantly more memory than the DCT. However, linebased and/or lifting implementations can significantly
reduce the memory requirements.
72
LineBased DWT
There is no need to
buffer the entire image
in order to perform the
wavelet transform.
Depending on the filter
size and the number of
wavelet decomposition
levels used, a line of
wavelet coefficients
can be made available
only after processing a
few lines of the input
image.
73
The Lifting Scheme
The lifting scheme is an alternative method of computing
the wavelet coefficients, with the following advantages:
Often requires less computation (actual savings depend
on the specific filter bank).
Requires less memory and the wavelet coefficients can
be calculated inplace.
Can be easily adapted to produce integertointeger
wavelet transforms for lossless compression.
Backward transform is easy to find and has the same
complexity as the forward transform.
Does not require explicit signal extension at boundaries.
74
The Lifting Algorithm
The lifting algorithm can be described in three steps:
Split step: First, the original signal, xk , is split into two
odd and even subsequences. This is sometimes referred
to as the lazy wavelet transform:
si 0 x 2i ,
d i 0 x 2i +1 ,
Lifting step: This step is executed as N substeps,
where the odd and the even sequences are transformed
using the prediction and update coefficients Pn(k) and
Un(k), whose values depend on the wavelet filters being
used. The relationships are:
di n = di n 1 + Pn ( k ) sk n 1 ,
n [1,2,! , N ]
si n = si n 1 + U n ( k ) d k n ,
n [1,2,! , N ]
75
The Lifting Algorithm
Normalization step: Finally, normalization factors are
applied to get the wavelet coefficients:
si N K 0 si N ,
d i N K1 d i N .
where siN and diN are, respectively, the lowpass and the
highpass wavelet coefficients and K0 and K1 are the
corresponding normalization factors.
Near image boundaries, at each adder in the lattice where a
lefthand input is missing, the mirrorimage righthand
input is added twice. Similarly, where a righthand input is
missing, the mirrorimage lefthand input is added twice.
This results in an implicit signal boundary extension.
76
Lifting Example for (5,3) Filter
s0 0
d00
1
2
1
2
1
2
1
4
s1
1
0
s2 0
1
2
d11
1
4
d10
d01
1
4
s0
s1 0
1
2
1
4
s2
0
s3 0
1
2
d21
1
4
1
0
d20
Highpass output
1
4
s3
1
1
Input sequence
1
1
Lowpass output
1
d i = d i 12 ( si + si +1 ), si = si + 14 ( d i 1 + d i )
77
Lattice Structure for (13,7) Integer Filters
Input sequence
b b
b b
b b
b b
Highpass
d
d d
d d
d d
Lowpass
CRF (13 x 7): a = 1/16, b = 9/16, c = 1/16, d = 5/16
SWE (13 x 7): a = 1/16, b = 9/16, c = 1/32, d = 9/32
78
Lifting Example for (9,7) Filter
Input sequence
p1
p1
u1
u1
p2
u2
p1
u1
p2
p1
u1
p2
u2
p1
u1
p2
u2
u2
p1
u1
p2
u2
p1
u1
p2
u1
p2
u2
p1
u2
Intermediate
stages
p2
u2
K1
K0
p1 = 1.586134342,
p2 = +0.882911075,
K0 = 1/ K1,
u1 = 0.052980118
u2 = +0.443506852
K1 = 1.230174104
Highpass
Lowpass
79
Lifting Block Diagram
s i0
+
+
xi
Phase
splitter
P1(z)
di0
U1(z)
K01
K11
K1
Forward transform
+
diN
highpass
s i0
+

lowpass
UN(z)
UN(z)
diN
PN(z)
+
+
siN
K0
siN
PN(z)
U1(z)
P1(z)
Inverse transform
xi
 di0
Phase
joiner
+
80
IntegertoInteger Transforms
si0
+
+
xi
siN
diN
Phase
splitter
K01
K11
K0
P1(z)
QU1
PN(z)
QUN
QP1
di0 +
+
U1(z)
QPN
+
+
UN(z)
K1
Forward transform
lowpass
diN
highpass
si0
+

siN
QUN
PN(z)
QU1
P1(z)
UN(z)
QPN

U1(z)
QP1
 di0
Inverse transform
Phase
joiner
xi
+
81
IntegertoInteger Transforms
The lifting structure can be easily modified to create
integertointeger transforms for lossless compression.
On the analysis side, a quantizer function Q is added at
the output of each ladder filter.
On the synthesis side, the same quantizer function Q is
added to the output of each ladder filter.
Two possible choices of quantizers are rounding to the
nearest integer and truncation.
In JPEG2000 Part I, quantizers are used after the
prediction and update stages to create a reversible version
of the (5,3) filter: QP1(x) =   x , QU1(x) = x + 1/2.
82
Integer (5,3) Filter Specification in Part 1
The reversible (5,3) DWT in Part 1 is based on the lifting
implementation using a WSS signal extension. Denoting
the extended 1D input sequence by xext(n), and the output
sequence by y(n), the odd output coefficients are given by:
xext ( 2n ) + xext ( 2n + 2)
y ( 2n + 1) = xext ( 2n + 1)
Even coefficients are computed from the even values of the
extended input signal and the odd output coefficients:
y ( 2n 1) + y ( 2n + 1) + 2
y ( 2n ) = xext ( 2n ) +
83
JPEG2000 DWT Choices
JPEG2000 Part 1 only allows successive powers of two
splitting of the LL band and the use of two DWT filters:
The lifted integer (5,3) filter that provides lossless
capability and reduced complexity (faster than DCT),
but at the expense of some loss in coding efficiency.
The Daubechies (9,7) floatingpoint filter that provides
superior coding efficiency. The analysis filters are
normalized to a DC gain of one and a Nyquist gain of 2.
Part 2 allows for arbitrary size filters (userspecified in the
header), arbitrary wavelet decomposition trees, and
different filters in the horizontal vs. vertical directions.
84
Quantization
85
Quantization and Dequantization
At the encoder, the scalar quantization operation maps a
given signal value to a quantizer index, which is then
encoded as part of the compressed bit stream.
At the decoder, the quantizer index is decoded and
converted into the corresponding quantized value. This
process is sometimes referred to as dequantization.
Quantizer indices
Input
signal
Quantization
Dequantization
Encoder
Decoder
Quantized
signal
86
Quantization in JPEG2000 Part 1
Wavelet
coefficients
Uniform Quantizer
with Deadzone
Quantizer
indices
Uniform quantization with deadzone is used to quantize all
the wavelet coefficients.
For each subband b, a basic quantizer step size b is
selected by the user and is used to quantize all the
coefficients in that subband.
The choice of the quantizer step size for each subband can
be based on visual models and is likened to the qtable
specification in the JPEG DCT.
87
Uniform Scalar Quantizer with Deadzone
Quantization rule:
y
q = sign(y)
b
where y is the input to the quantizer, b is the quantizer
step size, q is the resulting quantizer index, sign(y) denotes
the sign of y, y denotes the absolute value of y, and x
denotes the largest integer not larger than x.
b
2b
3
2
1
+1
+2
+3
88
Example
Encoder input value = 21.82
Quantizer index =  21.82/10 = 2
30
20
10
+10
10
10
20
+20
+30
10
10
+1
+2
0
21.82
2
1
89
Dequantization Rule
Dequantization rule: z = [q + r sign(q)]b , for q 0
z = 0,
otherwise
where q is the quantizer index, b is the quantizer step size,
z is the reconstructed (quantized) signal value, sign(q)
denotes the sign of q, and r is the reconstruction bias.
r = 0.5 results in midpoint reconstruction (no bias).
r < 0.5 biases the reconstruction towards zero. A popular
value for r is 0.375.
In JPEG2000 Part 1, the parameter r is arbitrarily chosen
by the decoder.
90
Example
Quantizer index = 2
Reconstructed value r = 0.5 (midpoint): (20.5)10 = 25
Reconstructed value r = 0.375: (20.375) 10 = 23.75
10
10
20
10
10
25
15
+15
+25
21.82
2
1
+1
+2
91
Centroid Uniform Threshold Quantizer
For a UTQ with a given step size, the meansquared
quantization error can be minimized if the dequantizer
reconstructs the signal to the centroid of the signal
probability distribution enclosed by the quantization bin as
opposed to the midpoint of the bin. This only affects the
decoder and has no impact on the encoder.
p(y)
Centroid reconstruction
Midpoint reconstruction
2b
b
92
93
94
Human Visual System CSF
The frequencydependent behavior of the human visual
system (HVS) can be characterized by its response to
harmonic (sinusoidal) functions.
For each sinusoid with a given frequency, the amount of
contrast needed to elicit a criterion level of response from a
neuron is called the contrast threshold.
The inverse of the contrast threshold is called the contrast
sensitivity and when plot as a function of frequency is
referred to as the contrast sensitivity function (CSF).
The luminance CSF peaks at around 5 cycles/degree, and
rapidly drops off to almost zero at 50 cycles/degree. The
chrominance CSF drops even faster.
95
Example of Luminance CSF
Contrast sensitivity
1.0
0.8
0.6
0.4
0.2
0.0
0
10
15
20
25
30
Cycles/degree
96
97
(Courtesy of S. Daly @ Sharp Laboratories of America)
99
100
101
102
Embedded Quantization in Part 1
Unlike JPEG Baseline, where the resulting quantizer index
q is encoded as a single symbol, in JPEG2000 it is
encoded one bit at a time, starting from the MSB and
proceeding to the LSB.
During this progressive encoding, the quantized wavelet
coefficient is called insignificant if the quantizer index q
is still zero. Once the first nonzero bit is encoded, the
coefficient becomes significant and its sign is encoded.
If the p least significant bits of the quantizer index still
remain to be encoded, the reconstructed sample at that
stage is identical to the one obtained by using a UTQ with
deadzone with a step size of b 2p.
103
Embedded Quantization by BitPlane Coding
MSB
2N1
N2
N3
.
.
0
LSB
104
Embedded Quantization Example
Wavelet coefficient = 83; Quantizer step size = 3
Quantizer index = 83/3 = 27 = 00011011
Dequantized value based on fully decoded index:
(27 + 0.5) 3 = 82.5
Dequantized value after decoding 6 BPs:
Decoded index = 000110 = 6; Step size = 12
Dequantized value = (6 + 0.5) 12 = 78
Dequantized value after decoding 4 BPs:
Decoded index = 0001 = 1; Step size = 48
Dequantized value = (1 + 0.5) 48 = 72
105
Entropy (Tier 1) Encoding
106
Entropy Coding in JPEG2000 Part 1
Quantizer
indices
Adaptive Binary
Arithmetic Coder
(Tier1 Coder)
Compressed
bitstream
Contextbased adaptive binary arithmetic coding is used in
JPEG2000 to efficiently compress each individual bit
plane.
The binary value of a sample in a block of a bit plane of a
subband is coded as a binary symbol with the JBIG2 MQCoder.
107
Information Content and Ideal Codelength
Consider a source that generates symbols si with
probability p(si) from an alphabet of size n.
The information content of the symbol si is denoted by
I(si) and is given by:
1
I ( si ) = log 2
= log 2 ( p( si ))
p( si )
bits
An ideal coder that is capable of producing fractional bits,
would be able to encode each symbol using exactly the
same number of bits as its information content, I(si). The
quantity I(si) is called the ideal codelength of the symbol
si. According to Shannons noiseless coding theorem, no
coder can perform better than the ideal coder.
108
Information Content and Entropy
Symbol
Probability
A
B
0.60
0.30
C
D
0.05
0.05
Average Rate
Ideal
Huffman
Huffman
Codelength Codelength Codeword
0.7370
1
0
1.737
2
10
4.322
4.322
3
3
1.40 bps
1.50 bps
110
111
The average information content (or ideal codelength) of the
symbols in a source is called its entropy and is given by:
n
i =1
i =1
H ( S ) = p( si ) I ( si ) = p( si ) log 2 ( p( si )) bits / symbol
109
Example: Skewed Binary Source
Symbol
Probability
Ideal
Codelength
Huffman
Codelength
0.999
0.0014434
0.001
9.9657843
Average Rate
0.0114077 bps
1 bps
Huffman coding can become inefficient for binary sources,
especially when the symbol probabilities have a large
skew.
110
Limitations of Huffman Coding
Since the Huffman codewords have to be integers, Huffman
coding becomes inefficient if the symbol probabilities
deviate significantly from negative powers of two, especially
when p(s) is much larger than 1/2. Performance can only be
improved by encoding blocks of symbols, thus increasing
memory requirements and complexity.
The code becomes inefficient if a mismatch between the
actual source statistics and assumed code statistics exist,
hence the need for twopass (custom) Huffman coding.
Further, there is no local adaptation when there is a large
variation of symbol statistics within an image.
When using a large number of conditional contexts, the cost
of storing multiple codebooks can be prohibitive.
111
Arithmetic Coding
An arithmetic coder can be thought of as an encoding
device that accepts at its input the symbols in a source
sequence along with their corresponding probability
estimates, and produces at its output a code stream with a
length equal to the combined ideal codelengths of the input
symbols. Practical implementations of arithmetic coders
exceed the ideal codelength by a few percent (e.g., 6%).
Source Symbol
Sequence
Probability
Estimate
Arithmetic
Coder
Codeword
Sequence
112
Conditioning Contexts
In general, the probability of a sample having a certain
value is influenced by the value of its neighbors. Thus, the
symbol probabilities can be conditioned on the values of
the symbols in a neighborhood surrounding them. For a
given neighborhood configuration, each combination of the
neighboring samples denotes a conditioning context.
The arithmetic coder is particularly efficient in encoding
sequences where the probabilities change in each context.
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
113
Example: Entropy of Lena MSB
Conditioning contexts can
capture the redundancy in the
image:
No conditioning contexts
Entropy = 1.0 bit/pixel
7neighbor conditioning context
Entropy = 0.14 bits/pixel
Most significant bit plane
114
Example: Entropy of Lena LSB
No conditioning contexts
Entropy = 1.0 bit/pixel
7neighbor conditioning context
Entropy = 1.0 bits/pixel
Least significant bit plane
115
JPEG2000 Entropy Coder
Each bit plane is further broken down into blocks (e.g., 64
64). The blocks are coded independently (i.e., the bitstream
for each block can be decoded independent of other data)
using three coding passes. The coding progresses from the
most significant bitplane to the least significant bitplane.
A coding block of a bit
plane of a subband
116
Codeblock Size Restrictions
The nominal dimensions of a codeblock are free
parameters specified by the encoder and can vary from one
tile to another, but are subject to the following constraints:
Each dimension should be an integer power of two.
The total number of coefficients in a block should not
exceed 4096.
Height of the codeblock can not be less than 4.
Codeblocks from all resolutions are constrained to have
the same size (except due to constraints imposed by the
precinct size).
117
RD Optimized Embedded Coding
To generate an optimally embedded bit stream, the data that
results in the most reduction in distortion for the least increase
in bit rate should be coded first. The path 123 has a superior
embedded RD performance over the path 456, even though
they both start at (R1, D1) and end at (R2, D2).
R2
Rate
B
B
A
ABC = Three passes of coding bitplane p
R1 = Rate before coding bitplane p
R2 = Rate after coding bitplane p
D1 = Distortion before coding bitplane p
D2 = Distortion after coding bitplane p
R1
D2
Distortion
D1
118
JPEG2000 Entropy Coder
The binary value of a sample in a block of a bit plane of a
subband is coded as a binary symbol with the JBIG2 MQCoder that is a contextbased adaptive arithmetic coder.
Each bitplane of each block of a subband is encoded in
three sub bit plane passes instead of a single pass. The
bitstream can be truncated at the end of each pass. This
allows for:
Optimal embedding, so that the information that results
in the most reduction in distortion for the least increase
in file size is encoded first.
A larger number of bitstream truncation points to
achieve finer SNR scalability.
119
Significance Propagation Pass
The first pass in a new bit plane is called the significance
propagation pass. A symbol is encoded if it is
insignificant but at least one of its eightconnected
neighbors is significant as determined from the previous
bit plane and the current bit plane based on coded
information up to that point. These locations have the
highest probability of becoming significant).
The probability of the binary symbol at
a given location of a bitplane of a
block of a subband is modeled by a
context formed from the significance
values of its neighbors. A total of nine
contexts are used.
d
120
Refinement and Cleanup Passes
Refinement (REF): Next, the significant coefficients are
refined by their bit representation in the current bitplane.
Cleanup: Finally, all the remaining coefficients in the bitplane are encoded. (Note: the first pass of the MSB bitplane
of a subband is always a cleanup pass).
The coding for the first and third passes are identical, except
for the run coding that is employed in the third pass.
The maximum number of contexts used in any pass is no
more than nine, thus allowing for extremely rapid probability
adaptation that decreases the cost of independently coded
segments.
121
Sig. Prop. =
Refine
=
Cleanup =
Total Bytes
0
0
21
21
Bit plane
1
Compression ratio = 12483 : 1
RMSE = 39.69 PSNR = 16.16 db
% refined = 0 % insig. = 99.99
122
BP1 Enlarged by 8
(256 256)
123
BP2 Enlarged by 8
(256 256)
124
Sig. Prop. =
Refine
=
Cleanup =
Total Bytes
18
0
24
42
Bit plane
2
Compression ratio = 4161 : 1
RMSE = 29.11 PSNR = 18.85 db
% refined = 0.01 % insig. = 99.95
125
Sig. Prop. =
Refine
=
Cleanup =
Total Bytes
38
13
57
108
Bit plane
3
Compression ratio = 1533 : 1
RMSE = 21.59 PSNR = 21.45 db
% refined = 0.05 % insig. = 99.89
126
Sig. Prop. =
Refine
=
Cleanup =
Total Bytes
78
37
156
271
Bit plane
4
Compression ratio = 593 : 1
RMSE = 16.58 PSNR = 23.74 db
% refined = 0.11 % insig. = 99.77
127
Sig. Prop. =
Refine
=
Cleanup =
Total Bytes
224
73
383
680
Bit plane
5
Compression ratio = 233 : 1
RMSE = 12.11 PSNR = 26.47 db
% refined = 0.23 % insig. = 99.43
128
Sig. Prop. =
Refine
=
Cleanup =
Total Bytes
551
180
748
1479
Bit plane
6
Compression ratio = 101 : 1
RMSE = 8.65 PSNR = 29.39 db
% refined = 0.58 % insig. = 98.68
129
Sig. Prop. =
Refine
=
Cleanup =
Total Bytes
1243
418
1349
3010
Bit plane
7
Compression ratio = 47 : 1
RMSE = 6.02 PSNR = 32.54 db
% refined = 1.32 % insig. = 97.09
130
Sig. Prop. =
Refine
=
Cleanup =
Total Bytes
2315
932
2570
5817
Bit plane
8
Compression ratio = 23 : 1
RMSE = 4.18 PSNR = 35.70 db
% refined = 2.91 % insig. = 93.99
131
Sig. Prop. = 4593
Refine
= 1925
Cleanup = 5465
Total Bytes 11983
Bit plane
9
Compression ratio = 11.2 : 1
RMSE = 2.90 PSNR = 38.87 db
% refined = 6.01 % insig. = 87.66
132
Sig. Prop. = 10720
Refine
= 3917
Cleanup = 12779
Total Bytes 27416
Bit plane
10
Compression ratio = 5.16 : 1
RMSE = 1.78 PSNR = 43.12 db
% refined = 12.34 % insig. = 71.88
133
Sig. Prop. = 25421
Refine
= 8808
Cleanup = 5438
Total Bytes 39667
Bit plane
11
Compression ratio = 2.90 : 1
RMSE = 0.90 PSNR = 49.00 db
% refined = 28.12 % insig. = 46.80
134
Tier2 Entropy Coding
135
Tier 2 role
Tier 1 generates a collection of bitstreams
One independent bitstream for each codeblock
Each bitstream is embedded
Tier 2 multiplexes the bitstreams for inclusion in the
codestream and signals the ordering of the resulting coded
bitplane passes in an efficient manner
Tier 2 coded data can be rather easily parsed
Tier 2 enables SNR, resolution, spatial, ROI and arbitrary
progression and scalability
136
Example of bitplane pass coded data
LL2
HL2
256x256
image
HL1
LH2
Codeblocks
(64x64)
HH2
LH1
HH1
137
Example of bitplane pass coded data
BP1
MSB
BP2
Codeblocks
BP3
Significance
Refinement
Cleanup
BP4
BP5
BP6
LL2 HL2 LH2 HH2
HL1
LH1
HH1
138
Lowest resolution, Highest quality
BP1
MSB
BP2
BP3
BP4
BP5
BP6
LL2 HL2 LH2 HH2
HL1
LH1
HH1
139
Medium resolution, Highest quality
BP1
MSB
BP2
BP3
BP4
BP5
BP6
LL2 HL2 LH2 HH2
HL1
LH1
HH1
140
Highest resolution, Highest quality
BP1
MSB
BP2
BP3
BP4
BP5
BP6
LL2 HL2 LH2 HH2
HL1
LH1
HH1
141
Highest resolution, Target SNR quality
BP1
MSB
BP2
BP3
BP4
BP5
BP6
LL2 HL2 LH2 HH2
HL1
LH1
HH1
142
Highest resolution, Target Visual quality
BP1
MSB
BP2
BP3
BP4
BP5
BP6
LL2 HL2 LH2 HH2
HL1
LH1
HH1
143
Layers
Layer: a collection of some consecutive bitplane coding
passes from all codeblocks in all subbands and
components. Each codeblock can contribute an arbitrary
number of bitplane coding passes to a layer.
Each layer successively increases the image quality. Most
often associated with SNR or visual quality levels.
Layers are explicitly signalled and can be arbitrarily
determined by the encoder
The number of layers can range from 1 to 65535. Typically
around 20. Larger numbers are intended for interactive
sessions were each layer is generated depending on user
feedback.
144
Example of layer organization
BP1
MSB
Layer 1
Layer 2
Layer 3
Layer 4
BP2
BP3
BP4
BP5
BP6
LL2 HL2 LH2 HH2
HL1
LH1
HH1
145
Precinct Partition
Precinct partition: divides a resolution level of a
component into rectangles of size 2PPx x 2PPy samples.
The precinct size (PPx, PPy) is encoder selectable for
each resolution level.
Each precinct is independently coded by Tier2.
The codeblock size at a resolution level is constrained by
the corresponding precinct size.
By default precincts are very large (PPx = PPy = 15)
146
Precincts and codeblocks: example
Codeblock
boundaries
Precincts
Subband
boundaries
147
Packets
Packet: compressed data representing a specific tile, layer,
component, resolution level and precinct.
Packet header signals
Which codeblocks are included in the packet
The number of most significant all zero bitplanes
skipped by the entropy coder, for each newly included
codeblock
The number of included codingpasses for each codeblock
The length of included coded data for each codeblock
Packet body: concatenation of included coded image data
148
Precinct usage examples
Limit the packet size for decreased buffering needs.
Limit codeblock size in specific resolutions.
For example the codeblock size at lower resolution
levels can be reduced so that the buffering needs for a
line based DWT are reduced. At higher resolution
levels larger codeblocks are used, which increases the
compression ratio.
Enhance spatial progression.
By limiting the packet size at each resolution level a
better data multiplexing for a top to bottom progression
is possible.
149
Packet head encoding: Tag trees
Tag tree: Hierarchical representation of a 2D array of nonnegative integer values, where successively reduced
resolutions form a tree. The value at every node of the tree
is the minimum value of its (up to four) children. The 2D
array to be coded is at the lowest level.
Each node has an associated current value, which is
initialized to zero.
A zero bit indicates that the current value is less than the
value and thus should be incremented by one. A one bit
indicates that is equal.
Coding starts at the top node, and a child can not be coded
until the parent is entirely coded.
150
Tag tree example
q0(0,0) > 0
q0(0,0) =
1
q2(0,0) = 1 ! 0 1 1 1
q0(0,0) = 1
q1(0,0) = q1(1,0) =
1
1
2
q2(1,0) > 1
q2(0,0) = 1
q2(1,0) = 3
q2(1,0) = 3 ! 0 0 1
q2(0,0) = q2(1,0) = q2(2,0) =
1
3
2
2
2
q1(0,0) = 1
2
2
1
2
3
4
q2(1,0) > 1
q2(1,0) > 2
q2(2,0) = 2 ! 1 0 1
q1(1,0) = 1
q2(1,0) = 2
151
Tag tree usage
Signal in which layer a codeblock is first included. For a
packet of layer l code the state l(m,n) > l, where l(m,n) is
the layer in which codeblock (m,n) is first included.
Signal number of mostsignificant all zero bitplanes for
newly included codeblocks, using another tag tree.
The index of the first layer in which a codeblock is
included and the number of all zero bitplanes are spatially
redundant. Tag trees efficiently exploit this redundancy in
a simple manner.
There are two independent tag trees for each precinct,
resolution level, tile and component: one for codeblock
first inclusion and one for number of all zero bitplanes.
152
Packet output to codestream
The packets for each tile are output to the codestream in
one of several predefined orders:
Layer resolution level component position
Resolution level layer component position
Component position resolution level layer
Resolution level position component layer
Position component resolution level layer
Arbitrary progression order changes can occur in the
codestream.
153
Layer (SNR) progressive example
154
Layer (SNR) progressive example
155
Layer (SNR) progressive example
156
Resolution progressive example
157
Resolution progressive example
158
Resolution progressive example
159
Resolution progressive example
160
Codestream
Codestream: compressed image data with all the signaling
required to properly decompress it.
Composed of a main and tile headers, that specify coding
parameters in a hierarchical way, plus the encoded data for
each tile.
The compressed data for a tile can be broken up in tileparts, and the different tileparts interleaved in the
codestream to allow for nontile progressiveness.
The codestream is the minimum exchange format for
JPEG 2000 encoded data, but usually the codestream is
embedded in a file format.
161
Rate allocation
162
Rate allocation principle
Rate allocation is the process that allows to target a
specific compression ratio with the best possible quality
(MSE, visual or other) for each layer and/or entire
codestream. Possible types are:
None: compression ratio is determined solely by the
quantization step sizes and image content.
Iterative: quantization step sizes are adjusted according
to obtained compression ratio and operation is repeated.
Postcompression: rate allocation is performed after the
image data has been coded, in one step.
Others (Lagrangian, scanbased, etc.)
Not standardized by JPEG 2000 encoder choice.
163
Rate allocation example
RMSE
9
8
7
6
5
4
3
2
1
0
Original image: 8 bits/pixel
Target:
4:1 compression ratio
4 layers,
logarithmically spaced
Result:
0
0.5
1.5
2.5
Rate (bits/pixel)
Layer 1
Layer 2
Layer 3
Layer 4
Discarded
layer
bpp
RMSE
0.25
8.4
0.50
5.4
1.00
3.2
2.00
1.6
164
Postcompression rate allocation
For each codeblock Bi a distortion measure Di j and a rate
Ri j is associated at the end of each coding pass j.
For an additive error measure the total distortion D and
rate R, subject to proper normalization, are
n
n
R = Ri i and D = Di i
i
where ni is the truncation point (i.e. end of last included
coding pass) of codeblock Bi.
MSE and weighted MSE is additive if the wavelet
transform is orthogonal or the quantization errors for
individual coefficients are uncorrelated. In practice neither
condition is met, but the approximation is good enough for
rate allocation purposes.
165
Postcompression rate allocation
Constrained optimization problem: find the ni that
minimizes D given R Rmax
Well known solution using the Lagrange multipliers
method: the problem is equivalent to minimizing
R D or equivalently
(
i
Ri i Di
ni
where is adjusted to achieve R = Rmax (or as close as
possible)
For each codeblock the convex hull of (Ri j,Di j) is
computed. Then ni is simply the largest truncation point
such that the DR slope is not smaller than 1.
The value of can be iteratively adjusted to match Rmax
166
Example
D isto rtio n (M S E )
60
Convex hull
50
(R,D) points not on convex hull
40
30
20
10
Truncate here
0
0
1 slope
50
100
150
R ate (b ytes)
167
Efficient ratedistortion estimation
The rate at the end of each coding pass can be calculated
from the MQ arithmetic coder using its internal state.
The reduction in distortion incurred by the coding of each
bit of a quantization index can be calculated by functions
that depend only on the quantizer step size, the wavelet
filter normalization and the bitplane (Annex J.14.4).
Such functions can be implemented efficiently using
lookup tables. Typically 6 and 7 bit lookup tables are
enough.
For increased accuracy, the fractional bits of the
quantization indices are kept internally, but not coded.
168
Visual frequency weighting
The eyes contrast sensitivity threshold varies with spatial
frequency as per the CSF function. In addition, each
subband represents a particular frequency range.
The MSE distortion measure of a codeblock can be
weighted by the average sensitivity threshold for the codeblocks subband to increase the compression ratio for the
same visual quality.
Alternatively quantization step sizes can be adjusted as per
the visual weighting tables, but is less flexible.
MSE weights depend on viewing conditions, e.g., display
resolution and viewing distance. Recommended weighting
tables are provided in an informative Annex (J.12.4) of the
standard.
169
Progressive visual frequency weighting
At high compression ratios image quality is low and
viewing distance is typically large. At lower compression
ratios typical viewing distances are smaller.
In a progressive SNR setting varying the visual weights
with the bitrate allows better visual quality at all stages of
transmission. A different set of weights for MSE measure
is applied for various birate ranges (weight set W(0) for up
to 0.125 bpp, set W(1) from 0.125 to 0.25, ).
Only the coded data inclusion order is affected.
Applicable to lossy to lossless progressive codesteams as
well.
170
Example visual RMSE weighting tables
1
1
0.561
1
0.727
0.178
0.561
0.284
0.178
0.044
Viewing distance: 200 samples (e.g., 10 inches on a 200 dpi display)
171
Region of Interest (ROI)
coding
172
Region of Interest coding principle
Region of Interest (ROI) coding allows a nonuniform
distribution of quality. The ROI is coded with a higher
quality than the background (BG). A higher compression
ratio can be achieved with same or higher quality inside
ROIs.
Static ROIs are defined at encoding time and are suitable
for storage, fixed transmission, remote sensing, etc.
Commonly referred to as ROI coding.
Dynamic ROIs are defined interactively by a user in a
client/server situation during a progressive transmission.
Suitable for telemedicine, PDAs, mobile communications,
etc. They can be achieved by the dynamic generation of
layers matching the users request.
173
ROI mask
The ROI mask defines which wavelet coefficients (ROI
coefficients) contribute to reconstructing the ROI. It is a
binary mask which depends on
The ROI shape in the image domain
The DWT synthesis filter lengths
It is obtained by following the wavelet reconstruction
process backwards: at each decomposition level the mask
at the current subband (or image initially) is extended
horizontally and vertically by the synthesis filter lengths
and subsampled.
For rectangular ROIs a fast algorithm exists for deriving
the ROI mask. No true mask bitmap is required.
174
Lossless ROI mask generation
ROI in image domain
Final ROI mask
175
Encoding ROIs: General scaling
ROI
MSB
LSB
Original
General scaling
The quantized wavelet coefficients of the ROI are shifted upwards by s
bitplanes, while BG ones remain unmodified. As a consequence ROI
coefficients are coded earlier. With multiple ROIs, different scaling
values s are applied to each ROI. The value of s is recorded in the
codestream header for each ROI. At the decoder ROI coefficients are
unshifted prior to dequantization.
The quality differential between ROI and BG is controlled by s.
The ROI mask is required at both, encoder and decoder.
176
Encoding ROIs: Maxshift
In maxshift mode, the value of s is set so that the least significant bit of
all shifted ROI coefficients is above the most significant nonzero bit
of all BG coefficients. At the decoder a simple threshold can be used to
distinguish ROI from BG coefficients.
No ROI mask is required at the decoder and arbitrary shaped ROIs are
possible. There is no need to include ROI shape in the codestream.
The encoder is free to generate the ROI mask in any way (e.g., LL
subband can be considered all ROI).
177
General scaling Pros & Cons
The ROI / BG quality differential can be freely adjusted by
the user at the encoder. "
Multiple ROIs with different quality differentials are
possible within the same tilecomponent. "
Only supported in Part 2. #
Decoder must generate ROI mask. #
Encoder must generate ROI mask in the standard way (i.e.
no optimizations possible). #
Limited ROI shapes: rectangle and ellipse. #
178
Maxshift Pros & Cons
Arbitrary ROI shapes supported, including disjoint ones. "
No need to include ROI shape in codestream and no ROI
mask required at the decoder. "
Encoder can optimize ROI mask to improve visual
quality. "
For example the entire LL subband can be considered
as ROI to provide a lowresolution BG together with
the ROI.
179
Maxshift Pros & Cons
No control over the ROI / BG quality differential. #
The differential depends directly on the quantization
step size and the dynamic range of the original image.
Almost all the BG data appears after all the ROI data
has been decoded.
Only one ROI per tilecomponent is supported. #
180
ROI Maxshift example
ROI covers 5% of image, 2 lowest resolution levels in ROI mask. Magnified portion shown.
ROI
256:1
45:1 (almost all ROI decoded)
No ROI
181
ROI Maxshift example (contd)
ROI covers 5% of image, 2 lowest resolution levels in ROI mask. Magnified portion shown.
ROI
16:1
4:1 (complete decode)
No ROI
182
Error resilience
183
Errorprone channels
When delivering compressed images across errorprone
channels any transmission error can severely affect the
decoded image quality. This is specially true since variable
length coding is used in the codeblock entropy coding and
packet heads.
Error types can be random errors, burst error and missing
bytes (i.e. network packet loss).
Since each codeblock is independently coded an error in a
codeblocks bitstream will be contained within that codeblock. Nevertheless severe distortion can occur in the case
of an error.
Packet heads are interdependent and thus fragile.
184
Error effects
In a packet body: corrupted arithmetically coded data for
some codeblock: wrong symbols are decoded and wrong
contexts are formed for subsequent bitplanes severe
distortion.
In a packet head: wrong body length can be decoded,
codeblock data can be assigned to wrong codeblocks and
subsequent packets cannot be correctly located total
synchronization loss.
Bytes missing (i.e. network packet loss): combined effects
of error in packet head and body.
185
Protecting codeblock data
Segmentation symbols: a special symbol sequence is
coded with a fixed context at the end of each bitplane. If
the wrong sequence is decoded an error has occurred and
the last bitplane is corrupted (at least).
Regular predictable termination: the arithmetic coder is
terminated at the end of each coding pass using a special
algorithm (predictable termination). The decoder
reproduces the termination and if it does not find the same
unused bits at the end an error has occurred in the last
coding pass (at least).
Both mechanisms can be freely mixed but slightly decrease
the compression efficiency.
186
Protecting packet heads
SOP resynchronization marker: every packet head can be
preceded by an SOP marker with a sequence index. If an
SOP marker with the correct index is not found just before
the packet head an error has occurred. In such a case the
next unaffected packet is searched in the codestream and
decoding proceeds from there. The SOP marker can not be
emulated by uncorrupted coded data.
PPM/PPT markers: the packet head content can be moved
to the main or tile headers in the codestream and transmitted
through a channel with a much lower error rate.
Precincts: precincts can limit the spatial coverage of a
packet and thus contains packet head errors within a small
image area.
187
Error resilience example
16:1 compression ratio. Transmission error rate 105. No errors in codestream header.
Magnified portion shown.
No transmission errors
No error resilience
Full error resilience
188
File format: JP2
189
JP2
JP2 is the optional JPEG 2000 file format to encapsulate
JPEG 2000 codestreams.
Although codestreams provide sufficient information to
properly decode an image, JP2 provides additional, but
important, information about it.
JP2 is based on the concept of boxes. Each box is a
contiguous stream of data containing type and length
information.
Special boxes, the superboxes, can contain other boxes,
leading to a hierarchical structure.
File extension:
.jp2
190
JP2: basic boxes
File type identification
Transmission error detection (7 bit email, ASCII ftp, etc.)
Image size
Number of components and component bitdepth
Capture and default display resolutions
Vendor specific information: XML formatted or
proprietary identified by an UUID
Opacity (i.e. alpha channel)
Accurate color interpretation
191
JP2: color
Color interpretation can be specified in two ways:
enumerated or restricted ICC profiles.
Enumerated allow for sRGB and nonlinear gray.
Restricted ICC profiles allow for a nonlinearity plus a 3x3
transformation matrix needed to convert decompressed
data to the profile connection space in XYZ colorspace.
Both truecolor and palette based color are supported.
The specified colorspace is applied to the decompressed
data, after the inverse multiple component transformation.
192
Part 2 Extensions
193
Trellis Coded Quantization (TCQ)
Trellis coded quantization applies a spatially varying scalar
quantization on the wavelet coefficients, by choosing
among four scalar quantizers for each coefficient.
Quantizer indices from supersets of these quantizers along
with the quantizer transitions in the form of a trellis are
sufficient to properly reconstruct the quantized
coefficients.
The full benefit of TCQ is realized with the Lagrangian
rate allocation algorithm, which is a form of iterative rate
allocation.
194
Visual masking
The visual masking effect can mask quantization artifacts,
where the image acts as the background signal.
The visual masking extension introduces a nonlinearity
between the DWT and the quantizer that exploits the visual
masking effect.
Improvements can be observed mostly in:
Low amplitude textures (e.g., skin)
Zerowidth edges in artificial imagery (e.g., graphs)
Nonlinearity:
xi yi = sign( xi )
xi
gainb
gainb
195
Arbitrary wavelets
Arbitrary wavelet decomposition
Provides better control over the decorrelation process
by adjusting the bandpass extent of the wavelet
subbands.
Arbitrary wavelet filters: whole sample filters and truly
arbitrary filters
Can be used to tune the wavelet filter to the image
characteristics to obtain better compression.
Arbitrary wavelet decomposition and filters allow
transcoding from other wavelet based standards (e.g.,
FBIs WSQ).
196
Single sample overlap (SSO)
Tiling typically introduces boundary artifacts at medium
and high compression ratios.
The SSO extension provides a way to perform the wavelet
transform with an overlap of one sample between tiles.
This almost completely removes the artifacts: in practice
they become invisible.
The SSO extension also provides a way to perform a block
based wavelet transform, which reduces memory usage,
without blocking artifacts.
197
Multiple component transformations
Multicomponent transformations in Part 2 lift the
limitation to ICT and RCT.
Two types are supported:
Array based linear transformations. Similar to most
common color transformations. Allows for component
prediction transformations such as DPCM and complex
ones as the KarhunenLov Transformation (KLT).
A wavelet transform across components. Arbitrary
wavelet filters allowed.
Both types can be performed reversibly or irreversibly
198
JPX file format
Extension of JP2, same structure
File extension:
.jpf
Multiple codestreams, with composition and animation
Multiple coding algorithms:
JPEG 2000, MMR, JPEG, JBIG, JPEGLS, JBIG2,
etc.
Codestream fragmentation (in boxes, files and/or URLs)
Region editing, fast image servers, optional highquality, etc.
199
JPX file format
General ICC profiles
Digital signature for arbitrary byte streams in file
Metadata formats
JPX (XML based)
MPEG7
General XML
Intellectual property and rights information (XML)
200
Other Part 2 extensions
Variable DC offset: an encoder specified DC offset can be
used to improve the data distribution of image samples of
very skewed data.
Variable scalar quantization: the width of the quantizer
deadzone can be adjusted to improve the visual appearance
of low level textures.
Region of interest: general scaling is allowed.
Nonlinear amplitude transformation: a nonlinear
transformation of the decoded image samples can be
specified. Gamma and lookup table styles allowed. A
common use is to perceptually flatten sensors with linear
responses from 12 to 8 bits prior to compression.
201
Part 1 Amendments
202
AMD1: Profiles
AMD1 introduces Profile0 and Profile1 that restrict the
set of possible values for the coding parameters and
options, in order to allow low complexity implementations.
Profile0 main restrictions:
Tiles 128x128 or smaller, or no tiles
Component subsampling: 1, 2 or 4
No geometric manipulations
Codeblocks 32x32 or 64x64
No packet heads in codestream headers (PPM/PPT
markers)
203
AMD1: Profiles
Profile1 main restrictions:
Tiles 1024x1024 or smaller, or no tiles
Codeblocks 64x64 or smaller
Profile2 has no restrictions:
All legal values in the codestream syntax are allowed
204
AMD2: sYCC colorspace (JP2)
In Part 1 the JP2 file format supports only sRGB and nonlinear grayscale as enumerated (i.e. named) colorspaces.
Most current JPEG applications use the YCbCr colorspace.
Initial JPEG 2000 applications will probably require it.
Motion JPEG 2000 (Part 3) also requires it.
AMD2 adds the sYCC colorspace, which is an sRGB
based YCbCr colorspace, to the list of allowed enumerated
ones.
205
AMD3: Efficient geometric manipulations
(proposal)
Not accepted as Part 1 amendment. Included in Part 2.
Part 1 allows performing image flips and 90 degree
rotations in the compressed domain (i.e. no inverse DWT
required). However the codeblock and precinct partitions
can change and a repartition becomes necessary
demanding high memory usage.
Proposal introduces an offset for codeblock and precinct
partitions and thus a repartition becomes unnecessary.
In this case the geometric manipulations can be performed
independently on each codeblock and the memory usage
is very low. The only requirement is to entropy decode and
recode each codeblock to use the new (i.e. rotated)
scanning pattern.
206
Other JPEG 2000 Features
Tiling of large images provides for independent processing
of different image regions.
The canvas coordinate system allows for efficient
recompression of cropped images.
Rich codestream syntax provides means for transcoding of
the data for streaming, resolution progression, quality
progression, or any mixture thereof.
207
JPEG2000 Feature Summary
Improved coding efficiency (up to 30% compared to DCT)
Multiresolution representation
Quality scalability (SNR or visual)
Target bit rate (constant bit rate applications)
Lossless to lossy progression
Improved error resilience
Tiling
Rich bit stream syntax (layering, packet partitions, canvas
coordinate system, etc.)
Rich file format
208
JPEG 2000 performance
assessment
209
Assessing JPEG 2000 performance
JPEG 2000: new stateoftheart standard but
How does it compare to other existing standards ?
How well is the offered functionality supported ?
How efficient is it ?
210
Which algorithms to compare to?
Other new standards: stateoftheart
MPEG4 Visual Texture Coding (VTC)
JPEGLS
Established standards: popularly used
JPEG: baseline, progressive, lossless, etc.
PNG (undergoing ISO standardization)
Well known DWT based algorithms: reference
SPIHT
211
MPEG4 VTC
Multiscale zerotree wavelet entropy coding (MZTE)
Dyadic wavelet transform (Daub. 9/3 filter)
Similar to EZW and ZTE
Zerotree and quantized symbols arithmetically coded
DC values encoded by a predictive scheme
Quantization
Single (SQ): no SNR scalability
Multiple (MQ): limited SNR scalability
Bilevel (BQ): general SNR scalability
212
MPEG4 VTC (contd)
Scanning
Tree depth (TD): no resolution scalability (EZW)
Band by band (BB): provides resolution scalability
Scalability
Bitstreams are always resolution progressive
Each resolution level is SNR scalable (except SQ)
Support for object based coding
No lossless coding
213
JPEGLS
Low Complexity Lossless Compression (LOCOI)
Prediction w/ context modeling
C B D
Golomb coding (of poweroftwo order) A X
Flat region detector
The predictor is based on a median edge detector for the
nonadaptive part.
The context modeling is based on the local gradients (DB,
BC, CA). It determines the adaptive part of the prediction
and the Golomb coder order.
Nearlossless coding is obtained using a maximum
allowable sample error.
214
PNG
Lossless only
C B
Predictive scheme
A X
Current value X predicted from A,B,C
Five possible predictors
Predictors chosen on a linebyline basis
Entropy coding
Deflate method: LZ77 coupled with Huffman
Same as used in ZIP file compression
215
SPIHT
SPIHT: Set Partitioning in Hierarchical Trees, by Said and
Pearlman.
DWT based: S+P reversible transform for lossy and
lossless compression and Daubechies (9,7) filter for lossy
compression.
Exploits self similarity across scales using set partitioning
The wavelet coefficients are ordered into sets using a
parentchild relationship and their significance at
successively finer quantization levels. The binary
decisions can optionally be arithmetically coded.
Produces SNR scalable bitstreams only.
216
Test images
Cafe 2048x2560
Cmpnd1 512x768
Bike 2048x2560
All images have 8 bits per pixel
217
Test images (contd)
Target 512x512
Aerial2 2048x2048
Chart 1688x2347
US 512x448
All images have 8 bits per pixel
218
Test conditions
Software:
JPEG 2000: Verification Model 6.1
MPEG4 VTC: MoMuSys VM of Aug. 1999
JPEG: Independent JPEG Group version 6b
JPEGLS: UBC / SPMG, version 2.2
JPEG lossless: Cornell Univeritys codec, version 1.0
PNG: libpng version 1.0.3
SPIHT: codec with arithmetic coder, version 8.01
Machine:
500 MHz Pentium III, 512 kB halfspeed L2 cache, 512
MB RAM (SDRAM) running under Linux 2.2.12.
219
Lossless compression ratios
JPEG 2000
JPEGLS
LJPEG
PNG
SPIHT
10.00
9.00
compression ratio
8.00
7.00
6.00
5.00
4.00
3.00
2.00
1.00
0.00
bike
caf
cmpnd1
chart
aerial2
target
us
average
JPEG 2000: default options with (5,3) reversible filter; JPEGLS: default options; JPEG
lossless (LJPEG): optimized Huffman tables and best predictor; PNG: maximum
compression setting and best predictor; SPIHT: S+P filter with arithmetic coding.
220
Lossless encoding times
JPEG 2000
JPEGLS
LJPEG
PNG
SPIHT
aerial2
target
us
compression time (microseconds/pixel)
5
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
bike
caf
cmpnd1
chart
average
221
Lossless decoding times
JPEG 2000
JPEGLS
LJPEG
PNG
SPIHT
aerial2
target
us
decompression time (microseconds/pixel)
3
2.5
2
1.5
1
0.5
0
bike
caf
cmpnd1
chart
average
222
Lossless results
JPEGLS is overall best performer with best compression
ratios and fastest execution. JPEG 2000 compression ratios
are, in general, close to JPEGLS ones. PNG and SPIHT
come close behind.
Notable exception to the general trend is target and, to a
lesser extent, cmpnd1. Because of LZ77 the regular
structure of target is very well exploited by PNG.
JPEG 2000 is considerably slower than JPEGLS or
LJPEG but faster than others at compression.
JPEG 2000 performs well with various image types and
reasonably better than the other DWT algorithm with nonnatural images.
223
Nonprogressive lossy compression
JPEG 2000 R
JPEG 2000 NR
JPEG
VTC
SPIHT R
SPIHT NR
44
42
40
PSNR (dB)
38
36
34
32
30
28
26
24
22
0.25
0.5
bpp
One bitstream generated for each bitrate. Average across all images. JPEG 2000: default
options; JPEG: baseline with flat quantization tables and optimized Huffman tables;
MPEG4 VTC: single quantization; SPIHT: arithmetic coding.
224
Nonprogressive results
JPEG 2000 with the nonreversible (9,7) filter outperforms
all other algorithms at all bitrates.
The reversible (5,3) filter incurs a small penalty for the
capability of lossless decoding, but still outperforms all
other algorithms (except SPIHT @ 2 bpp).
JPEG exhibits a considerable quality difference at all
bitrates, from 2.7 to 1.8 dB inferior PSNR.
The difference in compression efficiency of JPEG 2000
over the other algorithms gets larger as the compression
ratio increases.
225
Nonprogressive encoding times
JPEG 2000 R
2
3.84
JPEG 2000 NR
JPEG
4.00
VTC
4.36
SPIHT R
SPIHT NR
5.09
average compression time
(microseconds/pixel)
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0.25
0.5
bpp
226
Nonprogressive decoding times
JPEG 2000 R
1.8
4.72
JPEG 2000 NR
JPEG
4.90
VTC
5.37
SPIHT R
SPIHT NR
6.24
average decompression time
(microseconds/pixel)
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0.25
0.5
bpp
227
Execution times
All DWT based algorithms are substantially slower than
JPEG which is DCT based.
The MPEG4 VTC implementation appears to be
unreasonably slow (bad implementation?).
In JPEG 2000 the reversible (5,3) filter is considerably
faster than the irreversible (9,7) one, due to integeronly
arithmetic and smaller filter lengths.
JPEG 2000 encoding times are constant at all bitrates due
to the VM implementation: the image is coded to a very
high bitrate and then the resulting data is truncated.
228
SNR progressive lossy compression
JPEG 2000 R
JPEG 2000 NR
PJPEG
VTC
SPIHT R
SPIHT NR
44
42
40
PSNR (dB)
38
36
34
32
30
28
26
24
22
0.25
0.5
bpp
One bitstream generated at 2 bpp and decoded at 0.25, 0.5, 1 and 2 bpp. Average across
all images. JPEG 2000: multiple layers; JPEG: progressive (successive refinement) and
optimized Huffman tables; MPEG4 VTC: multiple quantization; SPIHT: arithmetic
coding.
229
SNR progressive encoding times
JPEG 2000 R
compression time (microseconds/pixel)
3.50
11.80
JPEG 2000 NR
11.91
11.01
PJPEG
11.43
VTC
12.16
SPIHT R
SPIHT NR
10.86
11.18
11.5
3.00
2.50
2.00
1.50
1.00
0.50
0.00
bike
caf
cmpnd1
chart
aerial2
target
us
average
230
SNR progressive decoding times
JPEG 2000 R
2.00
JPEG 2000 NR
5.20
PJPEG
7.04
VTC
SPIHT R
9.15
SPIHT NR
11.57
average decompression time
(microseconds/pixel)
1.80
1.60
1.40
1.20
1.00
0.80
0.60
0.40
0.20
0.00
0.25
0.5
bpp
231
SNR progressive results
JPEG suffers from a considerable degradation when
partially decoding progressive bitstreams.
All other algorithms exhibit little or no penalty when
generating progressive bitstreams. Otherwise results are
similar to nonprogressive ones.
JPEG 2000 and SPIHT do not exhibit any significant
execution overhead when generating progressive
bitstreams.
Conversely, JPEG and MPEG4 VTC show a considerable
increase in execution time over the nonprogressive case.
This is probably due to the use of multiple scans in JPEG
and multiple quantization in MPEG4 VTC.
232
ROI Maxshift coding*
ROI  LS
ROI  1.98 bpp
ROI  0.86 bpp
BG  LS
BG  1.98 bpp
BG  0.86 bpp
55
50
PSNR (dB)
45
40
35
30
25
20
15
0.01
0.1
10
bpp
Bike image, ROI covers 5%. Lowest 2 decomposition levels in the ROI mask. Three
SNR progressive codestreams with multiple layers: one lossless (LS @ 4.58 bpp) and
two lossy (1.98 and 0.86 bpp).
* JJ2000 JPEG 2000 reference software codec used instead of VM
233
Error resilience
BER 0
BER 1e6
BER 1e5
BER 1e4
45
40
35
PSNR (dB)
30
25
20
15
10
5
0
0.25
0.5
JPEG 2000 R
0.25

0.5
JPEG 2000 NR
0.25

0.5
JPEG
Symmetric binary transmission channel with random errors. No errors on codestream
headers. Average across all images. JPEG: maximum amount of restart markers; JPEG
2000: packet heads in main header, regular predictable termination and segmentation
symbols. Protections in both formats account to less than 1% overhead.
234
Error resilience visual results
Bit error rate = 105
JPEG 16:1 CR
JPEG 2000 16:1 CR
Images with median quality, of 200 runs
235
Error resilience visual results
Bit error rate = 104
JPEG 16:1 CR
JPEG 2000 16:1 CR
Images with median quality, of 200 runs
236
Supported functionality: subjective evaluation
JPEG 2000 JPEGLS
lossless compression performance
lossy compression performance
progressive bitstreams
Region of Interest (ROI) coding
arbitrary shaped objects
random access
low complexity
error resilience
noniterative rate control
genericity
+++
+++++
+++++
+++
++
++
+++
+++
+++
++++
+
+++++
++
+++
JPEG
MPEG4 VTC
PNG
(+)
+++
++
+++++
++
++
++++
+++
(+)
++
+
? (+++)
+
++
+++
+
+++
+
+++
+ : supported, the more marks the better
 : not supported
( ) : separate mode required
237
References
JPEG2000 Review Papers:
Majid Rabbani and Rajan Joshi, An Overview of the JPEG2000 Still Image
Compression Standard, To appear in Image Communications, (October 2001).
Diego SantaCruz, Raphal Grosbois and Touradj Ebrahimi, JPEG 2000 performance
evaluation and assessment. To appear in Image Communications, (October 2001).
C. Christopoulos, A. Skodras, and T. Ebrahimi, The JPEG2000 Still Image Coding
System: An Overview, IEEE Trans. on consumer Electronics, 46(4), pp. 11031127,
(November 2000).
M. W. Marcellin, M. Gormish, A. Bilgin, M. Boliek, An Overview of JPEG2000,
Proc. IEEE Data Compression Conference, pp. 523541 (March 2000).
Michael D. Adams, Hong Man, Faouzi Kossentini, and Touradj Ebrahimi, JPEG2000:
The Next Generation Still Image Compression Standard, Submitted for publication,
http://www.ece.ubc.ca/~mdadams/#publications.
Michael Gormish, Daniel Lee and Michael W. Marcellin, JPEG2000: Overview,
Architecture, and Applications, Proceedings IEEE International Conference on Image
Processing, Vancouver, CA, September 2000.
Diego SantaCruz and Touradj Ebrahimi, An Analytical Study of JPEG2000
Functionalities, Proceedings IEEE International Conference on Image Processing,
Vancouver, CA, September 2000.
T. Ebrahimi, D. Santa Cruz, J. Askelf, M. Larsson and C. Christopoulos, JPEG 2000
Still Image Coding Versus Other Standards, Proc. SPIE, 4115, San Diego, California,
(July/August 2000).
ISO Publications:
ISO/IEC International Standard 109181, ITUT Rec. T.81, Digital compression and
coding of continuoustone still images Part I: Requirements and Guidelines, (1993).
ISO/IEC JTC1/SC29/WG1 N505, Call for contributions for JPEG2000 (JTC 1.29.14,
15444): Image Coding System, March 1997.
ISO/IEC JTC1/SC29/WG1 N390R, New work item: JPEG2000 image coding system,
March 1997.
ISO/IEC JTC1/SC29/WG1 N1890R, (ISO/IEC FDIS154441), JPEG2000 Part I Final
Draft International Standard, September 25, 2000.
ISO/IEC JTC1/SC29/WG1 N2000, JPEG2000 Part II Final Committee Draft,
December 2000.
ISO/IEC JTC1/SC29/WG1 N1987, Motion JPEG2000 Committee Draft 1.0, December
2000.
ISO/IEC JTC1/SC29/WG1 N2010, JPEG2000 Part IV Committee Draft Release,
December 2000.
ISO/IEC JTC1/SC29/WG1 N2020, JPEG2000 Part 5 Committee Draft, January 2001.
ISO/IEC 14492 and ITU Recommendation T.88, JBIG2 Bilevel Image Compression
Standard, 2000.
ISO/IEC JTC1/SC29/WG1 N1894, JPEG2000 Verification Model 8.6 (Software),
December 2000.
ISO/IEC 144951 and ITU Recommendation T.87: Information Technology Lossless
and Near Lossless Compression of ContinuousTone Still Images, 1999.
DWT Filters and Implementation
C. M. Brislawn, Classification of Nonexpansive Symmetric Extension Transforms for
Multirate Filter Banks, Applied and Computational Harmonic Analysis (ACHA), 3, pp.
337357, 1996.
J. N. Bradley and C. M. Brislawn, Compression of Fingerprint Data Using Wavelet
Vector Quantization Image Compression Algorithm, Technical Report, LAUR921507, Los Alamos National. Lab, (1992).
David B. H. Tay, Rationalizing the Coefficients of Popular Biorthogonal Wavelets,
IEEE Trans. Circuits and Systems for Video Technology, 10(6), pp. 9981005,
(September 2000).
C. Chrysafis and A. Ortega, LineBased, Reduced Memory, Wavelet Image
Compression, IEEE Trans. Image Processing, 9(3), pp. 378389, (March 2000).
Didier Le Gall and Ali Tabatabai, "Subband Coding of Digital Images Using Symmetric
Kernel Filters and Arithmetic Coding Techniques," Proceedings of ICASSP, 761764
(April 1988).
Lifting Scheme and Integer to Integer Transforms
M. Adams and F. Kossentini, Reversible IntegertoInteger Wavelet Transforms for
Image Compression: Performance Evaluation and Analysis, IEEE Trans. Image
Processing, 9(6), pp. 10101024, (June 2000).
M. Adams, Reversible Wavelet Transforms and Their Application to Embedded Image
Compression, M.A.Sc. thesis, University of Victoria, Victoria, Canada, (Jan. 1998).
Available from http://www.ece.ubc.ca/~mdadams/papers/mascthesis.ps.Z.
I. Daubechies, W. Sweldens, Factoring Wavelet Transforms Into Lifting Step, J.
Fourier Analysis and Appls, 4(3), pp. 247269, 1998.
R. C. Calderbank, I. Daubechies, W. Sweldens, and B.L. Yeo, Wavelet Transforms
That Map Integers to Integers, Applied and Computational Harmonic Analysis (ACHA),
5(3), pp. 332369 (1998).
W. Sweldens, The Lifting Scheme: A Construction of Second Generation Wavelets,
Siam J. Math. Anal, 29(2), pp. 511546 (1997).
W. Sweldens, The Lifting Scheme: A CustomDesign Construction of Biorthogonal
Wavelets, Applied and Computational Harmonic Analysis (ACHA), 3(2), pp. 186200
(1996).
W. Sweldens, The Lifting Scheme: A New Philosophy in Biorthogonal Wavelet
Constructions, Proc. SPIE Wavelet Applications in Signal and Image Processing III,
2569, pp. 6879 (1995).
Visual Optimization:
M. Albanesi and S. Bertoluzza, Human Vision Model and Wavelets for HighQuality
Image Compression, Proc. of 5th International Conference in Image Processing and Its
Applications, Edinburgh, UK, No. 410, pp. 311315, (July 1995).
Wenjun Zeng, Scott Daly and Shawmin Lei, PointWise Extended Visual Masking for
JPEG2000 Image Compression, Proceedings IEEE International Conference on Image
Processing, Vancouver, CA, (September 2000).
Wenjun Zeng, Scott Daly and Shawmin Lei, Visual Optimization Tools in JPEG2000,
Proceedings IEEE International Conference on Image Processing, Vancouver, CA,
(September 2000).
J. Li, Visual Progressive Coding, Proc. IS&T/SPIE Conf. Visual Communications and
Image Processing (VCIP), San Jose, 3653, January 1999.
T. ORourke and R. Stevenson, Human Visual System Based Wavelet Decomposition
for Image Compression, J. VCIP V. 6, pp. 109121, 1995.
Watson, G. Yang, J. Solomon, and J. Villasenor, Visibility of Wavelet Quantization
Noise, IEEE Trans. Image Processing, 6(8), pp. 11641175, (August 1997).
P. Jones, S. Daly, R. Gaborski, and M. Rabbani, Comparative Study of Wavelet and
DCT Decompositions with Equivalent Quantization and Encoding Strategies for Medical
Images, Proceedings of SPIE Medical Imaging, 2431, pp. 571582, (February) 1995.
Entropy Coding
D. Taubman, E. Ordentlich, M. J. Weinberger, and G. Seroussi, Embedded Block
Coding in JPEG2000, Journal of Image Communications, October 2001.
W. B. Pennebaker, J. L. Mitchell, G. G. Langdon Jr., and R. B. Arps, An Overview of
the Basic Principles of the Qcoder Adaptive Binary Arithmetic Coder, IBM J. Research
Development, 32(6), pp. 717726 (November 1988).
M. J. Slattery and J. L. Mitchell, The QxCoder, IBM J. Research Development, 42(6),
pp. 767784, (November 1998).
E. Ordentlich, M. J. Weinberger, G. Seroussi, A Low Complexity Modeling Approach
for Embedded Coding of Wavelet Coefficients, Proceedings of 1998 IEEE Data
Compression Conference, Snowbird, Utah, pp. 408417, (March 1998).
E. Ordentlich, D. Taubman, M. J. Weinberger, G. Seroussi, and M. Marcellin, Memory
Efficient Scalable LineBased Image Coding, Proceeding of Data Compression
Conference, Snowbird, UT, pp. 218227, (March 1999).
David Taubman, High Performance Scalable Image Compression with EBCOT, IEEE
Trans. Image Processing, 9(7), pp. 11581170, (July 2000).
ROI
C. Christopoulos, J. Askelof and M. Larsson, "Efficient Methods for Encoding Regions of
Interest in the Upcoming JPEG2000 Still Image Coding Standard", IEEE Signal
Processing Letters, September 2000.
C. Christopoulos, J. Askelof and M. Larsson, Efficient Region of Interest Encoding
Techniques in the Upcoming JPEG2000 Still Image Coding Standard, Proc. IEEE Int.
Conference Image Processing, Vancouver, Canada, (September 2000).
E. Atsumi and N. Farvardin, Lossy/Lossless RegionofInterest Image Coding Based on
Set Partitioning in Hierarchical Trees, Proc. IEEE Int.Conf .Image Processing, pp. 8791,
D. Nister and C. Christopoulos, Lossless Region of Interest with Embedded Wavelet
Image Coding, Signal Processing, 78(1), pp. 117, 1999.
D. Santa Cruz, M. Larsson, J. Askelof, T. Ebrahimi, and C. Christopoulos, Region of
Interest Coding in JPEG2000 for Interactive Client/Server Applications, Proc. IEEE Int.
Workshop Multimedia Signal Processing, pp. 389384, Copenhagen, Denmark,
(September 1999).
Trellis Coded Quantization (TCQ)
M. W. Marcellin and T. R. Fischer, Trellis Coded Quantization of Memoryless and
GaussMarkov Sources, IEEE Transactions on Communications, 38(1), pp. 8293,
(January 1990).
J. H. Kasner, M. W. Marcellin, B. R. Hunt, Universal Trellis Coded Quantization, IEEE
Trans. Image Processing, 8(12), pp. 16771687, (December 1999).
Bilgin, P. J. Sementilli, and M. W. Marcellin, Progressive Image Coding Using Trellis
Coded Quantization, IEEE Transactions on Image Processing, 8(11), pp. 16381643,
(November 1999).
T. T. Chinen, T. J. Flohr, and M. W. Marcellin, TCQ in JPEG2000, Proceedings of
SPIE Conference on Applications of Digital Image Processing XXIII, 4115, pp. 552560,
San Diego, (July/August 2000).
Error Resilience
J. Liang and R. Talluri, Tools for Robust Image and Video Coding in JPEG2000 and
MPEG4 Standards, Proceedings of SPIE Visual Communications and Image Processing
Conference (VCIP), 3653, San Jose, CA, (January. 1999).
Moccagata, S. Sodagar, J. Liang and H. Chen, Error Resilient Coding in JPEG2000 and
MPEG4, IEEE Journal of Selected Areas in Communications (JSAC), 18(6), pp. 899914, (June 2000).
Software:
JJ2000, available at http://jj2000.epfl.ch.
M. D. Adams and F. Kossentini, JasPer: A SoftwareBased JPEG2000 Codec
Implementation, Proceedings IEEE International Conference on Image Processing,
Vancouver, CA, September 2000. Also refer to the JasPer home page at
http://www.ece.ubc.ca/~mdadams/jasper/
Miscellaneous:
J. Li and S. Lei, An Embedded Still Image Coder with RateDistortion Optimization,
IEEE Trans. Image Processing, 8(7), pp. 913924, (July 1999).
J. R. Price and M. Rabbani, Biased Reconstruction for JPEG Decoding, Signal
Processing Letters, 6 (12), pp. 297299, (December 1999).
Moreau de SaintMartin, P. Siohan, and A. Cohen, Biorthogonal Filter Banks and
Energy Preservation Property in Image Compression, IEEE Trans. Image Processing,
8(1), pp. 168178, (February 1999).
Z. Xiong, K. Ramchandran, and M. T. Orchard, SpaceFrequency Quantization for
Wavelet Image Coding, IEEE Trans. Image Processing, 6(5), 677693, (May 1997).
Gary Sullivan, Efficient Scalar Quantization of Exponential and Laplacian Variables,
IEEE Trans. Information Theory, 42(5), pp. 13651374, (September 1996).
Said and W. A. Pearlman, An Image Multiresolution Representation for Lossless and
Lossy Compression, IEEE Trans. Image Processing, 5(9), 13031310, (September
1996).
Said and W. A. Pearlman, "A New Fast and Efficient Image Codec Based on Set
Partitioning in Hierarchical Trees," IEEE Trans. Circuits Syst. Video Technology, 6(3),
243250, (June 1996).
A. Zandi, J. D. Allen, E. L. Schwartz, and M. Boliek, "CREW: Compression with
Reversible Embedded Wavelets," Proc. IEEE Data Compression Conf., pp. 212221
(1995).
J. D. Villasenor, B. Belzer, and J. Liao, Wavelet Filter Evaluation for Image
Compression, IEEE Trans. Image Processing, 4(8), 10531060 (1995).
Taubman and A. Zakhor, Multirate 3D Subband Coding of Video, IEEE Trans. Image
Processing, 3(5), 572588 (September 1994).
J. M. Shapiro, "Embedded Image Coding Using Zero Trees of Wavelet Coefficients,"
IEEE Trans. Signal Processing, 41(12), 34453462 (December 1993).
M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies, "Image Coding Using Wavelet
Transform," IEEE Trans. Image Processing, 1(2), 205220 (April 1992).
S. Lewis and G. Knowles, "Image Compression Using the 2D Wavelet Transform,"
IEEE Trans. Image Processing, 1(2), 244250 (April 1992).
John W. Woods and T. Naveen, A Filter Based Bit Allocation Scheme for Subband
Compression of HDTV, IEEE Trans. Image Processing, 1(3), pp. 436440, (July 1992).
Yair Shoham and Allen Gersho, Efficient Bit Allocation for an Arbitrary Set of
Quantizers, IEEE Trans. Acoustics, Speech, Signal Processing, 36(9), pp. 14451453,
(September 1988).
Iole Moccagatta and Homer Chen, MPEG4 visual texture coding: more than just
compression. Proc. of the International Conference on Consumer Electronics (ICCE),
pp. 302303, (June 1999).
Marcelo J. Weinberger, Gadiel Seroussi, and Guillermo Sapiro, The LOCOI lossless
image compression algorithm: Principles and standardization into JPEGLS. IEEE
Trans. on Image Processing, 9(8):13091324, (August 2000).
Gregory K. Wallace, The JPEG still picture compression standard. IEEE Trans. on
Consumer Electronics, vol. 38, pp. xviii xxxiv, (February 1992).
W3C. PNG (Portable Network Graphics) Specification, October 1996.
http://www.w3.org/TR/RECpng.
Books
D. Taubman and M. W. Marcellin, JPEG2000: Image Compression Fundamentals,
Practice and Standards, Kluwer Academic Publishers, 2001.
P. N. Topiwala (editor), Wavelet Image and Video Compression, Kluwer Academic
Publishers (1998).
Martin Vetterli and Jelena Kovacevic, Wavelet and Subband Coding, Prentice Hall,
Englewood Cliffs, New Jersey (1995).
W. B. Pennebaker and J. L. Mitchell, JPEG Still Image Data Compression Standard,
New York: Van Nostrand Reinhold, 1993.
J. W. Woods (editor), Subband Image Coding, Kluwer Academic Publishers, Norwell,
MA (1991).