Vous êtes sur la page 1sur 16

Applications of the wavelet transform in image

processing
yvind Ryan
Department of informatics, University of Oslo
email: oyvindry@i.uio.no

12 nov 2004
Abstract
Mathematical methods applied in the most recent image formats are
presented. First of all, the application of the wavelet transform in JPEG2000
is gone through. JPEG2000 is a standard established by the same group
which created the widely used JPEG standard, and it was established to
solve some of the shortcomings of JPEG. Also presented are other recently
established image formats having wavelet transforms as part of the codec.
Other components in modern image compression systems are also gone
through, together with the mathematical and statistical methods used.
1 Prelimiaries
All image formats gone through here use an image transform, quantization
and coding. All of these are described for the dierent formats in question.
Transforms mentioned below can be separably extended to two dimensions
for applications to image processing. Therefore, we state our results in one
dimension only for the sake of simplicity. We will describe this separation
process later. We use value m for block dimension (or more precisely,
the number of channels), for our wavelet transforms we will always have
m = 2. We will associate a transform with m lters, so that for m = 2
we will have only two lters. With respect to signal processing, these are
to be interpreted as low-pass high-pass lters.
If we denote the signal by x, the block dimension (or number of chan-
nels) describes the size of the block partitioning of the signal. JPEG
applies only block transforms, meaning that if the signal is split into
blocks x = x[i] (each x[i] a vector of dimension m), the transformed
signal y = y[i] is given by
y[i] = A

x[i]
for some m m matrix A. This way we lose inuence from dierent
blocks, for instance for pixels on block boundaries. This is what gives
rise to the blockiness artifact in JPEG. Important block transforms are
sketched below.
1
Sponsored by the Norwegian Research Council, project nr. 160130/V30
1
1.1 KLT (Karhunen-Loeve Transform)
KLT is the unique transform which decorellates its input. To be precise,
dene the covariance matric C
X
of a random vector X by
C
X
= E ((X
x
)(X
x
)

) .
If the KLT transform is called K, then the random vector Y = K

X
should have uncorrellated components, i.e. C
Y
= K

C
X
K is a diagonal
matrix. This transform is what gives rise to principal component analysis
(PCA). Among linear transforms, KLT minimizes MSE when keeping a
given number of its principal components (when principal components
are ranked in decreasing order).
The drawback for the KLT is that we need to recompute the transform
each time the statistics of the source changes. By its nature, it cannot
be separable either.
1.2 DFT (Discrete Fourier Transform
The DFT is dened by the transform A =
_
e
i
2
m
pq
_
p,q
, which is unitary.
DFT has an ecient implementation through the FFT. It is also separable.
One drawback of DFT is that the transform works badly when the
end points (x
0
and x
m1
) are far apart. If the full Fourier transform was
applied in this case, many higher Fourier components would be introduced
to compensate for this.
1.3 DCT
DCT is dened by
A =
_
c
q
cos
_
2f
q
_
p +
1
2
___
q,p
; f
q
=
q
2m
,
where
c
q
=
_ _
1
m
if q = 0
_
2
m
if q ,= 0
.
DCT can be constructed through DFT by symmetrically extending the
input sequence about the last point, applying the 2m point DFT, and
recovering the rst m points afterwards. It is separable since the DFT is,
and the FFT can be used as an ecient implementation of the DCT. No
need to adapt the transform to the statistics of the source material (as
with KLT). DCT is robust approximation to the KLT for natural image
sources. Used in JPEG, MPEG, CCITT H.261.
One drawback is that there is no way to use the DCT for lossless
compression, since outputs of the transform are not integers.
2 JPEG (baseline)
2.1 Transform
DCT is used as the transform of the input signal, after the input has been
level shifted. If elements in the input signal is 8 bits, level shift would
mean subtracting 128 from numbers in [0, 256), producing numbers in
[128, 128). Block dimension is always 8.
2
2.2 Quantization
Uniform midtread quantization (midtread means that 0 is the midpoint of
the quantization interval about 0) is used. A quantization table, consisit-
ing of the step sizes of each coecient quantizer (the table has size 88).
This table is emitted with the data itself. Use smaller values in this table
if you want less loss during encoding. The coecients after quantizations
are called labels. The rst label in a block is called an DC coecient, the
rest are called AC coecients. Higher AC coecients typically rounded
to 0. This provides us with good compression.
2.3 Coding
AC and DC coecients are coded dierently. Dierences between suc-
cessive DC labels are coded, instead of the DC label itself. This dier-
ence is not coded for AC labels. Labels are partitioned into categories:
0, 1, 1, 3, 2, 2, 3, ..., of sizes 2
0
, 2
1
, 2
2
, ..., numbered 0, 1, 2, 3, , .... Cat-
egory numbers are Human coded. Coding is done in zig zag scan order.
Each label is coded with rst the Human code of the category number,
followed by the value within the category (number of bits required equals
the category number). Zig zag scan order ensures that many coeecients
are zero near end of traversal, these are skipped with an end of block code.
Drawbacks and lacks in generality of the JPEG standard are
1. blockiness, due to splitting image into independent parts,
2. blocks are always processed sequentially (no way to obtain other
types of progression),
3. lossy version only, since quantization is always performed on oating
point values.
3 JPEG2000
3.1 Wavelet transform basics
We follow the introduction to subband transforms and wavelets used in [4].
A lighter introduction with more examples can be found in [5]. Going
through these two references together can be very helpful. Instead of
applying a block transform, one can attempt with a transform where one
block inuences many other (surrounding) blocks. This may reduce the
blockiness, even if the transformed signal at the end is partitioned into
independent blocks anyway. We will consider a subband transform as our
candidate to such a transform:
Denition 1 An (analysis) subband transform for the set of mm ma-
trices A[i]
iZ
is dened by
y[n] =

iZ
A

[i]x[n i].
Denition 2 A (synthesis) subband transform for the set of mm ma-
trices S[i]
iZ
is dened by
x[n] =

iZ
S[i]y[n i].
3
- m
- m
-
-
y[n 2] y[n 1] y[n] y[n + 1]
x[n 2] x[n 1] x[n] x[n + 1]
y[k]
x[k]

H
H
H
@
@
@
@
@
@
@
@
B
B
B
B
B
B
B
B

A[1] A[0] A[1]

Figure 1: One dimensional convolutional transform.


These denitions can be thought of as one dimensional convolutional
transforms, as shown in gure 3.1. The analysis transform produces a
transformed signal from an input signal, while the synthesis transform
should recover the input signal from the transformed signal. We say that
we have perfect reconstruction if there exists a synthesis transform exactly
inverting the analysis transform. JPEG2000 applies subband transforms
with only two channels (m = 2), as opposed to JPEGs block transform
with eight channels. So, artifact like blocking may be removed when using
subband transforms, even if the number of channels is decreased.
3.2 Expressing transforms with lter banks
One can write
y[nm+q] = (x h
q
)[mn], 0 q < m, (1)
where the lter bank |h
q

0q<m
is dened by h
q
[mi j] = (A

[i])
q,j
.
This expresses the analysis operation through lter banks.
One can also write
x =
m1

q=0
( y
q
g
q
),
where the lter bank |g
q

0q<m
is dened by g
q
[mi + j] = (S[i])
j,q
, and
where
y
q
[i] =
_
y
q
[
i
m
] if m divides i
0 otherwise
This expresses the synthesis operation through lter banks. When con-
structing subband transforms from wavelets, we will construct the trans-
form by rst nding a lter bank from the scaling function of the wavelet.
3.3 Expressing transforms in terms of vectors
Let , ) be the inner product in
2
(Z). One can write y
q
[n] = x, a
(n)
q
),
where
a
q
[k] = h

q
[k],
a
(n)
q
[k] = a
q
[k mn]
are the analysis vectors. This expresses the analysis operation in terms of
the analysis vectors.
One can also write x =

m1
q=0

n
y
q
[n]s
(n)
q
, where
s
q
[k] = g
q
[k]
4
s
(n)
q
[k] = s
q
[k mn]
are the synthesis vectors. This expresses the synthesis operation in terms
of the analysis vectors.
3.4 Orthonormal and biorthogonal transforms
Denition 3 An orthonormal subband transform is a transform for which
the synthesis vectors are orthonormal.
It is easy to show that for orthonormal subband transforms, the analysis
and synthesis vectors are equal (s
q
= a
q
q), and the analysis and synthesis
matrices are reversed version of one another A[i] = S[i]. Orthonormal
subband transform are the natural extension of orthonormal (unitary)
block transforms.
If the analysis system is given by lters h
0
, h
1
, and the synthesis system
is given by lters g
0
, g
1
, one can calculate the end-to-end transfer function
of analysis combined with synthesis. In order to avoid aliasing, one will
nd that

h
0
( +) g
0
() +

h
1
( +) g
1
() = 0, (2)
and if we in addition want perfect reconstruction,

h
0
() g
0
() +

h
1
() g
1
() = 2 (3)
Example 1 Lets take a look at a popular denition of orthonormal sub-
band transforms through lter banks. This is an alternative denition of
what is called Quadrature Mirror Filters (QMF). Given a low-pass proto-
type f,
h
0
[k] = g
0
[k] = f[k]
h
1
[k] = g
1
[k] = (1)
k+1
f((k + 1)).
Note that
g
1
[n] = (1)
n+1
g
0
[(n 1)], (4)
or g
1
() = e
i
g
0
( +)

in the Fourier domain. Note also that


h
1
[n] = (1)
n+1
h
0
[(n + 1)], (5)
or

h
1
() = e
i

h
0
( +)

in the Fourier domain. These relations will be


used when we construct biorthogonal transforms below.
It is not hard to see that the alias condition( 2) is satised, and that
perfect reconstruction 3 is satised if


h
0
()

2
+


h
0
( )

2
= 2. (6)
It is also not hard to show that the |h
0
[i 2n]
n
, |h
1
[i 2n]
n
are or-
thonormal and gives rise to an orthonormal subband transform if f = h
0
satsied this.
Example 2 Lapped orthogonal transform with cosine modulated lter bank:
Analysis vectors are dened by
a
q
[k] = c
q
[k] =
_
1

m
cos
_
2f
q
_
k
m1
2
__
if m k < m
0 otherwise
where the cosine frequencies are f
q
=
q+
1
2
2m
; 0 q < m. It is easy to verify
that these analysis vectors give rise to an orthonormal subband transform,
5
where all analysis matrices are 0, except for A[0] and A[1]. Such trans-
forms are called lapped transforms.
One can get a more general family of lapped transforms by dening
a
q
[k] = c
q
[k]w[k],
where the windowing sequence w satises
w[k] = w[1 k]; 0 k < m
w
2
[k] +w
2
[m1 k] = 2; 0 k < m
One can show that any window sequence satisfying these assumptions gives
rise to a new lapped orthonormal transform. These transforms work well,
and by choosing a windowing sequence wisely, one can obtain very good
frequency discrimination between the subbands.
The only thing we miss in the above example is linear phase (linear
phase means that the lter sequence is symmetric or anti-symmetric about
some point). Linear phase will ensure that lter applications will preserve
the support of the lter, which is a very nice property to use in an imple-
mentation. As it turns out, we cant get this in addition to orthonormality:
One can show that there exists no two channel (m = 2) nontrivial, linear
phase, nitely supported orthonormal subband transforms. We therefore
extend our transforms to the following class.
Denition 4 A biorthogonal subband transform is a transform for which
s
(n
1
)
q
1
, a
(n
2
)
q
2
) = [q
1
q
2
][n
1
n
2
], 0 q1, q
2
< m, n
1
, n
2
Z (7)
Contrary to the case for orthonormal transforms, there exists two channel
nontrivial, linear phase, nitely supported biorthogonal subband trans-
forms. Biorthogonal transforms are important in image compression also
because they may approximate orthonormal transforms well. It is not
hard to see that biorthogonality is equivalent with perfect reconstruction.
Example 3 We will construct biorthogonal subband transforms in the fol-
lowing way: We start with lters h
0
, g
0
, and construct lters h
1
, g
1
using
equations 5 and 4. To get a biorthogonal transform, we must construct
h
0
, g
0
jointly so that equation 7 is satised.
Alias cancellation and perfect reconstruction in this case reduce to

h
0
( +) g
0
() =

h
0
()

g
0
( +)

h
0
() g
0
() +

h
0
( +)

g
0
( +)

= 2.
These equations are normally specied another way, in which it uses func-
tions associated with wavelets: ( m
0
() =
1

2
g
0
(),

m
0
() =
1

h
0
()).
3.5 Multi-resolution analysis (MRA)
We turn now to the concept of constructing biorthogonal/orthonormal
transforms from wavelets.
Denition 5 A multi-resolution analysis (MRA) on /
2
(R) is a set of
sub-spaces
1
(2)
1
(1)
1
(0)
1
(1)
1
(2)

satisfying the following properties.
(MR-1)

mZ
1
(m)
= /
2
(R).
6
(MR-2)

mZ
1
(m)
= |0.
(MR-3) x(t) 1
(0)
x(2
m
t) 1
(m)
.
(MR-4) x(t) 1
(0)
x(t n) 1
(0)
.
(MR-5) There exists an orthonormal basis |
n

nZ
, for 1
(0)
such that

n
(t) = (t n). The function (t) is called the scaling function.
Since 1
(0)
1
(1)
, MR-3 and MR-4 shows that we can write
(t) =

n=
g
0
[n](2t n)
for some sequence g
0
. g
0
is to be thought of as a low-pass prototype. This
equation is called the two-scale equation. From the MR properties one
can deduce from the two scale equation that the vectors |g
0
[i 2n]
n
are
orthonormal. From example 1, we can associate it with a function f, and
obtain an orthonormal subband transform this way. the high-pass lter
obtained in this way, g
1
, can be used to construct a function
(t) =

n=
g
1
[n](2t n).
This function is called the mother wavelet, and has the nice property that
its translated dilations
(m)
n
(t) = (2
m
t n) are orthonormal functions
spanning /
2
(R).
3.5.1 Interpretation of MRA in image processing
MRA has the following interpretations with respect to image processing.
The input signal is represented as an element in 1
(0)
, by putting the
components in the signal as coecients for the translates of the scaling
function:
x(t) =

nZ
y
(0)
0
[n](t n).
Dene J
(m)
by the span of the |
(m)
n
(t)
n
. It is not dicult to show
that
1. 1
(m)
and J
(m)
are orthogonal subspaces,
2. 1
(m)
= 1
(m+1)
J
(m+1)
,
3. the coecients in such a decomposition can be obtained by ltering
with h
0
and h
1
respectively.
Note that point 1 and 2 implies that 1
(0)
=
i>0
J
(i)
. We need to explain
point 3 further. Equation 1 produces, through ltering with h
0
, h
1
, two
sequences (i.e. the two polyphase components of the transformed signal)
from an input signal. We let y
(0)
0
be the input signal, and let the two
sequences produced be y
(1)
0
and y
(1)
1
. Then one can show (by also using
the two-scale equation)

nZ
y
(0)
0
[n](t n)
. .
V
(0)
=

nZ
y
(1)
0
[n]
(1)
n
(t)
. .
V
(1)
+

nZ
y
(1)
1
[n]
(1)
n
(t)
. .
W
(1)
,
which explains point 3. This can be done iteratively, by writing

nZ
y
(1)
0
[n]
(1)
n
(t)
. .
V
(1)
=

nZ
y
(2)
0
[n]
(2)
n
(t)
. .
V
(2)
+

nZ
y
(2)
1
[n]
(2)
n
(t)
. .
W
(2)
,
7
and so on. Therefore, we obtain wavelet coecients (i.e. coecients in
J
(m)
) by iterative applications of the lters h
0
, h
1
.
The interpretation of the wavelet suspaces J
(i)
in image processing is
in terms of resolution: Wavelet subspaces at higher indices can be thought
of as image content at lower resolution, and the subspace 1
(m)
for high m
can be thought of as a base for obtaining a low resolution approximation
of the image. If one writes
1
(0)
= 1
(2)
J
(2)
J
(1)
,
and decomposes a signal x = y
(2)
0
+ y
(2)
1
+ y
(1)
1
into components in these
subspaces, one has in addition two approximations to the signal: y
(2)
0
+y
(2)
1
and y
(2)
0
, where the rst one is a better approximation than the last one.
One can view these approximations as versions of the signal with higher
frequencies dropped, since the coecients are obtained through ltering
with h
0
, which can be viewed as a low-pass lter. The eect of dropping
high frequencies in the approximation can be seen especially at sharp
edges in an image. These get more blurred, since they cant be represented
exactly at lower frequencies.
Transforms in image processing are two-dimensional, so we need a few
comments on how we implement a separable transform. When a two-
dimensional transform is separable, we can calculate it by applying the
corresponding one-dimensional transform to the columns rst, and then
to the rows. When ltering, we have four possibilities
1. low-pass lter to rows, followed by low-pass lter to columns (LL-
coecients)
2. low-pass lter to rows, followed by high-pass lter to columns (HL-
coecients)
3. high-pass lter to rows, followed by low-pass lter to columns (LH-
coecients)
4. high-pass lter to rows, followed by high-pass lter to columns (HH-
coecients)
When a separable transform is applied, only the LL-coecients may need
further decomposition. When this decomposition is done at many levels,
we get the subband decomposition in gure 3.5.1. A similar type of decom-
position is sketched for FBI ngerprint compression in section 3.12. The
wavelet subspace decomposition in two dimensions has a similar forms:
1
(m)
= 1
(m+1)
J
(m+1)
0,1
J
(m+1)
1,0
J
(m+1)
1,1
,
and the mother wavelet basis functions are expressed in terms of the
synthesis lters by

0,1
(s
1
, s
2
) = 2

n
1
,n
2
Z
g
0
[n
1
]g
1
[n
2
](2s
1
n
1
, 2s
2
n
2
),

1,0
(s
1
, s
2
) = 2

n
1
,n
2
Z
g
1
[n
1
]g
0
[n
2
](2s
1
n
1
, 2s
2
n
2
),

1,1
(s
1
, s
2
) = 2

n
1
,n
2
Z
g
1
[n
1
]g
1
[n
2
](2s
1
n
1
, 2s
2
n
2
).
8
LH
1
HH
1
HL
1
LH
2
HH
2
HL
2
LH
3
HH
3
HL
3
LL
3
Figure 2: Passband structure for a two dimensional subband transform with
three levels.
Example 4 With three dierent resolutions, our subband decomposition
can be written
1
(0)
= 1
(2)
J
(2)
0,1
J
(2)
1,0
J
(2)
1,1
J
(1)
0,1
J
(1)
1,0
J
(1)
1,1
= LL
2
HL
2
LH
2
HH
2
HL
1
LH
1
HH
1
.
Contributions from these subspaces appear in the same order as above
in a JPEG2000 le, and by including more of the subspaces results in
higher resolution. We demonstrate this here with a computer generated
le. Figures 3 through 9 show le sizes and images at all decomposition
levels in this case. Also shown graphically are the subbands which are
dropped, these are blacked out. We see that we gradually lose resolution
when dropping more and more wavelet subband spaces, but that even at the
lowest resolution the image is recognizable, even if the le size is reduced
from 105 kb to 17kb. This is a very nice property for usage in web browsers.
Note that these les sizes are calculated by replacing the contents of the
subbands dropped with zeroes. They are close to number of bytes if the
subbands were dropped in their entirety.
If we can nd a wavelet with nice properties, most wavelet coecients are
close to 0, and can thus be dropped in a lossy compression scheme.
3.5.2 Biorthogonal wavelets
Given that orthonormality in (MR-5) is replaced with linear indepen-
dence, we can follow the same reasoning as for orthonormality to create
a mother wavelet function. The wavelet coecient subspaces will not
be orthogonal in this case. When iterating the lters g
0
, g
1
, we decom-
pose into subspaces spanned by scaling function and mother wavelet , .
When we iterate the lters h
0
, h
1
, we decompose into subspaces spanned
by dual scaling function

and dual mother wavelet

. Scaling and dual
scaling functions dier only in the biorthogonal case. We can deduce a
biorthogonal subband transform under appropriate conditions. Expressed
with mother wavelets, the criteria for constructing a biorthogonal wavelet
becomes

(m)
n
,

( m)
n
) = [n n][m m], n, m, n, m
9
Figure 3: File with no loss, i.e. all wavelet subband spaces included. Its size is
105kb
Figure 4: We then remove the W
(1)
1,1
-coecients. Its size is 94kb.
10
Figure 5: We then remove the W
(1)
1,0
-coecients also. Its size is 84kb.
Figure 6: We then remove the W
(1)
0,1
-coecients also. Its size is 73kb.
11
Figure 7: We then remove the W
(2)
1,1
-coecients also. Its size is 57kb.
Figure 8: We then remove the W
(1)
1,0
-coecients also. Its size is 37kb.
12
Figure 9: Finally, we remove the W
(1)
0,1
-coecients also. Its size is 17kb.
Since we assume linear phase, we will work with biorthogonal wavelets
from now on.
Not all biorthogonal subband transforms and orthonormal transforms
give rise to wavelets. In order for a lter bank to give rise to a wavelet,
one can show that m
0
(dened in example 3) must have a zero at , and
that the number of zeroes aects the smoothness of the scaling function:
More zeroes means a smoother scaling function.
Daubechies found all FIR lters with N zeroes at which give rise
to orthonormal wavelets. Using similar calculations, Cohen, Daubechies,
and Feaveau [3] found FIR lters of odd length, linear phase and with
delay-normalized transforms with N,

N zeroes at (These must both be
even), which give rise to biorthogonal wavelets. These are:
m
0
() =
_
cos

2
_
N
p
0
(cos()) , and

m
0
() =
_
cos

2
_
N
p
0
(cos())
where p
0
(x) p
0
(x) is an arbitrary factorization of the polynomial (set M =
N+

N
2
)
M1

n=0
_
M +n 1
n
_
_
1
2
x
_
n
.
The factorization of the polynomial P(x) is not completely arbitrary, since
we must group complex conjugate roots together to get real-valued lter
coecients.
Example 5 If we set p
0
(x) 1, we obtain biorthogonal wavelets with
lter banks consisting of dyadic fractions only. If we in addition set
Set N =

N = 2, we obtain the Spline 5/3 transform. This is used by
JPEG2000 for lossless compression. One can show that denition 1 sim-
plies to
y[n] =

2
__

1
8

1
4
0
1
8
_
x[n + 1] +
_
3
4

1
4
1
4
3
4
_
x[n] +
_

1
8
0
1
4

1
8
_
x[n 1]
_
13
in this case. Similarly, denition 2 simplies to
x[n] =

2
__
0 0
1
4
0
_
y[n + 1] +
_
1
2

1
4
1
4
1
2
_
y[n] +
_
0
1
4
0 0
_
y[n 1]
_
Example 6 If we split the zeroes at as equally as we can, and the
factors in p
0
(x) and p
0
(x) equally also, and also set N =

N = 4, we obtain
the wavelet JPEG2000 uses for lossy compression. This is the CFD 9/7
transform. For the lossless transform above, only the A[1], A[0], A[1],
S[1], S[0], S[1] were nonzero. For this lossy transform, only the A[2],
A[1], A[0], A[1], A[2], S[2], S[1], S[0], S[1], S[2] are nonzero.
5/3 and 9/7 above referr to the number nonzero coecients in the
corresponding lters.
3.6 Transform
Wavelet-transform, with dierent wavelet kernels. Transform may be
skipped in its entirety (typically done so in lossless compression). Trans-
form may also be applied fully or only partially. DWT. Has ecient
implementation, both in lossy and lossless case.
3.7 Quantization
Deadzone scalar quantization is the quantization method used by JPEG2000.
Extensions to the standard opens up for another quantization scheme.
3.8 Coding
MQ coding
Image split into tiles (not smaller blocks as in JPEG). Typical size
512512. Each tile is decomposed into constituent parts, using particular
lter banks. Demands:
1. Spatial random access into bitstream
2. Distortion scalable bitstream
3. progression scalability
4. resolution scalability
3.9 Applications to video compression
MPEG4
3.10 Applications to speech recognition
PCA (Principal component analysis) is a common technique in specch
recognition.
3.11 Applications to face recognition
Elastic bunch graph matching. Gabor wavelets.
14
Figure 10: Decomposition structure employed in the FBI ngerprint compres-
sion standard.
3.12 Applications to ngerprint compression and
matching
FBI uses its own standard [1] for compressing ngerprint images. Com-
pression algorithms with wavelet-based transformations were selected in
competition with compression using fractal transformations. FBIs stan-
dard has similarities with the JPEG2000 standard, and especially with an
extension to the JPEG2000 standard. It uses another subband decompo-
sition, this is demonstrated in gure /reg:ngerprintdecomp.
Further decomposition of the LH-, HL- and HH-bands like this may
improve compression somewhat, since the eect of the lter bank applica-
tion may be thought of as an approximative orthonormalization process.
The extension to the JPEG2000 standard also opens up for this type of
more general subband decompositions. In FBIs standard we may also
use many dierent wavelets, with the coecients of the corresponding
lter banks signalled in the code-stream. The only constraint on the l-
ters is that there should be no more than 32 nonzero coecients. This
is much longer than lossy compression in JPEG2000 (9 nonzero coe-
cients). This may be necessary, since ngerprint images are images with
much more sharp edges than most natural images. FBI has their own set
of lters which they recommend. The JPEG2000 extension opens up for
user-denable wavelet kernels also. The coding is done dierently with
FBIs standard. They use Human coding, with tables calculated on a per
image basis. It turns out to be impossible (at least yet) to nd a lossless
compression algorithm for ngerprint images with compression ratio more
than 2 : 1. Similar phenomena can be observed with JPEG2000: If an im-
age with many sharp edges is wavelet-transformed, the compressed data
may be larger compared to when the image is NOT wavelet-transformed
(many small coecients are obtained after wavelet transformation, and
we would obtain compression in the lossy case only, since these would be
quantized to 0 for lossy transformation). JPEG2000 solves this by not
making the wavelet transform mandatory: It can be applied down to a
given level only, or skipped altogether.
15
3.13 Other applications of wavelets
Blending of multiple images [2]. If several lighting sources are combined,
the result may be obtained by combining a set of basis images. The
combination can be done very fast in the wavelet domain.
References
[1] WSQ Gray-scale Fingerprint Image Compression Specication.
[2] I. Drori D. Lischinski. Fast multiresolution image operations in the
wavelet domain. IEEE Transactions on Visualization and Computer
Graphics., 9(3):395412, 2003.
[3] A. Cohen I. Daubechies J.-C. Feauveau. Biorthogonal bases of com-
pactly supported wavelets. Communications on Pure and Appl. Math.,
45(5):485560, June 1992.
[4] David S. Taubman Michael W. Marcellin. JPEG2000. Image com-
pression. Fundamentals, standards and practice. Kluwer Academic
Publishers, 2002.
[5] Khalid Sayood. Introduction to Data Compression. Academic Press,
2000.
16

Vous aimerez peut-être aussi