Image Compression

Image Compression
EEE 6209 Digital Image Processing
Outline
Motivation of Image Compression
Fundamentals of Compression
Compression Models and Standards
Basic Compression Methods for Grayscale Image
Basic Compression Methods for Binary Images
Block Transform Coding and JPEG
Optimal Quantization
Wavelet Image Compression and JPEG-2000
Image Watermarking
Assignments
Dr. S. M. Mahbubur Rahman
Image Compression
Motivation
Data storage in terms of color image formats for movies
Standard Definition (SD)
High Definition (HD)
(720480) pixels(38) bit/pixel
(19201080) pixels(38) bit/pixel
Two-hour SD movie having 30 frames/sec requires 224 GB,

i.e., 27 no. of 8.5 GB dual-layer DVDs ! HD even higher !!
In order to have this movie in a single DVD, the compression
rate should be more than by a factor of 26 !
Data storage for an 8-megapixel digital camera for a full color
image is 24 MB. In such a case, 1 GB portable flush memory
could have only 41 images !! Thus, image compression is a
must.
Image Compression
Motivation
Data communication through internet at home
Speed of DSL
Speed of Broadband
(phone line) 56 kbps
(co-axial cable) 12 Mbps
Transmission times for a single (128128) pixels(38)

bits/pixel image are 7 sec or 0.03 sec, respectively.
Image compression may reduce this transmission by a factor
of 2 to 10.
Compression of images/video plays a vital role for realization
of next generation home-delivered IP TV.
Image Compression
Fundamentals
Data and Information are not the same ! Information is
conveyed through Data. Different representations of data may
have similar information. Representation that contains
irrelevant or repeated data is called redundant data.
Data Redundancy:
1
R = 1
C
0 R 1
Compression Ratio:
b
C= '
b
0C
Here, b and b represents the number of bits in two

representations carrying exactly same information.
Example, C=10, i.e., R=0.9 mean that 90% of data are
redundant.
Image Compression
Redundancies
Image compression types (i) Loss-less compression (e.g,
medical, law, etc) and (ii) Lossy compression (e.g., movie)
A. Coding Redundancy: Each piece of information is
assigned to a code word, i.e., a sequence of bits. Variable
length code word in terms of bits could be assigned exploiting
data redundancy. May be used for loss-less compression.
B. Spatial/Temporal Redundancy: In an image, neighboring
pixels are highly spatially correlated (or dependent). Frames are
also temporally correlated. Correlations may be used to extract
redundant data, and hence, be exploited for loss-less
compression.
C. Context Redundancy: In an image, certain pixels may be
ignored for HVS or be extraneous for intended purpose. This
redundancy is used for lossy-compression.
Image Compression
Redundancies
Example of Redundancies
Coding
Redundancy
(Certain intensities
are more probable
than others)
Spatial
Redundancy
(Each row has
same intensity)
Context
Redundancy
(Single intensity for
entire region)
Image Compression
Coding Redundancy
Let discrete random variable rk be in the interval [0,L-1] for an
image of size (MN)
Probability of k-th
intensity for nk occurrences
nk
pr (rk ) =
MN
k = 1,2, L , L 1
Let (rk) be the number of bits for representing the intensity rk.
Thus, average number of
bits required for representing
each pixel is given by
Coding gain or
compression ratio
L 1
avg = l(rk ) pr (rk )

k =0
mMN
C=
where L = 2 m
avg MN
Image Compression
Coding Redundancy
Variable length coding assigns a lower length
code word to a higher probable intensity level
Expected number of bits required for a pixel representation

avg = 0.25 2 + 0.47 1 + 0.25 3 + 0.03 3 = 1.81 bits/pixel
Compression ratio Redundancy
C = 4.42
R = 0.774
Coding gain may

be achieved in a
reversible way.
Image Compression
Spatial Redundancy
Intensities of each horizontal line are equal, and
thus maximally correlated.
Histogram reveals that
these intensities are equiprobable, and hence, coding
redundancy cannot be
exploited.
Run length coding assigns a code word by combining certain
bits for an intensity plus certain bits for its length.
In example, the coding gain is (2562568)/([256+256]8)=128
In practical images (and video), pixels may be predicted from
spatially (and temporally) correlated neighboring pixels. Thus,
reversible mapping procedure can be used for compression.
Image Compression
Context Redundancy
Intensities for entire region appear to be same, for
which coding gain may be achieved as (2562568)/8=65536
Histogram and the image obtained from histogram

equalization reveals that there are intensities 125 through 131
In practical images, irrelevant information are removed
using quantization process, which is irreversible.
Image Compression
Image Entropy
Assume that intensities represented by symbols arises from
zero-memory sources (statistically independent)
L 1
~
~
Image entropy H = pr (rk ) log 2 pr (rk )
H = 1.6614
k =0
~
bits/pixel
H gives the lower-limit of minimum
number of bits required to represent the
independent intensity values (i.e., image).
~
H
= 8.0
Visual information appearance to
bits/pixel
humans are not always the same as the
measured information of an image (see
the entropy values of three images)
~
Images are treated as finite memory
H = 1.566
sources so that blocks of correlated pixels
~
bits/pixel
are coded with a bit rate lower than H
Image Compression
Image Fidelity
Objective Metric
M 1 N 1
SNR =
x =0 y =0
M 1 N 1
[
x =0 y =0
Subjective Metric
2
f ( x , y )
f ( x , y ) f ( x , y )
Image Compression
Image Fidelity
RMS = 5.17
RMS = 15.67
RMS = 14.17
Objective measure gives a good identification of quality of

compression of images (compare the first with the rest two).
Subjective measure is also necessary to find the quality
(compare the last two)
Image Compression
Compression Model
Functional block consists of encoder and decoder (called
codec as a combination)
Examples include web browser, commercial image editing
software and DVD player
Image Compression
Codec
Encoder
Mapper usually transforms the image (/video) data to reduce
spatial (and temporal) redundancy.
Obtaining less correlated or uncorrelated transform coefficients
are reversible process.
Quantization of transform coefficients considers the preset
fidelity criteria and is an irreversible process.
Shortest length code words are assigned to most frequently
occurred quantized values, thus reducing coding redundancy.
Decoder
Symbol decoder and inverse mapper could not regain the
information loss due to quantization.
Image Compression
Compression Standards
Body sanctions standards include ISO, IEC, ITU-T (CCITT) for
images, and SPMTE (std. VC-1) and CMII (std. AVS) for video.
Image Compression
Binary
Image
Standards
Image Compression
Continuous Image
Standards
Image Compression
Continuous Image
Standards
Image Compression
Video
Standards
Image Compression
Video
Standards
Image Compression
Video
Standards
Image Compression
Huffman Coding
Huffman coding is optimal with the constraint that source
symbols are coded one at a time.
Source reduction is done by ordering probabilities of the
symbols from highest to lowest.
Compound codes starting from lowest to highest length are
assigned to the symbols that are most to the least probable.
Example of
source reduction
for six symbols
Image Compression
Huffman Coding
Entropy of the six symbols is 2.14 bits/pixel
Average length of Huffman coded symbols (see Table)
avg = 0.4 1 + 0.3 2 + 0.1 3 + 0.1 4 + 0.06 5 + 0.04 5 = 2.2 bits/pixel
It is called block code, since each symbol code has fixed bits.
The symbols is uniquely decodable by using lookup tables
(examining distinct codes from left to right).
010100111100
Example of
code
assignment
for six
symbols
01010 - 011 - 1 - 1 - 00
a3 a1a2 a2 a6
Image Compression
Huffman Coding
8-bit image Lena
having size 256256
has entropy 7.3838
bits/pixel.
MATLAB implementation of Huffman coding is done at a
rate of 7.428 bits/pixel.
Computational complexity for J distinct source symbols : (i)
J source probabilities, (ii) (J-2) source reduction (iii) (J-2)
code assignment.
In practice, JPEG and MPEG use near optimal Huffman
coding, wherein codes are assigned to each pixel from the
experimental data.
Image Compression
Golomb Coding
Golomb coding applies for a nonnegative integer n (n0) and
a positive devisor m (m>0) denoted as Gm(n).
Steps of generation of Golomb codes
Step1 Find Unary Code from the quotient
n
q=
m
denotes the largest integer less than or equal to x
Unary Code of an integer q means q number of 1s followed

by a zero
Step2 Find Binary Code from the remainder
(n mod m ) truncated to (k 1) bits
r=
truncated to (k ) bits
(n mod m ) + c
0 (n mod m ) < c
otherwise
Image Compression
Golomb Coding
where k = log 2 m c = 2 k m
denotes the smallest integer greater than or equal to x
If m = 2 k
c = 0 r = n mod m truncated to (k ) bits
generates Golomb-Rice code, wherein division is done

through computationally efficient binary shifting process
Step3 Concatenate the Unary and Binary Codes obtained
from steps 1 and 2
Examples of obtaining G4(9)
11001
Unary code
Binary code
9
= 2.25unary = 2 unary = 110
4
unary
9 mod 4 = 1001 mod 0100

= 0001 truncated to 2 bits = 01
Image Compression
Golomb Coding
Optimal m for integers
follow geometrically
distributed PMF
p (n ) = (1 ) n 0 < < 1
n
Golomb codes for different

choice of m (see Table)
Improper choice of PMF can
even produce data expansion,
instead of compression!
log 2 (1 + )
m=
log 2 (1 )
Image Compression
Golomb Coding
In practice, Golomb codes are used for differences of
intensities of images, e.g., in JPEG-LS
The negative integer of
the differences are
suitably mapped for
obtaining positive
integers.
Example of a mapping
function is
n0
2n
M (n ) =
2 n 1 n < 0
Variable length Golomb codes are uniquely decodable.

Image Compression
Arithmetic Coding
Arithmetic coding
generates non-block
codes, i.e., codes
assign for entire
message (or
character) instead of
codes for symbols.
The code word defines for an interval [0,1), wherein the interval
is defined as per the PMF of symbols in the message.
Example of a 4-symbol
message (n=4):
a1a2 a3 a3 a4
Image Compression
Arithmetic Coding
Notable issues act as impediment of compression performance:
Extra indicator is required for end-of-message
Finite precision arithmetic is required for assigning bits
(constraint is for shortest interval of large-length message)
Maximum number of bits required for a symbol for length-5
message is 3, since 5 lies between (22) 4 and (23) 8.
Resolution of each interval is = 1 8
Total number of bits required for the message as per entropy:
5
5
k = log 2 pi = log 2 pi = - log 2 (0.2 0.2 0.4 0.4 0.2 )

i =1

i=
= - log 2 (0.00128) = 9.6096 = 10 bits/message

Average 2 bits/symbol !
Image Compression
Arithmetic Coding
Variants of arithmetic coding includes Q-coder and MQ-coder
that are used in JBIG and JPEG-2000.
Probabilities of contexts of neighboring image pixels are
estimated adaptively to select the block region for coding a symbol.
No. of
probability
estimation
21 = 2
28 = 256
25 = 32
Image Compression
LZW Coding
Lempel-Ziv-Weltch (LZW)
uses spatial redundancy
instead of exploiting coding
redundancy of previous
coding methods.
LZW considers fixed-length codes for variable-length source
symbols appear in an image.
No prior probability of symbols is required here.
Common standards GIF, TIFF, PDF, and PNG use LZW.
Finite-length codebook/dictionary keeps the variable-length
symbols and their fixed-length codes. The symbols are placed
in the dictionary suitably considering the left-right and topbottom scanning process and frequency of symbols.
Image Compression
LZW Coding
LZW coding of 44, 8-bit
image (see Table)
39 39 126 126
39 39 126 126
39 39 126 126
39 39 126 126
Coding dictionary may require to flush or reinitialize for

handling overflow.
Decoding dictionary is identical to the coding dictionary.
LZW-based compression can be error-free.
Image Compression
Run Length Coding

Repeating intensities along
a row can be compressed by
number of repetition and
intensity itself.
This coding is used in CCITT, JBIG2, JPEG, MPEG, BMP, etc.
In BMP run length encoding uses two bytes first byte shows
the number of consecutive pixels, i.e., runs, for a pixel value given
in second byte (see Table). The second byte can be EOL, EOI,
MOV, or Color index as specified.
Performance of run length encoding (RLE) can be further
improved by using variable length coding (e.g., Huffman coding)
of the runs.
Image Compression
Run Length Coding

Expansion may happen instead of compression due to absence
of repeated intensities, and hence, RLE works better for binary
images instead of grayscale images.
CCITT Group 3 (1D) and 4 (2D) standards for binary image
compression use RLE. These standards are originally designed
for sending facsimile (FAX) through telephone lines.
Approximate run-length entropy
for binary image
H RL
H 0 + H1
=
L0 + L1
Entropy of white runs H1
Average length of white runs L1
Entropy of black runs H0
Average length of black runs L0
Image Compression
Symbol-Based Coding
Frequently appeared sub-image is treated as symbols/tokens.
Symbol dictionary contains triplets {(x1, y1, t1), (x2, y2, t2), },
where (xi yi) pair specifies the location of a symbol in the image
and token ti denotes the address of symbol in the dictionary.
Example may be shown for bi-level image that requires
9 511 = 459 bits
The required number of bits for triplets

(each require 3 bytes)
6 3 8 + [(9 7 ) + (6 7 ) + (6 6 )] = 285 bits
Compression achieved
C = 1.61
Image Compression
Symbol-Based Coding
In JBIG2, binary images are compressed by segmentation of
non-overlapping three regions (i) Text (character), (ii) Halftone
(pattern in regular grid) (iii) Generic (art and noise).
Text regions use symbol-based coding. If the reference
character bitmaps/templates in the dictionary are not the same as
in the actual bitmap in the image, it is called perceptually loss-less
coding. The difference in the bitmap may be stored for perfectly
loss-less coding (see d in Figure)
(a) (b) (c)
(a) Lossless
(b) Visually lossless
(c) Difference
Image Compression
Bit-Plane Coding
Image having more than two intensities may be decomposed
into m-number of bit planes using following polynomial formula.
am 1 2 m 1 + am 2 2 m 2 + L + a1 21 + a0 20
Each of the bit plane may be compressed by using any of the

available binary image compression technique.
Major problem of such approach is that one bit change (say in
LSB) may change all the bit planes (including MSB). Example
12710=(01111111)2 whereas 12810=(10000000)2.
Alternative decomposition such as m-bit Gray code may reduce
such problems, wherein successive code words will have only one
bit difference. This code is generated as
g i = ai ai +1
g m 1 = am 1
0i m2
Image Compression
Bit-Plane Coding
Example of a monochrome image. Comparison of binary and

gray code representations for first four MSBs
Image Compression
Bit-Plane Coding
Comparison of binary and gray code representations of the

image for last four LSBs
Image Compression
Bit-Plane Coding
Gray code for 127 is 01000000 and that for 128 is 11000000.
Considering an overhead of PDF representation, the image bitplanes are compressed by JBIG2 lossless algorithm using both
the binary and Gray coded bit representations (see Table).
Overall
compression
ratio for binary
representation
C = 678676 503916 = 1.35
The compression ratio for Gray representation is

better than that of binary representation C = 678676 475964 = 1.43
Image Compression
Block-Transform Coding
Image is partitioned into non-overlapping blocks/sub-images
of fixed size (say, 88).
Transformation process de-correlate the pixel data, and pack
as much information as possible to a small number of transformed
coefficients.
Quantization process selectively eliminates least amount of
information.
Quantized coefficients are coded adaptively or non-adaptively
preferably with a variable length coder.
Encoder
Decoder
Image Compression
Block-Transform Coding
Selection of transform depends on packing capacity and
computational complexity.
Let the size of image be (MN), and that of sub-image be (nn)
General relation for sub-image g(x,y) with r(x,y) and s(x,y) be
forward and inverse kernels, respectively.
n 1 n 1
Forward
T (u , v ) = g (x, y )r (x, y, u , v ) u , v = 0,1,2,L , n 1

x =0 y =0
n 1 n 1
Inverse
g (x, y ) = T (u, v )s ( x, y, u , v ) u , v = 0,1,2, L , n 1

u =0 v =0
2D kernels may be implemented as a separable manner
r (x, y, u , v ) = r1 ( x, u )r2 ( y, v )
s ( x, y , u , v ) = s1 ( x, u )s2 ( y, v )
Image Compression
FFT-Based Coding
DFT is the well-known transformation that may be used
The kernels for DFT are
1 j
r ( x, y , u , v ) = e
n
2
(ux + vy )
n
1 j
s ( x, y , u , v ) = e
n
2
(ux + vy )
n
The coefficients are complex valued in nature

The computational benefit may be obtained by choosing
n=2
The basis function consists of both sines and cosine
functions.
Image Compression
FFT-Based Coding
2D Basis functions of DFT
Image Compression
WHT-Based Coding
Walsh-Hadamard Transform is computationally very simple,
since the basis functions are obtained from alternative +1 and -1
m1
The kernels for
1
bi ( x ) pi (u )+ bi ( y ) pi (v )
r (x, y, u , v ) = s (x, y, u , v ) = ( 1)
WHT are
i =0
n
where
n = 2m
p0 (u ) = bm 1 (u )
p1 (u ) = bm 1 (u ) + bm 2 (u )
p2 (u ) = bm 2 (u ) + bm 3 (u )
M
pm 1 (u ) = b1 (u ) + b0 (u )
bk(z) means k-th bit from the right to left of binary representation
The summation of the exponent use modulo 2 arithmetic.
Image Compression
WHT-Based Coding
Checker board pattern 2D Basis functions of WHT for n=4
Image Compression
DCT-Based Coding
DCT is used JPEG. DCT has only cosine basis functions
The kernels for
DCT are
where
r ( x , y , u , v ) = s ( x, y , u , v )
u
= (u ) (v ) cos (2 x + 1) cos (2 y + 1)
2n
2n
(u ) =
1
n
2
n
for u = 0
for u = 1,2,3, L , n 1
The coefficients are real valued.

The de-correlation property of DCT is close to KLT (ideal)
Image Compression
DCT-Based Coding
2D Basis functions of DCT for n=4
Image Compression
Reconstruction from Transforms

Images are reconstructed by truncating 50% of the total
coefficients and compare the error (in terms of image details)
Reconstructed
Image
Error from
Original
FFT
WHT
DCT
Image Compression
Truncation of Coefficients
n 1 n 1
T (u , v )S uv
Inverse transform in matrix form with G =
u =0 v =0
the basis functions are
s (0,1,u,v )
s (0 ,0 ,u,v )
s (1,0 ,u,v )
s (1,2 ,u,v )
S uv =
M
M
s (n 1,0 ,u,v ) s (n 1,1, u,v )
s (0,n 1,u,v )
s (1,n 1,u,v )
L
O
M
L s(n 1,n 1,u,v )

L
Truncation by masking function

n 1 n 1
=
G
(u, v )T (u, v )Suv
0 satisfies truncation criteria

otherwise
1
(u , v ) =
u =0 v =0
Mean squared error (due to approximation) to be minimized

n 1 n 1
MSE = E G G = T2(u ,v ) [1 (u, v )]
u =0 v =0
depends on variance of the coefficients at location (u,v).

Image Compression
Comparison of Transforms
WHT-based compression is simple in terms of implementation
Sinusoidal transform (DFT/DCT) has high packing efficiency
Packing efficiency of KLT is the highest in the MSE sense
of truncating coefficients, but computationally expensive
data-dependent basis is required for such case.
Why DCT is chosen instead of DFT for image compression?

DCT provides efficient computation of real data
as opposed to complex data for DFT.
DCT minimizes the blocking artifacts between
two boundaries of sub-images.
Image Compression
Boundaries for DCT & DFT

In DFT, signal is truncated and assumed as periodic (n-point
periodic). Thus, Gibbs phenomenon occurs at the discontinuous
boundaries of sub-images
In DCT, signal is assumed to be even symmetric (2n-point
periodic). Thus, it reduces artifacts at the boundaries.
DFT
DCT
Image Compression
Size of Sub-Image
The sub-image size is
chosen as 2k. Common
choice id (88) or (1616)
Reconstructed output
25% DCT
Coef.
22
subimage
44
subimage
88
subimage
Image Compression
Bit Allocation
Bit allocation refers to truncation, quantization, and coding of
transform coefficients.
Zonal Transform coefficients with maximum variance carry the
Coding: most of the information and thus retained.
Coefficients may be normalized by their variances and
then uniformly quantized OR
Optimal quantizer considering PDF of coefficients may be
used. Zeroth and other coefficients are modeled by
Rayleigh and Gaussian PDFs, respectively.
Typical
Zonal
Mask
Bit assigned
log 2 T2(u ,v )
Image Compression
Bit Allocation
Threshold Transform coefficients with largest magnitudes are
retained for reconstruction.
Coding:
Thresholds may be global for an image, a local for
subimage, or even depend on location of a subimage.
Retained coefficients of a subimage are reordered to
1D sequence for run-length coding.
Typical
Threshold
Mask
Reordering
for 1D
sequence
Image Compression
Zonal or Threshold Coding

DCT based compression:
Threshold
Coding
Zonal
Coding
Image Compression
Controlling Compression Level

Let the masking process is redefined as
normalization matrix Z
T (u , v ) = (u , v )T (u , v )
T (u , v )
= round
(
)
Z
u
,
v
Coefficients of reconstructed image T& (u, v ) = T (u, v )Z (u, v )

Let a value of Z(u,v) be c
Quantization may be
c
c
kc T (u , v ) < kc +
2
2
Quantization
steps
Number of bits are controlled by k
Typical
normalization
mask
Image Compression
Controlling Compression Level

DCT based compression with varying normalization array
From left to right and top to bottom:

Normalization by Z, 2Z, 4Z, 8Z, 16Z, and 32Z
Image Compression
JPEG
Features: DCT-based lossy baseline coding standard.
Applicable to progressive reconstruction system.
Lossless reversible compression is possible.
Steps of compression with example:

Image partition considering subimage size is 88.
Level shifting by subtraction of 2k-1, where k=8 is the number of bits.
Image Compression
JPEG
Forward transform of DCT using n=8.
Scale the DCT coefficients with normalization matrix.
Reorder the scaled coefficients in zigzag pattern specified.

Use variable length coding (run-length) for binary representation..
Compression ration 5.6:1.

Image Compression
JPEG
Steps of reconstruction from compressed image:
Decoding the bits to obtain the real values and rearrange them to
an array of size 88.
Rescaling in accordance with the normalization array.
Image Compression
JPEG
JPEG
Calculation of inverse of DCT on the denormalized array.
Level shifting by addition of 2k on the transformed pixels.
Estimated error
of the subimage
(RMSE=5.8 only):
Image Compression
JPEG
Visual output for JPG compression:
Two levels of JPEG compression. Each row shows

compressed image, scaled error, and a zoomed-in section.
Image Compression
Predictive coding
Predictive coding eliminates redundancies of neighboring
pixels in space or time.
This coding is computationally very efficient, thus used for
compression of video sequence
Loss-less coding:
Encoder
Decoder
Image Compression
Predictive coding
Lossy coding includes a quantizer.
A feedback loop may be used to eliminate quantization error
Encoder
Decoder
Image Compression
Optimal Predictor
Let a pixel is to be predicted from m number of its
2
m
neighboring pixels.
2
MSE of prediction
E e (n ) = E f (n ) i f (n i )
i =1
MMSE prediction while

= R 1 r
1 E{ f (n 1) f (n 1)} E{ f (n 1) f (n 2 )}
E{ f (n 2 ) f (n 1)} E{ f (n 2 ) f (n 2 )}
2=
M
M
M

m E{ f (n m ) f (n 1)} E{ f (n m ) f (n 2 )}
E{ f (n 1) f (n m )}
E{ f (n 2 ) f (n m )}
O
M
L E{ f (n m ) f (n m )}
L
L
E{ f (n ) f (n 1)}
E{ f (n ) f (n 2 )}
{
(
)
(
)
}
E
f
n
f
n
m
Estimation of correlations of the matrix R is not so simple,

provided the pixels follow non-Gaussian characteristics.
m
If f(n) has a zero mean
2
2
e = i E{ f (n ) f (n i )}
and variance 2
i =1
Image Compression
Optimal Predictor
Assuming 2D markov image source with Gaussian statistics
and separable correlation function E{ f ( x, y ) f ( x i, y j )} = 2 i j
v
where vi and hj are the vertical and horizontal correlations for

delays i and j, respectively.
Considering fourth-order linear predictor
f ( x, y ) = 1 f ( x, y 1) + 2 f ( x 1, y 1)
3 f (x 1, y ) + 1 f (x 1, y + 1)
where
1 = h 2 = v h
3 = v
4 = 0
The practical implementation to maintain the pixel within the

m
range requires that
i 1
i =1
Image Compression
Optimal L-level Lloyd-Max Quantizer
Image Compression
Input signal s with PDF p (s )
Quantized signal t = q (s )
Quantization error e(s ) = s q (s )
Mean of quantization error
= E [s q(s )] =
i=L
(s t ) p(s )ds
i = L
si
si 1
Variance 2 of quantization error
= E (s q(s )) =
2
i=L
i= L
si
si 1
(s ti ) p(s )ds
2
Image Compression
The quantization error is minimum provided
i=L
2
opt
= min
si , t i
(s t ) p(s )ds
i = L
si
si 1
The variance may be seen as
= L+
2
si
si 1
(s ti ) p(s )ds + s (s ti +1 )2 p(s )ds + L

2
si +1
i
Minimization of quantization error may be done by equating the

following two derivatives
2
=0
ti
and
2
=0
si
Image Compression
The first derivative
2
=0
ti
0 + L 2
si
si 1
(s ti ) p(s )ds + L + 0 = 0
sp (s )ds = ti p (s )ds
si
si
si 1
si 1
si
ti
si 1
si
si 1
sp (s )ds
p (s )ds
i = L ,L, L
2
2
Image Compression
The second derivative
2
=0
si
Leibniz Integral Rule :

d b ( )
db( )
(
)
=
f
x
x
f (b( ), )
,
d
d a ( )
d
da ( )
f (a ( ), )
d
b ( )
[ f (x, )]dx
+
a ( )
dsi
dsi 1
2
2
(si ti ) p(si )
(si 1 ti ) p(si 1 )
0 +L+
dsi
dsi
+
si 1
si
dsi +1
2
2
(s ti ) ds +
(si +1 ti +1 ) p(si +1 )
si
dsi
si +1
dsi
2
2
(si ti +1 ) p(si ) +
(

s ti +1 ) ds + L + 0 = 0
si s
d
s
i
i
Image Compression
The second derivative
2
=0
si
(si ti ) p (si ) (si ti +1 ) p (si ) = 0
2
(si ti ) = (si ti +1 )
0
t + t
si = i i +1
2

i=0
i = L 1 ,L, L 1
2
2
i = L ,L
2 2
) (
Image Compression
The values of si and ti are estimated iteratively
Lloyed-Max quantizers for a Laplacian PDF of unit variance
Image Compression
1. Guess initial decision levels
2. Calculate centroids
ti
si
ti =
p(s ) s
ti 1 < si ti
p (s )
ti 1 < si ti
3. Calculate decisions
si
4. Repeat 2 and 3 until

variance is reduced further
ti + ti +1
si =
2
Image Compression
Wavelet Coding
The basis functions of wavelets has finite duration as opposed
to infinite duration of cosinusoidal basis function in DCT
DWT coefficients of images possess space-frequency
localization characteristics.
Hence, subimage partition is not necessary for DWT-based
compression, which in turn reduces blocking artifacts even for
high compression ratios.
Encoder
Decoder
Image Compression
Wavelet Selection
Factors on basis functions to be considered
Computational complexity Reduce filter length. Bi-level filter
Haar wavelet is good in this respect
Orthogonality Error-free reconstruction with high packing
capacity. Daubechies and Symmlets wavelets are good in this
sense.
Symmtricity Reduce border artifacts for reconstruction.
Biorthogonal wavelets are good for this.
Not all factors satisfy simultaneously.
JPEG-2000 uses near-orthogonal (biorthogonal) CohenDaubechies-Feauveau (CDF) wavelet basis function.
Image Compression
Wavelet Selection
Output of 3-level wavelet decomposition for four wavelet
basis functions
Haar
Symlets
Daubechies
CDF
Image Compression
Wavelet Selection
Comparisons of wavelet basis functions in terms of filters taps
as well as packing capacity (see Table).
Packing capacity is compared in terms of number of zeroed
coefficients for a fixed level threshold.
CDF is good for packing capacity although it has a higher

number of filter taps.
Image Compression
Level Selection
With an increasing number of levels, the size of approximate
coefficients reduces significantly (see Table)
Almost 95% of detail coefficients may be truncated, with
almost no significant change in the reconstruction error.
Increasing level of decomposition will introduce extra
computational load.
Image Compression
Quantizer Selection
Quantization performance may be increased
Introducing dead-zone Larger quantization steps near zero
magnitude coefficients
Scale-adaptation Quantization steps are adapted across
scales.
Impact of deadzone interval on
wavelet coding.
Image Compression
JPEG 2000
Overall Features
Flexibility in compression and access, e.g., portion of image
may be processed.
Wavelet coefficients are quantized subband-adaptively, and
then arithmetically coded using a bit-plane algorithm.
Encoding Features
R, G, B components are DC shifted
Individual components may optionally undergo irreversible decorrelation transform as
Y0 ( x, y ) = 0.299 I 0 ( x, y ) + 0.587 I1 ( x, y ) + 0.114 I 2 ( x, y )
Y1 ( x, y ) = 0.16875 I 0 ( x, y ) 0.33126 I1 ( x, y ) + 0.5 I 2 ( x, y )

Y3 ( x, y ) = 0.5 I 0 ( x, y ) 0.41869 I1 ( x, y ) 0.08131I 2 ( x, y )
Image Compression
JPEG 2000
Image is composed in terms of tiles of rectangular array of
pixels considering aspect ratio.
Each tile component may be processed independently.
Loss-less compression uses 5/3 biorthogonal wavelet, and
lossy compression uses 9/7 biorthogonal wavelet (i.e., CDF).
Implementation uses lifting-based fast wavelet transform
(FWT).
Standard does not specify number of scales for FWT
The transformed coefficient ab(u,v) of subband b is quantized
to qb(u,v) with quantization step b as
ab (u , v )
qb (u , v ) = sgn[ab (u , v )] floor
b = 2
Rb b
b
1 + 11
2
Image Compression
JPEG 2000
where Rb is the nominal dynamic range
b is the no. of bits for exponent of real coefficients
b is the no. of bits for mantissa of real coefficients

In the case of loss-less compression b = 0 b = Rb b = 1
In the case of lossy compression b = 0 b = 0 + nb N L
where n b is the no. of subbands
N L is the no. of scales
Example of tile components of 2-scale

wavelet transform (see Figure)
Image Compression
JPEG 2000
Tile components are coded individually, one bit plane at a
time starting from the MSB bit-plane
Each bit-plane is arithmetically coded in anyone of the three
passes significance propagation, magnitude refinement, and
cleanup.
Codes of similar passes are grouped in to layers, which are
finally partitioned into packets the unit of encoder.
Decoding Features
In decoding, number of bit-planes may chosen arbitrarily.
Inverse quantization is done by
qb (u , v ) + r 2 M b N b b
(
)
=
Rqb u , v qb (u , v ) r 2 M b N b b
qb (u , v ) > 0
qb (u , v ) > 0
qb (u , v ) = 0
Image Compression
JPEG 2000
where Rqb is the inverse quantized coefficients.
M b is the no. of bit-planes encoded
N b is the no. of bits-planes decoded

r (0 r 1) is the reconstruction parameter (e.g., 0.5)
Implementation uses lifting-based inverse of FWT.

Components may optionally undergo inverse of de-correlation
transform as I 0 (x, y ) = Y0 ( x, y ) + 1.402Y2 (x, y )
I1 (x, y ) = Y0 ( x, y ) 0.34413Y1 ( x, y ) 0.71414Y2 ( x, y )
I 2 (x, y ) = Y0 (x, y ) + 1.772Y1 (x, y )
Finall, decoded output requires DC level unshifting
Image Compression
JPEG 2000
Visual output
JPEG 2000
approximation
with error and
zoomed-in
section
Image Compression
Watermarking
Watermarking is embedding data (opposite to compression)
in an image for making an assertion of the image
Use of watermarking includes copyright identification, user
identification or fingerprinting, authenticity determination,
automated monitoring, and copy protection.
Two types of watermarking perceptible and imperceptible.
Usually imperceptible watermarking considers HVS.
Generation of additive marked image fw from the watermark w
where
f w = (1 ) f + w
(0 < 1) controls the imperceptibility
Generation of watermarked image through embedding specified
bits in the last b number of LSBs
w
b f
f w = 2 b + (2 )
2 2
b
Image Compression
Watermarking
Watermark encoding and decoding system
Watermark w to be embedded is usually generated using a
pseudo-random sequence obtained from a public/private key.
Watermark matching is done through a suitable threshold
Encoder
Decoder
Image Compression
Watermarking
Visible
watermarking
and extraction
Invisible
watermarking
and extraction
Image Compression
Watermarking
Watermarked image may undergo attacks, e.g.,
compression, filtering, cropping, rotation, re-sampling, etc.
Watermarking algorithm should be robust such that
intelligibility survives for attacks.
Robust watermarking algorithm uses spread spectrum
technique, so that embedded information (payload) spreads to
whole frequency range of image.
Effect of JPEG compression
for estimating the watermark
(see Figure).
Image Compression
Watermarking
Comparisons of
strength of marking on
images using a DCTbased algorithm. Marked
image and difference
from unmarked image is
shown in figure.
Image Compression
Watermarking
General principle of transform-based (DCT, DFT, DWT, etc.)
watermarking encoder:
Forward transform the image and find the selected
coefficients (e.g., largest magnitude) to be modified.
Generate watermark and modify the selected image
coefficients.
Inverse transform the marked coefficients.
General principle of transform-based watermark decoder:
Compute the forward transform of image concerned.
Compute the watermark from the image as per the known
embedding process.
Take decision based on the similarity between original and
estimated watermark.
Image Compression
Watermarking
Correlation coefficients may be used for finding similarity
between estimated version of K-element watermark and the
question in concern given by
(w w )(w w )
K
i =1
(
K
i =1
w i w
) (w w )
2
i =1
A decision threshold T may be used for decision
H1 : T
H0 : < T
Watermark Exists
Watermark Does' nt Exist
Image Compression
Watermarking
Effect of attacks on similarity measure
(a) (b) JPEG lossycompression with
increasing
strength.
(c) smoothing,
(d) AWGN,
(e) histogram
equalization,
(f) rotation
(a) (b) (c)

(d) (e) (f)
Image Compression
Assignment
Problem #1
Image Compression
Assignment
Problem #2
Devise a dictionary to obtain LZW encoded output for the 8-bit image of size
48 given in Problem #1
Problem #3
Problem #4
Image Compression
Assignment
Problem #5

Image Compression

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Image Compression

Transféré par

Droits d'auteur :

Formats disponibles

Image Compression

EEE 6209 Digital Image Processing

High Definition (HD)

(720480) pixels(38) bit/pixel

(19201080) pixels(38) bit/pixel

Two-hour SD movie having 30 frames/sec requires 224 GB,

(phone line) 56 kbps

(co-axial cable) 12 Mbps

Transmission times for a single (128128) pixels(38)

Dr. S. M. Mahbubur Rahman

Here, b and b represents the number of bits in two

avg = l(rk ) pr (rk )

Expected number of bits required for a pixel representation

Coding gain may

Histogram and the image obtained from histogram

Dr. S. M. Mahbubur Rahman

Objective measure gives a good identification of quality of

Dr. S. M. Mahbubur Rahman

Dr. S. M. Mahbubur Rahman

Dr. S. M. Mahbubur Rahman

Dr. S. M. Mahbubur Rahman

Dr. S. M. Mahbubur Rahman

Dr. S. M. Mahbubur Rahman

Dr. S. M. Mahbubur Rahman

Dr. S. M. Mahbubur Rahman

Dr. S. M. Mahbubur Rahman

denotes the largest integer less than or equal to x

Unary Code of an integer q means q number of 1s followed

denotes the smallest integer greater than or equal to x

c = 0 r = n mod m truncated to (k ) bits

generates Golomb-Rice code, wherein division is done

9 mod 4 = 1001 mod 0100

Golomb codes for different

Variable length Golomb codes are uniquely decodable.

k = log 2 pi = log 2 pi = - log 2 (0.2 0.2 0.4 0.4 0.2 )

= - log 2 (0.00128) = 9.6096 = 10 bits/message

Dr. S. M. Mahbubur Rahman

Coding dictionary may require to flush or reinitialize for

Run Length Coding

Run Length Coding

Entropy of white runs H1

Average length of white runs L1

Entropy of black runs H0

Average length of black runs L0

Dr. S. M. Mahbubur Rahman

The required number of bits for triplets

Each of the bit plane may be compressed by using any of the

Example of a monochrome image. Comparison of binary and

Comparison of binary and gray code representations of the

The compression ratio for Gray representation is

T (u , v ) = g (x, y )r (x, y, u , v ) u , v = 0,1,2,L , n 1

g (x, y ) = T (u, v )s ( x, y, u , v ) u , v = 0,1,2, L , n 1

2D kernels may be implemented as a separable manner

The coefficients are complex valued in nature

Dr. S. M. Mahbubur Rahman

Dr. S. M. Mahbubur Rahman

The coefficients are real valued.

Dr. S. M. Mahbubur Rahman

Reconstruction from Transforms

s (n 1,0 ,u,v ) s (n 1,1, u,v )

L s(n 1,n 1,u,v )

Truncation by masking function

0 satisfies truncation criteria

Mean squared error (due to approximation) to be minimized