CMPT 365 Multimedia Systems: Lossy Compression

CMPT365 Multimedia Systems 1
Lossy Compression

Fall 2012
CMPT 365 Multimedia Systems
Lossless vs Lossy Compression
If the compression and decompression processes
induce no information loss, then the compression
scheme is lossless; otherwise, it is lossy.
Why is lossy compression possible ?

Compression Ratio: 12.3 Compression Ratio: 7.7
Compression Ratio: 33.9
Original

Outline

Quantization
Uniform
Non-uniform
Vector quantization
Transform coding
DCT
Quantization
The process of representing a large (possibly infinite)
set of values with a much smaller set.
Example: A/D conversion
An efficient tool for lossy compression
Review
Entropy
coding
Quantization
Transform
Encoder
Entropy
decoding
Inverse
Quantization
Inverse
Transform
Decoder
channel
Review: Basic Idea
I
n
p
u
t

V
a
l
u
e
s

Bin 0
Bin 1
Bin 2
Bin 3
Bin 4
x
Entropy
Coding
Entropy
Decoding
Bin 0
Bin 1
Bin 2
Bin 3
Bin 4
R
e
c
o
n
s
t
r
u
c
t
i
o
n

V
a
l
u
e
s

Quantizer
Dequantizer
(Inverse Quantizer)
x
Quantization is a function that maps an input interval to one integer
Can reduce the bits required to represent the source.
Reconstructed result is generally not the original input
Terminologies:
Decision boundaries b
i
: bin boundaries
Reconstruction levels y
i
: output value of each bin by the dequantizer.
index
index
Model of Quantization
Quantization: q = A(x)
Inverse Quantization:

) ( )) ( ( ) (
x Q x A B q B x = = =
B(x) is not exactly the inverse function of A(x), because x x =
x x x e

) ( =
Quantization error:
A
q
x
x
B
Q
x
x
Combining quantizer and de-quantizer:
- e(x)
x
x or
Rate-Distortion Tradeoff
Things to be determined:
Number of bins
Bin boundaries
Reconstruction levels
A tradeoff between rate and distortion:
To reduce the size of the encoded bits, we need to reduce
the number of bins
Less bins More reconstruction errors

Rate
Distortion
A
B
Measure of Distortion
Quantization error:
Mean Squared Error (MSE) for Quantization
Average quantization error of all input values
Need to know the probability distribution of the input
Number of bins: M
Decision boundaries: b
i
, i = 0, , M
Reconstruction Levels: y
i
, i = 1, , M
Reconstruction:

i i i
b x b y x s < =
1
iff
MSE:
( ) ( )
} }
=

= =
M
i
b
b
i q
i
i
dx x f y x dx x f x x MSE
1
2 2
1
) ( ) (
x x x e

) ( =
Same as the variance of e(x) if = E{e(x)} = 0 (zero mean).
Definition of Variance:
( ) de e f e
e e
) (
2
2
}

= o
Rate-Distortion Optimization
Two Scenarios:
Given M, find b
i
and y
i
that minimize the MSE.
Given a distortion constraint D, find M, b
i
and y
i
such that
the MSE D.
Outline

Quantization
Uniform
Non-uniform
Vector quantization
Transform coding
DCT
Uniform Quantizer
All bins have the same size except possibly for the two outer intervals:
bi and yi are spaced evenly
The spacing of bi and yi are both (step size)
2 3 Input
-3 -2 -
Reconstruction
3.5
2.5
1.5
0.5
-0.5
-1.5
-2.5
-3.5
Uniform Midrise Quantizer
Even number of reconstruction levels
0 is not a reconstruction level
-2.5 -1.5 -0.5
Reconstruction
3
2

-
-2
-3
Uniform Midtread Quantizer
0.5 1.5 2.5 Input
Odd number of reconstruction levels
0 is a reconstruction level
( )
i i i
b b y + =
1
2
1
for inner intervals.
Midtread Quantizer
(
+
A
= = 5 . 0 ) ( ) (
x
x sign x A q
Quantization mapping:
Output is an index
A = = q q B x ) (
De-quantization mapping:
Example:
x = -1.8, q = -2.
-2.5 -1.5 -0.5
Reconstruction
3
2

-
-2
-3
0.5 1.5 2.5 Input
Uniform Quantization of a Uniformly Distributed
Source
Input X: uniformly distributed in [-X
max
, X
max
]: f(x)= 1 / (2X
max
)
Number of bins: M (even for midrise quantizer)
Step size is easy to get: = 2X
max
/ M.
b
i
= (i M/2)
b4
0
b3
-
b2
-2
b1
-3
b0
-4
b5

b6
2
b7
3
b8
4
-X
max
X
max
y4
-0.5
y3
-1.5
y2
-2.5
y1
-3.5
y5
0. 5
y6
1.5
y7
2.5
y8
3.5
x
e(x) is uniformly distributed in [-/2, /2].
x
0.5
-0.5

2 3
4
-
-2 -3 -4
Uniform Quantization of a Uniformly Distributed
Source
MSE
( ) ( )
2 3
max
0
2
max
1
2 2
12
1
12
1
2 2 2
1
) ( ) (
1
A = A =
|
.
|
\
|
A
=
= =
}
} }
A
=

X
M
dx x
X
M
dx x f y x dx x f x x MSE
M
i
b
b
i q
i
i
M increases, decreases, MSE decreases
Variance of a random variable uniformly distributed in [- /2, /2]:
( )
2
2 /
2 /
2
2
12
1 1
0 A =
A
=
}
A
A
dx x q o
Optimization: Find M such that MSE D
D
X M D
M
X
D
3
1

2
12
1

12
1
max
2
max
2
> s
|
.
|
\
|
s A
Signal to Noise Ratio (SNR)
Variance is a measure of signal energy
Let M = 2
n
Each bin index is represented by n bits

( )
( )
( )
dB n
n M
M X
X
X
Energy Noise
nergy Signal
dB SNR
n
02 . 6
) 2 log 20 ( 2 log 10 log 10
/ 2
2
log 10
12 / 1
2 12 / 1
log 10

E
log 10 ) (
10
2
10
2
10
2
max
2
max
10
2
2
max
10 10
~
= = = =
A
= =
If nn+1, is halved, noise variance reduces to 1/4,
and SNR increases by 6 dB.
Uniform Quantization of a Non-uniformly
Distributed Source
Many data are non-uniformly distributed and unbounded.
No closed-form solution for step size in general.
b4
0
b3
-
b2
-2
b1
-3
b0
-
b5

b6
2
b7
3
b8

y4
-0.5
y3
-1.5
y2
-2.5
y1
-3.5
y5
0. 5
y6
1.5
y7
2.5
y8
3.5
x
Unbounded error in the first and the last bin
x
0.5
-0.5

2 3
4
-
-2 -3 -4
Granular noise
Overload noise
The optimal step size
can be solved by numerical methods.
Outline

Quantization
Uniform
Non-uniform
Vector quantization
Transform coding
DCT
Non-uniform Quantization
Uniform quantizer is not optimal if source is not uniformly
distributed
For given M, to reduce MSE, we want narrow bin when f(x) is high
and wide bin when f(x) is low
x
f(x)
0
( ) ( )
} }
=

= =
M
k
b
b
k q
k
k
dx x f y x dx x f x x
1
2 2
2
1
) ( ) (
o
Lloyd-Max Quantizer
( ) ( )
} }
=

= =
M
k
b
b
k q
k
k
dx x f y x dx x f x x
1
2 2
2
1
) ( ) (
o
Also known as pdf-optimized quantizer
Given M, the optimal b
i
and y
i
that minimize MSE, satisfying
. 0 , 0 : condition Lagrangian
2 2
=
c
c
=
c
c
i
q
i
q
b y
o o
}
}
= =
c
c
i
i
i
i
b
b
b
b
i
i
q
dx x f
dx x f x
y
y
1
1
) (
) (
0
2
o
y
i
is the centroid of interval [b
i-1
, b
i
].
x
f(x)
0 bi-1 bi
yi
Lloyd-Max Quantizer
If f(x) = c (uniformly distributed source):
) (
2
1
) (
2
1
) (

) (
) (

1
1
2
1
2
1
1
1
1
+ =
= =
}
}
}

i i
i i
i i
i i
b
b
b
b
b
b
i
b b
b b
b b
b b c
dx x c
dx x f
dx x f x
y
i
i
i
i
i
i
2
0
1
2
+
+
= =
c
c
i i
i
i
q
y y
b
b
o
b
i
is the midpoint of y
i
and y
i+1
x
f(x)
0 bi-1 bi bi+1
yi
yi+1
Lloyd-Max Quantizer
How to find optimal bi and yi simultaneously?
A deadlock:
Reconstruction levels depend on decision levels
Decision levels depend on reconstruction levels
Solution: iterative method !
}
}
=
i
i
i
i
b
b
b
b
i
dx x f
dx x f x
y
1
1
) (
) (

2
1 +
+
=
i i
i
y y
b
Given b
i
, can find the corresponding optimal y
i
Given y
i
, can find the corresponding optimal b
i
Summary of conditions for optimal quantizer:
Lloyd Algorithm (Sayood pp. 267)
1. Start from an initial set of reconstruction values y
i
.

2. Find all decision levels

3. Computer MSE:

4. Stop if MSE changes little from last time.

5. Otherwise, update y
i
,
go to step 2.

( )
}
=

=
M
k
b
b
k q
k
k
dx x f y x
1
2
2
1
) ( o
2
1 +
+
=
i i
i
y y
b
}
}
=
i
i
i
i
b
b
b
b
i
dx x f
dx x f x
y
1
1
) (
) (

Outline

Quantization
Uniform quantization
Non-uniform quantization
Vector quantization
Transform coding
DCT
Vector Quantization
Group the input symbols into blocks of L samples
Find (from a codebook) the representative code-vector for
each block
Encode the index of the selected code-vector
Difficulties:
How to generate the codebook?
How to find the best code-vector for each input block
efficiently?

Comparison with Huffman:
Huffman coding is lossless
VQ is lossy
Vector Quantization
Codebook Design
x
f(x)
0
Scalar quantizer:
Smaller bin for larger f(x)
-3 -2 -1 0 1 2 3 4
-4
-3
-2
-1
0
1
2
3
4
Vector quantizer:
Assign more code-vectors to regions
with higher joint probability.
Basic VQ Procedure
Limitation of VQ
Codebook size grows exponentially:
Desired bit rate: R bits / sample
Each vector contains L samples
Vector index size: RL bits
2^(RL) possible vectors in the codebook
Example:
R = 0.25 bits / sample, L = 16 samples / vector
2^(RL) = 16 entries in the codebook
R = 2 bits / sample, L = 16 samples / vector
2^(RL) = 2^32 = 4-billion entries in the codebook!
Not practical for high-rate
Delay .
Outline

Quantization
Uniform quantization
Non-uniform quantization
Vector quantization
Transform coding
Discrete Cosine Transform (DCT)
Why Transform Coding ?
Transform
From one domain/space to another space
Time -> Frequency
Spatial/Pixel -> Frequency
Purpose of transform
Remove correlation between input samples
Transform most energy of an input block into a few
coefficients
Small coefficients can be discarded by quantization without too
much impact to reconstruction quality
Entropy
coding
Quantization
Transform
Encoder
1-D Example
Fourier Transform
1-D Example
Application (besides compression)
Boost bass/audio equalizer
Noise cancellation
1-D Example
http://www.mathdemos.org/mathdemos/trigsounddemo/trigso
unddemo.html
Sine wave/sound/piano

www.sagebrush.com/mousing.htm
An electronic instrument that allows direct control of pitch and
amplitude
1-D Example
Smooth signals have strong DC (direct current, or zero frequency) and low
frequency components, and weak high frequency components
High frequency
DC
1 2 3 4 5 6 7 8
0
100
200
Original Input
1 2 3 4 5 6 7 8
0
1000
2000
DFT Magnitudes
1 2 3 4 5 6 7 8
-500
0
500
DCT Coefficients
Sample Index
High frequency
DC
2-D Example
Original Image
2-D DCT Coefficients. Min= -465.37, max= 1789.00
0 50 100 150 200 250 300
0
2000
4000
6000
8000
10000
-500 0 500 1000 1500 2000
0
1
2
3
x 10
5
Apply transform to each 8x8 block
Histograms of source and DCT coefficients
Most transform coefficients are around 0.
Desired for compression
Rationale behind Transform
If Y is the result of a linear transform T of the
input vector X in such a way that the components
of Y are much less correlated, then Y can be
coded more efficiently than X.
If most information is accurately described by
the first few components of a transformed
vector, then the remaining components can be
coarsely quantized, or even set to zero, with little
signal distortion.

AR(1) Model
Most data have some correlation
AR(1) model: First order Auto-regressive Model
) ( ) 1 ( ) ( n e n x n x + =
x(n): input samples
: correlation coefficient
e(n): white noise
zero mean,
E{e(i) e(j)}=0 for i j.
A very good model for natural image when is close to 1. (Each
pixel is closely related to its neighbor).

Matrix Representation of Transform
Linear transform is an N x N matrix:

1 1
=
N N N N
x T y
T
X y
Inverse Transform:

y T x
1
=
T
X
y
T
x
-1
Unitary Transform (aka orthonormal):

T
T T =
1
T
X
y
T
x
T
For unitary transform: rows/cols have unit norm and are
orthogonal to each others

=
=
= = =
j i , 0
j i , 1

ij
T
j i
T
o t t I TT
Discrete Cosine Transform (DCT)
DCT close to optimal (known as KL Transform) but much
simpler and faster
1. - N ..., 1, i for / 2
0, i for / 1
1. - N ..., 0, j i, ,
2
) 1 2 (
cos
,
= =
= =
=
|
.
|
\
|
+
=
N a
N a
N
i j
a
j i
t
C
Definition:
Matlab function:
dct(eye(N));
DCT
Definition:
1. - N ..., 1, i for / 2
0, i for / 1
1. - N ..., 0, j i, ,
2
) 1 2 (
cos
,
= =
= =
=
|
.
|
\
|
+
=
N a
N a
N
i j
a
j i
t
C
N = 2 (Haar Transform):
(
=
1 1
1 1
2
1
2
C
(
+
=
(
=
(
=
(
1 1
1 0
1
0
1
0
2
1
0
2
1
1 1
1 1
2
1
x x
x x
x
x
x
x
y
y
C
y
0
captures the mean of x
0
and x
1
(low-pass)
x
0
= x
1
= 1 y
0
= sqrt(2) (DC), y
1
= 0
y1 captures the difference of x0 and x1 (high-pass)
x
0
= 1, x
1
= -1 y
0
= 0 (DC), y
1
= sqrt(2).

DCT
Magnitude Frequency Responses of 2-point DCT:
Can be obtained by freqz( ) in Matlab.
0 0.1 0.2 0.3 0.4 0.5
-40
-35
-30
-25
-20
-15
-10
-5
0
5
DC Att. > 403.0103, Mirr Att. > 324.2604, Stopband > 50, Coding Gain = 5.055 dB
Normalized Frequency
M
a
g
n
i
t
u
d
e

R
e
s
p
o
n
s
e

(
d
B
)
x 2
DC
(
=
1 1
1 1
2
1
2
C
Low pass
High
pass
4-point DCT
Four subbands
0.5000 0.5000 0.5000 0.5000
0.6533 0.2706 -0.2706 -0.6533
0.5000 -0.5000 -0.5000 0.5000
0.2706 -0.6533 0.6533 -0.2706
0 0.1 0.2 0.3 0.4 0.5
-40
-35
-30
-25
-20
-15
-10
-5
0
5
DC Att. > 406.0206, Mirr Att. > 324.2604, Stopband > 8.3456, Coding Gain = 7.5701 dB
M
a
g
n
i
t
u
d
e

R
e
s
p
o
n
s
e

(
d
B
)
x 2
8-point DCT
Eight subbands
x 2
0 0.1 0.2 0.3 0.4 0.5
-40
-35
-30
-25
-20
-15
-10
-5
0
5
DC Att. > 409.0309, Mirr Att. > 320.1639, Stopband > 9.9559, Coding Gain = 8.8259 dB
M
a
g
n
i
t
u
d
e

R
e
s
p
o
n
s
e

(
d
B
)
Example
x = [100 110 120 130 140 150 160 170]
T
;
8-point DCT:
[381.8377, -64.4232, 0.0, -6.7345, 0.0, -2.0090, 0.0, -0.5070]
Most energy are in the first 2 coefficients.
1 2 3 4 5 6 7 8
50
100
150
200
250
1 2 3 4 5 6 7 8
-100
0
100
200
300
400
Block Transform
Divide input data into blocks (2D)
Encode each block separately (sometimes with information from
neighboring blocks)
Examples:
Most DCT-based image/video coding standards

2-D DCT Basis
For 2-point DCT For 4-point DCT
2-D Separable DCT
X: N x N input block
T: N x N transform
A = TX: Apply T to each column of X
B=XT
T
: Apply T to each row of X
2-D Separable Transform:
Apply T to each row
Then apply T to each column

T
TXT Y =
Inverse Transform:
YT T X
T
=
2-D 8-point DCT Example
89 78 76 75 70 82 81 82
122 95 86 80 80 76 74 81
184 153 126 106 85 76 71 75
221 205 180 146 97 71 68 67
225 222 217 194 144 95 78 82
228 225 227 220 193 146 110 108
223 224 225 224 220 197 156 120
217 219 219 224 230 220 197 151
Original Data:
2-D DCT Coefficients (after rounding to integers):
1155 259 -23 6 11 7 3 0
-377 -50 85 -10 10 4 7 -3
-4 -158 -24 42 -15 1 0 1
-2 3 -34 -19 9 -5 4 -1
1 9 6 -15 -10 6 -5 -1
3 13 3 6 -9 2 0 -3
8 -2 4 -1 3 -1 0 -2
2 0 -3 2 -2 0 0 -1
Most energy is in the upper-
left corner
Interpretation of Transform
Forward transform y = Tx (x is N x 1 vector)
Let t
i
be the i-th row of T
y
i
= t
i
x = <t
i
T
, x> (Inner product)
y
i
measures the similarity between x and t
i

Higher similarity larger transform coefficient

| |
i
N
i
T
i
T
N
T T T
y
= = =
1
0
1 1 0
t y t t t y T x
x is the weighted combination of t
i
.
Rows of T are called basis vectors.
Inverse transform:
Interpretation of 2-D Transform
2-D basis matrices:

Outer products of basis vectors
YT T X TXT Y
T T
= =
{ } . 1 ..., , 0 , , = N j i
j
T
i
t t
Proof:
) , (
1
0
1
0
1
0
1
0
, j
T
i
N
i
N
j
N
i
N
j
j i
T T
j i Y t t T S T YT T X

=
=
= = =
0. are others all j), Y(i, is entry th - j) (i, The Define :
, j i
S
X is the weighted combination of basic matrices.
2-D DCT Basis Matrices
For 2-point DCT For 4-point DCT
2-D DCT Basis Matrices: 8-point DCT
Further Exploration
Text books:
Introduction to Data Compression by Khalid Sayood
Vector Quantization and Signal Compression by Allen Gersho
and Robert M. Gray
Digital Image Processing by Rafael C. Gonzales and Richard
E.Woods
Probability and Random Processes with Applications to Signal
Processing by Henry Stark and John W. Woods
A Wavelet Tour of Signal Processing by Stephane G. Mallat
Web sites:
A link to an excellent article Image Compression : from DCT to
Wavelets : A Review.

CMPT 365 Multimedia Systems: Lossy Compression

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

CMPT 365 Multimedia Systems: Lossy Compression

Transféré par

Droits d'auteur :

Formats disponibles

CMPT365 Multimedia Systems 1

Vous aimerez peut-être aussi