Académique Documents
Professionnel Documents
Culture Documents
Approaches
Lossless
Information preserving Low compression ratios
Lossy
Not information preserving High compression ratios
Data Information
Data and information are not synonymous terms! Data is the means by which information is conveyed. Data compression aims to reduce the amount of data
required to represent a given quantity of information while preserving as much information as possible.
Ex1:
Your subject faculty, AK, will meet you at SAC101 in SAC Building at 5 minutes past 9:00 am tomorrow morning AK Sir will meet you at SAC101 at 5 minutes past 9:00 am tomorrow morning AK Sir will meet you at SAC101 at 9:00 am tomorrow
Ex2:
Ex3:
Data Redundancy
compression
Compression ratio:
Example:
redundancy types.
Coding Redundancy
Code: a list of symbols (letters, numbers, bits etc.) Code word: a sequence of symbols used to represent a
Expected value:
= E( X )
= xP ( X
x
x)
Example:
Interpixel redundancy
Interpixel redundancy implies that any pixel value can
f = ( x) o g ( x)
f ( x) g ( x + a )da
autocorrelation: f(x)=g(x)
Example:
original
Psychovisual redundancy
The human eye does not respond with equal sensitivity to
C=8/4 = 2:1
Modeling Information
Information generation is assumed to be a probabilistic
process.
units of information!
E = I (rk ) Pr(rk )
k =0
using
Entropy
units/pixel
Redundancy (revisited)
Redundancy:
where:
Entropy Estimation
It is not easy to estimate H reliably!
image
16
Better than before (i.e., H=1.81 for original image) However, a better transformation could be found since:
Fidelity Criteria
How close is
to
Lossless Compression
A variable-length coding technique. Optimal code (i.e., minimizes the number of code symbols
Huffman Coding/Decoding
After the code has been created, coding/decoding can be
compression.
to be represented.
2) Subdivide [0, 1) based on the probability of i 3) Update interval by processing source symbols
Example
a1 a2 a3 a3 a4
Encode
Example
The message a1 a2 a3 a3 a4 is encoded using 3 decimal digits
Arithmetic Decoding
1.0 0.8 0.72 0.592 0.5728
a4
0.8 0.72 0.688 0.5856 0.57152
Decode 0.572 a3
0.4 0.56 0.624 0.5728 056896
a2
0.2 0.48 0.592 0.5664 0.56768
a3 a3 a1 a2 a4
a1
0.0 0.4 0.56 0.56 0.5664
distribution values.
sequences.
Patented Algorithm US 4,558,302 Included in GIF and TIFF and PDF file formats
LZW Coding
A codebook (or dictionary) needs to be constructed. Initially, the first 256 entries of the dictionary are
Consider a 4x4, 8 bit image 39 39 126 126 39 39 126 126 39 39 126 126 39 39 126 126
Dictionary Location
Entry
0 1 . 255 -
Dictionary Location
0 1 . 255 39-39 -
- Is 39 in the dictionary..Yes - What about 39-39.No - Then add 39-39 in entry 256
Example
39 39 39 39 39 39 39 39 126 126 126 126 126 126 126 126 Concatenated Sequence: CS = CR + P
(CR) (P)
CR = empty If CS is found: (1) No Output (2) CR=CS else: (1) Output D(CR) (2) Add CS to D (3) CR=P
Decoding LZW
The dictionary which was used for encoding need not be sent with the image.
Can be built on the fly by the decoder as it reads the received code words.
based on its neighbors (e.g., linear combination) to get a predicted image. yields a differential or residual image.
Encodes a run of symbols into two bytes: (symbol, count) Can compress any type of data but cannot achieve high compression ratios compared to other compression methods.
images.
e.g., (0,1)(1,1)(0,1)(1,0)(0,2)(1,4)(0,2)
Lossy Compression
Transform the image into a domain where compression
K << N
K-1 K-1
Transform Selection
DCT
forward
inverse
if u=0 if u>0
if v=0 if v>0
DCT (contd)
Basis set of functions for a 4x4 image (i.e.,cosines of
different frequencies).
DCT (contd)
DFT WHT DCT
8 x 8 subimages
1.78
1.13
DCT (contd)
DCT minimizes "blocking artifacts" (i.e., boundaries
DFT
i.e., n-point periodicity gives rise to discontinuities!
DCT
i.e., 2n-point periodicity prevents discontinuities!
DCT (contd)
Subimage size selection:
original
2 x 2 subimages
4 x 4 subimages
8 x 8 subimages
based on certain neighbors (e.g., linear combination) to get a predicted image. yields a differential or residual image.
The difference between the original and predicted images Encode differential image using Huffman coding.
xm
Predictor
dm pm
Entropy Encoder
quantization and (ii) the pixels are predicted from the reconstructed values of certain neighbours.
value
block.
Subband Coding
Analyze image to produce components containing
frequencies in well defined bands (i.e., subbands) e.g., use wavelet transform. Optimize quantization/coding in each subband.
Vector Quantization
Develop a dictionary of fixed-size vectors (i.e., code
vectors). vector.
Fractal Coding
What is a fractal?
A rough or fragmented geometric shape that can be
split into parts, each of which is (at least approximately) a reduced-size copy of the whole.
segmentations techniques based on edges, color, texture, etc.) and look them up in a library of IFS codes.