Académique Documents
Professionnel Documents
Culture Documents
Image Compression
1 picture = 512x512x3 =786 KB 1 second = 786KB x 30 = 23.5 MB 512 1 minute = 23.5 MB x 60 = 1.4 GB 1 hour = 1.4 GB x 60 = 84 GB
Lecture 13 - Image Compression
Image Compression
Fundamentals aim to remove redundancy present in data in a way which makes image reconstruction possible. reduce the amount of data required to represent a digital image. transform a 2-D array into a statistically uncorrelated data set.
Image Compression
, : n1 n1 the number of information carrying units in two data sets that represent the same information => Relative data redundancy.
1 n1 RD = 1 CR : compression ratio = CR n1
data.
(i)n1= n1 , R = 1,RD = 1 => the first contains no redundant C (ii) n1<< n1CR ,RD 1 =>significant compression ,
highly redundant data.
=> the second data set contains more data than the original representation.
Lecture 13 - Image Compression
786488 bytes
Interpixel redundancy
including spatial redundancy, geometric redundancy and interframe redundancy
Psycho-visual Redundancy
certain information has less relative importance in normal visual processing and can be eliminated
Lecture 13 - Image Compression
Coding Redundancy
Suppose 1 1 h 1 be all the possible symbols, and Pr(ri) be the probability of its appearance. Let l(rk) be the number of bits used to represent the value of rk. Then the average number of bits required to represent each value is
r , r ,..., r
Lavg = l( rk ) Pr( rk )
k =1
8
h 1
Coding Redundancy
A Variable Length Coding Example
Coding Redundancy
I 1(rk ) increases as
pr (rk ) decreases
the shortest code words in Code 2 are assigned to the grey levels that occurs most frequently in the image
Lecture 13 - Image Compression
10
Inter-pixel Redundancy
Also called spatial or geometric redundancy There is often correlation between adjacent pixels, i.e. the value of the neighbours of an observed pixel can often be predicted from the value of the observed pixel. run length coding
11
Psycho-visual Redundancy
The human visual system does not rely on quantitative analysis of individual pixel values when interpreting an image an observer searches for distinct features and mentally combines them into recognizable groupings In this process certain information is relatively less important than other this information is called psychovisually redudant
12
Psycho-visual Redundancy
Psychovisually redundant image information can be identified and removed a process referred to as quantization Quantization eliminates data and therefore results in lossy data compression Reconstruction errors introduced by quantization can be evaluated objectively or subjectively depending on the application need & specifications
13
111 11
are , however, is added instead. The 4 most significant bits of the resulting sum Lecture 13 - Image Compression are used as the coded pixel value.
11 1 11
14
If MSBi(1~4) of Gray Leveli =1111 then Sumi = Gray Leveli else Sumi = Gray Leveli + LSBi-1(5~8) of Sumi-1
15
Psycho-visual Redundancy
16
Quantization effects
17
Fidelity Criteria
f(x, y): an input image. f ( x , y ) :compressed output. error: e( x , y ) = f ( x , y ) f ( x , y ) total error: [ f ( x , y ) f ( x , y )] root mean square error:
MN 1 1 x= y= 1 1
erms
1 = MN
M 1 1 N x =1y =1
( x , y ) f ( x, y ) f
M 1 1 N x =1y =1 1
1 1 1
= 1 log 1 1 1
M 1 1 N x =1y =1
[f ( x, y) f (x, y)]
18
source encoder: remove input redundancies. Channel encoder: increase the noise immunity of the source encoders output.
19
Compression Model
The encoder is made up of a source encoder, which removes input redundancies, and a channel encoder, which increases the noise immunity of the source encoders output. If the channel between the encoder and decoder is noise free ( not prone to error), the channel encoder and decoder are omitted.
20
Compression Model
The source encoder is responsible for reducing or eliminating any coding, interpixel, or psychovisual redundancies in the input image. As the output of the source encoder contains little redundancy, it would be highly sensitive to transmission noise without the addition of the controlled redundancy-- channel encoding.
21
M apper
Q u a n tiz e r
C hannel
Source Decoder
C hannel Sym bol D ecoder In v e rse M a p p in g
f ( x, y )
Mapper: transform the input data into a(usually nonvisual) format designed to reduce interpixel redundancies in the input image. e.g. runlength coding. Quantizer: reduce the accuracy of the mappers output in accordance with some preestablished criterion fidelity. Symbol coder: create a fixed or variable length code to represent the quantizer output and maps the output in accordance with the code. e.g. 22 variable length encoding. Lecture 13 - Image Compression
23
24
25
Classification
Lossless compression means that the data file is compacted without losing any information; that is, the file re-created from the compressed form is identical with the original. Lossy compression indicates that some, hopefully small, amount of information is lost in the compression process. The recreated file is very similar to, but not identical with, the original image. exploit both data redundancy and human perception properties
Lecture 13 - Image Compression
26
27
Run-length coding
replacing long sequences of the same value with a code indicating the value that is repeated and the number of times it occurs in the sequence. Input sequence: 0,0,-3,5,1,0,-2,0,0,0,0,2,-4,3,-2,0,0,0,1,0,0,-2 Run-length sequence: (2,-3)(0,5)(0,1)(1,-2)(4,2)(0,-4)(0,3)(0,-2)(3,1)(2,2)
28
Run-length coding
RLE compression is only efficient with files that contain lots of repetitive data. An image with many colors that is very busy in appearance, however, such as a photograph, will not encode very well. This is because the complexity of the image is expressed as a large number of different colors. And because of this complexity there will be relatively few runs of the same color.
Lecture 13 - Image Compression
29
30
Huffman Coding
When coding the symbols of an information source individually, Huffman coding yields the smallest possible number of code symbols per source symbol. The resulting code is optimal for a fixed value of n, subject to the constraint that the source symbols be coded one at a time.
31
Example
A = {a1 a1, a1 a1, a1 , , } P (a1 = P(a1) = 1 ) .1 P (a1) = 1 .1 P (a1) = P(a1) = 1 .1
33
Example
1 0 0.6 0 0.4 0 1 0 0.2 1 1 1
a1(1 ) .1
a1 1 ) ( .1
a1(1 ) .1
a1(1 ) a1(1 ) .1 .1
Lecture 13 - Image Compression
34
Example
Huffman code
a2 a1 a3 a4 a5
1 2 3 4 4
2.2 bits/symbol
35
36
Arithmetic Coding
Arithmetic coding bypasses the idea of replacing an input symbol with a specific code. It replaces a stream of input symbols with a single floatingpoint output number. Arithmetic coding is especially useful when dealing with sources with small alphabets, such as binary sources, and alphabets with highly skewed probabilities.
37
Arithmetic Coding
(1)Get the probability and initial subinterval source symbol Source symbol a1 a2 a3 a4 Probability Initial subinterval 0.2 0.2 0.2 0.2 [0.0,0.2) [0.2,0.4) [0.4,0.8) [0.8,1.0)
38
Arithmetic Coding
39
Arithmetic Coding
(3)Any number within the range [0.06752,0.0688) can be used. For example 0.068,can be used to represent the message.
40
Arithmetic Coding
Encoding algorithm for arithmetic coding. Low = 0.0 ; high =1.0 ; while not EOF do range = high - low ; read(c) ; high = low + range high_range(c) ; low = low + range low_range(c) ; End do output (low);
Lecture 13 - Image Compression
41
Arithmetic Decoding
Decoding is the inverse process. Since 0.06752 falls between 0.0 and 0.2, the first character must be a1. Removing the effect of a1 from 0.06752 by first subtracting the low value of a1, 0.0, giving 0.06752. Then divided by the width of the range of a1, 0.2. This gives a value of 0.3376.
Lecture 13 - Image Compression
42
Arithmetic Decoding
Then calculate where that lands, which is in the range of the next letter, a2. The process repeats until 0 or the known length of the message is reached.
43
Example
Symbol 1 1 1 probability 11 .1 11 .1 11 .1
1321
44
Example
0.00 0.00 0.656
0.7712 0.7712
0.7712
0.773504
0.77408
0.773504
45
Example
New character 1 3 2 1 Low value 0.0 0.0 0.656 0.7712 0.7712 High value 1.0 0.8 0.800 0.77408 0.773504
46
Example
Decoding:
r 0.7712 0.964 0.8 0 c 1[0.0,0.8) 0 3[0.82,1.0) 0.82 2[0.8,0.82) 0.8 1[0.0,0.8) 0
r_next=(r -Low)/range
low high 0.8 1.0 range 0.8 0.18
(0.7712-0)/0.8=0.964 (0.964-0.82) / 0.18=0.8 (0.8-0.8)/0.02=0
Symbol 1 1 1
probability 11 .1 11 .1 11 .1
47
Arithmetic Coding
In summary, the encoding process is simply one of narrowing the range of possible numbers with every new symbol. The new range is proportional to the predefined probability attached to that symbol. Decoding is the inverse procedure, in which the range is expanded in proportion to the probability of each symbol as it is extracted.
48
49
50
Predictor
Symbol encoder
Compressed image
Each successive pixel of the input image, denoted f n The output of the predictor is then rounded to the nearest integer, donated ^
fn
51
+
Predictor
Decompressed image
52
where m is the order of the linear predictor, round is a function used to denote the rounding or nearest integer operation, and the ai for i=1,2,3m are prediction coefficients.
53
Lossy Coding
Scalar Quantization Transform Coding Discrete Cosine Transform Wavelet Transform
54
Quantization
Input Quantizer
Output
55
56
Quantization
Linear Quantizer Output codewords Non-linear Quantizer Output codewords
Input level
Input level
Encoder Convert transform coefficients into levels (Quantization) Decoder Convert levels into reconstructed transform coefficients (Inverse quantization)
58
59
60
61
Quantization
The amount of compression and the loss of image quality depend on the number of levels produced by the quantizer. A large number of levels Precision is only slightly reduced Low Compression A small number of levels Significant reduction in precision High compression
Lecture 13 - Image Compression
62
Transform Coding
At the heart of the majority of video coding system and standards. Spatial image data is inherently difficult to compress; neighboring samples are highly correlated (interrelated) and the energy tends to be evenly distributed across an image, making it difficult to discard data or reduce the precision of data without adversely affecting image quality.
63
64
Transform Coding
Compact the energy in the image Concentrate the energy into a small number of significant coefficients Decorrelate the data Minimizing the interdependencies between coefficients Discarding insignificant data has a minimal effect on image quality
65
66
Transform Coding
Discrete Fourier Transform (DFT) Use e j = cos + j sin as its basis functions Fast Fourier Transform (FFT) O(n logn) Not so popular in image compression because performance is not good enough computational load for complex number is heavy
67
Transform Coding
Discrete Cosine Transform (DCT) Use cosine function as its basis function Performance approaches KLT Fast algorithm exists Most popular in image compression application The periodicity implied by DCT implies that it causes less blocking effect than DFT Can be implemented by 2n points FFT Used in JPEG,H.26x,MPEG
Lecture 13 - Image Compression
68
Transform Coding
Transform of image efficiently puts most of the information into relatively few coefficients Quantizer: Bit allocation determine the number of bits for coding each coefficients. more bits are used for lower-frequency components.
69
Bit allocation
Bit allocation (cont.) Two ways of truncating coefficients 1. zonal coding - the retained coefficients are selected on the basis of maximum variance 2. threshold coding - selecting the coefficients according to the maximum magnitude of coefficients
70
Zonal coding Principle - according to the information theory, the transform coefficients with maximum variance carry most picture information and should be retained in the coding process. The process can be viewed as multiplying each transformed coefficient by the corresponding element in a zonal mask.
Lecture 13 - Image Compression
71
Zonal coding
1: retain 0: eliminate
72
Threshold coding Principle - the transform coefficients with largest magnitude contribute most to reconstructed subimage quality. The most often used adaptive transform coding approach due to its simplicity.
73
74
76
where C (u ), C ( v ) = 1 N , /
for u, v = 1 ;
C (u ), C ( v ) = 1 N , otherwise. /
Lecture 13 - Image Compression
8x8 DCT
8x8 DCT
DC coefficient DCT AC coefficients
DCT coefficients
Low frequencies Medium frequencies High frequencies Lecture 13 - Image Compression
79
8x8 DCT
52 63 62 63 67 79 85 87 55 59 59 58 61 65 71 79 61 66 68 71 68 60 64 69 66 90 113 122 104 70 59 68 70 61 109 85 144 104 154 106 126 88 77 68 55 61 65 76 64 69 66 70 68 58 65 78 73 72 73 69 70 75 83 94
Pixel domain
DC
DCT
IDCT
AC
609 - 29 - 62 25 55 - 20 - 1 3 7 - 21 - 62 9 11 - 7 - 6 6 - 46 8 77 - 25 - 30 10 7 - 5 - 50 13 35 - 15 - 9 6 0 3 11 - 8 - 13 - 2 - 1 1 -4 1 - 10 1 3 -3 -1 0 2 -1 -4 -1 2 -1 2 -3 1 -2 -1 -1 -1 -2 -1 -1 0 -1
Frequency domain
Example
82