IS502:MULTIMEDIA DESIGN FOR INFORMATION SYSTEM MULTIMEDIA DATA COMPRESSION

IS502:MULTIMEDIA DESIGN FOR INFORMATION
SYSTEM
MULTIMEDIA OF DATA
COMPRESSION
Presenter Name: Mahmood A.Moneim
Supervised By: Prof. Hesham A.Hefny
Winter 2014
Multimedia Data Compression

Reduce the size of data.
Reduces storage space and hence storage cost.
Reduces time to retrieve and transmit data.
original data size

compression ratio
compressed data size
original Data
compressed Data
compress
decompress
By Mahmood A.Moneim
compressed Data
Decompressed Data
2
Lossless And Lossy Compression

Compression ratios of lossy compressors generally is
higher than lossless compressors.
E.g. 100(lossy) vs. 2(lossless).
Lossless compression is essential in applications

such as text file compression.
Lossy compression is acceptable in many imaging
and voice applications.
E.g. JPEG, MP3, etc.
By Mahmood A.Moneim
Kinds of Lossless Compressors

[1] Model and code
The source is modeled as a stochastic process.
The probabilities (or statistics) is given or acquired.
[2] Dictionary-based
There is no explicit model and there is no explicit
statistics gathering. Instead, a codebook (or
dictionary) is used to map source words into codewords.
By Mahmood A.Moneim
Model and Code

Example:
Shannon code
Huffman code
Arithmetic code
By Mahmood A.Moneim
Dictionary-based
Example:
LZ family
runlength code
By Mahmood A.Moneim
Basics of information theory

Entropy is a measure of disorder of the
system
By Mahmood A.Moneim
Shannon-Fano Algorithm
To illustrate the algorithm, lets suppose the symbols
to coded are characters in the word HELLO. The
frequency count of the symbols is
the top-down algorithm manner is:

Sort the symbols according to the frequency count.
Recursively divide the symbols into two parts, each with
approximately the same number counts, until all parts
contains only one symbol.
By Mahmood A.Moneim
Coding tree for HELLO by the

Shannon-Fano algorithm
By Mahmood A.Moneim
Cont.
Entropy
By Mahmood A.Moneim
10
Huffman code
Huffman code: (illustrated with a
manageable example)
Letter
Frequency (%)
A
25
B
15
C
10
D
20
E
30
Huffman code
Huffman code: Code formation
- Assign weights to each character
- Merge two lightest weights into one root
node with sum of weights .
- Repeat until one tree is left
- Traverse the tree from root to the leaf (for
each node, assign 0 to the left, 1 to the right)
Huffman code
Huffman code: Code Interpretation
- No prefix property: code for any character
never appears as the prefix of another code
(Verify)
- Receiver continues to receive bits until it
finds a code and forms the character
- 01110001110110110111 (extract the string)
Example. Find Huffman codes and compression ratio (C.R.) for Table 1,
assuming that uncompressed representation takes 8-bit per character and
assume that size of Huffman table is not part of the compressed size.
Table 1:
Char
Freq
90
60
50
20
12
Huffman Codes:
A
00
01
10
111
1101
11001
110000
110001
11
10
01
000
0010
00110
001111
001110
14
Huffman Tree
250
/
\
150 100
/ \ / \
A B C 50
/ \
30 D
/ \
18 E
/ \
10 F
/ \
G H
Char
Freq
90
60
50
20
12
Huffman
Code
00
01
10
111
1101
11001
110000
110001
C.R. = (250*8) / (2*90 + 2*60 + 2*50 + 3*20 + 4*12 + 5*8 + 6*7 + 6*3) = 3.29
15
Decompression - Huffman Codes

A
00
01
10
111
1101
11001
110000
110001
11
10
01
000
0010
00110
001111
001110
Compress DEAF using above Huffman Codes.

111 1101 00 11001
Decompress 110001 1101 00 111
Ans.: HEAD
Sunday, November 24, 2013
WILPD, B.I.T.S., PILANI EA ZC473

Multimedia Computing On-Line Lecture-6
16
Arithmetic compression
Arithmetic compression: is based on
Interpreting a character-string as a single real
number
Letter
Frequency (%) Subinterval [p, q]
A
25
[0, 0.25]
B
15
[0.25, 0.40]
C
10
[0.40, 0.50]
D
20
[0.50, 0.70]
E
30
[0.70, 1.0]
Arithmetic compression: Coding CABAC
Generate subintervals of decreasing length,
subintervals depend uniquely on the strings
characters and their frequencies.
Interval [x, y] has width w = y x, the new
interval based on [p, q] is x = x + w.p, y = x +
w.q
Step 1: C 0..0.4.0.5..1
based on p = 0.4, q = 0.5
Step 2: A 0.40.425.....0.5
based on p = 0.0, q = 0.25
Step 3: B
0.40.406250.41..0.425
based on p = 0.25, q = 0.4
Step 4: A
Step 5: C
0.406625 0.4067187
Final representation (midpoint)?
Arithmetic compression: Extracting CABAC
N
0.4067
0.067
0.268
0.12
0.48
Interval[p, q]
0.4 0.5
0 0.25
0.25 0.4
0 0.25
0.4 0.5
Width Character
0.1
C
0.25
A
0.15
B
0.25
A
0.1
C
N-p
0.0067
0.067
0.018
0.12
0.08
(N-p)/width
0.067
0.268
0.12
0.48
0.8
When to stop? A terminal character is added to the original

character set and encoded. During decompression, once it is
encountered the process stops.
LZW Algorithm
LZW Compression
Begin
S= next input character
While not EOF
{
C= next input character
Is s+c exists in the dictionary
S= s+c
Else{
Output the code for s;
Add String s+ c to dictionary with a new code
S=c
}
}
Output the code for s
End
By Mahmood A.Moneim
21
LZW for String ABABBABCABABBA

Initially containing only three characters
By Mahmood A.Moneim
22
Cont.
By Mahmood A.Moneim
23
LZW Decompression
By Mahmood A.Moneim
24
Cont.
Input code for the decoder is 124523461.
By Mahmood A.Moneim
25
Run Length Encoding

Huffman code requires:
- frequency values
- bits are grouped into characters or units
Many items do not fall into such category
- machine code files
- facsimile Data (bits corresponding to light or
dark area of a page)
- video signals
Run Length Encoding

For such files, RLE is used.
Instead of sending long runs of 0s or 1s, it
sends only how many are in the run.
70%-80% space is white on a typed character
space, so RLE is useful.
Run Length Encoding
Runs with different characters

Send the actual character with the run-length
HHHHHHHUFFFFFFFFFYYYYYYYYYYYDGGGGG
code = 7, H, 1, U, 9, F, 11, Y, 1, D, 5, G
SAVINGS IN BITS (considering ASCII): ?
QUESTIONS?
By Mahmood A.Moneim
29

IS502:MULTIMEDIA DESIGN FOR INFORMATION SYSTEM MULTIMEDIA DATA COMPRESSION

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

IS502:MULTIMEDIA DESIGN FOR INFORMATION SYSTEM MULTIMEDIA DATA COMPRESSION

Transféré par

Droits d'auteur :

Formats disponibles

IS502:MULTIMEDIA DESIGN FOR INFORMATION

Multimedia Data Compression

original data size

Lossless And Lossy Compression

Lossless compression is essential in applications

Kinds of Lossless Compressors

Model and Code

Basics of information theory

the top-down algorithm manner is:

Coding tree for HELLO by the

Decompression - Huffman Codes

Compress DEAF using above Huffman Codes.

WILPD, B.I.T.S., PILANI EA ZC473

When to stop? A terminal character is added to the original

LZW for String ABABBABCABABBA

Run Length Encoding

Run Length Encoding

Run Length Encoding

Runs with different characters

SAVINGS IN BITS (considering ASCII): ?

Vous aimerez peut-être aussi