Video Communication 2009

Video communication
Enrico Magli
Dept. of Electronics - Politecnico di Torino (Italy)
enrico.magli@polito.it http://www1.tlc.polito.it/sas-ipl
Credits
Parts
of the material used in this course have been inspired or taken from work of the following people:
Yao
Wang (Polytechnic University, New York, USA) Bernd Girod (Stanford University, USA) Dapeng Wu (Univ. of Florida, USA) Y. Guo, C. Neumann (Thomson Inc.) D. Purandare (Univ. of Central Florida, USA)
Information theory and compression
General transmission scheme
Encoder/ Encoder/ Modulator Modulator

Source coding + Channel coding
Channel Channel
Decoder/ Decoder/ Demodulator Demodulator

Channel decoding + Source decoding
Source coding
Typical
signals contain redundancy
Coding
(representation) redundancy Correlation between adjacent samples Psychovisual redundancy

Goal:
to reduce the intrinsic redundancy of the source signal to find a more compact signal representation
Notation
Let n1
and n2 be the number of information carrying units in two data sets that represents the same information n1 CR = Compression ratio:
n2
compression algorithm searches a representation with
CR > 1
Coding redundancy
Data
is not equal to information Data is the means by which information is conveyed The same story can be told with a different number of words if the teller is long-winded or short and to the point! This is the coding redundancy
Coding redundancy

White is the most likely value in this picture Encoding each pixel with the same no. of bits leads to coding redundancy Variable length coding is the solution to coding redundancy
Coding redundancy
Let
p(rk) = nk / n the probability of occurrence of gray level rk, k = 0, 1, 2, , L-1 Let rk be represented by l(rk) bits; the average number of bits to represent each pixel is
Lave = l (rk ) p (rk )
k =0 L 1
If
l(rk) = m, then Lave = m
Coding redundancy
It
makes sense that fewer bits are assigned to those rk for which p(rk) is larger This achieves data compression as Lave is lower Therefore, data compression is achieved Variable length codes are used
10
Inter-pixel redundancy

Large areas of the image are uniform This means correlation among pixels (adjacent pixels are almost the same) This is not solved using variable length coding (which works on each single pixel)
11
Psychovisual redundancy
Human perception of the information of an image does not involve quantitative analysis of every pixel Pixel values can be modified up to a given extent without significant subjective degradation Modifications must involve psychovisually redundant information (not easy to define) The image is irreversibly altered
12
Psychovisual redundancy
The brain searches for distinguishing features and mentally combines them into objects (recognizable groups of pixels) Use of prior knowledge for interpretation (face, wall, poster) If the wall were slightly different this could not be perceived
13
Data compression
Data
compression algorithms can be divided into two categories:

Lossless
coding : compressed signal is equal to the original (no coding errors) Lossy coding : a controlled amount of errors are tolerated, according to human subjective sensing capability
14
Lossless source coding

Mainly
for data (e.g. PC files) Sensitive applications (biomedical images, remote sensing data) Output=input (no losses) represents in a more efficient way the signal samples (or data codewords) Achievable compression ratios in the case of natural images: 2.5-3
15
Lossy source coding

A
certain degree of distortion is accepted between output and input Distortion should not be apparent Lost data cannot be recovered Much larger achievable compression ratios: 50 and more
16
Introduction
Invented
by Claude Shannon in the 40s Has set the mathematical framework for digital communications Information Theory teaches us
how
to measure information how to represent information efficiently how to reliably transmit information across communication channels
17
Measuring information
A
discrete memoryless source generates symbols from a set X of M elements (alphabet); each symbol is characterized by its probability of occurrence pi
X = {x }
How
M i i =1
{p }
M i i =1
do we measure the amount of information carried by message xi ?
18
Properties of information
The
amount of information carried by a message is inversely proportional to its probability
I ( x j ) > I ( xi ) if
Statistically
p j < pi
independent messages:
P ( xi , x j ) = P ( xi ) P ( x j ) I ( xi , x j ) = I ( xi ) + I ( x j )
19
Definition of information
1 I ( xi ) = log 2 pi
and is measured in bits Average amount of information carried by the memoryless source: entropy (bits/symbol)
1 H ( X ) = pi I ( xi ) = pi log 2 pi i =1 i =1
20
Examples
x1 x2 X= x3 x4 p1 = 1 2 p2 = 1 4 p3 = 1 8 p4 = 1 8 x1 x2 X= x3 x4 p1 = 1 4 p2 = 1 4 p3 = 1 4 p4 = 1 4
H(X) = 1.75 bits/symbol
H(X) = 2 bits/symbol
Equiprobable sources carry more information, and are more difficult to compress
21
Bounds on Entropy
Theorem:
the first order entropy of a memoryless M-symbol alphabet is limited by
H ( X ) log 2 M
Example:
8 bit quantizer (M=28)
H ( X ) 8 bit/symbol
with the equality if the symbols are equiprobable
22
Noiseless coding theorem

(Shannon) It
is possible to code without any loss of information a memoryless source alphabet with entropy H(X) bits/symbol, using H(X) + bits/symbol is a quantity that can be made arbitrarily small considering increasingly larger blocks of symbols to be coded
23
Bounds on lossless coding

The
average codeword length of a lossless coder cannot be less than entropy Entropy represents the target average number of bits/symbol of a lossless encoder
Coding
Efficiency: = H(X) / n
where n is the average codeword length
24
if not memoryless
If
the source is correlated, the first order entropy does not represent a bound to the average codeword length All the previous results still hold, replacing first order entropy with entropy rate H which takes into account correlation amongst symbols 1
H ( X ) = lim
H N ( X ) = P ( x1 = i1 , x2 = i2 ,..., x N = iN ) log P ( x1 = i1 , x2 = i2 ,..., x N = iN )

i1 =1 i2 =1 i N =1
HN (X )
25
Lossless - Introduction
The goal of lossless compression is to minimize the average length of the compressed symbols exploiting statistical properties of the data probability distribution correlation (redundancy) of the data
No
distortion is accepted minimize the rate for zero distortion

26
Introduction to lossy compression

Advantages
Higher
of lossy compression:
compression ratios Low distortion of the decoded data Possibility to shape the error between original and decoded data
Disadvantages:
The
decoded signal is not exactly conform to the original
27
Examples of lossy compression

Speech/Audio:
GSM
speech compression MP3 audio

Image
compression:
JPEG JPEG2000
Video
compression
MPEG-2 H.263
28
Lossy compression

Consider an i.i.d. discrete-time random process X Main difference with respect to lossless compression: we accept some distortion we reconstruct X*X A single letter distortion measure for a length-m data vector is defined as
m 1 1 m ( x, x * ) = ( x j , x j * ) m j =0
with (.) a nonnegative component by component distortion measure
29
Examples of distortion measures

Hamming
metric
0 * m ( xi , x j ) = 1
Euclidean
if xi = x j if xi x j
metric
m ( xi , x j ) = ( xi x j )
*
* 2
30
Source code
A
source code Q describes a source X by an approximation X*, such that:

the
distortion between X and X* is equal to D the rate necessary to transmit X* losslessly is R

X Distortion D X* I* I*
Quantizer
Entropy coder
rate R bit/sample
Dequantizer
Entropy decoder
31
Rate-distortion function
Given a desired expected distortion E . m ( X X * ) D the rate-distortion
function R(D) is the minimum rate at which we can guarantee the existence of a source code that represents X with X*, so that X* is encoded with a rate of R bit/sample, and E [ m ( X X * )] D
32
Example of R-D function

R
lossless coding
maximum distortion
33
Operational R-D function

In
practice the R-D function is very difficult to compute for realistic sources. Usually one employs the operational R-D function, which is the set of practically achievable R-D points for a given sample realization of the source and a specific code
34
Operational R-D function

Example:
consider compressing an image with JPEG at different quality factors R

operational R-D curve
D
35
Huffman Coding
Problem
Given
statement
a source X emitting symbols ai with probability P(ai) Find a compact representation c(ai)ci
Objective
If li
represents the length of codeword ci we want to minimize the average length

l = li P (ai )
i
36
Huffman Coding
The average length l = li P (ai )will be i minimized if
ai , a j | P(ai ) P(a j ) li l j
Variable length coding (VLC): the shortest codewords are allocated to the most probable symbols
37
Huffman Coding

We must guarantee that the codewords c(ai)ci are unequivocally decodable Huffman coding is based on the idea of prefix free coding
Any codewords cannot be a prefix for another codeword
li 1
ci , c j | li l j ci c j
ci cj
38
Huffman Coding
Example: cod. 1 is not prefix free, cod.2 is prefix free

a b c cod. 1 0 01 11 cod. 2 0 10 11 cod. 1
0 1 1 1
cod. 2
0 0 1 1
Prefix free codes are represented with a binary tree where internal nodes do not represent codewords (the codewords are only the leaves of the tree)
39
Huffman Coding
Huffman
codes construction was prposed in D.A.. Huffman, A method for the construction of minimum redundancy codes, Proc. Of the IRE, 1951
40
Code construction example

Memoryless
source X={a,b,c,d,e} P(ai)={0.4 0.2 0.2 0.1 0.1} The two least probable symbols are grouped to form a tree node
0.4 0.2 0.2 P({d,e})=P(d)+P(e)=0.2
c d e
The
sum of the probabilities of the two symbols is attributed to the tree node
41
Code construction example

The
procedure is iterated considering both the remaining symbols and the created trees
0.4 0.4 0.2 0.2 0.2 0.6 0 1.0 1 1 1 0
a e c b
0 0
c d
0.4
a a (0.4) 1 b (0.2) 0 1 c (0.2) 0 0 0 d (0.1) 0 0 1 0 e (0.1) 0 0 1 1
b
1
0.4
0.2
c d
b c d e
l = li P(ai ) = 2.2bps
i i
H ( X ) = P(ai ) log P(ai ) = 2.122bps
42
Huffman Coding
Coding
efficiency
H ( X ) l < H ( X ) +1
Stronger
Given
higher bounds are
the maximum probability value pM
l < H ( X ) + pM , if pM 0.5
l < H ( X ) + pM + 0.086, if pM < 0.5
43
Huffman Coding
Coding
efficiency can be poor with small alphabet with unbalanced probabilities (PM > 0.5)
a(0.8) 0 b(0.18) 11 c(0.02) 10
H ( X ) = 0.816 bps l = 1.2 bps
44
Extended Huffman Coding

Extended
Huffman Coding is obtained coding n-tuples of symbols

Example
H ( X ) = 0.816 bps
aa(0.64) ab(0.144) ac(0.016) ba(0.144) bb(0.0324) bc(0.0036) ca(0.016) cb(0.0036) cc(0.0004) 0 11 10101 100 1011 10100100 101000 1010011 10100101
a(0.8) 0 b(0.18) 11 c(0.02) 10
l = 1.2 bps
l = 0.8614 bps
45
Huffman Coding
Extended
Huffman coding efficiency is
1 H (X ) l < H (X ) + n
(where n is the number of grouped symbols)
46
Simplified VLC
An
easy and sub-optimal VLC coding technique is known as Run-Length coding It is based on the assumption that a given symbol is repeated for long
Fax,
B/W images
The
symbol and length of its run is coded Example

X=000000100000000010000001 Code:
6,9,6
47
Basics of images
48
Light is part of the EM wave
49
Illuminating and reflecting light
50
Human Eye
cones rods
51
Human Eye
Human eye: some features
The range of intensity that we can perceive is impressive (on the order of 1010) HVS cannot operate over such a range simultaneously Brightness adaptation is used Brightness discrimination is poor at low level of illumination (Weber law) Sensitive to hedges (high contrast zones)
52
Colors
Sensing colors
7 millions cones in human eye can be divided into 3 categories, able to sense red (R), green (G), blue (B) RGB color model
53
Trichromatic color mixing
54
RGB vs. CMY
55
Color representation models
56
YCbCr color space
An important color space for video application is the so called YCbCr

Luminance Y= 0.299 R + 0.578 G + 0.114 B Chrominance Cb = B - Y Cr = R - Y Y corresponds to the black and white TV signal Cb/Cr can be used by color TV to generate R,G,B HVS is much less sensitive to Cb,Cr (can be compressed to a large extent without impairing the perceived quality)
57
Image Transforms part I
Outline
Introduction Fourier DFT
Transform
59
Introduction
An
image can be described in space or frequency Spatial frequency: the rate of change of an image Representation in space domain: picture = collection of brightness levels Representation in frequency domain: picture = collection of spatial frequency components
60
Space vs. frequency
Dark Low frequency
Dark High frequency
Bright Low frequency
Bright High frequency
61
Fourier Transform
62
Fourier Transform
The
Fourier Transform is used to decompose an image into sine and cosine components Used in a wide range of applications: image analysis, filtering, reconstruction and compression As we are only concerned with digital images, we will only consider (2D) Discrete Fourier Transform (DFT)
63
Example of image frequency representation

Images that are pure cosines have particularly simple FT Pure horizontal cosine of 8 cycles and pure vertical cosine of 32 cycles. The FT just has a single component, represented by 2 bright spots symmetrically placed about the center of the FT image The center of the image is the origin of the frequency coordinate system.
64
Example of image frequency representation
Images of 2D cosines with both horizontal and vertical components. (left) 4 cycles horizontal and 16 cycles vertically. (right ) 32 cycles horizontally and 2 cycles vertically For real images, the FT is symmetrical about the origin so the 1st and 3rd (2nd and 4th) quadrants are the same If the image is symmetrical about the x-axis 4-fold symmetry results.
65
Discrete Fourier Transform

The
DFT is the sampled Fourier Transform and therefore does not contain all frequencies forming an image, but only a set of samples which is large enough to fully describe the spatial domain image. The number of frequencies corresponds to the number of pixels in the spatial domain image, i.e. the image in the spatial and Fourier domain are of the same size
66
Two-dimensional DFT
A
square image x(n,m) of size NN has the two-dimensional DFT (2-D DFT):
kn lm x(n, m) exp j 2 + N N n =0 m =0
N 1 N 1
1 F (k , l ) = 2 N
F(k,l)
is obtained by multiplying the image with the corresponding base function and summing the result.
67
Two-dimensional DFT
The
base functions are sine and cosine waves with increasing frequencies F(0,0) represents the DC-component which corresponds to the average brightness and F(N-1,N-1) represents the highest frequency.
68
Separability of 2-D DFT

A
double sum has to be calculated for each image point. However, because the DFT is separable, it can be written as
lm P (k , m) exp j 2 N m =0 kn 1 N 1 P (k , m) = x(n, m) exp j 2 N n =0 N 1 F (k , l ) = N
N 1
69
Separability of 2-D DFT

The
spatial domain image is first transformed into an intermediate image using 1-D DFT applied to the rows This intermediate image is then transformed into the final image, again using 1-D DFT applied to columns This procedure decreases the number of required computations 2 Complexity of 2-D DFT: O N log 2 N
70
Properties of 2-D DFT

The
DFT produces a complex valued image It is displayed with two images, typically magnitude and phase. Only the magnitude is usually displayed The Fourier domain image has a much greater range than the image in the spatial domain. Hence, its values are usually calculated and stored in float values and represented in log- scale
71
Magnitude and phase spectra
The images are horizontal cosines of 8 cycles, differing only by a 1/2 cycle lateral shift Both have the same magnitude spectrum. The phase spectrum would be different, of course.
72
Inversion of 2-D DFT

The
Fourier image can be re-transformed to the spatial domain:

1 x ( n, m ) = 2 N kn lm F ( k , l ) exp j 2 + N N k =0 l =0
N 1 N 1
Both
amplitude and phase information are relevant for the reconstruction of the image
73
Effect of phase on reconstruction
(a)
(b) This image is reconstructed from the frequency domain using amplitude information from (b) and phase information from (a)
74
2-D DFT: example 1

(a) image (b) section A-B
(c) 1-d FFT of section A-B (d) 2-D FFT of image
75
2-D DFT : example 2
(a) Chest radiograph
(b) 2-D Fourier spectrum of (a)
broad range of spatial frequencies significant vertical and horizontal features, due to ribs and vertebral column
76
2-D DFT: example 3

The DFTs tend to have bright lines perpendicular to lines in the original letter. If the letter has circular segments, then so does the FT.
77
2-D DFT: example 4

The concentric ring structure in the DFT of the white pellets image is due to each individual pellet. If we took the DFT of just one pellet, we would still get this pattern. The fact that there are many pellets and information about exactly where each one is is contained mostly in the phase The coffee beans have less symmetry and are more variably colored so they do not show the same ring structure. You may be able to detect a faint "halo" in the coffee DFT. What do you think this is from?
78
2-D DFT: example 5
The girl looks very similar to the ape except for the hat
Effect of edge between hat and hair
79
2-D DFT: example 6
The first image is all black except for a single pixel wide stripe from the top left to the bottom right The second image is totally random
80
General transform coding scheme
pixels values X
Reversible transform
Quantization
Y
Entropy coding Bit allocation
Why do we need to introduce a transform domain? The objective is to represent the original data X into a new domain Y, more suitable for quantization and coding
81
General transform coding scheme

X Y Entropy Entropy coding coding Bit Bitallocation allocation
Reversible Reversible transform transform
Quantization Quantization
Quantization (lossy coding only) depends on

desired bit rate statistics of the various transformed coefficients distortion of the reconstructed signal Any binary encoding technique (Run length, Huffman, Arithmetic )
Entropy coding
82
Transform Coding
Transforms The
are able to decorrelate data
coefficients in the transformed domain are more suitable for the subsequent quantization operation
In
the transformed domain few coefficients concentrate most of the signal energy Coefficients are decorrelated, therefore scalar quantization is nearly optimum
83
The Karhunen-Loeve Transform

Also
called the Hotelling Transform The KLT is a data dependent transform Let X denote a random data vector of length N, m be its (vector) mean value and C be its N x N covariance matrix:
C = E ( X m)( X m)
}
84

The
matrix C is real and symmetric, and hence can be diagonalized using its eigenvectors The eigenvectors ei of C are given by
Cei = i ei
where i are the corresponding eigenvalues
85

Let
us consider a matrix A whose columns correspond to the eigenvectors of C, arranged in increasing eigenvalue order Let us consider the transformation T Y = A ( X m)
Y
is zero mean and has covariance matrix:

T T T T
C y = E (YY ) = E[ A ( X m)( X m) A] = A CA =
where is the diagonal eigenvalue matrix
86

The
elements in the transformed domain are uncorrelated If only the top K coefficients are kept, corresponding to the K largest eigenvectors, the mean square error between the original vector X and its reconstruction from truncated Y is theoretically minimum KLT is a bound as for compression efficiency but is computationally intractable
87
Discrete Cosine Transform

The
1-D discrete cosine transform (DCT) is defined as

(2 x + 1)u C (u ) = (u ) f ( x) cos 2N x =0 u = 0,1, L, N 1
N 1
(0) =
1 N
2 (u ) = N u = 1, L, N 1
88
Inverse DCT
Similarly,
the Inverse DCT (IDCT) is defined

N 1
as
(2 x + 1)u f ( x) = (u )C (u ) cos 2N u =0 x = 0,1, L , N 1
with (u) defined as before
89
2-D DCT
The
two-dimensional DCT is obtained applying the 1-D transform to the rows and columns independently The corresponding transform is
(2 x + 1)u (2 y + 1)v C (u , v) = (u ) (v) f ( x, y ) cos cos 2N 2N x =0 y =0 u , v = 0,1,L , N 1
N 1 N 1
90
Inverse 2-D DCT

Analogously,
N 1 N 1
the inverse 2-D transform is
(2 x + 1)u (2 y + 1)v f ( x, y ) = (u ) (v)C (u , v) cos cos 2N 2N u =0 v =0 x, y = 0,1,L , N 1
91
DCT basis functions

Basis functions of 8X8 DCT When it is applied to an 8x8 image, it yields an 8x8 matrix of weighted values corresponding to how much of each basis function is present in the image An 8x8 image that just contains one shade of gray will yield only a weighted value for the upper left hand DCT basis function (which has no frequencies in the x or y direction).
92
2-D DCT
93
Transform Coding: DCT

For
N , DCT tends to a diagonal matrix (KLT)
The
input data stream must be divided into blocks before applying the transform correlation across the block boundaries is not removed
The
94
Example of 2-D DCT

Image DCT
95
Test image: Lenna
96
Test image: Lenna
97
Interpretation of DCT basis functions

The
top-left basis function represents zero spatial frequency (DC coefficient) Along the top row the basis functions have increasing horizontal spatial frequency content. Down the left column the functions have increasing vertical spatial frequency content.
98
DFT vs. DCT periodicity

discontinuity discontinuity
DFT DFT periodicity periodicity
DCT DCT periodicity periodicity
2n
99
Why DCT not FFT?
DCT can approximate lines well with fewer coefficients Blocking artifacts less pronounced Better approximation to the KLT Used in the JPEG standard
100
DFT (example)
DFT (25% samples retained)
Absolute Error (MSE= 5.1345)
101
Quantization
Goal
of quantization: to represent a real number in (-,+ ) as an integer number, i.e. an element of a discrete and finite set of 2N possible values (N bit quantizer). Bit rate: B=N fs
102
Uniform Quantization
Quantized signal yi+1 yi
xi xi+1
Original signal
Errors: granularity and overload
103
Uniform Quantization
104
Quantization techniques
Uniform
quantization is (almost) optimal when the input signal is memoryless Quantization techniques:
Scalar
quantization Non-uniform quantization Robust-quantization Pdf-optimized quantization (Lloyd-Max) Entropy-constrained quantization Vector quantization
105
Alternative to transforms: linear prediction

Linear
prediction:
estimate
the value of the current pixel x[n] as the linear combination of past pixels: x*[n] = a1 x[n-1] + a2 x[n-2] + instead of x[n], encode the prediction error e[n]=x[n]-x*[n] the decoder recovers x[n]=e[n] + x*[n]
106
Linear prediction (DPCM)

x[n]
+ P(x[n])
e[n]
e[n] +
x[n]
P(x[n])
e[n] can be quantized more efficiently than x[n]
107
Example of third order LP
P(x)=aA + bB + cC E=X-P(x)
108
Practical DPCM scheme
x[n] + e[n] -
e[n]+q[n] Q Q-1 Q-1
+ + H(z)
xs[n]
Pxs
xs[n] + H(z) +
109
The JPEG coding standard
International standards
Organizations
ISO
9
that define standards:
(JTC 1 SC 29 WG 01/11)
JPEG, MPEG, JPEG 2000 H.261, H.263, H.264
ITU
9
Why
standards?
Interoperability
111
International standards (contd)

Who
defines standards:
Companies Academia
Advantages
provides
of using a standard
interoperability in the standards is some years old
Disadvantages:
technology
112
Carrying out the technical work

A
few weekly meetings per year A few intermediate meetings

Call
for proposals Working draft Final committee draft Final Draft International Standard Final Publication Draft
113
The copyright issue
Some technologies used in JPEG are covered by patents:

IBM, AT&T, and Mitsubishi for arithmetic coder Forgent for Huffman tables ? baseline algorithm, royalty-free advanced algorithm, with license fees
Goal:

Participants in JPEG are required to accept to provide royalty-free licenses for technology that they bring into the standard, for the baseline version of the algorithm.
114
International standards (contd)
What is standardized ?
Source data
Multimedia encoder
Multimedia decoder
Syntax
Defined by standard
115
Roadmap to international image coding standards
JPEG

baseline lossy compression extension (hierarchical, progressive) lossless compression lossless compression near-lossless compression lossy compression lossless compression extensions
JPEG-LS

JPEG 2000

116
JPEG
This
standardized image compression scheme is designed to work on full-color or gray-scale digital images JPEG defines a baseline algorithm, plus extensions for Progressive and Hierarchical Coding It foresees a separate lossless mode (Huffman or Arithmetic coding)
117
JPEG block scheme
Color space decomposition

RGB YUV (subsampled)
Application of the algorithm to each component
118
JPEG
The
coding steps:
of the image into a suitable color
transformation
space application of a 8x8 blocks DCT quantization zig-zag reading entropy (lossless) coding
119
JPEG compression
A weighted scalar quantization is applied to each transformed coefficient in every block Quantized DC values are coded by DPCM from macroblock to macroblock Zig-zag reordering Encoding of zero-runs Entropy coding
120
JPEG Quantization Matrices
Divide each entry of the image matrix by the corresponding entry in the quantization matrix Quality factor to control quality Contained in the JPEG file, with image information Flexibility with
Fq(u,v)= round[F(u,v)/Q(u,v)]
quantization tables (?)

121
122
Original Block
DCT (rows)
123
DCT (columns)
Quantized DCT
124
Reconstructed block
Abs error vs. original
125
126
127
JPEG entropy coding

The
zig-zag scanned coefficients are encoded as sequence of couples of symbols: Symbol 1 Symbol 2
(RUNLENGTH, SIZE) (AMPLITUDE)
Runlength:
nr. of zero samples preceding the current sample (0-15 or EOB) Size: nr. of quantization bits for the current sample Amplitude: quantized sample value
128
Codestream syntax
The
codestream consists of
Markers
and marker segments (to carry auxiliary information) Data

Marker
Code Length Marker
structure:
data
129
JPEG syntax
FFD8 (Start Of Image) FFE0 (FIF marker) FFDB (Define Quantization Table) FFC4 (Define Huffman Table) FFC0 (Start Of Frame) FFDA (Start of Scan) FFD0-FFD7 (Restart Markers) FFFE (Comment) FFD9 (End Of Image)
130
JPEG lossless mode
131
JPEG performance
Quality max - Size: 61k Quality med - Size: 14k
Quality low - Size: 4k
132
JPEG performance
Original image Encoded @ 24 bits per pixel
133
JPEG performance
Quality 95/100 3.926 bits per pixel (bpp) CR = 24/3.926 = 6.1
134
JPEG performance
Quality 50/100 1.067 bits per pixel (bpp) CR = 22.5
135
JPEG performance
Quality 25/100 0.705 bits per pixel (bpp) CR = 34.0
136
JPEG performance
Quality 5/100 (min.useful) 0.291 bits per pixel (bpp) CR = 82.5
137
JPEG
Disadvantages:
blocking
effect for non smooth images image correlation is not removed across block boundaries only possible dynamic range is 8 or 12 bpp non unified version for lossless and lossy compression Fourier-like basis functions Poor performance at low bit rate
The
use of low bit rate coding algorithms becomes necessary (JPEG 2000)
138
Video coding
Analog video
140
Progressive and interlaced scans
141
Color TV broadcasting and receiving
142
Why not using RGB directly?
143
144
Digitizing a raster video
145
RGB YCbCr
146
Chrominance subsampling formats
147
Digital video formats
148
2D motion estimation
Notation
150
Motion representation
151
Block based motion estimation
152
Block matching algorithm
153
Exhaustive block matching algorithm
154
Complexity of integer-pel EBMA
155
Sample Matlab script for integer-pel EBMA
156
Fractional accuracy EBMA
157
Half-pel accuracy EBMA
158
Bilinear interpolation
159
160
Pros and cons with EBMA
161
Fast algorithms for BMA
162
Video coding using motion compensation
Characteristics of typical videos
164
Key ideas in video compression: hybrid video coding
165
Different coding modes
166
Temporal prediction
167
Block matching algorithm for motion estimation
168
Multiple reference frame temporal prediction
169
Spatial prediction
170
Motion compensated video
171
Macroblocks in 4:2:0 color format
172
MB coding in I-mode (assuming no intra prediction)
173
MB coding in P-mode
174
MB coding in B-mode
175
Coding mode selection
176
Rate control
177
Loop filtering
178
Video coding standards
Scalable coding
180
Bitstream scalability
181
Illustration of scalable coding
182
Quality (SNR) scalability by multistage quantization
183
Spatial/temporal scalability through down/upsampling
184
Scalability in MPEG-2
185
Fine granularity scalability (FGS) in MPEG-4
186
Drift problem in scalable codecs
187
How to solve the drift problem?
188
Trade-off between coding efficiency and drift
189
Video coding standards and applications
190
H.261 video coding standard
191
DCT coefficient quantization
192
Motion estimation/compensation
193
Variable length coding
194
Parameter selection and rate control
195
H.263 video coding standard
196
Improvements over H.261
197
PB-picture mode
198
Performance of H.261 and H.263
199
MPEG-1 overview
200
MPEG-1 vs. H.261
201
Group of pictures in MPEG
202
MPEG-2 overview
203
MPEG-2 vs. MPEG-1
204
DCT modes
205
MPEG-2 scalability
206
SNR-scalable encoder
207
Spatially-scalable encoder
208
Temporally scalable encoder
209
Profiles and levels in MPEG-2
210
MPEG-4 overview
211
Object-based coding
212
Object description hierarchy in MPEG-4
213
Example of scene composition
214
Coding of texture with arbitrary shape
215
Shape-adaptive DCT
216
MPEG-4 shape coding
217
Mesh animation
218
Body and face animation
219
MPEG-4 video coding efficiency tools
220
H.264/AVC
Introduction
Started
as ITU recommendation Now joint ISO and ITU effort (JVT) ITU H.264/AVC, MPEG-4 Part 10
Targets
at
bit rate reduction by a factor 2, at the same quality, with respect to other standards
the expenses of much higher complexity
222
Comparison of video coders (QCIF, 30 fps, 100 kbit/s)

Original H.263 baseline (33 dB) H.263+ (33.5 dB)
MPEG-4 core (33.5 dB)
H.264 (42 dB)
223
H.264/AVC applications
224
Relationship to other standards
225
H.264/AVC structure
226
H.264/AVC profiles
Baseline:
core compression capabilities, plus error resilience. Suitable for videoconference, mobile video, Main: high compression and quality (e.g., broadcasting) Extended: added features for efficient streaming
227
H.264 video coding layer
228
Partitioning of a frame
229
Flexible Macroblock Ordering (FMO)
230
Common elements with other standards
231
H.264 motion compensation accuracy
232
Macroblock partitioning
233
Multiple reference frames
234
Macroblock type
Each
MB can be encoded in one of the following modes:

INTRA
9 9
Intra 4x4 Intra 16x16
9 prediction modes for Y 4 prediction modes for Y
INTER
9 9
prediction with square blocks (16x16, 8x8, 4x4) prediction with rectangula blocks (8x16, 16x8, 4x8, 8x4)
RATE DISTORTION OPTIMIZATION

235
Intra prediction
MBs to be coded in Intra mode can be predicted from the already coded MBs in the same slice
(Intra 16x16)
236
DCT and inverse transform
237
H.264 4x4 transform
238
4x4 DCT
4x4 DCT
4x4 DCT
239
Deblocking filter
In-loop
filter improves visual quality and PSNR. The filter in H.264/AVC is very articulate
slice
level edge level (filtering strength is dependent on coding residuals) sample level (thresholds allow to turn off the filter for given pixels) strong filter for very flat MBs
240
Deblocking filter
241
Deblocking filter: subjective results (Intra)
242
Deblocking filter: subjective results (Inter)
243
Entropy Coding
244
Entropy Coding
CAVLC (Context-adaptive Variable Length Coding)
uses exp-Golomb codes for all symbols except transform coefficients uses Huffman-like tables for transform coefficients
CABAC (Context-based Adaptive Binary Arithmetic Coding)
245
CABAC
246
S-pictures
247
Comparison of H.264 to MPEG-4
248
Rate Allocation
How
does one select the optimal coding mode for each MB? Lagrangian optimization. For each MB and for each coding mode a cost function is computed. The mode minimizing the cost function is used for that MB. This guarantees to obtain maximum PSNR, at the expenses of a very high complexity
249
Lagrangian R-D optimization

Cost
function:
where:
J = D + R
D = distortion using the current options (using SAD) R = Bit-rate using the current options = Lagrange parameter (used to set the bit-rate)
250
Lagrangian R-D optimization

Given
QP (i.e., the bit-rate), for every possible set of coding parameters (coded block pattern, intra and inter coding modes, reference frame, motion vectors), compute
the
distortion D associated to that set of parameters the rate R associated to that set of parameters the cost J=D+ R associated to that set of parameters
Select
the set of parameters that minimizes J
251
Performance: H.264 vs. MPEG-4
252
Network Adaptation Layer
253
Data partitioning
The
symbols contained in a slice are partitioned in different types:

0 TYPE_HEADER 1 TYPE_MBHEADER 2 TYPE_MVD 3 TYPE_CBP 4 TYPE_2x2DC 5 TYPE_COEFF_Y 6 TYPE_COEFF_C 7 TYPE_EOS Picture or Slice Headers Macroblock header information Motion Vector Data Coded Block Pattern 2x2 DC Coefficients Luma AC Coefficients Chroma AC Coefficients End-of-Stream Symbol
254
NAL for IP networks

1
slice 2 (or 3) packets

TYPE_HEADER
First packet (high priority)
TYPE_MBHEADER TYPE_MVD TYPE_EOS TYPE_CBP
Second packet (low priority)
TYPE_2x2DC TYPE_COEFF_Y TYPE_COEFF_C
255
Error concealment
It
is not normative
Works on single MBs

256
INTRA Concealment
Pixel value = (15x(16-3) + 21x(16-12) + 32x(16-7) + 7x(16-8)) / (13+4+9+8) =18
257
INTER Concealment
258
Error control
Steps involved in a communication session
260
End-to-end delay
261
Challenges for video communications
262
Conventional source coding is not good enough
263
Spatial/temporal error propagation
264
Drift
265
Effect of transmission errors
266
QoS requirements of typical video applications
267
Interactive two-way visual communications
268
One-way video streaming
269
Major types of communication networks
270
Characteristics of major video communications applications
271
Error control techniques for video
272
Transport level error control
273
Channel coding basics
274
FEC for video transmission
275
Delay-constrained ARQ
276
Error resilient encoding
277
Reversible variable length coding
278
Coding mode selection based on network conditions
279
Layered coding with unequal error protection
280
Multiple description coding
281
Generic two description coder
282
Challenges for multiple description video coding
283
Video redundancy coding in H.263+
284
Decoder error concealment
285
Error concealment techniques
286
Sample error concealment results
287
Encoder-decoder interactive error control
288
Video transport using path diversity
289
Why using multiple paths
290
Video streaming
A brief history of streaming media
292
Internet media streaming
293
What is streaming video?
294
Outline
295
Time-varying available bandwidth
296
Time-varying delay
297
Effect of packet loss
298
Unicast vs. multicast
299
Heterogeneity for multicast
300
Architecture for video streaming
301
Video compression
302
Application of layered video
303
Application-layer QoS control
304
Source-based rate control
305
Receiver-based rate control
306
Continuous Media Distribution Services
307
Continuous media distribution services
The aim is to provide QoS and achieving efficiency for streaming video/audio over the best-effort Internet. Continuous Media Distribution Services include:
1. 2. 3.
network filtering application-level multicast content replication
308
1) Network Filtering
Network
filtering aims to maximize video quality during network congestion. The filter receive the clients requests and adapt the stream sent by the server accordingly.
on the data plane control plane
309
1) Network Filtering (contd)

Typically,
frame-dropping filters are used as network filters. The receiver can change the bandwidth of the media stream.
By
sending requests to the filter to increase or decrease the frame dropping rate. The receiver continuously measures the packet loss ratio.
310
2) Application-Level Multicast
The
application-level multicast is aimed at building a multicast service on top of the Internet. The media multicast networks can be built from an interconnection of content-distribution networks. The media multicast networks could support peering relationships at the application level or the streaming-media/content layer.
311
3) Content Replication
1)
Mirror
{
Mirroring is to place copies of the original multimedia files on other machines scattered around the Internet. In this way, clients can retrieve multimedia data from the nearest duplicate server. Disadvantages: expensive, ad hoc, and slow. Caching makes local copies of contents that the clients retrieve. Based on the belief that different clients will load many of the same contents.
2)
Cache
{
312
Receiver-driven layered multicast
313
Streaming Servers
314
Streaming server
315
Streaming Servers
Streaming
servers are required to process multimedia data under timing constraints. A streaming server typically consists of the following three subsystems:
Communicator Operating
system Storage system
316
Real-Time Operating System

1)
Process Management
{ {
The operating system must use real-time scheduling techniques. There are two basic algorithms:
9
Earliest deadline first (EDF) { each task is assigned a deadline, and { the tasks are processed in the order of increasing deadlines. Rate-monotonic scheduling { each task is assigned a static priority according to its request rate. { rate , priority { the tasks are processed in the order of priorities.
317
Real-Time Operating System (contd)

2)
Resource Management
{
Resources in a multimedia server include CPUs, memories, and storage devices. Resource management involves admission control and resource allocation.
9
deterministic & statistical
318
Real-Time Operating System (contd)

3)
File Management
{
The file system provides access and control functions for file storage and retrieval. There are two basic approaches:
9 9
A files is not scattered across several disks To organize files on distributed storage like disk arrays.
319
Storage System
1)
Increase throughput with data striping

{
Under data striping schemes, a multimedia file is scattered across multiple disks and the disk array can be accessed in parallel. An important issue is to balance the load of most heavily loaded disks to avoid overload situations while keeping latency small.
320
Storage System (contd)

2)
Increase capacity with tertiary and hierarchical storage

{
To keep the storage cost down, tertiary storage must be added.

9
tape, CD-ROM
Under the hierarchical storage architecture, only a fraction of the total storage is kept on disks while the major remaining portion is kept on a tertiary tape system.
321
Hierarchical Storage
322
Storage System (contd)

3)
Fault tolerance
{
In order to ensure uninterrupted service even in the presence of disk failures. There are two techniques:
9
Error-correcting (parity-encoding) { Adding a small storage overhead mirroring { Incurring at least twice as much storage volume
Tradeoff between reliability and complexity.
323
Dynamic stream switching: SureStreams
324
Dynamic stream switching: SP-frames
325
SP-frames (contd)
326
SP-frames: performance gain
327
Media Synchronization
328
Media Synchronization
Media
synchronization refers to maintaining the temporal relationships within one data stream and between various media streams. Each component on the transport path affects the data in a different way.
They
all inevitably introduce delays and delay variations.
329
Media Synchronization (contd)

There
9
are three levels of synchronization:

synchronization synchronization
Intra-stream Inter-stream
9
the media layer the stream layer
Inter-object
9
synchronization
the object layer
330

The
method that are used widely to specify the temporal relations is time-stamping:
At
the source, a stream is time-stamped to keep temporal information At the destination, the application presents the streams according to their temporal relation.
331

Preventive
Designed
to minimize synchronization errors as data is transported from the server to the user. To minimize latencies and jitters
Corrective
Compensations
when synchronization errors
occur. Stream Synchronization protocol (SSP)
332
Protocols for Streaming Video
333
Protocol stack for Internet streaming media
334
Protocols for Streaming Video
Network-layer protocol

network addressing IP end-to-end network transport functions UDP, TCP, real-time transport protocol (RTP), and realtime control protocol (RTCP) defines the messages and procedures to control the delivery of the multimedia data during an established session. RTSP, and the session initiation protocol (SIP)
Transport protocol

Session control protocol
335
Protocol Stacks for Media Streaming
336
Transport Protocols
UDP
and TCP protocols support such functions as multiplexing, error control, congestion control, or flow control. Since TCP retransmission introduces unacceptable delays, UDP is typically employed for streaming applications.
337
Transport Protocols (contd)

RTP
is a data transfer protocol while RTCP is a control protocol. In an RTP session, participants periodically send RTCP packets to convey feedback on quality of data delivery and information of membership.
338

RTP
provides the following functions:
Time-stamping Sequence
numbering Payload type identification Source identification

9
SSRC (Synchronization SouRCe identifier)
339

Basically,
RTCP provides the following
services:
QoS
feedback Participant identification

9
RTCP SDES
Control
packets scaling Inter-media synchronization Minimal session control information
340
Session Control Protocols

Main
To
functions of RTSP are:
support VCR-like control operations. Providing means for choosing delivery channels and delivery mechanisms. Also establishing and controlling streams of continuous audio and video media.
9 9
Media retrieval Adding media to an existing session
341
Session Control Protocols (contd)

Session
SIP
Initiation Protocol
can also create and terminate sessions with one or more participants. SIP supports user mobility by proxying and redirecting requests to the users current location.
342
Peer-to-peer networking
Outline
Introduction
and Overview Popular P2P Applications P2P Video-on-Demand Conclusions and Future of P2P
344
P2P Introduction and Overview
P2P Introduction and Overview - Outline

Part I: History, motivation and evolution
History:
Napster and beyond What is Peer-to-peer? Why Peer-to-peer?

Brief
P2P technologies overview
Unstructured
p2p-overlays Structured p2p-overlays
346
History, motivation and evolution
P2P represented ~65% of Internet Traffic at end 2006
1999:
Napster, first widely used p2p-application

347
Napster, first widely used p2p-application

The application: A p2p application for the distribution of mp3 files
Each
user can contribute its own content
How it works: Central index server

Maintains
list of all active peers and their available
content
Distributed
Client
storage and download
nodes also act as file servers All downloaded content is shared

348
History, motivation and evolution - Napster (contd)

Initial
join
Peers
connect to Napster server Transmit current listing of shared files to server
join Central index server peers

349

Content
Peers
search
sends song request to Napster server Napster server checks song database and returns list of matched peers 1) query 2) answer Central index server peers
350

File
retrieval
The
requesting peer contacts the peer having the file directly and downloads it 1) 2) 1) request 2) download
Central index server
peers
351
History, motivation and evolution - File Download
Napster
was the first simple but successful P2P-application. Many others followed
P2P File Download Protocols: 1999: Napster 2000: Gnutella, eDonkey 2001: Kazaa 2002: eMule, BitTorrent
352
Definition of Peer-to-peer (or P2P)
A peer-to-peer (or P2P) computer network is a network that relies primarily on the computing power and bandwidth of the participants in the network rather than concentrating it in a relatively small number of servers. A pure peer-to-peer network does not have the notion of clients or servers, but only equal peer nodes that simultaneously function as both "clients" and "servers" to the other nodes on the network. This model of network arrangement differs from the client-server model where communication is usually to and from a central server.
Taken from the wikipedia free encyclopedia - www.wikipedia.org
353
It is a broad definition with lots of applications

P2P-File
download
P2P-Computation
seti@home
Napster,
Gnutella, KaZaa, eDonkey, Skype, Messaging,
P2P-Streaming
PPLive,
P2P-Communication
VoIP,
ESM,
P2P-Gaming
P2P-Video-on-
Demand
354
History, motivation and evolution Applications
P2P is not restricted to file download!

Application type:
P2P Protocols: 1999: Napster, End System Multicast (ESM) 2000: Gnutella, eDonkey 2001: Kazaa 2002: eMule, BitTorrent 2003: Skype 2004: PPLive Today: TVKoo, TVAnts, PPStream, SopCast Next: Video-on-Demand, Gaming
File Download Streaming Telephony Video-onDemand Gaming
355
Why is P2P so successful?

Scalable
No
Its all about sharing resources
need to provision servers or bandwidth Each user brings its own resource E.g. resistant to flash crowds
9
flash crowd = a crowd of users all arriving at the same time Resources could
capacity
be: Files to share; Upload bandwidth; Disk storage;

356
Why is P2P so successful? (contd)

Cheap
- No infrastructure needed can bring its own content (at no
Everybody
cost)
Homemade
content Ethnic content Illegal content But also legal content

High
availability Content accessible most of

357
time
P2P-Overlay
Build
graph at application layer, and forward packet at the application layer It is a virtual graph
Underlying
physical graph is transparent to the
user Edges are TCP connection or simply a entry of an neighboring nodes IP address
The
graph has to be continuously maintained (e.g. check if nodes are still alive)
358
P2P-Overlay (contd)
Overlay
Source
Underlay
Source
359
The P2P enabling technologies

Unstructured
Generally
p2p-overlays
random overlay Used for content download, telephony, streaming

Structured
p2p-overlays
Distributed
Hash Tables (DHTs) Used for node localization, content download, streaming
360
Unstructured p2p-overlays
Unstructured
Peers
9 9
p2p-overlays do not really care how the overlay is constructed

are organized in a random graph topology
E.g., new node randomly chooses three existing nodes as neighbors Flat or hierarchical
Build
your p2p-service based on this graph
Several
proposals
Gnutella KaZaA/FastTrack BitTorrent
361
Unstructured p2p-overlays (contd)
Unstructured p2p-overlays are just a framework, you can build many applications on top of it Unstructured p2p-overlays pros & cons
Pros
9 9
Very flexible: copes with node churn Supports complex queries (conversely to structured overlays) Content search is difficult: There is a tradeoff between generated traffic (overhead) and the horizon of the partial view
Cons
9
In the following we detail the following applications

Skype BitTorrent
362
One Example of usage of unstructured overlays

Typical
problem in unstructured overlays: How to do content search and query?

Flooding
Example of flooding: (similar to Gnutella)

Found entry!
Search Britney Spears
Upload Notify
Limited
Scope, send only to a subset of your neighbors Time-To-Live, limit the number of hops per messages
363
Survey of popular P2P applications
BitTorrent - Components
In the initial version of BitTorrent, a torrent is composed of: A single content

The content is cut down into pieces Pieces are cut down into blocks, which are the transmission units between peers The protocol only accounts for transferred pieces: partially received pieces cannot be served by a peer
A single Central Tracker
The central tracker has

9 9
the list of all peers participating accessing or serving the file the list of all pieces of the file, and their respective hash values
One or more Seeds
Seeds have the entire file
Many Leechers
Leechers download the file
365
BitTorrent Peer-set
Peer-set
The list of neighbors a peer is allowed to communicate with
Peer-set construction
Each peer (seed or leecher) contacts the tracker and gets a list of peers participating in the same session Typically 50 peers are chosen at random by the tracker for each peer The peer-set is augmented by peers connecting directly to you The peer-set size is limited to 80 peers
366
BitTorrent - Algorithms
Two components in BitTorrent downloading algorithm:
Peer
Selection determines from whom to download the piece? Selection determines which piece to download?
Piece
367
Tit for Tat
Based on the English saying meaning "equivalent retaliation" ("tip for tap"), an agent using this strategy will respond in kind to a previous opponent's action. If the opponent previously was cooperative, the agent is cooperative. If not, the agent is not. This strategy is dependent on the following conditions that has allowed it to become the most prevalent strategy for the Prisoner's Dilemma:

1. Unless provoked, the agent will always cooperate 2. If provoked, the agent will retaliate 3. The agent is quick to forgive

368
BitTorrent - Peer selection

Choke Algorithm Choking is a temporary refusal to upload Each peer unchokes a fixed number of peers (default = 4)
9 9
3 peers on tit-for-tat basis 1 peer on optimistic unchoke basis
369
BitTorrent - Peer selection (contd)

Tit-for-tat peer selection Select the 3 peers from which you downloaded most and that are interested in your chunks Peer selection is done every 10 seconds, based on the download rates of the last 30 seconds.
370

Optimistic unchoke peer selection Select one peer at random that is interested in your chunks, regardless of the current download rate from it Rotates every 30 seconds.
Reason:
To
discover currently unused connections that are better than the ones being used Corresponds to always cooperating on the first move in prisoner's dilemma
371

Anti-Snubbing When a remote peer uploaded no data in 60 s, the local peer assumes that he has been snubbed In that case the local peer refuses to upload to it except for the optimistic unchoking
372
BitTorrent - Piece selection

Random
Only
first piece
applies if leecher has downloaded less than 4 pieces (chunks) Choose randomly the next piece to download Allows to download quickly your first pieces to have pieces to reciprocate for the choke algorithm
373
BitTorrent - Piece selection (contd)

Local
rarest first policy
Determine
the pieces that are most rare among your peers and download those first Ensures that the most common pieces are left till the end to download Rarest first also ensures that a large variety of pieces are downloaded from the seed
374
BitTorrent - Summary
Efficient file download thanks to simple incentive mechanisms

Local rarest first

9
High piece entropy Avoids free-riding Optimizes resource utilization
Tit-for-tat
9 9
Space for improvement?

Steady state very stable and efficient Startup-phase still unstable with some inefficiencies Is there an advantage of deploying BitTorrent on Set-TopBoxes? Is BitTorrent adapted to mobile terminals/DTN networks? Possible usage of network coding?
375
Skype Overlay
Protocol not fully understood today

Proprietary protocol Content and control messages are encrypted
Protocol reuses concepts of the FastTrack overlay used by KaZaA Builds upon an unstructured overlay
Combines
9 9
distributed index servers a flat unstructured network among index servers Super Nodes (SN) Ordinary Nodes (ON)
Two tier hierarchy

9 9
376
Skype Overlay (contd)
Super Nodes (SN)
Connect to each other, building a flat unstructured overlay (similar to the Gnutella overlay)
Ordinary Nodes (ON)
Connect to Super Nodes that act as a directory server (similar to the index server in Napster)
Skype login server

Only central component Stores and verifies usernames and passwords Stores the buddy list
377
Skype Overlay (contd)
Skype login server

Message exchange during login for authentication
SN
ON
Neighbor relationship
378
How is the overlay constructed? - Super Node Lists

Each
Up
node keeps a host cache with a list of Super Nodes IP-addresses

to 200 entries
Some
Super Nodes IP-addresses are hardNodes provided by Skype
coded
Super
These
lists are used to locate a nodes Super Node at login
379
How is the overlay constructed? Login

Contact
login server and authenticate Advertise your presence to other peers

Contact
a Super Node Contact your buddies (through Super Node), and notify your presence
380
Super Nodes Index servers

Super
I.e.
Nodes are index servers
index of locally connected Skype users (and their IP addresses)
If
buddy is not found in local index of a Super Node

Spread
node search to neighboring Super Nodes Not clear how this is implemented
9
Possibly flood the request similar to Gnutella
381
Super Nodes Relay nodes

Super
nodes also act as relay nodes
Enables
NAT traversals Avoid congested or faulty paths
382

Alice
would like to call Bob (or inversely)
Alice
Bob
383

Alice
would like to call Bob (or inversely)

Contact Call Relay Node
Alice
Skype relay node Bob
384
Super Node election
When does an ordinary node become a super node?

High bandwidth, Public IP address, but details not clear Highly dynamic
9
Super Node Churn, Short Super Node session time
Churn
Session time
385
Super Node election

A
world map of Skype Super Nodes
386
Skype - Summary
VoIP
has other requirements than file download

Delay Jitter
Skype
network seems to handle these constraints in spite of

High
node churn
Protocol
not fully understood
387
Conclusion and future of p2p
P2P Attracting Attentions from Commercial World

NBC Universal goes peer-to-peer worldmedia.com BitTorrent raised $8.75 million venture capitals
Teamed with CacheLogic to work for BT
Startups providing P2P live program: pplive, coolstreaming BBC Legal Download Platforms: iMP / Kontiki
Allow users in UK to download BBC TV and radio programs via a program guide for up to 7 days after broadcast
389
P2P Attracting Attentions from Commercial World
Microsoft is active

Peer-to-Peer library Acquisition of Groove Avalanche RedCarpet P2P Windows update they face mounting costs with video Google video is online Bought YouTube Bought chinese p2p-company Xunlei Network Technology iTunes changed the world of music Will it change the world of video?
9
Google and Apple are not using P2P... Yet (?)
Google

Apple

iTV will be a digital media adapter with HDD
390
Will P2P Go Beyond Desktop?

Current
CPU,
device requirement
memory, and disk space requirement Platforms supported Internet connection requirement
Three categories file downloading
9
of p2p application
BitTorrent already on some SetTop-Boxes and DSLrouters Skype mobile phones Not yet
391
Voice
9
Video
9
Will P2P Go Beyond Desktop? (Discussion)

Mobile
What
9
P2P?
benefits does p2p offer over mobile device? are potential issues?
??? Power Connection speed ???
What
9 9 9
P2P
on set-top box? consumer electronic devices?
???
Other
???
392
Future of P2P - Ad-hoc P2P

Opportunistically use all available technologies! Access knowledge and resource of devices you cross in the street
GSM
Local P2P content search

What is currently the best place to find a cab ? What are the results of yesterdays soccer match ?
393
Future of P2P - Ad-hoc P2P (contd)

Your
request or messages are stored and forwarded

Enable
p2p communication even if there is no direct path between two peers at a given moment in time
394
Conclusions and Future of P2P
More commercial P2P applications

Combats between legal and illegal content sharing will continue More p2p used in commercial environment
9
Reduce distribution cost and compete with illegal content
Secure P2P Better performance

More intelligent sharing More scalable Handle churn better Competing with other technology YouTube
Supporting diversity long tail content
Supporting community Relationship with ISPs Become ubiquitous application ??

395
Peer-to-peer media streaming
Growth of Internet traffic

Cisco's global consumer Internet traffic forecast (2007)
8000
6000 PB/month
4000
2000
IPTV Video streaming P2P VoIP Web/Data Gaming Other video
0 2007 2008 2009 2010 2011
397
What is IPTV?
IPTV (Internet Protocol Television) is a system where a digital television service is delivered by using Internet Protocol over a network infrastructure. IPTV is typically supplied by a service provider using a closed network infrastructure. This closed network approach is in competition with the delivery of TV content over the public Internet, called Internet Television. In businesses, IPTV may be used to deliver television content over corporate LANs.

398
What is peer-to-peer TV?

The
term P2PTV refers to peer-to-peer (P2P) software applications designed to redistribute video streams in real time on a P2P network; The distributed video streams are typically TV channels from all over the world but may also come from other sources.

399
Joost
Joost
is a system for distributing TV shows and other forms of video using P2PTV technology Created by the founders of Skype and Kazaa. Has signed up more than a million beta testers and is on track for an end-of-year launch. Uses H.264 video coding
400
Introduction
Advent of multimedia technology and broadband surge lead to:
Excessive usage of P2P application that includes:

9
Sharing of Large Videos over the internet
Video-on-Demand (VoD) applications P2P media streaming applications
BitTorrent like P2P models suitable for bulk file transfer P2P file sharing has no issues like QoS:

No need to playback the media in real time Downloading takes long time, many users do it overnight
401
Introduction Contd.
P2P media streaming is non trivial:
Need to playback the media in real time

9
Quality of Service
Procure future media stream packets

9
Needs reliable neighbors and effective management
High churn rate Users join and leave in between

9
Needs robust network topology to overcome churn
Internet dynamics and congestion in the interior of the network

9
Degrades QoS
Fairness policies extremely difficult to apply like tit-for-tat

9
High bandwidth users have no incentive to contribute
402
P2P Media Streaming
Media streaming extremely expensive

1 hour of video encoded at 300Kbps = 128.7 MB Serving 1000 users would require 125.68 GB
Media Server cannot serve everybody in swarm In P2P Streaming:

Peers form an overlay of nodes on top of www internet Nodes in the overlay connected by direct paths (virtual or logical links), in reality, connected by many physical links in the underlying network Nodes offer their uplink bandwidth while downloading and viewing the media content Takes load off the server Scalable
403
P2P Sharing
Content Distribution Tool
1 Server 2 3 5 4
File is chopped into pieces
3
404
Major Approaches
Major approaches
Content Distribution Networks like Akamai

9
Expensive Only large infrastructure can afford Not scalable Alternate to IP Multicast Most viable and simple to use and deploy No setup cost Scalable
Client Server Model

9
Application Layer Multicast

9
Peer-to-Peer Based
9 9 9
405
Content Distribution Networks (CDNs)
CDN nodes deployed in multiple locations, often over multiple backbones These nodes cooperate with each other to satisfy an end users request User request is sent to nearest CDN node, which has a cached copy QoS improves as end user receives best possible connection Yahoo mail uses Akamai
406
Roadmap to Internet media streaming

Media Streaming
Application Layer Multicast
Peer-to-Peer
[CoolStreaming, PPLive, SOPCast,TV Ants, Feidian] Tree Based Mesh Based
[NICE, ZigZag, SpreadIT]
[ESM, Narada]
407
Application Layer Multicast (ALM)
Very sparse deployment of IP Multicast due to technical and administrative reasons In ALM:

Multicasting implemented at end hosts instead of network routers Nodes form unicast channels or tunnels between them Overlay Construction algorithms at end hosts can be more easily applied End hosts needs lot of bandwidth Simple to use Ineffective in case of churn and node failures as incurs high recovery time
Most ALM approaches form Tree based topology:

408
ALM Methodologies
Tree Based
Content flows from server to nodes in a tree like fashion, every node forwards the content to its children, which in turn forward to their children One point of failure for a complete subtree High recovery time Notes Tree Base Approaches: NICE, SpreadIT, Zigzag
Mesh Based

Overcomes tree based flaws Nodes maintain state information of many nodes High control overhead Notes Mesh Based approaches include Narada and ESM from CMU.
409
Tree Based ALM
410
Mesh Based ALM
411
Peer-to-Peer Streaming Models
Design flaws in ALM lead to current day P2P Streaming models based on chunk driven technology Media content is broken down in small pieces and disseminated in the swarm Neighboring nodes use Gossip protocol to exchange buffer information Nodes trade unavailable pieces Robust and Scalable Most noted approach in recent years: CoolStreaming

PPLive, SOPCast, Fiedian, TV Ants are derivates of CoolStreaming Proprietary and working philosophy not published Reverse Engineered and measurement studies released
412
CoolStreaming

Files is chopped by server and disseminated in the swarm Node upon arrival obtain a peerlist of 40 nodes from the server Nodes contact these nodes for media content In steady state, every node has typically 4-8 neighbors, it periodically shares it buffer content map with neighbors Nodes exchange the unavailable content Real world deployed and highly successful system
413
Metrics
Quality of Service

Jitter less transmission Low end to end latency High uplink throughput leads to scalable P2P systems Churn, Node failure or departure should not affect QoS
Uplink utilization
Robustness and Reliability
Scalability Fairness

Determined in terms of content served (Share Ratio) No user should be forced to upload much more than what it has downloaded Implicitly affects above metrics
Security
414
Quality of Service

Most important metric Jitter: Unavailability of stream content at play time causes jitter Jitter less transmission ensures good media playback Continuous supply of stream content ensures no jitters Latency: Difference in time between playback at server and user Lower latency keeps users interested
A live event viz. Soccer match would lose importance in crucial moments if the transmission is delayed
Reducing hop count reduces latency
415
Uplink Utilization
Uplink is the most sparse and important resource in swarm Summation of uplinks of all nodes is the load taken off the server Utilization = Uplink used / Uplink Available Needs effective node organization and topology to maximize uplink utilization
High uplink throughput means more bandwidth in the swarm and hence it leads to scalable P2P systems
416
Robustness and Reliability
A Robust and Reliable P2P system should be able to support with an acceptable levels of QoS under following conditions:

High churn Node failure Congestion in the interior of the network
Affects QoS Efficient peering techniques and node topology ensures robust and reliable P2P networks
417
Scalability
Serve as many users as possible with an acceptable level of QoS Increasing number of nodes should not degrade QoS An effective overlay node topology and high uplink throughput ensures scalable systems
418
Fairness
Measured in terms of content served to the swarm
Share Ratio = Uploaded Volume / Downloaded Volume
Randomness in swarm causes severe disparity

Many nodes upload huge volume of content Many nodes get a free ride with no or very less contribution
Must have an incentive for an end user to contribute P2P file sharing system like BitTorrent use tit-for-tat policy to stop free riding Not easy to use it in Streaming as nodes procure pieces in real time and applying tit-for-tat can cause delays
419
Security
Implicitly affects other P2P Streaming metrics Mainly 4 types of attacks:

Malicious garbled Payload insertion Free rider Selfish used only downloads with no uploads Whitewasher After being kicked out, comes again with new identity. Such nodes use IP spoofing DDoS attack One or more nodes collectively launch a DoS attack on media server to crack the system down
Lot of attack on P2P file sharing system but very few on Streaming
Possibility cannot be denied
420
Current Issues
High buffering time
Half a minute for popular streaming channels and around 2 minutes for less popular
Some nodes lag with their peers by more than 2 minutes in playback time.
Better Peering Strategy needed
Uneven distribution of uplink bandwidths (Unfairness) Huge volumes of cross ISP traffic

ISPs use bandwidth throttling to limit bandwidth usage Degrade QoS perceived at used end
Sub Optimal uplink utilization
421

Video Communication 2009

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Video Communication 2009

Transféré par

Droits d'auteur :

Formats disponibles

Video communication

Information theory and compression

General transmission scheme

Encoder/ Encoder/ Modulator Modulator

Decoder/ Decoder/ Demodulator Demodulator

signals contain redundancy

(representation) redundancy Correlation between adjacent samples Psychovisual redundancy

compression algorithm searches a representation with

l(rk) = m, then Lave = m

compression algorithms can be divided into two categories:

Lossless source coding

Lossy source coding

do we measure the amount of information carried by message xi ?

amount of information carried by a message is inversely proportional to its probability

H(X) = 1.75 bits/symbol

the first order entropy of a memoryless M-symbol alphabet is limited by

8 bit quantizer (M=28)

Noiseless coding theorem

Bounds on lossless coding

where n is the average codeword length

H N ( X ) = P ( x1 = i1 , x2 = i2 ,..., x N = iN ) log P ( x1 = i1 , x2 = i2 ,..., x N = iN )

distortion is accepted minimize the rate for zero distortion

Introduction to lossy compression

decoded signal is not exactly conform to the original

Examples of lossy compression

speech compression MP3 audio

with (.) a nonnegative component by component distortion measure

Examples of distortion measures

source code Q describes a source X by an approximation X*, such that:

distortion between X and X* is equal to D the rate necessary to transmit X* losslessly is R

Example of R-D function

Operational R-D function

Operational R-D function

consider compressing an image with JPEG at different quality factors R

represents the length of codeword ci we want to minimize the average length

The average length l = li P (ai )will be i minimized if

Example: cod. 1 is not prefix free, cod.2 is prefix free

Code construction example

Code construction example

a a (0.4) 1 b (0.2) 0 1 c (0.2) 0 0 0 d (0.1) 0 0 1 0 e (0.1) 0 0 1 1

H ( X ) = P(ai ) log P(ai ) = 2.122bps

higher bounds are

the maximum probability value pM

l < H ( X ) + pM + 0.086, if pM < 0.5

H ( X ) = 0.816 bps l = 1.2 bps

Extended Huffman Coding

Huffman Coding is obtained coding n-tuples of symbols

a(0.8) 0 b(0.18) 11 c(0.02) 10

Huffman coding efficiency is

(where n is the number of grouped symbols)

symbol and length of its run is coded Example

Light is part of the EM wave

Illuminating and reflecting light

Human eye: some features

Trichromatic color mixing

RGB vs. CMY

Color representation models

YCbCr color space

An important color space for video application is the so called YCbCr

Image Transforms part I

Space vs. frequency

Dark Low frequency

Dark High frequency