Vous êtes sur la page 1sur 421

Video communication

Enrico Magli
Dept. of Electronics - Politecnico di Torino (Italy)
enrico.magli@polito.it http://www1.tlc.polito.it/sas-ipl

Credits
Parts

of the material used in this course have been inspired or taken from work of the following people:
Yao

Wang (Polytechnic University, New York, USA) Bernd Girod (Stanford University, USA) Dapeng Wu (Univ. of Florida, USA) Y. Guo, C. Neumann (Thomson Inc.) D. Purandare (Univ. of Central Florida, USA)

Information theory and compression

General transmission scheme

Encoder/ Encoder/ Modulator Modulator


Source coding + Channel coding

Channel Channel

Decoder/ Decoder/ Demodulator Demodulator


Channel decoding + Source decoding

Source coding
Typical

signals contain redundancy

Coding

(representation) redundancy Correlation between adjacent samples Psychovisual redundancy


Goal:

to reduce the intrinsic redundancy of the source signal to find a more compact signal representation

Notation
Let n1

and n2 be the number of information carrying units in two data sets that represents the same information n1 CR = Compression ratio:

n2

compression algorithm searches a representation with

CR > 1

Coding redundancy
Data

is not equal to information Data is the means by which information is conveyed The same story can be told with a different number of words if the teller is long-winded or short and to the point! This is the coding redundancy

Coding redundancy

White is the most likely value in this picture Encoding each pixel with the same no. of bits leads to coding redundancy Variable length coding is the solution to coding redundancy

Coding redundancy
Let

p(rk) = nk / n the probability of occurrence of gray level rk, k = 0, 1, 2, , L-1 Let rk be represented by l(rk) bits; the average number of bits to represent each pixel is
Lave = l (rk ) p (rk )
k =0 L 1

If

l(rk) = m, then Lave = m

Coding redundancy
It

makes sense that fewer bits are assigned to those rk for which p(rk) is larger This achieves data compression as Lave is lower Therefore, data compression is achieved Variable length codes are used

10

Inter-pixel redundancy

Large areas of the image are uniform This means correlation among pixels (adjacent pixels are almost the same) This is not solved using variable length coding (which works on each single pixel)

11

Psychovisual redundancy
Human perception of the information of an image does not involve quantitative analysis of every pixel Pixel values can be modified up to a given extent without significant subjective degradation Modifications must involve psychovisually redundant information (not easy to define) The image is irreversibly altered

12

Psychovisual redundancy

The brain searches for distinguishing features and mentally combines them into objects (recognizable groups of pixels) Use of prior knowledge for interpretation (face, wall, poster) If the wall were slightly different this could not be perceived

13

Data compression
Data

compression algorithms can be divided into two categories:


Lossless

coding : compressed signal is equal to the original (no coding errors) Lossy coding : a controlled amount of errors are tolerated, according to human subjective sensing capability

14

Lossless source coding


Mainly

for data (e.g. PC files) Sensitive applications (biomedical images, remote sensing data) Output=input (no losses) represents in a more efficient way the signal samples (or data codewords) Achievable compression ratios in the case of natural images: 2.5-3

15

Lossy source coding


A

certain degree of distortion is accepted between output and input Distortion should not be apparent Lost data cannot be recovered Much larger achievable compression ratios: 50 and more

16

Introduction
Invented

by Claude Shannon in the 40s Has set the mathematical framework for digital communications Information Theory teaches us
how

to measure information how to represent information efficiently how to reliably transmit information across communication channels

17

Measuring information
A

discrete memoryless source generates symbols from a set X of M elements (alphabet); each symbol is characterized by its probability of occurrence pi

X = {x }
How

M i i =1

{p }

M i i =1

do we measure the amount of information carried by message xi ?

18

Properties of information
The

amount of information carried by a message is inversely proportional to its probability

I ( x j ) > I ( xi ) if
Statistically

p j < pi

independent messages:

P ( xi , x j ) = P ( xi ) P ( x j ) I ( xi , x j ) = I ( xi ) + I ( x j )

19

Definition of information

1 I ( xi ) = log 2 pi
and is measured in bits Average amount of information carried by the memoryless source: entropy (bits/symbol)

1 H ( X ) = pi I ( xi ) = pi log 2 pi i =1 i =1
20

Examples
x1 x2 X= x3 x4 p1 = 1 2 p2 = 1 4 p3 = 1 8 p4 = 1 8 x1 x2 X= x3 x4 p1 = 1 4 p2 = 1 4 p3 = 1 4 p4 = 1 4

H(X) = 1.75 bits/symbol

H(X) = 2 bits/symbol

Equiprobable sources carry more information, and are more difficult to compress

21

Bounds on Entropy
Theorem:

the first order entropy of a memoryless M-symbol alphabet is limited by

H ( X ) log 2 M
Example:

8 bit quantizer (M=28)

H ( X ) 8 bit/symbol
with the equality if the symbols are equiprobable

22

Noiseless coding theorem


(Shannon) It

is possible to code without any loss of information a memoryless source alphabet with entropy H(X) bits/symbol, using H(X) + bits/symbol is a quantity that can be made arbitrarily small considering increasingly larger blocks of symbols to be coded

23

Bounds on lossless coding


The

average codeword length of a lossless coder cannot be less than entropy Entropy represents the target average number of bits/symbol of a lossless encoder
Coding

Efficiency: = H(X) / n

where n is the average codeword length

24

if not memoryless
If

the source is correlated, the first order entropy does not represent a bound to the average codeword length All the previous results still hold, replacing first order entropy with entropy rate H which takes into account correlation amongst symbols 1

H ( X ) = lim

H N ( X ) = P ( x1 = i1 , x2 = i2 ,..., x N = iN ) log P ( x1 = i1 , x2 = i2 ,..., x N = iN )


i1 =1 i2 =1 i N =1

HN (X )

25

Lossless - Introduction

The goal of lossless compression is to minimize the average length of the compressed symbols exploiting statistical properties of the data probability distribution correlation (redundancy) of the data

No

distortion is accepted minimize the rate for zero distortion


26

Introduction to lossy compression


Advantages
Higher

of lossy compression:

compression ratios Low distortion of the decoded data Possibility to shape the error between original and decoded data
Disadvantages:
The

decoded signal is not exactly conform to the original

27

Examples of lossy compression


Speech/Audio:
GSM

speech compression MP3 audio


Image

compression:

JPEG JPEG2000

Video

compression

MPEG-2 H.263

28

Lossy compression

Consider an i.i.d. discrete-time random process X Main difference with respect to lossless compression: we accept some distortion we reconstruct X*X A single letter distortion measure for a length-m data vector is defined as
m 1 1 m ( x, x * ) = ( x j , x j * ) m j =0

with (.) a nonnegative component by component distortion measure

29

Examples of distortion measures


Hamming

metric

0 * m ( xi , x j ) = 1
Euclidean

if xi = x j if xi x j

metric

m ( xi , x j ) = ( xi x j )
*

* 2

30

Source code
A

source code Q describes a source X by an approximation X*, such that:


the

distortion between X and X* is equal to D the rate necessary to transmit X* losslessly is R


X Distortion D X* I* I*

Quantizer

Entropy coder

rate R bit/sample

Dequantizer

Entropy decoder

31

Rate-distortion function
Given a desired expected distortion E . m ( X X * ) D the rate-distortion

function R(D) is the minimum rate at which we can guarantee the existence of a source code that represents X with X*, so that X* is encoded with a rate of R bit/sample, and E [ m ( X X * )] D

32

Example of R-D function


R
lossless coding

maximum distortion

33

Operational R-D function


In

practice the R-D function is very difficult to compute for realistic sources. Usually one employs the operational R-D function, which is the set of practically achievable R-D points for a given sample realization of the source and a specific code

34

Operational R-D function


Example:

consider compressing an image with JPEG at different quality factors R


operational R-D curve

D
35

Huffman Coding
Problem
Given

statement

a source X emitting symbols ai with probability P(ai) Find a compact representation c(ai)ci
Objective
If li

represents the length of codeword ci we want to minimize the average length


l = li P (ai )
i

36

Huffman Coding

The average length l = li P (ai )will be i minimized if

ai , a j | P(ai ) P(a j ) li l j
Variable length coding (VLC): the shortest codewords are allocated to the most probable symbols

37

Huffman Coding

We must guarantee that the codewords c(ai)ci are unequivocally decodable Huffman coding is based on the idea of prefix free coding
Any codewords cannot be a prefix for another codeword
li 1

ci , c j | li l j ci c j

ci cj

38

Huffman Coding

Example: cod. 1 is not prefix free, cod.2 is prefix free


a b c cod. 1 0 01 11 cod. 2 0 10 11 cod. 1
0 1 1 1

cod. 2
0 0 1 1

Prefix free codes are represented with a binary tree where internal nodes do not represent codewords (the codewords are only the leaves of the tree)
39

Huffman Coding
Huffman

codes construction was prposed in D.A.. Huffman, A method for the construction of minimum redundancy codes, Proc. Of the IRE, 1951

40

Code construction example


Memoryless

source X={a,b,c,d,e} P(ai)={0.4 0.2 0.2 0.1 0.1} The two least probable symbols are grouped to form a tree node
0.4 0.2 0.2 P({d,e})=P(d)+P(e)=0.2

c d e

The

sum of the probabilities of the two symbols is attributed to the tree node

41

Code construction example


The

procedure is iterated considering both the remaining symbols and the created trees
0.4 0.4 0.2 0.2 0.2 0.6 0 1.0 1 1 1 0

a e c b
0 0

c d
0.4

a a (0.4) 1 b (0.2) 0 1 c (0.2) 0 0 0 d (0.1) 0 0 1 0 e (0.1) 0 0 1 1

b
1

0.4

0.2

c d

b c d e

l = li P(ai ) = 2.2bps
i i

H ( X ) = P(ai ) log P(ai ) = 2.122bps

42

Huffman Coding
Coding

efficiency
H ( X ) l < H ( X ) +1

Stronger
Given

higher bounds are

the maximum probability value pM

l < H ( X ) + pM , if pM 0.5

l < H ( X ) + pM + 0.086, if pM < 0.5

43

Huffman Coding
Coding

efficiency can be poor with small alphabet with unbalanced probabilities (PM > 0.5)
a(0.8) 0 b(0.18) 11 c(0.02) 10

H ( X ) = 0.816 bps l = 1.2 bps

44

Extended Huffman Coding


Extended

Huffman Coding is obtained coding n-tuples of symbols


Example

H ( X ) = 0.816 bps
aa(0.64) ab(0.144) ac(0.016) ba(0.144) bb(0.0324) bc(0.0036) ca(0.016) cb(0.0036) cc(0.0004) 0 11 10101 100 1011 10100100 101000 1010011 10100101

a(0.8) 0 b(0.18) 11 c(0.02) 10

l = 1.2 bps

l = 0.8614 bps
45

Huffman Coding
Extended

Huffman coding efficiency is

1 H (X ) l < H (X ) + n

(where n is the number of grouped symbols)

46

Simplified VLC
An

easy and sub-optimal VLC coding technique is known as Run-Length coding It is based on the assumption that a given symbol is repeated for long
Fax,

B/W images

The

symbol and length of its run is coded Example


X=000000100000000010000001 Code:

6,9,6

47

Basics of images

48

Light is part of the EM wave

49

Illuminating and reflecting light

50

Human Eye

cones rods

51

Human Eye

Human eye: some features

The range of intensity that we can perceive is impressive (on the order of 1010) HVS cannot operate over such a range simultaneously Brightness adaptation is used Brightness discrimination is poor at low level of illumination (Weber law) Sensitive to hedges (high contrast zones)

52

Colors

Sensing colors

7 millions cones in human eye can be divided into 3 categories, able to sense red (R), green (G), blue (B) RGB color model

53

Trichromatic color mixing

54

RGB vs. CMY

55

Color representation models

56

YCbCr color space

An important color space for video application is the so called YCbCr


Luminance Y= 0.299 R + 0.578 G + 0.114 B Chrominance Cb = B - Y Cr = R - Y Y corresponds to the black and white TV signal Cb/Cr can be used by color TV to generate R,G,B HVS is much less sensitive to Cb,Cr (can be compressed to a large extent without impairing the perceived quality)

57

Image Transforms part I

Outline
Introduction Fourier DFT

Transform

59

Introduction
An

image can be described in space or frequency Spatial frequency: the rate of change of an image Representation in space domain: picture = collection of brightness levels Representation in frequency domain: picture = collection of spatial frequency components

60

Space vs. frequency

Dark Low frequency

Dark High frequency

Bright Low frequency

Bright High frequency

61

Fourier Transform

62

Fourier Transform
The

Fourier Transform is used to decompose an image into sine and cosine components Used in a wide range of applications: image analysis, filtering, reconstruction and compression As we are only concerned with digital images, we will only consider (2D) Discrete Fourier Transform (DFT)

63

Example of image frequency representation


Images that are pure cosines have particularly simple FT Pure horizontal cosine of 8 cycles and pure vertical cosine of 32 cycles. The FT just has a single component, represented by 2 bright spots symmetrically placed about the center of the FT image The center of the image is the origin of the frequency coordinate system.

64

Example of image frequency representation

Images of 2D cosines with both horizontal and vertical components. (left) 4 cycles horizontal and 16 cycles vertically. (right ) 32 cycles horizontally and 2 cycles vertically For real images, the FT is symmetrical about the origin so the 1st and 3rd (2nd and 4th) quadrants are the same If the image is symmetrical about the x-axis 4-fold symmetry results.

65

Discrete Fourier Transform


The

DFT is the sampled Fourier Transform and therefore does not contain all frequencies forming an image, but only a set of samples which is large enough to fully describe the spatial domain image. The number of frequencies corresponds to the number of pixels in the spatial domain image, i.e. the image in the spatial and Fourier domain are of the same size

66

Two-dimensional DFT
A

square image x(n,m) of size NN has the two-dimensional DFT (2-D DFT):
kn lm x(n, m) exp j 2 + N N n =0 m =0
N 1 N 1

1 F (k , l ) = 2 N
F(k,l)

is obtained by multiplying the image with the corresponding base function and summing the result.

67

Two-dimensional DFT
The

base functions are sine and cosine waves with increasing frequencies F(0,0) represents the DC-component which corresponds to the average brightness and F(N-1,N-1) represents the highest frequency.

68

Separability of 2-D DFT


A

double sum has to be calculated for each image point. However, because the DFT is separable, it can be written as
lm P (k , m) exp j 2 N m =0 kn 1 N 1 P (k , m) = x(n, m) exp j 2 N n =0 N 1 F (k , l ) = N
N 1

69

Separability of 2-D DFT


The

spatial domain image is first transformed into an intermediate image using 1-D DFT applied to the rows This intermediate image is then transformed into the final image, again using 1-D DFT applied to columns This procedure decreases the number of required computations 2 Complexity of 2-D DFT: O N log 2 N

70

Properties of 2-D DFT


The

DFT produces a complex valued image It is displayed with two images, typically magnitude and phase. Only the magnitude is usually displayed The Fourier domain image has a much greater range than the image in the spatial domain. Hence, its values are usually calculated and stored in float values and represented in log- scale

71

Magnitude and phase spectra

The images are horizontal cosines of 8 cycles, differing only by a 1/2 cycle lateral shift Both have the same magnitude spectrum. The phase spectrum would be different, of course.

72

Inversion of 2-D DFT


The

Fourier image can be re-transformed to the spatial domain:


1 x ( n, m ) = 2 N kn lm F ( k , l ) exp j 2 + N N k =0 l =0
N 1 N 1

Both

amplitude and phase information are relevant for the reconstruction of the image

73

Effect of phase on reconstruction

(a)

(b) This image is reconstructed from the frequency domain using amplitude information from (b) and phase information from (a)

74

2-D DFT: example 1


(a) image (b) section A-B

(c) 1-d FFT of section A-B (d) 2-D FFT of image

75

2-D DFT : example 2

(a) Chest radiograph

(b) 2-D Fourier spectrum of (a)

broad range of spatial frequencies significant vertical and horizontal features, due to ribs and vertebral column

76

2-D DFT: example 3


The DFTs tend to have bright lines perpendicular to lines in the original letter. If the letter has circular segments, then so does the FT.

77

2-D DFT: example 4


The concentric ring structure in the DFT of the white pellets image is due to each individual pellet. If we took the DFT of just one pellet, we would still get this pattern. The fact that there are many pellets and information about exactly where each one is is contained mostly in the phase The coffee beans have less symmetry and are more variably colored so they do not show the same ring structure. You may be able to detect a faint "halo" in the coffee DFT. What do you think this is from?

78

2-D DFT: example 5

The girl looks very similar to the ape except for the hat

Effect of edge between hat and hair

79

2-D DFT: example 6

The first image is all black except for a single pixel wide stripe from the top left to the bottom right The second image is totally random

80

General transform coding scheme

pixels values X

Reversible transform

Quantization
Y

Entropy coding Bit allocation

Why do we need to introduce a transform domain? The objective is to represent the original data X into a new domain Y, more suitable for quantization and coding

81

General transform coding scheme


X Y Entropy Entropy coding coding Bit Bitallocation allocation

Reversible Reversible transform transform

Quantization Quantization

Quantization (lossy coding only) depends on


desired bit rate statistics of the various transformed coefficients distortion of the reconstructed signal Any binary encoding technique (Run length, Huffman, Arithmetic )

Entropy coding

82

Transform Coding
Transforms The

are able to decorrelate data

coefficients in the transformed domain are more suitable for the subsequent quantization operation
In

the transformed domain few coefficients concentrate most of the signal energy Coefficients are decorrelated, therefore scalar quantization is nearly optimum

83

The Karhunen-Loeve Transform


Also

called the Hotelling Transform The KLT is a data dependent transform Let X denote a random data vector of length N, m be its (vector) mean value and C be its N x N covariance matrix:

C = E ( X m)( X m)

}
84

The Karhunen-Loeve Transform


The

matrix C is real and symmetric, and hence can be diagonalized using its eigenvectors The eigenvectors ei of C are given by

Cei = i ei
where i are the corresponding eigenvalues

85

The Karhunen-Loeve Transform


Let

us consider a matrix A whose columns correspond to the eigenvectors of C, arranged in increasing eigenvalue order Let us consider the transformation T Y = A ( X m)
Y

is zero mean and has covariance matrix:


T T T T

C y = E (YY ) = E[ A ( X m)( X m) A] = A CA =

where is the diagonal eigenvalue matrix

86

The Karhunen-Loeve Transform


The

elements in the transformed domain are uncorrelated If only the top K coefficients are kept, corresponding to the K largest eigenvectors, the mean square error between the original vector X and its reconstruction from truncated Y is theoretically minimum KLT is a bound as for compression efficiency but is computationally intractable

87

Discrete Cosine Transform


The

1-D discrete cosine transform (DCT) is defined as


(2 x + 1)u C (u ) = (u ) f ( x) cos 2N x =0 u = 0,1, L, N 1
N 1

(0) =

1 N

2 (u ) = N u = 1, L, N 1

88

Inverse DCT
Similarly,

the Inverse DCT (IDCT) is defined


N 1

as
(2 x + 1)u f ( x) = (u )C (u ) cos 2N u =0 x = 0,1, L , N 1

with (u) defined as before

89

2-D DCT
The

two-dimensional DCT is obtained applying the 1-D transform to the rows and columns independently The corresponding transform is
(2 x + 1)u (2 y + 1)v C (u , v) = (u ) (v) f ( x, y ) cos cos 2N 2N x =0 y =0 u , v = 0,1,L , N 1
N 1 N 1

90

Inverse 2-D DCT


Analogously,
N 1 N 1

the inverse 2-D transform is

(2 x + 1)u (2 y + 1)v f ( x, y ) = (u ) (v)C (u , v) cos cos 2N 2N u =0 v =0 x, y = 0,1,L , N 1

91

DCT basis functions


Basis functions of 8X8 DCT When it is applied to an 8x8 image, it yields an 8x8 matrix of weighted values corresponding to how much of each basis function is present in the image An 8x8 image that just contains one shade of gray will yield only a weighted value for the upper left hand DCT basis function (which has no frequencies in the x or y direction).

92

2-D DCT

93

Transform Coding: DCT


For

N , DCT tends to a diagonal matrix (KLT)

The

input data stream must be divided into blocks before applying the transform correlation across the block boundaries is not removed

The

94

Example of 2-D DCT


Image DCT

95

Test image: Lenna

96

Test image: Lenna

97

Interpretation of DCT basis functions


The

top-left basis function represents zero spatial frequency (DC coefficient) Along the top row the basis functions have increasing horizontal spatial frequency content. Down the left column the functions have increasing vertical spatial frequency content.

98

DFT vs. DCT periodicity


discontinuity discontinuity

DFT DFT periodicity periodicity

DCT DCT periodicity periodicity

2n

99

Why DCT not FFT?

DCT can approximate lines well with fewer coefficients Blocking artifacts less pronounced Better approximation to the KLT Used in the JPEG standard

100

DFT (example)

DFT (25% samples retained)

Absolute Error (MSE= 5.1345)

101

Quantization

Goal

of quantization: to represent a real number in (-,+ ) as an integer number, i.e. an element of a discrete and finite set of 2N possible values (N bit quantizer). Bit rate: B=N fs

102

Uniform Quantization
Quantized signal yi+1 yi

xi xi+1

Original signal

Errors: granularity and overload

103

Uniform Quantization

104

Quantization techniques
Uniform

quantization is (almost) optimal when the input signal is memoryless Quantization techniques:
Scalar

quantization Non-uniform quantization Robust-quantization Pdf-optimized quantization (Lloyd-Max) Entropy-constrained quantization Vector quantization

105

Alternative to transforms: linear prediction


Linear

prediction:

estimate

the value of the current pixel x[n] as the linear combination of past pixels: x*[n] = a1 x[n-1] + a2 x[n-2] + instead of x[n], encode the prediction error e[n]=x[n]-x*[n] the decoder recovers x[n]=e[n] + x*[n]

106

Linear prediction (DPCM)


x[n]

+ P(x[n])

e[n]

e[n] +

x[n]

P(x[n])

e[n] can be quantized more efficiently than x[n]

107

Example of third order LP

P(x)=aA + bB + cC E=X-P(x)

108

Practical DPCM scheme

x[n] + e[n] -

e[n]+q[n] Q Q-1 Q-1

+ + H(z)

xs[n]

Pxs

xs[n] + H(z) +

109

The JPEG coding standard

International standards
Organizations
ISO
9

that define standards:

(JTC 1 SC 29 WG 01/11)

JPEG, MPEG, JPEG 2000 H.261, H.263, H.264

ITU
9

Why

standards?

Interoperability

111

International standards (contd)


Who

defines standards:

Companies Academia

Advantages
provides

of using a standard

interoperability in the standards is some years old

Disadvantages:
technology

112

Carrying out the technical work


A

few weekly meetings per year A few intermediate meetings


Call

for proposals Working draft Final committee draft Final Draft International Standard Final Publication Draft

113

The copyright issue

Some technologies used in JPEG are covered by patents:


IBM, AT&T, and Mitsubishi for arithmetic coder Forgent for Huffman tables ? baseline algorithm, royalty-free advanced algorithm, with license fees

Goal:

Participants in JPEG are required to accept to provide royalty-free licenses for technology that they bring into the standard, for the baseline version of the algorithm.

114

International standards (contd)

What is standardized ?

Source data

Multimedia encoder

Multimedia decoder

Syntax
Defined by standard

115

Roadmap to international image coding standards

JPEG

baseline lossy compression extension (hierarchical, progressive) lossless compression lossless compression near-lossless compression lossy compression lossless compression extensions

JPEG-LS

JPEG 2000

116

JPEG
This

standardized image compression scheme is designed to work on full-color or gray-scale digital images JPEG defines a baseline algorithm, plus extensions for Progressive and Hierarchical Coding It foresees a separate lossless mode (Huffman or Arithmetic coding)

117

JPEG block scheme

Color space decomposition


RGB YUV (subsampled)

Application of the algorithm to each component

118

JPEG
The

coding steps:
of the image into a suitable color

transformation

space application of a 8x8 blocks DCT quantization zig-zag reading entropy (lossless) coding

119

JPEG compression

A weighted scalar quantization is applied to each transformed coefficient in every block Quantized DC values are coded by DPCM from macroblock to macroblock Zig-zag reordering Encoding of zero-runs Entropy coding

120

JPEG Quantization Matrices

Divide each entry of the image matrix by the corresponding entry in the quantization matrix Quality factor to control quality Contained in the JPEG file, with image information Flexibility with

Fq(u,v)= round[F(u,v)/Q(u,v)]

quantization tables (?)


121

122

Original Block

DCT (rows)

123

DCT (columns)

Quantized DCT

124

Reconstructed block

Abs error vs. original

125

126

127

JPEG entropy coding


The

zig-zag scanned coefficients are encoded as sequence of couples of symbols: Symbol 1 Symbol 2
(RUNLENGTH, SIZE) (AMPLITUDE)

Runlength:

nr. of zero samples preceding the current sample (0-15 or EOB) Size: nr. of quantization bits for the current sample Amplitude: quantized sample value

128

Codestream syntax
The

codestream consists of

Markers

and marker segments (to carry auxiliary information) Data


Marker
Code Length Marker

structure:

data

129

JPEG syntax
FFD8 (Start Of Image) FFE0 (FIF marker) FFDB (Define Quantization Table) FFC4 (Define Huffman Table) FFC0 (Start Of Frame) FFDA (Start of Scan) FFD0-FFD7 (Restart Markers) FFFE (Comment) FFD9 (End Of Image)

130

JPEG lossless mode

131

JPEG performance

Quality max - Size: 61k Quality med - Size: 14k

Quality low - Size: 4k

132

JPEG performance

Original image Encoded @ 24 bits per pixel

133

JPEG performance

Quality 95/100 3.926 bits per pixel (bpp) CR = 24/3.926 = 6.1

134

JPEG performance

Quality 50/100 1.067 bits per pixel (bpp) CR = 22.5

135

JPEG performance

Quality 25/100 0.705 bits per pixel (bpp) CR = 34.0

136

JPEG performance

Quality 5/100 (min.useful) 0.291 bits per pixel (bpp) CR = 82.5

137

JPEG
Disadvantages:
blocking

effect for non smooth images image correlation is not removed across block boundaries only possible dynamic range is 8 or 12 bpp non unified version for lossless and lossy compression Fourier-like basis functions Poor performance at low bit rate
The

use of low bit rate coding algorithms becomes necessary (JPEG 2000)
138

Video coding

Analog video

140

Progressive and interlaced scans

141

Color TV broadcasting and receiving

142

Why not using RGB directly?

143

144

Digitizing a raster video

145

RGB YCbCr

146

Chrominance subsampling formats

147

Digital video formats

148

2D motion estimation

Notation

150

Motion representation

151

Block based motion estimation

152

Block matching algorithm

153

Exhaustive block matching algorithm

154

Complexity of integer-pel EBMA

155

Sample Matlab script for integer-pel EBMA

156

Fractional accuracy EBMA

157

Half-pel accuracy EBMA

158

Bilinear interpolation

159

160

Pros and cons with EBMA

161

Fast algorithms for BMA

162

Video coding using motion compensation

Characteristics of typical videos

164

Key ideas in video compression: hybrid video coding

165

Different coding modes

166

Temporal prediction

167

Block matching algorithm for motion estimation

168

Multiple reference frame temporal prediction

169

Spatial prediction

170

Motion compensated video

171

Macroblocks in 4:2:0 color format

172

MB coding in I-mode (assuming no intra prediction)

173

MB coding in P-mode

174

MB coding in B-mode

175

Coding mode selection

176

Rate control

177

Loop filtering

178

Video coding standards

Scalable coding

180

Bitstream scalability

181

Illustration of scalable coding

182

Quality (SNR) scalability by multistage quantization

183

Spatial/temporal scalability through down/upsampling

184

Scalability in MPEG-2

185

Fine granularity scalability (FGS) in MPEG-4

186

Drift problem in scalable codecs

187

How to solve the drift problem?

188

Trade-off between coding efficiency and drift

189

Video coding standards and applications

190

H.261 video coding standard

191

DCT coefficient quantization

192

Motion estimation/compensation

193

Variable length coding

194

Parameter selection and rate control

195

H.263 video coding standard

196

Improvements over H.261

197

PB-picture mode

198

Performance of H.261 and H.263

199

MPEG-1 overview

200

MPEG-1 vs. H.261

201

Group of pictures in MPEG

202

MPEG-2 overview

203

MPEG-2 vs. MPEG-1

204

DCT modes

205

MPEG-2 scalability

206

SNR-scalable encoder

207

Spatially-scalable encoder

208

Temporally scalable encoder

209

Profiles and levels in MPEG-2

210

MPEG-4 overview

211

Object-based coding

212

Object description hierarchy in MPEG-4

213

Example of scene composition

214

Coding of texture with arbitrary shape

215

Shape-adaptive DCT

216

MPEG-4 shape coding

217

Mesh animation

218

Body and face animation

219

MPEG-4 video coding efficiency tools

220

H.264/AVC

Introduction
Started

as ITU recommendation Now joint ISO and ITU effort (JVT) ITU H.264/AVC, MPEG-4 Part 10
Targets
at

bit rate reduction by a factor 2, at the same quality, with respect to other standards
the expenses of much higher complexity

222

Comparison of video coders (QCIF, 30 fps, 100 kbit/s)


Original H.263 baseline (33 dB) H.263+ (33.5 dB)

MPEG-4 core (33.5 dB)

H.264 (42 dB)

223

H.264/AVC applications

224

Relationship to other standards

225

H.264/AVC structure

226

H.264/AVC profiles
Baseline:

core compression capabilities, plus error resilience. Suitable for videoconference, mobile video, Main: high compression and quality (e.g., broadcasting) Extended: added features for efficient streaming

227

H.264 video coding layer

228

Partitioning of a frame

229

Flexible Macroblock Ordering (FMO)

230

Common elements with other standards

231

H.264 motion compensation accuracy

232

Macroblock partitioning

233

Multiple reference frames

234

Macroblock type
Each

MB can be encoded in one of the following modes:


INTRA
9 9

Intra 4x4 Intra 16x16

9 prediction modes for Y 4 prediction modes for Y

INTER
9 9

prediction with square blocks (16x16, 8x8, 4x4) prediction with rectangula blocks (8x16, 16x8, 4x8, 8x4)

RATE DISTORTION OPTIMIZATION


235

Intra prediction

MBs to be coded in Intra mode can be predicted from the already coded MBs in the same slice
(Intra 16x16)

236

DCT and inverse transform

237

H.264 4x4 transform

238

4x4 DCT

4x4 DCT

4x4 DCT

239

Deblocking filter
In-loop

filter improves visual quality and PSNR. The filter in H.264/AVC is very articulate
slice

level edge level (filtering strength is dependent on coding residuals) sample level (thresholds allow to turn off the filter for given pixels) strong filter for very flat MBs

240

Deblocking filter

241

Deblocking filter: subjective results (Intra)

242

Deblocking filter: subjective results (Inter)

243

Entropy Coding

244

Entropy Coding

CAVLC (Context-adaptive Variable Length Coding)

uses exp-Golomb codes for all symbols except transform coefficients uses Huffman-like tables for transform coefficients

CABAC (Context-based Adaptive Binary Arithmetic Coding)

245

CABAC

246

S-pictures

247

Comparison of H.264 to MPEG-4

248

Rate Allocation

How

does one select the optimal coding mode for each MB? Lagrangian optimization. For each MB and for each coding mode a cost function is computed. The mode minimizing the cost function is used for that MB. This guarantees to obtain maximum PSNR, at the expenses of a very high complexity

249

Lagrangian R-D optimization


Cost

function:

where:

J = D + R

D = distortion using the current options (using SAD) R = Bit-rate using the current options = Lagrange parameter (used to set the bit-rate)

250

Lagrangian R-D optimization


Given

QP (i.e., the bit-rate), for every possible set of coding parameters (coded block pattern, intra and inter coding modes, reference frame, motion vectors), compute
the

distortion D associated to that set of parameters the rate R associated to that set of parameters the cost J=D+ R associated to that set of parameters
Select

the set of parameters that minimizes J

251

Performance: H.264 vs. MPEG-4

252

Network Adaptation Layer

253

Data partitioning
The

symbols contained in a slice are partitioned in different types:


0 TYPE_HEADER 1 TYPE_MBHEADER 2 TYPE_MVD 3 TYPE_CBP 4 TYPE_2x2DC 5 TYPE_COEFF_Y 6 TYPE_COEFF_C 7 TYPE_EOS Picture or Slice Headers Macroblock header information Motion Vector Data Coded Block Pattern 2x2 DC Coefficients Luma AC Coefficients Chroma AC Coefficients End-of-Stream Symbol

254

NAL for IP networks


1

slice 2 (or 3) packets


TYPE_HEADER

First packet (high priority)

TYPE_MBHEADER TYPE_MVD TYPE_EOS TYPE_CBP

Second packet (low priority)

TYPE_2x2DC TYPE_COEFF_Y TYPE_COEFF_C

255

Error concealment
It

is not normative

Works on single MBs


256

INTRA Concealment

Pixel value = (15x(16-3) + 21x(16-12) + 32x(16-7) + 7x(16-8)) / (13+4+9+8) =18

257

INTER Concealment

258

Error control

Steps involved in a communication session

260

End-to-end delay

261

Challenges for video communications

262

Conventional source coding is not good enough

263

Spatial/temporal error propagation

264

Drift

265

Effect of transmission errors

266

QoS requirements of typical video applications

267

Interactive two-way visual communications

268

One-way video streaming

269

Major types of communication networks

270

Characteristics of major video communications applications

271

Error control techniques for video

272

Transport level error control

273

Channel coding basics

274

FEC for video transmission

275

Delay-constrained ARQ

276

Error resilient encoding

277

Reversible variable length coding

278

Coding mode selection based on network conditions

279

Layered coding with unequal error protection

280

Multiple description coding

281

Generic two description coder

282

Challenges for multiple description video coding

283

Video redundancy coding in H.263+

284

Decoder error concealment

285

Error concealment techniques

286

Sample error concealment results

287

Encoder-decoder interactive error control

288

Video transport using path diversity

289

Why using multiple paths

290

Video streaming

A brief history of streaming media

292

Internet media streaming

293

What is streaming video?

294

Outline

295

Time-varying available bandwidth

296

Time-varying delay

297

Effect of packet loss

298

Unicast vs. multicast

299

Heterogeneity for multicast

300

Architecture for video streaming

301

Video compression

302

Application of layered video

303

Application-layer QoS control

304

Source-based rate control

305

Receiver-based rate control

306

Continuous Media Distribution Services

307

Continuous media distribution services

The aim is to provide QoS and achieving efficiency for streaming video/audio over the best-effort Internet. Continuous Media Distribution Services include:
1. 2. 3.

network filtering application-level multicast content replication

308

1) Network Filtering
Network

filtering aims to maximize video quality during network congestion. The filter receive the clients requests and adapt the stream sent by the server accordingly.
on the data plane control plane

309

1) Network Filtering (contd)


Typically,

frame-dropping filters are used as network filters. The receiver can change the bandwidth of the media stream.
By

sending requests to the filter to increase or decrease the frame dropping rate. The receiver continuously measures the packet loss ratio.

310

2) Application-Level Multicast
The

application-level multicast is aimed at building a multicast service on top of the Internet. The media multicast networks can be built from an interconnection of content-distribution networks. The media multicast networks could support peering relationships at the application level or the streaming-media/content layer.
311

3) Content Replication
1)

Mirror
{

Mirroring is to place copies of the original multimedia files on other machines scattered around the Internet. In this way, clients can retrieve multimedia data from the nearest duplicate server. Disadvantages: expensive, ad hoc, and slow. Caching makes local copies of contents that the clients retrieve. Based on the belief that different clients will load many of the same contents.

2)

Cache
{

312

Receiver-driven layered multicast

313

Streaming Servers

314

Streaming server

315

Streaming Servers
Streaming

servers are required to process multimedia data under timing constraints. A streaming server typically consists of the following three subsystems:
Communicator Operating

system Storage system

316

Real-Time Operating System


1)

Process Management
{ {

The operating system must use real-time scheduling techniques. There are two basic algorithms:
9

Earliest deadline first (EDF) { each task is assigned a deadline, and { the tasks are processed in the order of increasing deadlines. Rate-monotonic scheduling { each task is assigned a static priority according to its request rate. { rate , priority { the tasks are processed in the order of priorities.
317

Real-Time Operating System (contd)


2)

Resource Management
{

Resources in a multimedia server include CPUs, memories, and storage devices. Resource management involves admission control and resource allocation.
9

deterministic & statistical

318

Real-Time Operating System (contd)


3)

File Management
{

The file system provides access and control functions for file storage and retrieval. There are two basic approaches:
9 9

A files is not scattered across several disks To organize files on distributed storage like disk arrays.

319

Storage System
1)

Increase throughput with data striping


{

Under data striping schemes, a multimedia file is scattered across multiple disks and the disk array can be accessed in parallel. An important issue is to balance the load of most heavily loaded disks to avoid overload situations while keeping latency small.

320

Storage System (contd)


2)

Increase capacity with tertiary and hierarchical storage


{

To keep the storage cost down, tertiary storage must be added.


9

tape, CD-ROM

Under the hierarchical storage architecture, only a fraction of the total storage is kept on disks while the major remaining portion is kept on a tertiary tape system.

321

Hierarchical Storage

322

Storage System (contd)


3)

Fault tolerance
{

In order to ensure uninterrupted service even in the presence of disk failures. There are two techniques:
9

Error-correcting (parity-encoding) { Adding a small storage overhead mirroring { Incurring at least twice as much storage volume

Tradeoff between reliability and complexity.

323

Dynamic stream switching: SureStreams

324

Dynamic stream switching: SP-frames

325

SP-frames (contd)

326

SP-frames: performance gain

327

Media Synchronization

328

Media Synchronization
Media

synchronization refers to maintaining the temporal relationships within one data stream and between various media streams. Each component on the transport path affects the data in a different way.
They

all inevitably introduce delays and delay variations.

329

Media Synchronization (contd)


There
9

are three levels of synchronization:


synchronization synchronization

Intra-stream Inter-stream
9

the media layer the stream layer

Inter-object
9

synchronization

the object layer

330

Media Synchronization (contd)


The

method that are used widely to specify the temporal relations is time-stamping:
At

the source, a stream is time-stamped to keep temporal information At the destination, the application presents the streams according to their temporal relation.

331

Media Synchronization (contd)


Preventive
Designed

to minimize synchronization errors as data is transported from the server to the user. To minimize latencies and jitters
Corrective
Compensations

when synchronization errors

occur. Stream Synchronization protocol (SSP)

332

Protocols for Streaming Video

333

Protocol stack for Internet streaming media

334

Protocols for Streaming Video

Network-layer protocol

network addressing IP end-to-end network transport functions UDP, TCP, real-time transport protocol (RTP), and realtime control protocol (RTCP) defines the messages and procedures to control the delivery of the multimedia data during an established session. RTSP, and the session initiation protocol (SIP)

Transport protocol

Session control protocol

335

Protocol Stacks for Media Streaming

336

Transport Protocols
UDP

and TCP protocols support such functions as multiplexing, error control, congestion control, or flow control. Since TCP retransmission introduces unacceptable delays, UDP is typically employed for streaming applications.

337

Transport Protocols (contd)


RTP

is a data transfer protocol while RTCP is a control protocol. In an RTP session, participants periodically send RTCP packets to convey feedback on quality of data delivery and information of membership.

338

Transport Protocols (contd)


RTP

provides the following functions:

Time-stamping Sequence

numbering Payload type identification Source identification


9

SSRC (Synchronization SouRCe identifier)

339

Transport Protocols (contd)


Basically,

RTCP provides the following

services:
QoS

feedback Participant identification


9

RTCP SDES

Control

packets scaling Inter-media synchronization Minimal session control information

340

Session Control Protocols


Main
To

functions of RTSP are:

support VCR-like control operations. Providing means for choosing delivery channels and delivery mechanisms. Also establishing and controlling streams of continuous audio and video media.
9 9

Media retrieval Adding media to an existing session

341

Session Control Protocols (contd)


Session
SIP

Initiation Protocol

can also create and terminate sessions with one or more participants. SIP supports user mobility by proxying and redirecting requests to the users current location.

342

Peer-to-peer networking

Outline
Introduction

and Overview Popular P2P Applications P2P Video-on-Demand Conclusions and Future of P2P

344

P2P Introduction and Overview

P2P Introduction and Overview - Outline


Part I: History, motivation and evolution
History:

Napster and beyond What is Peer-to-peer? Why Peer-to-peer?


Brief

P2P technologies overview

Unstructured

p2p-overlays Structured p2p-overlays

346

History, motivation and evolution

P2P represented ~65% of Internet Traffic at end 2006

1999:

Napster, first widely used p2p-application


347

Napster, first widely used p2p-application


The application: A p2p application for the distribution of mp3 files
Each

user can contribute its own content

How it works: Central index server


Maintains

list of all active peers and their available

content
Distributed
Client

storage and download

nodes also act as file servers All downloaded content is shared


348

History, motivation and evolution - Napster (contd)


Initial

join

Peers

connect to Napster server Transmit current listing of shared files to server

join Central index server peers


349

History, motivation and evolution - Napster (contd)


Content
Peers

search

sends song request to Napster server Napster server checks song database and returns list of matched peers 1) query 2) answer Central index server peers
350

History, motivation and evolution - Napster (contd)


File

retrieval

The

requesting peer contacts the peer having the file directly and downloads it 1) 2) 1) request 2) download

Central index server

peers
351

History, motivation and evolution - File Download

Napster

was the first simple but successful P2P-application. Many others followed

P2P File Download Protocols: 1999: Napster 2000: Gnutella, eDonkey 2001: Kazaa 2002: eMule, BitTorrent

352

Definition of Peer-to-peer (or P2P)

A peer-to-peer (or P2P) computer network is a network that relies primarily on the computing power and bandwidth of the participants in the network rather than concentrating it in a relatively small number of servers. A pure peer-to-peer network does not have the notion of clients or servers, but only equal peer nodes that simultaneously function as both "clients" and "servers" to the other nodes on the network. This model of network arrangement differs from the client-server model where communication is usually to and from a central server.
Taken from the wikipedia free encyclopedia - www.wikipedia.org
353

It is a broad definition with lots of applications


P2P-File

download

P2P-Computation
seti@home

Napster,

Gnutella, KaZaa, eDonkey, Skype, Messaging,

P2P-Streaming
PPLive,

P2P-Communication
VoIP,

ESM,

P2P-Gaming

P2P-Video-on-

Demand

354

History, motivation and evolution Applications

P2P is not restricted to file download!


Application type:

P2P Protocols: 1999: Napster, End System Multicast (ESM) 2000: Gnutella, eDonkey 2001: Kazaa 2002: eMule, BitTorrent 2003: Skype 2004: PPLive Today: TVKoo, TVAnts, PPStream, SopCast Next: Video-on-Demand, Gaming

File Download Streaming Telephony Video-onDemand Gaming

355

Why is P2P so successful?


Scalable
No

Its all about sharing resources

need to provision servers or bandwidth Each user brings its own resource E.g. resistant to flash crowds
9

flash crowd = a crowd of users all arriving at the same time Resources could

capacity

be: Files to share; Upload bandwidth; Disk storage;


356

Why is P2P so successful? (contd)


Cheap

- No infrastructure needed can bring its own content (at no

Everybody

cost)
Homemade

content Ethnic content Illegal content But also legal content


High

availability Content accessible most of


357

time

P2P-Overlay
Build

graph at application layer, and forward packet at the application layer It is a virtual graph
Underlying

physical graph is transparent to the

user Edges are TCP connection or simply a entry of an neighboring nodes IP address
The

graph has to be continuously maintained (e.g. check if nodes are still alive)

358

P2P-Overlay (contd)

Overlay

Source

Underlay

Source
359

The P2P enabling technologies


Unstructured
Generally

p2p-overlays

random overlay Used for content download, telephony, streaming


Structured

p2p-overlays

Distributed

Hash Tables (DHTs) Used for node localization, content download, streaming

360

Unstructured p2p-overlays
Unstructured
Peers
9 9

p2p-overlays do not really care how the overlay is constructed


are organized in a random graph topology
E.g., new node randomly chooses three existing nodes as neighbors Flat or hierarchical

Build

your p2p-service based on this graph

Several

proposals

Gnutella KaZaA/FastTrack BitTorrent

361

Unstructured p2p-overlays (contd)

Unstructured p2p-overlays are just a framework, you can build many applications on top of it Unstructured p2p-overlays pros & cons

Pros
9 9

Very flexible: copes with node churn Supports complex queries (conversely to structured overlays) Content search is difficult: There is a tradeoff between generated traffic (overhead) and the horizon of the partial view

Cons
9

In the following we detail the following applications


Skype BitTorrent

362

One Example of usage of unstructured overlays


Typical

problem in unstructured overlays: How to do content search and query?


Flooding

Example of flooding: (similar to Gnutella)


Found entry!

Search Britney Spears

Upload Notify
Limited

Scope, send only to a subset of your neighbors Time-To-Live, limit the number of hops per messages

363

Survey of popular P2P applications

BitTorrent - Components
In the initial version of BitTorrent, a torrent is composed of: A single content

The content is cut down into pieces Pieces are cut down into blocks, which are the transmission units between peers The protocol only accounts for transferred pieces: partially received pieces cannot be served by a peer

A single Central Tracker

The central tracker has


9 9

the list of all peers participating accessing or serving the file the list of all pieces of the file, and their respective hash values

One or more Seeds

Seeds have the entire file

Many Leechers

Leechers download the file

365

BitTorrent Peer-set

Peer-set

The list of neighbors a peer is allowed to communicate with

Peer-set construction

Each peer (seed or leecher) contacts the tracker and gets a list of peers participating in the same session Typically 50 peers are chosen at random by the tracker for each peer The peer-set is augmented by peers connecting directly to you The peer-set size is limited to 80 peers

366

BitTorrent - Algorithms
Two components in BitTorrent downloading algorithm:
Peer

Selection determines from whom to download the piece? Selection determines which piece to download?

Piece

367

Tit for Tat

Based on the English saying meaning "equivalent retaliation" ("tip for tap"), an agent using this strategy will respond in kind to a previous opponent's action. If the opponent previously was cooperative, the agent is cooperative. If not, the agent is not. This strategy is dependent on the following conditions that has allowed it to become the most prevalent strategy for the Prisoner's Dilemma:

1. Unless provoked, the agent will always cooperate 2. If provoked, the agent will retaliate 3. The agent is quick to forgive

Taken from the wikipedia free encyclopedia - www.wikipedia.org


368

BitTorrent - Peer selection


Choke Algorithm Choking is a temporary refusal to upload Each peer unchokes a fixed number of peers (default = 4)
9 9

3 peers on tit-for-tat basis 1 peer on optimistic unchoke basis

369

BitTorrent - Peer selection (contd)


Tit-for-tat peer selection Select the 3 peers from which you downloaded most and that are interested in your chunks Peer selection is done every 10 seconds, based on the download rates of the last 30 seconds.

370

BitTorrent - Peer selection (contd)


Optimistic unchoke peer selection Select one peer at random that is interested in your chunks, regardless of the current download rate from it Rotates every 30 seconds.
Reason:
To

discover currently unused connections that are better than the ones being used Corresponds to always cooperating on the first move in prisoner's dilemma
371

BitTorrent - Peer selection (contd)


Anti-Snubbing When a remote peer uploaded no data in 60 s, the local peer assumes that he has been snubbed In that case the local peer refuses to upload to it except for the optimistic unchoking

372

BitTorrent - Piece selection


Random
Only

first piece

applies if leecher has downloaded less than 4 pieces (chunks) Choose randomly the next piece to download Allows to download quickly your first pieces to have pieces to reciprocate for the choke algorithm

373

BitTorrent - Piece selection (contd)


Local

rarest first policy

Determine

the pieces that are most rare among your peers and download those first Ensures that the most common pieces are left till the end to download Rarest first also ensures that a large variety of pieces are downloaded from the seed

374

BitTorrent - Summary

Efficient file download thanks to simple incentive mechanisms


Local rarest first


9

High piece entropy Avoids free-riding Optimizes resource utilization

Tit-for-tat
9 9

Space for improvement?


Steady state very stable and efficient Startup-phase still unstable with some inefficiencies Is there an advantage of deploying BitTorrent on Set-TopBoxes? Is BitTorrent adapted to mobile terminals/DTN networks? Possible usage of network coding?
375

Skype Overlay

Protocol not fully understood today


Proprietary protocol Content and control messages are encrypted

Protocol reuses concepts of the FastTrack overlay used by KaZaA Builds upon an unstructured overlay

Combines
9 9

distributed index servers a flat unstructured network among index servers Super Nodes (SN) Ordinary Nodes (ON)

Two tier hierarchy


9 9

376

Skype Overlay (contd)

Super Nodes (SN)

Connect to each other, building a flat unstructured overlay (similar to the Gnutella overlay)

Ordinary Nodes (ON)

Connect to Super Nodes that act as a directory server (similar to the index server in Napster)

Skype login server


Only central component Stores and verifies usernames and passwords Stores the buddy list
377

Skype Overlay (contd)

Skype login server


Message exchange during login for authentication

SN

ON

Neighbor relationship
378

How is the overlay constructed? - Super Node Lists


Each
Up

node keeps a host cache with a list of Super Nodes IP-addresses


to 200 entries

Some

Super Nodes IP-addresses are hardNodes provided by Skype

coded
Super

These

lists are used to locate a nodes Super Node at login

379

How is the overlay constructed? Login


Contact

login server and authenticate Advertise your presence to other peers


Contact

a Super Node Contact your buddies (through Super Node), and notify your presence

380

Super Nodes Index servers


Super
I.e.

Nodes are index servers

index of locally connected Skype users (and their IP addresses)

If

buddy is not found in local index of a Super Node


Spread

node search to neighboring Super Nodes Not clear how this is implemented
9

Possibly flood the request similar to Gnutella

381

Super Nodes Relay nodes


Super

nodes also act as relay nodes

Enables

NAT traversals Avoid congested or faulty paths

382

Super Nodes Relay nodes


Alice

would like to call Bob (or inversely)

Alice

Bob

383

Super Nodes Relay nodes


Alice

would like to call Bob (or inversely)


Contact Call Relay Node

Alice

Skype relay node Bob

384

Super Node election

When does an ordinary node become a super node?


High bandwidth, Public IP address, but details not clear Highly dynamic
9

Super Node Churn, Short Super Node session time

Churn

Session time

385

Super Node election


A

world map of Skype Super Nodes

386

Skype - Summary
VoIP

has other requirements than file download


Delay Jitter

Skype

network seems to handle these constraints in spite of


High

node churn

Protocol

not fully understood

387

Conclusion and future of p2p

P2P Attracting Attentions from Commercial World


NBC Universal goes peer-to-peer worldmedia.com BitTorrent raised $8.75 million venture capitals

Teamed with CacheLogic to work for BT

Startups providing P2P live program: pplive, coolstreaming BBC Legal Download Platforms: iMP / Kontiki

Allow users in UK to download BBC TV and radio programs via a program guide for up to 7 days after broadcast

389

P2P Attracting Attentions from Commercial World

Microsoft is active

Peer-to-Peer library Acquisition of Groove Avalanche RedCarpet P2P Windows update they face mounting costs with video Google video is online Bought YouTube Bought chinese p2p-company Xunlei Network Technology iTunes changed the world of music Will it change the world of video?
9

Google and Apple are not using P2P... Yet (?)

Google

Apple

iTV will be a digital media adapter with HDD

390

Will P2P Go Beyond Desktop?


Current
CPU,

device requirement

memory, and disk space requirement Platforms supported Internet connection requirement
Three categories file downloading
9

of p2p application

BitTorrent already on some SetTop-Boxes and DSLrouters Skype mobile phones Not yet
391

Voice
9

Video
9

Will P2P Go Beyond Desktop? (Discussion)


Mobile
What
9

P2P?
benefits does p2p offer over mobile device? are potential issues?

??? Power Connection speed ???

What
9 9 9

P2P

on set-top box? consumer electronic devices?

???

Other
???

392

Future of P2P - Ad-hoc P2P


Opportunistically use all available technologies! Access knowledge and resource of devices you cross in the street

GSM

Local P2P content search


What is currently the best place to find a cab ? What are the results of yesterdays soccer match ?
393

Future of P2P - Ad-hoc P2P (contd)


Your

request or messages are stored and forwarded


Enable

p2p communication even if there is no direct path between two peers at a given moment in time

394

Conclusions and Future of P2P

More commercial P2P applications


Combats between legal and illegal content sharing will continue More p2p used in commercial environment
9

Reduce distribution cost and compete with illegal content

Secure P2P Better performance


More intelligent sharing More scalable Handle churn better Competing with other technology YouTube

Supporting diversity long tail content

Supporting community Relationship with ISPs Become ubiquitous application ??


395

Peer-to-peer media streaming

Growth of Internet traffic


Cisco's global consumer Internet traffic forecast (2007)
8000

6000 PB/month

4000

2000

IPTV Video streaming P2P VoIP Web/Data Gaming Other video

0 2007 2008 2009 2010 2011

397

What is IPTV?

IPTV (Internet Protocol Television) is a system where a digital television service is delivered by using Internet Protocol over a network infrastructure. IPTV is typically supplied by a service provider using a closed network infrastructure. This closed network approach is in competition with the delivery of TV content over the public Internet, called Internet Television. In businesses, IPTV may be used to deliver television content over corporate LANs.

Taken from the wikipedia free encyclopedia - www.wikipedia.org


398

What is peer-to-peer TV?


The

term P2PTV refers to peer-to-peer (P2P) software applications designed to redistribute video streams in real time on a P2P network; The distributed video streams are typically TV channels from all over the world but may also come from other sources.

Taken from the wikipedia free encyclopedia - www.wikipedia.org


399

Joost
Joost

is a system for distributing TV shows and other forms of video using P2PTV technology Created by the founders of Skype and Kazaa. Has signed up more than a million beta testers and is on track for an end-of-year launch. Uses H.264 video coding

400

Introduction

Advent of multimedia technology and broadband surge lead to:

Excessive usage of P2P application that includes:


9

Sharing of Large Videos over the internet

Video-on-Demand (VoD) applications P2P media streaming applications

BitTorrent like P2P models suitable for bulk file transfer P2P file sharing has no issues like QoS:

No need to playback the media in real time Downloading takes long time, many users do it overnight

401

Introduction Contd.

P2P media streaming is non trivial:

Need to playback the media in real time


9

Quality of Service

Procure future media stream packets


9

Needs reliable neighbors and effective management

High churn rate Users join and leave in between


9

Needs robust network topology to overcome churn

Internet dynamics and congestion in the interior of the network


9

Degrades QoS

Fairness policies extremely difficult to apply like tit-for-tat


9

High bandwidth users have no incentive to contribute

402

P2P Media Streaming

Media streaming extremely expensive


1 hour of video encoded at 300Kbps = 128.7 MB Serving 1000 users would require 125.68 GB

Media Server cannot serve everybody in swarm In P2P Streaming:


Peers form an overlay of nodes on top of www internet Nodes in the overlay connected by direct paths (virtual or logical links), in reality, connected by many physical links in the underlying network Nodes offer their uplink bandwidth while downloading and viewing the media content Takes load off the server Scalable

403

P2P Sharing
Content Distribution Tool
1 Server 2 3 5 4

File is chopped into pieces

3
404

Major Approaches

Major approaches

Content Distribution Networks like Akamai


9

Expensive Only large infrastructure can afford Not scalable Alternate to IP Multicast Most viable and simple to use and deploy No setup cost Scalable

Client Server Model


9

Application Layer Multicast


9

Peer-to-Peer Based
9 9 9

405

Content Distribution Networks (CDNs)

CDN nodes deployed in multiple locations, often over multiple backbones These nodes cooperate with each other to satisfy an end users request User request is sent to nearest CDN node, which has a cached copy QoS improves as end user receives best possible connection Yahoo mail uses Akamai

406

Roadmap to Internet media streaming


Media Streaming

Application Layer Multicast

Peer-to-Peer

[CoolStreaming, PPLive, SOPCast,TV Ants, Feidian] Tree Based Mesh Based

[NICE, ZigZag, SpreadIT]

[ESM, Narada]

407

Application Layer Multicast (ALM)

Very sparse deployment of IP Multicast due to technical and administrative reasons In ALM:

Multicasting implemented at end hosts instead of network routers Nodes form unicast channels or tunnels between them Overlay Construction algorithms at end hosts can be more easily applied End hosts needs lot of bandwidth Simple to use Ineffective in case of churn and node failures as incurs high recovery time

Most ALM approaches form Tree based topology:


408

ALM Methodologies

Tree Based

Content flows from server to nodes in a tree like fashion, every node forwards the content to its children, which in turn forward to their children One point of failure for a complete subtree High recovery time Notes Tree Base Approaches: NICE, SpreadIT, Zigzag

Mesh Based

Overcomes tree based flaws Nodes maintain state information of many nodes High control overhead Notes Mesh Based approaches include Narada and ESM from CMU.

409

Tree Based ALM

410

Mesh Based ALM

411

Peer-to-Peer Streaming Models

Design flaws in ALM lead to current day P2P Streaming models based on chunk driven technology Media content is broken down in small pieces and disseminated in the swarm Neighboring nodes use Gossip protocol to exchange buffer information Nodes trade unavailable pieces Robust and Scalable Most noted approach in recent years: CoolStreaming

PPLive, SOPCast, Fiedian, TV Ants are derivates of CoolStreaming Proprietary and working philosophy not published Reverse Engineered and measurement studies released

412

CoolStreaming

Files is chopped by server and disseminated in the swarm Node upon arrival obtain a peerlist of 40 nodes from the server Nodes contact these nodes for media content In steady state, every node has typically 4-8 neighbors, it periodically shares it buffer content map with neighbors Nodes exchange the unavailable content Real world deployed and highly successful system

413

Metrics

Quality of Service

Jitter less transmission Low end to end latency High uplink throughput leads to scalable P2P systems Churn, Node failure or departure should not affect QoS

Uplink utilization

Robustness and Reliability

Scalability Fairness

Determined in terms of content served (Share Ratio) No user should be forced to upload much more than what it has downloaded Implicitly affects above metrics

Security

414

Quality of Service

Most important metric Jitter: Unavailability of stream content at play time causes jitter Jitter less transmission ensures good media playback Continuous supply of stream content ensures no jitters Latency: Difference in time between playback at server and user Lower latency keeps users interested

A live event viz. Soccer match would lose importance in crucial moments if the transmission is delayed

Reducing hop count reduces latency

415

Uplink Utilization

Uplink is the most sparse and important resource in swarm Summation of uplinks of all nodes is the load taken off the server Utilization = Uplink used / Uplink Available Needs effective node organization and topology to maximize uplink utilization
High uplink throughput means more bandwidth in the swarm and hence it leads to scalable P2P systems

416

Robustness and Reliability

A Robust and Reliable P2P system should be able to support with an acceptable levels of QoS under following conditions:

High churn Node failure Congestion in the interior of the network

Affects QoS Efficient peering techniques and node topology ensures robust and reliable P2P networks

417

Scalability

Serve as many users as possible with an acceptable level of QoS Increasing number of nodes should not degrade QoS An effective overlay node topology and high uplink throughput ensures scalable systems

418

Fairness

Measured in terms of content served to the swarm

Share Ratio = Uploaded Volume / Downloaded Volume

Randomness in swarm causes severe disparity


Many nodes upload huge volume of content Many nodes get a free ride with no or very less contribution

Must have an incentive for an end user to contribute P2P file sharing system like BitTorrent use tit-for-tat policy to stop free riding Not easy to use it in Streaming as nodes procure pieces in real time and applying tit-for-tat can cause delays

419

Security

Implicitly affects other P2P Streaming metrics Mainly 4 types of attacks:


Malicious garbled Payload insertion Free rider Selfish used only downloads with no uploads Whitewasher After being kicked out, comes again with new identity. Such nodes use IP spoofing DDoS attack One or more nodes collectively launch a DoS attack on media server to crack the system down

Lot of attack on P2P file sharing system but very few on Streaming

Possibility cannot be denied

420

Current Issues

High buffering time

Half a minute for popular streaming channels and around 2 minutes for less popular

Some nodes lag with their peers by more than 2 minutes in playback time.

Better Peering Strategy needed

Uneven distribution of uplink bandwidths (Unfairness) Huge volumes of cross ISP traffic

ISPs use bandwidth throttling to limit bandwidth usage Degrade QoS perceived at used end

Sub Optimal uplink utilization

421

Vous aimerez peut-être aussi