Académique Documents
Professionnel Documents
Culture Documents
Enrico Magli
Dept. of Electronics - Politecnico di Torino (Italy)
enrico.magli@polito.it http://www1.tlc.polito.it/sas-ipl
Credits
Parts
of the material used in this course have been inspired or taken from work of the following people:
Yao
Wang (Polytechnic University, New York, USA) Bernd Girod (Stanford University, USA) Dapeng Wu (Univ. of Florida, USA) Y. Guo, C. Neumann (Thomson Inc.) D. Purandare (Univ. of Central Florida, USA)
Channel Channel
Source coding
Typical
Coding
to reduce the intrinsic redundancy of the source signal to find a more compact signal representation
Notation
Let n1
and n2 be the number of information carrying units in two data sets that represents the same information n1 CR = Compression ratio:
n2
CR > 1
Coding redundancy
Data
is not equal to information Data is the means by which information is conveyed The same story can be told with a different number of words if the teller is long-winded or short and to the point! This is the coding redundancy
Coding redundancy
White is the most likely value in this picture Encoding each pixel with the same no. of bits leads to coding redundancy Variable length coding is the solution to coding redundancy
Coding redundancy
Let
p(rk) = nk / n the probability of occurrence of gray level rk, k = 0, 1, 2, , L-1 Let rk be represented by l(rk) bits; the average number of bits to represent each pixel is
Lave = l (rk ) p (rk )
k =0 L 1
If
Coding redundancy
It
makes sense that fewer bits are assigned to those rk for which p(rk) is larger This achieves data compression as Lave is lower Therefore, data compression is achieved Variable length codes are used
10
Inter-pixel redundancy
Large areas of the image are uniform This means correlation among pixels (adjacent pixels are almost the same) This is not solved using variable length coding (which works on each single pixel)
11
Psychovisual redundancy
Human perception of the information of an image does not involve quantitative analysis of every pixel Pixel values can be modified up to a given extent without significant subjective degradation Modifications must involve psychovisually redundant information (not easy to define) The image is irreversibly altered
12
Psychovisual redundancy
The brain searches for distinguishing features and mentally combines them into objects (recognizable groups of pixels) Use of prior knowledge for interpretation (face, wall, poster) If the wall were slightly different this could not be perceived
13
Data compression
Data
coding : compressed signal is equal to the original (no coding errors) Lossy coding : a controlled amount of errors are tolerated, according to human subjective sensing capability
14
for data (e.g. PC files) Sensitive applications (biomedical images, remote sensing data) Output=input (no losses) represents in a more efficient way the signal samples (or data codewords) Achievable compression ratios in the case of natural images: 2.5-3
15
certain degree of distortion is accepted between output and input Distortion should not be apparent Lost data cannot be recovered Much larger achievable compression ratios: 50 and more
16
Introduction
Invented
by Claude Shannon in the 40s Has set the mathematical framework for digital communications Information Theory teaches us
how
to measure information how to represent information efficiently how to reliably transmit information across communication channels
17
Measuring information
A
discrete memoryless source generates symbols from a set X of M elements (alphabet); each symbol is characterized by its probability of occurrence pi
X = {x }
How
M i i =1
{p }
M i i =1
18
Properties of information
The
I ( x j ) > I ( xi ) if
Statistically
p j < pi
independent messages:
P ( xi , x j ) = P ( xi ) P ( x j ) I ( xi , x j ) = I ( xi ) + I ( x j )
19
Definition of information
1 I ( xi ) = log 2 pi
and is measured in bits Average amount of information carried by the memoryless source: entropy (bits/symbol)
1 H ( X ) = pi I ( xi ) = pi log 2 pi i =1 i =1
20
Examples
x1 x2 X= x3 x4 p1 = 1 2 p2 = 1 4 p3 = 1 8 p4 = 1 8 x1 x2 X= x3 x4 p1 = 1 4 p2 = 1 4 p3 = 1 4 p4 = 1 4
H(X) = 2 bits/symbol
Equiprobable sources carry more information, and are more difficult to compress
21
Bounds on Entropy
Theorem:
H ( X ) log 2 M
Example:
H ( X ) 8 bit/symbol
with the equality if the symbols are equiprobable
22
is possible to code without any loss of information a memoryless source alphabet with entropy H(X) bits/symbol, using H(X) + bits/symbol is a quantity that can be made arbitrarily small considering increasingly larger blocks of symbols to be coded
23
average codeword length of a lossless coder cannot be less than entropy Entropy represents the target average number of bits/symbol of a lossless encoder
Coding
Efficiency: = H(X) / n
24
if not memoryless
If
the source is correlated, the first order entropy does not represent a bound to the average codeword length All the previous results still hold, replacing first order entropy with entropy rate H which takes into account correlation amongst symbols 1
H ( X ) = lim
HN (X )
25
Lossless - Introduction
The goal of lossless compression is to minimize the average length of the compressed symbols exploiting statistical properties of the data probability distribution correlation (redundancy) of the data
No
of lossy compression:
compression ratios Low distortion of the decoded data Possibility to shape the error between original and decoded data
Disadvantages:
The
27
compression:
JPEG JPEG2000
Video
compression
MPEG-2 H.263
28
Lossy compression
Consider an i.i.d. discrete-time random process X Main difference with respect to lossless compression: we accept some distortion we reconstruct X*X A single letter distortion measure for a length-m data vector is defined as
m 1 1 m ( x, x * ) = ( x j , x j * ) m j =0
29
metric
0 * m ( xi , x j ) = 1
Euclidean
if xi = x j if xi x j
metric
m ( xi , x j ) = ( xi x j )
*
* 2
30
Source code
A
Quantizer
Entropy coder
rate R bit/sample
Dequantizer
Entropy decoder
31
Rate-distortion function
Given a desired expected distortion E . m ( X X * ) D the rate-distortion
function R(D) is the minimum rate at which we can guarantee the existence of a source code that represents X with X*, so that X* is encoded with a rate of R bit/sample, and E [ m ( X X * )] D
32
maximum distortion
33
practice the R-D function is very difficult to compute for realistic sources. Usually one employs the operational R-D function, which is the set of practically achievable R-D points for a given sample realization of the source and a specific code
34
D
35
Huffman Coding
Problem
Given
statement
a source X emitting symbols ai with probability P(ai) Find a compact representation c(ai)ci
Objective
If li
36
Huffman Coding
ai , a j | P(ai ) P(a j ) li l j
Variable length coding (VLC): the shortest codewords are allocated to the most probable symbols
37
Huffman Coding
We must guarantee that the codewords c(ai)ci are unequivocally decodable Huffman coding is based on the idea of prefix free coding
Any codewords cannot be a prefix for another codeword
li 1
ci , c j | li l j ci c j
ci cj
38
Huffman Coding
cod. 2
0 0 1 1
Prefix free codes are represented with a binary tree where internal nodes do not represent codewords (the codewords are only the leaves of the tree)
39
Huffman Coding
Huffman
codes construction was prposed in D.A.. Huffman, A method for the construction of minimum redundancy codes, Proc. Of the IRE, 1951
40
source X={a,b,c,d,e} P(ai)={0.4 0.2 0.2 0.1 0.1} The two least probable symbols are grouped to form a tree node
0.4 0.2 0.2 P({d,e})=P(d)+P(e)=0.2
c d e
The
sum of the probabilities of the two symbols is attributed to the tree node
41
procedure is iterated considering both the remaining symbols and the created trees
0.4 0.4 0.2 0.2 0.2 0.6 0 1.0 1 1 1 0
a e c b
0 0
c d
0.4
b
1
0.4
0.2
c d
b c d e
l = li P(ai ) = 2.2bps
i i
42
Huffman Coding
Coding
efficiency
H ( X ) l < H ( X ) +1
Stronger
Given
l < H ( X ) + pM , if pM 0.5
43
Huffman Coding
Coding
efficiency can be poor with small alphabet with unbalanced probabilities (PM > 0.5)
a(0.8) 0 b(0.18) 11 c(0.02) 10
44
H ( X ) = 0.816 bps
aa(0.64) ab(0.144) ac(0.016) ba(0.144) bb(0.0324) bc(0.0036) ca(0.016) cb(0.0036) cc(0.0004) 0 11 10101 100 1011 10100100 101000 1010011 10100101
l = 1.2 bps
l = 0.8614 bps
45
Huffman Coding
Extended
1 H (X ) l < H (X ) + n
46
Simplified VLC
An
easy and sub-optimal VLC coding technique is known as Run-Length coding It is based on the assumption that a given symbol is repeated for long
Fax,
B/W images
The
6,9,6
47
Basics of images
48
49
50
Human Eye
cones rods
51
Human Eye
The range of intensity that we can perceive is impressive (on the order of 1010) HVS cannot operate over such a range simultaneously Brightness adaptation is used Brightness discrimination is poor at low level of illumination (Weber law) Sensitive to hedges (high contrast zones)
52
Colors
Sensing colors
7 millions cones in human eye can be divided into 3 categories, able to sense red (R), green (G), blue (B) RGB color model
53
54
55
56
57
Outline
Introduction Fourier DFT
Transform
59
Introduction
An
image can be described in space or frequency Spatial frequency: the rate of change of an image Representation in space domain: picture = collection of brightness levels Representation in frequency domain: picture = collection of spatial frequency components
60
61
Fourier Transform
62
Fourier Transform
The
Fourier Transform is used to decompose an image into sine and cosine components Used in a wide range of applications: image analysis, filtering, reconstruction and compression As we are only concerned with digital images, we will only consider (2D) Discrete Fourier Transform (DFT)
63
Images that are pure cosines have particularly simple FT Pure horizontal cosine of 8 cycles and pure vertical cosine of 32 cycles. The FT just has a single component, represented by 2 bright spots symmetrically placed about the center of the FT image The center of the image is the origin of the frequency coordinate system.
64
Images of 2D cosines with both horizontal and vertical components. (left) 4 cycles horizontal and 16 cycles vertically. (right ) 32 cycles horizontally and 2 cycles vertically For real images, the FT is symmetrical about the origin so the 1st and 3rd (2nd and 4th) quadrants are the same If the image is symmetrical about the x-axis 4-fold symmetry results.
65
DFT is the sampled Fourier Transform and therefore does not contain all frequencies forming an image, but only a set of samples which is large enough to fully describe the spatial domain image. The number of frequencies corresponds to the number of pixels in the spatial domain image, i.e. the image in the spatial and Fourier domain are of the same size
66
Two-dimensional DFT
A
square image x(n,m) of size NN has the two-dimensional DFT (2-D DFT):
kn lm x(n, m) exp j 2 + N N n =0 m =0
N 1 N 1
1 F (k , l ) = 2 N
F(k,l)
is obtained by multiplying the image with the corresponding base function and summing the result.
67
Two-dimensional DFT
The
base functions are sine and cosine waves with increasing frequencies F(0,0) represents the DC-component which corresponds to the average brightness and F(N-1,N-1) represents the highest frequency.
68
double sum has to be calculated for each image point. However, because the DFT is separable, it can be written as
lm P (k , m) exp j 2 N m =0 kn 1 N 1 P (k , m) = x(n, m) exp j 2 N n =0 N 1 F (k , l ) = N
N 1
69
spatial domain image is first transformed into an intermediate image using 1-D DFT applied to the rows This intermediate image is then transformed into the final image, again using 1-D DFT applied to columns This procedure decreases the number of required computations 2 Complexity of 2-D DFT: O N log 2 N
70
DFT produces a complex valued image It is displayed with two images, typically magnitude and phase. Only the magnitude is usually displayed The Fourier domain image has a much greater range than the image in the spatial domain. Hence, its values are usually calculated and stored in float values and represented in log- scale
71
The images are horizontal cosines of 8 cycles, differing only by a 1/2 cycle lateral shift Both have the same magnitude spectrum. The phase spectrum would be different, of course.
72
Both
amplitude and phase information are relevant for the reconstruction of the image
73
(a)
(b) This image is reconstructed from the frequency domain using amplitude information from (b) and phase information from (a)
74
75
broad range of spatial frequencies significant vertical and horizontal features, due to ribs and vertebral column
76
77
78
The girl looks very similar to the ape except for the hat
79
The first image is all black except for a single pixel wide stripe from the top left to the bottom right The second image is totally random
80
pixels values X
Reversible transform
Quantization
Y
Why do we need to introduce a transform domain? The objective is to represent the original data X into a new domain Y, more suitable for quantization and coding
81
Quantization Quantization
desired bit rate statistics of the various transformed coefficients distortion of the reconstructed signal Any binary encoding technique (Run length, Huffman, Arithmetic )
Entropy coding
82
Transform Coding
Transforms The
coefficients in the transformed domain are more suitable for the subsequent quantization operation
In
the transformed domain few coefficients concentrate most of the signal energy Coefficients are decorrelated, therefore scalar quantization is nearly optimum
83
called the Hotelling Transform The KLT is a data dependent transform Let X denote a random data vector of length N, m be its (vector) mean value and C be its N x N covariance matrix:
C = E ( X m)( X m)
}
84
matrix C is real and symmetric, and hence can be diagonalized using its eigenvectors The eigenvectors ei of C are given by
Cei = i ei
where i are the corresponding eigenvalues
85
us consider a matrix A whose columns correspond to the eigenvectors of C, arranged in increasing eigenvalue order Let us consider the transformation T Y = A ( X m)
Y
C y = E (YY ) = E[ A ( X m)( X m) A] = A CA =
86
elements in the transformed domain are uncorrelated If only the top K coefficients are kept, corresponding to the K largest eigenvectors, the mean square error between the original vector X and its reconstruction from truncated Y is theoretically minimum KLT is a bound as for compression efficiency but is computationally intractable
87
(0) =
1 N
2 (u ) = N u = 1, L, N 1
88
Inverse DCT
Similarly,
as
(2 x + 1)u f ( x) = (u )C (u ) cos 2N u =0 x = 0,1, L , N 1
89
2-D DCT
The
two-dimensional DCT is obtained applying the 1-D transform to the rows and columns independently The corresponding transform is
(2 x + 1)u (2 y + 1)v C (u , v) = (u ) (v) f ( x, y ) cos cos 2N 2N x =0 y =0 u , v = 0,1,L , N 1
N 1 N 1
90
91
Basis functions of 8X8 DCT When it is applied to an 8x8 image, it yields an 8x8 matrix of weighted values corresponding to how much of each basis function is present in the image An 8x8 image that just contains one shade of gray will yield only a weighted value for the upper left hand DCT basis function (which has no frequencies in the x or y direction).
92
2-D DCT
93
The
input data stream must be divided into blocks before applying the transform correlation across the block boundaries is not removed
The
94
95
96
97
top-left basis function represents zero spatial frequency (DC coefficient) Along the top row the basis functions have increasing horizontal spatial frequency content. Down the left column the functions have increasing vertical spatial frequency content.
98
2n
99
DCT can approximate lines well with fewer coefficients Blocking artifacts less pronounced Better approximation to the KLT Used in the JPEG standard
100
DFT (example)
101
Quantization
Goal
of quantization: to represent a real number in (-,+ ) as an integer number, i.e. an element of a discrete and finite set of 2N possible values (N bit quantizer). Bit rate: B=N fs
102
Uniform Quantization
Quantized signal yi+1 yi
xi xi+1
Original signal
103
Uniform Quantization
104
Quantization techniques
Uniform
quantization is (almost) optimal when the input signal is memoryless Quantization techniques:
Scalar
quantization Non-uniform quantization Robust-quantization Pdf-optimized quantization (Lloyd-Max) Entropy-constrained quantization Vector quantization
105
prediction:
estimate
the value of the current pixel x[n] as the linear combination of past pixels: x*[n] = a1 x[n-1] + a2 x[n-2] + instead of x[n], encode the prediction error e[n]=x[n]-x*[n] the decoder recovers x[n]=e[n] + x*[n]
106
+ P(x[n])
e[n]
e[n] +
x[n]
P(x[n])
107
P(x)=aA + bB + cC E=X-P(x)
108
x[n] + e[n] -
+ + H(z)
xs[n]
Pxs
xs[n] + H(z) +
109
International standards
Organizations
ISO
9
(JTC 1 SC 29 WG 01/11)
ITU
9
Why
standards?
Interoperability
111
defines standards:
Companies Academia
Advantages
provides
of using a standard
Disadvantages:
technology
112
for proposals Working draft Final committee draft Final Draft International Standard Final Publication Draft
113
IBM, AT&T, and Mitsubishi for arithmetic coder Forgent for Huffman tables ? baseline algorithm, royalty-free advanced algorithm, with license fees
Goal:
Participants in JPEG are required to accept to provide royalty-free licenses for technology that they bring into the standard, for the baseline version of the algorithm.
114
What is standardized ?
Source data
Multimedia encoder
Multimedia decoder
Syntax
Defined by standard
115
JPEG
baseline lossy compression extension (hierarchical, progressive) lossless compression lossless compression near-lossless compression lossy compression lossless compression extensions
JPEG-LS
JPEG 2000
116
JPEG
This
standardized image compression scheme is designed to work on full-color or gray-scale digital images JPEG defines a baseline algorithm, plus extensions for Progressive and Hierarchical Coding It foresees a separate lossless mode (Huffman or Arithmetic coding)
117
118
JPEG
The
coding steps:
of the image into a suitable color
transformation
space application of a 8x8 blocks DCT quantization zig-zag reading entropy (lossless) coding
119
JPEG compression
A weighted scalar quantization is applied to each transformed coefficient in every block Quantized DC values are coded by DPCM from macroblock to macroblock Zig-zag reordering Encoding of zero-runs Entropy coding
120
Divide each entry of the image matrix by the corresponding entry in the quantization matrix Quality factor to control quality Contained in the JPEG file, with image information Flexibility with
Fq(u,v)= round[F(u,v)/Q(u,v)]
122
Original Block
DCT (rows)
123
DCT (columns)
Quantized DCT
124
Reconstructed block
125
126
127
zig-zag scanned coefficients are encoded as sequence of couples of symbols: Symbol 1 Symbol 2
(RUNLENGTH, SIZE) (AMPLITUDE)
Runlength:
nr. of zero samples preceding the current sample (0-15 or EOB) Size: nr. of quantization bits for the current sample Amplitude: quantized sample value
128
Codestream syntax
The
codestream consists of
Markers
structure:
data
129
JPEG syntax
FFD8 (Start Of Image) FFE0 (FIF marker) FFDB (Define Quantization Table) FFC4 (Define Huffman Table) FFC0 (Start Of Frame) FFDA (Start of Scan) FFD0-FFD7 (Restart Markers) FFFE (Comment) FFD9 (End Of Image)
130
131
JPEG performance
132
JPEG performance
133
JPEG performance
134
JPEG performance
135
JPEG performance
136
JPEG performance
137
JPEG
Disadvantages:
blocking
effect for non smooth images image correlation is not removed across block boundaries only possible dynamic range is 8 or 12 bpp non unified version for lossless and lossy compression Fourier-like basis functions Poor performance at low bit rate
The
use of low bit rate coding algorithms becomes necessary (JPEG 2000)
138
Video coding
Analog video
140
141
142
143
144
145
RGB YCbCr
146
147
148
2D motion estimation
Notation
150
Motion representation
151
152
153
154
155
156
157
158
Bilinear interpolation
159
160
161
162
164
165
166
Temporal prediction
167
168
169
Spatial prediction
170
171
172
173
MB coding in P-mode
174
MB coding in B-mode
175
176
Rate control
177
Loop filtering
178
Scalable coding
180
Bitstream scalability
181
182
183
184
Scalability in MPEG-2
185
186
187
188
189
190
191
192
Motion estimation/compensation
193
194
195
196
197
PB-picture mode
198
199
MPEG-1 overview
200
201
202
MPEG-2 overview
203
204
DCT modes
205
MPEG-2 scalability
206
SNR-scalable encoder
207
Spatially-scalable encoder
208
209
210
MPEG-4 overview
211
Object-based coding
212
213
214
215
Shape-adaptive DCT
216
217
Mesh animation
218
219
220
H.264/AVC
Introduction
Started
as ITU recommendation Now joint ISO and ITU effort (JVT) ITU H.264/AVC, MPEG-4 Part 10
Targets
at
bit rate reduction by a factor 2, at the same quality, with respect to other standards
the expenses of much higher complexity
222
223
H.264/AVC applications
224
225
H.264/AVC structure
226
H.264/AVC profiles
Baseline:
core compression capabilities, plus error resilience. Suitable for videoconference, mobile video, Main: high compression and quality (e.g., broadcasting) Extended: added features for efficient streaming
227
228
Partitioning of a frame
229
230
231
232
Macroblock partitioning
233
234
Macroblock type
Each
INTER
9 9
prediction with square blocks (16x16, 8x8, 4x4) prediction with rectangula blocks (8x16, 16x8, 4x8, 8x4)
Intra prediction
MBs to be coded in Intra mode can be predicted from the already coded MBs in the same slice
(Intra 16x16)
236
237
238
4x4 DCT
4x4 DCT
4x4 DCT
239
Deblocking filter
In-loop
filter improves visual quality and PSNR. The filter in H.264/AVC is very articulate
slice
level edge level (filtering strength is dependent on coding residuals) sample level (thresholds allow to turn off the filter for given pixels) strong filter for very flat MBs
240
Deblocking filter
241
242
243
Entropy Coding
244
Entropy Coding
uses exp-Golomb codes for all symbols except transform coefficients uses Huffman-like tables for transform coefficients
245
CABAC
246
S-pictures
247
248
Rate Allocation
How
does one select the optimal coding mode for each MB? Lagrangian optimization. For each MB and for each coding mode a cost function is computed. The mode minimizing the cost function is used for that MB. This guarantees to obtain maximum PSNR, at the expenses of a very high complexity
249
function:
where:
J = D + R
D = distortion using the current options (using SAD) R = Bit-rate using the current options = Lagrange parameter (used to set the bit-rate)
250
QP (i.e., the bit-rate), for every possible set of coding parameters (coded block pattern, intra and inter coding modes, reference frame, motion vectors), compute
the
distortion D associated to that set of parameters the rate R associated to that set of parameters the cost J=D+ R associated to that set of parameters
Select
251
252
253
Data partitioning
The
254
255
Error concealment
It
is not normative
INTRA Concealment
257
INTER Concealment
258
Error control
260
End-to-end delay
261
262
263
264
Drift
265
266
267
268
269
270
271
272
273
274
275
Delay-constrained ARQ
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
Video streaming
292
293
294
Outline
295
296
Time-varying delay
297
298
299
300
301
Video compression
302
303
304
305
306
307
The aim is to provide QoS and achieving efficiency for streaming video/audio over the best-effort Internet. Continuous Media Distribution Services include:
1. 2. 3.
308
1) Network Filtering
Network
filtering aims to maximize video quality during network congestion. The filter receive the clients requests and adapt the stream sent by the server accordingly.
on the data plane control plane
309
frame-dropping filters are used as network filters. The receiver can change the bandwidth of the media stream.
By
sending requests to the filter to increase or decrease the frame dropping rate. The receiver continuously measures the packet loss ratio.
310
2) Application-Level Multicast
The
application-level multicast is aimed at building a multicast service on top of the Internet. The media multicast networks can be built from an interconnection of content-distribution networks. The media multicast networks could support peering relationships at the application level or the streaming-media/content layer.
311
3) Content Replication
1)
Mirror
{
Mirroring is to place copies of the original multimedia files on other machines scattered around the Internet. In this way, clients can retrieve multimedia data from the nearest duplicate server. Disadvantages: expensive, ad hoc, and slow. Caching makes local copies of contents that the clients retrieve. Based on the belief that different clients will load many of the same contents.
2)
Cache
{
312
313
Streaming Servers
314
Streaming server
315
Streaming Servers
Streaming
servers are required to process multimedia data under timing constraints. A streaming server typically consists of the following three subsystems:
Communicator Operating
316
Process Management
{ {
The operating system must use real-time scheduling techniques. There are two basic algorithms:
9
Earliest deadline first (EDF) { each task is assigned a deadline, and { the tasks are processed in the order of increasing deadlines. Rate-monotonic scheduling { each task is assigned a static priority according to its request rate. { rate , priority { the tasks are processed in the order of priorities.
317
Resource Management
{
Resources in a multimedia server include CPUs, memories, and storage devices. Resource management involves admission control and resource allocation.
9
318
File Management
{
The file system provides access and control functions for file storage and retrieval. There are two basic approaches:
9 9
A files is not scattered across several disks To organize files on distributed storage like disk arrays.
319
Storage System
1)
Under data striping schemes, a multimedia file is scattered across multiple disks and the disk array can be accessed in parallel. An important issue is to balance the load of most heavily loaded disks to avoid overload situations while keeping latency small.
320
tape, CD-ROM
Under the hierarchical storage architecture, only a fraction of the total storage is kept on disks while the major remaining portion is kept on a tertiary tape system.
321
Hierarchical Storage
322
Fault tolerance
{
In order to ensure uninterrupted service even in the presence of disk failures. There are two techniques:
9
Error-correcting (parity-encoding) { Adding a small storage overhead mirroring { Incurring at least twice as much storage volume
323
324
325
SP-frames (contd)
326
327
Media Synchronization
328
Media Synchronization
Media
synchronization refers to maintaining the temporal relationships within one data stream and between various media streams. Each component on the transport path affects the data in a different way.
They
329
Intra-stream Inter-stream
9
Inter-object
9
synchronization
330
method that are used widely to specify the temporal relations is time-stamping:
At
the source, a stream is time-stamped to keep temporal information At the destination, the application presents the streams according to their temporal relation.
331
to minimize synchronization errors as data is transported from the server to the user. To minimize latencies and jitters
Corrective
Compensations
332
333
334
Network-layer protocol
network addressing IP end-to-end network transport functions UDP, TCP, real-time transport protocol (RTP), and realtime control protocol (RTCP) defines the messages and procedures to control the delivery of the multimedia data during an established session. RTSP, and the session initiation protocol (SIP)
Transport protocol
335
336
Transport Protocols
UDP
and TCP protocols support such functions as multiplexing, error control, congestion control, or flow control. Since TCP retransmission introduces unacceptable delays, UDP is typically employed for streaming applications.
337
is a data transfer protocol while RTCP is a control protocol. In an RTP session, participants periodically send RTCP packets to convey feedback on quality of data delivery and information of membership.
338
Time-stamping Sequence
339
services:
QoS
RTCP SDES
Control
340
support VCR-like control operations. Providing means for choosing delivery channels and delivery mechanisms. Also establishing and controlling streams of continuous audio and video media.
9 9
341
Initiation Protocol
can also create and terminate sessions with one or more participants. SIP supports user mobility by proxying and redirecting requests to the users current location.
342
Peer-to-peer networking
Outline
Introduction
and Overview Popular P2P Applications P2P Video-on-Demand Conclusions and Future of P2P
344
Unstructured
346
1999:
content
Distributed
Client
join
Peers
search
sends song request to Napster server Napster server checks song database and returns list of matched peers 1) query 2) answer Central index server peers
350
retrieval
The
requesting peer contacts the peer having the file directly and downloads it 1) 2) 1) request 2) download
peers
351
Napster
was the first simple but successful P2P-application. Many others followed
P2P File Download Protocols: 1999: Napster 2000: Gnutella, eDonkey 2001: Kazaa 2002: eMule, BitTorrent
352
A peer-to-peer (or P2P) computer network is a network that relies primarily on the computing power and bandwidth of the participants in the network rather than concentrating it in a relatively small number of servers. A pure peer-to-peer network does not have the notion of clients or servers, but only equal peer nodes that simultaneously function as both "clients" and "servers" to the other nodes on the network. This model of network arrangement differs from the client-server model where communication is usually to and from a central server.
Taken from the wikipedia free encyclopedia - www.wikipedia.org
353
download
P2P-Computation
seti@home
Napster,
P2P-Streaming
PPLive,
P2P-Communication
VoIP,
ESM,
P2P-Gaming
P2P-Video-on-
Demand
354
P2P Protocols: 1999: Napster, End System Multicast (ESM) 2000: Gnutella, eDonkey 2001: Kazaa 2002: eMule, BitTorrent 2003: Skype 2004: PPLive Today: TVKoo, TVAnts, PPStream, SopCast Next: Video-on-Demand, Gaming
355
need to provision servers or bandwidth Each user brings its own resource E.g. resistant to flash crowds
9
flash crowd = a crowd of users all arriving at the same time Resources could
capacity
Everybody
cost)
Homemade
time
P2P-Overlay
Build
graph at application layer, and forward packet at the application layer It is a virtual graph
Underlying
user Edges are TCP connection or simply a entry of an neighboring nodes IP address
The
graph has to be continuously maintained (e.g. check if nodes are still alive)
358
P2P-Overlay (contd)
Overlay
Source
Underlay
Source
359
p2p-overlays
p2p-overlays
Distributed
Hash Tables (DHTs) Used for node localization, content download, streaming
360
Unstructured p2p-overlays
Unstructured
Peers
9 9
Build
Several
proposals
361
Unstructured p2p-overlays are just a framework, you can build many applications on top of it Unstructured p2p-overlays pros & cons
Pros
9 9
Very flexible: copes with node churn Supports complex queries (conversely to structured overlays) Content search is difficult: There is a tradeoff between generated traffic (overhead) and the horizon of the partial view
Cons
9
Skype BitTorrent
362
Upload Notify
Limited
Scope, send only to a subset of your neighbors Time-To-Live, limit the number of hops per messages
363
BitTorrent - Components
In the initial version of BitTorrent, a torrent is composed of: A single content
The content is cut down into pieces Pieces are cut down into blocks, which are the transmission units between peers The protocol only accounts for transferred pieces: partially received pieces cannot be served by a peer
the list of all peers participating accessing or serving the file the list of all pieces of the file, and their respective hash values
Many Leechers
365
BitTorrent Peer-set
Peer-set
Peer-set construction
Each peer (seed or leecher) contacts the tracker and gets a list of peers participating in the same session Typically 50 peers are chosen at random by the tracker for each peer The peer-set is augmented by peers connecting directly to you The peer-set size is limited to 80 peers
366
BitTorrent - Algorithms
Two components in BitTorrent downloading algorithm:
Peer
Selection determines from whom to download the piece? Selection determines which piece to download?
Piece
367
Based on the English saying meaning "equivalent retaliation" ("tip for tap"), an agent using this strategy will respond in kind to a previous opponent's action. If the opponent previously was cooperative, the agent is cooperative. If not, the agent is not. This strategy is dependent on the following conditions that has allowed it to become the most prevalent strategy for the Prisoner's Dilemma:
1. Unless provoked, the agent will always cooperate 2. If provoked, the agent will retaliate 3. The agent is quick to forgive
369
370
discover currently unused connections that are better than the ones being used Corresponds to always cooperating on the first move in prisoner's dilemma
371
372
first piece
applies if leecher has downloaded less than 4 pieces (chunks) Choose randomly the next piece to download Allows to download quickly your first pieces to have pieces to reciprocate for the choke algorithm
373
Determine
the pieces that are most rare among your peers and download those first Ensures that the most common pieces are left till the end to download Rarest first also ensures that a large variety of pieces are downloaded from the seed
374
BitTorrent - Summary
Tit-for-tat
9 9
Steady state very stable and efficient Startup-phase still unstable with some inefficiencies Is there an advantage of deploying BitTorrent on Set-TopBoxes? Is BitTorrent adapted to mobile terminals/DTN networks? Possible usage of network coding?
375
Skype Overlay
Protocol reuses concepts of the FastTrack overlay used by KaZaA Builds upon an unstructured overlay
Combines
9 9
distributed index servers a flat unstructured network among index servers Super Nodes (SN) Ordinary Nodes (ON)
376
Connect to each other, building a flat unstructured overlay (similar to the Gnutella overlay)
Connect to Super Nodes that act as a directory server (similar to the index server in Napster)
Only central component Stores and verifies usernames and passwords Stores the buddy list
377
SN
ON
Neighbor relationship
378
Some
coded
Super
These
379
a Super Node Contact your buddies (through Super Node), and notify your presence
380
If
node search to neighboring Super Nodes Not clear how this is implemented
9
381
Enables
382
Alice
Bob
383
Alice
384
High bandwidth, Public IP address, but details not clear Highly dynamic
9
Churn
Session time
385
386
Skype - Summary
VoIP
Skype
node churn
Protocol
387
NBC Universal goes peer-to-peer worldmedia.com BitTorrent raised $8.75 million venture capitals
Startups providing P2P live program: pplive, coolstreaming BBC Legal Download Platforms: iMP / Kontiki
Allow users in UK to download BBC TV and radio programs via a program guide for up to 7 days after broadcast
389
Microsoft is active
Peer-to-Peer library Acquisition of Groove Avalanche RedCarpet P2P Windows update they face mounting costs with video Google video is online Bought YouTube Bought chinese p2p-company Xunlei Network Technology iTunes changed the world of music Will it change the world of video?
9
Google
Apple
390
device requirement
memory, and disk space requirement Platforms supported Internet connection requirement
Three categories file downloading
9
of p2p application
BitTorrent already on some SetTop-Boxes and DSLrouters Skype mobile phones Not yet
391
Voice
9
Video
9
P2P?
benefits does p2p offer over mobile device? are potential issues?
What
9 9 9
P2P
???
Other
???
392
Opportunistically use all available technologies! Access knowledge and resource of devices you cross in the street
GSM
What is currently the best place to find a cab ? What are the results of yesterdays soccer match ?
393
p2p communication even if there is no direct path between two peers at a given moment in time
394
Combats between legal and illegal content sharing will continue More p2p used in commercial environment
9
More intelligent sharing More scalable Handle churn better Competing with other technology YouTube
6000 PB/month
4000
2000
397
What is IPTV?
IPTV (Internet Protocol Television) is a system where a digital television service is delivered by using Internet Protocol over a network infrastructure. IPTV is typically supplied by a service provider using a closed network infrastructure. This closed network approach is in competition with the delivery of TV content over the public Internet, called Internet Television. In businesses, IPTV may be used to deliver television content over corporate LANs.
term P2PTV refers to peer-to-peer (P2P) software applications designed to redistribute video streams in real time on a P2P network; The distributed video streams are typically TV channels from all over the world but may also come from other sources.
Joost
Joost
is a system for distributing TV shows and other forms of video using P2PTV technology Created by the founders of Skype and Kazaa. Has signed up more than a million beta testers and is on track for an end-of-year launch. Uses H.264 video coding
400
Introduction
BitTorrent like P2P models suitable for bulk file transfer P2P file sharing has no issues like QoS:
No need to playback the media in real time Downloading takes long time, many users do it overnight
401
Introduction Contd.
Quality of Service
Degrades QoS
402
1 hour of video encoded at 300Kbps = 128.7 MB Serving 1000 users would require 125.68 GB
Peers form an overlay of nodes on top of www internet Nodes in the overlay connected by direct paths (virtual or logical links), in reality, connected by many physical links in the underlying network Nodes offer their uplink bandwidth while downloading and viewing the media content Takes load off the server Scalable
403
P2P Sharing
Content Distribution Tool
1 Server 2 3 5 4
3
404
Major Approaches
Major approaches
Expensive Only large infrastructure can afford Not scalable Alternate to IP Multicast Most viable and simple to use and deploy No setup cost Scalable
Peer-to-Peer Based
9 9 9
405
CDN nodes deployed in multiple locations, often over multiple backbones These nodes cooperate with each other to satisfy an end users request User request is sent to nearest CDN node, which has a cached copy QoS improves as end user receives best possible connection Yahoo mail uses Akamai
406
Peer-to-Peer
[ESM, Narada]
407
Very sparse deployment of IP Multicast due to technical and administrative reasons In ALM:
Multicasting implemented at end hosts instead of network routers Nodes form unicast channels or tunnels between them Overlay Construction algorithms at end hosts can be more easily applied End hosts needs lot of bandwidth Simple to use Ineffective in case of churn and node failures as incurs high recovery time
408
ALM Methodologies
Tree Based
Content flows from server to nodes in a tree like fashion, every node forwards the content to its children, which in turn forward to their children One point of failure for a complete subtree High recovery time Notes Tree Base Approaches: NICE, SpreadIT, Zigzag
Mesh Based
Overcomes tree based flaws Nodes maintain state information of many nodes High control overhead Notes Mesh Based approaches include Narada and ESM from CMU.
409
410
411
Design flaws in ALM lead to current day P2P Streaming models based on chunk driven technology Media content is broken down in small pieces and disseminated in the swarm Neighboring nodes use Gossip protocol to exchange buffer information Nodes trade unavailable pieces Robust and Scalable Most noted approach in recent years: CoolStreaming
PPLive, SOPCast, Fiedian, TV Ants are derivates of CoolStreaming Proprietary and working philosophy not published Reverse Engineered and measurement studies released
412
CoolStreaming
Files is chopped by server and disseminated in the swarm Node upon arrival obtain a peerlist of 40 nodes from the server Nodes contact these nodes for media content In steady state, every node has typically 4-8 neighbors, it periodically shares it buffer content map with neighbors Nodes exchange the unavailable content Real world deployed and highly successful system
413
Metrics
Quality of Service
Jitter less transmission Low end to end latency High uplink throughput leads to scalable P2P systems Churn, Node failure or departure should not affect QoS
Uplink utilization
Scalability Fairness
Determined in terms of content served (Share Ratio) No user should be forced to upload much more than what it has downloaded Implicitly affects above metrics
Security
414
Quality of Service
Most important metric Jitter: Unavailability of stream content at play time causes jitter Jitter less transmission ensures good media playback Continuous supply of stream content ensures no jitters Latency: Difference in time between playback at server and user Lower latency keeps users interested
A live event viz. Soccer match would lose importance in crucial moments if the transmission is delayed
415
Uplink Utilization
Uplink is the most sparse and important resource in swarm Summation of uplinks of all nodes is the load taken off the server Utilization = Uplink used / Uplink Available Needs effective node organization and topology to maximize uplink utilization
High uplink throughput means more bandwidth in the swarm and hence it leads to scalable P2P systems
416
A Robust and Reliable P2P system should be able to support with an acceptable levels of QoS under following conditions:
Affects QoS Efficient peering techniques and node topology ensures robust and reliable P2P networks
417
Scalability
Serve as many users as possible with an acceptable level of QoS Increasing number of nodes should not degrade QoS An effective overlay node topology and high uplink throughput ensures scalable systems
418
Fairness
Many nodes upload huge volume of content Many nodes get a free ride with no or very less contribution
Must have an incentive for an end user to contribute P2P file sharing system like BitTorrent use tit-for-tat policy to stop free riding Not easy to use it in Streaming as nodes procure pieces in real time and applying tit-for-tat can cause delays
419
Security
Malicious garbled Payload insertion Free rider Selfish used only downloads with no uploads Whitewasher After being kicked out, comes again with new identity. Such nodes use IP spoofing DDoS attack One or more nodes collectively launch a DoS attack on media server to crack the system down
Lot of attack on P2P file sharing system but very few on Streaming
420
Current Issues
Half a minute for popular streaming channels and around 2 minutes for less popular
Some nodes lag with their peers by more than 2 minutes in playback time.
Uneven distribution of uplink bandwidths (Unfairness) Huge volumes of cross ISP traffic
ISPs use bandwidth throttling to limit bandwidth usage Degrade QoS perceived at used end
421