Vous êtes sur la page 1sur 15

MPEG-2

VIDEO COMPRESSION
TECHNIQUE

By
PRATEEK RAJ GAUTAM
HBTI KANPUR
Abstract

MPEG-2 is an extension of the MPEG-1

nternational standard for digital compression of audio and

video signals.MPEG-1 was designed to code progressively

scanned video at bit rates up to about 1.5 Mbit/s for

applications such as CD-I (compact disc interactive).

MPEG-2 is directed at broadcast formats at higher data

rates; it provides extra algorithmic tools' for

efficiently coding interlaced video supports a wide

range of bit rates and provides for multichannel surround

sound coding. This paper introduces the principles used

for compressing video according to the MPEG-2 standard, and

outlines the comression techniques.


Introduction

The MPEG-2 committee began its life in late 1988


by the hand of Leonardo Chairiglione and Hiroshi Yasuda
with the immediate goal of standardizing video and audio
for compact discs.Over the next few years, participation
amassed from international technical experts in the areas
of Video, Audio, and Systems, reaching over 200
participants by 1992.

By the end of the third year (1990), a syntax


emerged, which when applied to code SIF video and compact
disc audio samples rates at a combined coded bitrate of
1.5 Mbit/sec, approximated the perceptual quality of
consumer video tape (VHS). After demonstrations proved
that the syntax was generic enough to be applied to bit
rates and sample rates far higher than the original primary
target application, a second phase (MPEG-2) was initiated
within the committee to define a syntax for efficient
representation of broadcast video. Efficient representation
of interlaced (broadcast) video signals was more
challenging than the progressive (non-interlaced) signals
coded by MPEG-1. Similarly, MPEG-1 audio was capable of
only directly representing two channels of sound. MPEG-2
would introduce a scheme to decorrelate mutlichannel
discrete surround sound audio.

Need for a third phase (MPEG-3) was


anticipated in 1991 for High Definition Television,
although it was later discovered by late 1992 and 1993 that
the MPEG-2 syntax simply scaled with the bit rate,
obviating the third phase. MPEG-4 was launched in late
1992 to explore the requirements of a more diverse set of
applications, while finding a more efficient means of
coding low bit rate/low sample rate video.

Today,MPEG(video and systems) is exclusiv syntax of


the United States Grand Alliance HDTV specification, the
European Digital Video Broadcasting Group, and the high
density compact disc (lead by rivals Sony/Philips and
Toshiba). The MPEG (Moving Pictures Experts Group) MPEG-2
is dismissed by many as inappropriate for digital cinema
since it is often viewed at high compression ratio in low
bit-rate applications. But MPEG-2 is fundamentally a rich
set of compression tools, with capabilities that are not
made available by the commonly defined profiles and levels.
By looking deeper than the usual implementations of the
standard it is possible to find enhancements to enable the
high picture quality required by the digital cinema
application. Enhancements in constant quality rate control,
color space, and bit depth are possible while still
adhering to the basic MPEG-2 bit-stream specification. The
enhancements work together with the same silicon devices
that are used in larger markets, allowing digital cinema to
take advantage of the beneficial price/ performance ratio
in the compression and playback systems.

Need For Compression:Video actually is a sequence of


pictures, each picture is consisted by an array of pixel.
For a uncompression video, its size is huge. Such as CCIRR-
601 parameters (720pixels x 480pixels x 30frames/s), it has
a data rate at about 165Mbps. This high data rate is too
high for user-level application and it is a big problem for
CPU and communication. To deal with this problem, video
compression is used in order to reduce the size. There are
two kinds of compression method, one is loss-less and the
other is lossy. For a loss-less compression, such as
Huffman, Arithmetic, LZW..etc, they do not work well for
video since the distribution of pixel value is wide range

Compression Capabilities Of Mpeg-2 :-MPEG 2 provides a way


to compress this digital video signal to a manageable bit
rate. The compression capability of MPEG-2 video
compression is shown in the table-1 followed.

Therefore the higher the picture quality for a given Table

Summary of compression capabilities


Because the MPEG-2 standard provides good
compression using standard algorithms, it has become the
standard for digital TV. It has the following features
• Full-screen interlaced and/or progressive video
(for TV and Computer displays)
• Enhanced audio coding (high quality, mono, stereo, and
other audio features)
• Transport multiplexing (combining different MPEG
streams in a single transmission stream)
• Other services (GUI, interaction, encryption, data
transmission, etc)
The list of systems which now (or will soon)
use MPEG-2 is extensive and continuously growing: digital
TV(cable, satellite and terrestrial broadcast), Video on
Demand, Digital Versatile Disc (DVD), personal computing,
card payment, test and measurement, etc.

The MPEG-2 video compression algorithm


achieves very high rates of compression by exploiting the
redundancy in video information. MPEG-2 removes both the
temporal redundancy and spatial redundancy which are
present in motion video.

Temporal redundancy arises when successive


frames of video display images of the same scene. It is
common for the content of the scene to remain fixed or to
change only slightly between successive frames.

Spatial redundancy occurs because pats of


the picture (called pels) are often replicated (with minor
changes) within a single frame of video.

Clearly, it is not always possible to compress every frame


of a video clip to the same extent - some parts of a clip
may have low spatial redundancy (e.g. complex picture
content), while other parts may have low temporal
redundancy (e.g. fast moving sequences). The compressed
video stream is therefore naturally of variable bit rate,
where as transmission links frequently require fixed
transmission rates. The key to controlling the transmission
rate is to order the compressed data in a buffer in order
of decreasing detail. Compression may be performed by
selectively discarding some of the information. A minimal
impact on overall picture quality can be achieved by
throwing away the most detailed information, while
preserving the less detailed picture content. This will
ensure the overall bit rate is limited while suffering
minimal impairment of picture quality. The basic operation
of the encoder is shown below:
Basic Operation of an MPEG-2 Encoder
MPEG-2 includes a wide range of compression mechanisms. An encoder must
therefore which compression mechanisms are best suited to a particular
scene / sequence of scenes. In general, the more sophisticated the
encoder, the better it is at selecting the most appropriate compression
mechanism, and transmission bit rate. MPEG-2 Decoders also come in
various types and have varying capabilities (including ability to handle
high quality video, ability to cope with errors) and connection options.
Block diagram of encoder and decoder:
Most common implementations of MPEG-2 are designed to work with some
fixed bandwidth distribution channel. The 19.4 Mb/s payload of the ATSC
digital television transmission standard is one example. These
implementations apply a constant bit-rate control algorithm to the
compression engine, to make sure that every picture can be delivered
through the channel at the correct time. This type of rate control
necessarily causes the picture quality after compression to vary from
scene to scene.
In digital cinema, the priority is for consistent picture
quality from the first image to the last, before any requirement for
fixed bandwidth transmission. Compression for digital cinema should use a
variable bit-rate, constant quality mechanism for rate control.
In fact, constant quality rate control is inherent to the
basic set of compression tools of MPEG-2, listed in Figure 1. These
operations naturally result in complex pictures being allocated more
bits, and simple pictures less. The common practice to achieve a constant
bit-rate involves adding a layer of control over these tools to monitor
compressed picture sizes and adjust quantization for each picture. In
compression for digital cinema, this control layer is disabled. The
following paragraphs show how the basic MPEG-2 compression tools result
in constant quality encoding.
The DCT transforms the image data from the spatial domain
to the frequency domain. As an example, the block of image data in Figure
2a,b is transformed to the DCT coefficients in Figure 2c.

FIGURE 2: Block discrete Cosine transform(a)image block (b)image block


with luma represented as height;(c) DCT coefficients.The dc term is in
the front corner. At this stage no information from the original image
data has been lost; taking the inverse DCT on the coefficient in Figure
2c exactly reproduces the original source data. The DCT coefficients are
all signed 11- bit integers except for the dc term which is unsigned up
to 11 bits.
The advantage of the DCT transform is that most of the
coefficients are zero, and many of the rest are small values. In the
subsequent variable length coding operation, small values translate to
short codes and zero values are run-length coded. It may seem reasonable
to omit quantizing the DCT coefficients altogether and apply the run-
length/variable-length codes on the DCT coefficients directly. The result
is essentially lossless compression with about 2X compression ratio.
Picture Types:
The MPEG standard specifically defines three types of pictures:
1 Intra Pictures(I Pictures)
2 Predicted Pictures(P Pictures)
3 BiDirectional Pictures(P Pictures)
These three types of pictures are combined to form a group of
picture.

Intra pictures, or I-Picture, are coded using only information present


in the picture itself, and provides potential random access points into
the compressed video data. It uses only transform coding and provide
moderate compression. Typically it uses about two bits per coded pixel.
Predicted Pictures
Predicted pictures, or P-pictures, are coded with respect to
the nearest previous I- or P-pictures. This technique is called forward
prediction and is illustrated in above figure.
Like I-pictures, P-pictures also can serve as a prediction reference for
B-pictures and future P-pictures. Moreover, P-pictures use motion
compensation to provide more compression than is possible with I-
pictures.
Bidirectional Pictures
Birectional pictures, or B-pictures, are pictures that use both a past
and future picture as a reference. This technique is called
bidirectional prediction. B-pictures provide the most Cmpression since it
use the past and future picture as a regerence, however, the computation
time is the largerest.

Method of Encoding Pictures


Intra Pictures
The MPEG transform coding algorithm includes the following steps:
1.Discete cosinetransform(DCT)
2.Quantization
3.Run-length encoding
Both image blocks and prediction-error blocks have high spatial
redundancy. To reduce this redundancy, the MPEG algorithm transforms 8x8
blocks of pixels or 8x8 blocks of error terms from the spatial domain to
the frequency domain with the discrete Cosine Transform(DCT).
The combination of DCT and quantisation results in many of
the frequency coefficients being zero, especially the coefficients for
high spatial frequencies. To take maximum advantage of this, the
coefficients are organized in a zigzag order to produce long runs of
zero. The coefficients are then converted to a series of run amplitude
pairs each pair indicating a number of zero coefficeints and the
amplitude of a non-zero coefficient. These run amplitude pairs are then
coded with a variable-length code(Huffman Encoding) which uses shorter
codes for commonly occurring pairs and longer codes for less common
pairs.
Some blocks of pixels need to be coded more accurately
than others for example, blocks with smooth intensity gradients need
accurate coding to avoid visbile block boundaries. To deal with this
inequality between blocks, the MPEG algorithm allows the amount of
quantization to be modified for each macroblock of pixels. This mechanism
can also be used to provide smooth adaptation to particular bit rate.

Predicted Pictures
A P-picture is coded with reference to a previous image(reference image)
which is an I or P Pictures. From the above figure, the highlighted block
in target image(the image to be coded) is simalar to the reference image
except it shifted to upper right. Since most of changes between target
and reference image can be approximated as translation of small image
regions. Therefore a key technique call motion compensation prediction is
used.
Motion compensation based prediction exploits the temporal redundancy.
Due to frames are closely related, it is possible to accurately represent
or "predict" the data of one frame based on the data of a reference
image, provided the translation is estimated. The process of prediction
helps in the reduction of bits by a huge amont. In P-Pictures, each 16x16
sized macroblock is predicted from a macroblock of a previously encoded I
picture. Sinces, frames are snapshots in time of a moving object, the
macroblocks in the two frames may not be cosited, i.e. correspond to the
same spatial location. Hence, a search is conducted in the I frame to
find the macroblock which closely matches the macroblock under
consideration in the P-frame frame. The difference between the two
macroblock is the prediction error. This error can be coded in the DCT
domain. The DCT of the errr results in few high frequency coefficients,
which after the quantisation process require a small number of bits for
represenation. The quantisation matrices for the prediction error blocks
are different from those used in intra block, due to the distinct nature
of their frequency spectrum. The displacements in the horizaontal and
vertical directions of the best match macroblock from the cosited
macroblock are called motion vectors. Differential coding is used because
it reduces the total bit requirement by transmitting the difference
between the motion vectors of consecutinve frames. Finally it use therun-
length encoding and huffman coding to encode the data.
Biderectional Pictures
example:
From the above pictures, there are some information which is not in
the reference frame. Hence B picture is coded like P-pictures except the
motion vectors can reference either the previous reference picture, the
next picture, or both. The following is the machanism of B-picture
coding.

MPEG-2 in everyday life:


Just about wherever you see video today.
DBS (Direct Broadcast Satellite)
The Hughes/USSB service will use MPEG-2 video and audio.
Thomson has exclusive rights to manufacture the decoding boxes for the
first 18 months of operation. No doubt Thomson 's STi-3500 MPEG-2 video
decoder chip will be featured. Hughes/USSB DBS already begun service in
North America in 1994. Two satellites at 101 degrees West share the power
requirements of 120 Watts per 27 MHz transponder. Multi-source channel
rate control methods is employed to optimally allocate bits between
several programs on one data carrier. An average of 150 channels are
planned.
CATV (Cable Television)
Despite conflicting options, the the cable industry has more or
less settled on MPEG-2 video. Audio is less than settled. For example,
General Instruments (the largest U.S. consumer cable set-top box
manufacturer) have announced the planned use of the Dolby AC-3 audio
algorithm.
DigiCipher
The General Instruments DigiCipher I video syntax is similar to
MPEG-2 syntax but uses smaller macroblock predictions and no B-frames.
The DigiCipher II specification includes modes to support both the GI and
full MPEG-2 Video Main Profile syntax. Services such as HBO will upgrade
to DigiCipher II in 1994.
At the European IBC broadcast technology convention, in
September 1994,GI demonstrated a prototype DCII encoder which handles
both digital encoding standards. Fully configured the encoder will be
able to process 16 analogue video inputs, plus 32 stereo audio channels
and 32 data channels into a single high speed datastream which can be
carried on cable, satellite, microwave or ATM systems.
DCII technology has now been licensed to Scientific Atlanta and Hewlett
Packard (both set-top manufacturers) and to chip manufacturers Motorola,
LSI Logic and C-Cube. All these manufacturers already support MPEG2 and
plan to incorporate DCII into dual mode digital video decoder chips for
the set-top terminal market.
HDTV
The U.S.Grand Alliance, a consortium of companies that formely
competed for the U.S. terrestrial HDTVstandard, have already agreed to
use the MPEG-2 Video and Systems syntax (including B-pictures) . Both
interlaced (1440 x 960 x 30 Hz) and progressive (1280 x 720 x 60 Hz)
modes will be supported. The Alliance must then settle upon a modulation
(QAM, VSB, OFDM), convolution (MS or Viterbi), and error correction
(RSPC, RSFC) specification.
In September 1993, the consortium of 85 European companies
signed an agreement to fund a project known Digital Video Broadcasting
(DVB) which will develop a standard for cable and terrestrial
transmission by the end of 1994. The scheme will use MPEG-2. This
consortium has put the final nail in the coffin of the D-MAC scheme for
gradual migration towards an all-digital,
HDTV consumer transmission standard. The only remaining analog or
digital-analog hybrid system left in the world is NHK's MUS
Conclusion:
Mpeg-2 has been very successful in defining a specification to
serve a range of applications, bit rates, qualities and services.
Currently, the major interest is in the main profile at main level
(MP@ML) for applications such as digital television broadcasting
(terrestrial, satellite and cable), video-on-demand services and desktop
video systems. Several manufacturers have announced MP@ML single-chip
decoders and multichip encoders. Prototype equipment supporting the SNR
and spatial profiles has also been constructed for use in broadcasting
field trials.
The specification only defines the bitstream syntax and
decoding process. Generally, this means that any decoders which conform
to the specification should produce near identical output pictures.
However, decoders may differ in how they respond to errors introduced in
the transmission channel. For example, an advanced decoder might attempt
to conceal faults in the decoded picture if it detects errors in the
bitstream.
For a coder to conform to the specification, it only has to produce
a valid bitstream. This condition alone has no bearing on the picture
quality through the codec, and there is likely to be a variation in
coding performance between different coder designs. For example, the
coding performance may vary depending on the quality of the motion-vector
measurement, the techniques for controlling the bit rate, the methods
used to choose between the different prediction modes, the degree of
picture preprocessing and the way in which the quantiser is adapted
according to the picture content.
The picture quality through an MPEG-2 codec depends on the
complexity and predictability of the source pictures. Real-time coders
and decoders have demonstrated generally good quality standard-definition
pictures at bit rates around 6 Mbit/s. As experience of MPEG-2 coding
increases, the same picture quality may be achievable at lower bit rates.
REFERENCES:

[1] ISO/IEC 11172: 'Coding of moving pictures and associated audio for
digital storage media at up to about 1.5 Mbit/s'.

[2]ISO/IEC 13818: Generic coding of moving pictures and associated audio


(MPEG-2).

[3]Encoding parameters of digital television for studios, CCIR


Recommendation 601-1 XVIth Plenary Assembly Dubrovnik 1986, Vol. XI, Part
pp. 319-328.

[4]JAIN, A.K.: Fundamentals of digital image processing (Prentice Hall,


1989).

[5]WELLS, N.D.: Component codec standard for high-quality digital


television, Electronics & Communication Engineering Journal, August 1992,
4, (4), pp. 195-202.