Vous êtes sur la page 1sur 35

The MPEG Standard

MPEG-1 (1992) actually a video player


plays out audio/video streams same type of access as home VCR

MPEG-2 (1995) introduced for compression and transmission of digital TV signals


still limited interactivity

MPEG-4 (1999) is completely different


high level of interactivity

MPEG-7 (2002) for the description of metadata only


4/20/2012 1

MPEG-4
MPEG-4 addresses the need towards
Mixing of natural and synthetic audiovisual information High interactivity in the presentation of multimedia content Deployment of communication systems for realtime or broadcast delivery of coded data streams

A new approach for describing, coding and presenting a scene MPEG-4 combines different coding tools for
4/20/2012

Audio/video Synthetic objects and graphics

MPEG-4 Objects
The audio/video components of MPEG-4
Objects are coded, transmitted separately and composed at the decoder site They can exist independently Multiple objects can be grouped together to form complex objects Video and audio can be easily manipulated Permits choosing appropriate coding tools for audio, video and graphics objects
4/20/2012 3

MPEG-4 Object Based Coding

4/20/2012

MPEG-4 Coding
The scene is composed and rendered at the sender site video frames, audio are coded, multiplexed and transmitted tools for coding arbitrarily shaped objects At the receiver the stream is demultiplexed video and audio are decoded, composed, synchronized and presented as defined at the senders site
4/20/2012 5

Object Coding
Objects are described mathematically (e.g. by their positions)
similarly for audio and graphics objects an object need only be defined once the viewer can change their position transmit calculations to update the scene at the receiver this is a critical feature when the response has to be fast and bit-rate is limited
4/20/2012 6

Binary Format for Scenes (BIFS)


MPEG-4s language for describing and dynamically changing a scene Borrows concepts from VRML Both define representations of the same data VRML defines objects and actions in text BIFS code is binary (10-15 times shorter) Unlike VRML, MPEG-4 uses BIFS for realtime streaming: a scene can be built-up and played on the fly VRML and BIFS evolve consistently 4/20/2012 7

scene graph

4/20/2012

The Scene Graph


Represents a scene as independent or compound objects e.g.,
father and child the audio track of his voice floor and walls (sprites: for backgrounds) the web site the synthetic image of the furniture a synthetic HDTV set playing a movie from the families DVD library
9

4/20/2012

Elementary Streams (ES)


The scheme for preparing content for transmission, storage and decoding Objects are placed in ESs Probably two or more ESs per object A sound track or a video may have a single ES Scalable objects way have one ES for basic quality information + one or more enhancement layers for improved quality (e.g., finer detail, faster motion) ESs are split into packets and sent along with timing information for proper synchronization
4/20/2012 10

Object Descriptors (OD)


MPEG-4s mechanism that informs the system which ES belongs to a certain object
OD contain Elementary Stream Descriptors (ESD) which tell the system which decoders to use ODs are sent in their own stream which allows them to be added or deleted as the scene changes
4/20/2012 11

Profiles and Levels


MPEG-4 provides a set of tools for coding multimedia contents
an application may use only subsets of these tools

Profiles: MPEG-4s definitions of these subsets for audio, visual, graphics information Levels: define the computational complexity of the profiles tool subset Certain combinations of profiles fit well 4/20/2012 12 together

MPEG-4 Profiles

4/20/2012

13

MPEG-4 Visual Objects


Arbitrarily shaped objects are coded apart from their background Binary shape coding: a pixel is or is not part of an object
simple, crude technique, suitable for low-bit rates, suffers from aliasing

Alpha shape (gray scale) coding: each pixel is assigned a value for its transparency
objects can be smoothly blended into a background or with other objects
4/20/2012 14

Visual Objects
Rectangular natural images and scenes are coded using MPEG-1, 2 Texture is coded separately by a DCT, block based coding scheme or wavelets E.g., weather reports: the weathermans image seems to be standing in front of a map which is actually generated elsewhere
4/20/2012 15

Object Segmentation
MPEG does not specify how objects are extracted
video object segmentation is difficult e.g., record weathermans image in front of a color background

MPEG-4 specifies decoding


implementation of encoding is left to the industry to decide
4/20/2012 16

MPEG-4 Applications
MPEG-4 makes video possible even at very low bit-rates (e.g., 10 kb/s) Scalable objects for low bit-rates
mobile devices, internet

a base layer conveys all the information in some basic quality one of more enhancement layers can be sent to get better quality send only the most important objects
17

4/20/2012

Sprites
For coding unchanged backgrounds The background is defined and coded only once Must be updated for each change (e.g., when the viewing angles changes) The sprite is sent only once New views are created by sending the new positions
4/20/2012 18

Advanced Features
Map images into computer generated shapes
a 2D or 3D mesh may have an image mapped onto it a few parameters to deform the mesh generate the impression of a moving picture rather than sending new images for each change, send commands and parameters to the viewer pre-defined faces are particularly interesting meshes the appearance of a face may be left to the decoder (e.g., custom facial models can be downloaded)
4/20/2012 19

MPEG-4 Faces
Images laid over a wire-frame face Send wire-frame plus parameters Image reconstruction at receivers site Speech is generated from text in steps with motions of the mouth, eyes and lips
4/20/2012 20

MPEG-7
MPEG-7 (2002) focuses on description of multimedia content
modalities: image, speech, video, graphics and their combinations

MPEG-7 complements existing MPEG standards and is applicable even to non-MPEG formats (compressed or uncompressed) MPEG-7 is driven by trends in technology, market and user needs Applications: VideoOnDemand, NewsOnDemand, InteractiveTV, multimedia information systems etc.
4/20/2012

21

Scope of the Standard


Provides the means for indexing, searching, filtering and managing audiovisual content
broadcast media selection (e.g., personalized TV) multimedia editing (e.g., personalized news service)
tools may be designed for specific modalities, aspects or applications

MPEG-7 interoperable interface defines syntax and semantics

4/20/2012

22

Interoperable Services and Applications

4/20/2012

23

MPEG-7 Main Tasks


Multimedia: generate customized program guides or summaries of broadcast audio-visual content Archive: generate descriptions of audiovisual content (or elements) Adaptation: filter and transform multimedia streams in low bit-rate environments (e.g., mobile users)
4/20/2012 24

MPEG-7 Specific Tasks


Music/audio: play a few notes and return music with similar music/audio Images/graphics: draw a sketch and return images with similar graphics Movement: describe movements and return video clips with the specified temporal and spatial relations Scenario: describe actions and return scenarios where similar actions take place
4/20/2012

25

MPEG-7 Elements
1. Descriptors (D) : define syntax and semantics of features of audio-visual content
Application independent Low level: shape, motion, color, camera motion, harmonicity, timbre for audio ... Semantic level: events, concepts ...

4/20/2012

26

MPEG-7 Elements (cont.d)


2. Description Schemes (DS): specify the structure and semantics of the relationships among the constituent Ds or DSs e.g.,
Video DS specify syntax and semantics for segment decomposition, attributes, their relationships DS related to creation, production, and access of content (e.g., property rights, parental rating, etc.)

4/20/2012

27

MPEG-7 Elements (cont.d)


3. Description Definition Language (DDL): allows flexible definition of Ds and DSs based on XML schema
Ds and DSs are application independent DLLs to define specialized tools

4/20/2012

28

MPEG-7 Descriptions
MPEG-7 allows descriptions at different levels of abstractions
low level features extracted automatically semantic features with human interaction or textual annotation

MPEG-7 does not specify how features are extracted or used (e.g., filtering, retrieval)
their representation must conform to the MPEG-7 standard
4/20/2012

29

MPEG-7 Parts
Systems: specifies functionality at system level
Preparation of descriptions for efficient transport and storage synchronization of content and descriptors development of decoders

Description Definition Language (DDL): language for specifying new Ds and DSs
extension of XML schema
4/20/2012 30

MPEG-7 Visual
Specifies a set of standardized visual Ds and DSs
Color descriptors: color space, quantization Texture descriptors: homogeneous texture, texture browsing, edge histogram ... Shape descriptors: for regions or contours Motion descriptors: camera motion, trajectories, motion activity ... Face recognition
31

4/20/2012

MPEG-7 Audio
Specifies standardized audio descriptors and descriptor schemes for pure music, pure speech, sound effects, soundtracks
silence descriptor spoken content descriptors sound effects descriptors melody contour descriptors
4/20/2012 32

Multimedia Description Schemes


Specify a framework that allows generic description of all kinds of multimedia data

basic elements: data types, structures, Ds content management: content from several viewpoints (creation, usage etc.) organization of content by collections, classification navigation and access user interaction
33

4/20/2012

Multimedia Description Schemes

4/20/2012

34

MPEG-7 Reference Software


Reference implementation of the relevant parts of the MPEG-7 standard
The focus is on creating bit-streams of descriptors and description schemes (DDL parser, DDL validation, multimedia description schemes) Some software for extracting descriptors is also included (visual, audio descriptors)

4/20/2012

35

Vous aimerez peut-être aussi