Académique Documents
Professionnel Documents
Culture Documents
As with any communication, compressed data communication only works when both
the sender and receiver of the information understand the encoding scheme. For example, this
text makes sense only if the receiver understands that it is intended to be interpreted as characters
representing the English language. Similarly, compressed data can only be understood if the
decoding method is known by the receiver.
Compression is useful because it helps reduce the consumption of expensive resources, such
as hard disk space or transmission bandwidth. On the downside, compressed data must be
decompressed to be used, and this extra processing may be detrimental to some applications. For
instance, a compression scheme for video may require expensive hardware for the video to be
decompressed fast enough to be viewed as it's being decompressed (the option of decompressing
the video in full before watching it may be inconvenient, and requires storage space for the
decompressed video). The design of data compression schemes therefore involves trade-offs
among various factors, including the degree of compression, the amount of distortion introduced
(if using a lossy compression scheme), and the computational resources required to compress and
uncompress the data.
Lossless compression
Lossless data compression is a class of data compression algorithms that allows the exact
original data to be reconstructed from the compressed data. The term lossless is in contrast
to lossy data compression, which only allows an approximation of the original data to be
reconstructed in exchange for better compression rates.
Lossless data compression is used in many applications. For example, it is used in the
popular ZIP file format and in the Unix tool gzip. It is also often used as a component within
lossy data compression technologies.
Lossless compression is used when it is important that the original and the decompressed data be
identical, or when no assumption can be made on whether certain deviation is uncritical. Typical
examples are executable programs and source code. Some image file formats, like PNG or GIF,
use only lossless compression, while others like TIFF and MNG may use either lossless or lossy
methods.
Most lossless compression programs do two things in sequence: the first step generates
a statistical model for the input data, and the second step uses this model to map input data to bit
sequences in such a way that "probable" (e.g. frequently encountered) data will produce shorter
output than "improbable" data.
The primary encoding algorithms used to produce bit sequences are Huffman coding (also used
by DEFLATE) and arithmetic coding. Arithmetic coding achieves compression rates close to the
best possible for a particular statistical model, which is given by the information entropy,
whereas Huffman compression is simpler and faster but produces poor results for models that
deal with symbol probabilities close to 1.
There are two primary ways of constructing statistical models: in a static model, the data are
analyzed and a model is constructed, then this model is stored with the compressed data. This
approach is simple and modular, but has the disadvantage that the model itself can be expensive
to store, and also that it forces a single model to be used for all data being compressed, and so
performs poorly on files containing heterogeneous data. Adaptive models dynamically update the
model as the data are compressed. Both the encoder and decoder begin with a trivial model,
yielding poor compression of initial data, but as they learn more about the data performance
improves. Most popular types of compression used in practice now use adaptive coders.
Lossless compression methods may be categorized according to the type of data they are
designed to compress. While, in principle, any general-purpose lossless compression algorithm
(general-purposemeaning that they can compress any bitstring) can be used on any type of data,
many are unable to achieve significant compression on data that are not of the form for which
they were designed to compress. Many of the lossless compression techniques used for text also
work reasonably well for indexed images.
Lossy compression
A lossy compression method is one where compressing dataand then decompressing it retrieves
data that is different from the original, but is close enough to be useful in some way. Lossy
compression is most commonly used to compress multimediadata (audio, video, still images),
especially in applications such as streaming media and internet telephony. By contrast, lossless
compression is required for text and data files, such as bank records, text articles, etc. In many
cases it is advantageous to make a master lossless file which can then be used to produce
compressed files for different purposes; for example a multi-megabyte file can be used at full
size to produce a full-page advertisement in a glossy magazine, and a 10 kilobyte lossy copy
made for a small image on a web page.
Low compression (84% less information than uncompressed PNG, 9.37 KB)
Medium compression (92% less information than uncompressed PNG, 4.82 KB)
High compression (98% less information than uncompressed PNG, 1.14 KB)
It is possible to compress many types of digital data in a way which reduces the amount of
information stored, and consequently the size of a computer file needed to store it or
thebandwidth needed to stream it, with no loss of information. A picture, for example, is
converted to a digital file by considering it to be an array of dots, and specifying the colour and
brightness of each dot. If the picture contains an area of the same colour, it can be compressed
without loss by saying "200 red dots" instead of "red dot, red dot, ...(197 more times)..., red dot".
The original contains a certain amount of information; there is a lower limit to the size of file that
can carry all the information. As an intuitive example, most people know that a compressed ZIP
file is smaller than the original file; but repeatedly compressing the file will not reduce the size to
nothing, and will in fact usually increase the size.
In many cases files or data streams contain more information than is needed. For example, a
picture may have more detail than the eye can distinguish when reproduced at the largest size
intended; an audio file does not need a lot of fine detail during a very loud passage. Developing
lossy compression techniques as closely matched to human perception as possible is a complex
task. In some cases the ideal is a file which provides exactly the same perception as the original,
with as much digital information as possible removed; in other cases perceptible loss of quality is
considered a valid trade-off for the reduced data size.
Lossless compression schemes are reversible so that the original data can be reconstructed, while
lossy schemes accept some loss of data in order to achieve higher compression.
However, lossless data compression algorithms will always fail to compress some files; indeed,
any compression algorithm will necessarily fail to compress any data containing no discernible
patterns. Attempts to compress data that has been compressed already will therefore usually
result in an expansion, as will attempts to compress all but the most trivially encrypted data.
In practice, lossy data compression will also come to a point where compressing again does not
work, although an extremely lossy algorithm, like for example always removing the last byte of a
file, will always compress a file up to the point where it is empty.
25.888888888
This string can be compressed as:
25.[9]8
Interpreted as, "twenty five point 9 eights", the original string is perfectly recreated,
just written in a smaller form. In a lossy system, using
26
instead, the exact original data is lost, at the benefit of a smaller file size.
Audio compression (data)
Audio compression is a form of data compression designed to reduce the transmission
bandwidth requirement of digital audio streams and the storage size of audio files. Audio
compression algorithmsare implemented in computer software as audio codecs. Generic data
compression algorithms perform poorly with audio data, seldom reducing data size much below
87% from the original,[citation needed] and are not designed for use in real time applications.
Consequently, specifically optimized audio losslessand lossy algorithms have been created.
Lossy algorithms provide greater compression rates and are used in mainstream consumer audio
devices.
In both lossy and lossless compression, information redundancy is reduced, using methods such
ascoding, pattern recognition and linear prediction to reduce the amount of information used to
represent the uncompressed data.
The trade-off between slightly reduced audio quality and transmission or storage size is
outweighed by the latter for most practical audio applications in which users may not perceive
the loss in playback rendition quality. For example, one compact disk (CD) holds approximately
one hour of uncompressed high fidelity music, less than 2 hours of music compressed losslessly,
or 7 hours of music compressed in the MP3 format at medium bit rates.
Formats
Shorten was an early lossless format; newer ones include Free Lossless Audio
Codec (FLAC), Apple'sApple Lossless, MPEG-4 ALS, Monkey's Audio, and TTA.
Some audio formats feature a combination of a lossy format and a lossless correction; this allows
stripping the correction to easily obtain a lossy file. Such formats include MPEG-4
SLS (Scalable to Lossless), WavPack, and OptimFROG DualStream.
Evaluation criteria
Lossless audio codecs have no quality issues, so the usability can be estimated by
The innovation of lossy audio compression was to use psychoacoustics to recognize that not all
data in an audio stream can be perceived by the human auditory system. Most lossy compression
reduces perceptual redundancy by first identifying sounds which are considered perceptually
irrelevant, that is, sounds that are very hard to hear. Typical examples include high frequencies,
or sounds that occur at the same time as louder sounds. Those sounds are coded with decreased
accuracy or not coded at all.
While removing or reducing these 'unhearable' sounds may account for a small percentage of bits
saved in lossy compression, the real savings comes from a complementary phenomenon: noise
shaping. Reducing the number of bits used to code a signal increases the amount of noise in that
signal. In psychoacoustics-based lossy compression, the real key is to 'hide' the noise generated
by the bit savings in areas of the audio stream that cannot be perceived. This is done by, for
instance, using very small numbers of bits to code the high frequencies of most signals - not
because the signal has little high frequency information (though this is also often true as well),
but rather because the human ear can only perceive very loud signals in this region, so that softer
sounds 'hidden' there simply aren't heard.
If reducing perceptual redundancy does not achieve sufficient compression for a particular
application, it may require further lossy compression. Depending on the audio source, this still
may not produce perceptible differences. Speech for example can be compressed far more than
music. Most lossy compression schemes allow compression parameters to be adjusted to achieve
a target rate of data, usually expressed as a bit rate. Again, the data reduction will be guided by
some model of how important the sound is as perceived by the human ear, with the goal of
efficiency and optimized quality for the target data rate. (There are many different models used
for this perceptual analysis, some better suited to different types of audio than others.) Hence,
depending on the bandwidth and storage requirements, the use of lossy compression may result
in a perceived reduction of the audio quality that ranges from none to severe, but generally an
obviously audible reduction in quality is unacceptable to listeners.
Because data is removed during lossy compression and cannot be recovered by decompression,
some people may not prefer lossy compression for archival storage. Hence, as noted, even those
who use lossy compression (for portable audio applications, for example) may wish to keep a
losslessly compressed archive for other applications. In addition, the technology of compression
continues to advance, and achieving a state-of-the-art lossy compression would require one to
begin again with the lossless, original audio data and compress with the new lossy codec. The
nature of lossy compression (for both audio and images) results in increasing degradation of
quality if data are decompressed, then recompressed using lossy compression.
Coding methods
The masking threshold is calculated using the absolute threshold of hearing and the principles
ofsimultaneous masking - the phenomenon wherein a signal is masked by another signal
separated by frequency - and, in some cases, temporal masking - where a signal is masked by
another signal separated by time. Equal-loudness contours may also be used to weight the
perceptual importance of different components. Models of the human ear-brain combination
incorporating such effects are often called psychoacoustic models.
Applications
Due to the nature of lossy algorithms, audio quality suffers when a file is decompressed and
recompressed (digital generation loss). This makes lossy compression unsuitable for storing the
intermediate results in professional audio engineering applications, such as sound editing and
multitrack recording. However, they are very popular with end users (particularly MP3), as a
megabyte can store about a minute's worth of music at adequate quality.
Usability
Usability of lossy audio codecs is determined by:
Lossy formats are often used for the distribution of streaming audio, or interactive applications
(such as the coding of speech for digital transmission in cell phone networks). In such
applications, the data must be decompressed as the data flows, rather than after the entire data
stream has been transmitted. Not all audio codecs can be used for streaming applications, and for
such applications a codec designed to stream data effectively will usually be chosen.
Latency results from the methods used to encode and decode the data. Some codecs will analyze
a longer segment of the data to optimize efficiency, and then code it in a manner that requires a
larger segment of data at one time in order to decode. (Often codecs create segments called a
"frame" to create discrete data segments for encoding and decoding.) The inherent latency of the
coding algorithm can be critical; for example, when there is two-way transmission of data, such
as with a telephone conversation, significant delays may seriously degrade the perceived quality.
In contrast to the speed of compression, which is proportional to the number of operations
required by the algorithm, here latency refers to the number of samples which must be analysed
before a block of audio is processed. In the minimum case, latency is 0 zero samples (e.g., if the
coder/decoder simply reduces the number of bits used to quantize the signal). Time domain
algorithms such as LPC also often have low latencies, hence their popularity in speech coding for
telephony. In algorithms such as MP3, however, a large number of samples have to be analyzed
in order to implement a psychoacoustic model in the frequency domain, and latency is on the
order of 23 ms (46 ms for two-way communication).
Speech encoding
Speech encoding is an important category of audio data compression. The perceptual models
used to estimate what a human ear can hear are generally somewhat different from those used for
music. The range of frequencies needed to convey the sounds of a human voice are normally far
narrower than that needed for music, and the sound is normally less complex. As a result, speech
can be encoded at high quality using relatively low bit rates.
Perhaps the earliest algorithms used in speech encoding (and audio data compression in general)
were the A-law algorithm and the µ-law algorithm.
History
Solidyne 922: The world's first commercial audio bit compression card for PC, 1990
A literature compendium for a large variety of audio coding systems was published in the IEEE
Journal on Selected Areas in Communications (JSAC), February 1988. While there were some
papers from before that time, this collection documented an entire variety of finished, working
audio coders, nearly all of them using perceptual (i.e. masking) techniques and some kind of
frequency analysis and back-end noiseless coding.[1] Several of these papers remarked on the
difficulty of obtaining good, clean digital audio for research purposes. Most, if not all, of the
authors in the JSAC edition were also active in the MPEG-1 Audio committee.
The world's first commercial broadcast automation audio compression system was developed by
Oscar Bonello, an Engineering professor at the University of Buenos Aires.[2] In 1983, using the
psychoacoustic principle of the masking of critical bands first published in 1967,[3] he started
developing a practical application based on the recently developed IBM PC computer, and the
broadcast automation system was launched in 1987 under the name Audicom. 20 years later,
almost all the radio stations in the world were using similar technology, manufactured by a
number of companies.
Video compression
Video compression refers to reducing the quantity of data used to represent digital
video images, and is a combination of spatial image compression and temporal motion
compensation. Video compression is an example of the concept of source coding in Information
theory. This article deals with its applications: compressed video can effectively reduce
the bandwidth required to transmit video viaterrestrial broadcast, via cable TV, or via satellite
TV services.
Most video compression is lossy — it operates on the premise that much of the data present
before compression is not necessary for achieving good perceptual quality. For
example, DVDs use a video coding standard called MPEG-2 that can compress around two hours
of video data by 15 to 30 times, while still producing a picture quality that is generally
considered high-quality for standard-definitionvideo. Video compression is
a tradeoff between disk space, video quality, and the cost of hardwarerequired to decompress the
video in a reasonable time. However, if the video is overcompressed in a lossy manner, visible
(and sometimes distracting) artifacts can appear.
The programming provider has control over the amount of video compression applied to their
video programming before it is sent to their distribution system. DVDs, Blu-ray discs, and HD
DVDs have video compression applied during their mastering process, though Blu-ray and HD
DVD have enough disc capacity that most compression applied in these formats is light, when
compared to such examples as most video streamed on the internet, or taken on a cellphone.
Software used for storing video on hard drives or various optical disc formats will often have a
lower image quality, although not in all cases. High-bitrate video codecs with little or no
compression exist for video post-production work, but create very large files and are thus almost
never used for the distribution of finished videos. Once excessive lossy video compression
compromises image quality, it is impossible to restore the image to its original quality.
Video is basically a three-dimensional array of color pixels. Two dimensions serve as spatial
(horizontal and vertical) directions of the moving pictures, and one dimension represents the time
domain. A data frame is a set of all pixels that correspond to a single time moment. Basically, a
frame is the same as astill picture.
Video data contains spatial and temporal redundancy. Similarities can thus be encoded by merely
registering differences within a frame (spatial), and/or between frames (temporal). Spatial
encoding is performed by taking advantage of the fact that the human eye is unable to distinguish
small differences in color as easily as it can perceive changes in brightness, so that very similar
areas of color can be "averaged out" in a similar way to jpeg images (JPEG image compression
FAQ, part 1/2). With temporal compression only the changes from one frame to the next are
encoded as often a large number of the pixels will be the same on a series of frames.
The most commonly used method works by comparing each frame in the video with the previous
one. If the frame contains areas where nothing has moved, the system simply issues a short
command that copies that part of the previous frame, bit-for-bit, into the next one. If sections of
the frame move in a simple manner, the compressor emits a (slightly longer) command that tells
the decompresser to shift, rotate, lighten, or darken the copy — a longer command, but still much
shorter than intraframe compression. Interframe compression works well for programs that will
simply be played back by the viewer, but can cause problems if the video sequence needs to be
edited.
Since interframe compression copies data from one frame to another, if the original frame is
simply cut out (or lost in transmission), the following frames cannot be reconstructed properly.
Some video formats, such as DV, compress each frame independently using intraframe
compression. Making 'cuts' in intraframe-compressed video is almost as easy as editing
uncompressed video — one finds the beginning and ending of each frame, and simply copies bit-
for-bit each frame that one wants to keep, and discards the frames one doesn't want. Another
difference between intraframe and interframe compression is that with intraframe systems, each
frame uses a similar amount of data. In most interframe systems, certain frames (such as "I
frames" in MPEG-2) aren't allowed to copy data from other frames, and so require much more
data than other frames nearby.
It is possible to build a computer-based video editor that spots problems caused when I frames
are edited out while other frames need them. This has allowed newer formats like HDV to be
used for editing. However, this process demands a lot more computing power than editing
intraframe compressed video with the same picture quality.
Current forms
Today, nearly all video compression methods in common use (e.g., those in standards approved
by theITU-T or ISO) apply a discrete cosine transform (DCT) for spatial redundancy reduction.
Other methods, such as fractal compression, matching pursuit and the use of a discrete wavelet
transform (DWT) have been the subject of some research, but are typically not used in practical
products (except for the use of wavelet coding as still-image coders without motion
compensation). Interest in fractal compression seems to be waning, due to recent theoretical
analysis showing a comparative lack of effectiveness to such methods.
Image compression
Image compression is the application of data compression on digital images. In effect, the
objective is to reduce redundancy of the image data in order to be able to store or transmit data in
an efficient form.
A chart showing the relative quality of various jpg settings and also compares saving a file as a
jpg normally and using a "save for web" technique
Image compression can be lossy or lossless. Lossless compression is sometimes preferred for
medical imaging, technical drawings, icons or comics. This is because lossy compression
methods, especially when used at low bit rates, introduce compression artifacts. Lossless
compression methods may also be preferred for high value content, such as medical imagery or
image scans made for archival purposes. Lossy methods are especially suitable for natural
images such as photos in applications where minor (sometimes imperceptible) loss of fidelity is
acceptable to achieve a substantial reduction in bit rate. The lossy compression that produces
imperceptible differences can be called visually lossless.
The best image quality at a given bit-rate (or compression rate) is the main goal of image
compression. However, there are other important properties of image compression schemes:
Region of interest coding. Certain parts of the image are encoded with higher quality than
others. This can be combined with scalability (encode these parts first, others later).
Meta information. Compressed data can contain information about the image which can be used
to categorize, search or browse images. Such information can include color and texture statistics,
smallpreview images and author/copyright information.
The quality of a compression method is often measured by the Peak signal-to-noise ratio. It
measures the amount of noise introduced through a lossy compression of the image. However,
the subjective judgement of the viewer is also regarded as an important, perhaps the most
important, measure.
Methods
Image
Cartesian Perceptual Compression: Also known as CPC
DjVu
Fractal compression
HAM, hardware compression of color information used in Amiga computers
ICER, used by the Mars Rovers: related to JPEG 2000 in its use of wavelets
JPEG
JPEG 2000, JPEG's successor format that uses wavelets, for Lossy or Lossless
compression.
JBIG2
PGF, Progressive Graphics File (lossless or lossy compression)
Wavelet compression
S3TC texture compression for 3D computer graphics hardware
Video
H.261
H.263
H.264
MNG (supports JPEG sprites)
Motion JPEG
MPEG-1 Part 2
MPEG-2 Part 2
MPEG-4 Part 2 and Part 10 (AVC)
Ogg Theora (noted for its lack of patent restrictions)
Dirac
Sorenson video codec
VC-1
Audio
Music
AAC
ADPCM
ATRAC
Dolby AC-3
MP2
MP3
Musepack
Ogg Vorbis (noted for its lack of patent restrictions)
WMA
Speech
CELP
G.711
G.726
Harmonic and Individual Lines and Noise (HILN)
AMR (used by GSM cell carriers, such as T-Mobile)
Speex (noted for its lack of patent restrictions)
Other data
Researchers have (semi-seriously) performed lossy compression on text by either using a
thesaurus to substitute short words for long ones, or generative text techniques [3], although these
sometimes fall into the related category of lossy data conversion.
Glossary
www.data-compression.com
Cappellini, V., Ed. 1985. Data Compression and Error Control Techniques with
Applications. Academic Press, London.
Cormack, G. V. 1985. Data Compression on a Database System. Commun. ACM 28, 12 (Dec.),
1336-1342.