Académique Documents
Professionnel Documents
Culture Documents
INTRODUCTION
Meridian Lossless Packing (MLP) is a lossless coding
{
24
system for use on high-quality digital audio data n
audio 24 Lossless
originally represented as linear PCM. High quality audio channels encoder buffer
24
core
these days implies high sample rates, large word sizes metadata
24
DVD/CD
and multichannel.
Lossless encoder
This paper describes the MLP system and is an
abbreviated version of [14].
24 { n
audio
1 OVERVIEW Lossless 24
channels
buffer decoder
24 out
MLP performs lossless compression of up to 63 audio core
24
channels including 24-bit material sampled at rates as DVD/CD metadata
metadata
2.2 Integrity
Coder simulation 96kHz 24-bit 6ch chamber music
10Mbps A lossless encoding – decoding system displays an
inherent integrity. Once audio has been ‘wrapped up’ in
8Mbps the MLP stream it will remain intact through any
intermediate storage or transmission process.
6Mbps An MLP decoder can continuously test against checks
Data Rate
Entropy
C shift de-correlator
coder
h
R
n a
e Interleave
channels n Lossless Entropy
m shift de-correlator and add
PCM n matrix coder
a substream
audio e headers for
p encoding
l
s Entropy parameters
shift de-correlator
coder
lsb bypass
S1 +
+ S1' + S1
-
S2 S2
S2'
S3 S3
S3
S4 S4
S4
m_coeff[1,2] m_coeff[1,2]
+ Q + Q
m_coeff[1,3] m_coeff[1,3]
m_coeff[1,4] m_coeff[1,4]
performance. Frequency
-10dB
100dB
Signal -20dB
Spectral level
80dB
Average
60dB
Spectral level
-30dB
40dB
-40dB Input
IIR4 residual
20dB
FIR8 residual
-50dB
0dB
0kHz 10kHz 20kHz 30kHz 40kHz
Frequency -60dB
0Hz 5kHz 10kHz 15kHz 20kHz
Frequency
Figure 5: Spectra of a signal and its average level. Figure 7: Spectra for a signal excerpt and the residuals
using 8th order FIR and 4th order IIR predictors.
4.6 Buffering
We have explained that while normal audio signals can
be well predicted, there will be occasional fragments
like sibilants, synthesised noise or percussive events that
have high entropy.
D samples
MLP uses a particular form of stream buffering that can delay
reduce the variations in transmitted data rate, absorbing
transients that are hard to compress. Lossless FIFO FIFO Lossless
FIFO memory buffers are used in the encoder and Encoder Encode Decode Decoder
Core Buffer Buffer Core
decoder as shown in Figure 9. These buffers are Transmission
medium
cueing, the FIFO management minimises the part of the Figure 9: Outlining the buffering used in MLP.
delay due to the decoder buffer. So, this buffer is
normally empty and fills only ahead of sections with MLP coding of RPO3: Variable Data Rate
high instantaneous data rate. 10Mbps
During these sections, the decoder’s buffer empties and
6Mbps
plotted through a 30-second 96kHz 24-bit 6-channel DVD-A limit
excerpt featuring a close recording of a jazz saxophone. CP2
4Mbps
Figure 11 indicates the underlying compression with the
encoder set to limit above 9.5Mbps. The minimum-rate
2Mbps
encode shown in Figure 12 makes long-term (but low
occupancy) use the decoder buffer.
0bps
It should be obvious that the situation in Figure 12 is 0 200 400 600 800 1000 1200 1400 1600 1800
preferable if the transmission channel (maybe DVD Time (Packets)
disc) has other calls on the bandwidth – for example
bandwidth to transmit associated picture or text. Figure 11: Showing an unlimited data rate encoding.
Figure 13 shows how hard-to-compress signals can be MLP of CP2: Minimum Data Rate
10Mbps
squeezed below a preset format limit. This 30-second
96kHz 24-bit recording features closely recorded
8Mbps
cymbals in 6 channels. At the crescendo this signal is
virtually random and the underlying compressed data
MLP Data Rate
6Mbps
rate is 12.03Mbps. Buffering allows the MLP encoder to DVD-A limit
hold the transmitted data rate below 9.2Mbps by filling CP2
4Mbps
the decoder buffer to a short-term maximum of 86kbyte
(bottom curve).
2Mbps
0bps
0 200 400 600 800 1000 1200 1400 1600 1800
Time (Packets)
8Mbps 160k
96k
separate mix, mastering and authoring processes and
Input data rate (top)
4Mbps
DVD-A limit uses disc capacity.
Entropy data 64k
2Mbps
MLP stream In cases where only one multichannel stream is
Buffer occupancy 32k
0bps available, then there are very few options at replay – one
0 200 400 600 800 1000 1200 1400 1600
0
1800
is to use either a fixed or guided downmix. However, to
Time (Packets) create such a downmix it is first necessary to decode the
full multichannel signal; this contravenes the desirable
Figure 13: Buffering allows a difficult passage to remain principle that decoder complexity should decrease with
below a hard format limit. functionality.
M
decode substream 0.
24
buffer core 0 m1
a A more complex decoder may access both the 2-
De-packetiser t 24
substream 1
m2 r
channel and multi-channel versions losslessly.
The downmix coefficients do not have to be
24
m3 i
FIFO Decoder x 24
buffer core 1 m4 constant for a whole track, but can be varied under
24
m5
1 artistic control.
24
Additional
data
Entropy
re-correlator shift C
decoder
h
R
a n
De-interleave e
Entropy Lossless n channels
and extract decoder
re-correlator
matrix
shift m
n PCM
substream a
decoding e audio
parameters p
l
from headers Entropy s
decoder
re-correlator shift
lsb bypass
{
Multi
channel between 40 and 160 samples.
6
channels Lossless
Lossless Blocks are assembled into packets. The user and/or
encoder
mix-down core the encoder can adjust the length of packets. A
Stereo DVD typical range is between 640 and 2560 samples.
downmix Each packet contains full initialisation and restart
instructions information. Therefore the decoder can recover
from severe transmission errors, or start up
Monitor
losslessly in mid-stream typically within 7ms.
Lossless encoder
6Mbps
DVD-A limit
6ch + 2 ch (upper)
The fixed-rate stream is packetised to provide losslessly
4Mbps 6ch compressed audio at a constant data rate. Encoding for
fixed rate can be a single-pass process if the target data
2Mbps rate is always attainable.
At times when the compressed data rate is less than the
0bps target, the encoder will fold in padding data or transmit
0 200 400 600 800 1000 1200 1400 1600 1800
a pending payload of additional data (see section 11).
Time (Packets)
Table 1 Data-rate reduction: At 44·1 or 48 kHz, the peak data rate can almost always
bits/sample/channel be reduced by at least 4 bits/sample, i.e. 16-bit audio can
Sampling kHz Peak Average be losslessly compressed to fit into a 12-bit channel.
48 4 5-11
At 96kHz, the peak data rate can similarly be reduced by
96 8 9–13
8 bits/sample, i.e. 24-bit audio can be compressed to 16
192 9 9–14
bits and 16-bit 96kHz audio can be losslessly
MLP of OP1: 192kHz 2 channel compressed to fit into an 8-bit channel.
10Mbps The important parameter for transmission applications is
the reduction of the peak rate. In the case of DVD-
8Mbps Audio peak rate is a key parameter because the encoded
stream must always operate below the audio buffer data-
6Mbps rate limit of 9.6Mbps.
The average number in Table 1 indicates the degree of
4Mbps compression that could be obtained when using MLP in
DVD-A limit
OP1
an archive, mastering or editing environment. For
2Mbps example, a peak data rate reduction of 8 bits/sample
means that a 96kHz 24-bit channel can be carried on the
0bps disc with a rate equal to that of a 24–8 = 16-bit LPCM
0 500 1000 1500 2000 2500 3000
channel. However the space used on the disc is
Time (Packets)
estimated by the average saving, in this case the residual
Figure 19: Compressed data rate for a 24-bit 2-channel will be 24–11 = 13bits/channel.
item sampled at 192kHz. Consider that an 11-bit saving represents a compression
ratio of 1.85:1 with 24-bit material, whereas the same
2.0Mbps
MLP of Take5: 44.1kHz 2 channel saving compresses 16-bit audio by 3.2:1!
Figure 19 shows a typical progression through 2-channel
192kHz 24-bit material (original data rate 9.216Mbps).
1.5Mbps Figures 20 and 21 show compression examples at CD
quality. The 2-channel example in Figure 20 shows an
1.0Mbps
average 2:1 compression.
Note that the 3-channel horizontal ambisonic B-format
(WXY) stream in Figure 21 (opening of Rachmaninov’s
500.0kbps
CD rate limit 2nd piano concerto) shows sufficient peak rate
Take5 compression to allow the stream to fit on a CD.
0.0bps
0 1000 2000 3000 4000 5000 6000 7000 8000 9.1 Compression adjustment
Time (Packets)
A producer may wish to save space used by a recording,
Figure 20: Compressed data rate for ‘Take Five’ – a 16- or to reduce data rate. Lossless compression extends the
bit 2-channel 44.1kHz item from CD. number of options.
With MLP, data is automatically saved if the incoming
2.0Mbps MLP of Rach4: Ambisonic WXY precision is reduced. So, reducing (for example) a few
or all channels in a mix from 24 to 22-bit will provide an
automatic data saving. The authors have previously
1.5Mbps
described appropriate quantising strategies.
[9][10][11][12]
1.0Mbps Another option for reducing encoded data is to low-pass
filter some of the incoming channels. Low-pass filtering
500.0kbps
reduces the entropy in the signal and the lossless coder
CD rate limit generally provides a lower data rate.
Rach4 A typical 96kHz 24-bit 6-channel program would
0.0bps encode to an average of 7.2Mbps. Reducing the audio
0 5000 10000 15000 20000
Time (Packets)
bandwidth with simple filtering from 48kHz to 24kHz
will generally reduce the rate to below 5Mbps.
Figure 21: Compressed data rate for a horizontal
ambisonic WXY 16-bit 3-channel 44.1kHz fragment
compressed for delivery on CD.
MLP was conceived as a general-purpose lossless [7] Craven, P.G. and Gerzon, M.A., ‘Lossless
compression system and the system is defined in terms Coding Method for Waveform Data’,
of the bitstream and the required decoder behaviour. International Patent Application no.
As a result, encoder developments may continue (for PCT/GB96/01164 (May 1996)
example, for increased compression) without outdating
the installed base of decoders. Current decoders are [8] Craven, P.G., Law, M.J. and Stuart, J.R.,
required to decode any legal bitstream, so there will be ‘Lossless Compression using IIR Prediction
no question of ‘old’ decoders being unable to decode Filters’, J. Audio Eng. Soc. (Abstracts), vol. 45
‘new’ software. no. 5 p. 404 (Preprint #4415) (March 1997)
The bitstream has been designed to keep open as many
options as possible for future encoder developments, [9] Craven, PG and Gerzon, MA. ‘Optimal Noise
while not impacting decoder complexity and MIPS more Shaping and Dither of Digital Signals.’ presented
than necessary. at AES 87th Convention, New York (Preprint
While the highest compression requires sophisticated #2822) (October 1989).
encoders, near optimal encoding of most music signals
can be obtained with much simpler encoders that have [10] Gerzon, M.A., Craven, P.G., Stuart, J.R. and
modest MIPS requirements and can run in real time with Wilson, R.J., ‘Psychoacoustic Noise Shaped
low latency on cheaply available DSP devices. Thus Improvements in CD and Other Linear Digital
future use in consumer record/playback systems and in Media’, AES 94th Convention, Berlin, (Preprint
radio microphones or other real-time applications is #3501) (March 1993)
entirely feasible.
[11] Stuart, J.R. and Wilson, R.J., ‘Dynamic Range
13 REFERENCES Enhancement Using Noise-shaped Dither
[1] Acoustic Renaissance for Audio, ‘A Proposal for Applied to Signals with and without Pre-
High-Quality Application of High-Density CD emphasis’, AES 96th Convention, Amsterdam,
Carriers’, private publication (February 1995) (Preprint #3871) (1994)
Available for download at: http://www.meridian-
audio.com/ara. [12] Stuart, J.R. and Wilson, R.J., ‘Dynamic Range
Enhancement using Noise-Shaped Dither at 44.1,
[2] Shannon, C.E. ‘A Mathematical Theory of 48 and 96 kHz’, AES 100th Convention,
Communication’, Bell System Technical Journal Copenhagen (May 1996)
vol 27 pp.379-423,623-656 (July, October 1948)
[13] Toshiba et al., ‘DVD Specifications for Read-
[3] Robinson, A. ‘Shorten: Simple lossless and near- Only Disc, Part 4: Audio Specification’ Version
lossless waveform compression.’ Tech Rep 1.0 (March 1999)
CUED/F-INFENG/TR.156, Cambridge
University, Cambridge, UK (Dec 1994). [14] Gerzon, M.A., Craven, P.G., Stuart, J.R., Law,
M. and Wilson, R.J., ‘The MLP Lossless
[4] Bruekers, AAML, Oomen, AWJ and van der Compression System’, to be presented at AES
Vleuten, RJ. ‘Lossless Coding for DVD Audio.’ 17th International Conference on High Quality
AES 101st Convention, Los Angeles (Preprint Audio Coding, Florence (September 1999).
#4358) (November 1996).