Vous êtes sur la page 1sur 5

Channel Coding Techniques for Adaptive Multi Rate Speech Transmission

Thomas Hindelang , Joachim Hagenauer , Max Schmautz , Wen Xu

Institute for Communication Engineering (LNT) Munich University of Technology (TUM) Arcisstr. 21, 80290 Munich, Germany Email: rstname . lastname @ei.tum.de

Dept. of Mobile Phone Development Siemens AG Hofmannstr. 51, 81359 Munich, Germany Email: rstname . lastname @mch.siemens.de

Abstract A variable channel coding scheme for Adaptive Multi Rate (AMR) speech transmission over mobile radio channels is proposed. Although it was developed for the GSM (Global System for Mobile Communications), the basic concept of variable channel coding can be adopted to other digital radio systems. The new AMR concept allows almost wire-line speech quality even for poor channel conditions by dynamically splitting the gross bit rate between source (speech) and channel coding according to the channel quality. In this study we show some new aspects relating to AMR and show the advantage of recursive systematic convolutional (RSC) codes for mobile speech transmission. With some modication they improve not only the bit error rate but also the frame erasure rate. It is derived that RSC codes can be decoded with a standard non-systematic Viterbi decoder and a simple transformation. A new powerful approach is employed for the detection of the currently used mode (in-band signaled information) where the mode bits are integrated in a long block and codes gain with their length.

ing the frame erasure rate and by a derivation of decoding RSC codes with a standard non systematic decoder. In section IV the generation and in-band transmission of the mode bits is outlined and compared to the standardized system with short block codes. II. T HE AMR CONCEPT The basic concept of an AMR speech transmission system is depicted in Figure 1.
Mobile station
SP-Enc. CH-Enc. Channel UL CH-Dec. CH-Est. detect chan. metric DL

Base station
SP-Dec.

UL rate

I. I NTRODUCTION Due to a strongly varying transmission environment, mobile communication systems such as the current GSM suffer from the non-optimum partition of source coding and channel coding rate. E.g., in the GSM the source and channel coding rates remain xed independent of the channel quality. Under bad channel conditions (deep fading) the redundancy inserted by channel coding might be insufcient to correct transmission errors, so that the signal (speech) cannot be properly reconstructed and this leads to very annoying artifacts. On the other hand, for good channels the overall speech quality can be improved if more bits are spent for source coding. The current digital radio systems are designed as a compromise between channel coding powerful enough to remove most transmission errors and sufciently good speech quality. Now, the so-called adaptive multi rate (AMR) speech codec is standardized for GSM in ETSI (European Telecommunications Standards Institute). The AMR concept solves the source channel rate allocation problem in a more intelligent way. The ratio between source bit rate and error protecting redundancy is adapted according to the channel conditions. When the channel is bad, the source encoder operates at low bit rates with lower speech quality, allowing more bits to be used for a powerful forward error correction. The highest rate of the speech encoder is used for good channels, since in this case weak error protection is sufcient. In this paper we will present some new aspects on channel coding for mobile speech transmission and a new approach for transmitting signaling information. First, a brief description of the AMR concept is given in section II. The channel coding with unequal error protection (UEP) and recursive systematic convolutional (RSC) codes is described in section III. It is extended by an approach decreasUL rate

generate modebits

UL

DL chan.
metric Control Unit Control Unit net driven

UL required
rate CH-Est. generate modebits

DL rate

detect req. rate UL SP-Dec.

DL
CH-Dec. Channel DL

DL rate
CH-Enc. SP-Enc.

Fig. 1. Overview of the GSM AMR concept.

Both the base station (BS) and the mobile station (MS) mainly consist of the following functional entities:

to the bit rate of the speech codec (CH-Enc, CH-Dec), a channel estimation entity (CH-Est), a control unit for the rate adaptation.

a speech codec with variable bit rate (SP-Enc, SP-Dec), a channel codec with variable error protection rate, matching

In the studied system, the BS is the master and decides about the modes (rates) in both uplink (UL) and downlink (DL). The MS will decode the modes which are used in UL and DL, and send the estimated channel metric to the BS. A channel quality parameter is derived from the soft output generated by the equalizer. It is used to control the codec mode (rate) in UL and DL. The AMR concept for both uplink and downlink of the GSM speech transmission is outlined below in more detail. Uplink: After initialization the mobile starts transmission with the lowest speech bit rate, ensuring a secure transmission. The mode bits (i.e., the information about the used bit rate) and the DL channel metric (information about the DL channel quality) are sent to the channel encoder and transmitted in-band.

Hence, no control channel is necessary for the rate and channel quality indication. At the receiver (i.e., BS) appropriate channel decoding (including the detection of the in-band information) is done rst followed by speech decoding. In parallel a measurement of the UL channel quality is carried out by the BS channel estimator. The measured UL channel quality and the detected DL channel quality metric are fed to the BS control unit, which determines the current DL rate (based on channel metric analysis) and the requested UL rate (based on the measured UL channel quality). Downlink: The current DL mode as well as the requested UL rate are transmitted in-band to the MS. The MS performs channel and speech decoding according to the detected DL rate (mode). Similar to the UL, a DL channel quality measurement is done by the estimator of the MS, and the requested UL rate is decoded from the received bit stream. From the measured DL channel quality a DL channel metric is calculated by the MS control unit and then transmitted in-band to the BS. The speech encoder now operates with the new requested UL rate. In Figure 1 the dashed-dotted lines indicate the DL signal ow and the dashed lines the UL signal ow, both in the MS and BS. III. CHANNEL C ODING
OF THE


Fig. 3. Realization of an RSC code.

S OURCE E NCODED B ITS

The SegSNR (segmental signal to noise ratio) [2] is plotted for the bit positions , i.e., the considered bit is chosen randomly and the decoded speech is compared to the error free transmitted speech. Furthermore, the SegSNR is limited to +20 dB per segment of 20 ms. A low SegSNR indicates a greater error sensitivity. In this example, the rst 9 bits of the LPC coefcients denoting the rst vector quantized index are very sensitive to errors, but many other bits (SegSNR 10 dB) are robust against errors. Therefore, an unequal error protection (UEP) is advantageous. The redundancy for the most important bits inserted by channel coding has to be greater than for the less important ones. That means, the information bits should be classied according to their sensitivity and then correspondingly protected. The required rates after channel encoding (e.g., 22.8 kbits/s for GSM full rate) can be obtained by puncturing. More information about UEP and puncturing can be found in [3, 4]. B. Recursive systematic convolutional codes In this study we used recursive systematic convolutional (RSC) codes with constraint length 5 for convolutional channel coding. The use of RSC instead of non-systematic convolutional (NSC) codes has several advantages. The bit error rate (BER) for typical mobile radio channels is lower or alternatively the needed channel signal to noise ratio (SNR) for RSC codes at the same BER (up to ) is lower than for NSC codes. Notice that the block error rate remains unchanged for both RSC and NSC codes. The gain in bit error rate compared to NSC codes increases the higher the rate of the code is. For a convolutional code of rate 2/3 and constraint length 4 we gain approx. 0.8 dB at a bit error rate of . This gain holds only if the systematic bits are not punctured. Another advantage of RSC codes arises from the fact that the systematic bits are transmitted over the channel and hence all information bits are available in the received bit stream before channel decoding. So, an a priori information for each bit can be calculated if there is redundancy left in the source coded bit stream [57] and added to a soft input outside the Viterbi decoder. That means, any Viterbi decoder can exploit a priori knowledge to improve the decoding result. Fig. 3 shows an example for the realization of an RSC code with shift registers. The RSC code uses the generator polynomials and as dened in GSM full rate [8]. The polynomial is used as the feedback polynomial. Higher rates can be achieved by puncturing.

A. Unequal error protection The coded bits of the speech encoder are of varying importance for the speech decoder to reconstruct the original speech. Thus, the corruption of the speech encoded bits due to transmission errors have different impacts on the quality of the decoded speech. Considering a CELP (code excited linear prediction) speech coder, the bits of the LPC (linear predictive coding) coefcients are usually more important than those of the xed codebook. Figure 2 shows the bit error sensitivity of a 6.1 kbits/s CELP speech codec [1].

19 17 15 13 11 SegSNR in dB 9 7 5 3 1 1 3 5 10 20 30 40 50 60 70 bit number 80 90 100

LPC LTPGain LTPIndex CBGain CBIndex

110

120

Fig. 2. Bit error sensitivity of CELP coded speech.

C. Improving the frame erasure rate with RSC codes and UEP Typically, in speech transmission the bits are assigned to classes dependent on their importance for the speech quality. For an important class some error detection techniques are used like a cyclic redundancy check or a quality estimation. The result hereof determines a frame erasure. In some modes in the speech channel of the GSM, e.g., TCH/AHS 7.4 kbits/s, (Trafc CHannel in the Adaptive multi rate Half rate Speech) the class 1 bits are protected with a rate 2/3 convolutional code obtained by puncturing a rate 1/2 codec. In the following we will show by an example how to modify a codec to improve both the BER and the frame erasure rate (FER).
EEP with NSC codes 140 rate 2/3 UEP with RSC codes 70 rate 7/11 4 rate 2/3 66 rate 7/10 4 4
BER

1 0.5 0.2 0.1 FER 0.05 0.02 0.01 0.005 0 NSC code with EEP RSC code with UEP 1 2 3 4 5 6 7 8 9 10

0.5 0.2 0.1 0.05 0.02 0.01 0.005 0.002 0.001 uncoded NSC code with EEP RSC code: class 1b RSC code: class 1a 0 1 2 3 4 5 6 ES/N0 in dB 7 8 9 10

Fig. 4. Equal error protection with NSC codes and unequal error protection with RSC codes.

510-4

In the upper part of Fig. 4 an equal error protection (EEP) scheme with termination is shown. EEP is often applied to the whole class 1. If an RSC code with the same EEP is used the BER could be improved but the block error rate (at least one error within the whole block) stays the same. By using RSC codes we only change the assignment of the information bits to the paths in the trellis but not the coded bits. More information about code properties can be found in [9]. In mobile speech transmission the class 1 is divided into two subclasses 1a and 1b. Only the class 1a is considered in determining the frame erasure rate. For this reason we halve the 140 bits in the example to 70 class 1a bits and 70 class 1b bits. To achieve an overall rate of 2/3, 70 bits are coded with rate 7/11 and 4 with rate 2/3 and 70 (66 + 4 tail-) bits are coded with rate 7/10. All rates are achieved by puncturing the non-systematic bits of a rate 1/2 RSC code. A frame erasure is declared if at least one of the 70 class 1a bits is wrong. In Fig. 5 the upper part shows the frame erasure rate which is improved by approx. 0.5 dB. In the lower part it can be seen that the bit error rate of the class 1b is lower than the BER of the NSC code to a value of 8 dB. For better channels its a little worse. The class 1a bit error rate is now more than a factor of 2 below the curve of the NSC code. It can be noticed that this gain can be achieved without increasing complexity. For the simulation results shown in Fig. 5 and for all further results we used a block fading channel, which means the coded bits are separated onto 8 time slots (exact as in the GSM full rate [8, section 3.1.3]). Inside one slot the fading amplitude is kept constant but between the slots it is statistically independent and Rayleigh distributed. This corresponds approximately to the typical mobile channels at low speed and ideal

Fig. 5. Frame erasure rate (upper part) and bit error rate (lower part) of the NSC code with EEP and the RSC code explained in Fig. 4.


Fig. 6. Realization of the RSC code shown in Fig. 3 by a recursion and an NSC code.

frequency hopping but with an additional gain of 3 to 4 dB due to vanishing loss in synchronization, channel estimation, and equalization. D. Decoding an RSC code with a standard NSC decoder The RSC decoding can be realized by using the NSC decoder for the equivalent non-systematic code. In Fig. 6 the realization of the RSC encoder is shown. First, the sequence is transformed to and then encoded by an NSC code. Note that the coded bit is the systematic bit. After the NSC decoding the preceding transformation is canceled by a simple shift register with appropriate taps but without feedback. This leads to exactly the same hard decision as decoding with an RSC decoder. However, the exact reliability information (MAP probabilities [9]) for each decoded bit can usually be generated only by an RSC decoder. The advantage of an implementation of RSC decoding by using an NSC decoder is that the old hard-

ware (e.g., the standard Viterbi decoder for NSC codes) can further be used to decode the RSC code and the complexity for the calculation of the bits from is negligible. E. Comparison of standard channel coding for GSM full rate and a new UEP scheme with RSC codes The advantages shown in the paragraphs III-A III-D were the reason to modify the GSM full rate standard with a new UEP channel coding scheme and RSC codes. For the implemented UEP scheme the used channel coding rates are shown in the following table, where 3 bits are additionally reserved for mode indication or a parity check.
TABLE I UEP SCHEME FOR THE SPEECH CODEC MODE WITH 13.0 KBITS / S .

are better protected than in the GSM standard although some less important bits (e.g., bits ) are less protected. As a result, better subjective quality is obtained conrmed by our informal subjective tests. It is well audible and objective measurable in terms of the speech SNR. For the UEP scheme presented the use of one parity check (with 3 bits like in the GSM full rate) for error concealment might not be a good solution. Instead one may use more each protecting only a small number of bits. Possible solutions without parity checks were proposed in [10, 11]. A new proposal for the error concealment in the AMR codec is described in [1]. IV. I N -BAND T RANSMISSION
OF

M ODE B ITS

bit rate bit rate


2/5 2/3


1/2

1/2


2/3 2/5

3/4

As mentioned in section II the rate indicating mode bits have to be protected by a powerful code to avoid wrong rate adaptation which may lead to annoying artifacts due to wrong speech decoding. With code termination the rst and last decoded bits have a lower BER compared to the bits in the middle because of the known initial and nal state of the decoder. Therefore, placing the very important mode bits at the very beginning of the frame of information bits and performing a joint encoding/decoding using UEP together with the speech bits ensures a very secure in-band transmission of mode bits. In contrast, employing a short block code to protect the mode bits separately is less powerful. Because of joint coding/decoding of mode bits together with speech bits it is necessary to use the same coding scheme for the rst part of the frame for all different source rates. With such a hierarchical coding scheme the complete trellis of the rst part can be built without knowing the actual block-size, and the mode bits can then be decoded. The length of the rst part, which is common to all rates, is typically , where is the memory of the convolutional coder. This takes into consideration the constraint length of the code and hence achieves a BER less than building the trellis only for the rst few mode bits. Having determined the mode bits, the block-size and the used rates are known so that the rest of the received bit stream can be properly decoded.
T 22 52 8/14 M 34 16 2 14 2/5 24 42 1/2 61 1/2 24 2/5 31 1/3 45 9/16 24 2/5 41 1/4 4 1/3 41 1/2 21 1/3 172 8/11 16 2/5 16 4 3/7 2/5 16 4 4

To compare the performance, the new 13.0 kbits/s channel codec with UEP is used in both recursive systematic form and non-systematic form. The coded bits are punctured such that the channel bit rate of 22.8 kbits/s is generated [8]. Figure 7 shows the BER of UEP/NSC, UEP/RSC schemes and the GSM full rate standard for a channel SNR of 2 dB and 6 dB, where the used channel model is a block fading channel (see par. III-C). For better comparison with the new UEP scheme, the uncoded class 2 bits of the GSM standard are placed in the middle (bits ) of the frame of information bits. Note that for bad channels the NSC code yields a higher BER than the uncoded bits.

10

10

BER 10
3

1/3 1/2

10

GSM TCH FS, 2 dB New UEP with NSC, 2 dB New UEP with RSC, 2 dB GSM TCH FS, 6 dB New UEP with RSC, 6 dB 19 42 101 bit number 161 220 244

1/5 1/3 1/4

1/3 2/5 16 1/3

10

Fig. 8. Hierarchical channel coding structure with UEP for different source bit rates. Fig. 7. Performance of RSC and UEP for a block fading channel.

As shown in paragraph III-A, only a small number of speech encoded bits have strong impacts on the quality of the reconstructed speech in case of errors (see Fig. 2). In our UEP scheme, these bits are placed in the rst part (e.g., bits ) and in the last part (bits ) of the frame, and hence

Fig. 8 shows the hierarchical coding scheme for 4 different speech rates with the corresponding channel coding rates, where 3 bits are reserved for the rate indication. In this example the length of the common part is 23 bits. Due to joint channel coding a plausibility check of the decoded mode bits can also be carried out by comparing the values of the met-

rics in conjunction with the estimated channel quality. Thus, wrong rate adaptation can normally be avoided. The detection of mode bits in the new AMR standard and in a hierarchical scheme The above described approach for in-band signaling is very powerful. To show its advantage we compare the AMR standard at 5.9 kbits/s [8, section 3.9.4.4, TCH/AFS5.9] and its modied hierarchical scheme. This is realized as follows: In the AMR standard, the two mode bits are separately encoded by a rate 1/4 block code. Here, they are integrated into the block of information bits. This results in 126 bits. These are encoded with a rate 1/4 convolutional code [8] resulting in 528 (520 in [8]) coded bits. Instead of puncturing the bits C(0),C(1),C(3) (see [8]) the bits C(501),C(505),C(508) are punctured. This leads to a modest increase of the BER at the end of the block of information bits but a decrease at the beginning. Finally, all puncturing positions are right shifted by 8 leaving the rst 8 bits (coded mode bits) untouched. The receiver decodes the rst 23 information bits and decides upon the rst two bits (mode bits). Depending on this decision, decoding continues.

V. C ONCLUSIONS We described channel coding and adaptation algorithms for GSM AMR speech transmission. The methods presented here can certainly be used for other mobile transmission systems like the third generation mobile telecommunication standard. The channel coding scheme was designed as hierarchical as possible. The mode bits are coded together with the source (speech) bits to ensure a better protection. By using RSC codes instead of conventional NSC codes, better performance can be obtained especially in mobile communication environments. The channel coding algorithms presented in this study have been combined with the VR-CELP speech coding and the error concealment algorithms described in [1] and the channel estimation and rate adaptation techniques shown in [12] to build an AMR codec proposal for GSM speech transmission [13]. Subjective tests were carried out to evaluate this AMR codec proposal for the conditions dened at ETSI SMG11 AMR standardization meetings. The test results showed that it met most of the qualication requirements and constraints [14]. ACKNOWLEDGMENTS The authors are grateful to Prof. Peter Vary, Stefan Heinen and Marc Adrat, with RWTH Aachen, and Dr. Stefan Oestreich, Dr. Juergen Paulus, with Siemens AG, for many fruitful discussions. Siemens AG, Munich, Germany, has supported this work. R EFERENCES
[1] [2] S. Heinen, M. Adrat, O. Steil, P. Vary, and W. Xu, A 6.1 to 13.3 kbit/s Variable Rate CELP Codec (VR-CELP) for AMR Speech Coding, in Proc. of ICASSP99, Phoenix, Arizona, Mar. 1999. N.S. Jayant and P. Noll, Digital Coding of Waveforms, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1984. J. Hagenauer, Rate-Compatible Punctured Convolutional Codes (RCPC Codes) and their Applications, IEEE Transactions on Communications, vol. 36, no. 4, pp. 389400, Apr. 1988. J. Hagenauer and T. Stockhammer, Channel coding and transmission aspects for wireless multimedia, to appear in Proceedings of the IEEE - Special Issue on Video Transmission for Mobile Multimedia applications, 2000. J. Hagenauer, Source-Controlled Channel Decoding, IEEE Transactions on Communications, vol. 43, no. 9, pp. 24492457, Sept. 1995. A. Ruscitto and T. Hindelang, Channel Decoding Using Residual IntraFrame Correlation in a GSM System, IEE Electronics Letters, vol. 33, no. 21, pp. 17541755, Oct. 1997. T. Hindelang, T. Fingscheidt, N. Seshadri, and R.V. Cox, Combined Source/Channel (De)Coding: Can A Priori Information Be Used Twice?, in Proc. of ICC2000, New Orleans, Lousiana, June 2000. Recommendation GSM 05.03 Channel Coding, ETSI TC-SMG, Version 8.3.0, Release 1999. R. Johannesson and K.S.Zigangirov, Fundamentals of Convolutional Codes, IEEE Press, Inc., Piscataway, New Jersey, 1999. T. Hindelang, C. Erben, and W. Xu, Quality Enhancement of Coded and Corrupted Speeches in GSM Using Residual Redundancy, in Proc. of ICASSP97, Munich, Germany, Apr. 1997, vol. 1, pp. 259262. T. Fingscheidt and O. Scheufen, Robust GSM Speech Decoding Using the Channel Decoders Soft Output, in Proc. of EUROSPEECH97, Rhodos, Greece, Sept. 1997, pp. 13151318. T. Hindelang, M. Kaindl, J. Hagenauer, M. Schmautz, and W. Xu, Improved Channel Coding and Estimation for Adaptive Multi Rate (AMR) Speech Transmission, in Proc. of VTC00 Spring, Tokyo, Japan, May 2000. ETSI SMG11: Proposal of an Adaptive Multi Rate Codec., AMR #10 Tdoc AMR /98, Siemens, Stockholm, Sweden, June 1998. W. Xu, S. Heinen, T. Hindelang, et al., An Adaptive Multirate Speech Codec Proposed for the GSM, in Proc. of 3. ITG Conference Source and Channel Coding, Munich, Germany, Jan. 2000, pp. 5156, VDE Verlag.

0.03 0.01 Mode error rate 0.003 0.001 310-4 110-4 310-5 110-5 -2 TCH/AFS standard Hierarchical approach -1 0 1 2 3 ES/N0 in dB 4 5 6 7

3.9 dB

[3] [4]

[5] Fig. 9. Mode Errors in the TCH/AFS standard and in the hierarchical approach (Transmission over a block fading channel). [6] [7] [8] [9] [10] [11] [12]

As shown in Fig. 9 there is a gain of 3 to more than 4 dB if the in-band signaling information is integrated into the whole block of information bits. The free distance of two code words in the standardized short block in TCH/AFS code is 5. For a rate 1/3 constraint length 7 (5) convolutional code it is 14 (12) and for rate 1/4 it is 20 (15) [9]. The complexity increase is modest, e.g., if a soft output Viterbi algorithm [5] is used, merely the decision feed back has to be done twice for the rst 23 bits. This feedback is much less complex than the forward trellis construction which runs through at once. In this example, the hierarchical system was integrated in the standard with small modications. Remember that this approach needs the same code and the same puncturing for the rst part of a block (mode bits + approx. 20 speech bits) of all rates. Furthermore, a priori information can be exploited in the hierarchical scheme and improve the results.

[13] [14]

Vous aimerez peut-être aussi