Vous êtes sur la page 1sur 10


4, APRIL 2006

Interleave-Division Multiple-Access
Li Ping, Member, IEEE, Lihai Liu, Student, IEEE, Keying Wu, Student, IEEE, and W. K. Leung

Abstract This paper provides a comprehensive study of chip-level interleavers. An interleaver-based multiple access
interleave-division multiple-access (IDMA) systems. The IDMA scheme has also been studied in [18][19] for high spectral
receiver principles for different modulation and channel condi- efficiency, improved performance and low receiver complexity.
tions are outlined. A semi-analytical technique is developed based
on the density evolution technique to estimate the bit-error-rate This scheme relies on interleaving as the only means to
(BER) of the system. It provides a fast and relatively accurate distinguish the signals from different users, and hence it
method to predict the performance of the IDMA scheme. With has been called interleave-division multiple-access (IDMA).
simple convolutional/repetition codes, overall throughputs of 3 IDMA inherits many advantages from CDMA, in particular,
bits/chip with one receive antenna and 6 bits/chip with two receive diversity against fading and mitigation of the worst-case other-
antennas are observed for IDMA systems involving as many as
about 100 users. cell user interference problem. Furthermore, it allows a very
simple chip-by-chip iterative MUD strategy [18][19]. The
Index Terms CDMA, density evolution, iterative decoding, normalized MUD cost (per user) is independent of the number
multi-user detection.
of users.
In this paper, we will provide a comprehensive study of
I. I NTRODUCTION the IDMA scheme, incorporating the principles developed

in [18][19]. The contributions of this paper are as follows.
HE performance of code-division multiple-access
First, we will derive several low-cost detection algorithms
(CDMA) systems is mainly limited by multiple access
for different channel conditions, namely, real-single-path, real-
interference (MAI) and intersymbol interference (ISI). In the
multi-path and complex-multi-path channels. These algorithms
wake of the success of turbo codes [1], turbo-type iterative
are very simple and efficient, as confirmed by simulation
multi-user detection (MUD) has been extensively studied
results. Second, we will develop a semi-analytical technique
[2]-[10] to mitigate MAI and ISI, and significant progress
based on the SNR density evolution technique [9], [20]-[22]
has been made.
to estimate the bit-error-rate (BER) performance of these
A conventional random waveform CDMA (RW-CDMA)
algorithms. This offers a fast and accurate method to predict
system (such as IS-95) involves separate coding and spreading
the performance of the IDMA scheme, and is useful for system
operations. Theoretical analysis [11][12] shows that the op-
analysis and design. Finally, we will present a comprehensive
timal multiple access channel (MAC) capacity is achievable
assessment of the IDMA principle using numerical examples.
when the entire bandwidth expansion is devoted to coding.
Simulation results are provided to demonstrate the advantages
This suggests combining coding and spreading using low-
of the IDMA scheme in terms of both bandwidth and power
rate codes to maximize the coding gain [11][13]. In this
efficiencies. For example, with simple convolutional/repetition
case, interleavers can be employed to distinguish signals from
codes, overall throughputs of 3 bits/chip with one receive
different users. The principle has been studied previously and
antenna and 6 bits/chip with two receive antennas are observed
its potential advantages have been demonstrated [2][14]-[19].
for IDMA systems with as many as about 100 users. More
Ref. [2] showed the possibility of employing interleaving
sophisticated low-rate codes can also be used for further per-
for user separation in coded systems. Ref. [14] proposed
formance enhancement, as illustrated by comparisons between
narrow-band coded-modulation schemes in which trellis code
low-rate and high-rate coded IDMA systems.
structures are used for user separation and interleaving is con-
Although our focus in this paper is on multi-user systems,
sidered as an option. For wideband systems, the performance
the principles developed in this paper, such as the iterative
improvement by assigning different interleavers to different
detection algorithms and SNR evolution techniques, are di-
users in conventional CDMA has been demonstrated in [15]
rectly applicable to a variety of different applications, such as
and [16]. Ref. [17] studied a chip interleaved CDMA scheme
space-time coding for antenna diversity [23] and superposition
and a maximal-ratio-combining (MRC) technique for MACs
coding for bandwidth efficient coded modulation [24] and
with ISI. It clearly demonstrated the advantages of introducing
adaptive modulation [25].
Manuscript received May 7, 2004; revised March 23, 2005; accepted
April 26, 2005. The associate editor coordinating the review of this letter II. IDMA T RANSMITTER AND R ECEIVER P RINCIPLES
and approving it for publication was F. Daneshgaran. This work was fully
supported by a grant from the Research Grant Council of the Hong Kong A. IDMA Transmitter and Receiver Structures
Special Administrative Region, China [Project No. CityU 1164/03E]. The
material in this paper was presented in part at the IEEE Vehicular Technology The upper part of Fig. 1 shows the transmitter structure
Conference 2003 Spring and IEEE Wireless Communications and Networking of the multiple access scheme under consideration with K
Conference 2003. simultaneous users. The input data sequence dk of user-k
The authors are with the Department of Electronic Engineering, City
University of Hong Kong, Hong Kong (e-mail: eeliping@cityu.edu.hk) is encoded based on a low-rate code C, generating a coded
Digital Object Identifier 10.1109/TWC.2006.04028. sequence ck [ck (1), . . . , ck (j), . . . , ck (J)]T ,where J is
c 2006 IEEE
1536-1276/06$20.00 !

Transmitter for user-1 where hk is the channel coefficient for user-k and {n(j)} are
d1 c1 x1 samples of an AWGN process with variance 2 = N0 /2. We

C 1
Multiple assume that the channel coefficients {hk } are known a priori

Transmitter for user- K at the receiver. (For the channel estimation in IDMA systems,
dK cK xK

C K please refer to [18].) Due to the use of random interleavers

{ k }, the ESE operation can be carried out in a chip-by-chip
Turbo processor manner, with only one sample r(j) used at a time. Rewrite
d 1
{eESE ( c1 ( j ))} -1 {eESE(x1 ( j))} (2) as
Decoder 1
{eDEC (c1 ( j))} {eDEC(x1 ( j))}
1 Elementary r(j) = hk xk (j) + k (j) (3)
Signal r



{eESE (cK ( j ))} {eESE (x K ( j))} (ESE)
Decoder K
{eDEC ( cK ( j ))} {eDEC ( xK ( j ))} #
k (j) r(j) hk xk (j) = hk! xk! (j) + n(j)
k! !=k

Fig. 1. Transmitter and (iterative) receiver structures of an IDMA scheme is the distortion (including interference-plus-noise) in r(j)
with K simultaneous users. with respect to user-k. From the central limit theorem, k (j)
can be approximated as a Gaussian variable, and r(j) can be
characterized by a conditional Gaussian probability density
the frame length. The elements in ck are referred to as coded function
bits. Then ck is permutated by an interleaver k , producing
p (r(j)|xk (j) = 1)
xk [xk (1), . . . , xk (j), . . . , xk (J)]T . Following the CDMA ! "
convention, we call the elements in xk chips. Users are 1 (r(j)(hk+E(k (j))))2
=$ exp (5)
solely distinguished by their interleavers, hence the name 2Var(k(j)) 2Var(k(j))
interleave-division multiple-access (IDMA). where E() and Var() are the mean and variance functions,
The key principle of IDMA is that the interleavers respectively. Note that the central limit theorem applies to
{ k }should be different for different users. We assume that the the summation of a large number of random variables. This
interleavers are generated independently and randomly. These implies the assumption of a large number of simultaneous
interleavers disperse the coded sequences so that the adjacent users, which is reasonable in spread-spectrum cellular systems
chips are approximately uncorrelated, which facilitates the (both IDMA and CDMA).
simple chip-by-chip detection scheme discussed below. The following is a list of the ESE detection algorithm
We adopt an iterative sub-optimal receiver structure, as based on (2)(5) [18], assuming that the a priori statistics
illustrated in Fig. 1, which consists of an elementary signal {E(xk (j))} and {Var(xk (j))} are available (see Section II.F).
estimator (ESE) and K single-user a posteriori probability
(APP) decoders (DECs). The multiple access and coding Algorithm 1. Chip-by-Chip Detection in a Single-Path
constraints are considered separately in the ESE and DECs. Channel
The outputs of the ESE and DECs are extrinsic log-likelihood
ratios (LLRs) about {xk (j)} defined below [18][19]: Step (i): Estimation of Interference Mean and Variance
! "
p(y|xk (j) = +1) #
e(xk (j)) log , k, j. (1) E(r(j)) = hk E(xk (j)), (6)
p(y|xk (j) = 1)
These LLRs are further distinguished by subscripts, i.e., Var(r(j)) =
|hk | Var(xk (j)) + 2 , (7)
eESE (xk (j)) and eDEC (xk (j)), depending on whether they k
are generated by the ESE or DECs. For the ESE, y in (1) E(k (j)) = E(r(j)) hk E(xk (j)), (8)
denotes the received channel output. For the DECs, y in (1) 2
Var(k (j)) = Var(r(j)) |hk | Var(xk (j)). (9)
is formed by the deinterleaved version of the outputs of the
ESE. (See Fig. 1 and the discussions in Section II.F below.) Step (ii): LLR Generation
A global turbo-type iterative process is applied to process the r(j) E(k (j))
LLRs generated by the ESE and DECs [1][18], as detailed eESE (xk (j)) = 2hk . (10)
Var(k (j))
B. The Basic ESE Function Under the assumption that {xk (j)} are independent, (6)-
(9) are a straightforward consequence of (2) and (4).
We first assume that the channel has no memory. After chip-
Step (ii) is obtained by evaluating (1) based on (5).
matched filtering, the received signal from K users can be Algorithm 1 is an extremely simplified form of that
written as
derived in [4] when the spreading sequences are all
K length-1.
r(j) = hk xk (j) + n(j), j = 1, 2, ..., J (2) The operations in (6) and (7), i.e., generating E(r(j))

k=1 and Var(r(j)), are shared by all users, costing only


three multiplications and two additions per coded bit per for xk (j) = 1 are the products of the individual a
user. Overall, the ESE operations in (6)-(10) cost only posteriori probabilities generated from {r(j), r(j + 1),
seven multiplications and five additions per coded bit per . . . , r(j + L 1)}. Hence the LLRs for xk (j) can be
user, which is very modest. Interestingly, the cost per directly summed as in (19). This LLR combining (LLRC)
information bit per user is independent of the number technique is similar to the rake operation used in CDMA.
of users K. This is considerably lower than that of The overall complexity is approximately L times of that
other alternatives. For example, the well-known MMSE of Algorithm 1.
algorithm in [4] has a complexity of O(K 2 ). From this algorithm, we can see that frame synchroniza-
tion is not necessary for IDMA, since frame asynchro-
C. The ESE Function for Multi-Path Channels nization has the same effect as multipath delay.
The uncorrelatedness assumption mentioned above is only
We now consider the ESE function in a quasi-static
approximate, but it greatly simplifies the matter. The complex-
multi-path fading channel with memory length L 1. Let
ity (per coded bit per user) for Algorithm 2 is O(L). There are
{hk,0 , , hk,L1 } be the fading coefficients related to user-
other alternative treatments for channels with memory. One is
k. After chip-matched filtering, the received signal can be
the maximum ratio combining (MRC) technique [8][17], in
represented by
which r {r(j)} is passed through K MRC filters, each
matched to the L tap-coefficients for a particular user. This
K #
# L1
method involves the calculation of the interference variances
r(j) = hk,l xk (j l)+n(j), j = 1, , J+L1. (11) after the MRC. The related complexity is quite high (O(LK))
k=1 l=0
if these variances are calculated exactly [17]. The technique
We write used in [8] has a lower cost (O(L)) due to the use of an
approximation. We have observed that the method in [8]
r(j + l) = hk,l xk (j) + k,l (j) (12) has similar performance to the LLRC method. A common
problem of all the techniques discussed above is that they have
poor performance when the rate of C is high [19]. The joint
Gaussian (JG) technique [19] provides an efficient solution
k,l (j) = r(j + l) hk,l xk (j). (13)
to this problem, which takes into consideration the correla-
The similarity between (12) and (3) is clearly seen. Assume tion among {k,0 (j), k,1 (j), , k,L1 (j)}. This technique
again BPSK signaling and real channel coefficients. Algorithm demonstrates much better performance when the number of
2 below is a straightforward extension of Algorithm 1. users is very large or when the rate of C is high. The related
cost is O(L2 ).
Algorithm 2. Chip-by-Chip Detection in a Multi-Path
Channel D. The ESE Function for More Complex Channels
Step (i): Estimation of Interference Mean and Variance We now extend our discussion to more complex situations.
# We will use either superscripts Re and Im or function
E(r(j)) = hk,l E(xk (j l)), (14) notations Re() and Im() to indicate real and imaginary parts,
k,l respectively. Consider quadrature-phase-shift-keying (QPSK)
Var(r(j)) =
|hk,l | Var(xk (j l)) + 2 , (15) signaling,
E(k,l (j)) = E(r(j + l)) hk,l E(xk (j)), (16) xk (j) = xRe
k (j) + ixk (j)
Var(k,l (j)) = Var(r(j + l)) |hk,l | Var(xk (j)). (17) where i = 1, xRe k (j) and xk (j) are two coded bits from

ck . For convenience, we still call the elements in xk chips.

Step (ii): LLR Generation and Combining Note that in this case, each chip contains two coded bits.
r(j + l) E (k,l (j)) We adopt channel model (11) and expand it using complex
eESE (xk (j))l = 2hk,l , (18) channel coefficients {hk,l = hRe
k,l + ihk,l } as
Var (k,l (j))
# #% &
eESE (xk (j)) = eESE (xk (j))l . (19) r(j) = k,l xk (j l) hk,l xk (j l)
hRe Re Im Im

l=0 k,l
#% &
Comments: +i k,l xk (j l) + hk,l xk (j l) +n(j) (21)
hRe Im Im Re

It is easy to see the connection between (14)-(17) and
(6)-(9). where {n(j)} are samples of a complex AWGN process with
From(11), each xk (j) is observed on L successive sam- variance 2 per dimension. Denote by hk,l the conjugate of
ples {r(j), r(j + 1), . . . , r(j + L 1)}. Assume that hk,l . Recall (12): r(j + l) = hk,l xk (j) + k,l (j). The phase
the distortion terms with respect to xk (j) in these L shift due%to hk,l is cancelled
& out in hk,l r(j + l), which means
samples, i.e., {k,0 (j), k,1 (j),. . . , k,L1 (j)}, are un- that Im hk,l r(j + l) is not a function of xRe k (j). Therefore
correlated. Then the overall a posteriori probabilities the detection of xRek (j) only requires

For the derivation of (29), see the Appendix.

k (j) based
% & 2 % & A similar procedure can be used to estimate xIm
Re hk,l r(j + l) = |hk,l | xRe
k (j) + Re hk,l k,l (j) . (22) on {Im(hk,l r(j + l)), l = 0, , L 1}.
k (j)
Algorithm 3 below outlines the procedure to estimate xRe If the cost related to (j) is ignored, the complexity
based on (22). of Algorithm 3 per coded bit per user is approximately
two times of that of Algorithm 2. It slightly increases
Algorithm 3. Chip-by-Chip Detection in a Complex Multi- by several additions and multiplications when (j) is
Path Channel considered, but is still O(L).
Step (i): Estimation of Interference Mean and Variance E. The ESE Function for Channels with Multiple Receive
% & #% Re & Antennas
E rRe (j) = k (j l))hk,l E(xk (j l)) , (23)
hk,l E(xRe Im Im
The above principles can be easily generalized to channels
% & # % Re & with multiple receive antennas. The signals from each receive
E rIm (j) = k (j l))+hk,l E(xk (j l)) , (24)
hk,l E(xIm Im Re
antenna can be treated as those from a set of independent
k,l paths. The LLRC technique discussed in Section II.C can be
% & #% &2 % &
Var rRe (j) = directly applied.
k,l k (j l)
Var xRe
#% &2 % & F. The DEC Function
+ hIm
k,l k (j l) + , (25)
Var xIm 2
The DECs in Fig. 1 carry out APP decoding using the output
% & #% &2 % & of the ESE as the input. With BPSK signaling, their output is
Var rIm (j) = hIm
k,l k (j l)
Var xRe the extrinsic LLRs {eDEC (xk (j))} of {xk (j)} defined in (1),
k,l which are used to generate the following statistics
#% &2 % &
+ hRe
k,l k (j l) + , (26)
Var xIm 2

E(xk (j)) = tanh(eDEC (xk (j))/2), (33)
# % % Re & % Im && Var(xk (j)) = 1 (E(xk (j)))2 . (34)
(j) = k,l hk,l Var xk (j l) Var xk (j l) ,
hRe Im

k,l (With QPSK signaling, the DEC outputs are the extrinsic
LLRs for {xRe k (j)} and {xk (j)}.) As discussed above,
% % && % Re & % Im & {E(xk (j))} and {Var(xk (j))} will be used in the ESE to
E Re hk,l k,l(j) = hRe
k,l E r (j + l) +hIm
k,l E r (j + l) update the interference mean and variance in the next iteration.
% &
|hk,l |2 E xRe Initially, we set E(xk (j)) = 0 and Var(xk (j)) = 1 for k, j,
k (j) , (28)
implying no information from DECs.
% % && APP decoding is a standard operation [1] and so we will not
Var Re hk,l k,l (j) discuss it in detail. We will only consider a special case of C
% &2 % & % &2 % &
= hRek,l Var rRe (j + l) + hIm k,l Var rIm (j + l) in Fig. 1 that is formed by serially concatenating a sub-code
4 % Re & CF EC (the same for every user) and a length-S repetition code
+2hRek,l hk,l (j + l) |hk,l | Var xk (j) .
CREP . This scheme is not optimized from performance point
of view, as the repetition code is actually a very poor code.
Step (ii): LLR Generation and Combining
However, this structure does have the advantage of flexibility
regarding rate.
eESE (xRe
k (j))l
% & % % && The input data sequence of each user is first encoded by
2 Re hk,l r(j +l) E Re hk,l k,l (j) CF EC , generating {bk (i), i = 1, 2, . . . }. Then each bk (i) is
= 2|hk,l | % % && , (30)
Var Re hk,l k,l (j) repeated S times by CREP , producing {ck (j)}. For simplicity,
# we focus on those replicas related to bk (1), i.e., {ck (j), j =
eESE (xk (j)) =
eESE (xRe
k (j))l . (31) 1, 2, . . . , S}. The treatment for replicas of bk (i) with i > 1 is
l=0 similar. The DEC for C carries out the following operations.
For simplicity, we assume BPSK modulation.
Comments: (i) Obtain the estimate of each bk (i) based on
We obtain (23)(26) using (21) and obtain (28) as follows {eESE (xk (j))} from the ESE. We assume that {eESE (xk (j)),
(based on (12) and (22)), j} are un-correlated (which is approximately true due to in-
terleaving). From Fig. 1, we have ck (j) = xk ( k (j)). Then the
% & soft estimate of bk (1) can be computed from {eESE (xk (j))}
Re hk,l k,l (j) = hRe
k,l r
(j + l) + hIm
k,l r
(j +l)
as [17]
k (j).
|hk,l | xRe (32)
# ! "
It can be verified that (j) in (27) is the covariance of p (r(k (j))|xk (k (j)) = +1)
L(bk (1)) = log
rRe (j) and rIm (j). It is introduced for cost saving since j=1
p (r(k (j))|xk (k (j)) = 1)
it is shared by all users, costing L multiplications and S
L/2 additions per coded bit per user. (Recall that there are = eESE (xk (k (j))). (35)
two coded bits in a chip, one in each dimension.) j=1

(ii) Perform standard APP decoding for CF EC using A. Performance Assessment for Algorithm 1
{L(bk (i))} as the input, and generate the a posteriori LLRs Approximate Var(k (j)) in (9) by its sample mean
{LAP P (bk (i))} for {bk (i)}.
(iii) Recall that ck (j) = bk (1) for j = 1, . . . , S. We compute Var(k (j)) Vk |hk! |2 Vxk! + 2 (38)
[17] k! !=k

eDEC (xk (k (j))) = eDEC (ck (j)) where J

1 #
Vxk Var(xk (j)) . (39)
= LAP P (bk (1)) eESE (xk (k (j))), J j=1
j = 1, . . . , S. (36)
(Notes: Var(xk (j)) is the variance of a particular xk (j)
The subtraction above ensures that eDEC (xk ( k (j))) is obtained from a feedback eDEC (xk (j)) using (34). Vxk and
extrinsic [1]. Vk are averages of {Var(xk (j)), j} and {Var( k (j)), j}
respectively, which can be different for different k due to the
Alternatively, we can use an approximation of (36),
unequal fading coefficients for different users.) Substituting
(38) into (10), we have
eDEC (xk (k (j))) LAP P (bk (1)), j = 1, . . . , S. (37)
eESE (xk (j)) = (hk xk (j) + k (j) E(k (j))) . (40)
In this way, all the replicas of bk (i) have the same feedback Vk
from the DEC, so the memory usage can be greatly reduced In our study, we observed that (40) leads to slightly poorer
(since we only need to store {LAP P (bk (i))} instead of performance compared with (10), since Var( k (j)) carries
{eDEC (xk (j))}). Eqn. (37) may lead to certain performance more information about k (j) (for a particular j) than Vk .
loss compared with (36). See Fig. 3(a) in Section IV below. Thus, replacing (10) by (40) is a pessimistic approximation.
However, this replacement greatly simplifies the analysis issue.
Similar techniques have been used in [9][26] for CDMA
G. The Cost of the Overall Receiver receiver analysis.
The DEC cost of a cascade CF EC /CREP structure studied In (40), hk xk (j) and k (j)E( k (j)) represent signal and
in Section II.F is dominated by the APP decoding cost for distortion components, respectively. Since xk (j) = 1, signal
CF EC , as the additional cost involved in (35) and (36) are power E(|hk xk (j)|2 ) = |hk |2 . We approximate the average
usually marginal. In particular, suppose that a turbo type code noise power after soft cancellation (for a fixed k) by its sample
is used as CF EC . Then even a single-user detector would mean,
involve iterative processing with APP decoding. In this case,
the extra cost for the multi-user detector described above is E(|k (j) E(k (j))|2 ) Vk . (41)
mainly related to the ESE, which, as we have seen, is very
The coefficient 2hk /Vk in (40) is a constant factor that does
modest. The overall complexity of the multiuser detector can
not affect the SNR. The average SNR of eESE (xk (j)) over
be roughly comparable to that of a single-user one. (The exact
j, denoted by snrk , is thus given by
ratio depends on the cost ratio between the ESE and APP
decoding.) ' (
E |hk xk (j)|2 2
|hk |
snrk = =) 2 2 .
III. P ERFORMANCE A NALYSIS Vk |hk! | Vxk! |hk | Vxk + 2
The performance analysis for a conventional CDMA multi-
We assume that {eESE (xk (j)), j} can be approximately
user detection scheme requires the knowledge of the corre-
treated as LLRs of {xk (j), j} generated from the obser-
lation characteristics among signature sequences. It can be
vations of an AWGN channel with SNR equal to snrk . This
a quite complicated issue and sophisticated large random
implies that the distortion components among {eESE (xk (j)),
matrix theory has been used in the past to tackle the problem
j} are uncorrelated, which is approximately true when the
frame length J . Recall that Var(xk (j)) in (34) is
IDMA does not involve signature sequences, which greatly
calculated based on eDEC (xk (j)), so Vxk in (39) is a function
simplifies the problem. In the following, we will derive a
of snrk , i.e.,
simple and efficient performance assessment technique. The
method is semi-analytical since some of the functions involved
Vxk = f (snrk ). (43)
(related to the FEC codes) are pre-calculated by simulation
(similar to [14][15]). We will only discuss Algorithms 1 In general, there is no closed form expression for f (), but
and 3, as Algorithm 2 is a special case of Algorithm 3. it can be easily obtained by the Monte Carlo method. This
The resultant performance assessment method is useful in only involves simulating a single-user APP decoder for C in
many applications. For example, in searching for optimized an AWGN channel with specified SNRs. We assume that all
transmission power levels, repeated system performance eval- users use the same FEC code, so f () is the same for all
uation is involved. A fast performance assessment technique users. Similarly, we can define the BER performance for the
is essential for this purpose. See [27][28] for details. kth DEC as a function of snrk ,

BER = g(snrk ) (44) Var = f (SNR )

which can also be obtained by simulation. Combining (42) 1.E-02 BER = g (SNR )
and (43), we have

Var or BER
|hk | 1.E-04
snrk new = ) 2 2
|hk! | f (snrk! old )|hk | f (snrk old )+
(45) 1.E-06
where snrk new and snrk old are, respectively, snrk values 1.E-03 1.E-02 1.E-01 1.E+00
after and before one iteration. At the start, we initialize SNR

f (snrk old ) = 1 for all k, implying no feedback from the

DECs. Repeating (45), we can track the SNR evolution for the Fig. 2. The variance (solid line) and BER (dashed line) as functions of the
iterative process. During the final iteration, we can estimate the SNR of a single-user APP decoder.
BER performance of all users using (44): BER = g(snrk final ),
k= 1, 2, .
Substituting (48) into (31), we have
B. Performance Assessment for Algorithm 3
# |hk,l |2
We now consider Algorithm 3. With QPSK signaling, each eESE(xRe
k (j)) = 2
xk (j) contains two coded bits in the real and imaginary parts Vk,l
% % & % % &&&
respectively, and Vxk in (39) is modified as k (j)+Re hk,l k,l (j) E Re hk,l k,l (j) .
|hk,l |2 xRe
#% J % & % % &&
1 & 2
k (j)) + Var(xk (j)) .
Var(xRe We view |hk,l| xRe k (j) and Re hk,l k,l (j) E Re hk,l k,l(j)
Vxk (46)
2J j=1 in (52) as signal and distortion components, respectively. Their
SNRs are given by |hk,l | /Vk,l . Thus, besides a scaling
Similar to (38), we adopt the following approximation
factor of 2, (52) can be regarded as a MRC %of L indepen- &
2 Re
Var(xRe Im
k (j)) Var(xk (j)) Vxk . (47)
% distorted
% && { |hk,l | xk (j) + Re hk,l k,l (j)
E Re hk,l k,l (j) , l = 0, , L 1}. Following the dis-
Substitute (47) into (23) - (29). Then (30) can be modified as cussion in [29] on MRC, the average SNR for eESE (xRe k (j)),
denoted by snrk , is simply
eESE(xRe #
k (j))l snrk = snrk,l . (53)
|hk,l |2 ' 2 % &( l
=2 k (j)+Re(hk,l k,l (j))E Re(hk,l k,l (j))
|hk,l| xRe
V k,l Similarly, it can be verified that the average SNR of
(48) eESE (xIm
k (j)) over j has the same expression as (52). Com-
bining (52) and (43), we have (for either eESE (xRe
k (j)) or
where (after replacing Var(xRe
k (j)) and Var(xk (j)) by Vxk in
eESE (xIm (j)))
(25) - (29) ), k

# # 2
|hk,l |
Vk,l = |hk,l |
2 4
|hk! ,l! | Vxk! |hk,l | Vxk +|hk,l | 2 . (49)
snrk new = ) .
|hk! ,l! |2f(snrk! old)|hk,l |2f(snrk old)+2
k! ,l! l
k! ,l!
Similar to (41), we approximate the average noise power after (54)
soft cancellation by Vk,l , i.e., It is interesting to note the similarity between (45) and (54).

'* % & % % &&*2 ( IV. N UMERICAL R ESULTS

E *Re hk,l k,l (j) E Re hk,l k,l (j) * Vk,l . (50)
Let Ninfo be the number of information bits in a frame,
eESE (xRe K the number of simultaneous users in the system, L the
Then the average SNR for k (j))l , denoted by snrk,l ,
is given by number of taps in an ISI channel, Nr the number of receive
antennas, It the number of iterations, RC the rate of each user,
!' (2 " and K RC the system throughput that is a measurement of
2 Re
E |hk,l | xk (j) the overall bandwidth efficiency. QPSK signaling is always
snrk,l = assumed.
Vk,l First we consider constructing C using a common rate 21
|hk,l |2 (23, 35)8 convolutional code followed by (i.e., in serial con-
= ) . (51) catenation with) a length-8 repetition code (RC = 12 18 = 16
|hk! ,l! |2 Vxk! |hk,l |2 Vxk + 2
k! ,l! The repetition coding can be viewed as a kind of spreading,

1.E+00 1.E+00
24 users
64 users E b /N 0 = 0,1,2 dB
1.E-01 1.E-01
E b /N 0 = 3 dB

E b /N 0 = 4 dB

Single user E b /N 0 = 5 dB
1.E-04 Evolution
Simulation I 1.E-05
Simulation II E b /N 0 = 6 dB
0 2 4 6 8 10 12 14
E b /N 0 (dB) 0 3 6 9 12 15

Fig. 4. Convergence property of Vxk in the evolution procedure over AWGN
channels at different Eb /N0 . Ninfo = 1024, Nr = 1, K = 24, and equal power
1.E+00 allocation is adopted.

1.E-01 1.E+00

It=1 (1, 1)
1.E-03 It=2 BER
It=3 1.E-02
It=5 (2, 1)
(2, 2)
1.E-05 (1, 2)
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 1.E-04
E b /N 0 (dB)
0 3 66 99 12
12 15
15 18
18 21
average E b /N 0 (dB) per receive antenna

Fig. 3. Comparison between the evolution and simulation results of a Fig. 5. Performance of a convolutionally coded IDMA system in quasi-static
convolutionally coded IDMA system in AWGN channels. Ninfo = 1024 and multipath Rayleigh fading channels. The (L, Nr ) pair is marked in the figure.
Nr = 1. (a) For different numbers of simultaneous users K. It = 15 and 50 for K = 48 for one receive antenna and K = 96 for two receive antennas. Ninfo
K = 24 and 64, respectively. Simulation I and II denote the simulation results = 128 and It = 10.
based on (36) and (37), respectively. (b) For different numbers of iterations
(It) and K = 24. Dashed lines represent evolution results and solid lines
represent simulation results (using (36) in DECs).
analytical SNR evolution approach discussed in Section III and
the simulation methods using either (36) or (37) in DECs. For
K = 24, equal power levels are assigned to all users, and for K
except that all of the users use the same sequence. The
= 64, un-equal power levels are used. The relative power ratios
resultant codeword is then multiplied by a mask sequence with
between different users are as follows (normalized power level
alternant signs, i.e., [+1, 1, +1, 1, . . . ]. The purpose of the
user number): 132, 2.488316, 4.29992, 5.159814.
masking operation is to balance the numbers of +1 and 1, so
These power levels are obtained using the power allocation
as to maximize randomness among the transmitted sequences
method developed in [28]. The simulation results (using (36)
of different users1 . Two independent chip interleavers are
in DECs) and evolution results are quite close for different K
employed by each user to produce the in-phase and quadrature
(Fig. 3(a)) and for different numbers of iterations (Fig. 3(b)),
parts of the transmitted sequence.
which confirms the viability of the semi-analytical method.
Fig. 2 shows the curves of f () in (43) and g() in (44)
The low-cost method based on (37) can achieve performance
obtained by Monte Carlo simulations for the concatenation of
close to that of (36) at K = 24, but the performance difference
the convolutional code and the repetition code in an AWGN
between the two methods becomes apparent when K is large
(i.e., K = 64). Also note that for K = 64, the bandwidth
Fig. 3 compares the SNR evolution and simulation results
efficiency is four information bits per chip, which is very high
of the above system in an AWGN channel with different
compared with the results for CDMA reported in the literature
numbers of users (Fig. 3(a)) and different numbers of iterations
(Fig. 3(b)). The single-user performance is also included for
reference. We compare three methods, namely, the semi-
1 Consider an extreme example that half of the users send all +1 and the Fig. 4 illustrates the convergence speed of the above system
other half send all 1. Without masking operations, the received symbols during the evolution procedure over AWGN channels. We use
will be all zeros and the signals from different users cannot be separated. This Vxk in (39) as a measure of convergence. We consider different
situation can be avoided using mask sequences. If each transmitted sequence
has a balanced number of +1 and 1, the probability of the above event Eb /N0 values, K = 24 and equal power allocation, so Vxk is
is extremely low after random interleaving. the same for all k. As we can see, the convergence speed of

1.E+00 1.E+00
Scheme I
Scheme II
1.E-01 1.E-01
8 1 16 16
1.E-02 1.E-02

1.E-03 1.E-03

1.E-04 K=48 1.E-04
1.E-05 1.E-05
6 9 12 15 18 21 -1 -0.5 00 0.5 11 1.5 22 2.5
-1 -0.5 0.5 1.5 2.5
E b /N 0 (dB) E b /N 0 (dB)
Eb/N 0 (dB)

Fig. 6. Performance comparison between IDMA and CDMA systems in Fig. 7. Performance of IDMA systems based on the turbo-Hadamard code
quasi-static Rayleigh fading multipath channels with different numbers of
[31] and turbo code over AWGN channels. Nr = 1, It = 30, Ninfo = 4095
users. L = 2, Nr = 1, Ninfo = 128 and It = 10. The dashed lines are for
for Scheme I and Ninfo = 4096 for Scheme II.
CDMA systems and the solid lines are for IDMA systems.

the LLRs involved in (35) are uncorrelated. In CDMA, this

Vxk increases with Eb /N0 . At Eb /N0 5 dB, convergence approximation is not correct since chips spread from the same
can be achieved within 6 iterations. This observation agrees bk (i) are transmitted consecutively, so the corresponding LLRs
with Fig. 3(b). are heavily correlated. In IDMA, however, this assumption is
Fig. 5 shows the performance of Algorithm 3 applied to more valid. After random chip-level interleaving, the replicas
the above system in quasi-static Rayleigh fading multipath of each bk (i) are dispersed randomly, so the corresponding
channels with different numbers of channel taps and receive LLRs become less correlated. Note that the MMSE method
antennas. The corresponding single-user performance is also proposed in [4] can be used to treat the correlation problem
included for reference. It is observed that the system can in CDMA, but the complexity involved is quite high.
achieve K RC = 3 bits/chip for K = 48 using one receive Next we consider using a more sophisticated low-rate
antenna and K RC = 6 bits/chip for K = 96 using two code to improve power efficiency. With higher power effi-
receive antennas with performance close to the single-user ciency, the transmission power of each user can be reduced,
performance at BER = 104 . Such throughputs are rather high, which is beneficial to cellular systems [30]. We adopt a
recalling that with TDMA we may require a 128-QAM trellis turbo-Hadamard code [31] constructed by concatenating 3
coded modulation scheme to achieve similar throughput and convolutional-Hadamard codes in parallel, each generated
performance. from a length-32 Hadamard code and a convolutional code
It is also interesting to compare the performance of IDMA with polynomial G(x) = 1/(1+x). The information bits in
and CDMA using the same detection algorithm. Fig. 6 shows all component codes except one are punctured. A random
such comparisons for different numbers of users in a quasi- puncturing operation on parity bits is also adopted to make
static Rayleigh fading multipath channel with L = 2 and Nr RC = 1/16.
= 1. The main difference between IDMA and CDMA is the Fig. 7 illustrates the performance of an IDMA system based
chip-level interleaving for the former and bit-level interleaving on the turbo-Hadamard code (Scheme I) in AWGN channels.
for the latter. For IDMA, the same parameters as those in Fig. From Fig. 7, performance of BER = 105 is observed at
5 are used. For CDMA, a rate-1/2 (23, 35)8 convolutional code Eb /N0 1.4 dB with K = 16, which corresponds to K
is employed followed by two independent length-8 spreading RC = 1 bit/chip. This is only about 1.4 dB away from the
sequences in the real and imaginary parts for each user. At corresponding Shannon limit, which is Eb /N0 = 0 dB for a
the receiver, the detection principle discussed in Section II throughput of 1 bit/chip, the same as that for a single-user
is used for both systems2 . As we can see, the performance AWGN channel [12].
advantage of IDMA increases with the number of users. For comparison, we have also included in Fig. 7 the
This observation can be explained intuitively as follows. In performance of an IDMA system based on a standard turbo
multipath channels, adjacent chips from each user interfere code (Scheme II), in which C is constructed using a rate-1/3
each other, so their ESE outputs are heavily correlated (see (1, 35/23)8 turbo code followed by a length-6 repetition code.
(30) and (31)). According to the detection principle discussed Puncturing is applied to make RC = 1/16. The advantage of
in Section II, the ESEs outputs are used as the inputs in using a low-rate code is clearly seen from Fig. 7. With K =
each DEC following the procedures listed in Steps (i) 16, Scheme I demonstrates about 1dB performance advantage
(iii) in Section II.F. One basic assumption in Step (i) is that over Scheme II, due to the higher coding gain offered by the
turbo-Hadamard code. The decoding costs of Schemes I and
2 For CDMA systems, a spreading sequence over {+1, 1} is used in II are quite similar.
place of the repetition code in IDMA, and (35) and (36) are modified as
follows, L(bk (1)) = eESE (xk (k (j))) and eDEC (xk ( k (j))) = V. C ONCLUSIONS
LAP P (bk (1)) eESE (xk ( k (j))), where correspond to the signs We have presented several simple detection algorithms for
of the spreading sequence. various channels and developed a semi-analytical technique

to track the SNR evolution for these algorithms, based on Substituting (25), (26) and (27) into (57) gives
which the performance of IDMA systems can be accurately
predicted. The benefits of the IDMA scheme are substantial % % &&
as seen from Figs. 3 to 7. These include low-cost MUD for Var Re hk,l r(j + l)
% &2 % Re & % &2 % &
systems with large numbers of users, robustness and diversity = hRe
k,l Var r (j + l) + hIm
k,l Var rIm (j + l)
in multipath environments, very high spectral efficiency and
k,l hk,l (j + l).
+2hRe Im
near limit performance.
In conclusion, we have explained the feasibility and advan- Finally, (29) results from substituting (58) into (55).
tages of the interleaver-based multiple access scheme together
with an accurate and effective performance prediction tech-
nique. We expect that the basic principles can be extended R EFERENCES
to other applications, such as space-time codes and ultra [1] C. Berrou and A. Glavieux, Near optimum error correcting coding and
wideband (UWB) systems. decoding: Turbo-codes, IEEE Trans. Commun., vol. 44, pp. 12611271,
Oct. 1996.
[2] M. Moher and P. Guinand, An iterative algorithm for asynchronous
VI. A PPENDIX . T HE D ERIVATION OF E QN . (29) coded multi-user detection, IEEE Commun. Lett., vol. 2, pp. 229231,
Aug. 1998.
Based on (22), the left hand side (LHS) of (29) can be [3] M. C. Reed, C. B. Schlegel, P. D. Alexander, and J. A. Asenstorfer,
divided% into% two parts &&
as Iterative multi-user detection for CDMA with FEC: Near-single-user
Var Re hk,l k,l (j) performance, IEEE Trans. Commun., vol. 46, pp. 16931699, Dec. 1998.
% % && % & [4] X. Wang and H. V. Poor, Iterative (turbo) soft interference cancellation
= Var Re hk,l r(j + l) |hk,l | Var xRe
k (j) . (55)
and decoding for coded CDMA, IEEE Trans. Commun., vol. 47, pp.
% (21), & 10461061, July 1999.
Re hk,l r(j + l) [5] Z. Shi and C. Schlegel, Joint iterative decoding of serially concatenated
error control coded CDMA, IEEE J. Select. Areas Commun., vol. 19,
= hRe
k,l r
(j + l) + hIm k,l r
(j + l) pp. 16461653, Aug. 2001.

[6] A. AlRustamani, A. D. Damnjanovic, and B. R. Vojcic, Turbo greedy
#% & multi-user detection, IEEE J. Select. Areas Commun., vol. 19, pp. 1638
= hRe hRe Re # Im Im #
k! ,l! xk! (j +ll )hk! ,l!xk! (j +ll )
k,l 1645, Aug. 2001.
k! ,l! [7] R. J. McEliece, Are turbo-like codes effective on nonstandard channels?
IEEE Inform. Theory Society Newsletter, vol. 51, no. 4, pp. 18, Dec.
#% & 2001.
hIm Re Re Im #
k! ,l!xk! (j +ll )+hk! ,l!xk! (j +ll )
[8] M. C. Reed and P. D. Alexander, Iterative multi-user detection using
k! ,l! antenna arrays and FEC on multipath channels, IEEE J. Select. Areas
% & Commun., vol. 17, pp. 20822089, Dec. 1999.
+Re hk,l n(j + l) [9] J. Boutros and G. Caire, Iterative multi-user joint decoding: Unified
#% & Re framework and asymptotic analysis, IEEE Trans. Inform. Theory, vol.
= k,l hk! ,l! + hk,l hk! ,l! xk! (j + l l )
hRe Re Im Im #
48, pp. 17721793, July 2002.
k! ,l! [10] M. L. Honig, R. Ratasuk, Large-system performance of iterative
#% & Im multiuser decision-feedback detection, IEEE Trans. Commun., vol. 51,
+ k,l hk! ,l! + hk,l hk! ,l! xk! (j + l l )
hRe Im Im Re #
pp. 13681377, Aug. 2003.
k! ,l! [11] A. J. Viterbi, Very low rate convolutional codes for maximum theoret-
% & ical performance of spread spectrum multiple-access channels, IEEE J.
+Re hk,l n(j + l) , (56) Select. Areas Commun., vol. 8, pp. 641649, Aug. 1990.
[12] S. Verdu and S. Shamai, Spectral efficiency of CDMA with random
spreading, IEEE Trans. Inform. Thoery, vol. 45, pp. 622640, Mar. 1999.
% % && [13] J. Y. N. Hui, Throughout analysis for code division multiple accessing
Var Re hk,l r(j + l) of the spread spectrum channel, IEEE J. Select. Areas Commun., vol. 2,
#% & % Re & pp. 482486, July 1984.
Im Im 2
= k,l hk! ,l! +hk,l hk! ,l! Var xk! (j + l l )
hRe Re #
[14] F. N. Brannstrom, T. M. Aulin, and L. K. Rasmussen, Iterative decoders
k! ,l! for trellis code multiple-access, IEEE Trans. Commun., vol. 50, pp.
#% & % Im & 14781485, Sept. 2002.
+ k,l hk! ,l! +hk,l hk! ,l! Var xk! (j +ll )
hRe Im Im Re 2 #
[15] S. Bruck, U. Sorger, S. Gligorevic, and N. Stolte, Interleaving for outer
k! ,l! convolutional codes in DS-CDMA Systems, IEEE Trans. Commun., vol.
48, pp. 11001107, July 2000.
+|hk,l | 2 [16] A. Tarable, G. Montorsi, and S. Benedetto, Analysis and design of
% Re &2 # '% Re &2 % & interleavers for CDMA systems, IEEE Commun. Lett., vol. 5, pp. 420
= hk,l k! (j + l l )
hk! ,l! Var xRe #
422, Oct. 2001.
k! ,l! [17] R. H. Mahadevappa and J. G. Proakis, Mitigating multiple access
% &2 % Im &( interference and intersymbol interference in uncoded CDMA systems
+ hIm
k ,l ! Var xk ! (j + l l #
) with chip-level interleaving, IEEE Trans. Wireless Commun., vol. 1, pp.
781792, Oct. 2002.
% Im &2 # '% Im &2 % Re & [18] Li Ping, L. Liu, K. Y. Wu, and W. K. Leung, Interleave-division
+ hk,l hk! ,l! Var xk! (j + l l# )
multiple-access (IDMA) communications, in Proc. 3rd International
k! ,l! Symposium on Turbo Codes & Related Topics, 2003, pp. 173180.
% &2 % &( [19] L. Liu, W. K. Leung, and Li Ping, Simple chip-by-chip multi-user
+ hRe k! ,l! k! (j + l l )
Var xIm #
detection for CDMA systems, in Proc. IEEE VTC2003-Spring, Jeju,
# % % Re & Korea, Apr. 2003, pp. 21572161.
+2hRe Im
k,lhk,l k! ,l! hk! ,l! Var xk! (j +ll )
hRe Im #
[20] T. Richardson and R. Urbanke, The capacity of low density parity check
k! ,l! codes under message passing decoding, IEEE Trans. Inform. Theory, vol.
% && 47, pp. 599618, Feb. 2001.
k! (j +ll )
Var xIm [21] S. ten Brink, Convergence behavior of iteratively decoded parallel
'% &2 % Im &2 ( 2 concatenated codes, IEEE Trans. Commun. vol. 49, pp. 17271737, Oct.
+ hRe
k,l + hk,l . (57) 2001.

[22] D. Divsalar, S. Dolinar, and F. Pollara, Iterative turbo decoder analysis Lihai Liu (S02) received the B.S. degree in Elec-
based on density evolution, IEEE J. Select. Areas Commun., vol. 19, pp. tronic & Information System and M.E. degree in
891907, May 2001. Circuit & System from Wuhan University, Wuhan,
[23] K. Y. Wu and Li Ping, Multi-layer turbo space-time codes, IEEE China in 1997 and 2000, respectively. He is currently
Commun. Lett., vol. 9, pp. 5557, Jan. 2005. working towards the Ph.D. degree at City University
[24] X. Ma and Li Ping, Coded modulation using superimposed binary of Hong Kong. His research interests are equal-
codes, IEEE Trans. Inform. Theory, vol, 50, pp. 33313343, Dec 2004. ization and multiuser detection in communication
[25] H. Schoeneich and P. A. Hoeher, Adaptive interleave-division multiple systems.
access-A potential air interference for 4G bearer services and wireless
LANs, in Proc. WOCN2004, Muscat, Oman, June 2004, pp. 179182.
[26] G. Caire, R. R. Muller, and T. Tanaka, Iterative multiuser joint de-
coding: Optimal power allocation and low-complexity implementation,
IEEE Trans. Inform. Theory, vol. 50, pp. 19501973, Sept. 2004.
[27] Li Ping and L. Liu, Analysis and design of IDMA systems based Keying Wu (S02) received the B.E. and M.E.
on SNR evolution and power allocation in Proc. VTC2004-Fall, Los degrees in Communication Engineering from Xidian
Angles, CA, Sept. 2004. University, China in 1996 and 1999, respectively.
[28] J. Zhang, E. K. P. Chong, and D. N. C. Tse, Output MAI distributions She is currently studying in Department of Elec-
of linear MMSE multiuser receivers in DS-CDMA systems, IEEE Trans. tronic Engineering at City University of Hong Kong
Inform. Theory, vol. 47, pp. 10681072, Mar. 2001. for a Ph.D. degree. Her research interests include
[29] T. S. Rappaport, Wireless Communications Principle and Practice. coding techniques and multiuser detection.
Prentice-Hall, 1996.
[30] K. S. Gilhousen, I. M. Jacobs, R. Padovani, A. J. Viterbi, L. A. Weaver,
and C. E. Wheatly, On the capacity of a cellular CDMA system, IEEE
Trans. Vehicular Technology, vol. 40, pp. 303312, May 1991.
[31] Li Ping, W. K. Leung, and K. Y. Wu, Low-rate turbo-Hadamard codes,
IEEE Trans. Inform. Theory, vol. 49, pp. 32133224, Dec. 2003.
W. K. Leung received his Ph.D degree from City
Li Ping (S87-M91) received his Ph.D. degree
University of Hong Kong in 2004. His research in-
at Glasgow University in 1990. He lectured at
terests include error-correcting coding and wireless
Department of Electronic Engineering, Melbourne
communication systems.
University, from 1990 to 1992, and worked as a
research staff at Telecom Australia Research Lab-
oratories from 1993 to 1995. He has been with the
Department of Electronic Engineering, City Univer-
sity of Hong Kong, since January 1996 where he
is now a professor. His research interests are mixed
analog/digital circuits, communications systems and
coding theory. Dr. Li Ping was awarded a British
Telecom - Royal Society Fellowship in 1986, the IEE J J Thomson premium
in 1993 and a Croucher Senior Research Fellowship in 2005.