Académique Documents
Professionnel Documents
Culture Documents
(2.6)
Figure 2.6: Structure of a basic OFDM transmitter
The number of bits in d depends on the chosen F
map
.
Quadrature Phase Shift Keying (QPSK) modulation is used in the thesis be-
cause of its popularity in the literature. The signal constellation for the QPSK
modulation is shown in Figure 2.7. Two bits of the vector b result in one modu-
lated symbol of d when QPSKscheme is used. The amplitude for all combinations
of two bits of b is same, but signal phase is different. Gray coding is used to map
the transmitted bits into data symbols. It gives one bit error when a symbol error
occurs, i.e., when a received data symbol is decided to belong to a neighboring
constellation point.
After encoding, the elements in d are placed in parallel by serial-to-parallel
transformation, they may be multiplied by a weighting matrix . is a diagonal
matrix of size [N
s
N
s
] that allows setting different values of power to different
elements in an OFDM symbol as explained in [Sandell, Edfors, 1996].
16 OFDM Analytical Model
Figure 2.7: Quadrature Phase Shift Keying (QPSK)
=
_
0
0 0
0
1
0
.
.
.
.
.
.
.
.
.
.
.
.
0 0
N
s
1
_
_
(2.7)
a
m
= d
T
m
(2.8)
The symbol vector a (size [N
s
1]) contains encoded weighted symbols to be
transmitted by using N
s
sub-carriers:
a
m
=
_
_
a
m
(0)
a
m
(1)
.
.
.
a
m
(N
s
1)
_
_
(2.9)
As the encoded symbols are complex, the real part of each encoded symbol is
modulated using a cosine harmonic and the imaginary part is modulated using a
sine harmonic. The reason that sine and cosine harmonics are used is that they are
2.4 Basic OFDM Transmitter 17
orthogonal to each other and thus faithful demodulation is possible at the receiver.
2.4.1 Cyclic Prex
In an OFDM system, sub-carriers passing through a time dispersive channel loose
their orthogonality because of inter-carrier interference (ICI) and inter-symbol in-
terference (ISI).
ISI, in the context of the thesis is dened as the crosstalk between signals
within the same sub-channel of consecutive FFT frames, which are separated in
time by the signalling interval. ICI is the crosstalk between adjacent sub-channels
or frequency bands of the same FFT frames.
The problem of ISI is illustrated in Figure 2.8(a). Portions of the transmitted
OFDM symbols overlap because of multipath, causing constructive or destructive
interference, therefore, changing the spectral contents of the received symbol (ISI
is one of possible causes of ICI). The system performance degrades because of
the interference.
To overcome inter-symbol interference, a Cyclic Prex (CP) is added to OFDM
symbols. It is a copy of the last N
g
samples of the OFDMsymbol that is prepended
to the transmitted symbol and removed at the receiver before demodulation (Fig-
ure 2.8(b)). When a cyclic prex long enough to accommodate the maximum path
delay is added, the overlapping parts are rejected at the receiver. The received
OFDM symbol contains no ISI during its useful period N
u
.
The benet of the cyclic prex is twofold. First, it avoids ISI by acting as a
guard space between symbols. Second, the cyclic prex also converts the linear
convolution between the transmitted symbol and channel impulse response into
a cyclic convolution . Cyclic convolution in time domain translates into a scalar
multiplication in frequency domain [Mitra, 1998, pp. 140], so the sub-carriers
remain orthogonal and there is no ICI, [Engels, 2002], [Sramek, 2003].
Although BER performance is preserved, the spectral efciency decreases be-
cause of enlarged symbol time. As the cyclic prex does not carry any new infor-
mation, it should be set to a minimum length. The minimum length of the cyclic
prex that cancels ISI completely is equal to N
g
= N
ds
. The multipath delay
spread N
ds
is the relative delay of the last path as compared to the rst arriving
path in samples, or the length of the channel impulse response (in samples) minus
one.
The insertion of cyclic prex is explained in Figure 2.9, the OFDM symbol
being sent is extended with a cyclic prex of length N
g
by copying last N
g
samples
to the beginning of the symbol. Therefore, the total symbol length becomes N
gu
=
N
u
+ N
g
samples.
18 OFDM Analytical Model
(a) ISI without cyclic prex
(b) ISI with cyclic prex
Figure 2.8: Inter-symbol interference and cyclic prex
2.4 Basic OFDM Transmitter 19
Figure 2.9: Cyclic prex insertion
The important principle of OFDM is the orthogonality of its sub-carriers.
Cyclic Prex is a technique that combats the negative effects of multipath environ-
ment on the transmitted signal. Therefore, a further insight about the sub-carrier
orthogonality and Cyclic Prex is given in the next section.
2.4.2 Sub-carrier Correlation Properties
An orthogonal waveform can be represented, [Engels, 2002, pp. 35], as:
k
(t) =
_
1
T
u
e
2f
k
t
(t [0, T
u
])
0 otherwise
(2.10)
k
(t) k
th
orthogonality function;
Each base function has an integer number of cycles in useful OFDM symbol
time and therefore, is orthogonal to the other functions [IEC, 2004]. The cross-
correlation between two arbitrary orthogonality functions
k
(t) and
k
(t) when
channel delay is less than the length of cyclic prex is calculated according to
[Engels, 2002] as:
20 OFDM Analytical Model
_
T
u
0
k
(t)
k
(t)dt = (2.11)
=
_
T
u
0
_
(
1
T
u
e
2f
k
t
) (
1
T
u
e
2f
k
t
)
_
dt (2.12)
=
1
T
u
_
T
u
0
e
2(f
k
f
k
)t
dt (2.13)
when k = k
, f
k
= f
k
_
T
u
0
k
(t)
k
(t)dt = (2.14)
=
1
T
u
_
T
u
0
dt (e
0
= 1) (2.15)
=
1
T
u
_
1 T
u
1 0
_
(2.16)
= 1 (2.17)
when k = k
,
_
T
u
0
k
(t)
k
(t)dt = (2.18)
=
1
T
u
_
T
u
0
e
2(f
k
f
k
)
dt (2.19)
=
1
T
u
1
2(f
k
f
k
)
_
e
2(f
k
f
k
)T
u
1
_
(2.20)
= 0 (2.21)
2.4 Basic OFDM Transmitter 21
(a) Auto-correlation sequence of
2
(b) Auto-correlation sequence of
3
Figure 2.10: Auto-correlation of orthogonality functions
Upon summarizing,
_
T
s
0
k
(t)
k
(t)dt =
_
0 (k = k
)
1 (k = k
)
(2.22)
Figure 2.10 shows how the auto-correlation of the base functions
2
and
3
varies when the delay varies from 2N
gu
to 2N
gu
in unit increments. It can
be seen from the gure that the correlation value is one while the delay value
is between 0 and N
g
, i.e., the orthogonality holds when the cyclic period length
N
g
is greater than the maximum excess delay N
ds
as the cyclic prex cancels the
effects of delay. When the maximum excess delay exceeds N
g
, the correlation
starts decreasing.
Cross-correlation sequences between different base functions
k
and
k
are
shown in Figure 2.12. It is seen that the cross-correlation sequence does not de-
pend on any particular values of k and k
= 3 (b) k = 3, k
= 4
(c) k = 2, k
= 4 (d) k = 3, k
= 5
Figure 2.12: Cross-correlation of orthogonality functions
24 OFDM Analytical Model
Figure 2.13: Three approaches for modulation and demodulation using orthogonal sub-
carriers
size [N
s
1]. This multiplication results in a time-domain signal of size
[N
u
1]. Finally, cyclic prex is inserted making the size of the signal to
be transmitted to [N
gu
1]. The cyclic prex insertion can be described in
matrix form as multiplication of the transmitted signal with a matrix G
cp
that has two diagonals, as shown in [Engelhart et al, 1999]:
G
cp
=
_
0
N
g
(N
FFT
N
g
)
I
N
g
N
g
I
N
FFT
N
FFT
_
(2.23)
s
m
= G
cp
a
m
(2.24)
0 zero matrix;
I identity matrix;
2. Instead of generating the orthogonal sub-carriers, an IDFT operation can be
used as proposed by [Weinstein, Ebert, 1971]. The advantage of using an
IFFT operation instead of IDFT is described in Section 2.4.6. Cyclic prex
is added to the IDFT result:
2.4 Basic OFDM Transmitter 25
s
m
= G
cp
IFFT(a
m
) (2.25)
3. By copying N
g
trailing rows before the leading rows of , an orthogonality
matrix already containing cyclic prex (
cp
, size [N
gu
N
s
]) is formed.
Both modulation and prex insertion now can be done by one matrix multi-
plication (
cp
a), resulting in time-domain signal vector of size [N
gu
1]:
s
m
=
cp
a
m
(2.26)
The orthogonality matrices and
cp
are described in detail in Sections 2.4.4
and 2.4.5.
2.4.4 Structure of Orthogonality Matrix
The matrix is used for mapping a vector of transmitted data onto orthogonal
sub-carriers. It is dened as:
=
_
0,0
0,1
0,N
s
2
0,N
s
1
1,0
1,1
1,N
s
2
1,N
s
1
.
.
.
.
.
.
i,k
.
.
.
.
.
.
N
u
2,0
N
u
2,1
N
u
2,N
s
2
N
u
2,N
s
1
N
u
1,0
N
u
1,1
N
u
1,N
s
2
N
u
1,N
s
1
_
_
(2.27)
The dimensions of are [N
u
N
s
].
Each column of the matrix contains the values of the k
th
orthogonality
function:
26 OFDM Analytical Model
i,k
= cos
k
i + sin
k
i, i = 0, 1, . . . , N
u
1 (2.28)
There is a total of N
s
used sub-carriers in the system, i.e.,
k = 0, 1, . . . , N
s
1 (2.29)
From Equation 2.4 it is known that N
s
= N
u
.
The rst column of with frequency f
0
(with zero cycles per N
u
samples)
contains samples of orthogonality function, the second function has one cycle in
it and so on. There is a total of N
s
columns (base functions) in the matrix and
each of them has N
u
samples. The real part of
k
is a cosine harmonic and the
imaginary part is a sine harmonic.
2.4.5 Orthogonality Matrix with Cyclic Prex
The orthogonality matrix can be enhanced by including the cyclic prex into
it. In this case there is no need of having a separate operations for mapping and
cyclic prex insertion.
When a cyclic prex of length N
g
is added, N
g
samples from the end of each
base function are copied to their beginning of that base function, so the matrix is
increased to size [(N
u
+ N
g
) N
s
]. The matrix after the addition of cyclic prex
is represented as:
2.4 Basic OFDM Transmitter 27
cp
= G
cp
= (2.30)
=
_
N
g
,0
N
g
,1
N
g
,N
s
2
N
g
,N
s
1
N
g
+1,0
N
g
+1,1
N
g
+1,N
s
2
N
g
+1,N
s
1
.
.
.
.
.
.
.
.
.
.
.
.
2,0
2,1
2,N
s
2
2,N
s
1
1,0
1,1
1,N
s
2
1,N
s
1
0,0
0,1
0,N
s
2
0,N
s
1
1,0
1,1
1,N
s
2
1,N
s
1
.
.
.
.
.
.
i,k
.
.
.
.
.
.
N
u
2,0
N
u
2,1
N
u
2,N
s
2
N
u
2,N
s
1
N
u
1,0
N
u
1,1
N
u
1,N
s
2
N
u
1,N
s
1
_
_
(2.31)
It is observed from Equation 2.31 that the lower sub-matrix is equal to the
matrix from Equation 2.27. The upper sub-matrix contains the cyclic prex
parts of the orthogonality functions.
The values of k
th
column of
cp
are calculated as:
cp
i,k
= cos
k
i + sin
k
i, i = N
g
, . . . , N
u
1 (2.32)
2.4.6 OFDM Transmitter Using IDFT
As described in Section 2.4.3, the computational complexity of the modulator can
be reduced.
The number of complex multiplications required in a DFT operation is N
FFT
2
and the number of complex additions is N
FFT
2
N
FFT
. The computational com-
plexity can further be reduced using an FFT operation. The number of com-
plex multiplications and complex additions in an FFT operations are of order
N
FFT
2
log
2
(N
FFT
) and N
FFT
log
2
(N
FFT
), respectively. As N
FFT
increases the use-
fulness of FFT becomes more apparent. Thus, the modulation of the encoded
28 OFDM Analytical Model
symbols can be performed in an efcient way by using IFFT. A representation of
an OFDM transmitter which uses IFFT operation instead of independent modula-
tors for each encoded signal is depicted in Figure 2.14.
Figure 2.14: IFFT-based OFDM transmitter
At a particular signalling interval m, the transmitted signal is given by:
s
m
= G
cp
a
m
G
cp
IFFT(a
m
) (2.33)
2.5 Channel Models
The channel models are not discussed in this iteration of the simulator. This ap-
proach is followed because the channel is decoupled from the transceiver tech-
nology and the basic OFDM system is described without considering a channel.
Various channel models are developed in the next chapter.
2.6 Basic OFDM Receiver
The block diagram of a basic OFDM receiver which is adopted from Figure 2.5
is shown in Figure 2.15. The received discrete signal vector r
m
of size [1 N
gu
]
contains the transmitted OFDM symbol affected by the channel. Although no
particular channel model is considered in this chapter, an imitation of the channel
is assumed in order to maintain a full view of the generic OFDM system. It does
not distort the transmitted signal.
2.6 Basic OFDM Receiver 29
Figure 2.15: Basic OFDM receiver
After reception, the cyclic prex is removed because it is not a part of the
useful transmitted symbol. Then, the data symbols originally mapped onto or-
thogonal sub-carriers in the transmitter (vector a) are demodulated. The received
time-domain signal vector r is transformed into a frequency-domain signal vector
z. Similar to the techniques described in Section 2.4.3 and shown in Figure 2.13,
these operations are performed at the receiver in one of the following ways:
1. Discarding rst N
g
OFDM samples of the received signal r
m
by multiplying
it with a cyclic prex removal matrix
G
cp
. The size of vector r is decreased
from [1 N
gu
] to [1 N
u
]. Then, the signal is converted into a parallel
form by transposing the vector. Finally, the received signal vector without
the cyclic prex is multiplied with a Hermitian of the orthogonality matrix
:
G
cp
=
_
0
N
g
N
FFT
I
N
FFT
N
FFT
_
(2.34)
z
m
=
H
_
r
m
G
cp
)
T
(2.35)
0 zero matrix;
I identity matrix;
2. Discarding rst N
g
samples of the received signal r
m
and then performing
an FFT on the result:
z
m
= FFT((r
m
G
cp
)
T
) (2.36)
This approach has the lowest computational complexity;
30 OFDM Analytical Model
3. Discarding the cyclic prex and performing the FFT at the same time by
multiplying the received signal r
m
by a Hermitian of the orthogonality ma-
trix
cp
that has zeros in its rst N
g
rows (it shall be called
H
cp
):
H
cp
=
H
G
T
cp
(2.37)
z
m
=
H
cp
r
m
(2.38)
This approach has high computational complexity as compared to FFT but
is written in compact matrix-vector notation. Because of this compact nota-
tion, this approach is used in the report.
After the demodulation, a complex-valued data symbol vector z
m
with dimen-
sions [N
u
1] is obtained. Depending on the channel model used, the received
data symbols z
m
may have been affected by the channel. The data symbols have
to be restored by using a channel equalization function F
eq
:
y
m
= F
eq
(z
m
) (2.39)
This function puts the constellations of the received symbols into their esti-
mated correct positions in the complex plane by canceling the effects of multi-
path radio channel. Finally, a hard decision has to be made on the data symbols
y
m
. A demapping scheme, corresponding to its counterpart used at the transmitter
is applied to extract a serial bit sequence:
x
m
= F
dec
(y
m
) (2.40)
b
m
= F
demap
(x
m
) (2.41)
2.7 Summary 31
2.7 Summary
In this chapter, a basic analytical model of an OFDM transceiver is explained
using matrix vector notation. The model does not take any radio channel into
account.
In the next chapter, a few channel models are presented and analyzed. The
transceiver is placed in an environment represented by the channel and a channel
estimation algorithm is applied at the receiver to recover the corrupted data.
32 OFDM Analytical Model
CHAPTER 3
DOWNLINK SIMULATOR
The goal of this chapter is to extend the basic OFDM transceiver from the previous
chapter with radio channel models. Several channel models with an increasing
order of complexity and real environment resemblance are presented.
The channel models are created in order to enhance the knowledge of OFDM
technology that was gained in the previous chapters. The OFDM performance in
several theoretical environments is analyzed through simulations.
A simple Additive White Gaussian Noise (AWGN) channel simulator that
does not require any channel estimation is presented rst, then it is extended to
a multipath slow-fading Rayleigh channel. Finally, the Rayleigh channel is ex-
tended to a fast-fading multipath Jakes channel model that includes International
Telecommunication Union (ITU) proles.
A literature survey of various channel estimation algorithms is given. One
algorithm is applied in order to explore the effect of the channel on the transmitted
signal.
WiMAX is one of the latest applications in which OFDM is applied. The
parameters specied in WiMAX (IEEE 802.16) standard are used for testing the
simulator.
WiMAX is a high-throughput broadband wireless technology that provides
connections over long distances. It is intended to be used for a number of ap-
plications, including "last mile" broadband connections, hotspots and high-speed
enterprise connectivity for business [Intel, 2004].
Various parameters regarding WiMAX specication are given in
[Yaghoobi, 2004, pp. 203]. The parameters that are relevant to this project are
shown in Table 3.1.
34 Downlink Simulator
Parameter Value
Sampling frequency (F
0
) 20 MHz
Sample time (
1
F
0
) 50 ns
FFT size (N
FFT
= N
s
) 2048
Sub-carrier spacing (f =
F
0
N
s
) 9.76 kHz
Useful symbol time (T
u
=
1
f
) 102.4 s
Guard time (T
g
=
T
u
8
) 12.8 s
OFDM symbol length (T
gu
= T
u
+ T
g
) 115.2 s
Table 3.1: WiMAX parameters
3.1 Additive White Gaussian Noise Channel Simu-
lator
The noise characteristics - white, additive and Gaussian are most often used to
model noise in a communication system. Since zero-mean Gaussian noise is com-
pletely characterized by its variance, this model is particularly simple to use in the
detection of signals, [Sklar, 2001, pp.33]. In order to seek analytical tractability,
the problem of analyzing a complex channel model is simplied by conditioning
it with an elementary Additive White Gaussian Noise (AWGN) effected channel.
Unconditioning is performed in order to approach reality in further models.
3.1.1 Denitions
A random signal with known statistical properties of amplitude, distribution, and
spectral density having a frequency spectrum that is continuous and uniform over
a specied frequency band is called white noise, [Haykin, 2001]. For a specied
bandwidth consisting of a continuous frequency spectrum, the total power in the
specied bandwidth divided by the specied bandwidth is termed as spectral den-
sity, [Haykin, 2001]. It is usually expressed in Watt per Hertz,
[Telecom Glossary, 2000]. The power spectral density of white noise is indepen-
dent of the operating frequency and is given according to [Haykin, 2001] as:
3.1 AWGN Channel Simulator 35
w
(f) =
N
0
2
(3.1)
w
(f) power spectral density of white noise;
N
0
is expressed in Watt per Hertz [W/Hz]. The parameter N
0
is usually refer-
enced to the input stage of the receiver of a communication system, [Haykin, 2001,
pp. 61].
The auto-correlation function for the white noise is
R
w
() =
N
0
2
() (3.2)
R
w
() auto-correlation function of white noise;
The delta function in Equation 3.2 means that the noise signal w(t) is to-
tally uncorrelated with its time shifted version, for any > 0, or that any two
different samples of a white noise process are uncorrelated. A random variable
R has a Gaussian distribution function if its probability density has the form,
[Haykin, 2001, pp. 54]:
PDF(R(r)) =
1
2
R
e
_
(r
R
)
2
2
2
R
_
(3.3)
R
mean of the random variable R;
R
variance of R;
36 Downlink Simulator
3.1.2 Assumptions
The simulator uses AWGN as the only source of degradation to the trans-
mitted signal;
All the sub-carriers would have experienced the same attenuation when they
reach the receiver;
The transmitter and the receiver are perfectly synchronized.
3.1.3 Channel Model
Since the noise in this iteration of the OFDM simulator is a Gaussian process
and the samples are uncorrelated, the noise samples are also independent. Such
channel is called memoryless channel because the AWGN affects the transmitted
symbol independently. The term additive means that the noise is simply added
to the transmitted signal and that there are no multiplicative mechanisms involved,
[Sklar, 2001, pp. 33].
3.1.4 Channel Estimation
No channel estimation is considered for the AWGN channel simulator. The chan-
nel estimator is transparent in the sense that it does not make any effect on the
signal received by it.
3.1.5 Theoretical Performance
One of the most important metrics of performance of digital communication sys-
tems is a plot of the bit-error probability (P
b
) versus
b
N
0
, [Sklar, 2001]. The data
received at the receiver is corrupted by noise while transmission. This results in an
error while estimating the transmitted data. The smaller the required
b
N
0
, the more
efcient is the detection process for a given probability of error, [Sklar, 2001].
The theoretical BER curves are generated according to the formulae given in
[Proakis, 2001, pp.256]. The practical curves are plotted against the theoretical
curves as described in [Proakis, 2001, pp. 260].
3.1 AWGN Channel Simulator 37
P
b
= Q
_
_
2
b
N
0
_
(3.4)
b
N
0
signal-to-noise ratio per bit;
3.1.6 Results
The simulated BER curve corresponds to the theoretical curve as shown in Fig-
ure 3.1. The basic OFDM transceiver simulator produces theoretically expected
results.
Figure 3.1: BER performance in AWGN channel
38 Downlink Simulator
3.2 Rayleigh Channel Simulator
A slow-fading Rayleigh channel model is chosen to be the extension to the simple
AWGN channel model in the previous section. The process of extending is what
has been referred to as unconditioning ([Jeruchim et al, 2000, pp. 16]) in Sec-
tion 3.1. The Rayleigh channel is selected because it is commonly used to describe
the statistical time varying multipath component of a channel [Hara, Prasad, 2003].
This model introduces a multipath channel and a channel estimator (Figure 3.2).
This system model is more realistic compared to the AWGN channel model be-
cause of the simulated multipath effects that are inevitable in real environments.
Figure 3.2: Block diagram of the OFDM system simulator with a slow fading channel
3.2.1 Denitions
According to [Hara, Prasad, 2003, pp. 13-18], multipath fading is due to multipath
reections of a transmitted wave by local scatterers such as houses, buildings and
man-made structures, or natural objects such as forest surrounding a mobile unit.
Figure 3.3 shows a typical multipath fading channel with L paths.
Before dealing further with multipath channels, it is necessary to have an un-
derstanding about the parameters of multipath channels and various types of fad-
ing. The channel can be characterized as a function of channel impulse response
(CIR) h(t, ). The term t represents the time variations occurring because of the
movement of the receiver (assuming that the transmitter does not move). The
3.2 Rayleigh Channel Simulator 39
Figure 3.3: Multipath fading channel
Figure 3.4: Impulse response
channel multipath delay for a xed value of t is represented by . h(t, ) is shown
as in Figure 3.4, [Rappaport, 1996].
The delay axis of the impulse response can be divided into equal time delay
segments called excess delay bins, where each bin has time delay width equal to
=
i+1
i
. The rst signal at the receiver arrives at a relative delay
0
= 0
as the propagation delay between the transmitter and the receiver is neglected. A
single resolvable multipath component having a delay
i
represents all the mul-
tipath signals received with in the i
th
bin. The total number of equally spaced
multipath components, N
comp
corresponds to normalized maximum excess delay
of the channel, N
ds
=
max
/T
s
[Rappaport, 1996]. The relative delay of the i
th
multipath component as compared to the rst arriving component and given by
i
is the excess delay. At some time t and
i
, there might be no multipath components
at some excess delay bins as the reections of signal occur randomly. A power
40 Downlink Simulator
delay prole is useful in obtaining excess delay spread. It is calculated by taking
the spatial average of h(t, )
2
. The spatial average is calculated by averaging
the values of h(t, )
2
at same
i
for different values of t.
A channel that passes all spectral components with approximately equal gain
and linear phase is a at channel. Coherence bandwidth, (f)
c
, is a statistical
measure of the range of frequencies over which the channel can be considered
at. Coherence bandwidth is also described as the range of frequencies over
which two frequency components have a strong potential for correlation in ampli-
tude. Two sub-carriers that have their frequency separation greater than (f)
c
are
affected differently by the channel.
Delay spread and coherence bandwidth describe the time () dispersive nature
of the channel. The time(t) varying nature of the channel is given by Doppler
spread and coherence time. When a pure sinusoidal frequency f is transmitted,
the received signal spectrum, called the Doppler spectrum, will have components
in the range of (f f
d
) to (f + f
d
), where f
d
is the Doppler shift depending on
the velocity of the receiver and the angle between the transmitter and the receiver
[Rappaport, 1996, pp. 141]. The effects of Doppler spread are negligible at the
receiver if the baseband transmitted signal bandwidth is greater than the Doppler
spread.
Coherence time, (t)
c
is the time duration over which two received signals
have a strong correlation. The channel will change during the transmission of
baseband message if the time duration of the baseband signal is greater than (t)
c
,
thus causing distortion at the receiver.
Different transmitted signals undergo different types of fading depending on
the relation between the signal parameters like bandwidth, symbol period, etc.
and the channel parameters like delay spread, Doppler spread, etc. There are four
possible effects, that are exhibited depending on the nature of the transmitted sig-
nal the channel, and the velocity of the mobile unit. The multipath delay spread
leads to time dispersion and frequency selective fading. The Doppler spread leads
to frequency dispersion and time selective fading. These two propagation mecha-
nisms are independent of one another.
Time dispersion due to multipath causes at fading and frequency selective
fading. The time duration of the transmitted signal in a at fading channel is
larger than the multipath time delay spread of the channel. Flat fading channels
are sometimes referred to as amplitude varying channels, frequency non-selective
channels or narrowband channels, since the bandwidth of the transmitted signal
is narrow compared to the bandwidth of the at fading channel. The instanta-
neous amplitude distribution of at fading channels is commonly considered to
be Rayleigh distributed [Rappaport, 1996, pp. 169]. Thus, a Rayleigh at fad-
3.2 Rayleigh Channel Simulator 41
ing channel model assumes that the channel induces an amplitude which varies in
time (t) according to the Rayleigh distribution.
The PDF of Rayleigh distribution envelope and phase is given by
p() =
2
r
e
2
2
2
r
, ( > 0) (3.5)
p() =
1
2
, (0 < 2) (3.6)
p() and p() are statistically independent.
Frequency selective fading occurs on the received signal if the channel pos-
sesses a constant gain and linear phase response over a bandwidth that is smaller
than the bandwidth of transmitted signal. In frequency selective fading, the mul-
tipath delay spread of the channel impulse response is greater than the time du-
ration of the transmitted signal and the received signal contains multiple versions
of the transmitted signal. These multiple versions are attenuated and delayed in
time and make the received signal distorted. Certain frequency components in
the received signal spectrum have greater gains than others in the frequency do-
main. Frequency selective fading channels are also called wideband channels
[Rappaport, 1996, pp. 169] since the bandwidth of the transmitted signal is wider
than the bandwidth of the channel impulse response.
If the channel impulse response changes rapidly within the symbol duration
i.e., the coherence time of the channel is smaller than the time duration of the
transmitted symbol, the channel is a fast fading channel. On the contrary and if
the channel response changes at a slower rate than the transmitted symbol, the
channel is a slow fading channel. The channel is decided to be fast or slow fading
by the bandwidth of the transmitted signal and the velocity of the mobile unit that
gives the Doppler spread.
The received signal after passing through the multipath fading channel is r(t)
and is described by Equation 3.7.
42 Downlink Simulator
r(t) =
L1
l=0
p
l
(t)e
2f
k
l
(t)
(3.7)
=
L1
l=0
l
(t) (3.8)
l
(t) complex-valued stochastic process;
3.2.2 Assumptions
The following assumptions are made for this system model:
The channel is slow fading, frequency non-selective;
There is a perfect synchronization between the transmitter and the receiver;
Perfect channel estimation exists.
3.2.3 Channel Model
The modeled radio channel involves distortion of the transmitted signal because
of two reasons: multipath effects and white Gaussian noise. As the channel fol-
lows slow fading, the coherence time is longer than OFDM symbol time T
gu
, in
other words, the channel impulse response can be considered constant during one
OFDM symbol duration. An account of the impulse response characteristics have
been given in Section 3.2.1. According to [Jeruchim et al, 1992, pp. 374], the
effects of a channel on a sent signal may be described as:
r(n) =
L1
l=0
p
l
(n) s(n
l
(n)) (3.9)
3.2 Rayleigh Channel Simulator 43
A simulated radio channel has L paths, the rst of them appears at time delay
= 0, and each path is associated with a complex attenuation coefcient p
l
and a
time delay
l
.
Tests are performed with the multipath delay spread being both smaller than
cyclic prex length Ng (no ISI expected) and longer (to investigate the effects of
ISI). According to [Jeruchim et al, 1992, pp. 375], the complex low-pass impulse
response of the multipath channel can be described as:
h(; n) =
L1
l=0
p
l
(n) (
l
(n)) (3.10)
The effect of channel on the transmitted signal can be expressed in matrix-
vector notation as multiplication of the matrix H
m
with the transmitted signal
vector. The matrix H
m
has N
gu
columns and each column contains a delayed
channel impulse response with length N
ds
. If the channel is approximated by
slow-fading, i.e., the channel impulse response h is considered to be constant
during one OFDM symbol duration, the channel matrix H
m
is written as:
H
m
=
_
_
h
m
[0] 0 0
h
m
[1] h
m
[0] 0
h
m
[2] h
m
[1] 0
.
.
.
.
.
.
.
.
.
h
m
[N
ds
1] h
m
[N
ds
2] 0
0 h
m
[N
ds
1] 0
.
.
.
.
.
.
.
.
.
0 0 h
m
[0]
.
.
.
.
.
.
0 h
m
[N
ds
1] h
m
[N
ds
2]
0 0 h
m
[N
ds
1]
_
_
(3.11)
44 Downlink Simulator
If the channel is modeled as fast-fading, the impulse response changes from
one OFDM sample to another.
The dimensions of H are [(N
ds
+ N
gu
1) N
gu
]. Convolution of the trans-
mitted signal is performed by multiplying the matrix with the signal vector:
r
m
= H
m
s
m
+w
m
(3.12)
The impulse response matrix H
m
can be divided into several sub-matrices
H
m
[0], H
m
[1], . . . of size [N
gu
N
gu
]. For example, the sub-matrix H
m
[0] is
shown between the horizontal lines in Equation 3.11:
H
m
=
_
_
H
m
[0]
H
m
[1]
.
.
.
_
_
(3.13)
The number of sub-matrices in Equation 3.13 depends on the impulse response
length N
ds
. H
m
[0] corresponds to the multipath channel effects to the current
transmitted symbol s
m
. The ISI term for the (m + 1)
th
received OFDM symbol
can be written as H
m
[1] s
m
.
The received OFDM symbol r
m
contains current transmitted symbol s
m
con-
volved with H
0
, possible ISI from the previous transmitted symbols and AWGN:
r
m
=
_
_
N
ds
+N
gu
1
N
gu
_
1
i=0
H
mi
[i] s
mi
_
+w
m
(3.14)
When the multipath delay spread N
ds
is shorter than cyclic prex length N
g
,
the ISI effect can be neglected and the received signal is written as:
3.2 Rayleigh Channel Simulator 45
r
m
= H
m
[0] s
m
+w
m
(3.15)
= H
m
[0] a
m
+w
m
(3.16)
3.2.4 Channel Estimation
If the multipath delay spread is longer than the guard period length, but not longer
than 2N
gu
, the current received OFDM symbol is corrupted with one previous
symbol:
r
m
= H
m
[0]s
m
+H
m1
[1]s
m1
+w
m
(3.17)
At the receiver the cyclic prex is removed from the received signal and FFT
is performed. Both FFT operation and cyclic prex removal can be substituted by
received signal multiplication by orthogonality matrix
H
cp
, i.e., a Hermitian of
and has zeros instead of the cyclic prex elements as described in Section 2.6.
z
m
=
H
cp
r
m
(3.18)
=
H
cp
H
m
[0]
cp
. .
C
m
[0]
a
m
+
H
cp
H
m1
[1]
cp
. .
C
m
[1]
a
m1
(3.19)
+
H
cp
w
m
(3.20)
According to [Kim et al, 1999], the variance of AWGN does not change af-
ter FFT. Therefore, the noise component is written as w instead of
H
cp
w
m
in
Equation 3.21. The transmitted and received OFDM symbols are related by the
diagonal matrix C
m
[0] containing the channel effects. C
m
[1] a
m1
is the inter-
symbol interference. If the multipath delay spread is shorter than the guard period
length, H
m1
[1] contains only zeros and the ISI term can be ignored.
46 Downlink Simulator
Finally, the received OFDM symbols after the FFT operation as depicted in
Figure 2.15 can be expressed as:
z
m
= C
m
[0] a
m
+C
m
[1] a
m1
+w
m
(3.21)
The C matrices contain the channel effects, therefore, successful recovery of
the transmitted data symbols relies on correct estimation of C
m
[0]. The issues of
OFDM channel estimation are presented in Section 3.4.
For the purpose of verifying the Rayleigh multipath channel model, a per-
fect channel estimation is assumed, i.e., the channel estimation matrix
H
m
[0] is
known. As there is also an assumption of normalized channel delay spread be-
ing less than the cyclic prex length,
H
m
[1] contains only zeros and the sent data
symbols can be extracted:
C
m
[0] =
H
cp
H
m
[0] (3.22)
y
m
= diag(
C
m
[0])
H
z
m
(3.23)
3.2.5 Theoretical Performance
According to [Proakis, 2001, pp. 831], the expression of bit error rate for QPSK
is:
P
b
=
1
2
_
1
_
2
2
L1
l=0
_
2l
l
_
_
1
2
4 2
2
_
l
_
(3.24)
=
_
c
1 +
c
(3.25)
where
c
is average received SNR per channel.
3.2 Rayleigh Channel Simulator 47
3.2.6 Results
Figure 3.5 shows the bit error performance of the model in comparison to the
theoretical performance given by Equation 3.24. The theoretical AWGN BER
curve from Figure 3.1 is also included. The simulations are performed with one,
two and three Rayleigh fading paths. The maximum excess delay of the channel
is less than the cyclic prex length. The plot shows that the simulated BER curves
correspond to theoretical estimates. However, the Rayleigh BER curve is different
from AWGN. The transceiver BER performance in a Rayleigh fading channel is
poorer than in OFDM channel, because the symbol energy becomes scattered in
time when passing a multipath environment. The effect of of scattering can be
mitigated using methods that would gather the dispersed symbol energy back into
its original time interval.
Figure 3.5: Slow fading Rayleigh channel BER performance
A second simulation is performed with two paths and the SNR xed at 10dB.
The delay of rst path arrival is xed at zero (
0
= 0). The delay of the second
path
1
changing from 1 to 2N
g
OFDM samples. It can be seen in Figure 3.6 that
the OFDM performance does not suffer when the maximum excess delay is less
than cyclic prex length N
g
.
The performance of different symbol and cyclic prex lengths is compared.
Figure 3.7 shows the BER performance curves of the basic OFDM system with
48 Downlink Simulator
Figure 3.6: BER performance with xed SNR and varying second path delay
100s symbol length. Two values, T
u
/8 and 0 are considered as the cyclic prex
length. Theoretical BER performance is also shown on the plot. ITU Vehicular
B and ITU Pedestrian B channel proles (Tables 3.8 and 3.9) are used for this
simulation (channel proles are discussed in Section 3.3.6). The channel model
used for these simulations differs from the model described in Section 3.2 because
the paths have xed delays and different variances.
The maximum delay spread of the ITU Vehicular B channel prole is 20ms.
It exceeds the cyclic prex of 12ms. Figure 3.7 shows that the BER performance
is worse than the theoretical limit. The ITU Pedestrian B channel prole has
the maximum delay spread of 3.7ms. It can be seen that the BER performance
deteriorates when the maximum delay spread is greater provided the same cyclic
prex length.
Figure 3.8 shows the results of a simulation with 30s OFDM symbol length.
As expected, the shorter symbol length results in higher inter-symbol interference
and bit error rate.
3.2 Rayleigh Channel Simulator 49
Figure 3.7: Rayleigh channel BER performance, T
u
= 100s
Figure 3.8: Rayleigh channel BER performance, T
u
= 30s
50 Downlink Simulator
3.3 Jakes Channel Model
The channel models analyzed in the previous sections are Additive White Gaus-
sian Noise (AWGN) and Rayleigh multipath channels. The AWGNchannel model
has a drawback of not considering any multipath effects on the transmitted signal.
The Rayleigh fading channel model is only suitable for simulating slow-fading
channels because the channel impulse response is calculated for each OFDM sym-
bol independently. The individual channel samples are not correlated in time. The
next step is to introduce time correlation to the Rayleigh channel model.
Historically, the Jakes model has been used for modeling a Rayleigh fading
channel. The Jakes simulator models the low-pass envelope of a stationary (at)
frequency non-selective (see Section 3.2.1) mobile fading channel under isotropic
scattering conditions [Rappaport, 1996]. The condition when the transmitted en-
ergy arrives equally distributed over all possible spatial angles, with uniformly
distributed phases is called an isotropic condition [ien, 2003]. An approximate
analytical model for such a channel is a zero-mean complex Gaussian noise pro-
cess with uncorrelated inphase and quadrature components. Jakes model allows
an effective approximation of the desired analytical model by using a nite num-
ber of low-frequency oscillators [Ptzold, Laue, 1998].
A reference model gives theoretical performance of the Jakes model and al-
lows performance evaluation of the practical Jakes simulator. Next, a practical
approach to Jakes simulator is given together with its statistical properties. In
order to use the Jakes simulator for multipath frequency-selective channel mod-
eling, a single-path case is considered and extended to multipath.
3.3.1 Reference Model
The complex low-pass Rayleigh envelope for the frequency non-selective (single
path) Jakes reference model is, [Xiao et al, 2002]:
3.3 Jakes Channel Model 51
g(t) = g
1
(t) + g
2
(t) (3.26)
g
1
(t) =
_
2
N
osc
N
osc
n=1
cos(2f
d
tcos
n
+
n
) (3.27)
g
2
(t) =
_
2
N
osc
N
osc
n=1
sin(2f
d
tcos
n
+
n
) (3.28)
For large N
osc
, the central limit theorem justies that g
1
(t) and g
2
(t) can be ap-
proximated as a Gaussian random processes assuming that and
n
are mutually in-
dependent and uniformly distributed over [, ] for each oscillator,
[Xiao et al, 2002].
The reference model gives the principle of generating a Rayleigh process by
using two banks of oscillators.
Second-order statistics, namely auto-correlation and cross-correlation func-
tions, are useful for analyzing the correlation properties of the inphase and quadra-
ture components. They are given as, [Xiao et al, 2002]:
R
g
1
g
1
() = E[ g
1
(t) g
1
(t + )] = J
0
(2f
d
) (3.29)
R
g
2
g
2
() = E[ g
2
(t) g
2
(t + )] = J
0
(2f
d
) (3.30)
R
g
2
g
1
() = E[ g
2
(t) g
1
(t + )] = 0 (3.31)
R
g
1
g
2
() = E[ g
1
(t) g
2
(t + )] = 0 (3.32)
3.3.2 Jakes Model
It is possible to reduce the computational complexity of the reference model. Ac-
cording to [Xiao et al, 2002] and [Pop, Beaulieu, 2002], the number of oscillators
in each bank can be reduced. The complex low-pass envelope g is given by the
Jakes model as:
52 Downlink Simulator
g(t) = g
1
(t) + g
2
(t) (3.33)
g
1
(t) =
2
N
osc
M+1
n=1
u
n
cos(
n
t +
n
) (3.34)
g
2
(t) =
2
N
osc
M+1
n=1
v
n
sin(
n
t +
n
) (3.35)
where
N
osc
= 4M + 2 (3.36)
u
n
=
_
2cos
n
, n = 1, 2, . . . , M
2cos
n
, n = M + 1
(3.37)
v
n
=
_
2sin
n
, n = 1, 2, . . . , M
2sin
n
, n = M + 1
(3.38)
n
=
_
n
M
, n = 1, 2, . . . , M
4
, n = M + 1
(3.39)
n
=
_
w
d
cos
2n
N
osc
, n = 1, 2, . . . , M
w
d
, n = M + 1
(3.40)
d
= 2f
d
(3.41)
The Jakes model generates the g
1
and g
2
components by two independent
banks of cosine wave generators. In the reference model, the means of g
1
and
g
2
are zero and the variance of g is equal to one. In principle, the two banks
of oscillators should generate two zero-mean real Gaussian noise processes with
identical variances. The processes are uncorrelated. However, this is not the case
as each bank with a limited number of oscillators generates colored noise. The
noise generated from one bank is correlated with the noise generated by the other
bank. Second-order statistics for the Jakes model are given by:
3.3 Jakes Channel Model 53
R
g
1
g
1
() =
4
N
osc
_
M+1
n=1
u
n
2
2
cos(
n
)
_
(3.42)
R
g
2
g
2
() =
4
N
osc
_
M+1
n=1
v
n
2
2
cos(
n
)
_
(3.43)
R
g
1
g
2
() = E
_
g
2
(t)g
1
(t + )
=
4
N
_
M+1
n=1
u
n
v
n
2
cos(
n
)
_
(3.44)
R
g
2
g
1
() = R
g
1
g
2
() (3.45)
3.3.3 Jakes Multipath
The Jakes fading model is suitable for simulating a at fading, i.e., single-path
channel [Li, Guan, 2000]. For simulating frequency-selective channel, it has to
be extended to a multipath. The wide-sense stationary-uncorrelated scattering
(WSSUS) channel is a commonly employed model for the multipath channel,
[Sadowsky, Kafedziski 1998]. The WSSUS model for multipath channel includes
both the variations in t and (see Section 3.2.1). The time-varying nature of the
channel is modeled as a wide-sense stationary (WSS) process. The attenuation
and phase shift associated with different delays are modeled with an uncorrelated
scattering assumption, [Jeruchim et al, 1992].
The extension to multipath can be achieved in a few different ways. A method
of assigning different arrival angles for the paths and applying orthogonal weight-
ing functions is proposed in [Dent et al, 1993]. Another possible approach is to
use the theoretical correlation function to nd a time offset after which the auto-
correlation of the process reaches a negligible value. The different paths could be
produced by the same Jakes model with the time offset.
A Jakes model with random phases of the oscillators is analyzed in
[Xiao et al, 2002] and is used in this project.
In this case, each path is modeled as a Jakes fading process with random
phases assigned to its low-frequency oscillators. All the paths are low-correlated
because of the random phases.
54 Downlink Simulator
3.3.4 Approaches for Implementation
There are different approaches for the implementation of the multipath Jakes
channel model. One possibility is to pre-generate the sequences of g (one se-
quence for each multipath component, or multipath can be achieved by using a
single delayed process) and reuse them later in the simulations. However, this ap-
proach implies that the channel is identical throughout the simulations. Moreover,
long sequences of the Jakes process require a lot of storage space. For example,
the complex Jakes coefcients for one second of channel data (Ts = 50ns, single
path) occupy 320 megabytes.
It is decided to investigate the possibilities to calculate the g function for every
transmitted OFDM sample on the y.
MATLAB code proling was performed because the initial simulator code was
extremely slow. Proling showed that almost all computational complexity is con-
centrated in only few lines of the code. After some optimizations, namely replac-
ing for loops with vector operations and usage of sparse matrices, the execution
time of the simulation was reduced up to 1500 times. Nevertheless, performing a
simulation with a big amount of transmitted-received bits still stays time consum-
ing. For the performance evaluation of the channel estimation algorithm described
in Section 3.4, a fast-fading channel simulator is required. A 50 10
6
bit transmis-
sion through a fast fading channel with 6 paths takes around 1 hour for a single
point of a BER curve. The rate of the simulated transmission is 7500 OFDM sam-
ples per second. The PC used for this measurement has a 2.4GHz Pentium IV
CPU, 1GB RAM and runs Linux 2.6 operating system.
3.3.5 Results
Figure 3.9(a) shows the envelope of the Jakes model output waveform. The
Rayleigh process is generated using one path of Equation 3.33. It can be ob-
served that the process is correlated in time. The probability density function
(PDF) of the waveform envelope is shown in Figure 3.9(b). The Jakes enve-
lope is an approximation of Rayleigh random process. It can be observed that the
PDF approaches theoretical curve as M increases. Choosing parameter M is a
trade-off between computational complexity and approximation accuracy. It can
be seen from Equations 3.33-3.40 that increment of M corresponds to increase
of storage requirements by 5 real numbers, adds a sin and a cos operation, four
multiplications and four additions for one calculation of complex channel attenu-
ation. We decided to set the value of M to 20 for further simulations as it closely
approximates the theoretical Rayleigh PDF.
3.3 Jakes Channel Model 55
(a) Jakes envelope (b) Probability Density Function of Jakes enve-
lope
Figure 3.9: Jakes envelope (f
d
= 400Hz, f
s
= 20MHz, carrier frequency: 3.5GHz)
The auto and cross-correlation plots shown in Figure 3.10 depict the compari-
son of the correlation properties of the Jakes model against the reference model.
The simulated curves are obtained using Jakes single path simulator given in
Equation 3.33, the reference curves correspond to Equations 3.29 to 3.32. The
theoretical correlation sequences are given in Equations 3.42 to 3.45. It can be
observed that the obtained correlation sequences approximate the desired correla-
tion characteristics.
The auto-correlation function of inphase, or quadrature components, decreases
in time at a rate that corresponds to Doppler frequency. As the Doppler frequency
increases, the time in which the channel is highly correlated (channel coherence
time (t)
c
) decreases. This effect can be observed in Figures 3.10(b) (400Hz
maximum Doppler frequency: carrier frequency is 3.5GHz, speed of movement -
120km/h) and 3.10(d) (100Hz maximum Doppler frequency - 3.5GHz at
30km/h). The Jakes simulator provides a sequence of complex Rayleigh dis-
tributed path gains. These path gains are correlated in time in a controllable fash-
ion.
According to [Rappaport, 1996], the coherence time of a channel (see Section
3.2.1) corresponding to the Doppler shift of 400Hz (3.5GHz carrier frequency at
the speed of 120km/h) is equal to:
56 Downlink Simulator
(t)
c
=
9
16 f
2
d
1ms (3.46)
The coherence time (t)
c
is also dened as the time for which the channel
correlation decreases by 3dB [Engels, 2002 (2), pp. 27]. It can be seen from
Figure 3.10 that Jakes path correlation decreases by 3dB (the correlation coef-
cient decreases by 2 times) in approximately 1 millisecond, thus the correlation
corresponds to the theoretical assumption in Equation 3.46.
As the channel coherence time ((t)
c
= 1ms) is much larger than the OFDM
symbol time (N
gu
= 100s), the channel can be considered to be slow fading in
WiMAX system. The model is capable to provide a fast fading channel model as
well.
Tables 3.2 and 3.3 give the cross-correlation coefcients of inphase and quadra-
ture components between 5 different Jakes paths. It can be seen that the differ-
ent paths are low-correlated. It approaches the WSSUS condition of zero cross-
correlation between paths.
Path 1 Path 2 Path 3 Path 4 Path 5
Path 1 1.0000 -0.0841 -0.1458 -0.0513 -0.0364
Path 2 1.0000 0.0410 -0.0670 0.0981
Path 3 1.0000 -0.0237 0.1185
Path 4 1.0000 0.1174
Path 5 1.0000
Table 3.2: Cross-correlation coefcients of Jakes model path inphase components
Path 1 Path 2 Path 3 Path 4 Path 5
Path 1 1.0000 0.0295 -0.0410 0.0269 0.0542
Path 2 1.0000 0.0510 0.0409 0.1791
Path 3 1.0000 -0.0663 0.1793
Path 4 1.0000 -0.0393
Path 5 1.0000
Table 3.3: Cross-correlation coefcients of Jakes model path quadrature components
3.3 Jakes Channel Model 57
(a) Inphase component auto-correlation, f
d
=
400Hz
(b) Quadrature component auto-correlation, f
d
=
400Hz
(c) Cross-correlation, f
d
= 400Hz (d) Quadrature component auto-correlation, f
d
=
100Hz
Figure 3.10: Jakes output waveform correlation (f
s
= 20MHz, carrier frequency: 3.5GHz)
58 Downlink Simulator
3.3.6 Channel Proles
In order to bring a channel model even closer to reality, a well-dened channel
prole has to be used. A channel prole denes the average signal attenuations at
certain path delays. The ITU Vehicular A, Vehicular B and Pedestrian B channel
proles are used to specify the properties of the multipath Jakes model in this
project. The path delays and the average powers are given in Tables 3.4 - 3.6.
Relative delay Average relative
i
, ns power A
i
, dB
0 0.0
310 -1.0
710 -9.0
1090 -10.0
1730 -15.0
2510 -20.0
Table 3.4: ITU Vehicular A channel prole
Relative delay Average relative
i
, ns power A
i
, dB
0 -2.5
300 0
8900 -12.0
12900 -10.0
17100 -25.2
20000 -16.0
Table 3.5: ITU Vehicular B channel prole
Relative delay Average relative
i
, ns power A
i
, dB
0 0.0
200 -0.9
800 -4.9
1200 -8.0
2300 -7.8
3700 -23.9
Table 3.6: ITU Pedestrian B channel prole
3.3 Jakes Channel Model 59
It is seen that the delays
i
of the ITU Vehicular A prole are not sample-
spaced according to the OFDMsampling rate (f
s
= 50.0ns), specied by WiMAX
in Table 3.1. The channel prole is resampled using this sampling rate in order
to fulll the assumption of the delays
i
being sample-spaced. The channel is
resampled using a simple approach of assigning the paths to the closest OFDM
sample:
i
=
i
, i = 1, 2, . . . , N (3.47)
N number of paths in channel prole;
Next, the mean attenuation values A, given in dB are converted to path vari-
ances
2
p
. In order to be able to control the signal power in the simulator, it is
desired to have the sum of path variances equal to one, but the given proles do
not have this property. Therefore, the average path powers are recalculated for all
the three channel proles by using the following equations:
A
i
= 10log
10
_
2
i
2
1
_
, i = 1, 2 . . . , N (3.48)
i
=
2
1
10
A
i
10
(3.49)
N
i=1
i
=
N
i=1
_
1
10
A
i
10
_
=
2
1
N
i=1
10
A
i
10
= 1 (3.50)
1
=
1
N
i=1
10
A
i
10
(3.51)
i
variance (energy gain) of i
th
path;
N number of paths in channel prole;
The variance of the rst path is calculated using Equation 3.51 and then the
rest are calculated using Equation 3.49. The resulting channel prole is given in
Table 3.7. The path delays are sample-spaced and the sum of tap attenuations is
equal to one, resulting in no amplication in the multipath channel. The same
approach is taken to resample ITU Vehicular B and ITU Pedestrian B channel
60 Downlink Simulator
proles [Chang et al, 2003]. They are given in Tables 3.8 and 3.9.
Relative delay Average relative
i
, ns power
2
i
0 0.485 (-3.1 dB)
300 0.385 (-4.1 dB)
700 0.061 (-12.1 dB)
1100 0.049 (-13.1 dB)
1750 0.015 (-18.1 dB)
2500 0.005 (-23.1 dB)
Table 3.7: Resampled ITU Vehicular A channel prole
Relative delay Average relative
i
, ns power
2
i
0 0.322 (-4.9 dB)
300 0.574 (-2.4 dB)
8900 0.030 (-15.2 dB)
12900 0.057 (-12.4 dB)
17100 0.002 (-27.6 dB)
20000 0.014 (-18.4 dB)
Table 3.8: Resampled ITU Vehicular B channel prole
Relative delay Average relative
i
, ns power
2
i
0 0.406 (-3.9 dB)
200 0.330 (-4.8 dB)
800 0.131 (-8.8 dB)
1200 0.064 (-11.9 dB)
2300 0.067 (-11.7 dB)
3700 0.002 (-27.8 dB)
Table 3.9: Resampled ITU Pedestrian B channel prole
The Doppler effect is investigated. The ITU Vehicular A channel prole de-
nes the path delays and variances of the Jakes simulator. Perfect (known) chan-
nel estimation (see Equation 3.23) is applied at the receiver. The properties of the
OFDM system are the following: N
FFT
= 256, F
0
= 250kHz, f = 1kHz,
T
s
= 4s, QPSK modulation. These settings are selected in order to reduce the
3.3 Jakes Channel Model 61
inter-carrier frequency f. When compared to the standard WiMAX parame-
ters, the ICI effect is substantial when the sub-carriers are closer in frequency,
i.e., the sub-carriers with closer frequency spacing become more overlapped due
to Doppler spread. Simulation results for three Doppler frequencies, correspond-
ing to receiver movement at 3km/h, 30km/h and 120km/h are shown in Figure
3.11. A theoretical multipath curve for QPSK (Equation 3.24) is also given for a
reference. It can be observed that the BER performance is close to the theoreti-
cal curve in low Doppler environment (3km/h). As expected, BER performance
deteriorates when the speed of movement increases.
Figure 3.11: Jakes fast-fading channel simulation results
62 Downlink Simulator
3.3.7 Summary
In this section, a Jakes fading channel model is presented and extended to a mul-
tipath model. It enables us to explore the performance of OFDM in a simulated
time-varying multipath radio channel with known and adjustable correlation prop-
erties.
This section starts with an assumption of slow-fading channel. It is shown by
both theory and simulation results with WiMAX parameters that a slow-fading
channel model is suitable to characterize a channel in a WiMAX system.
Several ITU channel proles are adapted to meet the assumption of sample-
spaced path delays. A method to recalculate the path attenuations is formulated
and applied to the proles.
In the next section, an OFDM channel estimation algorithm is presented and
the Jakes multipath channel is used for the performance evaluation of the estima-
tor.
3.4 Channel Estimation for OFDM 63
3.4 Channel Estimation for OFDM
In the previous section, Jakes channel model was presented. The slow-fading ra-
dio channel that is considered is frequency selective and time variant. The channel
transfer function varies across the sub-carriers of the OFDM system and from one
OFDM symbol to the next. In order to recover the transmitted data at the receiver,
channel estimation is needed. The purpose of this section is to uncondition the
assumption of ideal channel estimation presented in Section 2.6.
This section begins with an overview of existing channel estimation algo-
rithms for OFDM. Then, a time-domain channel estimation algorithm is analyzed.
Finally, simulation results are presented and discussed.
3.4.1 Overview of OFDM Channel Estimation Algorithms
Various channel estimation algorithms found in the literature are presented and
compared in this section without exhaustive details.
Blind Estimation vs Pilot Symbol Assisted Modulation
There are two essential methods for OFDM channel estimation - blind and pilot
assisted estimation.
Blind channel estimation techniques provide improved spectral efciency as
no pilot tones are needed but are effective only when a large amount of data is col-
lected under the same channel conditions. This is a disadvantage in the case of mo-
bile wireless systems because of their time-varying channel [Jeremic et al, 2004].
Blind channel estimation constraints itself to a static channel model, therefore,
use of blind channel estimation is rejected in this project.
Pilot-aided channel estimation is another technique that involves insertion of
pilots (known symbols) to the time-frequency grid of the OFDM system and esti-
mating the channel impulse response at the receiver. This technique is called Pilot
Symbol Assisted Modulation (PSAM).
Channel estimation using PSAM consists of two stages: estimation and in-
terpolation. During estimation, the channel gains are obtained in the OFDM
frequency-time grid points that contain pilots. Then the estimates are interpolated
to cover the whole grid.
64 Downlink Simulator
(a) Block type pilot scheme (b) Comb type pilot scheme
Figure 3.12: Pilot schemes
Pilot Schemes
The training data has to be sent on the selected pilot tones. Various schemes can
be used to insert the pilot symbols into the transmitted data stream: block, comb,
rectangular, triangular and others. [Negi, Ciof, 1998] gives a study of the impact
of pilot selection to channel estimation. The number of pilot tones are treated as
the number of equations in a system describing the impulse response. Therefore,
a number of pilots less than channel length results in an under-determined system
and a non-unique solution
H. It is also proved that equally spaced pilots are
optimally spaced and thereby give a better channel estimation. When the pilot
tones are concentrated in the time-frequency grid (not equally spaced), there is a
noise enhancement effect.
Comb type pilot scheme (Figure 3.12(b)) is one that has a few pilot tones
uniformly distributed within each OFDM symbol. It has a higher re-transmission
rate than block scheme and provides better tracking of dynamic channels. Since,
only some sub-carriers contain the pilot signal, the channel response of non-pilot
sub-carriers is estimated by interpolating neighboring pilot sub-channels. Thus,
this type of pilot arrangement is sensitive to frequency selectivity, i.e., the pilot
spacing must be much smaller than the coherence bandwidth (f)
c
of the channel
[Hsieh, Wei, 1998]. If the pilot spacing is greater than the coherence bandwidth,
interpolation can be complicated.
A block type scheme, on the other hand, periodically transmits an OFDM
3.4 Channel Estimation for OFDM 65
symbol containing only pilot information in time-domain (see Figure 3.12(a)).
This type of pilot arrangement is especially suitable for static channels. Since
the training block contains all pilots, channel interpolation in frequency-domain
is not required. Therefore, this type of pilot arrangement is relatively insensitive
to frequency selectivity [Hsieh, Wei, 1998].
The scattered pilot tones can be treated as noisy samples of the stochastic
channel frequency response function. They have to be placed close enough to
fulll the Nyquist sampling theorem and avoid aliasing [Sandell, Edfors, 1996].
[Yoon et al, 2002] presents an alternative pilot scheme - a boosted impulse is
inserted before a block of OFDM symbols and surrounded by no transmission
periods of maximum excess delay length. However, it is mentioned that this ap-
proach gives only satisfactory results in a single-path Rayleigh channel.
A technique called boosted pilots is proposed for Digital Video Broadcast
(DVB). The pilot tones are given a higher power than the data symbols. The aver-
age SNR of the data symbols is reduced but the channel estimates are better and
the BERcan be decreased with a suitable pilot power level [Sandell, Edfors, 1996].
Time and Frequency Domain PSAM
Channel estimation can be performed by both time-domain windowing and
frequency-domain interpolation. In time-domain windowing algorithms, the chan-
nel impulse response (CIR) is obtained by performing IFFT of the frequency-
domain channel response at the pilot symbols. The number of pilots must be
greater than the maximum excess delay [Tsai, Chiueh, 2004]:
N
p
>
max
T
s
(3.52)
Different techniques can be applied to the CIR in order to reduce noise and
aliasing: cutting off below a threshold, leaving most signicant samples or using
minimum mean squared error (MMSE). At last, the time-domain CIR is converted
back to frequency-domain channel attenuation coefcients at all sub-carriers by
using FFT [Tsai, Chiueh, 2004].
Frequency-domain interpolation, on the other hand, performs interpolation of
channel responses at the pilot symbols to obtain estimates at all sub-carriers. In
this case, the pilot sub-carriers over-sample the channel frequency response in the
frequency domain by at least a factor of two [Tsai, Chiueh, 2004]:
66 Downlink Simulator
N
p
>
2
max
T
s
(3.53)
Time-domain estimation algorithms improve spectral efciency with a draw-
back of higher latency and complexity. Frequency domain estimation is usually
less complex but has lower transmission efciency because of a higher pilot rate
[Tsai, Chiueh, 2004].
Estimation Algorithms
In conventional pilot estimation methods, the estimate of pilot symbols based on
least-squares (LS) is given by:
h
p
=
_
h
p
(0),
h
p
(1), . . . ,
h
p
(N
p
1)
_
T
=
_
z(D
f
0)
a(D
f
0)
,
z(D
f
1)
a(D
f
1)
, . . . ,
z(D
f
(N
p
1))
a(D
f
(N
p
1))
_
T
(3.54)
N
osc
M+1
n=1
u
n
cos(
n
t +
n
) (4.2)
g
2
(t) =
2
N
osc
M+1
n=1
v
n
sin(
n
t +
n
) (4.3)
g normalized low-pass process of Jakes model;
g
1
inphase component of Jakes model;
g
2
quadrature component of Jakes model;
M reduced number of oscillators;
initial phase associated with a propagation path (random variable, uniformly distributed
over [, ]);
Different attenuations A
p
and random initial phases are assigned to paths
(hence 2-dimensional phase matrix ), see Section 3.3.3. From this point on we
denote multipath Jakes process as g. The path attenuation is included in the in-
phase and quadrature components of the Jakes simulator:
g(t, l) = g
1
(t, l) + g
2
(t, l) (4.4)
g
1
(t, l) =
2
N
osc
_
A
p
2
M+1
n=1
u
n
cos(
n
t +
n,l
) (4.5)
g
2
(t, l) =
2
N
osc
_
A
p
2
M+1
n=1
v
n
sin(
n
t +
n,l
) (4.6)
(4.7)
l path index;
q
A
p
2
gain adjustment for pth path;
4.5 Channel Model Implementation 97
The sine and cosine functions in Equations 4.2 and 4.3 are expanded as:
sin( + ) = sin cos + cos sin (4.8)
cos( + ) = cos cos sin sin (4.9)
After moving the path gain adjustment
_
A
p
2
inside g
1
(t, l) and g
2
(t, l) equa-
tions, we obtain:
g
1
(t, l) =
M+1
n=1
_
2 A
p
N
osc
u
n
cos(
n
t +
n,l
) (4.10)
g
2
(t, l) =
M+1
n=1
_
2 A
p
N
osc
v
n
sin(
n
t +
n,l
) (4.11)
After expanding according to 4.8 and 4.9,
g
1
(t, l) =
M+1
n=1
_
cos(
n
t)
_
2 A
p
N
osc
u
n
cos(
n,l
. .
P1
)
sin(
n
t)
_
2 A
p
N
osc
u
n
sin(
n,l
)
. .
P2
_
(4.12)
g
2
(t, l) =
M+1
n=1
_
sin(
n
t)
_
2 A
p
N
osc
v
n
cos(
n,l
)
. .
P3
+
cos(
n
t)
_
2 A
p
N
osc
v
n
sin(
n,l
)
. .
P4
_
(4.13)
98 Implementation of the OFDM Simulator
Calculation of the four terms P1, P2, P3 and P4 in the equations above can
be substituted with pre-calculated matrices of size [ML], where L is the number
of paths, since v, and path attenuations are known in advance and do not change
during the simulation.
Finally,
n
t requires a large number of bits because may contain big values
when Doppler frequency is high. Simulation time t is a small number, requiring a
large number of fraction bits. But the nature of t can be exploited by substituting
the multiplication
n
t by addition of smaller bit count xed-point numbers. We
know that t = 0 at the start of the simulation. The proceeding time is incremented
by a xed sample time. Therefore, another vector that contains the current
n
t
values is introduced. The vector is initialized with zeros and increased by
n
T
s
(a constant vector requiring less precision bits) at each simulated sample of
the transmitted signal. Periodicity of sine and cosine functions is also exploited
by subtracting 2 from the elements of the vector whenever 2 is exceeded. It
guarantees that the sine/cosine function arguments never exceed 2. The number
of required integer bits for the arguments can now be safely reduced to 3 (2
3
=
8 > 2).
This approach to time requires rewriting the channel simulator so that all the
operations are performed for the current sample only. MATLAB simulations be-
come slower because the fast built-in vector operations are no longer used. For
implementation in C language this is not an issue because all vector operations
are written and optimized manually.
The following step is to perform xed-point MATLAB simulations for the op-
timized code. This showed that having signed xed-point numbers with 4 integer
and 15 fraction bits provided a precision close to oating-point simulations as
depicted in Figure 4.9. Further increasing the number of fractional bits does not
change the error between xed- and oating-point simulations signicantly.
To conclude, the rearrangement of the Jakes simulator signicantly reduces
the required precision and allows an FPGA implementation to be feasible.
4.5.5 Handel-C Implementation
After the Ccode is tested to produce output identical to that of MATLAB and xed-
point simulations are complete (required xed-point precision is known), Handel-
C code development is initiated. The purpose of C to Handel-C conversion is not
only to introduce xed-point arithmetics with specic word widths to the code,
but also to specify sections of code that should execute in parallel. Therefore, the
execution time is decreased.
4.5 Channel Model Implementation 99
Figure 4.9: Fixed-point simulation results
Memory Usage Optimization
At the very start of the transformation of C code to Handel-C, it is found that the
algorithm has a requirement for a high gate count. It is because of the buffers
for the transmitted and received signals. They alone consumed about 50% of all
the available gates in the FPGA, as shown by DK Suite Logic Estimation Tool, if
implemented as register arrays.
The buffers can be stored in an external or internal RAM. Turning the buffers
into RAM arrays has the drawback that RAM can be accessed only once per clock
cycle, which creates restrictions for maximum achievable parallelism. The chan-
nel simulator algorithm is optimized in terms of memory usage and tested again.
The area usage is reduced as a result of this optimization.
The buffers are substituted with a "sample in - sample out" approach. Only a
few small buffers are maintained inside the FPGA, therefore, they can be imple-
mented as fast register arrays. Buffering of the transmitted and received signal is
left to the on-board Ethernet device. The nal overall channel simulator algorithm
is given in Figure 4.10.
100 Implementation of the OFDM Simulator
The algorithm runs in a loop, where a new input (one sample from the trans-
mitted OFDM symbol) is taken at the start. The simulation of L Jakes paths is
performed together with channel convolution with the transmitted signal in lines
4-11 of Figure 4.10. A circular buffer, large enough to hold N
ds
OFDM samples
is maintained. In line 7, a call to function g is made. The implementation of this
function is given in Figure 4.11. It returns a complex attenuation coefcient for a
given path l. The function g is realization of Equations 4.4, 4.12 and 4.13.
When the OFDM sample transmission through all channel paths is complete,
the result is sent back to the software part of the simulator (line 12). Finally, the
simulation time is increased by one OFDM sample time T
s
(lines 14-18). The
explanation for such time management is given in Section 4.5.4.
begin 1
while ContinueSimulation do 2
buffer [index [0]] sentSample ; 3
l 0; 4
recSample 0; 5
while l < L do 6
recSample recSample + buffer [index [l ]] g(l, wt); 7
index [l ] index [l ] + 1; 8
if index [l ] >= L then 9
index [l ] 0; 10
l l + 1; 11
returnResult(recSample); 12
n 0; 13
while n <= M do 14
wt [n ] wt [n ] + (w [n] T
s
); 15
if wt [n ] >= 2 then 16
wt [n ] wt [n ] - 2; 17
n n + 1; 18
end 19
Figure 4.10: Channel simulator algorithm
The optimized Jakes algorithm is given in Figure 4.11.
4.5 Channel Model Implementation 101
begin 1
Real(res) 0; 2
Imag(res) 0; 3
n 0; 4
while n <= M do 5
t1 wt [n ]; 6
t2 cos(t1); 7
t3 sin(t1); 8
t4 t2 * P1 [n ][l ]; 9
t5 t3 * P2 [n ][l ]; 10
t6 t3 * P3 [n ][l ]; 11
t7 t2 * P4 [n ][l ]; 12
Real(res) Real(res) + t4- t5 ; 13
Imag(res) Imag(res) + t6 + t7 ; 14
n n + 1; 15
returnResult(res); 16
end 17
Figure 4.11: Optimized Jakes algorithm
Hardware-Software Interface
A wrong assumption about the interface between the RC203 development board
and a PC was made while designing the system. A signicant effort is required to
make the interface because the Ethernet controller available on the board provides
only Data Link level communications. The following possible solutions were
considered:
nding a TCP/IP protocol stack implementation on FPGA, or
performing Data Link layer communications from MATLAB.
Considering the scope of the project, it is infeasible to implement either of
these two possibilities. The Celoxica RC200/203 hardware and PSL Reference
Manual was found slightly misleading, because it does not give any information
on the Open Systems Interconnection (OSI) layers. Therefore, the data packets
that are mentioned in the document were mistakenly treated as TCP/IP packets
while designing the system. Also, the lack of experience with such Ethernet con-
trollers played a major role. The wrong assumption was noticed only later, while
writing the Handel-C code.
102 Implementation of the OFDM Simulator
Finally, it was decided not to implement the link between the PC and the de-
velopment board. Instead of that, the focus was concentrated on an efcient im-
plementation of the hardware part of the algorithm. The cost of communication
with the PC is not considered anymore. This is justied as PCI-card FPGA co-
processor accelerators, such as the Celoxica RC2000 exist. They provide a fast
link between the PC chipset and FPGA (such as 528 MBps (64-bit 66MHz) PCI
bus in the case of the RC2000). Therefore, our approach for the hardware accel-
erator implementation is still valid.
Parallelism and Scheduling
We decided to complete the implementation part by obtaining and analyzing the
estimates about speed improvement in the hardware part when trade-offs between
resource usage (area) and parallelism (speed) are made. These estimates include:
manual estimates of clock cycles per OFDM sample transmission simula-
tion;
gate, CLB and memory usage estimates from the DK Logic Estimation Tool;
simulated number of clock cycles (to verify the correctness of manual esti-
mates).
Two extreme cases are considered - no parallelism and full achievable par-
allelism. The level of parallelism is increased by adding inline functions and
par{ ... } statements to an initially sequential code. Function inlining in
itself does not introduce parallelism, but allows more than one instance of the
function to be synthesized and to be executed in parallel. Inlining and parallel
execution increases the area and decreases the time.
In order to explore the trade-offs between inherent parallelism and resource
utilization in the source code, a Data Flow Graph (DFG) of the Jakes simulator is
constructed. The DFG is shown in Figure 4.12. A resource-constrained schedule
(see Section 4.1.3) is derived from the DFG and is shown in Figure 4.13. One
multiplier and one ALU functional units are assumed to be the constraints in terms
of the resources. Resource-constrained scheduling produces a schedule that ts a
certain architecture. However, all the inherent parallelism may not be exploited. It
can be seen from Figure 4.13 that the execution takes place in nine control steps.
As the main goal of the implementation in this thesis is to reduce the execution
time, a time-constrained schedule (see Section 4.1.3) is derived from the initial
DFG and shown in Figure 4.14. It is found that the minimum execution time is
4.5 Channel Model Implementation 103
Figure 4.12: Data Flow Graph of the Jakes simulator
achieved when four multipliers and two ALUs are employed. Compared to the
resource-constrained schedule, the area usage is higher. The execution time is
reduced to six control steps.
A parallel version of the Handel-C program is written by including the par-
allelism of the time-constrained schedule. The sequential and parallel Handel-C
source code that are used during the investigation are given in Appendices C and D.
104 Implementation of the OFDM Simulator
Figure 4.13: Resource-constrained schedule
Time and Resource Estimation
First, the number of FPGA clock cycles used for channel simulation per OFDM
sample are counted and accumulated manually. The following rules are used:
assignment operator takes one clock cycle;
4.5 Channel Model Implementation 105
Figure 4.14: Time-constrained schedule
the instructions following the par{ ... } do not execute until all branches
of the parallel block are complete.
Next, the execution of the source code is simulated. The number of clock
cycles used for all the functions of the programs are counted step-by-step in the
simulator. The results are shown in Table 4.1 (Clocks Man in the table represents
manual estimates, Clocks Sim corresponds to simulated values). The manual esti-
mation is performed only once for each version of the code. It is performed only
once in order to avoid any possible bias during the analysis. An accurate manual
prediction of the timing of the Handel-C code is found to be possible.
An important result is that by specifying parallelism in the Handel-C code, the
number of clock cycles required to simulate a transmission of one OFDM sample
can be reduced by 60%.
In order to acquire the estimates about the number of NAND gates, memory
and ip-op usage, both versions of the Handel-C code are compiled to Elec-
tronic Design Interchange Format (EDIF) with DK Suite. The compiler produces
a report regarding the resource usage for each line of the source code. The total
resource usage is given in Table 4.1. The number of memory bits and ip-ops
106 Implementation of the OFDM Simulator
Function Implementation Difference
Min. parallelism Max. parallelism
Sin() Clocks Man 6 Clocks Man 5 17%
Clocks Sim 6 Clocks Sim 5 17%
Cos() Clocks Man 7 Clocks Man 6 14%
Clocks Sim 7 Clocks Sim 6 14%
CircIndex() Clocks Man 1 Clocks Man 1 -
Clocks Sim 1 Clocks Sim 1 -
InitJakes() Clocks Man 12 Clocks Man 1 92%
Clocks Sim 12 Clocks Sim 1 92%
JSample() Clocks Man 465 Clocks Man 190 59%
Clocks Sim 486 Clocks Sim 195 60%
CMul() Clocks Man 6 Clocks Man 2 67%
Clocks Sim 6 Clocks Sim 2 67%
CAdd() Clocks Man 2 Clocks Man 1 50%
Clocks Sim 2 Clocks Sim 1 50%
CAssign() Clocks Man 2 Clocks Man 1 50%
Clocks Sim 2 Clocks Sim 1 50%
Main() Clocks Man 3,089 Clocks Man 1,271 57%
(All code) Clocks Sim 3,122 Clocks Sim 1,246 60%
Memory 12,448 Memory 12,448 -
NAND Gates 545,993 NAND Gates 631,770 14%
Flip Flops 1,176 Flip Flops 1,176 -
Table 4.1: Costs of FPGA implementation
(registers) is the same for the sequential and parallel versions. This is justied as
the same global and local variables are used in both versions. However, the num-
ber of NAND gates is higher in the case of parallel execution, as more hardware
is synthesized as compared to the sequential version.
The nal step towards FPGA implementation is to Place/Route the EDIF rep-
resentation of the design. Xilinx ISE 7.1i toolkit is used. The Place/Route tool
gives a report about the timing, power, area, functional unit usage, etc. As the
goal is to speedup the simulator, efforts are spent only in reducing the clock time
while ensuring that the limitations specic to the FPGA chip are fullled.
Place/Route is performed for the EDIF les obtained from DK Suite. Both
sequential and parallel Handel-C code versions result in a maximum FPGA clock
frequency of 50 MHz (20ns clock time). The detailed results are given in Appen-
dices C and D.
4.5 Channel Model Implementation 107
The maximum clock frequency depends on the critical path of the design.
Handel-Csemantics guarantee that an assignment takes a single clock cycle. There-
fore, assignments to complex operations are synthesized to equally complex logic
that must be executed in one clock cycle. As the logic depth increases, the clock
period has to be increased in order to accommodate the complex logic. A sin-
gle complex operation (critical path) may decrease the performance of the whole
system because the clock period is determined according to the critical path. By
simplifying or pipelining complex operations, the performance of the system can
be improved. In our case, the report of the DK Logic Estimation Tool showed that
the critical path is in the xed-point library. In order to reduce the clock period,
the Celoxica xed-point library must be substituted with its pipelined version.
Finally, the simulation speed is compared between the FPGA/Handel-C (esti-
mated) and PC/MATLAB (measured) simulators. The comparison is given in Ta-
ble 4.2. A 5.5 speedup factor over a 2.4GHz CPU (see Section 3.3.4) is achieved
by using an FPGA running at only 50MHz. If the difference in the clock fre-
quencies is taken into account, the efciency of the FPGA implementation is 264
(
2.400MHz
50MHz
5.5) times higher to that of the PC.
Sequential Parallel
FPGA Clock Rate 50 MHz 50 MHz
Clocks per OFDM sample 3, 122 1, 246
OFDM Samples/s, FPGA
5010
6
3,122
= 16, 000
5010
6
1,246
= 41, 100
OFDM Samples/s, PC 7, 500 7, 500
Speedup 2.13 5.48
Table 4.2: Speedup estimation results
It is not completely fair to compare only the pure performance of a general
purpose CPU and FPGA, because:
the present day FPGA technology does not achieve clock frequencies com-
parable to those of CPUs;
MATLAB operates on its native 64-bit oating-point format numbers, whereas
FPGA implementation uses 19 bit xed-point arithmetics in our case;
the MATLAB simulator is a mixture of translated and interpreted programs,
therefore, the overhead of the interpreter has to be taken into account;
MATLAB simulations are performed in a multi-tasking operating system
environment and are constantly interrupted by other concurrent processes;
108 Implementation of the OFDM Simulator
a signicant effort is spent in optimizing the Handel-C code as compared to
the MATLAB simulator.
4.6 Design Options and Decisions
The overview of the options and decisions that were made during the design pro-
cess is shown in Figure 4.15. The dark path shows the design ow decisions. The
light lines show the options that were considered but not followed.
Figure 4.15: Options and decisions in the HW/SW co-design process
4.7 Summary 109
4.7 Summary
This chapter presents the characterization of various steps involved in a traditional
co-design ow according to the Rugby meta model.
A literature study of the various steps involved in hardware/software co-design
is performed. A methodology is chosen for the Implementation part of the thesis
based on the literature study.
An analysis of the underlying complexity of the OFDM simulator algorithm is
performed. The algorithm is divided into grains and partitioned between hardware
and software. The partitioning decision is supported with an evaluation of the
computational complexity, operation nature and data rates of different grains of
the algorithm.
A brief account of various factors that are considered while choosing an archi-
tecture for the implementation is also given.
The Jakes channel simulator is optimized in order to reduce the high bit count
requirement for xed-point arithmetics. The requirement of 40-bit xed-point
arithmetics found initially. It is reduced until 19-bit by optimizing the Jakes
simulator algorithm. This signicantly reduces the chip area and allows achieving
higher clock rates.
Finally, the channel simulator is implemented using the Handel-C program-
ming language. Cycle-accurate simulations and FPGA synthesis show that it is
possible to reduce the channel simulation time by 5.5 times compared to simula-
tion on an average PC.
CHAPTER 5
CONCLUSIONS
The overall goals of the project were to:
study and analyze a generic OFDM downlink system;
build a generic OFDM downlink simulator in MATLAB by following a spe-
cic methodology;
analyze channel modeling and estimation techniques;
decrease the execution time of the simulator;
explore and apply hardware-software co-design methodologies.
To achieve these goals, we decided to break the tasks in the thesis into two
parts namely: 1) the analytical part, 2) the implementation part. The conclusions
for the two parts are presented in this chapter.
5.1 The Analytical Part
The thesis started with a study of various issues related to communication systems
and OFDM. The analytical part provides a mathematical description of a generic
OFDM transceiver. A low complexity MATLAB simulator was written to test the
transceiver.
The low complexity MATLAB simulator does not represent all the issues that
are encountered in a real OFDM system. It is assumed that a perfect channel be-
tween the transmitter and the receiver exists. However, such channel does not exist
112 Conclusions
in reality. In order to make the transceiver usable in realistic environments, vari-
ous mathematical channel models are considered. Sophisticated channel estima-
tion and equalization techniques are required at the receiver when more realistic
channel models are used.
We followed an iterative approach while building the simulator as professed
by the XP methodology that was used while pursuing the analytical part. As the
understanding of the issues regarding channel modeling and estimation improved,
several iterations of the simulator were derived.
The rst channel model considered only additive white Gaussian noise (page
34). It does not require equalization because multipath effects, present in a real-
istic channel, do not occur. The performance of the OFDM system was measured
with simulations. It matched the theoretical performance, and therefore, gave us
condence about the model.
Further, a multipath Rayleigh channel was investigated (page 38). The chan-
nel effects are written as a multiplication of the transmitted symbols with channel
matrices, the number of which depends on the maximum excess delay of the chan-
nel. Therefore, the analytical model of the channel is not constrained to a certain
maximum excess delay. Our channel model is more accommodating than many
other channel models proposed in the literature. It allows a particular OFDM
symbol to contain ISI from more than one previous OFDM symbols. In addition,
fast-fading channels can be modeled. The use of Cyclic Prex in combating the
effects of ISI was investigated. It was shown by the simulation results that the per-
formance of the OFDM transceiver deteriorates when the maximum excess delay
of the channel exceeds the cyclic prex length.
Then, the Jakes channel simulator (page 50) was included to the project as
it introduces time correlation to the channel. The properties of both single- and
multi-path Jakes channel model were examined. The effect of Doppler spread on
the performance of the transceiver was observed in the results of the simulation.
Various ITU channel proles were presented and adapted to a channel sampling
rate. A discussion about the implementation of the Jakes simulator was given.
The nal task of the analytical part of the thesis was to provide a channel
estimation algorithm that would be able to restore the transmitted signal after it is
corrupted by the fast-fading multipath channel. A literature study was performed.
We decided to concentrate on time-domain channel estimation techniques as they
provide better spectral efciency compared to frequency-domain estimation. The
results of a simple channel estimator showed a big error level. Therefore, we
attached an adaptive process to the channel estimator in order to reduce the error
(page 71). It was found that adaptive estimation improves the performance.
5.2 The Implementation Part 113
5.2 The Implementation Part
The goal of the implementation part was to decrease the execution time of the
simulator that was built in MATLAB in the analytical part. The simulator con-
tains an OFDM transceiver with an adaptive channel estimator. The channel is a
multipath fast-fading Jakes simulator.
We started by presenting a generic hardware/software co-design ow. The
generic co-design ow was explained in terms of the Rugby meta-model. The
various steps in the co-design ow were explored by a literature study.
The analysis of the OFDM simulator was performed to nd the distribution
of computational complexity. After MATLAB code proling and manual com-
plexity analysis, we found out that 90% of the complexity is concentrated in the
few program lines of the channel model. The operations in the channel model
are computation-oriented. The rate of data passed through the interface between
the transceiver and the channel simulator was found to be much lower than in-
side the channel simulator. From these observations, we decided to ofoad the
channel model into a hardware accelerator. The accelerator was decided to be
implemented on an FPGA development board.
The DK Suite methodology was adapted to the requirements for the imple-
mentation of the channel model.
The Jakes channel model was optimized to reduce the number of precision
bits required. We showed that the optimization allows using 19-bit arithmetics
instead of 40-bit, while keeping the functionality intact. As a consequence, a
higher maximum clock rate of the FPGA can be achieved and the usage of the
FPGA resources is reduced. Such optimization with the obtained results has not
been found elsewhere in the literature.
The MATLAB channel simulator was written in Handel-C programming lan-
guage. Two versions of the optimized xed-point code, sequential and parallel
were produced. The speedup was estimated for both versions. It was found that
the parallel code consumes around 60% less clock cycles to execute than the se-
quential version. The overall simulation time obtained by using the parallel code
executing on an FPGA is 5.5 times lower than that of an average PC system run-
ning the MATLAB simulator.
114 Conclusions
5.3 Future Work
The work presented in this Masters thesis has some assumptions. The following
items could be further investigated:
improving the BER performance in channels with very high Doppler fre-
quency (high mobility OFDM applications) by using more sophisticated
adaptive channel estimators;
the transmitter/receiver synchronization issues, modulation and source cod-
ing techniques;
improving the simulator performance in terms of execution speed by using a
more efcient xed-point arithmetic library and/or pipelining the simulator;
development of a working prototype. The current hardware implementation
does not have an interface to the transceiver running on a PC.
BIBLIOGRAPHY
[AccelChip, 2005] AccelChip, 2005: AccelWare IP for DSP Design Targeting
FPGAs and ASICs.
Internet: <http://www.accelchip.com/files/whitepapers/
3_05_AccelWare_IP_0304.pdf> Date Visited: 28-04-2005
[Anniballe, 1993] Anniballe, J.V.D., P. J. Koopman, Jr., 1993: Towards execution
models of distributed systems: a case study of elevator design. ACMDEEE
International Workshop on Hardware-Software Co-design, October 1993.
[Axelsson, 1997] Axelsson, J., 1997: Analysis and Synthesis of Heterogeneous
Real-Time Systems. Phd. Thesis, No. 502, Linkping University, 1997.
[Beek, 1995] Beek, J. J. van de, O. Edfors, M. Sandell, S. K. Wilson, P. O. Br-
jesson, 1995: On channel estimation in OFDM systems. Proceedings IEEE
Vehicular Technology Conference, July 1995. pp. 815-819.
[Brewer, 2001] Brewer, J., 2001: Extreme Programming FAQ
Internet: <http://www.jera.com/techinfo/xpfaq.html> Date
Visited: 06-12-2004.
[Brooks, 2004] Brooks, T., 2005: Key questions to consider when using a highly
integrated DSP. GSPx Conference, September 2004.
[Celoxica, 2005 (1)] Celoxica, 2005: RC203.
Internet: <http://www.celoxica.com/products/rc203/
default.asp> Date Visited: 20-04-2005
[Celoxica, 2004 (2)] Celoxica, 2004: Handel-C Language Reference Manual.
[Celoxica, 2005 (3)] Celoxica, 2005: Technology Backgrounder.
Internet: <http://www.celoxica.com/corporate/tech_
backgrounder_01000.pdf> Date Visited: 26-04-2005
116 BIBLIOGRAPHY
[Celoxica, 2005 (4)] Celoxica, 2005: Handel-C Language Overview.
Internet: <http://www.celoxica.com/techlib/files/
CEL-W0307171KDD-47.pdf> Date Visited: 26-04-2005
[Chang, Gibby, 1968] Chang, R. W., R. A. Gibby, 1968: A Theoretical Study
of Performance of an Orthogonal Multiplexing Data Transmission Scheme.
IEEE Transactions on Communication Technology, Vol. 16, No. 4, August
1968. pp. 529-540.
[Chang et al, 2003] Chang, J. W., D. S. Park, J. R. Cleveland, 2003: Summary of
Delay Proles for MBWA. IEEE 802 Executive Commitee Study Group on
Mobile Broadband Wireless Access.
Internet: <http://grouper.ieee.org/groups/802/20/
Contribs/C802.20-03-77.ppt> Date Visited: 26-11-2004
[Chou et al, 1995] Chou, P. H., R. B. Ortega, G. Borriello, 1995: The Chinook
Hardware/Software Co-Synthesis System. Proceedings of the 8th international
symposium on System Synthesis, ACM Press, 1995. pp. 22-27.
[Coleri et al, 2002] Coleri, S., M. Ergen, A. Puri, A. Bahai, 2002: A Study of
Channel Estimation in OFDM Systems. Procedings Of IEEE Vehicular Tech-
nology conference, Fall 2002. pp. 894-898.
[Cosyma, 1998] Cosyma, 1998: Cosyma Architecture and Input Languages.
Internet: <http://www.ida.ing.tu-bs.de/research/
projects/cosyma/overview/> Date Visited: 28-03-2005
[CoWare, 2005] CoWare, 2005: SPW Product Oveview.
Internet: <http://www.coware.com/products/spw4.php> Date
Visited: 30-04-2005
[CSys, 2004] Larsen, L. S.,2004: Cellular Systems Division. Internet: <http:
//kom.aau.dk/csys/index.htm> Date Visited: 27-04-2005
[Dent et al, 1993] Dent, P., G.E. Bottomley, T. Croft, 1993: Jakes Fading
Model Revisited. IEEE Electronic Letters, Volume 29, No.13, January 1993.
pp.1162-1163.
bibitem[Dick, 2000]dick Dick, C.,2000: FPGAs:The High-End Alterna-
tive for DSP Applications. Internet: <http://www.hunteng.co.uk/
pdfs/tech/DSP1736FPGA.pdf> Date Visited: 05-05-2005
BIBLIOGRAPHY 117
[Edfors et al, 1998] Edfors, O., M. Sandell, J. J. van de Beek, S. K. Wilson, P. O.
Brjesson, 1998: OFDM Channel Estimation by Singular Value Decomposi-
tion. IEEE Transactions on Communications, Vol. 46, No. 7, July 1998. pp.
931-939.
[Engelhart et al, 1999] Engelhart, A., H. Gryska, C. Sgraja, W. G. Teich, J. Lind-
ner, 1999: The Discrete-Time Channel Matrix Model for General BDFM
Packet Transmission Schemes. Proceedings of International OFDM Work-
shop, 1999.
[Engels, 2002] Engels, M., 2002: Wireless OFDM Systems: How to Make Them
Work? Kluwer Academic Publishers.
[Engels, 2002 (2)] Engels, M, 2002: Wireless OFDM Systems. Kluwer Academic
Publishers.
[Ernst, 1998] Ernst, R., 1998: Codesign of Embedded Systems: Status and
Trends. IEEE Design and Test, Vol. 15, No. 2, April 1998. pp 45-54.
[Eyre, Bier, 2000] Eyre, J., J. Bier 2000: The Evolution of DSP Processors. IEEE
Signal Processing Magazine, Vol 2, No. 2, March 2000. pp 43-51.
[Frank, 1996] Frank, V., T. D. Le, Y. C. Hsu, 1996: A comparison of Functional
and structural Partitioning. Processdings of IEEE International Symposium
on System Synthesis, November 1996. pp. 121-126.
[Gajski et al, 1992] Gajski, D., N. Dutt, A. Wu, S. Lin, 1992: High-Level Synthe-
sis. Kluwer Academic Pulishers.
[Gajski, Ramachandran, 1994] Gajski, D. D., L. Ramachandran, 1994: Introduc-
tion to High-Level Synthesis. IEEE Design and Test of Computers, Winter
1994. pp 44-54.
[Ge, Yun, 1996] Ge, Y., D.Y.Y. Yun, 1996: A method that Determines Optimal
Grain Size and Inherent Parallelism Concurrently. Proceedings of Interna-
tional Symposium on Parallel Architectures, Algorithms and Networks, June
1996. pp 200-206.
[Groot, 1990] Heemstra de Groot S.M., 1990: Scheduling Techniques for Itera-
tive Data-Flow Graphs. PhD Thesis, University of Twente, The Netherlands.
[Hara, Prasad, 2003] Hara, S., R. Prasad, 2003: Multicarrier Techniques for 4G
Mobile Communications. Artech House.
118 BIBLIOGRAPHY
[Haykin, 2001] Haykin, S., 2001: Communication Systems, Fourth Edition. John
Wiley & Sons.
[Haykin, 2002] Haykin, S., 2002: Adaptive Filter Theory, Fourth Edition. Pren-
tice Hall.
[Hsieh, Wei, 1998] Hsieh, M., C. Wei, 1998: Channel Estimation for OFDM Sys-
tems Based on Comb-type Pilot Arrangement in Frequency Selective Fading
Channels. IEEE Transactions on Consumer Electronics, Vol. 44, No. 1, Febru-
ary 1998. pp 217-225.
[Hunt Engineering, 2004] , Hunt 2004: Choosing DSP or FPGA for your Appli-
cation.
Internet: <http://www.hunteng.co.uk/dsp-fpga.htm> Date
Visited: 1-03-2005
[Hutter et al, 2002] Hutter, A.A., R. Hasholzner, J.S. Hammerschmidt, 1999:
Channel estimation for mobile OFDM systems. IEEE Vehicular Technology
Conference, Vol. 1, Fall 1999. pp. 305-309.
[Hwang et al, 1991] Hwang, C. T., J.H. Lee, Y.C. Hsu: A formal approach to
scheduling problemin High Level Synthesis. IEEE Transactions on Computer-
Aided Design, Vol. 10, No. 4, April 1991. pp.464-475.
[IEC, 2004] IEC, 2004: OFDM for Mobile Data Communications.
Internet: <http://www.iec.org/online/tutorials/ofdm/>
Date Visited: 17-12-2004
[I-Logix, 2005] I-Logix, 2005: Embedded System Design Software.
Internet: <http://www.ilogix.com/statemate/statemate.
cfm> Date Visited: 27-04-2005
[Intel, 2004] Intel, 2004: WiMAX - Broadband Wireless Access Technology.
Internet: <http://www.intel.com/netcomms/technologies/
wimax> Date Visited: 26-11-2004
[Jantsch et al, 2000] Jantsch, A., S. Kumar, A. Hemani, 2000: A Metamodel for
Studying Concepts in Electronic System Design. IEEE Design Test of Com-
puters, Vol. 17, No. 3, July/September 2000. pp. 78-85.
[Jeffries, 2001] Jeffries, R., 2001: What is Extreme Programming?
Internet: <http://www.xprogramming.com/xpmag/whatisxp.
htm> Date Visited: 07-12-2004
BIBLIOGRAPHY 119
[Jeremic et al, 2004] Jeremic, A., T. A. Thomas, A. Nehorai, 2004: OFDMChan-
nel Estimation in the Presence in Interference. IEEE Transactions on Signal
Processing, Vol. 52, No. 12, December 2004. pp. 3429-3439.
[Jeruchim et al, 1992] Jeruchim, M. C., P. Balaban, K. S. Shanmugan, 1992: Sim-
ulation of Communication Systems. Plenum Press, NY.
[Jeruchim et al, 2000] Jeruchim, M. C., P. Balaban, K. S. Shanmugan, 2000: Sim-
ulation of Communication Systems. Plenum Press, NY.
[Jussel, 2005 ] Jussel, J., 2005: C to FPGA: An Abstract Concept for Concrete
Design Implementation.
Internet: <http://www.rtcmagazine.com/home/printthis.
php?id=100304> Date Visited: 28-04-2005
[Kim et al, 1999] Kim, Y. H., I. Song, H. G. Kim, T. Chang, H. M. Kim, 1999:
Performance Analysis of a Coded OFDM System in Time-Varying Multipath
Rayleigh Fading Channels. IEEE Transactions on Vehicular Technology, Vol.
48, No. 5, September 1999. pp. 1612-1615.
[Koch, 1996] Koch, P., 1996: Strategies for Realistic and Efcient Static Schedul-
ing of Data Independent Algorithms onto Multiple Digital Signal Processors.
PhD Thesis, Institute of Electronic Systems, Aalborg University, Denmark.
[Krauatrachue, Lewis, 1988] Krauatrachue, B., T. Lewis, 1988: Grain Size De-
termination for Parallel Processing. IEEE Software, Vol. 5, No. 1, January
1988. pp.23-32.
[Langton, 2002 ] langton, C., 2002: Orthogonal Frequency Division Multiplex-
ing
Internet: <http://www.complextoreal.com/chapters/ofdm2.
pdf> Date Visited: 10-09-2004
[Li, Guan, 2000] Li, Y., Y. L. Guan, 2000: Modied Jakes Model for Simulat-
ing Multiple Uncorrelated Fading Waveforms. Proceedings of IEEE vehicular
Technology Conference, Vol. 3, 2000. pp. 1819-1822.
[Litwin, Pugel, 2001] Litwin, L., M. Pugel, 2001: The Principles of OFDM.
Internet: <http://rfdesign.com/images/archive/
0101Puegel30.pdf> Date Visited: 10-09-2004
[Lyrtech, 2005 (1)] Lyrtech, 2005: Virtex-II-based SignalMaster-DSP/FPGA De-
velopment Products - Lyrtech Signal Processing.
Internet: <http://www.lyrtech.com/DSP-development/dsp_
fpga/signalmaster_cpci.php> Date Visited: 16-04-2005
120 BIBLIOGRAPHY
[Lyrtech, 2000 (2)] Lyrtech, 2000: Users Manual and Installation Guide fro SM-
C67X
[Madsen et al, 1997] Madsen, J., J. Grode,P. V. Knudsen, M.E. Petersen,
A.Haxthausen, 1997: LYCOS: the Lyngby Co-Synthesis System. Design Au-
tomation for Embedded Systems, Vol.2, No. 2, Kluwer Academic Publishers,
March 1997. pp. 195-235.
[MathWorks, 2005] The MathWorks, Inc., 2004: Fixed-Point Toolbox. Inter-
net: <http://www.mathworks.com/access/helpdesk/help/
toolbox/fixedpoint/> Date Visited: 20-03-2005
[McDonough, 1995] McDonough, R. N, 1995: Detection of Signals in Noise.
Academic Press.
[Mentor Graphics, 2003] Mentor Graphics Corporation, 2003: FPGAs: Fast
Track to DSP. Internet: <http://www.mentor.com/techpapers/
fulfillment/upload/mentorpaper_11937.pdf> Date Visited:
04-05-2005
[Micheli, 1994] Giovanni De Micheli, 1994: Synthesis and Optimization of Dig-
ital Circuits. McGraw-Hill.
[Mitra, 1998] Mitra, S. K, 1998: Digital Signal Processing. McGraw-Hill.
[Mitra, Basu, 1997] Mitra, R. S., A. Basu, 1997: Knowledge Representation in
MICKEY: An Expert System for Designing Microprocessor-Based Systems.
IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems and
Humans, Vol. 27, No. 4, July 1997. pp. 467-479.
[Moher, Lodge, 1989] Moher, M. L., J. H. Lodge, 1989: A Modulation and Cod-
ing Strategy for Rician-Fading Channels. IEEE Transactions Vehiculat Tech-
nology, 40(4), pp.686-693.
[Morris, 2005 ] Morris, K., 2005: FPGA and Programmable Logic Journal
Internet: <http://www.fpgajournal.com/articles_2005/
20050426_power.htm> Date Visited: 03-05-2005
[Le Moullec et al, 2002] Le Moullec, Y., P.Koch, J.P.Diguet, J.L.Philippe: De-
sign Trotter: Building and Selecting Architectures for Embedded Multimedia
Applications. IEEE International Symposium on Consumer Electronics, De-
cember 2003.
[Nee, 2005] Nee, R. van, 2005: Basics and History of OFDM.
Internet: <www.ofdm-forum.com> Date Visited: 12-10-2004
BIBLIOGRAPHY 121
[Negi, Ciof, 1998] Negi, R., J. Ciof, 1998: Pilot Tone Selection for Channel
Estimation in a Mobile OFDM System. International conference on Consumer
Electronics, June 1998. pp. 466-467.
[NI, 2005] National Instruments, 2005: NI MATRIXx Design and Development
Tools.
Internet: <http://www.ni.com/matrixx/what_is_matrixx.
htm> Date Visited: 16-04-2005
[Niemann, 1998] Niemann, R., 1998: Hardware/Software Co-design for Data
Flow Dominated Embedded Systems. Kluwer academic Publishers.
[Nilsson et al, 1997] Nilsson, R., O. Edfors, M. Sandell, P.O. Brjesson, 1997:
An Analysis of Two-Dimensional Pilot-Symbol Assisted Modulation for
OFDM. IEEE International Conference on Personal Wireless Communica-
tions, December 1997. pp. 71-74.
[Orfanidis, 1996] Orfanidis, S. J, 1996: Introduction to Signal Processing. Pren-
tice Hall.
[Ouaiss et al, 1997] Ouaiss, I., S. Govindarajan, V. Srinivasan, M. Kaul, R. Ve-
muri, 1997: An Integrated Partitioning and Synthesis System for Dynamically
Recongurable Multi-FPGA Architectures. DDEL, University of Cincinnati.
[ien, 2003] ien, G. E., 2003: Modelling and Analysis of Wireless Fad-
ing Channels. Internet: <http://www.inf.fu-berlin.de/inst/
ag-tech/resources/material/> Date Visited: 20-02-2005
[zer et al, 2003] zer E., A. P. Nisbet, D. Gregg, 2003: Classication of Com-
piler Optimizations for High Performance, Small Area and Low Power in FP-
GAs. Technical Report, Department of Computer Science, Trinity College,
Dublin.
[Page, 2002 (1)] Page, I., 2002: Computing Without Computers
Internet: <http://www.doc.ic.ac.uk/~ipage/cwoc.html>
Date Visited: 26-04-2005
[Page, 2002 (2)] Page, I., 2002: The Handel Methodology
Internet: <http://www.doc.ic.ac.uk/~ipage/handel_
methodology.html> Date Visited: 26-04-2005
[Pop, Beaulieu, 2002] Pop M. F., N. C. Beaulieu, 2001: Limitations of Sum-
of-Sinusoids Fading Channel Simulators IEEE Transactions on Communica-
tions, Vol. 49, No. 4, April 2001. pp. 699-708.
122 BIBLIOGRAPHY
[Proakis, 2001] Proakis, J. G., 2001: Digital Communications. Mc Graw Hill.
[Proakis, Salehi, 2002] Proakis, J. G., Salehi M., 2002: Communication System
Engineering. Prentice Hall.
[Ptzold, Laue, 1998] Ptzold, M., F.Laue, 1998: Statistical Properties of Jakes
Fading Channel Simulator IEEE Vehicular Technology Conference, Vol. 2,
May 1998. pp. 712-718.
[Rappaport, 1996] Rappaport, T. S., 1996: Wireless Communications. Prentice
Hall.
[Reeves, 1995] Reeves, C. R., 1995: Modern Heuristic Techniques for Combina-
torial Problems. McGraw-Hill.
[Sadowsky, Kafedziski 1998] Sadowsky J. S., V. Kafedziski, 1998: On the Cor-
relation and Scattering Functions of the WSSUS Channel for Mobile Commu-
nications IEEE Transactions on Vehicular Technology, Vol 7, No. 1, February
1998. pp. 270-282.
[Sakthivel, 2004] Sakthivel A., 2004: Extreme Programming (XP) - An Overview
Internet: <http://www.c-sharpcorner.com/Code/2004/Sept/
ExtremeProgXP.asp> Date Visited: 10-12-2004
[Sandell, Edfors, 1996] Sandell M., O. Edfors, 1996: A Comparative Study
of Pilot-Based Channel Estimators for Wireless OFDM. Research Report
TULEA 1996:19, Div. of Signal Processing, LuleUniversity of Technology.
[Sanghamitra, Prith, 2004] Roy,S., P. Banerjee, 2004: An Algorithm for Convert-
ing Floating-Point Computations to Fixed-Point Computations to Fixed-point
in MATLAB based FPGA design. Proceedings of 41
st
Design Automation
Conference, June 2004. pp. 484-487.
[Schmidt, 2005] Schmidt, C. F., 2005: Exhaustive Search Techniques
Internet: <http://www.rci.rutgers.edu/~cfs/305_html/
Computation/ExhaustiveSearch_305.html> Date Visited:
22-04-2005
[Shanmugan, 1988] Shanmugan, K. S., 1988: Random Signals: Detection, Esti-
mation and Data Analysis. John Wiley and Sons Ltd.
[Simeone et al, 2004] Simeone, O., Y. Bar-Ness, U. Spagnolini, 2004: Pilot-
Based Channel Estimation for OFDM Systems by Tracking the Delay-
Subspace. IEEE Transactions on Wireless Communications, Vol. 3, No. 1,
January 2004. pp. 315-325.
BIBLIOGRAPHY 123
[Sklar, 2001] Sklar, B., 2001: Digital Communications, Second Edition. Prentice
Hall.
[Sramek, 2003] Sramek, C., 2003: Cyclic Prex
Internet: <http://cnx.rice.edu/content/m11762/latest/>
Date Visited: 19-12-2004
[Synopsys, 2005] Synopsys, 2005: COSSAP
Internet: <http://www.synopsys.com/products/success/
hyundai_ss.html> Date Visited:20-04-2005
[Telecom Glossary, 2000] Telecom Glossary 2K,
Internet: <http://www.atis.org/tg2k/> Date Visited: 05-12-2004
[Traquair, 2005] Traquair, 2005: Estimating FPGA Requirements for DSP
Applications.
Internet: <http://www.traquair.com/technology/fpga.
dspest.html> Date Visited: 06-05-2005
[Tsai, Chiueh, 2004] Tsai, P. Y., T. D. Chiueh, 2004: Frequency-Domain
Interpolation-Based Channel Estimation in Pilot-Aided OFDM Systems.
IEEE 59
th
Vehicular Technology Conference, Vol. 1, Spring 2004. pp. 420-
424.
[Veiverys et al., 2004] Veiverys, A., E. Jatkonis, M. urnait e, A. Rashid, 2004:
Analysis of Adaptive Car Noise Cancellation Algorithms and Hardware Ar-
chitecture Modelling on Ptolemy II Aalborg University
[Walker, 1991] Walker, R. A., R. Camposano, 1991: A survey of High-Level Syn-
thesis Systems. Kluwer Academic Pulishers.
[Weinstein, Ebert, 1971] Weinstein, S.B., P.M. Ebert, 1971: Data Transmission
by Frequency-Division-Multiplexing Using the Discrete Fourier Transform.
IEEE Transactions on Communications, Vol. 19, No. 5, October 1971. pp.
628-634.
[Wells, 2004] Wells, D., 2004: Extreme Programming
Internet: <http://www.extremeprogramming.org> Date Visited:
07-12-2004
[Wikipedia, 2004] Wikipedia, 2004: Extreme Programming
Internet: <http://en.wikipedia.org/> Date Visited: 08-12-2004
[Wolf, 1994] Wolf W.H., 1994:Hardware-Software Co-Design of Embedded Sys-
tems. Proceedings of the IEEE, Vol. 82, No. 7, July 1994. pp. 967-989.
124 BIBLIOGRAPHY
[Xcell, 2005] Xcell Journal Online,
Internet: <http://www.xilinx.com/publications/
xcellonline/partners/xc_celoxica44.htm> Date Visited:
06-05-2005
[Xiao et al, 2002] Xiao, C, Zheng Y. R, Beaulieu N. C, 2002: Second-Order Sta-
tistical Properties of the WSS Jakes Fading Channel Simulator IEEE Trans-
actions on Communications, Vol. 50, No. 6, June 2002. pp. 888-891.
[Xilinx, 2005] Virtex-II Platform FPGA User Guide,
Internet: <http://www.xilinx.com/bvdocs/userguides/
ug002.pdf> Date Visited: 01-05-2005
[Yaghoobi, 2004] Yaghoobi, H., 2004: Scalable OFDMA Physical Layer in IEEE
802.16 WirelessMAN. Intel Technology Journal, Vol. 8, No.3, August 2004.
pp. 201-212.
[Yoon et al, 2002] Yoon, P. K., P. H. Kar-Ming, N. C. Sum, 2002: Channel Esti-
mation for Mobile OFDMSystemwith different Detectors under Time-Varying
Rayleigh Fading Channel. The 8
th
International Conference on Communica-
tion Systems, Vol. 1, November 2002. pp. 294-298.
APPENDIX A
LIST OF SYMBOLS
A - path attenuation (pp. 58).
a - transmitted data symbol vector (pp. 16).
a - element of transmitted data symbol vector (pp. 16).
b - serial data input vector (pp. 15).
b - serial data input vector element (pp. 125).
b - bit (pp. 36).
C - estimation matrix for symbol retrieval (pp. 45).
D
f
- pilot spacing (pp. 66).
d - encoded data symbol vector (pp. 15).
d - encoded data symbol (pp. 15).
E - expectation (pp. 51).
F
0
- total wideband frequency width or OFDM sampling frequency, Hz
(pp. 12).
F
dec
- decision function (pp. 30).
126 List of Symbols
F
demap
- symbol demapping function (pp. 30).
F
eq
- channel estimation function (pp. 30).
F
map
- symbol mapping function (pp. 15).
f - sub-carrier frequency, Hz (pp. 12).
f
d
- maximum Doppler frequency occurring when = 0, Hz (pp. 40).
f
s
- OFDM sampling frequency, Hz (pp. 55).
G
cp
- cyclic prex insertion matrix (pp. 24).
G
cp
- cyclic prex removal matrix (pp. 29).
g - normalized low-pass process of Jakes model (pp. 51).
g - complex envelope of the reference or ideal process (pp. 51).
g
1
- inphase component of Jakes model (pp. 52).
g
1
- inphase component of the reference model (pp. 51).
g
2
- quadrature component of Jakes model (pp. 52).
g
2
- quadrature component of the reference model (pp. 51).
H - channel impulse response matrix (pp. 43).
H - transfer function (pp. 126).
max
- maximum excess delay (pp. 39).
- initial phase associated with a propagation path (random variable,
uniformly distributed over [, ]) (pp. 51).
- power spectral density (pp. 35).
- orthogonality matrix (pp. 22).
cp
- orthogonality matrix with cyclic prex (pp. 25).
H
cp
- orthogonality matrix for DFT and cyclic prex removal (pp. 30).
- element of orthogonality matrix (pp. 19).
- orthogonality vector (pp. 129).
- angular frequency (pp. 26).
APPENDIX B
LIST OF ABBREVIATIONS
A
3
AAU
ADSL
ALU
ASIC
ASIP
ASPI
AWGN
BER
BPSK
CIR
CISS
CLB
CP
CPU
CSys
DAB
DAPG
DFG
DFT
DK
DSE
DSM
DSP
DT
DVB
Application Algorithm Architecture
Aalborg University
Asynchronous Digital Subscriber Loop
Arithmetic Logic Unit
Application-Specic Integrated Circuit
Application-Specic Integrated Processor
Applied Signal Processing and Implementation
Additive White Gaussian Noise
Bit Error Rate
Binary Phase Shift Keying
Channel Impulse Response
Center for Embedded Software Systems
Congurable Logic Block
Cyclic Prex
Central Processing Unit
Cellular Systems Division, AAU
Digital Audio Broadcast
Directed Acyclic Precedence Graph
Data Flow Graph
Discrete Fourier Transform
Celoxica DK Design Suite
Design Space Exploration
Data Stream Manager
Digital Signal Processing(-or)
Design Trotter
Digital Video Broadcast
132 List of Abbreviations
EDIF
ETSI
FDM
FFT
FIR
FPGA
GCC
GPU
HCDFG
HW
ICI
IDE
IDFT
IEEE
IFFT
ILP
IP
ISI
ITU
LMMSE
LMS
LS
LUT
MAC
MMSE
NAND
NP
NRE
OFDM
OSI
PAL
PCI
PDF
PE
PSAM
QAM
QPSK
RAM
RC203
SNR
SVD
SW
Electronic Design Interchange Format
European Telecommunications Standards Institute
Frequency Division Multiplexing
Fast Fourier Transform
Finite Impulse Response
Field-Programmable Gate Array
GNU Compiler Collection
Graphics Processing Unit
Hierarchical Control and Data Flow Graph
Hardware
Inter-Carrier Interference
Integrated Development Environment
Inverse Discrete Fourier Transform
Institute of Electrical and Electronic Engineers
Inverse Fast Fourier Transform
Integer Linear Programming
Intellectual Property
Inter-Symbol Interference
International Telecommunication Union
Linear Minimum Mean-Squared Error
Least Mean Square
Least Squares
Look-Up Table
Media Access Control
Minimum Mean-Squared Error
Not AND
Nondeterministic Polynomial
Non-Recurring Engineering
Orthogonal Frequency Division Multiplexing
Open Systems Interconnection
Platform Abstraction Layer
Peripheral Component Interconnect
Probability Density Function
Processing Element
Pilot Symbol Assisted Modulation
Quadrature Amplitude Modulation
Quadrature Phase Shift Keying
Random Access Memory
Celoxica RC203 FPGA Development Board
Signal-to-Noise Ratio
Singular Value Decomposition
Software
133
TCP/IP
VHDL
WiMAX
WSS
WSSUS
XP
Transmission Control Protocol and Internet Protocol
Very High Speed Integrated Circuit Hardware Description Lan-
guage
IEEE 802.16 standard
Wide Sense Stationary
Wide Sense Stationary Uncorrelated Scattering
Extreme Programming
APPENDIX C
SEQUENTIAL HANDEL-C CODE
C.1 Source Code
#include "fixed.hch"
#include <stdlib.hch>
set clock = external "P1";
#define nf 15 // fraction bits
#define ni 4 // integer bits
#define bn1 23 // bit indices for fixed point data reading/writing
#define bn2 20
#define bn3 19
#define bn4 5
#define jakesM 20 // Jakes parameters
#define jakesM1 21
#define jakesN 82
#define numPaths 6
#define max_delay 50
macro expr REAL(a) = a[0];
macro expr IMAG(a) = a[1];
typedef FIXED_SIGNED(ni, nf) TFixed;
typedef TFixed TComplex[2];
typedef unsigned int 4 TPathNumberIndex;
typedef unsigned int 6 TPathIndex;
typedef unsigned int 5 TJakesIndex;
TFixed PI2 = FixedLiteral(FIXED_ISSIGNED, ni, nf, 6.283172607421875);
TFixed PIby2 = FixedLiteral(FIXED_ISSIGNED, ni, nf, 1.57080078125);
TPathIndex delays[numPaths] = {0, 6, 14, 22, 35, 50};
TPathIndex index[numPaths];
TFixed gains[numPaths];
#define sinTableSize 25735
#define sinTableIndexBits 14
typedef FIXED_SIGNED(1, 13) TTableItem;
136 Sequential Handel-C Code
unsigned 8 out;
interface bus_clock_in(unsigned 8 i) pi() with
{data = {"N1", "N3", "N2", "M4", "M3", "M2", "M1", "L4"}};
interface bus_out() po(out) with
{data = {"H1", "H2", "H3", "H4", "G1", "G2", "G3", "G4"}};
interface bus_clock_in(unsigned 1 P) sin_int() with
{data = {"F5"}};
interface bus_clock_in(unsigned 13 K) sin_frac()with
{data = {"L3", "L2", "L1", "L5", "K5", "K4", "K3",
"K2", "K1", "J4", "J3", "J2", "J1"}};
ram TComplex buffer[max_delay+1];
ram TFixed w[jakesM+1];
ram TFixed dw[jakesM+1];
ram TFixed p1[jakesM+1][numPaths];
ram TFixed p2[jakesM+1][numPaths];
ram TFixed p3[jakesM+1][numPaths];
ram TFixed p4[jakesM+1][numPaths];
TFixed Sin(TFixed inval)
{
unsigned int (ni+nf) v;
unsigned int sinTableIndexBits ind;
TTableItem sinres;
TFixed res;
unsigned int 1 t;
v = (unsigned int)inval.FixedIntBits @ (unsigned int)inval.FixedFracBits;
ind = v[16:3];
t = sin_int.P;
res.FixedIntBits = (signed int ni) (t[0] @ t[0] @ t[0] @ t);
res.FixedFracBits = (signed int nf) (sin_frac.K @ 0);
return res;
}
TFixed Cos(TFixed inval)
{
TFixed in2;
in2 = FixedSub(PIby2, inval);
return Sin(in2);
}
TPathIndex CircIndex(TPathIndex a)
{
if (a == max_delay)
return 0;
else return a+1;
}
void InitJakes(void)
{
// initialize gains
gains[0] = FixedLiteral(FIXED_ISSIGNED, ni, nf, 0.484375);
gains[1] = FixedLiteral(FIXED_ISSIGNED, ni, nf, 0.38671875);
gains[2] = FixedLiteral(FIXED_ISSIGNED, ni, nf, 0.0625);
gains[3] = FixedLiteral(FIXED_ISSIGNED, ni, nf, 0.046875);
gains[4] = FixedLiteral(FIXED_ISSIGNED, ni, nf, 0.015625);
gains[5] = FixedLiteral(FIXED_ISSIGNED, ni, nf, 0.00390625);
index[0] = 0; //assuming that first delay is 0
index[1] = max_delay + 1 - delays[1];
index[2] = max_delay + 1 - delays[2];
C.1 Source Code 137
index[3] = max_delay + 1 - delays[3];
index[4] = max_delay + 1 - delays[4];
index[5] = max_delay + 1 - delays[5];
}
//==============================================================================
// returns one complex Jakes simulator waveform value for one path.
// Time is contained in vector w.
void JSample(TPathNumberIndex path, TComplex *res)
{
TJakesIndex n;
TFixed coef;
TFixed t1, t3, t4, t5, t6, t7, t8;
REAL(*res) = FixedLiteral(FIXED_ISSIGNED, ni, nf, 0);
IMAG(*res) = FixedLiteral(FIXED_ISSIGNED, ni, nf, 0);
n = 0;
while (n != jakesM1)
{
t1 = w[n];
t3 = Cos(t1);
t4 = Sin(t1);
t5 = FixedMultSigned(t3, p1[n][path<-3]);
t6 = FixedMultSigned(t4, p2[n][path<-3]);
t7 = FixedMultSigned(t4, p3[n][path<-3]);
t8 = FixedMultSigned(t3, p4[n][path<-3]);
t3 = FixedSub(t5, t6);
t4 = FixedAdd(t7, t8);
REAL(*res) = FixedAdd(REAL(*res), t3);
IMAG(*res) = FixedAdd(IMAG(*res), t4);
n++;
}
}
//==============================================================================
// complex multiplication operation
void Cmul(TComplex *a, TComplex *b, TComplex *res)
{
// (x + yi)(u + vi) = (xu