Vous êtes sur la page 1sur 89

ADAPTIVE WIENER FILTERING APPROACH FOR SPEECH

ENHANCEMENT

M. A. Abd El-Fattah*, M. I. Dessouky , S. M. Diab and F. E. Abd El-samie #


Department of Electronics and Electrical communications, Faculty of Electronic Engineering
Menoufia University, Menouf, Egypt
E-mails: * maro_zizo2010@yahoo.com , # fathi_sayed@yahoo.com

ABSTRACT
This paper proposes the application of the Wiener filter in an adaptive manner in
speech enhancement. The proposed adaptive Wiener filter depends on the adaptation of the
filter transfer function from sample to sample based on the speech signal statistics(mean
and variance). The adaptive Wiener filter is implemented in time domain rather than in
frequency domain to accommodate for the varying nature of the speech signal. The
proposed method is compared to the traditional Wiener filter and spectral subtraction
methods and the results reveal its superiority.

Keywords: Speech Enhancement, Spectral Subtraction, Adaptive Wiener Filter

1 INTRODUCTION

Speech enhancement is one of the most is removed first. Decomposition of the vector space
important topics in speech signal processing. of the noisy signal is performed by applying an
Several techniques have been proposed for this eigenvalue or singular value decomposition or by
purpose like the spectral subtraction approach, the applying the Karhunen-Loeve transform (KLT)[8].
signal subspace approach, adaptive noise canceling Mi. et. al. have proposed the signal / noise KLT
and the iterative Wiener filter[1-5] . The based approach for colored noise removal[9]. The
performances of these techniques depend on idea of this approach is that noisy speech frames
quality and intelligibility of the processed speech are classified into speech-dominated frames and
signal. The improvement of the speech signal-to- noise-dominated frames. In the speech-dominated
noise ratio (SNR) is the target of most techniques. frames, the signal KLT matrix is used and in the
noise-dominated frames, the noise KLT matrix is
Spectral subtraction is the earliest method for used.
enhancing speech degraded by additive noise[1]. In this paper, we present a new technique to
This technique estimates the spectrum of the clean improve the signal-to-noise ratio in the enhanced
(noise-free) signal by the subtraction of the speech signal by using an adaptive implementation
estimated noise magnitude spectrum from the noisy of the Wiener filter. This implementation is
signal magnitude spectrum while keeping the phase performed in time domain to accommodate for the
spectrum of the noisy signal. The drawback of this varying nature of the signal.
technique is the residual noise.
The paper is organized as follows: in section
Another technique is a signal subspace II, a review of the spectral subtraction technique is
approach [3]. It is used for enhancing a speech presented. In section III, the traditional Wiener
signal degraded by uncorrelated additive noise or filter in frequency domain is revisited. Section IV,
colored noise [6,7]. The idea of this algorithm is proposes the adaptive Wiener filtering approach for
based on the fact that the vector space of the noisy speech enhancement. In section V, a comparative
signal can be decomposed into a signal plus noise study between the proposed adaptive Wiener filter,
subspace and an orthogonal noise subspace. the Wiener filter in frequency domain and the
Processing is performed on the vectors in the signal spectral subtraction approach is presented.
plus noise subspace only, while the noise subspace

UbiCC Journal - Volume 3 1


2 SPECTRAL SUBTRACTION
A noise-free signal estimate can then be obtained
Spectral subtraction can be categorized as a with the inverse Fourier transform. This noise
non-parametric approach, which simply needs an reduction method is a specific case of the general
estimate of the noise spectrum. It is assume that technique given by Weiss, et al. and extended by
there is an estimate of the noise spectrum that is Berouti , et al.[2,12].
typically estimated during periods of speaker The spectral subtraction approach can be
silence. Let x(n) be a noisy speech signal : viewed as a filtering operation where high SNR
regions of the measured spectrum are attenuated
x ( n) = s ( n) + v ( n) (1) less than low SNR regions. This formulation can be
given in terms of the SNR defined as:
where s(n) is the clean (the noise-free) signal, and 2
v(n) is the white gaussian noise. Assume that the X (ω )
noise and the clean signals are uncorrelated. By SNR = (5)
applying the spectral subtraction approach that Pˆ v (ω )
estimates the short term magnitude spectrum of the
Thus, equation (3) can be rewritten as:
noise-free signal S (ω ) by subtraction of the
2 2
estimated noise magnitude spectrum Vˆ (ω ) from Sˆ (ω ) = X (ω ) − Pˆ v (ω )
−1 (6)
the noisy signal magnitude spectrum X (ω ) . It is 2⎡ 1 ⎤
≈ X (ω ) 1 +
sufficient to use the noisy signal phase spectrum as ⎢⎣ SNR ⎥⎦
an estimate of the clean speech phase
spectrum,[10]:
An important property of noise suppression
using spectral subtraction is that the attenuation
Sˆ(ω) = ( X (ω) − Nˆ (ω) ) exp(j∠X (ω)) (2) characteristics change with the length of the
analysis window. A common problem for using
The estimated time-domain speech signal is spectral subtraction is the musicality that results
obtained as the inverse Fourier transform of from the rapid coming and going of waves over
successive frames [13].
Sˆ (ω ) .
Another way to recover a clean signal s(n) 3 WIENER FILTER IN FREQUNCY
from the noisy signal x(n) using the spectral DOMAIN
subtraction approach is performed by assuming
that there is an the estimate of the power spectrum The Wiener filter is a popular technique that
of the noise Pv (ω ) , that is obtained by averaging has been used in many signal enhancement
over multiple frames of a known noise segment. methods. The basic principle of the Wiener filter is
An estimate of the clean signal short-time squared to obtain a clean signal from that corrupted by
magnitude spectrum can be obtained as follow [8]: additive noise. It is required estimate an optimal
filter for the noisy input speech by minimizing the
⎧X(ω) 2 −Pˆv(ω), if X(ω) 2 −Pˆv(ω) ≥0 Mean Square Error (MSE) between the desired
2 ⎪ signal s(n) and the estimated signal sˆ( n) . The
Sˆ(ω) = ⎨ frequency domain solution to this optimization
(3)
⎪ 0, otherwise problem is given by[13]:

Ps(ω)
H(ω) = (7)
It is possible combine this magnitude spectrum Ps(ω) + Pv(ω)
estimate with the measured phase and then get the
Short Time Fourier Transform (STFT) estimate as where Ps (ω ) and Pv (ω ) are the power spectral
follows: densities of the clean and the noise signals,
respectively. This formula can be derived
j∠X (ω ) considering the signal s and the noise signal v as
Sˆ (ω ) = Sˆ (ω ) e (4)

UbiCC Journal - Volume 3 2


uncorrelated and stationary signals. The signal-to- Pv (ω ) =σv2
noise ratio is defined by[13]: (10)

Ps (ω ) Consider a small segment of the speech


SNR = (8) signal in which the signal x(n) is assumed to be
Pˆ v (ω ) stationary, The signal x(n) can be modeled by:

This definition can be incorporated to the Wiener x(n) = mx + σxw(n) (11)


filter equation as follows:
where mx and σx are the local mean and standard
deviation of x(n). w(n) is a unit variance noise.
−1
⎡ 1 ⎤ Within this small segment of speech, the
H ( ω ) = ⎢1 + (9)
SNR ⎥⎦
Wiener filter transfer function can be approximated
⎣ by:

The drawback of the Wiener filter is the fixed


Ps (ω ) σs
2

frequency response at all frequencies and the H (ω ) = =


Ps (ω ) + Pv (ω ) σs + σv
2 2
requirement to estimate the power spectral density
of the clean signal and noise prior to filtering. (12)
From Eq.(12), because H (ω ) is constant over the
4 THE PROPOSED ADAPTIVE WIENER
small segment of speech, the impulse response of
FILTER
the Wiener filter can be obtained by:
This section presents and adaptive
implementation of the Wiener filter which benefits
σs
2
from the varying local statistics of the speech h( n) = δ ( n) (13)
signal. A block diagram of the proposed approach σs + σv
2 2

is illustrated in Fig. (1). In this approach, the


estimated speech signal mean mx and variance From Eq.(13), the enhanced speech sˆ( n) within
this local segment can be expressed as:
σx 2
are exploited.

σs
2
A priori knowledge sˆ(n) = mx + ( x(n) - mx ) ∗ δ ( n)
σs + σv
2 2

Space-
σs
2
Degraded speech variant Enhanced
x(n) speech
= mx + ( x(n) − mx )
σs + σv
2 2
h(n)
signal sˆ( n) (14)
If it is assumed that mx and σs are updated at
each sample, we can say:

Measure of
σs (n) ( x(n) − mx(n))
2

Local speech sˆ(n) = mx (n) +


σs (n) + σv
2 2
A priori statistics
knowledge (15)

In Eq.(15), the local mean mx(n) and


Figure 1: Typical adaptive speech enhancement system
for additive noise reduction ( x(n) − mx (n)) are modified separately from
segment to segment and then the results are
combined. If σs is much larger than σv the
2 2

It is assumed that the additive noise v(n) is


output signal sˆ( n) is assumed to be primarily due
of zero mean and has a white nature with variance
to x(n) and the input signal x(n) is not attenuated. If
of σv .Thus, the power spectrum Pv (ω ) can be
2

σs is smaller than σv , the filtering effect is


2 2

approximated by:
performed.

UbiCC Journal - Volume 3 3


Notice that mx is identical to ms when In the first experiment , all the above-
mentioned algorithms are carried out on the Handle
mv is zero. So, we can estimate mx (n) in Eq.(15) signal with different SNRs and the output PSNR
from x(n) by:
results are shown in Fig. (2). The same experiment
n+M
is repeated for the Laughter and Gong signals and
1
mˆ s (n) = mˆ x (n) = ∑ x(k ) the results are shown in Figs.(3) and (4),
respectively.
(2 M +1) k =n−M
From these figures, it is clear that the proposed
(16) adaptive Wiener filter approach has the best
performance for different SNRs. The adaptive
where ( 2 M + 1) is the number of samples in the Wiener filter approach gives about 3-5 dB
improvement at different values of SNR. The non-
short segment used in the estimation. linearity between input SNR and output PSNR is
due to the adaptive nature of the filter.
To measure the local signal statistics in
the system of Figure 1, the algorithm developed
uses the signal variance σs . The specific method
2

used to designing the space-variant h(n) is given by


(17.b).
Since σx = σs + σv may be estimated
2 2 2
80
from x(n) by:
70

⎧σˆx (n) − σˆv , if σˆx (n) > σˆv


2 2 2 2

σˆs (n) = ⎨
2 60
O u tp u t P S N R (d B )

⎩0, otherwise
50

(17.a)
Where 40

30
n+ M
1
σˆx (n) =
2

(2 M + 1)
∑ ( x(k ) − mˆ (n))
k =n−M
x
2
20
Spectral Subtraction
(17.b)
10 Wiener Filter
By this proposed method, we guarantee that Adaptive Wiener Filter
the filter transfer function is adapted from sample
to sample based on the speech signal statistics. 0
-10 -5 0 5 10 15 20 25 30 35
Input SNR (dB)
5 EXPERIMENTAL RESULTS

For evaluation purposes, we use different Figure 2: PSNR results for white noise
speech signals like the handel, laughter and gong case at-10 dB to +35 dB SNR levels for Handle signal
signals. White Gaussian noise is added to each
speech signal with different SNRs. The different
speech enhancement algorithms such as the
spectral subtraction method, the Weiner filter in
frequency domain and the proposed adaptive
Wiener filter are carried out on the noisy speech
signals. The peak signal to noise ratio (PSNR)
results for each enhancement algorithm are
compared.

UbiCC Journal - Volume 3 4


reveal that the best performance is that of the
60 proposed adaptive Wiener filter.

1 1
50

A m p lit u d e

A m p lit u d e
0 0

40 -1 -1
O u tp u t P S N R (d B )

0 2000 4000 6000 8000 0 2000 4000 6000 8000


(a) (b)

30 1 1

A m p lit u d e

A m p lit u d e
0 0
20
-1 -1
0 2000 4000 6000 8000 0 2000 4000 6000 8000
Spectral Subtraction (c) (d)
10 Wiener Filter 1

A m p lit u d e
Adaptive Wiener Filter 0
0
-10 -5 0 5 10 15 20 25 30 35 -1
0 2000 4000 6000 8000
Input SNR (dB) (e) Time(msec)

Figure 3: PSNR results for white noise case at -10 dB


to +35 dB SNR levels for Laughter signal Figure 5: Time domain results of the Handel sig. at
SNR = +5dB (a) original sig. (b) noisy sig. (c) spectral
subtraction. (d) Wiener filtering. (e) adaptive Wiener
filtering.
80

70

Amplitude (dB)
Amplitude (dB)

0 0
60 -20 -20
-40
O u tp u t P S N R (d B )

50 -40
0 1000 2000 3000 4000 0 1000 2000 3000 4000
(a) (b)
40
Amplitude (dB)
Amplitude (dB)

0 0
30 -20
-20
20 Spectral Subtraction -40 -40
0 1000 2000 3000 4000 0 1000 2000 3000 4000
Wiener Filter (c) (d)
10
Adaptive Wiener Filter
Amplitude (dB)

0
0
-10 -5 0 5 10 15 20 25 30 35 -20
Input SNR (dB) -40
0 1000 2000 3000 4000
Figure 4: PSNR results for white noise case at -10 dB (e) Freq.(Hz)
to +35 dB SNR levels for Gong signal

The results of the different enhancement Figure 6:The spectrum of the Handel sig. in Fig.(5) (a)
original sig. (b) noisy sig. (c) spectral subtraction. (d)
algorithms for the handle signal with SNRs of 5,
Wiener filtering. (e) adaptive Wiener filtering.
10,15 and 20 dB in the both time and frequency
domain are given in Figs. (5) to (12). These results

UbiCC Journal - Volume 3 5


A m p lit u d e

A m p lit u d e
1 1

A m p lit u d e

A m p lit u d e
1 1
0 0
0 0
-1 -1
0 2000 4000 6000 8000 0 2000 4000 6000 8000
-1 -1
(a) (b) 0 2000 4000 6000 8000 0 2000 4000 6000 8000
A m p lit u d e

1 A m p lit u d e 1 (a) (b)

A m p lit u d e

A m p lit u d e
1 1
0 0
0 0
-1 -1
0 2000 4000 6000 8000 0 2000 4000 6000 8000 -1 -1
(c) (d) 0 2000 4000 6000 8000 0 2000 4000 6000 8000
A m p lit u d e

(c) (d)

A m p lit u d e
1
1
0 0

-1 -1
0 2000 4000 6000 8000 0 2000 4000 6000 8000
(e) Time (msec) (e) Time(msec)

Figure 7: Time domain results of the Handel sig. at


SNR = 10 dB (a) original sig. (b) noisy sig. (c) spectral Figure 9: Time domain results of the Handel sig. at
subtraction. (d) Wiener filtering. (e) adaptive Wiener SNR = 15 dB (a) original sig. (b) noisy sig. (c) spectral
filtering. subtraction. (d) Wiener filtering. (e) adaptive Wiener
filtering.
Amplitude (dB) Amplitude (dB)
Amplitude (dB) Amplitude (dB)

Amplitude (dB) Amplitude (dB)

Amplitude (dB) Amplitude (dB)

0 0 0 0
-20 -20 -20
-20
-40 -40
-40 -40
0 1000 2000 3000 4000 0 1000 2000 3000 4000 0 1000 2000 3000 4000 0 1000 2000 3000 4000
(a) (b) (a) (b)
0 0 0 0
-20 -20 -20 -20
-40 -40
0 1000 2000 3000 4000 -40 -40
0 1000 2000 3000 4000 0 1000 2000 3000 4000 0 1000 2000 3000 4000
(c)
Amplitude (dB)

(d) (c)
Amplitude (dB)

(d)
0
0
-20
-20
-40
-40
0 1000 2000 3000 4000 0 1000 2000 3000 4000
(e)Freq. (Hz) (e)Freq. (Hz)

Figure 8: The spectrum of the Handel sig. in Fig.(7) Figure 10: The spectrum of the Handel sig. in Fig.(9)
(a) original sig. (b) noisy sig. (c) spectral subtraction. (d) (a) original sig. (b) noisy sig. (c) spectral subtraction. (d)
Wiener filtering. (e) adaptive Wiener filtering. Wiener filtering. (e) adaptive Wiener filtering.

UbiCC Journal - Volume 3 6


6 CONCLUSION
1 1

A m p lit u d e

A m p lit u d e
An adaptive Wiener filter approach for
0 0 speech enhancement is proposed in this papaper.
This approach depends on the adaptation of the
-1 -1
0 2000 4000 6000 8000 0 2000 4000 6000 8000 filter transfer function from sample to sample
(a) (b) based on the speech signal statistics(mean and
variance). This results indicates that the proposed
1 1 approach provides the best SNR improvement
A m p lit u d e

A m p lit u d e
0 0 among the spectral subtraction approach and the
traditional Wiener filter approach in frequency
-1 -1 domain. The results also indicate that the proposed
0 2000 4000 6000 8000 0 2000 4000 6000 8000 approach can treat musical noise better than the
(c) (d) spectral subtraction approach and it can avoid the
1 drawbacks of Wiener filter in frequency domain .
A m p lit u d e

0 REFERENCES

-1 [1] S. F. Boll: Suppression of acoustic noise in


0 2000 4000 6000 8000 speech using spectral subtraction, IEEE Trans.
(e) Time(msec) Acoust., Speech, Signal Processing, vol. ASSP-27,.
pp. 113-120 (1979).
Figure 11: Time domain results of the Handel sig. at [2] M. Berouti, R. Schwartz, and J. Makhoul:
SNR = 20 dB (a) original sig. (b) noisy sig. (c) spectral Enhancement of speech corrupted by acoustic
subtraction. (d) Wiener filtering. (e) adaptive Wiener noise, Proc. IEEE Int. Conf. Acoust., Speech
filtering. Signal Processing, pp. 208-211 (1979).
Amplitude (dB)
Amplitude (dB)

0
[3] Y. Ephriam and H. L. Van Trees: A signal
0
subspace approach for speech enhancement, in
-20 -20 Proc. International Conference on Acoustic,
-40 Speech and Signal Processing, vol. II, Detroit,
-40
0 1000 2000 3000 4000 0 1000 2000 3000 4000 MI, U.S.A., pp. 355-358, May (1993).
(a) (b) [4] Simon Haykin: Adaptive Filter Theory,
Prentice-Hall, ISBN 0-13-322760-X, (1996).
Amplitude (dB)
Amplitude (dB)

0 0 [5] J. S. Lim and A. V. Oppenheim.: All-pole


-20 -20 Modelling of Degraded Speech, IEEE Trans.
Acoust., Speech, Signal Processing, ASSP-26,
-40 -40 June (1978).
0 1000 2000 3000 4000 0 1000 2000 3000 4000
[6] Y. Ephraim and H. L. Van Trees, A spectrally-
(c) (d)
based signal subspace approach for speech
Amplitude (dB)

0 enhancement, in IEEE ICASSP, pp. 804-807


(1995).
-20 [7] Y. Hu and P. Loizou: A subspace approach
-40 for enhancing speech corrupted by colored noise,
0 1000 2000 3000 4000 in Proc. International Conference on
(e)Freq. (Hz) Acoustics, Speech and Signal Processing, vol. I,
Orlando, FL, U.S.A., pp. 573-576, May (2002).
Figure 12: The spectrum of the Handel sig. in Fig.(11) [8] A. Rezayee and S. Gazor: An adaptive KLT
(a) original sig. (b) noisy sig. (c) spectral subtraction. (d) approach for speech enhancement, IEEE Trans.
Wiener filtering. (e) adaptive Wiener filtering. Speech Audio Processing, vol. 9, pp. 87-95
Feb. (2001).
[9] U. Mittal and N. Phamdo: Signal/noise KLT
based approach for enhancing speech degraded by
colored noise, IEEE Trans. Speech Audio
Processing, vol. 8, NO. 2, pp. 159-167,(2000).
[10] John R. Deller, John G. Proakis, and John H.
L. Hansen. Discrete- Time Processing of Speech

UbiCC Journal - Volume 3 7


Signals. Prentice-Hall, ISBN 0-02-328301-7
(1997).
[11] S. F. Boll: Suppression of Acoustic Noise in
Speech Using Spectral Sub- traction. IEEE Trans.
Acoustics, Speech, and Signal Processing. vol.
ASSP-29. no. 2, pp. 113-120, April (1979).
[12] M. R. Weiss, E. Aschkenasy, and T. W.
Parsons: Processing Speech Signal to Attenuate
Interference, in Proc. IEEE Symp. Speech
Recognition, pp. 292-293, April (1974).
[13] J. S. Lim and A. V. Oppenheim:
Enhancement and band width compression of
Noisy speech, Proc. of the IEEE, vol. 67, No..12,
pp. 1586-1604, Dec. (1979).

UbiCC Journal - Volume 3 8


FREQUENCY SELECTIVITY PARAMETERS ON
MULTI-CARRIER WIDEBAND WIRELESS SIGNALS

Víctor M Hinostroza, José Mireles and Humberto Ochoa


Institute of Engineering and Technology, University of Ciudad Juárez
Valle del tigris # 3247,Ciudad Juárez Chihuahua México C. P. 32306
vhinostr@uacj.mx, jmireles@uacj.mx, hochoa@uacj.mx

ABSTRACT.
This work is a study of the effects of frequency selectivity on multi-carrier wideband signals in three different
environments; indoors, outdoor to indoor and outdoors. The investigation was made using measurements carried
out with a sounder with a 300 MHz bandwidth. The main part of this work is related to evaluate the contribution
of several parameters; frequency selective fading, coherence bandwidth and delay spread on the frequency
selectivity of the channel. A description of the sounder parameters and the sounded environments are given. The
300 MHz bandwidth is divided in segments of 60 kHz to perform the evaluation of frequency selective fading.
Sub channels of 20 MHz for OFDM systems and 5 MHz for WCDMA were evaluated. Figures are provided for
a number of bands, parameters and locations in the three environments. It is also shown the variation of the
signal level due to frequency selective fading. The practical assumptions about the coherence bandwidth and
delay spread are reviewed and a comparison is made with actual measurements. Statistical analysis was
performed over some of the results.

Keywords. Coherence bandwidth, frequency correlation, frequency selective fading and multi-carrier
modulation
.
characteristics of the wireless communications
I. INTRODUCTION. channel. The dispersive channel characteristics
arise from the different propagation paths, i.e.
To simulate and evaluate the performance of a multipath, between the receiver and the transmitter.
wireless mobile system a good channel model is This dispersion could be measured, if we could
needed. Mobile communication systems are using measure the channel impulse response (CIR). As a
larger bandwidths and higher frequencies and these general rule the effects of ISI on the transmission
characteristics impose new challenges on channel errors is negligible if the delay spread is
estimation. The channel models that have been significantly shorter than the duration of the
developed for the mobile systems in use may not be transmitted symbol. Due to the expected increase in
applicable anymore. To validate that the old demand of higher data rates, wideband multi-
models can be used for future systems or to design carrier systems such as; OFDM and WCDMA are
new models, it is necessary to answer the question expected to be technologies of choice [1], [12] and
about how the same parameters performs at higher [14]. This is because these two technologies can
bandwidths? Also, we have to be able to measure provide both; high data rates and an acceptable
and validate some parameters and compare them to level of quality of service. However, these systems
well known practical assumptions. Measurements need first to address better the problem regarding
for analysis of the fading statistics at common channel prediction or estimation, because this
frequencies have been performed before, but they condition is the main boundary for higher data
have been performed at small bandwidths, it is rates. The study of correlation of the mobile radio
necessary to update the models with higher channel in frequency and time domains has helped
bandwidths. to understand the problem of channel estimation.
One of this work objectives is to evaluate
As the data rate (the bandwidth) increases the frequency selective fading (FSF) in several
communication limitations come from the Inter environments. This work begins with the results of
Symbol Interference (ISI) due to the dispersive

UbiCC Journal - Volume 3 9


measurements made with a sounder that uses the E{H (t1 ; f1 ) * H (t2 ; f 2 )} =
chirp technique for sounding. ∞ ∞

∫ ∫ E{h(t ;τ ) * h(t ;τ )}e


− 2πf1τ 1 + j 2πf 2τ 2
1 1 2 2 e dτ 1dτ 2
− ∞− ∞
Multipath fading channels are usually classified
into flat fading and frequency selective fading (2)
according to their coherence bandwidth relative to
the one of the transmitted signal. Coherence By considering the channel to have uncorrelated
bandwidth is defined as the range of frequencies scattering (US) and to be wide sense stationary
over which two frequency components remain in a (WSS), the subscript for τ is eliminated and f1 and
strong amplitude correlation. Physically, it defines f2 can be replaced by f + ∆f and t1 and t2 replaced
the range of frequencies over which the channel by t + ∆t, then:

can be considered “flat”. The analytic issue of
coherence bandwidth was first studied by Jakes [1] R H (∆t ; ∆f ) = ∫R h (∆t ;τ )e − j 2π∆fτ dτ
where by assuming homogeneous scattering, his −∞
work revealed that the coherence bandwidth of a (3)
wireless channel is inversely proportional to its In (3) RH and Rh represents the correlation of
root-mean-square (rms) delay spread. The same random variations in the channels transfer function
issue was subsequently studied by various authors and its impulse response respectively. If there are
[4], [8], [9], [10]. Since many practical channel US, then ∆t is 0 then:
environments can significantly deviate from the
homogeneous assumption, various measurements
were conducted to determine multipath delay
Rh (0;τ ) = E h (0;τ ) { 2
}= E {h(τ ) } 2

profiles and coherence bandwidths [19], [20], [21], (4)


[22], aiming to obtain a more general formula for
coherence bandwidth. In this work the variations of substituting into (3) gives:
this formula are reviewed and compared with

∫ E {h(τ ) }e

actual results and a comparison is provided. 2 − j 2π∆fτ
R H ( ∆f ) = dτ
The rest of this document is structured as follow; in −∞
part II the theoretical foundations of the channel (5)
impulse response frequency selective fading and
coherence bandwidth are reviewed. Also in this
part, the characteristics of the three environments where
{
E h (τ )
2
}
is the average Power Delay
sounded are described. In part III, the frequency Profile PDP of the channel. So, under the above
selective fading evaluation and analysis are conditions, RH is the Fourier transform of the
presented. Plots of the dependency of fading deep average PDP.
and frequency separation of two specific points in
the response are studied. At part IV, data about the 2.2 Coherence bandwidth.
relationship between delay spread and coherence The multipath effect of the channel, the arrival of
bandwidth are provided. At the end in part V, different signals in different time delays causes the
conclusions and future work are mentioned. statistical properties of two signals of different
frequencies to become independent if the frequency
II. MATHEMATICAL BACKGROUND separation is large enough. The maximum
frequency separation for which the signals are still
2.1 The wideband channel model. strongly correlated is called coherence bandwidth
The radio propagation channel is normally (Bc). Besides to contribute to the understanding of
represented in terms of a time-varying linear filter, the channel, the coherence bandwidth is useful in
with complex low-pass impulse response, h(t, τ). Its evaluating the performance and limitations of
time-varying low-pass transfer function is [4] [6] different modulations and diversity models.
[8] [10]:
∞ The coherence bandwidth of a fading channel is
∫ h(t;τ )e τ dτ
− j 2πf
H (t , f ) = probed by sending two sinusoids, separated in
−∞ frequency by ∆f = f1- f2 Hz, through the channel.
(1) The coherence bandwidth is defined as ∆f, over
Where τ represents delay, using (1) the frequency which the cross correlation coefficient between r1
correlation function for the channel can be written and r2 is greater than a preset threshold, say, η0=
as: 0.9. Namely:

UbiCC Journal - Volume 3 10


Cov( r1, r 2) ⎛2 λ ⎞ π
C r1,r 2 = = η0 (1 + λ ) E ⎜⎜ ⎟−
var(r1) var(r 2)
⎝ 1 + λ ⎟⎠ 2
(6) ρ ( s, τ ) =
π
2−
Then, using (2) 2
J (ω τ )
2
∞ ∞ = λ2 = 0 2m 2
R( s, τ ) = r1r 2 = ∫ 1+ s σ
0
∫ r1r 2 p(r1, r 2)dr dr
0
1 2
(12)
(7)
It is possible to see in this expression that the
Where p(r1,r2) is
correlation decreases with frequency separation.
This formula has been substituted by several
2π 2π
practical expressions some of them are the
p( r1, r 2) = ∫ ∫ p(r1, r 2, θ , θ
0 0
1 2 )dθ1dθ 2 following [4], [8], [9], [10].

(8) 1 1
BC =0.9 = (13) BC =0.5 = (14)
50σ rms 5σ rms
r1r 2 ⎡ r + r ⎤ ⎛ r1r 2 λ ⎞ 2 2
= exp⎢− I ⎜
2 ⎥ 0⎜
1

2 ⎟
2
1 1
µ (1 − λ )
2 2
⎣ 2 µ (1 − λ ) ⎦ ⎝ µ 1 − λ ⎠ BC =0.9 = (15) BC = (16)
8σ mean 2πσ rms

Where I0(x) is the modified Bessel function of zero In general


order. Then, substituting (8) in (7) and integrating k
BC = (17)
π 1 1 σ rms
R( s,τ ) = b0 F ( − ,− ;1; λ2 )
2 2 2
(9) It will be shown, comparing with practical
measurements that none of these expressions are
this may also be expressed as accurate and it is difficult to obtain a
comprehensive expression for all environments.
π⎛ λ2 ⎞
R( s, τ ) = b0 ⎜⎜1 + ⎟⎟ (10) 2.3 Sounder systems characteristics and
2 ⎝ 4⎠ environment description.

The sounder system used to make the


R ( s, τ ) − r1 r 2 measurements of this work was developed at
ρ ( s, τ ) =
[r 1
2
− r1
2
][ r
2
2
− r2
2
] UMIST in Manchester UK and is described in [2]
and [3]. This sounder uses the FMCW or chirp
technique. The generated chirp consists of a
linearly frequency modulated signal with a
⎛2 λ ⎞ bandwidth of 300 MHz and a carrier frequency of
R( s,τ ) = b0 (1 + λ ) E ⎜⎜ ⎟
⎟ (11) 2.35 GHz. The chirp repetition frequency is 100
⎝1+ λ ⎠ Hertz, which allows having 50-Hertz Doppler
range measurements. The receiver has the same
architecture than the transmitter. But in the
Where E(x) is the complete elliptic integral of the receiver, the generated chirp is not transmitted but
second kind. The expansion of the hyper geometric mixed with the incoming signal from the antenna,
function gives a good approximation to (9). After which are the multi-path components of the
several reductions and considerations, the transmitted chirp. This mixing allows having the
correlation coefficient becomes multi-path components at low frequencies, these
low frequencies can be sampled, digitized and
stored in a computer to perform the required
analysis.

The three environments where the measurements


took place were the following: 1) Indoors, in

UbiCC Journal - Volume 3 11


different floors around a building in an eight stories MHz were calculated. Figure 1 shows a typical
building. Each floor form a rectangle with four channel transfer function. In figure 2 are shown the
long corridors; two 66 m long and two 86 m long, fading characteristics for a 20 MHz sub channel, in
the width of the corridors is 3 m, the total covered this figure there are 15 different lines, each one
area was 912 square meters and the ceiling height corresponds to a 20 MHz sub channel in a 300
is 5 meters. The transmitter was static and the MHz bandwidth. To get this figure, the following
receiver was moved around the corridors and in has been done; first, the fading information of the
different floors, measurements were taken at complete 300 MHz bandwidth was divided in 15
specific distances in each corridor. 2) From a segments of 20 MHz each. Then, each point on that
building to different building. These two buildings segment, that represents a 60 kHz sample, was
are eight stories high, they have about the same measured and the result was compared to the next
high and they are separated by about 200 meters. sample, then the next sample was compared and so
The transmitter was located in the top of one of the on up to the complete 20 MHz bandwidth was
buildings and the receiver was moved around compared. After that, a second 20 MHz sub
specific locations inside the second building in all channel was applied the same procedure and so on
eight different floors. 3) An urban environment up to the end of the 300 MHz bandwidth. To form
around the city center, a commercial area with figure 3, the same procedure was followed, but in
several high rising buildings. Each location in each this case the sub channel bandwidth was of 5 MHz,
environment was sampled during one second and then this figure has 60 different lines which come
100 impulse responses were stored for this specific from the 300 MHz total bandwidth.
location. When the measurements were taken,
every location was sampled with 100 impulse Figure 2 shows that the maximum fading deep
responses, an impulse response (IR) was taken on within the 20 MHz sub channel is lower than 14 dB
that specific location every 10 milliseconds. in all the bandwidth. On the other hand, in the 5
MHz bandwidth the maximum fading deep was of
III. FREQUENCY SELECTIVITY 18 dB. It is possible to see in figure 2, that most of
PARAMETERS the lines follows a pattern, which means that the
fading in all sub channels is about the same. In
To carry out the calculation of frequency selective figure 2, most of the lines stay below 5 dB and only
fading, each processed IR was split in samples of a few lines go higher than 6 dB; this could mean
60 kHz, which was the minimum sampled that deep fading in this bandwidth is rare.
frequency, each 300 MHz bandwidth was sampled
5000 times every 10 milliseconds, this means that a
sample was taken every 2 microseconds or every
60 kHz. Each IR was averaged over the complete
second, meaning that every frequency was
averaged 100 times for each IR. The level of the
frequency response of each 60 kHz segment was
calculated and recorded

In each environment the channel transfer function


for about 50 different locations were calculated and
recorded. Each transfer function has 5000 sampled
frequencies, i.e. 5000 segments for each location.
Every sample represents the signal level on that
segment, relative to the maximum over the Figure 1. Typical channel transfer function.
complete 300 MHz bandwidth. The outdoors
environment has a bandwidth of 120 MHz, the Figure 3 shows the calculations of fading
indoors and indoor to outdoor environments has characteristics for the second environment with 15
300 MHz bandwidth. different 20 MHz sub channels, this figure shows
that there are more dispersion of the lines, which
The fading characteristics in each of the means there are more deep fading in the responses
environments were calculated. For each of the of the IR. Since this environment is the
locations the fading characteristics that were measurement of the propagation for the penetration
calculated were; average delay spread, RMS delay of the signal in different floors in a building, higher
spread, coherence bandwidth, channel transfer delay spread, (time dispersion) than the former
function and frequency correlation. Using the environment was expected and therefore more
channel transfer function for each IR, specific fading.
fading characteristics for sub channels of 5 and 20

UbiCC Journal - Volume 3 12


Another way to look at the statistics of the fading is correlation coefficient, the coherence bandwidth
to calculate the CDF of this parameter. To make (Bc) is lower than 10 MHz most of the locations.
the calculations of these CDF’s figures, the mean This is corroborated in figure 6, this figure shows
of all locations in the involved environment were the average Bc for all locations in the indoors
used. Figure 4 shows the CDF of the building-to- environment. Figure 7, shows the RMS delay
building environment for both, the 20 and 5 MHz spread for all locations for the same environment.
bandwidths, this figure shows that the fading deep Quick calculations comparing figure 11 results and
for a 20 MHz sub channel is below 7 dB for 90% expression (13) show that, few calculated values of
of the time. On the other hand, for the 5 MHz sub the versions of expression (13) match with the
channel, the fades are below 5 dB for 90% of the measured values of figure 7.
time.

Figure 2 Indoor to outdoor fading Characteristics


for a 20 MHz sub channel Figure 4. Fading CDF for indoor to outdoor
for a 5 MHz sub channel

Figure 3. Outdoor to indoor fading Characteristics


For a 5 MHz sub channel
Figure 5. Coherence bandwidth for indoors
IV. COHERENCE BANDWIDTH
EVALUATION.

Figure 5 shows the frequency correlation of all


locations in the indoors environment. To make this
figure the following was done; first the PDP of all
locations was calculated. Then a Fourier transform
was performed on the PDP, which gave us the
frequency correlation for all locations. Then the
frequency correlation for each location was plotted
in figure 5. On this figure, the thick and dashed line
is the line for the maximum coherence bandwidth,
when the transmitter and receiver are connected
directly. In figure 5, one can see that at 0.9

UbiCC Journal - Volume 3 13


Figure 6. Average of coherence bandwidth for
indoors

Figure 8 shows the frequency correlation for the


outdoor to indoor environment, this figure shows
that in this environment the Bc at frequency
correlation of 0.9 is higher than the indoor
environment, although the delay spread is not
different is both environments. Figure 9, shows the
average Bc for the outdoors to indoors
environment, one can see in this figure, that the
coherence bandwidth is higher than the indoor
environment, which was an expected result, but the
difference is higher than expected. In indoors the
coherence bandwidth is not bigger than 20 MHz in Figure 8. Coherence bandwidth for outdoor to
average. In the other hand, in the outdoor to indoor, indoor
the average is about 100 MHz, here is relation of 5
to 1. The difference in RMS delay spread is 100 nS
versus 200 nS, there is a relation of 2 to 1.

Figure 9. Average of coherence bandwidth for


indoor to outdoor

Figure 7. RMS delay spread for indoors Table 1, shows the comparisons of Bc for the
three environments with the different versions of
Figure 11 shows the frequency correlation for the expressions 13 -16 and measured results. This
outdoors environment. Figure 12, shows the Bc at table shows that the values of the expressions are
frequency correlation of 0.9. In this case the Bc can always lower than the measured results, which
not be compared to the Bc for the other two induce to conclude that the expressions were
environments, since in this environment a lower underestimated, at least in these environments.
bandwidth is evaluated, 120 MHz instead of 300 Moreover, it is possible to conclude that these
MHz. Despite this difference and observing expressions were deduced with not enough
figures 11 and 12, Bc is not significantly lower measured results. Also, table 1 show that the
even when we have higher distances and higher relationship between delay spread and coherence
delay spread. In outdoors the Bc is not bigger than 2 bandwidth, not necessarily is a single constant.
MHz in average. In the other hand, the RMS delay
spread is 1.5 µS in average.

UbiCC Journal - Volume 3 14


Figure 11. Coherence bandwidth for outdoors

Figure 10. RMS delay spread for outdoors to


indoors
Figure 12. Average of coherence bandwidth for
outdoors
V. CONCLUSIONS.

In this work the results of analysis of frequency


Table 1. Coherence bandwidth calculations
selective fading on two indoor and one outdoor
environment have been presented. The three Value from Indoors Outdoors Outdoors
environments analyzed demonstrate that the fading to indoors
is within specific limits, these results could help to (13) 400 kHz 200 kHz 30 kHz
the designers of adaptive receivers to estimate the (14) 4 MHz 2 MHz 300 kHz
channel more accurately. The division of the (15) 3.3 MHz 2.5 MHz 250 kHz
channel impulse bandwidth in segments of 20 and (16) 3.2 MHz 1.6 MHz 212 kHz
5 MHz bandwidths, allow the calculation of fading Measured0.9 5.3 MHz 12 MHz 300 kHz
in the bandwidth of interest for OFDM and Measured0.5 19 MHz 72 MHz 5.6 MHz
WCDMA transmission. Plots of the frequency
selective fading will help for this assessment. The
analysis of coherence bandwidth show that the
expressions accepted in the literature for its
calculation are not accurate and the accepted direct
relationship between delay spread and coherence
bandwidth is not simple. Also, additional work is
require on try to determine how much the
combined effect of Doppler spread, time variability
and frequency offsets affects the transmission on
multi-carrier signals as the ones on OFDM y
CDMA

Figure 13. RMS delay spread for outdoors

References.

1. Jakes W. C., Microwave mobile


communications, (Wiley, 1974).
2. Aurelian B, Gessler F, Queseth O, Stridh
R, Unbehaun M, Wu J, Zander J, Flament

UbiCC Journal - Volume 3 15


M. “4th-Generation Wireless 16. Namgoong N, and Lehnert J., “
Infrastructures: Scenarios and Research Performance of DS/SSMA Systems in
Challenges”, IEEE Personal Frequency Selective Fading”, IEEE
Communications Magazine, 8(6), 25-31, Transaction on Wireless communication,
Dec 2001. April 2002, Vol. 1, No. 2, pp. 236-244.
3. Salous S, Hinostroza V,” Bi-dynamic 17. TA0 X., et. al., “ Channel Modeling of
indoor measurements with high resolution Layered
sounder”, 5th. International Symposium Space-Time Code Under Frequency
on wireless multimedia Communications, Selective fading Channel”, Proceedings
Honolulu Hawaii USA, October 2002. of ICCT2003, May 2003, Berlin Germany.
4. Golkap H., “ Characterization of UMTS 18. Shayevitz O. and Feder M.,” Universal
FDD channels ”, PhD Thesis, Department Decoding for Frequency Selective
of Electrical Engineering and Electronics, Fading”, IEEE Transactions on
UMIST, UK 2002 Information Theory, August 2005, Vol.
5. Lee W. C. Y., Mobile Communication 51 , N0. 8, pp. 2770-2790.
Engineering, (McGraw-Hill, 1998). 19. Sánchez M and García M, RMS Delay and
6. Bello P.A., “Characterization of randomly Coherence Bandwidth Measurements in
time-variant linear channels”, IEEE Indoor Radio Channels in the UHF Band,
Transactions on Communications Systems, IEEE Transactions on Vehicular
December 1963, pp. 360-393. Technology, vol. 50, no. 2, march 2001
7. Hehn T., Schober R.m and Gerstacker W., 20. Jia-Chin Lin, Frequency Offset
“Optimized Delay Diversity for Frequency Acquisition Based on Subcarrier
Selective Fading Channels”, IEEE Differential Detection for OFDM
Transaction on Wireless communications, Communications on Doubly-Selective
September 2005, Vol. 4, No. 5, pp. 2289- Fading Channels,
2298. 21. Yoo D. and Stark W. E., Characterization
of WSSUS Channels: Normalized Mean
8. Hashemi H., “The indoor radio Square Covariance, IEEE Transactions
propagation channel”, IEEE Proceedings, on Wireless Communications, vol. 4, no.
Vol. 81, No. 81, July 1993, pp. 943-967. 4, july 2005.
9. Lee W. C. Y., Mobile Communication
Engineering, (McGraw-Hill, 1999)
10. Rappaport T. S., Wireless
communications, (Prentice-Hall, 2002, 2nd
ed.)
11. Parsons J. D., The mobile radio
propagation channel, (Wiley, 2000).
12. Morelli M., Sanguinetti L. and Mengali
U., “Channel Estimation for Adaptive
Frequency Domain Equalization ”, IEEE
Transaction on Wireless communication,
September 2005, Vol. 4, No. 5, pp. 2508-
2518.
13. Salous S., and Hinostroza V., “ Bi-
dynamic UHF channel sounder for Indoor
environments “, IEE ICAP 2001, pp. 583-
587
14. Biglieri E., Proakis J. and Shamai S.,
“Fading Channels: Information Theoretic
and Communications Aspects”, IEEE
Transactions on Information Theory,
October 1998, Vol. 44 , No. 6, pp. 2619-
2692.
15. Al-Dhahir N., “Single Carrier Frequency
Domain Equalization for Space-Time
Blok-Coded Transmission over Frequency
Selective Fading Channels”, IEEE
Communications Letters, July 2001, Vol.
5, No. 7, pp. 304-306.

UbiCC Journal - Volume 3 16


Continuous Reverse Nearest Neighbor Search

Lien-Fa Lin*, Chao-Chun Chen


Department of Computer Science and Information
Engineering National Cheng-Kung University, Tainan, Taiwan, R.O.C.
Department of Information Communication Southern Taiwan
University of Technology, Tainan, Taiwan, R.O.C.

lienfa@cc.kyu.edu.tw,chencc@mail.stut.edu.tw

ABSTRACT
The query service for the location of an object is called Location Based Services
(LBSs), and Reverse Nearest Neighbor (RNN) queries are one of them. RNN queries
have diversified applications, such as decision support system, market decision,
query of database document, and biological information. Studies of RNN in the past,
however, focused on inquirers in immobile status without consideration of
continuous demand for RNN queries in moving conditions. In the environment of
wireless network, users often remain in moving conditions, and sending a query
command while moving is a natural behavior. Availability of such service therefore
becomes very important; we refer to this type of issue as Continuous Reverse
Nearest Neighbor (CRNN) queries. Because an inquirer’s location changes
according to time, RNN queries will return different results according to different
locations. For a CRNN query, executing RNN search for every point of time during a
continuous query period will require a tremendously large price to pay. In this work,
an efficient algorithm is designed to provide precise results of a CRNN query in just
one execution. In addition, a large amount of experiments were conducted to verify
the above-mentioned method, of which results of the experiments showed significant
enhancement in efficiency.

Keywords: Location Based Services, Location-Dependent Query, Continuous


Query, Reverse Nearest Neighbor Query, Continuous Reverse Nearest Neighbor
Query

1 INTRODUCTION Location-Dependent Query (LDQ), of which


applications include Range Query, Nearest Neighbor
As wireless network communications and mobile (NN) query, K-Nearest Neighbor (KNN) query, and
device technology develop vigorously and Reverse Nearest Neighbor (RNN) query.
positioning technology matures gradually, LBS is There are plenty of studies about NN [14, 22, 26],
becoming a key development in the industrial as well KNN [4, 9, 14, 23, 25], CNN [17, 3, 12, 20], and
as academic circles [2, 5, 13, 21, 26, 27]. According CKNN [17, 20] queries, and issues pertaining to
to the report of “IT Roadmap to a Geospatial Future” Reverse Nearest Neighbor (RNN) Query [10, 11, 16,
[6], LBSs will embrace pervasive computing and 18, 19, 22, 24] have been receiving attention in recent
transform mass advertising media, marketing, and years. RNN query means finding a collection of
different societal facets in the upcoming decade. nearest neighbor objects for S, a given collection of
Despite the fact that LBSs have been existing in the objects, with q, a given query object. Practical
traditional calculation environment (such as Yahoo! examples of RNN query are provided in [10]. If a
Local), its greatest development potential lies in the bank is planning to open a new branch, and its clients
domain of mobile computing that provides freedom prefer a branch on a nearest possible location, then
of mobility and access to information anywhere such new branch should be established on a location
possible. where the distance to the majority of its clients is
LBSs shall become an indispensable application shorter than that of other banks. Taxi cabs selecting
in mobile network as its required technology has passengers is another good example. If a taxi cab uses
matured and 3G wireless communication wireless devices to find out the location of its
infrastructure is expected to be deployed everywhere. customer, then RNN queries will be far more
The query that answers to LBSs is referred to as advantageous than NN queries from the aspect of

UbiCC Journal - Volume 3 17


competition. Figure 1 illustrates that Customer c is Related works about RNN search are introduced in
the nearest neighbor for Taxi a, but that does not Section 2. Concerned issues are defined and
necessarily mean Taxi a can capture Customer c assumptions made are described in Section 3. The
because Taxi b is even closer to Customer c. On the proposed CRNN search algorithm is introduced in
contrary, the best option for Taxi a should be Section 4 The experiment environment and
Customer d because Taxi a is the nearest neighbor for evaluation parameters for experimental efficacy are
Customer d. That is, d is the RNN for a, and a may described in Section 5. In the end, a conclusion and
reach d faster than any other taxi. This is an example future study directions are provided in Section 6.
of CRNN query for that the query object, the taxi,
changes location according to time. Mobile users will
be mobile in a wireless environment, and that is why
the continuous query is an important issue in the
wireless environment.
As far as the knowledge available to the
researchers is concerned, there is not yet any
researcher working on this issue. Because an inquirer
changes location constantly according to time,
changes of location will cause RNN queries to return
different results. For a CRNN query, executing RNN
search for every point of time during a continuous
query period will require a tremendously large price
to pay. The larger the number of query objects and Figure 1: Example of RNN query.
the shorter the time segment are, the longer the
calculation time will be.
In addition, due to the continuance nature of
time, defining the appropriate time segment for RNN 2 RELATED WORK
search will be a concern; if the interval between RNN
searches is too short, then more CRNN queries need RNN search concerns about finding q, a query
to be executed to complete the query, and vice versa. that is the NN for some objects. Related works of
If a RNN search is repeated over a longer period of study about RNN search are introduced and
time to reduce the number of execution, the RNN summarized in this section:
query result for the whole time segment will lose z Index methods that support RNN search
accuracy due to insufficient frequency of sampling. The number of objects can be infinite; if one must
In this paper, a more efficient algorithm is first find out the distance from query q to each object
designed to replace processing of each and every for identifying the RNN for query q, then the
point of time for RNN search; just one execution of efficiency may be unacceptably low due to
CRNN query is all it takes to properly define the overwhelmingly large computation cost. To
segment for the query time that a user is interested in, accelerate processing speed, most of studies adopt the
and find out the segments that share the same answer index methods. Major index methods are introduced
and the RNN for each of the intervals. in this section.
z RNN search of different types
Other than that, an index is also used to filter out RNN searches in different scenarios are described
unnecessary objects to reduce search space and and categorized according to static and moving
improve CRNN search efficiency. The experiment situations of query q and the objects.
result suggests that using index provides efficiency
20 times better than not using index when the number 2.1 Index Methods for RNN Query
of objects is 1000.
This Study provides major contribution in three RNN search concerns about finding q, a query
ways: that is the NN for some objects, and it is necessary to
z This Study pioneers into continuous query find out the distance between query q and each object,
processing methods opposite to static or the distance from the coordinate of query q to the
query regarding RNN issues. coordinate of an object. For a given q, not every
z A CRNN search algorithm is proposed; object is its RNN, and these objects which can not be
just one execution will return all CRNN RNN may be practically left out of consideration to
results. reduce the number of objects to be taken into
z The proposed method allows the index consideration and accelerate processing speed for
which was only applicable to finding RNN RNN search. Many studies were dedicated to the
for a single query point to support CRNN designing of an effective indexing structure for
query to improve CRNN search efficiency. coordinates of an object. The most famous ones are
The structure of the other sections in this work: R-Tree proposed by [8] and Rdnn-Tree proposed by

UbiCC Journal - Volume 3 18


[10]. These two index methods are described below. these child trees to its NN will not exceed MaxDnn.

2.1.1R-Tree

R-Tree is an index structure developed in early


years for spatial database and was used by [10] to
accelerate RNN search processing. All objects are
grouped and then placed on leaf nodes according to
the closeness of their coordinates. That is, objects at
similar coordinates are put in one group. Next, each
group of objects is contained in a smallest possible
rectangle, which is called Minimum Bounding
Rectangle (MBR). Next, MBRs are grouped in
clusters, which are contained inside a larger MBR
until all objects are contained in the same MBR.
What is stored on an internal node of a R-Tree is an Figure 3. Data structure of Rdnn-tree
MBR, in which all nodes underneath are contained,
and the root of the R-Tree contains all objects. The 2.2 Categories of Rnn queries
size and range of an MBR is defined by its lower left
coordinate (Ml,Md) and upper right coordinate (Mr, Depending on the static or moving status of
Mu). Figure 2 is an example of R-Tree. From a to l, query q and the query objects, related studies can be
summarized into 4 categories.
there are total 12 objects; (a , b , c , d) belong to
1. If query q and the query objects are both static,
MBR b1, and (e,f,g) belong to MBR b2. MBR b1 then this category is called static query vs. static
and b2 belong to MBR B1, and MBR R contains all objects.
objects.. 2. If query q is moving and the query objects are
static, then this category is called moving query vs.
static objects.
3. If query q is static and the query objects are
moving, then this category is called static query vs.
moving objects.
4. If both query q and the query objects are moving,
then this category is called moving query vs. moving
objects.

2.2.1 Static query vs. static objects

The scenario that both query q and query objects


Figure 2: Example of R-tree Indexing
are static is first discussed because the query and
query objects are immobile and are therefore easier
2.1.2 Rdnn-Tree
for processing than other scenarios. The method
proposed in [10] is now introduced. For static
Rdnn-tree (R-tree containing Distance of Nearest
database, the author adopts a special R-tree, called
Neighbors) [22] improves the method of [10]. The
RNN-tree, for answering RNN queries. For static
author proposes a single index structure (Rdnn-tree)
database that requires being frequently updated, the
to provide solutions for NN queries and RNN queries
author proposes a combined use of NN-tree and
at the same time. Rdnn-tree differs from standard R-
RNN-tree. NN of every object is stored in the RNN-
tree structure by storing extra information about
tree, and what are stored in the NN-tree are the
nearest neighbor of the points in each node.
objects themselves and their respective collections of
Information of (ptid,dnn) is stored on the leaf node
NN. The author uses every object as the center of a
of Rdnn-tree, as shown in Figure 3. ptid means an circle, of which the radius is the distance from the
object of which the data concentrate on the dimension, object to its NN, to make a circle, and then examines
denoted as d, and dnn means the distance from such every circle that contains query q to find out the
object to its NN. Information of (ptr , Rect , answers of RNN queries. Such method, however, is
MaxDnn) is stored on a non-leaf node, where ptr very inefficient for dynamic database because the
points to the address of a child node, Rect contains structures of NN-tree and RNN-tree must be changed
the MBR of all child nodes subordinate to this node, whenever the database is updated. In [22], the method
and MaxDnn means the maximum value of dnn of all proposed by [10] is therefore improved. The author
objects in the child trees subordinate to this node. The proposes a single index structure, Rdnn-tree, for
maximum distance from any object contained in answering NN queries and RNN queries at the same

UbiCC Journal - Volume 3 19


time. It differs from normal R-tree; it separately q; instead, i segments of time, such as segment1,
stores the information of NN of every object (i.e. segment2 …segmenti, that have the same result, are
Distance of Nearest Neighbor), and NN of every first identified among the entire CRNN query time
object must be calculated in advance. period. RNN result of each segment of time, such as
RNN1, RNN2, …, is calculated separately, and the
2.2.2 Static query vs. moving objects result is returned in the format of (q ,
[segment1])={RNN1 result} , … , (q ,
Studies mentioned above primarily assume a [segmenti])={RNNi result} back to the inquirer.
monochromatic situation that all objects, including
query q and query objects, are of the same type. In
[18], the researcher addresses this type of issues in a
bichromatic situation that objects are divided into two
different types; one is inquirer, and the other is query
object. NN and range query techniques are used in
this Paper to handle RNN issues.

2.2.3 Moving query


Figure 4. Example of a CRNN query
This subsection discusses the situation when an
inquirer is no longer static but changes his or her Based on the description above, CRNN query,
location according to time, and the query object can one issue that this Study concerns, may be stated as
be either static or moving. That is, two categories of below:
query: moving query with static objects and moving Given:
query with moving objects, are involved. Because the A collection of static objects S={O1,O2,…,On}
inquirer is moving, these two categories of query will A query point q, its current position (q.x,q.y), and
return different results for the identical RNN search moving velocity (v.x,v.y)
at different points of time. This type of issue is A continuous query time [Ps , Pe]; where Ps
obviously more complicate than the issues previously
represents the coordinate of the point of time when
discussed. As far as the knowledge available to the
the query begins, and Pe represents the coordinate of
researcher is concerned, no related study has ever
the point of time when the query ends
discussed about the issues of these two categories. In
Find:
this Study, solutions for a moving query with static
The RNNs of q between any two adjacent points of
objects are pursued.
time of {P1 ,P2 ,…,Pi} within [Ps ,Pe] remain
constant.
3 Problem Formulation Such that:
RNN(q,[Ps ,P1]) = {RNN1},RNN(q,[P1 ,P2])
CRNN query concerns about a period of
continuing time where adjacent points in such period ={RNN2},…, RNN(q,[Pi ,Pe]) = {RNNi},
of time may have the identical RNN. That is, a period where [Pi,Pe] ⊆ [Ps,Pe],{RNN1} ∈ {O1,O2,…,
of time may have the same RNN unless the query q On}.
has moved beyond this period of time. Please refer to Under the assumptions:
Figure 4. When a user executes CRNN query q, the 1. The moving direction of query q is fixed.
time segment of the continuous query is [Ps,Pe], and 2. All query objects are static.
the query objects are {a,b,c,d}. If time point P1 As described above, two adjacent points of time
can be identified, and any given point of time in the may share the same RNN, or a segment of time has
only one RNN unless query q moves to another
time segment from Ps to P1, or [Ps,P1], has the same
segment of time that has a different RNN. The CRNN
RNN result, then one-time execution of RNN search
search algorithm proposed in this Study uses exactly
is all it needs for the time segment of [Ps ,P1]. If this concept. First, the points of time that produce
points of time, P2, P3, and P4, are also identified, and different RNN results within a query time are
any given point of time in the time segment of [P1, identified. These points of time divide the query time
P2] has the identical RNN result, while any given into several segments that have different RNN results,
point of time in the time segment of [P2,P3] has the and then the RNN results are identified for each of
identical result, and any given point of time in the the segments. The detailed algorithm of CRNN
time segment of [P3,P4] has the identical result, then Search is explained in the next section.
the entire CRNN query needs only one-time
execution of RNN search at each time segment. 4. CRNN Search Algorithm
For processing CRNN query, it is not necessary
to execute RNN search for every point of time and The detailed procedure of CRNN Search
return the RNN of every point of time back to query algorithm is introduced in this section. CRNN Search

UbiCC Journal - Volume 3 20


algorithm is divided into two steps. As illustrated in Figure 6, if the NN of object a
Step 1: Finding segment points of CRNNq is b, and a circle is made using ab as the radius with
Points of time that produce different RNN
a as the center point, then the distance from query q
results are identified. Based on these points of time,
to a must be shorter than the distance from a to its
CRNN query is divided into several time segments
NN, or object b, as long as query q falls within this
that require execution of RNN search. The RNN
circle. Therefore, during the period of time when
result for any given point of time within one segment
query q remains within this circle, RNNs of object a
will remain constant, and different segments have
must include a, unless query q moves out of this
different RNN results.
circle. Because the moving direction of query q is
Step 2: Calculating RNN result of each segment
assumed to be fixed, CRNN query will form a query
Separately calculate the RNN results for each of
line (qline) from its beginning to its end. The point to
the segments that have been divided in the previous
which this CRNN query begins to leave this circle is
step.
the intersection S of this circle and the query line
The entire procedure for processing CRNN
formed by CRNN query. Before intersection S, the
Search is illustrated in Figure 5. On top of the
result of RNN query must include object a; beyond
necessary query objects and continuous query (query
intersection S, the result of RNN query will not
path), it is divided into two steps: finding segment
include object a; the RNN results will be different.
points of CRNNq and calculating RNN result of each
This intersection is referred to as a segment point.
segment; each of the steps is described below:
This explains why the intersection of the circle
with NN as its radius and the query line is the point
of time where RNN query produces different results.
Making a circle by using an object itself as the center
and the distance to its NN as the radius will enable all
of the intersections of the circle and the query line of
CRNN query to cut CRNN query into several time
segments that have different results of RNN query.

Figure 5: Flow chart of CRNN query


processing

4.1 Finding segment points of CRNN

What CRNN query pursues is a period of Figure 6. Finding segment point of CRNN search
continuous time; the moving distance of query
objects is very short among some adjacent points of Figure 7 illustrates the time segmentation
time for the query, thus possibly resulting in the same process described above. For object a, b, and c, their
RNN result. That is, the entire period of continuous respective NNs are identified first: NN(a)=b,
query is divided into several segments, and the RNN NN(b)=a, and NN(c)=b. Next, use each object as the
results in each segment are the same. If these points center of a circle, and the distance to its respective
of time share the same RNN result, then it is not NN as the radius to make circles of a, b, and c. Then,
necessary to execute RNN search for each of the intersections of the circles and qlines, Ps, P1, P2, P3,
points of time; one-time calculation is enough. P4, and Pe , are sorted according to time, and every
Therefore, CRNN query does not require executing two intersection points define a time segment. The
RNN search for all points of time. Instead, points of entire CRNN query is cut into five time segments, [Ps,
time that share the same RNN result are grouped into P1] , [ P1, P2] , [P2,P3] , [P3,P4] , and [P4,Pe].
time segments, and one-time RNN search is executed Every segment has a unique RNN query result.
for each of the segments. RNN of query q is a
collection of the objects of which the NN is query q.
If the distance, or N, is realized in advance, then
these objects are the RNN for query q when the
distances from query q to the objects are shorter than
the distances from the objects to their respective NN.

UbiCC Journal - Volume 3 21


Figure 8: Calculating RNN result of each segment.

1. CRNN Algorithm with Index


Not every object will be an answer in the
processing of CRNN query. To improve RNN query
efficiency, it is preferred that the objects that can not
be answers are filtered out in advance to greatly
reduce search space for CRNN query, size of data
that requires CRNN query, and consequently,
computation cost. The process that further improves
CRNN query efficiency dramatically is referred to as
pruning process. Figure 9 illustrates the flowchart of
CRNN query processing with a pruning process
added.
Figure 7: Segmenting of the CRNN query

4.2 Calculating RNN result of each segment

In the previous section, intersections of qlines


and the circles with the distances between the objects
and their respective NNs as the radiuses are defined.
With these intersections, CRNN query is cut into
several time segments. The next step is to find RNNs
for each of the time segments. Because the distances
from query objects to their respective NNs are used
as the radiuses to make circles which are coded by
the objects’ numbers, if a segment falls within a
certain circle, then the resulting RNN of this time
segment for the CRNN query is the object collection
represented by such circle. This is illustrated in
Figure 8. First, intersections of qlines that represent
the CRNN query and the circles of the objects are Figure 9. Flow chart of CRNN query with index
sorted by time; every two intersection points define a
time segment, and there are five segments, [Ps,P1] , Step 2 and 3 are identical to Step 1 and 2 in
CRNN search algorithm, which have been described
[ P1 , P2] , [P2 , P3] , [P3 , P4] , and [P4 , Pe].
in the previous sections, and they will not be
Segment [Ps , P1] is contained only by circle a, reiterated again here. For step 1, the pruning process,
therefore: RNN(q,[Ps ,P1]) ={a}. Next, examine an index structure for Rdnn-tree is designed to
segment [Ps,P1]; this segment is contained by circle effectively execute the pruning process. The three
a and circle b. Therefore: RNN(q, [Ps,P1]) = {a, steps of CRNN query with index are illustrated in
b}. If this process is repeated, then the obtained Figure 9. The pruning process is described below. For
results will be RNN(q , [P2 , P3]) = {a , b , c}, every internal node of Rdnn-tree, the distance from
RNN(q,[P3,P4]) ={b,c}, and RNN(q,[P4,P5]) query q to its node will be computed for every
={c}. separation, and the distance is denoted as D(q,Rect).
If D(q,Rect) of a node is larger than MaxDnn of the
node, then all the objects beneath it will not be
considered because the distance from query q to Rect
Node will be equal to or larger than the distances
from query q to all the objects underneath Rect node.
When the distance from query q to Rect node is
longer than MaxDnn, it is impossible that query q is
closer to its NN than any other object underneath
Rect node, and no object underneath can be the RNN
result for query q. On the contrary, if D(q,Rect)
equals the MaxDnn of such node, then the distances
from some objects underneath Rect node to their
respective NNs are shorter than the distance from
query q to Rect. That is, some objects are the RNN
results for query q. The examination continues along

UbiCC Journal - Volume 3 22


the branch all the way to the lead node. All entries therefore, h and i are placed inside RNNCanSet. Next,
underneath such leaf node are recorded as the b4 is examined. MaxDnn of b4 is equal to or smaller
candidate objects for RNN query result. The than D(q , b4); therefore, b4 can be pruned. The
collection of these candidate objects is referred to as entire pruning process then ends.
RNNCanSet, which means the possible results for However, the CRNN query to be processed is
RNN query must exist within this collection, and the not a RNN query of a single query point; therefore,
objects outside of RNNCanSet can not possibly be the pruning process in [22] can not be directly used.
RNN query results. All that are needed to be To ensure that no possible RNN result is deleted, the
considered when finding segment point of CRNNq of criteria of pruning is changed from the condition that
CRNN search algorithm are the objects inside D(q,Rect), the distance from query point to Rect,
RNNCanSet. This will greatly reduce the quantity of must be longer than MaxDnn to the condition that
objects needed to be handled and enhance CRNN MinD(q,Rect)>MaxDnn, where MinD(qline,Rect)
search algorithm efficiency. represents the minimum distance from qline to Rect
Figure 3 explains the pruning process. It begins node. The reason why the shortest distance is selected
with root node R. Because D(q,R) ≦MaxDnn of R, is that if the minimum distance from the entire qline
child nodes of B1 and B2 must be examined. Because to Rect node is larger than MaxDnn, then the distance
the MaxDnn of MBR B1 ≦D(q,B1), all child nodes from any given point of time on the qline to Rect
underneath B1 can be pruned. Next, D(q , node must be longer than MaxDnn. Therefore, all the
B2)≦MaxDnn of B2, so child nodes b3 and b4 of B2 objects underneath Rect node can not be RNN for
must be examined. D(q,b3) is equal to or smaller qline, and pruning is out of consideration. Details of
than the MaxDnn of b3, which is also a leaf node; the pruning algorithm are exhibited in Algorithm 1:

Algorithm 1: Pruning Algorithm.


and comparison of experiment results.
5. Performance Study
5.1 Experiment Settings
To evaluate the improvement which the method
proposed in this Study has made in CRNN query The coordinates of the objects disperse in an
efficiency, some experiments are designed, and this experiment environment of [0 , 1]×[0 , 1] plane.
section provides descriptions of experiment Because distribution density of the objects may
environments, experimental parameters and settings, influence efficiency, it should be taken into

UbiCC Journal - Volume 3 23


consideration in the experiment. In the experiment, referring to [15], and the velocity vector of each
three different types of distribution are used in the query falls between [-0.01 , 0.01]. Because the
generation of objects’ coordinates. The three different influence of different types of object distribution on
types of distribution are Uniform distribution, efficiency is concerned in this experiment, the queries
Gaussian distribution, and Zipf distribution. In are generated as close to the center of the plane as
uniform distribution, the objects are evenly possible. Having executed 30 queries, the average
distributed on the plane, as shown in Figure 10(a). In cost of executing one CRNN search is used in
Gaussian distribution, most of the objects concentrate determining which method is more favorable. As to
on the center of the plane, as shown in Figure 10(b). the program coding of Rdnn-tree in the CRNN search
In Zipf distribution, most of the objects will distribute algorithm, R*-tree code of GIST[7] is used in
at the extreme left and extreme bottom of the plane. perfecting Rdnn-tree to make it match with the
In the experiment, skew factor is set at 0.8, as shown requirement of this experiment.
in Figure 10(c). In addition, 30 queries are generated
randomly in a [0.4 , 0.6]×[0.4 , 0.6] plane by
1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

Y axis
Y axis

Y axis

0.4 0.4 0.4

0.2 0.2 0.2

0 0
0 0 0.2 0.4 0.6 0.8 1
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
X axis X axis
X axis

Figure 10 Data sets of experiment evaluation

In addition to the object distribution described CRNN query is executing RNN algorithm for every
above, the influences that the amount of query time point of time which is continuous, and it is
(qline) and the number of objects may impose on impossible to calculate the required count of
efficiency are also considered. Three data sets of execution. Therefore, the CRNN query time must be
Uniform, Gaussian, and Zipf are considered in object segmented before the total execution time required
distribution. The amount of query time (qline) for CRNN query may be calculated. The more the
changes from query length 1 to query length 10. The time is segmented, the more executions of RNN are
number of objects changes from 1K to 10K. required. If a period of time is segmented into m
Parameters and settings used in the experiment are segments, then time complexity will be O(m×n3), and
listed in Table 1. if time is not adequately segmented, then the RNN
Table 1: Parameter settings of experiment result may be erroneous. These make it an inefficient
CRNN search algorithm, and it will not be compared
Parameter Description Settings in this experiment. Efficiency of two methods is
distribution Data distribution Uniform, compared in this experiment: one uses Rdnn-tree as
Gaussian, Zipf the index, and the other uses no index. To evaluate
interval Time interval of 1, 2, 5, 8, 10 these two methods, comparison of the time required
Query for one CRNN search execution can be used, and this
object-no Number of Data 1, 10. 30, 50 , comparison is referred to as total cost in this Study.
Objects 100(k)
5.4 Performance Results and Discussion
5.2 Compared Algorithms and Performance
Metrics Based on the changes of metrics (distribution,
interval, and object-no), different types of
The most intuitive method for finding RNN is experiments have been conducted. Results are
looking for the NN of every object. If the number of summarized by object-no and query interval in the
query objects is N, then time Complexity is O(n2). next section.
Next, determine which objects’ NNs are query points.
If the NNs are the query points, then the objects will 5.4.1 The effect of object-no parameter
be the RNNs for the query points. The required time
complexity for the RNN algorithm is O(n3). First, the fixed query interval is set at 5. The
However, the most intuitive method for finding influence imposed on efficiency by object-no

UbiCC Journal - Volume 3 24


parameter, which is the number of objects, under comparison of influence from object distribution on
different types of object distribution, will be efficiency clearly suggests Zipf distribution offers the
discussed. The experiment result is shown in Figure best efficiency, followed by Uniform distribution,
11. X axle represents the total cost of time required and Gaussian distribution offers the worst. This result
for executing one CRNN search, and Y axle can be explained as such: because Zipf distribution is
represents the number of query objects. Total cost located at the far left and the lowest bottom, most of
increases as the number of query objects increases. the data will be pruned, and the number of objects
In addition, when the number of objects is 1K, that are included in RNNCanSet without being
the efficiency of the CRNN search that uses Rdnn- pruned is very small, offering the lowest total cost.
tree is about 300 seconds, and that of the CRNN On the opposite, data in Gaussian distribution
search using no Rdnn-tree index is about 15 seconds; concentrate in the center of the plane and very few
about 20 times faster. It is obvious that pruning some data can be pruned, allowing many objects to remain,
unnecessary objects by adopting Rdnn-tree as the and causing a large RNNCanSet, thus resulting in the
index to reduce CRNN search space provides much highest total cost.
higher efficiency than not adopting Rdnn-tree. The
7
10
7
10 107

CRNN without index 6 6 CRNN without index


106 10 CRNN without index 10
CRNN with index CRNN with index
CRNN with index
5 5 5
10 10 10

Total time (sec.


Total time (sec.

Total time (sec.

)
)

104 104 104

103
103 103

2 2 2
10 10 10

10 10 10

1 1 1
1K 10K 30K 50K 100K 1K 10K 30K 50K 100K 1K 10K 30K 50K 100K
object-no object-no object-no

Figure 11: Influences on different types of data distribution from changing object-no

5.4.2 The effect of query interval parameter distance of MinD(qline , Rect) decreases, causing
pruning efficiency to reduce. On the contrary, when
This section focuses on the influence from the the interval increases, the number of time
length of query interval on each method under segmentation by CRNN query increases.
different types of object distribution. Results of the Consequently, the number of RNN searches for every
experiment are shown in Figure 12. Generally segment increases, and total cost of CRNN query
speaking, when the query interval is lengthened, the increases as well.
105 105 10
5

104 104 10
4

CRNN without index CRNN without index


CRNN with index CRNN with index
Total Time (sec.

Total Time (sec.


Total time (sec.
)

)
)

3
10 103 103 CRNN without index
CRNN with index

102 10
2
10
2

10 10 10

1 1 1
1 2 5 8 10 1 2 5 8 10 1 2 5 8 10
Query Interval Query Interval Query Interval

Figure12: Effect of query interval parameter for different data distribution

6. Conclusions and Future Works Study also prove the efficiency of the proposed
method. As wireless communication and mobile
An efficient CRNN search algorithm is proposed device technology become mature, more and more
in this Study. Such algorithm requires only one users access information from wireless information
execution to find out RNN results from all continuous systems through mobile devices. To process requests
RNN searches. The diversified experiments in this from more and more mobile users, data dissemination

UbiCC Journal - Volume 3 25


through broadcast is an effective solution for (CLDB’96), pp. 215–226.
scalability. The future goal of this Study is extending [13] Lee,D.L., chien Lee, W.,Xu,J., and Zheng, B.
the issues of CRNN search to the wireless (2002) Data management in location dependent
broadcasting environment. services. IEEE Pervasive Computing,1, 65–72.
[14] Roussopoulos,N., Kelley,S. ,and Vincent,
7. References F.(1995) Nearest neighbor queries.
Proceedings of ACM Sigmod International
[1] AmitSingh,H.F. and Tosun,A.S. (2003) High Conference on Management of Data , Illinois,
dimensional reverse nearest neighbor queries. USA, June, pp.71–79.
Proceedings of the 20th International [15] Ross,S.(2000) Introduction to Probability and
Conference on Information and Knowledge Statistics for Engineers and Scientists.
Management (CIKM’03), NewOrleans, LA, [16] Stanoi,I., Agrawal,D., and Abbadi,A.E. (2000)
USA, pp.91–98. Reverse nearest neighbor queries for dynamic
[2] Barbara,D. (1999) Mobile computing and databases. ACM SIGMOD Workshop on
databases-a survey. IEEE Transactions on Research Issues in Data Mining and
Knowledge and Data Engineering, 11, 108–117. Knowledge Discovery, pp.44–53.
[3] Benetis,R.,Jensen, C.S.,Karciauskas,G., and [17] Song,Z. and Roussopoulos,N. (2001) K-nearest
Saltenis,S. (2002) Nearest neighbor and reverse neighbor search for moving query point.
nearest neighbor queries for moving objects. Proceedings of 7th International Symposium on
International Database Engineering and Advances in Spatial and Temporal Databases,
Applications Symposium, Canada, July17-19, LNCS2121, RedondoBeach, CA, USA, July12-
pp.44–53. 15, pp.79–96.
[4] Chaudhuri,S. and Gravano,L. (1999) Evaluating [18] Stanoi,I.,Riedewald, M.,Agrawal,D., and
top-k selection queries. Proceedings of the 25th Abbadi,A.E. (2001) Discovery of influence sets
IEEE International Conference on Very Large in frequently updated databases. Proceedings of
Data Bases, pp.397–410. the 27th VLDB Conference, Roma, Italy, pp.99–
[5] Civilis,A., Jensen,C.S., and Pakalnis,S. (2005) 108.
Techniques for efficient road-network-based [18] Tao,Y., Papadias,D., and Lian,X. (2004) Reverse
tracking of moving objects. IEEE Transactions knn search in arbitrary dimensionality.
on Knowledge and Data Engineering, 17, 698– Proceedings of 30th Very Large Data Bases,
712. Toronto, Canada, August29-September3,
[6] Computer Science and Telecommunication pp.279–290.
Board.IT Roadmap to a geospatial future, the [20]Tao,Y., Papadias, D., and Shen,Q. (2002)
national academies press,2003. Continuous nearest neighbor search.
[7] http://gist.cs.berkeley.edu/. International Conference on Very Large Data
[8] Guttman,A. (1984) R-trees:A dynamic index Bases, Hong Kong, China, August 20-23,
structure for spatial searching. Proceedings of pp.279–290.
the 1984 ACM SIGMOD international
conference on Management of data, pp.47–57. [21] Xu,J., Zheng,B., Lee,W.-C.,, and Lee,D.L. (2003)
[9] Hjaltason,G.R. and Samet,H. (1999) Distance Energy efficient index for energy query
browsing in spatial data bases. ACM location-dependent data in mobile
Transactions on Database Systems (TODS), 24, environments. In Proceedings of the 19th IEEE
265–318. International Conference on Data Engineering
[10] Korn,F. and Muthukrishnan,S. (2000) Influence (ICDE’03), Bangalore, India, March, pp.239–
sets based on reverse nearest neighbor queries. 250.
Proceedings of the 2000 ACM SIGMOD [22] Yang,C. and Lin,K.-I. (2001) An index structure
International Conference on Management of for efficient reverse nearest neighbor queries.
Data, Dallas, Texas, USA, May16-18, pp.201– Proceedings of the 17th International
212. Conference on Data Engineering, pp.485–492.
[11] Korn,F.,Muthukrishnan, S.,and Srivastava.,D. [23] Yiu,M.L., Papadias,D., Manoulis,N., and Tao,Y.
(2002) Reverse nearest neighbor aggregates (2005) Reverse nearest neighbors in large
over data streams. Proceedings of the graphs. Proceedings of 21st IEEE International
International Conference on Very Large Conference on Data Engineering (ICDE),
DataBases (VLDB’02), Hong Kong, China, Tokyo, Japan, April5-8, pp.186–187.
August, pp.91–98. [24] Yu,C.,Ooi,B.C., Tan,K.-L., and Jagadish,H.V.
[12] Korn,F., Sidiropoulos, N.,Faloutsos, C.,Siegel,E., (2001) Indexing the distance: An efficient
and Protopapas,Z. (1996) Fast nearest neighbor method to knn processing. Proceedings of the
search in medical image database. In 27th VLDB Conference, Roma, Italy, pp. 421–
Proceedings of the 22th International 430.
Conference on Very Large Data Bases [25] Zheng,B.,Lee, W.-C., and Lee,D.L. (2003)

UbiCC Journal - Volume 3 26


Search k nearest neighbors on air. Proceedings Crete, Greece,March, pp.48–66.
of the 4th International Conferenceon Mobile [27] Zhang,J., Zhu,M., Papadias,D., Tao,Y., and
Data Management, Melbourne, Australia, Lee,D.L. (2003) Location-based spatial queries.
January, pp.181–195. In Proceedings of the 2003 ACM SIGMOD
[26] Zheng,B., Xu,J., chien Lee, W., and Lee,D.L. international conference on Management of
(2004) Energy conserving air indexes for data, SanDiego, California, USA, June9-12,
nearest neighbor search. Proceedings of the 9th pp.443–454.
International Conference on Extending
Database Technology (EDBT’04), Heraklion,

UbiCC Journal - Volume 3 27


REDUCTION OF INTERCARRIER INTERFERENCE IN OFDM
SYSTEMS
R.Kumar Dr. S.Malarvizhi
* Dept. of Electronics and Comm. Engg., SRM University, Chennai, India-603203
rkumar68@gmail.com

ABSTRACT In [6], ICI self-cancellation of the data-conversion


method was proposed to cancel the ICI caused by frequency
Orthogonal Frequency Division Multiplexing offset in the OFDM system. In [7], ICI self-cancellation of the
(OFDM) is a promising technique for the broadband wireless data-conjugate method was proposed to minimize the ICI
communication system. However, a special problem in OFDM caused by frequency offset and it could reduce the peak
is its vulnerability to frequency offset errors due to which the average to power ratio (PAPR) than the data-conversion
orthogonality is destroyed that result in Intercarrier method. In [8], self ICI cancellation method which maps the
Interference (ICI). ICI causes power leakage among data to be transmitted onto adjacent pairs of subcarriers has
subcarriers thus degrading the system performance. This been described. But this method is less bandwidth efficient. In
paper will investigate the effectiveness of Maximum- [9], the joint Maximum Likelihood symbol-time and carrier
Likelihood Estimation (MLE), Extended Kalman Filtering frequency offset (CFO) estimator in OFDM systems has been
(EKF) and Self-Cancellation (SC) technique for mitigation of developed. In this paper, only carrier frequency offset (CFO)
ICI in OFDM systems. Numerical simulations of the ICI is estimated and is cancelled at the receiver. In addition,
mitigation schemes will be performed and their performance statistical approaches have also been explored to estimate and
will be evaluated and compared in terms of bit error rate cancel ICI [10].
(BER), bandwidth efficiency and computational complexity.
Keywords: Orthogonal Frequency Division Multiplexing Organization: This paper is organized as follows: In
(OFDM), Intercarrier Interference (ICI), Carrier Frequency section 2, the standard OFDM system has been described. In
Offset (CFO), Carrier to Interference Ratio (CIR), Maximum section 3, the ICI mitigation schemes such as Self-
Likelihood (ML), Extended Kalman Filtering (EKF). Cancellation (SC), Maximum Likelihood Estimation (MLE)
and Extended Kalman Filtering (EKF) methods have been
1. Introduction described. In section 4, simulations and results for the three
Orthogonal frequency division multiplexing (OFDM), methods has been shown and are compared in terms of
because of its resistance to multipath fading, has attracted bandwidth efficiency, bit error rate (BER) performance.
increasing interest in recent years as a suitable modulation Section 5 concludes the paper and inference has been given.
scheme for commercial high-speed broadband wireless
communication systems. OFDM can provide large data rates 2. System Description
with sufficient robustness to radio channel impairments. It is The block diagram of standard OFDM system is given
very easy to implement with the help of Fast Fourier in figure 1. In an OFDM system, the input data stream is
Transform and Inverse Fast Fourier Transform for converted into N parallel data streams each with symbol
demodulation and modulation respectively [1]. period Ts through a serial-to-parallel Port. When the parallel
It is a special case of multi-carrier modulation in symbol streams are generated, each stream would be
which a large number of orthogonal, overlapping, narrow band modulated and carried over at different center frequencies.
sub-channels or subcarriers, transmitted in parallel, divide the The sub-carriers are spaced by 1/NTs in frequency, thus they
available transmission bandwidth [2]. The separation of the are orthogonal over the interval (0, Ts). Then, the N symbols
subcarriers is theoretically minimal such that there is a very are mapped to bins of an inverse fast Fourier transform
compact spectral utilization. These subcarriers have different (IFFT). These IFFT [11] bins correspond to the orthogonal
frequencies and they are orthogonal to each other [3]. Since sub-carriers in the OFDM symbol. Therefore, the OFDM
the bandwidth is narrower, each sub channel requires a longer symbol can be expressed as
symbol period. Due to the increased symbol duration, the ISI 1 N −1
over each channel is reduced.
However, a major problem in OFDM is its
x ( n) =
N
∑X e
m =0
m
j 2πnm / N
(1)

vulnerability to frequency offset errors between the where the Xm’s are the base band symbols on each
transmitted and received signals, which may be caused by sub-carrier. The digital-to-analog (D/A) converter then creates
Doppler shift in the channel or by the difference between the an analog time-domain signal which is transmitted through the
transmitter and receiver local oscillator frequencies [4]. In channel.
such situations, the orthogonality of the carriers is no longer At the receiver, the signal is converted back to a
maintained, which results in Intercarrier Interference (ICI). ICI discrete N point sequence y(n), corresponding to each sub-
results from the other sub-channels in the same data block of carrier. This discrete signal is demodulated using an N-point
the same user. ICI problem would become more complicated Fast Fourier Transform (FFT) operation at the receiver.
when the multipath fading is present [5]. If ICI is not properly
compensated it results in power leakage among the S/P IFFT P/S D/A
subcarriers, thus degrading the system performance.

Channel

UbiCC Journal - Volume 3 w(n) 28


Figure 2: Comparison between | S ' ' (l-k)|,
|S ' (l-k)| and |S (l-k)|
Figure 1: OFDM System Model It is seen from figure 2 that |S ' (l-k)| << |S (l-k)| for
most of the l-k values. Hence, the ICI components are much
The demodulated symbol stream is given by: smaller Also, the total number of interference signals is halved
N −1 in as opposed to since only the even subcarriers are involved
Y (m) = ∑ y (n)e − j 2πnm / N + w(m) (2) in the summation.
n=0
N-1 3.1.2 ICI Canceling Demodulation
where w (m) corresponds to the FFT of the samples of w ICI modulation introduces redundancy in the
(n), which is the Additive White Gaussian Noise (AWGN) received signal since each pair of subcarriers transmit only one
introduced in the channel. data symbol. This redundancy can be exploited to improve the
system power performance, while it surely decreases the
3. ICI Mitigation Schemes bandwidth efficiency. To take advantage of this
redundancy, the received signal at the (k + 1) th subcarrier,
3.1 Self-Cancellation (SC) Scheme where k is even, is subtracted from the kth subcarrier. This is
In this scheme, data is mapped onto group of
expressed mathematically as
subcarriers with predefined coefficients. This results in
Y '' (k) = Y '(k) -Y ' (k+1)
cancellation of the component of ICI within that group due to N−2
the linear variation in weighting coefficients, hence the name
self- cancellation. The complex ICI coefficients S (l-k) are
= ∑X (l)[−S(l − k −1) + 2S(l − k) − S(l + k +1)]+ n − n
l =0, 2,4,..
k k +1

given by
(7)
Sin(π (l + ε − k )) (3) Subsequently, the ICI coefficients for this received signal
Sin(l − k ) = exp( jπ (1 − 1 / N )(l + ε − k ))
NSin(π (l + ε − k ) / N ) becomes
S '' (l-k) = – S (l-k-1) + 2S (l-k) – S (l-k+1) (8)
3.1.1 ICI Canceling Modulation When compared to the two previous ICI coefficients
The ICI self-cancellation scheme requires that the |S (l-k)| for the standard OFDM system and |S'(l-k)| for the ICI
transmitted signals be constrained such that X (1) = - X (0), X canceling modulation, |S''(l-k)| has the smallest ICI
(3) = - X (2) …X (N-1) = - X (N-2).The received signal on coefficients, for the majority of l-k values, followed by
subcarriers k and k + 1 to be written as |S' (l-k)| and |S (l-k)|. This is shown in Figure 2 for N = 64 and
N −2 ε = 0.5. The combined modulation and demodulation method
Y ' (k ) = ∑ X (l )[S (l − k ) − S (l + 1 − k )] + n
l = 0 , 2 , 4 , 6 ,..
k (4) is called the ICI self-cancellation scheme. The reduction of the
ICI signal levels in the ICI self-cancellation scheme leads to a
N −2 higher CIR. The theoretical CIR is given by
Y ' (k + 1) = ∑ X (l )[S (l − k − 1) − S (l − k )] + n k +1
− S (−1) + 2 S (0) − S (1)
2
l = 0 , 2 , 4 , 6 ,..
CIR = 2
(9)
(5) N −1
where nk and nk+1 is the noise added to it.
And the ICI coefficient S ' (l-k) is denoted as
∑ − S (l − 1) + 2S (l ) − S (l + 1)
l = 2 , 4 , 6 ,..
S '(l-k) = S (l-k) – S (l+1-k) (6)
As mentioned previously, the redundancy in this
scheme reduces the bandwidth efficiency by half. There is a
tradeoff between bandwidth and power tradeoff in the ICI self-
cancellation scheme.
3.2 Maximum Likelihood Estimation
The second method for frequency offset correction
in OFDM systems was suggested by Moose in [12]. In this
approach, the frequency offset is first statistically estimated
using a maximum likelihood algorithm and then cancelled at
the receiver. This technique involves the replication of an
OFDM symbol before transmission and comparison of the

UbiCC Journal - Volume 3 29


phases of each of the subcarriers between the successive z(n) is linearly related to d(n). Hence the normalized
symbols. frequency offset ε (n) can be estimated in a recursive
When an OFDM symbol of sequence length N is procedure similar to the discrete Kalman filter. As linear
replicated, the receiver receives, in the absence of noise, the approximation is involved in the derivation, the filter is called
2N point sequence i.e., {r (n)} given by the extended Kalman filter (EKF). The EKF provides a
1 K trajectory of estimation for ε(n). The error in each update
r ( n) =
N
∑ X ( k ) H ( k )e
k =− K
j 2πn ( k +ε ) / N
(10) decreases and the estimate becomes closer to the ideal value
during iterations.
where {X(k)} are the 2K+1 complex modulation
values used to modulate 2K+1 subcarriers, 4.2 ICI Cancellation
The first set of N symbols are demodulated using an There are two stages in the EKF scheme to mitigate
N-point FFT to yield the sequence R1(k), and the second set is the ICI effect: the offset estimation scheme and the offset
demodulated using another N-point FFT to yield the sequence correction scheme.
R2(k). The frequency offset is the phase difference between R1
(k) and R2 (k), that is 4.2.1 Offset Estimation Scheme
To estimate the quantity ε (n) using an EKF in each
R2 (k) = R1 (k) ej2πε (11) OFDM frame, the state equation is built as
ε(n) = ε (n-1) (23)
Adding the AWGN yields i.e., in this case we are estimating an unknown constant ε. This
Y1 (k) = R1 (k) + W1 (k) (12) constant is distorted by a non-stationary process x(n), an
Y2 (k) = R1 (k) ej2πε + W2 (k) observation of which is the preamble symbols preceding the
k = 0, 1 ...N – 1 data symbols in the frame. The observation equation is
The maximum likelihood estimate of the normalized
frequency offset is given by: y(n) = x(n) e j2 π n ε(n) / N + w(n) (24)
⎧ K ⎫

∑ Im Y (k )Y * (k )
⎪ ⎪
⎪ ⎪

2 1
⎪ where y(n) denotes the received preamble symbols
∧ 1
tan − 1
⎪ ⎪
distorted in the channel, w(n) the AWGN, and x(n) the IFFT
ε=
⎪ ⎪
⎪ k =− K ⎪
⎨ ⎬ (13) of the preambles X(k) that are transmitted, which are known at
2π ⎪ K ⎪

∑ Re Y (k )Y * (k )
⎪ ⎪




the receiver. Assume there are Np preambles preceding the
⎪ 2 1 ⎪
⎪⎩ k =− K ⎪⎭ data symbols in each frame are used as a training sequence
This maximum likelihood estimate is a conditionally and the variance σ2 of the AWGN w(n) is stationary.
unbiased estimate of the frequency offset and was computed
using the received data. Once the frequency offset is known, 4.2.2 Offset Correction Scheme
the ICI distortion in the data symbols is reduced by The ICI distortion in the data symbols x(n) that
multiplying the received symbols with a complex conjugate of follow the training sequence can then be mitigated by
the frequency shift and applying the FFT, multiplying the received data symbols y(n) with a complex
X (n) = FFT {y (n) e-j2π nε / N} (14) conjugate of the estimated frequency offset and applying FFT,
i.e.
3.3 Extended Kalman Filtering xˆ(n) = FFT{ y(n) e -j 2 π n ε(n) / N} (25)
A state space model of the discrete Kalman filter is As the estimation of the frequency offset by the EKF
defined as scheme is pretty efficient and accurate, it is expected that the
z(n) = a(n) d(n) + v(n) (15) performance will be mainly influenced by the variation of the
In this model, the observation z(n) has a linear AWGN.
relationship with the desired value d(n). By using the discrete
Kalman filter, d(n) can be recursively estimated based on the 4.3 Algorithm
observation of z(n) and the updated estimation in each 1. Initialize the estimate εˆ(0) and corresponding state
recursion is optimum in the minimum mean square sense. error P(0)
The received symbols in OFDM System are 2. Compute the H(n), the derivative of y(n) with respect to
y(n) = x(n) ej 2 π n ε(n) / N + w(n) (16) ε(n) at εˆ(n-1) the estimate obtained in the previous
where y(n) the received symbol and x(n) is the FFT iteration.
of transmitted symbol. It is obvious that the observation y(n) is 3. Compute the time-varying Kalman gain K(n) using the
in a nonlinear relationship with the desired value ε(n), i.e error variance p (n-1), H(n), and σ2
y(n) = f(ε(n)) + w(n) (17) 4. Compute the estimate yˆ(n) using x(n) and εˆ(n-1) i.e.
where f(ε(n)) = x(n) ej 2 π n ε(n) / N (18) based on the observations up to time n-1, compute the
In order to estimate ε(n) efficiently in computation, error between the true observation y(n) and yˆ(n)
we build an approximate linear relationship using the first- 5. Update the estimate εˆ(n) by adding the K(n)-weighted
order Taylor’s expansion: error between the observation y(n) and yˆ(n) to the
y(n)≈f(εˆ(n-1))+f'(εˆ(n-1))[ε(n)-εˆ(n-1)]+w(n) (19) previous estimate εˆ(n-1)
6. Compute the state error P(n) with the Kalman gain K(n),
where εˆ(n-1) is the estimate of ε(n-1). H(n), and the previous error P(n-1).
To Define 7. If n is less than Np, increment n by 1 and go to step 2;
z(n) = y(n) – f(εˆ(n-1) (20)
d(n) = ε(n) - εˆ(n-1) (21) otherwise stop.
and the following relationship It is observed that the actual errors of the estimation εˆ(n) from
z(n) = f'(ε(n-1)) d(n) + w(n) (22) the ideal value ε(n) are computed in each step and are used for
adjustment of estimation in the next step.

UbiCC Journal - Volume 3 30


4. SIMULATIONS AND RESULTS
In order to compare the ICI cancellation schemes,
BER curves were used to evaluate the performance of each
scheme. For the simulations in this project, MATLAB was
employed. The simulations were performed using an AWGN
channel.
Table 1: Simulation Parameters
PARAMETERS VALUES
Number of carriers (N) 1705
Modulation (M) BPSK
Frequency offset ε [0.25,0.5,0.75]
No. of OFDM symbols 100
Bits per OFDM symbol N*log2(M) Figure 5: BER performance with ICI
Cancellation for ε=0.75
Eb-No 1:20
the effect of this residual ICI increases for larger offset values.
IFFT size 2048 However, ML method has an increased BER performance and
proves to be efficient than SC method.

5. CONCLUSION
It is observed from the figures that Extended
Kalman filter method indicates that for very small frequency
offset, it does not perform very well, as it hardly improves
BER. However, for high frequency offset the Kalman filter
does perform extremely well. Important advantage of EKF
method is that it does not reduce bandwidth efficiency as in
self cancellation method because the frequency offset can be
estimated from the preamble of the data sequence in each
OFDM frame.
Self cancellation does not require very complex
hardware or software for implementation. However, it is not
Figure 3: BER performance with ICI bandwidth efficient as there is a redundancy of 2 for each
Cancellation for ε=0.25 carrier. The ML method also introduces the same level of
Figure 3 shows that for small frequency offset redundancy but provides better BER performance, since it
values, ML and SC methods have a similar performance. accurately estimates the frequency offset. EKF
However, ML method has a lower bit error rate for increasing implementation is more complex than the ML method but
values of Eb/No. provides better BER performance.
Further work can be done by extending the concept
of self-ICI cancellation and by performing simulations to
investigate the performance of these ICI cancellation schemes
in multipath fading channels.

6. REFERENCEs
[1] Ramjee Prasad, “OFDM for wireless communication
system”,Artech House,2004.
[2]S.Weinstein and P.Ebert, ‘Data transmission by
frequency-division multiplexing using the discrete Fourier
transform,’ IEEE Trans. Commun.,vol.19, pp. 628-634, Oct.
1971.
[3] L.J. Cimini, “Analysis and Simulation of a Digital Mobile
Channel Using Orthogonal Frequency Division Multiplexing”,
Figure 4: BER performance with ICI IEEE Transactions on Communication. no.7 July 1985.
Cancellation for ε=0.5 [4] Russell, M.; Stuber, G.L.; “Interchannel interference
Figure 4 illustrates that for frequency offset value of analysis of OFDM in a mobile environment”, Vehicular
0.5, BER increases for both the methods but ML method Technology Conference, 1995 IEEE 45th, vol. 2, pp. 820 –
maintains a lower bit error rate than SC.EKF is better than SC 824,.Jul. 1995
method. [5] X.Cai, G.B.Giannakis,”Bounding performance and
In figure 5, for frequency offset value of 0.75, self- suppressing intercarrier interference in wireless mobile
cancellation method has a BER similar to standard OFDM OFDM”, IEEE Transaction on communications, vol.51, pp.
system since the self-cancellation technique does not 2047-2056, no.12, Dec.2003.
completely cancel the ICI from adjacent sub-carriers and

UbiCC Journal - Volume 3 31


[6] J. Armstrong, “Analysis of new and existing methods of
reducing intercarrier interference due to carrier frequency
offset in OFDM,” IEEE Transactions on Communications,
vol. 47, no. 3, pp. 365 – 369, March 1999.
[7] Y. Fu, S. G. Kang, and C. C. KO, “A new scheme for
PAPR reduction in OFDM systems with ICI self-
cancellation,” in Proc. VTC 2002- Fall, 2002 IEEE 56th
Vehicular Technology Conf., vol. 3, pp 1418–1421, Sep. 2002.
[8] Y.Zhao and S. Häggman, “Intercarrier interference self-
cancellation scheme for OFDM mobile communication
systems,” IEEE Transactions on Communications, vol. 49, no.
7, pp. 1185 – 1191, July 2001.
[9] J.-J. van de Beek, M. Sandell, and P.O. Borjesson, “ML
estimation of time and frequency offset in OFDM systems,”
IEEE Trans. Signal Process., 45, pp.1800–1805, July 1997.
[10] Tiejun (Ronald) Wang, John G. Proakis, and James R.
Zeidler “Techniques for suppression of intercarrier
interference in ofdm systems”. Wireless Communications and
Networking Conference, 2005 IEEE Volume 1, Issue, 13-17
pp: 39 - 44 Vol. 1, March 2005.
[11] William H.Tranter, K.Sam Shanmugam, Theodore
S.Rappaport, Kurt L.Kosbar, “Principles of Communication
system simulation with wireless application”, Pearson
Education, 2004.
[12] P.H. Moose, “A technique for orthogonal frequency
division multiplexing frequency offset Correction,” IEEE
Trans. Commun., 42, pp.2908–2914, October 1994

UbiCC Journal - Volume 3 32


A NEW SIGNALLING PROTOCOL FOR SEAMLESS ROAMING
IN HETEROGENEOUS WIRELESS SYSTEMS

Azita Laily Yusof, Mahamod Ismail, Norbahiah Misran


Dept of Electrical, Electronic & System Engineering,
Universiti Kebangsaan Malaysia,
43600 UKM Bangi, Selangor,
Malaysia.
Tel.: +60389216122, Fax : +60389216146
Email: laily012001@yahoo.com, {mahamod, bahiah}@eng.ukm.

ABSTRACT

The world is undergoing a major telecommunications revolution that will provide


ubiquitous communication access to citizens, wherever they are. Seamless
roaming across different wireless networks which has different types of services
and quality of service guarantees has becomes a major topic for the past several
years in the research area. With the integration of different technologies, the
signaling protocol of mobility management must be designed to support seamless
roaming for both intra and interdomain system. In this paper, we designed a
simplified system architecture, called enhanced system architecture evolution
(eSAE) to support mobility between multiple heterogeneous wireless system.
eSAE contains fewer network nodes and is reduced to only the enhanced node B
(eNB) and access gateway (aGW) that comprise Mobility Management Entity
(MME) and User Plane Entity (UPE). We designed a signaling protocol for the
location registration due for intersystem roaming in next generation wireless
systems. Performance analysis has been carried out and based on this proposed
architecture, it is shown that this enhancement can reduce the signaling cost and
latency of location registration.

Keywords: Seamless roaming; Handoff latency; Intra and interdomain system;


Heterogeneous wireless system

1 INTRODUCTION network to another. This protocol needs to request


location registration after it receives signals from the
In the next generation wireless systems, it is new system and this cause high overhead of
expected that the population of the mobile users will signaling cost and processing time. It also causes the
be increased with the development of various triangular call routing problem because the call for
applications in the seamless global . Mobile users roaming mobile in the same network need to route to
can have different services that suits their need and the previous network before delivered to the new
can move freely between different wireless systems. network. Boundary location register (BLR) [2] was
However, different wireless system will have designed in order to solve this problem. In this
different environments, interworking and integration. protocol, the home location register (HLR) is not
This scenario has becomes challenges for the involved in location registration unless the mobile
researcher to support intra and intersystem mobility goes through into another system. So the incoming
for providing continuous wireless services to mobile calls of intersystem roaming mobiles are delivered to
users in the next generation heterogeneous wireless them directly. However, this approach is not scalable
networks. in the sense that one BLR gateway is needed for each
There has been many proposals to integrate pair of adjacent networks when integrating multiple
different wireless systems. In [1], the mobility networks.
gateway location register (GLR) has been developed In [3], they proposed a distributed gateway
to support the intersystem roaming. The GLR foreign agent (GFA) where each foreign agent [FA]
converts signaling and data formats from one can function dynamically either as an FA or GFA.

UbiCC Journal - Volume 3 33


There is no fixed regional network boundary and according to its changing mobility and packet arrival
mobile decides to perform the home location update pattern. However, this
scheme increases the requirement of the processing among different network operators. The architecture
capability on each mobility agent and mobile shown in figure 1, where the NIA functions as a
terminals. The hierarchical Intersystem Mobility trusted third party for authentication dialogs between
Agent (HIMA) [4] was proposed where it acts as an the foreign agent and home network. The working
anchor point to forward data as the user moves from principle of this third party architecture is as follows.
one network to another. The HIMAs are placed at the When a mobile user requests services from an
gateway routers or anchor routers for mobile users foreign network (FN) and the FN determines that it
with high roaming profiles. However, the scheme of has no SLA with the user’s HN provider, it forwards
address administrative issues and service level the request to NIA to authenticate the user. Then,
agreements across different wireless network and NIA talks to the user’s HN provider and mediates
service providers is not analyzed in this paper. between the FN and HN for authentication message
In [5], the author introduced an architecture exchanges. Once the user is authenticated, NIA also
called ubiquitous Mobile Communications (AMC) to creates security associations/keys required between
integrate multiple heterogeneous systems. AMC different network entities. At the end of the proposed
eliminates the need for direct SLA among service security procedures, the HN and FN will be mutually
providers by using a third party, Network authenticated, and will have session keys for secured
Interoperating Agent (NIA). In this paper, they use data transfer. They integrate the authentication and
distributed and hybrid scheme as a network selection. Mobile IP registration processes as defined in [5].
However, because the decision making is
implemented in the mobiles, so the system
information has to be broadcasted to the mobiles
periodically by the handoff management module,
resulting in a great update cost of the system.
Moreover, the existing protocol does not consider the
determination of the NIA’s number required for
global integration. Low complexity, centralized
network selection scheme [6] has been proposed to
overcome the shortcomings of NIA. The proposed
scheme eliminated the update cost whereby this
scheme will only be invoked by changes in end
users’ service requirements, beginning of a new
application, or ending of an existing application.
In this paper, we propose a simplified
network architecture, eSAE to support the low
latency system. The network is simplified and reduce
to only the Base Station called enhanced Node B
(eNB) and access gateway (aGW) that consists of
Mobility Management Entity (MME) and User Plane
Entity (UPE). The system uses all Internet Protocol
(IP) network where all services are via packet switch
domain only. In this proposed architecture, we
design a signaling protocol for authentication and
authorization.
The rest of this paper is organized as
follows. First we describe the existing system
architecture and the signaling protocol called AMC.
Then we present our proposed simplified architecture
followed by the authentication and authorization
information flow in eSAE. We discuss the simulation
results and finally the conclusion.

2 CURRENT AMC PROTOCOL

AMC integrates heterogeneous wireless systems


using a third party, called Network Interoperating
Agent (NIA) which eliminates the need for SLAs

UbiCC Journal - Volume 3 34


NIA
FN HN

AAAL

HLR
AU

AAAH
FA

HA
UE

Figure 1 : The architecture for AMC

3 THE PROPOSED ARCHITECTURE equipment (UE) context, generate


temporary identities, UE authentication and
Figure 2 shows our proposed architecture for the authorization and mobility management and
next generation wireless systems. eSAE will have User Plane entity (UPE) to manage/store
two types of network elements supporting the user UE context and packet routing/forwarding,
and control planes. initiation of paging.
• The first is the enhanced base station, so
called enhanced node B (eNB). This Comparing the functional breakdown with existing
enhanced base station provides air interface architecture:
and performs radio resource management • Radio Network elements functions, such as
for the access system. Radio Network Controller (RNC), are
• The second is the access gateway (aGW). distributed between the aGW and the eNB.
The aGW provides termination of the bearer. • Core Network elements functions, such as
It also acts as a mobility anchor point for SGSN and GGSN or PDSN (Packet Data
the user plane. It implements key logical Serving Node) and routers are distributed
functions including Mobility Management mostly towards the aGW.
Entity (MME) to manage/store user

UbiCC Journal - Volume 3 35


aGW aGW
(MME/UPE) (MME/UPE)

(E/UPE) ME/UPE)

eNB eNB eNB eNB

Figure 2 : The proposed mobility management architecture for next generation all-IP-based wireless systems

3.1 Authentication and Authorization and its HSS. IEEE 802.1x uses a special frame
format known as Extensible Authentication Protocol
The working principle of this architecture is as (EAP) over LAN (EAPOL) for transportation of
follows. When a mobile user requests service from a authentication messages between a UE and an access
FN and the FN determines that it has no SLA with point (AP). EAP [9] over RADIUS [10] or Diameter
user's home service subscriber (HSS), it forwards the [11] is used for the transportation of authentication
request to aGW to authenticate the user. Then, aGW messages between other entities. When the UE
talks to user's HSS and mediates between FN and roams into a FN, the authentication and MIP
HSS for authentication message exchanges. Once the registration are carried out as described below. Here,
user is authenticated, aGW also creates security EAP-SIM [12] is used to illustrate the authentication
associations/keys required between different network process. Note that any other authentication schemes,
entities. Finally the HSS and FN will be mutually e.g. EAP-AKA [13], EAP-SKE [14], EAP-TLS [15]
authenticated, and will have session keys for secured etc. can also be used. Figure 3 shows the location
data transfer. registration procedure.
The authentication and Mobile IP
registration processes are integrated in the proposed
architecture using the procedures defined in [7].
IEEE 802.1x port access control standard [8] is used
for end-to-end mutual authentication between a UE

UbiCC Journal - Volume 3 36


UE eNB aGW AAAH AuC Inter AS HSS
(MME/UPE) Anchor

1. Network Discovery and


Access System Selection

2. Attach Request [c1 + c2 ]

[c1 + c2 ] 3. Authentication [c3 + c4]

4. Attach Reply [c1 + c2 ]


5. Register MME [c3 + c4 + c5 + c6 ]

6. Confirm Registration [c3 + c4 + c5 + c6 ]

7. Selection of Intersystem
Mobility Anchor GW

[c1 + c2 ] 8. User Plane Route Configuration [c3 + c4 + c5]

9. Configure IP Bearer QoS [c7]

10. Attach Accept [c1 + c2 ]

Figure 3 : The authentication and authorization signaling messages

1. The UE discovers new access system and Request to the Home AAAH server (AAAH).
performs access system and network selection. Once the AAAH receives the MIP Registration
2. The UE sends an attach request, MIP Request containing the SIM Key Request
Registration Request including Mobile-AAA extension, first it verifies the Mobile-AAA
Authentication extension (as defined in [16]) to authentication extension. If the authentication is
the aGW. The UE also includes a SIM Key successful, it contacts the home authentication
Request extension [19] and a Network Access center (AuC) of the UE and obtains n number of
Identifier (NAI) [18], e.g. UE@relam, in its MIP triplets (RAND, SRES, Kc), where RAND
Registration Request. The SIM Key Request denotes a random number, SRES denotes the
extension contains a random number response and Kc is the key used for encryption.
(NONCE_UE) picked up by the UE, which is Then it forwards a copy of these triplets to aGW.
used for new authentication key generation as When aGW receives n triplets it derives a
discussed later in this section. UE_AAAH key (KUE_AAAH) and calculates
3. When the aGW receives the MIP Registration message authentication code (MAC) for the
Request and finds the Mobile-AAA RANDs (MAC_RAND) using [19]
Authentication extension, it learns that the UE is
a roaming user. Based on the NAI in the MIP KUE_AAAH = h(n * Kc│NONCE_UE) and
Registration Request, the aGW recognizes that
the operator does not have direct SLA with the MAC_RAND = PRF(KUE_AAAH, α) (1)
UE's HN and forwards the MIP Registration

UbiCC Journal - Volume 3 37


where α is n*RAND│key lifetime; and h() and with the determined user IP address. The user
PRF() denotes a one-way hash function and a plane is established and the default policy and
keyed pseudo-random function, respectively. charging rules are applied. The user plane
establishment is initiated by the aGW.
Then, aGW sends the RANDs, MAC_RAND 8. The aGW provides the Evolved RAN with QoS
and SIM Key Reply extension to UE. The UE configurations for the Default IP Access Bearer,
derives the corresponding SRES and Kc values e.g. the upper limits for transmission data rates.
using its SIM card and the received RANDs. It 9. The aGW accepts the UE's network attachment
also calculates (KUE__AAAH) and MAC_RAND and allocates a temporary identity to the UE.
using (20). It validates the authenticity of Also the determined user IP address is
RANDs by comparing the calculated transferred. aGW calculates UE-eNB security
MAC_RAND with the received MAC_RAND. key, KUE_eNB, and forwards the MIP Registration
Thus, confirming that the RANDs are generated Reply (containing KUE_eNB and the Kc keys) to
by its HN. If the MAC_RAND is valid, the UE eNB. eNB extracts KUE_eNB and the Kc keys and
calculates a MAC for its SRES values using [19] send a MIP Registration Reply to the UE. The
Kc keys are used for secure data transfer
MAC_SRES = PRF(KUE _AAAH, n * SRES) between the UE and eNB providing
(2) confidentiality and integrity to the data traffic.

The MAC_SRES is used by aGW to know if the


SRES values are fresh and authentic. The UE 4 PERFORMANCE ANALYSIS of eSAE
also generates security association keys;
(KUE_eNB) for the eNB and (KUE_HSS) for the HSS In this section, we analyze the performance of
using [19] signaling cost and latency of location registration due
to intersystem roaming. The costs for location
KUE_eNB = PRF(KUE _AAAH, AddeNB) and registration are associated with the traffic of
messages between the entities and the accessing cost
KUE_HSS = PRF(KUE_AAAH, AddHSS) (3) of databases. To compare the total of signaling cost
between the proposed and existing architecture, we
where AddeNB and AddHSS are the IP address of assume the following parameters :
eNB and HSS, respectively. These keys are used
to authenticate subsequent Mobile IP
registrations until the key lifetime expires. Table 1 : Simulation parameters
4. Now, the UE resends MIP Registration Request
message to the eNB containing SRES extension p transmission cost of messages between
[19] and Mobile-AAA Authentication extension. the UE and the eNB
When eNB detects the presence of Mobile-AAA α transmission cost of messages between
Authentication extension, it forwards the MIP the eNB and the aGW
Registration Request message to aGW. aGW β transmission cost of messages between
calculates MAC_SRES and compares that with the aGW and the HSS
the received MAC_SRES. If valid, it forwards c1 transmission cost of messages between
the MIP Registration Request message to the the UE and the eNB
AAAH. After successful authentication AAAH c2 transmission cost of messages between
forwards the MIP Registration Request the eNB and the aGW
containing KUE_HSS (calculated using (4)) to the c3 transmission cost of messages between
HSS. the aGW and the AAAH
c4 transmission cost of messages between
KUE_HSS = PRF(KUE_AAAH, AddeNB, AddHSS ) (4) the AAAH and the AUC
c5 transmission cost of messages between
5. The HSS confirms the registration of the new the AUC and the IASA
aGW. Subscription data authorising the Default c6 transmission cost of messages between
IP Access Bearer are transferred. Information the IASA and the HSS
for policy and charging control of the Default IP c7 transmission cost of messages between
Access Bearer is sent to the aGW. the eNB and the aGW
6. An Inters AS Anchor is selected. The IP address
configuration is determined by user preferences
received from the UE, by subscription data, or
by HPLMN or VPLMN policies.
7. The Inter AS Anchor configures the IP layer

UbiCC Journal - Volume 3 38


We assume that a mobile keeps the same
1.6
mobility pattern when it moves into another system.
Further, we assume that the updating, deletion and 1.4
retrieval in the database have the same cost, a. We

Latency of location registration


calculate the total signaling of location registration 1.2

which is the sum of the transmission cost and the


cost associated with database access. Then we 1

calculate the latency of location registration where 0.8


we assume the average processing time of each
database access is 1/μ and the average waiting time is 0.6
w. So the latency for location registration is the total
time including waiting time in queue and the 0.4
NIA
processing time. eSAE

Figure 4 shows the comparison of total 0.2


0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

signaling cost as a function of intersystem roaming Probability of intersystem roaming

probability. As we can see from the graph, the total


signaling cost increases as the intersystem roaming Figure 5: Latency of location registration
probability increases,. We can also observe that the
total signaling cost of the eSAE protocol is much
lower than the NIA protocol. It is seen that as
compared to the NIA protocol, the eSAE protocol
yields significantly improved because of the
simplified architecture. The NIA protocol has to 5 Conclusion
access more databases compared to the eSAE
protocol. Similar to the case of total signaling cost, In this paper, we introduced a new
the latency of location registration increases with the signaling protocol for mobility management
increases of the intersystem roaming probability. which is based on the enhancement of the SAE
Figure 5 shows the result obtained. Therefore, eSAE architecture. We proposed the detailed procedure
protocol reduces the total signaling cost and latency of location registration for the eSAE protocol.
of location registration so that it is more suitable for This protocol is specifically developed to decrease
an intersystem roaming environment. the latency of the NIA protocol. To summarize the
comparison of eSAE and NIA protocol, we
measured the signaling cost of location
16 registration. Moreover, we evaluated the latency
of the location registration, which is composed of
14
waiting time and processing time at a specific
12 database. The results show that the eSAE protocol is
Total signalling cost

able to reduce the signaling cost and latency of


10 location registration for the mobile’s moving across
different networks.
8

6 4 REFERENCES

4
NIA
[1] ETSI TS 129 120 V3.0.0, “Universal mobile
eSAE telecommunications systems (UMTS); mobile
2
0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 application part (MAP) specification for
Probability of intersystem roaming
gateway location register (GLR)”, 3GPP/ETSI
2000, 2000-2003.
Figure 4 : Total cost of location registration [2] I.F. Akyildiz, W. Wang, “A new signaling
protocol for intersystem roaming in next
generation wireless systems”, IEEE Journal on
Selected Area in Communications, vol.19, no.
10, Oct. 2001, pp. 2040-2052.
[3] I.F. Akyildiz, W. Wang, “A novel distributed
dynamic location management scheme for
minimizing signaling costs in mobile IP”, IEEE
Transactions on Mobile Computing, vol. 1, No 3,
July 2002, pp. 163-175.
[4] N. Shenoy, “A framework for seamless roaming
across heterogeneous next generation wireless

UbiCC Journal - Volume 3 39


networks”, Journal on ACM Wireless Networks.
[5] I.F. Akyildiz, S. Mohanty, J. Xie, “A ubiquitous
mobile communication architecture for next-
generation heterogeneous wireless systems”,
IEEE Communications Magazine, vol. 43, no. 6,
pp. 29-36, June 2005.
[6] H. Jia, Z. Zhang, P. Cheng, H. Chen, A. Li,
“ Study on network selection for next generation
heterogeneous wireless networks”, in Proc.
IEEE International Symposium on Personal,
Indoor and Mobile radio Communications”,
2006.
[7] Glass, S., Hiller, T., Jacobs, S., and Perkins, C.,
“Mobile IP authentication, authorization, and
accounting requirements,” RFC 2977, IETF,
2000.
[8] “IEEE Standard for Local and metropolitan area
networks - Port-Based Network Access
Control.” IEEE Std 802.1X-2001.
[9] Blunk, L. and Vollbrecht, J., “PPP Extensible
Authentication Protocol (EAP),” RFC 2284,
IETF, 1998.
[10]Rigney, C. and et al, “Remote Authentication
Dial In User Service (RADIUS),” RFC 2865,
IETF, 2000.
[11]Calhoun, P. R., “Diameter Mobile IPv4
application,” Internet Draft, draft-ietf-aaa-
diameter-mobile ip 16.txt, work in progress,
2004.
[12]Haverinen, H. and Salowey, J., “EAP SIM
authentication,” Internet Draft, draft-haverinen-
pppest-eap-sim-16.txt, work in progress, 2004.
[13]Arkko, J. and Haverinen, H., “EAP AKA
Authentication,” Internet Draft, draft-arkko-
pppest-eap-aka-09. txt, work in progress, 2003.
[14] Salgarelli, L., “EAP SKE authentication and key
exchange protocol,” Internet Draft, draft-
salgarelli-pppext-eap-ske-03.txt, work in
progress, May 2003.
[15]Aboba, B. and Simon, D., “PPP EAP TLS
Authentication Protocol,” RFC 2716, IETF,
1999
[16]Aboba, B. and Simon, D., “PPP EAP TLS
Authentication Protocol,” RFC 2716, IETF,
1999
[17]Haverinen, H., Asokan, N., and Maattanen, T.,
“Authentication and key generation for Mobile
IP using GSM authentication and roaming,” in
Proc. IEEE ICC (ICC'01), pp. 2453{2457.
[18]Calhoun, P. and Perkins, C., “Mobile IP network
access identi¯er extension for IPv4,” RFC 2290,
IETF, 2000.
[19] Haverinen, H., Asokan, N., and Maattanen, T.,
“Authentication and key generation for Mobile
IP using GSM authentication and roaming,” in
Proc. IEEE ICC (ICC'01), pp. 2453{2457.
[20]“3GPP System to WLAN Interworking:
Functional and Architectural De¯ni-tion.” Tech.
rep. 3GPP TR 23.934 v0.3.0. 3GPP.

UbiCC Journal - Volume 3 40


Artificial Neural Network application in Parameter Optimization of Rectangular Microstrip
Patch Antenna
1 2
R.Malmathanraj , S.Thamarai Selvi
1
Lecturer /ECE National Institute of Technology, Tiruchirapalli malmathan@gmail.com
2
Professor and Head /Information Technology, Madras Institute of Technology, Anna University, Chennai.
stselvi@annauniv.edu

Abstract - Printed microstrip antennas and simple rectangular patch fed at the center of
arrays are known to have limitations in terms radiating wall. A microstrip patch antenna is a
of bandwidth and efficiency, all imposed by the radiating patch on one side of a dielectric
very presence of the dielectric substrate. The substrate, which has a ground plane on the
paper deals with the design of a probe fed and underside. The EM waves fringe off the top
edge fed rectangular microstrip patch antenna patch into the substrate, reflecting off the
with the basic parameters W,h,L,ε r,fo to ground plane and radiates out into the air.
achieve better bandwidth and directivity with Radiation occurs mostly due to the fringing
efficient radiation pattern and Gain. The field between the patch and ground. The
analytical results for various possible radiation efficiency of the patch antenna
dimensions and different dielectric values were depends largely on the substrate permittivity
calculated for achieving bandwidth and (εr) of the dielectric[2]. The basic geometry of
directivity without any structural complexities the microstrip patch is shown in fig (1)
.The analytical results were tested by
simulating with basic design software
PCAAD,MSTRIP40. To obtain an optimum
value for the design parameters of the
microstrip antenna Support Vector Machines
(SVM), Generalised Regularisation Neural
Network (GRNN) and Back Propagation
Network (BPN) were implemented to train the
network to attain optimized values to yield
wide bandwidth and better directivity with
high Gain. The application of artificial neural
network ensures an optimum design
methodology for microstrip antenna design
which is revealed when comparing the results
with analytical methods and the results of the Figure 1.Microstrip Patch Antenna Geometry
simulation softwares.

1. Introduction
Microstrip patch antennas have been
attractive due to their conformal properties.
Mathematical modeling of the basic microstrip
radiator was initially carried out by the Ideally, a thick dielectric is preferred
application of transmission-line analogies to for broadband purposes. Small values of width
W of patch result in low antenna efficiencies

UbiCC Journal - Volume 3 41


while large W values lead to higher order various complex structures adopted for the
modes. Substrate thickness should be chosen as enhancement of bandwidth, Directivity and
large as possible to maximize bandwidth and Gain. The size of the probe is selected as 0.2
efficiency, but not so large as to risk surface- mm and the various feeding positions were
wave excitation. The patch length is considered for the calculation[5]. The
determined by condition for resonance. This dimensions of the patch antenna along with the
occurs when the input impedance is purely substrate permittivity and the probe position is
real. The bandwidth of the patch is defined as varied for different operating frequencies and
the frequency range over which it is matched the numerical results were arrived using the
with that feed line within specified limits. In basic design formulas of the microstrip patch
other words, the frequency range over which listed below,
the antenna will perform satisfactorily. This
means the channels have larger usable The width of the Microstrip patch antenna is
frequency range and thus results in increased given by
transmission. The bandwidth of an antenna is
c
usually defined by the acceptable standing W= -(1)
wave ratio (SWR) value over the concerned 
r 1
2 f0
frequency range[3,4]. Dimensions of the top 2
patch were calculated to get the required Effective dielectric constant is given by
bandwidth and the impedance matching.
The advantages of microstrip antenna 
1
1 r 1  h 2
is that they are low-cost, conformable, reff = r  1 12  -(2)

lightweight and low profile, while both linear 2 2  W
and circular polarization easily achieved.
Disadvantages of microstrip antenna include Effective length ( Leff ) is given by
such as a narrow bandwidth, a low gain (~6
C
dB) and polarization purity is hard to achieve. Leff  -(3)
Several methods were reported in literature to 2 f 0 reff
improve impedance bandwidth including
employing wide band impedance matching, Length extension ( L ) is given by
stacked patches and utilizing thicker
substrates[9].  0.3
reff
W
 0.264

L 0.412h h  -(4)

2.Design Methodology reff 0.258W 


 0.8 
h 
2.1 Basic Rectangular Microstrip
The major design task of this paper is
optimization of the dimensions of the probe
fed rectangular microstrip patch antenna. The Actual length of patch ( L ) is given by
simplest approach is adopted to demonstrate
how effectively the Artificial neural network L= Leff 2L -(5)
can be used to train and optimize the various
parameters involved in the design of microstrip Ground plane dimensions ( Lg and W g) is given
patch antenna. This work concentrates only on by
the basic geometry of the microstrip ignoring Lg = 6h +L -(6)

UbiCC Journal - Volume 3 42


Wg = 6h + W -(7) 2.2 Artificial Neural Network Design
for Rectangular Microstrip Patch Antenna
Height of the substrate is given by .
A microstrip rectangular patch
0.3C
h antenna (Fig. 1) can be viewed as a matrix
2fo r
-(8) with X variables and four unknown the bs,
such that
The bandwidth of a rectangular patch is given
by AX=b
BW 3.77  
r 1
/ 2r  
W / L
h / 0  --(9)
The unknown are the resonant frequency (RF),
bandwidth (BW), gain(G), and polarization
(PL). The variables X, are the patch length
Where f 0 is the resonant frequency,  r is the (L1), patch width(L2), substrate height (H 1) ,
relative substrate permittivity, C is the speed of substrate relative permittivity (r) and the
light 3 108 m/s. feeding positions(Xf) and (Yf)

[A][ L1 , L2 , r, H1 , X,Y, Xf Yf]T=[G,BW,S 11,PL]T


processing structure of rectangular microstrip
antennas, where the input neuron units are (L
G=f(L1 , L2 , r, H1 , X,Y, Xf , Yf); ,W, εr ,h ,P ,fo )and the output units are (BW ,D
,G ,RP ). The learning paradigm on the
BW=f(L1 , L2 , r, H1 , X,Y, Xf , Yf); microstrip is supervised learning, where the
mapping function between the inputs and
S11=f(L1 , L2 , r, H1 , X,Y, Xf , Yf); outputs is the matrix A. The inputs are
weighted and the effect that each input has at
PL=f(L1 , L2 , 
r, H 1 , X,Y, Xf , Y f );
decision making is dependent on the weight of
the particular input. The weight of an input is a
In the neural network design the inputs are L1 , number that when multiplied with the input
L2 , r, H1 , X,Y, Xf , Yf and the output is gives the weighted input. Their calculation is
G,BW,S 11,PL. based on the method of moments. These
An artificial neural network is an weighted inputs then generate the unknowns.
information processing paradigm. That is Those unknowns are then compared to stored
inspired by the way biological nervous information that gives the desired bandwidth, -
systems, such as the brain, process directivity along with radiation pattern and
information. The key element of this paradigm gain. The gain is expected to be greater than
is the novel structure of the information 3 db and polarization should be linear or
processing system .It is composed of a large circular [1,2].
number of highly interconnected processing A good paradigm of supervised
elements (neurons), working in union to solve learning that is of interest to microstrip
specific problems. Artificial neural network is antenna designer is error correcting learning
like a normal human, it learns by that is minimization of error between the
examples.[2,6] desired and computed values. In this learning
A neural network with feedback is an paradigm, the set of weights that minimizes the
adequate representation of the information error between the teaching input and the

UbiCC Journal - Volume 3 43


weighted inputs is obtained. Neural networks mathematically. The structure of the human
are general function approximators. One brain and the learning process is known, but
important characteristic is that they can learn the main difference between both networks is
any Input-Output (IO) mapping by using the the effciency. The human brain is capable of
information contained in a given input-output recognizing a familiar face in approximately
data set without needing a structure definition 100-200 ms, where conventional computers
of the IO-mapping. The type of network, can take hours or days to fulfill less complex
parameter settings, number of hidden neurons, tasks. The biological neural network is still
and the connectivity of the neural network much faster then the artificial neural network
define the structure of the approximated IO- (ANN or NN), but the capabilities of the
mapping. The only additional information a neural networks are promising. In this section,
neural network needs, besides the input-output the general structure of a neural network will
data, is the definition of the input and output be explained. First the concept of neurons is
parameters, the relevant parameters which treated with a special attention to the several
span the IO-mapping. The ideal use of neural activation functions. The possible
networks is antenna model parameters combinations of several neurons in layers is
optimization. Without knowing the IO- handled in the discussion about the
mapping structure of the model, the neural architecture of networks. Some example
network can learn to mimic the IO-mapping networks are used to illustrate the effect of the
[7,8]. parameters of the networks. The primary
In the IO-mapping problem a feed- element in a neural network is the neuron, an
forward neural network is used because the information-processing unit. A mathematical
antenna model is a static mapping (except at model of an artificial neuron in given in Fig.
the very moment the design failure occurs). (2). The structure is similar to the human
Also the ’tangent sigmoidal’, e.g. ’tansig’, neuron; more information about the
activation function will be used in the hidden functioning of the human neural network. The
layer of the network because of two reasons. elements xi on the left side are the input signals
One, in the antenna design, IO-mapping is a to the neuron k. These inputs are multiplied by
smooth mapping with little to none the corresponding weights wki and summed
discontinuities. The ’tansig’ is also a smooth together with the bias bk. The results of the
function with the capability of approximating a summation vk is passed through an activation
discontinuity (by squashing the shape with function (vk) producing the output yk.
respect to the input axis). The second reason The mathematics of the neuron given in
for choosing the ’tansig’ function is that this Fig. (2) can start with the weighted inputs
function is a very ’general’ function. After a p
failure, the antenna design IO-mapping has v k   wki x i -(10)
changed into an unknown form. Therefore an i 1
activation function, which can be used to The output can be written as
mimic almost all shapes, is more preferable yk k (vk bk ) -(11)
since in that case all (unknown) IO-mapping
A weighted bias can be included by adding an
can be approximated by the neural network. extra term in the first equation.
Neural networks are based on the
human brain and its enormous capability of p
learning and adapting. Over decades, people v   w xi -(12)
have been trying to model the human brain k ki
i 0

UbiCC Journal - Volume 3 44


Where the bias is changed to a fixed input of 1
and with a weight of wk0 . The shape of the
output is then only depending on the activation
function of vk.
y k k (v k ) -(13)

The type of activation function has a large


influence on the output of the neuron as can be
seen from the equation (12). In a signal flow
diagram, a neuron can be represented as shown
in Fig. (3).

x0=1

x1 wk,0

wk,1

x2 wk,2 vk (k) yk

.
.
.
xn wk,n

Figure 2. Mathematical
definition of a neural network

X1
wk, bk
1

wk,
X2 2 Σ
.
UbiCC Journal - Volume 3 . 45
.
Fig 3. Signal flow diagram of a
neuron

A neural network is a directed graph 3.1 Learning


consisting of nodes with interconnecting
synaptic and activation links, and is As the neural network software read
characterized by four properties: the training set, the network learns the data
1. Each neuron is represented by a set of linear patterns in the training set. Learning
synaptic links, an externally applied bias, and a subprograms differ depending on the
possibly nonlinear activation link. The bias is architecture selected. As training progressed,
represented by a synaptic link connected to an statistical graphs furnished by the neural net
input fixed at +1. software, provided a means to monitor training
2. The synaptic links of a neuron weight their progress.
respective input signals. Numerical historical data and repetitive
3. The weighted sum of the input signals examples in which the solution is already
defines the induced local field of the neuron in known are required to train a neural network.
question. While the relationship between variables may
4. The activation link squashes the induced not be known, network results can be
local field of the neuron to produce an output. improved by the addition of more variables.
Data may need different representation, for
3. Architecture Selection example if data has a very large value range,
A variety of Neural network logarithms or other data transformations or
architectures are available to process the data conversions may be necessary.
from the input data set files. A multi layer 3.2 Generalised Regularisation
Backpropagation Network architecture, Neural Network (GRNN)
Generalized Regularization neural network
(GRNN) and Support vector machines (SVM) The GRNN is based on the Nadaraya –
were used for training because of its ability to Watson Kernel regression. GRNN’s feature
generalize well when applied to a wide variety fast training times can model non linear
of applications and also for the ability to have functions and have been shown to perform
better regression. well in noisy environments given enough data.
The primary advantage of the GRNN is the
speed at which the network can be trained.
Training a GRNN is performed in one pass of
the training data through the network, the

UbiCC Journal - Volume 3 46


training data values are copied to become the
weight vectors between layers. The
architecture of the GRNN is shown in the
figure 4, it has four layers input pattern,
summation and output, with weighted
connections Wij between the input and pattern
layer and

Pattern Units Summation Layer Output Units

Input
S
S
fo D

S
L

W S
BW

ε
r
D

P S- Summation Unit
D- Division Unit

fo

Figure 4.Architecture of GRNN Network

Input

UbiCC Journal - Volume 3 47


sets or between H1 and H2 we need to
minimize ||w|| with the condition that there are
no data points between H1 and H2
3.3 Support Vector Machines
Traditionally neural networks have w . x –b ≥+1 for yi = +1 -(17)
been used for classification, which is based on w . x –b ≤-1 for yi = -1 -(18)
Empirical Risk Minimization (ERM). SVM
was developed by Vapnik and had become Combining the above two equations,
popular tools for data mining. The formulation
embodies the Structural Risk Minimization yi ( w . x – b ) >= 1 -(19)
(SRM), which is superior to empirical risk
minimization. SRM minimizes the upper So the problem of maximizing the
bound on expected risk as supposed to ERM distance between hyper plane H1 and H2 is
that minimizes the error on training data. So, formulated as min ½ wT w subject to
SVM generalizes much better. There are many
linear classifiers that can separate data, but yi ( w . x – b ) ≥1 -(20)
SVM only maximizes the margin i.e. the
distance between it and the nearest data point This is a convex quadratic problem in w, b in a
in each class. convex set. The solution is found by solving
We have N training data {(x 1,y1), using lagrangian method by introducing
(x2,y2),….. (xN,yN)} Where xi Є Rd and yi Є lagrangian multipliers. It is easier to solve
{+1,-1}. It needs to be classified using a linear using lagrangian dual equation given by
hyper plane classifier
LD = Σi αi - Σi Σj αi αj yi yj xi · xj -(21)
f(x) =sgn (w.x - b) -(14)
The significance of the above equation is that
This hyper plane will have maximum distance the training input vectors appear only as dot
between each class. This hyper plane product. So when the data is not linearly
H : y =w . x – b = 0 and two hyper planes separable it is required to transform the data
parallel to it into a higher dimensional. This causes
complex calculations in neural networks but in
H1 : y =w . x – b= +1 -(15) SVM as data appear only as a dot product all
calculation can be carried explicitly in low
H2 : y =w . x – b = -1 -(16) dimension if a kernel function exists for

With no data points between H1 and H2, and LD = Σi αi - Σi Σj αi αj yi yj Φ(xi) · Φ(xj) -(22)
distance between H1 and H2 maximized.
Some training point will lie on the hyper plane as Φ(xi ) · Φ(xj ) = K(xi , xj )
H1 and H2, they are called support vector Where K is the kernel function. This is
machines because they define the separating equivalent as the dot product in high
plane and the other training points can be dimension is equal to kernel function in input
removed or moved provided they don’t cross space. The common kernel function used is
the planes H1 and H2. The distance between Gaussian kernel,
hyper plane H1 and H2 is 2/ || w||. To
maximize the distance between the two data K (xi , xj) = e - || xi – xj || 2 / σ2 -(23)

UbiCC Journal - Volume 3 48


(GRNN) and Support vector machines (SVM)
Mercers condition determines whether whose results were in good agreement with the
a function g(x) can be used as a kernel or not, analytical as well as the designed structure
∫g(x)2 dx should be finite. output shown in Table (1). The input output
relations were also checked for the
4.Design Implementation and Result. experimental results. The Backpropagation
The dimensions of the rectangular Network architecture achieves the antenna
patch were selected in a trial and error basis parameter optimization with maximum time
considering the constraints of the design in for convergence. The GRNN and the SVM
selecting the values. The different geometrical neural network achieves optimization with
parameters were designed analytically and the quicker learning time as shown in Fig 11 and
bandwidth given in equation (9) was used to 12. In this research analysis for antenna
calculate the value for the selected dimensions. parameter optimization the GRNN neural
The parameters were used to construct the network produced the accurate result with
structure using the simulation software. The comparatively minimum time for convergence.
bandwidth and directivity along with the gain The computational time was very less in terms
and radiation pattern of the design were of seconds with high accuracy as shown in Fig
obtained. The parameters of the patch 13. The optimized parameters obtained using
equations (1-9) with the feed position for a the training neural networks achieved high
resonant frequency are fed as input to the impedance bandwidth of 7.8%, directivity
networks. The impedance bandwidth and 7.73db without side lobes and offered high
directivity was taken as the output of the gain 8.67 dbi and radiation efficiency 100%
network. The analytical data values are given was attained. The results were comparatively
as input to train the network to obtain an better when compared with the results from
optimized geometry for the probe fed analytical analysis and simulation analysis for
microstrip antenna .The wide range of Microstrip patch antenna using PCAAD and
parameters was used to provide the optimum MSTRIP40.
result and the training steps were increased to
obtain the accuracy.
To train the SVM parameters
The validity of the network was tested
by comparing the analytical results obtained [alpha,b] = trainlssvm ({X, Y, type, gam, sig2,
from the basic formulas for a given set of input
kernel, preprocess})
values. The same parameters were used to
construct a probe fed rectangular patch using
simulation software shown in figure (7) and
Outputs
the output radiation pattern was obtained as
shown in figure (8). The current pattern of the alpha matrix with support values of SVM
designed antenna is also plotted which shows
b vector with bias term(s) of SVM
the even distribution due to proper impedance
matching of the probe feed. The same values
were trained using the three networks
Inputs
Backpropagation Network architecture (BPN),
Generalized Regularization neural network

UbiCC Journal - Volume 3 49


Fig 5. Antenna output model using MATLAB
Software .

Fig 6 Antenna output model using MATLAB


Software .

UbiCC Journal - Volume 3 50


Model Trained object oriented representation Gam Regularization parameter
of the SVM model sig2 Kernel parameter (bandwidth in the
Model Object oriented representation of the case of the 'RBF_kernel')
SVM model kernel Kernel type default 'RBF_kernel'
X matrix with the inputs of the Xt inputs of the test data
training data preprocess preprocess
Y vector with the outputs of the
training data Plotting the graph in SVM
Type function estimation plotlssvm({X,Y,type,gam,sig2,'RBF_kernel','p
Gam Regularization parameter reprocess'},{alpha,b});
sig2 Kernel parameter (bandwidth in the
case of the 'RBF_kernel') Inputs
X matrix with the inputs of the training
kernel Kernel type (by default 'RBF_kernel') data
preprocess preprocess'(*) or 'original' Y vector with the outputs of the training
data
Simulating the SVM Type function estimation
Yt=simlssvm({X,Y,type,gam,sig2,'RBF_kerne Gam Regularization parameter
l','preprocess'},Xt); sig2 Kernel parameter (bandwidth in
the case of the 'RBF_kernel')
Outputs kernel Kernel type (by default
Yt matrix with predicted output of test 'RBF_kernel')
data preprocess preprocess
alpha support values obtained from
Inputs training
X matrix with the inputs of the training b Bias term obtained from
data training
Y vector with the outputs of the training 5.Conclusion
The radiation pattern of the designed
data
antenna presented in this paper figure (7)
Type function estimation clearly depicted that it is a wideband antenna
with high directivity, gain with radiation

UbiCC Journal - Volume 3 51


efficiency. The major attraction of this antenna for mobile communication. The parameter
is size reduction along with bandwidth and optimization using the networks is the major
directivity made it most suitable for satellite attraction of this paper, which highlights the
communication, commercial applications. The simplicity, accuracy and reduction in
size reduction and operating frequency make it computational time for the designers of
suitable interest.

Figure 7.Structure of Probe Fed Microstrip Rectangular


Patch Antenna
Figure 8.Current Distribution
Pattern

Figure 9. Radiation Pattern of the


Optimized Patch Antenna
Figure
10. Plot Showing the Learning Trial of
Back Propagation Network

Fig 1 1. Plot to show the time for convergence


for SVM neural network.

UbiCC Journal - Volume 3 52


Fig 13. Plot to show the weight surface of SVM

Fig 12. Plot to show the time for convergence for


GRNN neural network.

Figure 15. Output of the Optimized output of the Fig 14. Plot to show the radiation pattern
rectangular patch antenna using MSTRIP using PCAAD

UbiCC Journal - Volume 3 53


References

1.Dipak K.Neog, Shyam S.Pattnaik, C.Panda, Swapna Devi, Bonomali Khuntia, and Malaya Dutta, “Design of a Wideband Microstrip
Antenna and the use of Artificial Neural Networks in Parameter Calculation”, IEEE Antennas and Propagation Magazine, Vol.47,
No.3, June 2005,pp.60-65.

2. Inder J.Bahl, Prakash Bhartla and Stanislaw S. Stuchly, “ Design of Microstrip Antennas Covered with a Dielectric Layer”, IEEE
Transactions on Antennas and Propagation, Vol. AP-30, No. 2, MARCH 1982, pp. 314-318.

3. Kin-Lu Wong and Yi-Fang Lin, “Small broadband rectangular microstrip antenna with chip-resistor loading”, ELECTRONICS
LETTERS, 1 l September 1997,Vol. 33 No. 79,pp.1593, 1594.

4. S.Lebbar, Z.Guennoun, M.Drissi, and F.Riouch, “A Compact and Broadband Antenna Design Using a Geometrical- Methodology-
Based Artificial Neural Network”, IEEE Antennas and Propagation Magazine, Vol.48, No.2, April 2006,pp.146-154.

5. C. L. Mak, K. M. Luk, Senior Member, IEEE, K. F. Lee, Fellow, IEEE, and Y. L. Chow, “Experimental Study of a Microstrip Patch
Antenna with an L-Shaped Probe,” IEEE Transactions on Antennas and Propagation, VOL. 48, NO. 5, MAY 2000,pp.777-783.

6. R.K.Mishra and Patnaik, “Designing Rectangular Patch Antenna Using the Neurospectral Method”, IEEE Transactions on Antennas
and Propagation,AP-51,8 August 2003,pp.1914-1921.

7. S.S.Pattnaik, D.C.Panda and S.Devi, “Input Impedance of Rectangular Microstrip Patch Antenna Using Artificial Neural Networks”,
Microwave and Optical Technology Letters,32,5,5 March 2002,pp.381-383.

8. S.S.Pattnaik, D.C.Panda and S.Devi, “Radiation Resistance of Coax-Fed Rectangular Microstrip Patch Antenna Using Artificial
Neural Networks”, Microwave and Optical Technology Letters, 34,1,5 July 2002,pp.51-53.

9. D.M.Pozar, “Microstrip Patch Antennas,” in L.C.Godara (ed), Handbook of Antennas in Wireless Communications, New York, CRC
Press, 2001,Chapter 6.

10.Ye Bin Hu Gu Yu , “The analyze and improve TCP performance using a DSR route protocol based on signal strength”, IEEE
Wireless Communications, Networking and Mobile Computing, pp. 846 – 849, 2005.

11. Dongkyun Kim , Hanseok Bae, Jeomki Song, “Analysis of the interaction between TCP variants and routing protocols in
MANETs”, IEEE Parallel Processing, ICPP 2005 Workshops, pp 380-386, 2005.

12. Prabakaran, M. Mahasenan, A. , “Analysis and enhancement of TCP performance over an IEEE 802.11 multi-hop wireless network:
single session case”, IEEE International Conference on Personal Wireless Communications, pp-29- 33, 2005

13. Caihong Kai Yuzhong Chen Nenghai Yu, “An Improvement Scheme Applied to TCP Protocol in Mobile Ad Hoc Networks”, IEEE
International Conference on Mobile Technology, Applications and Systems, pp.1-6, 2005

UbiCC Journal - Volume 3 54


PERFORMANCE OF TRANSMISSION SPACED SELECTION
DIVERSITY IN DS-CDMA SYSTEMS

Mona Shokair*, Maher Aziz**, Mohamed Nasr**


*Faculty of Electronic Engineering, Menoufia University, Egypt.
** Faculty of Engineering, Tanta, Tanta University, Egypt.
maher_luka@yahoo.com

ABSTRACT
The performance of transmission spaced selection diversity (SD) placed at base
station (BS) in DS-CDMA system remains insufficiently clear. This performance
will be evaluated by considering the effect of space distance between antennas
and the maximum Doppler frequency (fd) on bit error rate (BER) performance under
optimum conditions which are not clarified until now. Moreover, analysis of this
system is presented under the effect of Rayleigh fading.

Keywords: DS-CDMA, transmission spaced selection diversity, Rayleigh fading.

1 INTRODUCTION receiving or transmitting array is greater than a


certain minimum distance.
The demand for many radio services is In this paper, the performance of transmission
increasing. New techniques are required to selection spaced diversity under the effects of
improve spectrum utilization to satisfy that demand spacing distance between antennas and the
without increasing the radio frequency spectrum maximum Doppler frequency will be studied under
that is used. One technique in a digital cellular using optimum conditions. These effects are not
system is the use of spread spectrum Code Division clarified until now.
Multiple Access (CDMA) technology [1]. Another The organization of this paper is made as
technique is diversity system. Cooperation between follows: Sect. 2 introduces the analysis of the
a CDMA system and diversity system has also been system under Rayleigh fading. Computer
studied in [6]. Actually the main purpose of simulation conditions are done in Sect. 3. Results
diversity system is mitigating the multipath fading are presented in Sect. 4. Conclusions are achieved
which has negative effect on the quality of in Sect. 5.
transmission of mobile radio communication.
There are classifications of diversity system. 2 ANALYSIS OF THE SYSTEM OVER
One view of classifications is transmission and RAYLEIGH FADING
reception diversity. Other classifications are
frequency diversity, polarization diversity, spaced To get expressions for both SNR and BER
diversity, time diversity and angle diversity. All values, we consider a two – branch diversity system
these classifications are presented in detail in [2]. at BS with correlated fading channels.
To combine diversity branches, many combing The received signal from each branch of the system
techniques are explained in detail in [1] and [2], can be modeled as [3]
including space Selection Diversity SD. In SD one
of the M antenna branches that provides the highest rk(t)=Rkejαkejψm(t) + nk(t) k=1,2 (1)
Signal-to-Noise Ratio (SNR) is selected for data
recovery. Where ψm(t) is the transmitted signal, Rk is a
The success of diversity techniques depends on
Rayleigh – distributed amplitude factor, αk is a
the degree to which the signals on the different
branches are uncorrelated. This requires that the uniformly distributed phase factor, and nk(t) is zero
spacing between the antenna elements in the – mean Additive White Gaussian Noise (AWGN).
The received signal can be described by:

UbiCC Journal - Volume 3 55


rk(t)=[ Xk + jYk ] ejψm(t) + nk(t) k=1,2 (2) Eq. (9) and Eq. (10), a new SNR is defined for each
uncorrelated signal:
Where X1, X2, Y1, and Y2 are all Gaussian random
2
variables with zero mean and variance σ . Γ3 = (1 + ρ ) Γ (11)
The expectation can be expressed as, Γ4 = (1 - ρ ) Γ (12)

E [XiYk] = 0 i = 1, 2; k=1, 2 (3) Where Γ is the SNR of the original correlated


signals.
Now the BER values for a two-branch selective
E[X1X2] = E [Y1Y2] = ρσ2 (4)
diversity system can be calculated from the
following expression [3],
Where ρ is the correlation coefficient between the
fading channels.
Another assumption is that the channel statistics 1 ⎡ Γ3 Γ4 Γ3Γ4 ⎤
are independent of the AWGN variables, which are BER= ⎢1− − + ⎥
2 ⎣ Γ3 +1 Γ4 +1 Γ4Γ3 +Γ3 +Γ4 ⎦
also uncorrelated with each other, therefore
(13)
E [n1n2] = E[niXk] = E[niYk] = 0
i=1,2 ; k = 1,2 (5) 3 COMPUTER SIMULATION CONDITIONS

Ref. [3] introduces a transformation matrix T to DS-CDMA system with three antennas at the
BS and one antenna at the MS is assumed. Fig. 1
transform the correlated received signals r1(t) and
shows propagation model at the BS. Table 1 shows
r2(t) into two new uncorrelated signals r3(t) and simulation parameters.
r4(t) therefore, Incident waves

⎡ r3 (t ) ⎤ ⎡ r1 (t ) ⎤
⎢r (t )⎥ = T ⎢r (t )⎥ (6)
⎣4 ⎦ ⎣2 ⎦
φ

⎡ 2 2⎤ θ
⎢ ⎥ #3
Where T = ⎢ 2 2 ⎥ #2 #1
(7)
⎢− 2 2⎥ d/λ
⎢⎣ 2 2 ⎥⎦
Figure1: Linear array and propagation model at
The two new received signals can be expressed as, BS.

rk(t) = [ Xk + jYk ] ejψm(t) + nk(t) Table 1: Simulation parameters


k=3 , 4 (8)
Modulation QPSK
Demodulation Coherent detection
By writing out the expressions of X3, X4, Y3,
Symbol rate 30 Ksps
and Y4, it can be seen that they are functions of
Spreading code Walsh code
Gaussian random variables, therefore they are also
Gaussian random variables; in addition they are Spreading factor 128
mutually independent. Thus,
To model the Rayleigh fading, we consider a set
of 8 plane waves that are transmitted in random
E[X23] =E [Y23] = (1+ρ) σ 2
(9)
direction within the range of φ degrees at the BS
[4]. The value of φ will be determined in the next
E[X24] = E [Y24] = (1- ρ) σ2 (10) section. Each of the plane waves has constant
amplitude and takes the random initial phase
Also, n3 and n4 are functions of AWGN distributed from 0 to 2π. The Doppler frequency is
random variables, and they have the same noise uniformly distributed from +fd to –fd ( fd : is the
power, and are uncorrelated with the new channel maximum Doppler frequency). The 8 incident plane
statistics. waves arrive in random direction from 0 to 2π at the
If the noise power at each receiver for the MS. QPSK is assumed with coherent detection. A
original correlated signals is the same, then, from square root raised cosine filtering with a roll-off

UbiCC Journal - Volume 3 56


factor α of 0.5 is employed. A symbol rate of 30 1.0E-01
ksps is assumed. The spreading code is Walsh code
with spreading factor of 128. The Rayleigh fading M=2

BER
channels were disturbed by AWGN. 1.0E-02
M=1
The Performance of the diversity system
depends on correlation between antenna elements.
1.0E-03
The correlation is determined by antenna elements 80 100 120 140 160 180 200
spacing, angle spread of incident waves φ and
fd (Hz)
direction of arrival θ [5]. Thus, we have to optimize
these values to get better BER performance.
Figure 4: Maximum Doppler frequencies (fd ) vs.
BER for Eb/N0=10 dB
4 COMPUTER SIMULATION RESULTS

1.0E+00 rho=0
Fig. 2 shows the effect of arrival angle, θ, of the rho=0.1
signal on BER performance at Eb/N0=10dB. From 1.0E-01 rho=0.5
this figure, it can be concluded that changing the rho=0.9
value of θ gives slightly small effect. Therefore, we 1.0E-02
rho=1.0

BER
use in our simulation the value 300 of θ. 1.0E-03

1.0E-04
1.0E-02

1.0E-05
0 5 10 15 20
BER

SNR dB

Figure 5: Effect of branch correlation on BER


1.0E-03
0 45 90 135
performance of SD with two –branch diversity
cita in degrees (theoretical)

Figure 2: Arrival angle of the signal θ vs. BER. 1.0E-02

The effect of angle spread of incident waves φ 1.0E-03


is presented in Fig. 3. From this figure, we select
BER

the value of 120 which gives better BER


1.0E-04
performance.

1.0E-05
1.0E-02 0 2 4 6 8
d/λ
1.0E-03
BER

Figure 6: Normalized distance (d/λ) vs. BER


1.0E-04
We use these values on the following paragraphs.
1.0E-05
6 8 10 12 14 16 18 20
Fig. 5 shows the results of the theoretical BER
fay in degrees
in Eq.13 for two-branch diversity system with
different values of the correlation coefficient ρ.
Figure 3: Angle spread of incident waves φ vs. From this Figure it can be concluded that as the
BER correlation coefficient increases the BER
performance decrease. Also, as the coefficient
The effect of fd on BER performance is shown approaches 1, one of the diversity branches is
on Fig. 4. As fd increases, due to the increase in the effectively removed; this leads to lose the
speed of the Mobile, BER performance will advantage gained from antenna diversity. On the
degrade. This degrading is due to rapid changes in other hand, reducing values ρ correspond to an
channel characteristics. The lowest value of fd that increase in the spatial separation between antennas.
gives better BER is 90 Hz. For this reason we have to look for the optimum
antenna separation that yields better BER
performance. Simulations were performed where
the ratio d / λ was varied between 0.1 and 8. The
results are indicated in Fig. 6. It is clear that as the
ratio is increased, the BER performance is better.
When d / λ is 6, we already have optimal BER

UbiCC Journal - Volume 3 57


results. Also, increasing d / λ beyond 6 does not due to diversity gain. This gain comes from
have any noticeable benefits. uncorrelated diversity branches. Moreover
increasing the maximum Doppler frequency
1.0E-01 degrades the BER performance due to the rapid
changes of channel characteristics. Moreover the
1.0E-02
M=1 analysis of this system is explained under Rayleigh
1.0E-03 M=2/0.5λ fading.
BER

M=3/0.5λ
1.0E-04
M=2/5.25λ
6 REFERENCES
1.0E-05 M=3/5.25λ

1.0E-06 [1] M. K. Simon and M. S. Alouini : Digital


0 5 10 15 20 Communication over Fading Channels: A Unified
Eb/N0
Approach to Performance Analysis. Wiley Series in
Telecommunications and Signal Processing. New
Figure 7: BER vs. Eb/N0 for fd =90 Hz York: Wiley- Interscience, 2000.
d/λ=0.5, 5.25 M=1, 2, and 3 [2] W. C. Jakes : Microwave Mobile
Communications. New York: John Wiley & Sons,
Fig. 7 displays the BER of transmit diversity for Inc.1974
different numbers of antennas, M=1, 2, and 3, and [3] L. Fang, G. Bi, and A. C. Kot, "New Method of
different values of antenna separation d / λ at the Performance Analysis for Diversity Reception with
base station. It is clear that increasing M and d / λ Correlated Rayleigh-fading Signals," IEEE
have a positive effect on the BER performance. Transactions on Vehicular Technology, September
Numerically, the amount of improvement in Eb/N0 2000, vol. 49, pp. 1807 – 1812.
for M = 2, and 3 is 4dB and 6dB with respect to [4] C. X. Wang and M. Patzold: Methods of
M=1 at d / λ=0.5 and BER=10E-4, respectively. Generating Multiple Uncorrelated Rayleigh Fading
Also, more improvement has been got at d / λ=5.25. Processes. IEEE Vech. Tech. Conf. 2003_spring.
It is increased to be 12dB and 14dB for M=2 and 3 [5] S. Kosono, and S. Sakagumi, "Correlation
with respect to M=1 and BER of 10E-4. Coefficient on Base Station Diversity for Land
Mobile Communication Systems", IEICE Trans.,
Comm., Vol. J 70-B No.4 1987, April, pp. 476-482
1.0E+00 [6] Qiang Zhao, New Results on Selection
theoritical
1.0E-01 simulation
Diversity Over Fading Channels, Thesis, December
27, 2002
BER

1.0E-02

1.0E-03

1.0E-04
0 2 4 6 8 10
Eb/N0 dB

Figure 8: Comparison of simulated and theoretical


results

Figure 8 compares the simulated and theoretical


results of BER for M=2. It shows that the
simulation results are a close match to the BER in
Eq. (13). The small difference is due to optimizing
the simulation parameters.

5 CONCLUSIONS

In this paper, the performance of transmission


selection spaced diversity in DS-CDMA system
was studied. This performance is not clarified until
now under the effect of changing the space distance
between antennas at BS and the maximum Doppler
frequency by using the optimum conditions. The
results show that increasing the space distance
between antennas gives better BER performance

UbiCC Journal - Volume 3 58


WEB-BASED DECISION SUPPORT SYSTEMS
AS KNOWLEDGE REPOSITORIES FOR
KNOWLEDGE MANAGEMENT SYSTEMS

Yuri Boreisha
Minnesota State University Moorhead, USA
Boreisha@mnstate.edu

Oksana Myronovych
North Dakota State University, USA
Oksana.Myronovych@ndsu.nodak.edu

ABSTRACT
Problem solving and learning processes conducted on the basis of contemporary Web-
based DSS provide for development and enhancement of knowledge management
systems. Knowledge objects form the foundation of the conceptual approach to the
knowledge management based on the contemporary Internet technologies and
knowledge accumulated in DSS.

Keywords: knowledge management systems, decision support systems.

1 INTRODUCTION popularity, particularly with businesses.


Collaborating on projects with co-workers across the
Knowledge management (KM) has become an world is easier, since information is stored on a Web
important theme as managers realize that much of server instead of on a single desktop.
their firm’s value depends on ability to create and Rich Internet Applications (RIAs) are Web
manage knowledge. To transform information into applications that offer the responsiveness, “rich”
knowledge a firm must use additional resources to features and functionality approaching that of
discover patterns, rules, and context where the desktop applications. RIAs are the result of today’s
knowledge works [1-3]. more advanced technologies (such as Ajax) that
Knowledge that is not shared and applied to the allow greater responsiveness and advanced GUIs.
practical problems does not add business value. Web services have emerged and, in the process,
Today people can share their knowledge in three have inspired the creation of many Web 2.0
primary ways. Organizational information systems businesses. Web services allow you to incorporate
(IS) that store, manage, and deliver documents are functionality from existing applications and Web
called content management systems (CMS). With the sites into your own applications quickly and easily.
arrival of modern communications technology, Web 2.0 companies use “data mining” to extract
people can share their knowledge via collaborating as much meaning as they can from XHTML-encoded
knowledge management systems (KMS). In addition pages. XHTML-encoded content does not explicitly
to content management and collaboration, the convey meaning, but XML-encoded content does. So
knowledge can be shared via expert systems. if we can encode in XML (and derivative
Comprehensive discussion of important dimensions technologies) much or all of the content on the Web,
of knowledge, the knowledge management value we’ll take a great leap forward towards realizing the
chain, and types of KMS can be found in [2, 3]. Semantic Web.
Web 2.0 companies use the Web as a platform Many people consider the Semantic Web to be
to create collaborative, community-based sites (e.g., the next generation in Web development, one that
social networking sites, blogs, wikis, etc.). The Web helps to realize the full potential of the Web – the
has now become an application, development, “Web of meaning”. Though Web 2.0 applications are
delivery, and execution platform [4]. finding meaning in the content, the Semantic Web
Software as a Service (SaaS) - application (heavily depended on XML and XML-based
software that runs on a Web server rather than being technologies) will attempt to make those meaning
installed on the client computer – has gained clear to computers as well as humans [5].

UbiCC Journal - Volume 3 59


These trends in the Web Science – the new combination of internally developed taxonomies and
science of decentralized information systems – search engine techniques.
provide for new opportunities in the KM. Organizations acquire knowledge in a number
In this paper we consider contemporary of ways, depending on the type of knowledge they
Decision Support Systems (DSS) as knowledge seek. Once the corresponding documents, patters, and
repositories that can be expanded to KMS using the expert rules are discovered they must be stored so
Web 2.0 software development technologies and they can be retrieved and used. Knowledge storage
tools. This paper is based on a series of previous generally involves databases, document management
authors’ publications [6-11]. systems, expert systems, etc. To provide a return on
investment, knowledge should become a systematic
2 KNOWLEDGE MANAGEMENT AND part of the organizational problem solving process.
DECISION SUPPORT SYSTEMS Ultimately, new knowledge should be built into a
firm’s business processes and key application
The AI representation principle states that once a systems.
problem is described using an appropriate KMS and related knowledge repositories should
representation, the problem is almost solved. Well- facilitate the problem solving process (Figure 1).
known knowledge representation techniques include During the process of solving problems managers
rule-based systems, semantic nets and frame systems engage into decision making, the act of selecting
[12]. from alternative problem solutions.
KM refers to the set of business processes The different levels in an organization
developed in an organization to create, store, transfer (strategic, management, and operational) have
and apply knowledge. KM increases the ability of the different decision-making requirements. Decisions
organization to learn from its environment and to can be structured, semi-structured or unstructured.
incorporate knowledge into business processes. There The structured decisions are clustered at the
are three major categories of KMS: enterprise-wide operational level of the organization, and
KMS, knowledge work systems (KWS), and unstructured decisions at the strategic level.
intelligent techniques [2, 3]. Management information systems (MIS)
Enterprise-wide KMS are general purpose, provide information on firm performance to help
integrated, firm-wide efforts to collect, store, managers monitor and control the business, often in
disseminate, and use digital content and knowledge. the form of fixed regularly scheduled reports based
Such systems provide databases and tools for on data summarized from the firm’s transaction
organizing and storing structured and unstructured processing systems (TPS). MIS support structured
documents and other knowledge objects, directories decisions and some semi-structured decisions.
and tools for locating employees with experience in a DSS combine data, sophisticated analytical
particular area, and increasingly, Web-based tools for models and tools, and user-friendly software into a
collaboration and communication. single powerful system that can support semi-
KWS (such as computer-aided design, structured and unstructured decision making [3, 13,
visualization, and virtual reality systems) are 14].
specialized systems built for engineers, scientists, and The main components of the DSS are the DSS
other knowledge workers charged with discovering database, the user interface, and the DSS software
and creating new knowledge for a company. system (Figure 2). The DSS database is a collection
Diverse group of intelligent techniques (such as of current data from a number of applications and
data mining, neural networks, expert systems, case- groups. Alternatively, the DSS database may be a
based reasoning, fuzzy logic, genetic algorithms, and data warehouse that integrates the enterprise data
intelligent agents) have different objectives, from a sources and maintains historical data.
focus on discovering knowledge (data mining and The DSS user interface permits easy interactions
neural networks), to distilling knowledge in the form between users of the system and the DSS software
of rules for a computer program (expert systems and tools. Many DSS today have Web interfaces to take
fuzzy logic), to discovering optimal solutions for advantages of graphics displays, interactivity, and
problems (genetic algorithms). ease of use.
It is said that effective KM is 80% managerial The DSS software system contains the software
and organizational, and 20% technology. One of the tools that are used for data analysis. It may contain
first challenges that firms face when building various OLAP tools, data mining tools, or a
knowledge repositories of any kind is the problem of collection of mathematical and analytical models that
identifying the correct categories to use when easily can be made accessible to the DSS users.
classifying documents. Firms are increasingly using a

UbiCC Journal - Volume 3 60


Problem

Alternative
Standards solutions
(Desired state) Problem
(DSS)
solver
(Manager)
Information Constraints
(Current state)

Solution

Figure 1: Elements of the problem solving process.

The dialog manager is also in charge for the and especially a time-series of internal company data
information visualization. Finally, access to the and sometimes external data. Relational databases
Internet, networks, and other computer-based systems accessed by query and retrieval tools provide an
permits the DSS to tie into other powerful systems, elementary level of functionality. Data warehouse
including the TPS or function-specific subsystems. systems that allow the manipulation of data by
There are many kinds of DSS. The first generic computerized tools tailored to a specific task and
type of DSS is a Data-Driven DSS. These systems setting or by more general tools and operations
include file drawer and management reporting provided additional functionality. Data-Driven DSS
systems, data warehousing and analysis systems, with Online Analytical Processing (OLAP) provide
Executive Information Systems and Spatial DSS. the highest level of functionality and decision support
Data-Driven DSS emphasize access to and that is linked to analysis of large collections of
manipulation of large databases of structured data historical data.

Internal External
Data Data
DSS
Database/
Data Warehouse

DSS Software System


Models
OLAP Tools
Data Mining Tools

User Interface
(Dialog Manager)

Users

Figure 2: Main components of the DSS.

UbiCC Journal - Volume 3 61


A second category, Model-Driven DSS, includes Document-Driven DSS are evolving to help
systems that use accounting and financial models, mangers retrieve and manage unstructured documents
representational models, and optimization models, and Web pages. A Document-Driven DSS integrates
and optimization models. Model-Driven DSS a variety of storage and processing technologies to
emphasize access to and manipulation of a model. provide complete document retrieval and analysis.
Simple statistical and analytical tools provide an WWW provides access to large document databases
elementary level of functionality. Some OLAP including databases of hypertext documents, images,
systems that allow complex analysis of data may be sounds and video. Examples of documents that would
classified as hybrid DSS providing modeling, data be accessed by Document-Driven DSS are policies
retrieval, and data summarization functionality. and procedures, product specifications, catalogs, and
Model-Driven DSS use data and parameters provided corporate historical documents, including minutes of
by decision-makers to aid them in analyzing a meetings, corporate records, and important
situation, but they are not usually data intensive. correspondence. Search engines are powerful
Very large databases are usually not needed for decision-aiding tools associated with Document-
Model-driven DSS. Driven DSS.
Knowledge-Driven DSS or Expert Systems can Group DSS (GDSS) came first, but now a
suggest or recommend actions to managers. These broader category of Communications-Driven DSS or
DSS are human-computer systems with specialized groupware can be identified. These DSS includes
problem-solving expertise. The expertise consists of communication, collaboration and related decision
knowledge about a particular domain, understanding support technologies. These are hybrid DSS that
of problems within that domain, and skills at solving emphasize both the use of communications and
some of these problems (AI algorithms and solutions decision models to facilitate the solution of problems
can be used). A related concept is data mining. It by decision-makers working together as a group.
refers to a class of analytical applications that search Groupware supports electronic communication,
for hidden patterns in a database. Data mining is the scheduling, document sharing, and other group
process of sifting through large amounts of data to productivity and decision support enhancing
produce data content relationships. Tools used for activities.
building Knowledge-Driven DSS are sometimes A DSS model that incorporates Group Decision
called Intelligent Decision Support methods. Support, OLAP, and AI is shown on Figure 3.

Relational Knowledge Multidimensional


Database Database Database

Relational Inference Multidimensional


DBMS Engine DBMS

Report
Writing Mathematical
Models Groupware
Software

Solutions Outputs
Periodic Outputs from Outputs and from
and mathematical from explanations OLAP
special models groupware
reports

Figure 3: A DSS model that incorporates GDS, OLAP, and AI.

UbiCC Journal - Volume 3 62


DSS facilitate the decision-making. Decision Business logic (domain) layer that implements
making is an integrated part of the overall problem the rules and procedures of the business
solving process. KMS should facilitate the problem processing.
solving process. In the next section we are going to View layer that accepts input and formats and
discuss how Web-enabled DSS can be integrated into displays processing results.
contemporary KMS. RIA have two key attributes – performance and
rich GUI. RIA performance comes from Ajax
3 WEB-ENABLED DECISION SUPPORT (Asynchronous JavaScript and XML), which uses
SYSTEMS client-side scripting to make Web applications more
responsive by separating client-side user interaction
All types of DSS can be deployed using Web and server communication, and running them in
technologies and can become Web-based DSS. parallel. Various ways to develop Ajax applications
Managers increasingly have Web access to data are discussed in [5].
warehouses and analytical tools. To discuss the Web services promote software portability and
recent trends in this area the latest achievements in reusability in applications that operate over the
the three-layer design, Rich Internet Applications Internet. Web service is a transition to service-
(RIA), and Web services should be taken into oriented, component-based, distributed applications.
account. Web services are applications implemented as Web-
Three-layer design is an effective approach to based components with well-defined interfaces,
development robust and easy maintainable systems. which offer certain functionality to clients via the
The corresponding architecture is appropriate for Internet. Once deployed, Web services can be
systems that need to support multiple user interfaces. discovered, used/reused by consumers (clients, other
Contemporary Web applications are three-layer services or applications) as building blocks via open
applications. industry-standard protocols. Web service architecture
The most common set of layers includes the is built on open standards and vendor-neutral
following: specifications. Services can be implemented in any
Data layer that manages stored data, usually in programming language, deployed and then executed
one or more databases. on any operating system or software platform.

Internal External
Data Data
DSS
Database/
Data Warehouse

Web Services
provide access to
DSS Software System

Ajax-Enabled Applications
implement
Dialog Manager

Internet Users

Figure 4: Web-enabled DSS.

The service-oriented architecture (SOA) provides consisting of different software components working
the theoretical model for all Web services. The model together. Consuming Web services is based on open
behind Web services is a loosely coupled architecture, standards managed by broad consortia (e.g., World

UbiCC Journal - Volume 3 63


Wide Web Consortium, Organization for the 2). RIA provide for efficient implementation of the
Advancement of Structured Information Standards, Dialog Manager GUI for DSS. Web services allow
Web Services Interoperability Organization). incorporating functionality from existing applications
What makes Web services different from ordinary and due to this providing for access to the DSS
Web sites is the type of interaction that they can Software System through the SOA. The components of
provide. Most of the enthusiasm surrounding Web the Web-enabled DSS are shown on Figure 4.
services is based on the promise of interoperability. We can call a group of the following related
Every software application in the world can potentially components a knowledge object (Figure 5). Discussed
talk to every other software application. This techniques allow to create new Web services (based on
communication can take place across the old the existing ones and contemporary DSS software
boundaries of location, operating system, language, systems), and Ajax-enabled application interacting with
protocol, and so on. these Web services. So we can talk about creation and
Three-layer architecture maps well on the modification of the knowledge objects.
structure of main components of the DSS (see Figure

DSS
Database/
Data Warehouse

Web Service

Ajax-Enabled Application

Figure 5: Structure of a knowledge object.

Web-enabled DSS provide for expandable built up over the years. This organizational knowledge
collections of the knowledge objects that constitute the can be captured and stored using case-based reasoning
knowledge repository of the corresponding KMS. From (CBR). In CBR description of the past experiences of
this point of view the knowledge objects can be human specialists, represented as cases, are stored in a
considered as a knowledge representation technique. database for the later retrieval when the user encounters
a new case with similar parameters. The system
4 PROBLEM SOLVING AND LEARNING searches for stored cases with problem characteristic
similar to the new one, finds the closest fit, and applies
AI distinguishes two general kinds of learning. The the solution of the old case to the new case. Successful
first kind is based on coupling new information to solutions are tagged to the new case and both are stored
previously acquired knowledge. Typical examples together with the other cases in the knowledge base.
include learning by analyzing differences, by managing Unsuccessful solutions are also appended to the case
multiple models, by explaining experience, and by database along with explanations as why the solutions
correcting mistakes. The second kind is based on did not work.
digging useful regularity out of data; a practice often Problem-based learning (PBL) is (along with active
refers as data mining. Typical examples include learning and cooperative/collaborative learning) one of
learning by recording cases, by building identification the most important developments in contemporary
trees, by training neural nets, by training perceptrons, higher education. PBL is based on the assumption that
by training approximation nets, and by simulation human beings evolved as individuals who are motivated
evolution (e.g. genetic algorithms). to solve problems, and that problem solvers will seek
Expert systems primarily capture the tacit and learn whatever knowledge is needed for successful
knowledge of individual experts, but organizations also problem solving. PBL is a typical example of an
have collective knowledge and expertise that they have

UbiCC Journal - Volume 3 64


application of the first type of learning in higher Combining the main ideas of CBR and PBL the
education [11]. following problem solving and learning process can be
depicted as it’s shown on Figure 6.

User describes User learns


the problem about the knowledge objects
that facilitate
the problem solving

System searches
Repository of Repository of
knowledge objects for knowledge objects
the suitable ones (based on a
Web-enabled DSS)

System asks user


additional questions
to narrow search

System stores
System finds the problem description
the closest fit and and the knowledge
provides access to object in the repository
knowledge objects

New knowledge object


is created to
better fit the problem

Figure 6: Problem solving and learning with knowledge objects.

5 CONCLUSIONS 6 REFERENCES

Knowledge is a complex phenomenon, and there are [1] V. Supyuenyong, N. Islam: Knowledge
many aspects to the process of managing knowledge. Management Architecture: Building Blocks and
Knowledge-based core competencies of firms are key Their Relationships, Technology Management for
organizational assets. Knowing how to do things the Global Future, Vol. 3, pp. 1210-1219 (2006).
effectively and efficiently in ways that other [2] K.C. Laudon, J.P. Laudon: Management
organizations cannot duplicate is a primary source of Information Systems. Managing the Digital Farm,
profit and competitive advantage that cannot be Prentice Hall, pp. 428-508 (2006).
purchased easily by competitors in the marketplace. [3] R. McLeod, G. Schell: Management Information
This paper discusses Web-enabled DSS, related Systems, 10th Edition, Prentice Hall, pp. 250-274
knowledge repositories, and KMS that facilitate the (2006).
problem solving and learning. The knowledge objects [4] P.J. Deitel, H.M. Deitel: Internet and World Wide
approach to the knowledge representation allows Web. How to Program, 4th Edition, Prentice Hall,
considering contemporary DSS as integrated parts of pp. 50-117 (2008).
the corresponding KMS. [5] T. Berners-Lee, et al: A Framework for Web
Science, Foundations and Trends in Web Science,
Vol. 1, No 1, pp. 1-130 (2006).

UbiCC Journal - Volume 3 65


[6] Y. Boreisha, O. Myronovych: Web-Based Decision [11] Y. Boreisha, O. Myronovych: Knowledge
Support Systems in Knowledge Management and Navigation and Evolutionary Prototyping in E-
Education, Proceedings of the 2007 International Learning Systems, Proceedings of the E-Learn
Conference on Information and Knowledge 2005 World Conference on E-Learning in
Engineering, IKE’07, June 25-28, Las Vegas, Corporate, Government, Healthcare, and Higher
USA, pp. 11-17 (2007). Education, October 24-28, Vancouver, Canada, pp.
[7] Y. Boreisha, O. Myronovych: Web Services-Based 552-559 (2005).
Virtual Data Warehouse as an Integration and ETL [12] P.H. Winston: Artificial Intelligence, Addison-
Tool, Proceedings of the 2005 International Wesley, pp. 15-228 (1992).
Symposium on Web Services and Applications, [13] S. French, M. Turoff: Decision Support
ISWS’05, June 27-30, Las Vegas, USA, pp. 52-58 Systems, Communications of the ACM, Vol. 50,
(2005). No 3, pp. 39-40 (2007).
[8] Y. Boreisha, O. Myronovych: Data-Driven Web [14] Chien-Chih Yu: A Web-Based Consumer-
Sites, WSEAS Transactions on Computers, Vol. 2, Oriented Intelligent Decision Support System for
No 1, pp. 79-83 (2003). Personalized E-Services, ACM International
[9] Y. Boreisha: Database Integration Over the Web, Conference Proceeding Series, Vol. 60, pp. 429-
Proceedings of the International Conference on 437 (2004).
Internet Computing, IC’02, June 24-27, Las Vegas,
USA, pp. 1088-1093 (2002).
[10] Y. Boreisha: Internet-Based Data
Warehousing, Proceedings of SPIE Internet-Based
Enterprise Integration and Management, Vol. 4566,
pp. 102-108 (2001).

UbiCC Journal - Volume 3 66


Simulating social interaction scenarios in an office.
∗ ∗ †
Michele Bezzi Robin Groenevelt Frederick Schlereth
October 15, 2007

Abstract extremely hard even in well structured environments,


such as an office. The main issue is the complexity of
Work team coordination is becoming a major chal- social human behavior due to its high variability, its
lenge in the contemporary complex working environ- dependency on external constraints such as temporal,
ments. Coordination process takes place through spatial context (e.g., environment layout) and task
direct interaction and explicit communication, but context (e.g., personal list of activities and goals).
it takes also advantage of informal social network A successful model should therefore incorporate all
within team members. Consequently, in order to de- these aspects, and, to be realistic, parameters have
velop realistic model of team coordination, we need to be set using experimental data.
to measure and model such interactions in real world On the positive side, recent sensor technologies
environments. We present an agent-based model for provide us an unprecedented recording of informa-
simulating people movement in a workspace, which tion from the physical world. In previous studies, we
may be used as tool for developing and testing social investigated the social patterns during some typical
relationship models. We demonstrate the model by office activities [2], using data from a sensor network
simulating office life in one of our laboratories and located in one of our laboratories [15, 14]. Collect-
comparing the results to actual measurements ob- ing long-term and reliable data using this pervasive
tained with a sensor network. environment is a long process and may raise privacy
issues. Consequently, working with a real life environ-
ment does not allow us to efficiently test the impact
1 Introduction of changes in the environment (e.g., impact of some
Large corporations are often organized in functional space rearrangements on group dynamics).
teams. The objective of team work is to achieve a The aim of the paper is to introduce an agent-based
common goal by integrating and coordinating indi- model for simulating a workspace with movements of
vidual capabilities. In this framework, social interac- people and face-to-face contact between individuals.
tions play a major role, and—although many commu- This model can be used as tool for investigating the
nication media are nowadays available—-face-to-face dynamics of social interactions, for which the results
interactions are still highly important [1, 5]. Accord- can be fed by and/or validated against actual mea-
ingly, theoretical models of how people interact in a surements obtained with a sensor network. In par-
certain environment can be useful to shed some light ticular, this allows assessment of these measures un-
on the mechanisms underlying the collective behavior der different conditions, such as assessing the impact
of teams and business units. However, finding realis- of a physical change in the environment, the effects
tic mathematical descriptions of social interactions is of team building exercises, the arrival of a new em-
ployee, or changes in layout of the teams.
∗ Accenture Technology Labs, 449, route des Cretes, Sophia

Antipolis, France The important reason for being able to simulate so-
† Chalmers University, Goteborg, Sweden cial encounters is that it allows us to study the effect

UbiCC Journal - Volume 3 67


of (changes in) the environment on the social behav- vided into 50 locations, each of them the size of ap-
ior of people. There are many questions for which, proximately a room. This allowed us to remove the
to the best of our knowledge, little or few quantita- variability of paths inside a room while still main-
tive studies exist. For example, how well and quickly taining enough information about the movements of
does a new employee get integrated into the working people. Each sensor detects signals of people in its
society under a variety of scenarios? These scenarios sensory field. For each person and location the signals
could include: having people working in open space were merged together to build the current probabilis-
offices instead of in cubicles, having team meeting in tic evidence of finding a certain person in a specific
various locations, the location of a coffee machine, location, after which this information was integrated
the effect of being at the far end of building. Do with the current belief of the system (derived from
people get more social connections when teams are previous observations). The result was a sequence of
mixed so that it forces people to walk around more? matrices, one for each time step, where the probabil-
We see our simulator as a step towards quantitatively ity of finding a person in each location is reported.
answering these kinds of questions, in the lack of real- In the second step, starting from these matrices,
world measurements. we derived the most likely paths for each tracked in-
The sketch of the paper is the following: in Sec- dividual; these data were then analyzed to find fre-
tion 2 we briefly summarize the main features to quent patterns and appropriate statistical quantities
model and the actual sensor network. In Section 3 to describe long term activities. Extracted recurrent
we describe the probabilistic model underlying our patterns were identified later exploiting local seman-
model. Social behavior, derived through numerical tics (e.g., meetings usually take place in the meet-
simulations, are presented in Section 4. Finally, con- ing room) as well as context-based knowledge (e.g.,
clusions are drawn in the last section. matching movement patterns with the information
available from electronic calendars). The data acqui-
sition system is currently still under development, so
2 Modeling Office Activities we had too little data available to find meaningful
long-term recurrent patterns. Nonetheless, to give a
We chose an office environment as a test setting for glimpse of the kind of statistical analysis we are inter-
two reasons. First of all, quantitative evaluations of ested in, we analyzed a limited data set showing, for
various office activities have important practical ap- example, that functional teams, such as research and
plications (e.g., assessing the quality of space organi- development groups, tend to be strongly intercon-
zation in the office, estimating connections amongst nected inside the group, but loosely connected across
different people/departments, safety and security). different groups. Results of this analysis are reported
Secondly, a video-camera infrastructure which col- in Ref. [2].
lects data on peoples movements and presence was
readily available in one of our offices and the data
thus collected is accessible to us [15, 14]. This last 3 Numerical Simulations
experimental environment is composed of an office
floor at Accenture Technology Labs in Chicago. The In this section we present a model for simulating
floor is equipped with a network consisting of 30 video movements of people in an office setting analogous
cameras, 90 infrared tag readers, and a biometric sta- to the workspace described above. In fact, data col-
tion for fingerprint reading. lection in a real environment is a long process and
The first step was the fusion of this raw-sensor data it may generate privacy concern. Therefore, to freely
into a higher-level description of peoples movements test our algorithms and hypothesis, we built an agent-
inside the office. Identification and tracking of the based simulator of movements of people an office. As
people was performed using a Bayesian network. In in the real-life setting, the office map was divided into
short (see [14] for details), the office space was di- 50 locations, each of them the size of a room (see

UbiCC Journal - Volume 3 68


a snapshot of the simulation (at time 11 a.m.). The
output of the agent-based system consisted of a tem-
poral sequence of matrices, which report the location
of each agent for each time step, with the same for-
mat as for the sensor network. This allowed us to
use the same analysis tools for both the agent-based
model and for the real-life data collected. Despite its
simplicity, this model showed a visual agreement with
the trajectories observed in the real environment. We
used this model to study the evolution of social in-
teractions.

4 Social Network Analysis


Social network analysis provides a powerful tool for
assessing patterns of relationships in informal net-
Figure 1: Snapshot of the simulation (at time 11
works [5, 3]. The nodes in the network represent the
a.m.). Numbers indicate locations. Note that a meet-
people and the links represent the interactions be-
ing is taking place in the North-East corner room.
tween the nodes. Social network theory has a long
history [11], but has only recently been able to take
full advantage of the large use of digital communi-
Fig. 1). In total there were 30 people (agents). In
cations; the properties of such networks have been
its simplest version, each agent had a set of possible
extensively studied using data from emails [6, 9] and
destinations in the office floor, with different proba-
instant-messaging [16]. In the first study an individ-
bilities derived from the collected data and from our
ual’s emailing history is analyzed and his connections
knowledge of their office life. At each time step, each
are automatically generated and displayed as a graph.
agent decides to stay in the current location with
Typical analysis include: the number of connections
a certain probability (usually large if it is in his or
and frequency of contacts, the diameter and clique-
her own office) or to move to a destination sampled
ness (i.e., degree of local clusters) of the network,
from a distribution of destinations. In this last case,
the time evolution of the network, and identifying
the agent starts moving according to a specific path,
the most-connected nodes. The distribution of con-
usually the shortest one, with possible random fluc-
nections in social networks has often been shown to
tuations. An agent also has a personal schedule in
follow a power law, i.e., the number of nodes with
which specific tasks are listed (e.g., meetings, lunch,
connectivity k falls as:
coffee) with a corresponding time and probability of
performing that action. This schedule was derived n(k) ∝ k α
from samples of employees electronic calendars and
then integrated with context knowledge, such as typi- where a is a negative constant, usually somewhere
cal arrival, lunch, and departure times. Furthermore, between 1 and 4. This leads to a scale free network in
in case two or more agents cross paths in the same which there are many nodes with few connections as
location, the probability of staying was increased by well as the existence of highly connected hubs, which
a quantity, ∆p, specific for each agent. This proba- foster network cohesion and connections between dis-
bility mimics the fact that random encounters may tant nodes, even in very large networks. Emails or
result in short conversations. Its numerical value was instant-messaging logfiles provide a large source of
derived from real data whenever available and using data about social relationships, and they give inter-
context knowledge in the other cases. Fig. 1 shows esting results and potential applications [17, 7], but

UbiCC Journal - Volume 3 69


1
10
n(k)

0
10 0 1
10 10
k
Figure 3: Social network as extracted from movement
Figure 2: The degree of connectivity k of a node plot- data from one day of simulation. Black and white cir-
ted against the frequency of nodes with degree k on a cles indicate researchers and developers, respectively.
log-log scale. Each point represents data points from
one numerical simulations over a 7 period.
ple from different teams are loosely connected. Re-
sults vary across different runs but the a two-clusters
they do not consider physical interactions and face- structure was already present. Similar results were
to-face communications that are at the basis of hu- obtained by analyzing the tracking data from the
man behaviors. In this study, we focused on this last real-life sensor network (see Fig. 2 in Ref. [2]).
feature, we estimate social relationships from pat- We simulated one week of activity and measured
terns of collocation in the workplace. This approach the properties of the resulting social network. Fig. 2
will be integrated with data collected from electronic shows the degree of connectivity k versus the fre-
communications in future studies, to better specify quency of nodes with degree k for one simulation.
the structure of the network and to investigate the In general, the observed distributions do no follow a
(possible) different topologies of electronic and phys- power law (straight line in the log-log plots). This is
ical social networks. probably due to the limited sampling size: there are
We inferred the structure of the social network in few agents and a short duration of the simulation.
the office by simulating the movement of a group of In fact, due to the small size of the environment,
people for long periods and considering a simple prox- this frequency distribution converges to a delta af-
imity rule: two individuals share a link if they spend ter 6 − 8 weeks of simulations, at this time every
enough time in the vicinity of one another. In addi- agent is directly connected to everybody else. Fur-
tion, we added to the system some context specific ther investigations and more experimental data are
rules, e.g., we excluded the entrance hall. This sim- clearly required to fully characterize the topology of
ple rule can lead to a number of false positives, e.g., this network, and to assess whether the structure of
two individuals may share the same location with- the social network in a real world physical space dif-
out interacting. However, we expect that in the long fers from those measured with email or chat log files,
run and with a large number of users it provides a where spatial extension and physical constraints are
gross estimation of global structure of the network of not taken in account.
interactions and of its evolution in time. Extending the period of simulation to 4 weeks, we
Fig. 3 illustrates the social network amongst two observed the network becomes fully connected after
departments (Research and Development) after one 9 working days (on average), even if the clusters cor-
day of simulation; it shows, for example, that peo- responding to the different teams are still present at

UbiCC Journal - Volume 3 70


the end of the simulation. This suggests that in small of our laboratory during normal office hours. In this
environments people get connected in rather short paper we presented an agent-based model for mod-
amounts of time. To check this hypothesis, we simu- eling peoples movements and social interactions in
lated the arrival of a new employee in the office and the same setting. This simulator uses a set of simple
measured over the time the number of hops needed rules which reproduces a persons trajectories inside
to connect him to all the other people in the office the office, and provides a cheap and flexible tool to
(shortest average path length). Fig. 4 shows the aver- develop and test pervasive environment and human
age number of hops (links) needed for this new joiner interaction models. In particular, we investigated the
to connect to any other employees in the office (tri- social interactions taking place during normal work
angles indicate the average over 50 simulations and days.
bars correspond to the standard deviation). After The paper does not present a complete model for
one month the new employee had directly interacted modeling the dynamics of interactions as we did not
with all the people in the office, i.e., Fig. 4 black consider, for example, digital communication media,
triangles ≃ 1 at day 30. Excluding formal meetings and—more importantly—we disregarded the content
from this dataset, we can estimate the contribution of of the interactions. Still, the results of this prelimi-
random encounters (square dots in Fig. 4). Random nary study show that it is technically possible to an-
encounters contribute largely to the increase the con- alyze the spatial influence of the environment on the
nectivity stressing the relevance of informal contacts behavior of the people and relevant numbers concern-
to establish a personal social network. Indeed consid- ing face-to-face interactions in real-environment can
ering random encounters only, the network becomes easily be generated. This allows for important input
connected after 13 days (on average) and after 30 the for collective human behavior modeling, as well as
new joiner is, almost (1.3 hops on average, Fig. 4), practical implications to evaluate the implementation
connected to all the others. of certain measures such as office design, team build-
Our current experimental setup does not permit ing efforts, efficient information transmission, and the
long recordings so we were not able to compare the correct integration of new joiners. The next step will
simulation results to experimental data. be to validate these against real-life data from our ex-
perimental setup, and to possibly extend it to larger
(and richer) environments. To this scope, privacy is
5 Conclusions clearly a major concern. Possible solutions include
users controlling the personal data released, limiting
Social interactions are highly important in collective the data a single party can access, data anonymiza-
activity, such as goal-oriented work teams. In par- tion, and following accepted ethical guidelines. In
ticular, despite the fact that many communication applications where real-time is not a requirement (as
media are accessible, face-to-face interactions still in our case for identifying social networks), the users
constitute one of the preferred media for informa- could have full control over the data released, e.g., re-
tion transmission [1] and contribute to increase the ceiving a weekly e-mail with the summary of events;
cohesion within groups. Furthermore, it has been and deciding which of them to disclose for the analy-
shown [10, 12] that the actual physical context, such sis. Even more important is finding a reasonable equi-
as the design of the environment and physical loca- librium point in the trade-off between privacy and
tions of agents, can considerably impact the human benefits. In other words, users need to be provided a
agent coordination. clear and tangible return for their privacy investment
Accordingly, suitable measures of social interac- for gaining acceptance.
tions in real environments are needed to develop ab- Lastly, even an analysis in some specific cases (re-
stract model of team functioning. We previously de- search laboratories, conferences, public events [8, 4,
veloped a prototype pervasive environment allowing 13]) will hopefully increase our—at the moment very
the measuring of face-to-face interactions inside one limited—quantitative knowledge on social interac-

UbiCC Journal - Volume 3 71


2.5
Histories in Smart Environments , also eprint
arXiv: 0706.1926, 2006.

2
[3] K. Chan and J. Liebowitz. The synergy of so-
cial network analysis and knowledge mapping:
Hops a case study. International Journal of Manage-
1.5 ment and Decision Making, 7(1):19–35, 2006.
[4] S. Counts and J. Geraci. Incorporating physi-
1 cal co-presence at events into digital social net-
working. In CHI ’05: CHI ’05 extended abstracts
on Human factors in computing systems, pages
0.5
5 10 15 20 25 30 1308–1311, New York, NY, USA, 2005. ACM
Days
Press.

Figure 4: Triangles indicate the average number of [5] R. Cross and A. Parker. The Hidden Power of
hops to connect one worker with all the workforce Social Networks: Understanding how Work Re-
(average shortest path). In the x-axis the number ally Gets Done in Organizations. Harvard Busi-
of days are reported, starting from day 8. Previous ness School Press, 2004.
days are not shown because the network is not fully [6] H. Ebel, L.-I. Mielsch, and S. Bornholdt. Scale-
connected. Vertical bars indicate standard deviations free topology of e-mail networks. Phys. Rev. E,
taken over 50 simulations. Square dots indicate the 66(3):035103, Sep 2002.
same quantity considering random encounters only.
Standard deviations for random encounters are not [7] R. Guimera, B. Uzzi, J. Spiro, and L. A. N. Ama-
shown for clarity. ral. Team Assembly Mechanisms Determine Col-
laboration Network Structure and Team Perfor-
mance. Science, 308(5722):697–702, 2005.
tions and their effects on collective behaviors.
[8] Q. Jones and S. A. Grandhi. P3 systems:
Putting the place back into social networks.
Acknowledgments IEEE Internet Computing, 9(5):38–46, 2005.
[9] H. Kautz, B. Selman, and M. Shah. Referral
Frederick Schlereth contributed to this study dur- web: combining social networks and collabo-
ing his internship at Accenture Technology Labs in rative filtering. Commun. ACM, 40(3):63–65,
Sophia Antipolis. We thank Valery Petrushin and 1997.
Gang Wei for providing tracking data obtained from
Accenture Technology Labs in Chicago. [10] D. Kirsh. Distributed Cognition, Coordination
and Environment Design. Proceedings of the Eu-
ropean conference on Cognitive Science, pages
References 1–11, 1999.
[11] S. Milgram. The small world problem. Psychol-
[1] T. Allen. Architecture and Communication ogy Today, 2(1):60–67, 1967.
Among Product Development Engineers. MIT
Press, Cambridge, MA, 1997. [12] A. Omicini, A. Ricci, M. Viroli, C. Castel-
franchi, and L. Tummolini. Coordination Arti-
[2] M. Bezzi and R. Groenevelt. Towards under- facts: Environment-Based Coordination for In-
standing and modeling office daily life. In 2nd telligent Agents. Proceedings of the Third In-
International Workshop on Exploiting Context ternational Joint Conference on Autonomous

UbiCC Journal - Volume 3 72


Agents and Multiagent Systems-Volume 1, pages
286–293, 2004.

[13] A. Pentland, T. Choudhury, N. Eagle, and


P. Singh. Human dynamics: computation for
organizations. Pattern Recogn. Lett., 26(4):503–
511, 2005.

[14] V. Petrushin, R. Ghani, and A. Gershman. A


Bayesian Framework for Robust Reasoning from
Sensor Networks. AAAI Spring Symposium on
AI Technologies for Homeland Security March,
pages 21–23, 2005.
[15] V. Petrushin, G. Wei, R. Ghani, and A. Ger-
shman. Multiple sensor integration for indoor
surveillance. Proceedings of the 6th international
workshop on Multimedia data mining: mining
integrated media and complex data, pages 53–60,
2005.

[16] R. D. Smith. Instant messaging as a scale-


free network. Arxiv preprint cond-mat/0206378,
2002.

[17] F. Wu, B. Huberman, L. Adamic, and J. Tyler.


Information flow in social groups. Physica
A: Statistical Mechanics and its Applications,
337(1-2):327–335, 2004.

UbiCC Journal - Volume 3 73


NEW STOP & WAIT ARQ PROTOCOL

Nitin Jain, Rishi Asthana & Manuj Darbari


Uttar Pradesh Technical University, Lucknow
nitinjain_22@rediffmail.com, asthana_rishi@yahoo.com, manujuma@rediffmail.com

ABSTRACT
In all types of data communication systems, errors may occur. Therefore error
control is necessary for reliable data communication. Error control involves both
error detection and error correction. Previously error detection can be done by
Cyclic Redundancy Check (CRC) codes and error correction can be performed by
retransmitting the corrupted data block popularly known as Automatic Repeat
Request (ARQ). But CRC codes can only detect errors after the entire block of
data has been received and processed. In this work we use a new and “continuous”
technique for error detection namely, Continuous Error Detection (CED). The
“continuous” nature of error detection comes from using arithmetic coding. This
CED technique improves the overall performance of communication systems
because it can detect errors while the data block is being processed. We focus only
on ARQ based transmission systems. We will show have the proposed CED
technique can improve the throughput of ARQ systems by up to 15%.

Keywords: Cyclic redundancy check codes, arithmetic coding, automatic repeat request.

1 INTRODUCTION integration of this novel paradigm into popular,


powerful transmission scenarios such as ARQ.
When we talk about any type of data Upon applying this method of error detection to
communication system, we concern only on its stop-and-wait ARQ, gains in throughput were
reliability. In all types of data communication achieved over conventional ARQ schemes at all bit
systems, errors may occur. Error control is the only error probabilities. Result shows that the throughput
way out for avoiding this problem. It comes by of new stop -and-wait ARQ protocol i.e. with CED is
detecting the error in first step and then correctin g it approximately 15% enhanced than the throughput of
in another step. For error detection we had CRC the conventional stop -and-wait ARQ protocol i.e.
codes and for error correction we use to retransmit with CRC.
the corrupted data which is popularly known as ARQ. The rest of the paper is organized as follows.
Although efficient, CRC’s can detect errors only The basic idea behind the continuous error detection
after an entire block of data has been received and is introduced in Section 2. Section 3 presents an
processed. An error detection scheme that is application of CED for ARQ transmission where it
“continuous” can detect errors while the block is provides significant throughput gains over
being processed. Thus, it can enhance the overall conventional CRC-based schemes. We conclude in
performance of the communication systems. Section 4. References are given in Section 5.
In this paper, we use this type of new and
continuous method for detecting the errors. The new 2 IDEA BEHIND CED
method of error detection is based on the arithmetic
coder, and allows for an efficient way to detect errors To understand the error detection scheme, an
continuously in the bit-stream by investing a understanding of how the arithmetic coder works is
controlled amount of redundancy in the arithmetic necessary. We assume that the reader is familiar with
coding operation. During our research, we became arithmetic coding and refer readers that are
aware that the idea of continuous error detection unfamiliar with arithmetic coding to [2].
based on arithmetic coding had been tackled earlier Arithmetic coding is a method of data
by Boyd et al. [1], albeit with little system compression in which data strings are mapped to
performance analysis, or exposition of its utility in code strings which represent the probabilities of the
communication systems. In this paper, we not only corresponding data strings. The method in which this
undertake a more rigorous analysis of this paradigm, mapping is achieved requires a model which
quantifying the underlying tradeoffs involved in the specifies the assumptions that are made about the
process, but also establish impressive gains in system source data. A simple example of a model is the
performance attainable through sophisticated memoryless model where the current symbol being

UbiCC Journal - Volume 3 74


encoded is independent of the previously encoded
symbols. Another simple example is the first-order
Markov model, where the current symbol being
encoded is dependent only on the previously encoded
symbol. For simplicity, we will examine the
memoryless model, keeping in mind that the analysis
generalizes to more sophisticated models. Thus,
encoding and decoding via the arithmetic coder
function by repetitively partitioning subint ervals
within the unit interval [0, 1) according to the
probabilities of the data symbols.
The basic idea is simple and consisting of
adding a “forbidden” symbol that is never encoded
by the arithmetic coder, but nonetheless occupies a
nonzero measure on the set [0, 1), then upon
decoding, if an error occurs, this “forbidden” symbol
is guaranteed to be decoded due to the loss of
synchronization. The amount of time it takes to Figure 1: Throughput comparison curves for new
decode the “forbidden” symbol after the occurrence stop-and-wait ARQ protocol i.e. with CED (upper
of an error is inversely related to the amount of curve in red color) versus conventional stop -and-wait
redundancy added through introducing the ARQ protocol i.e. with CRC (lower curve in blue
“forbidden” symbol. T his allows for control of the color ).
number of bits we suspect need to b e retransmitted.
In fact, we can guarantee to a specified accuracy, that 4 CONCLUSION
errors will be localized to the previous m bits (where
m is a function of the amount of redundancy added) In this paper we have introduced a new method
upon decoding the “forbidden” symbol. This is of error detection for common ARQ protocols. We
useful in an ARQ setting, becau se as soon as the analytically characterized the tradeoff of added
error is detected, we have a statistical assurance as to redundancy versus error -detection capability and
how many bits need to be retransmitted to ensure that formulated a method for incorporating this new error
the bit in error will be retransmitted. detection “device” into an ARQ type scenario.
We would also like to mention here that CED
3 APPLICATION OF CED can be put to good use to improve throughput
performance of transport protocols like TCP over
Simulations were run using a binary symmetric heterogeneous networks, where early detection of an
channel at various bit-error probabilities. Several ten error can result in a potentially greater number of
kilobit packets were sent at each bit-error retransmits, thereby increasing the probability of
probability, and the resulting throughput was successful reception over a fading channel. This is
calculated. As a measure of performance, we currently being verified. The goal of this work is to
compared our method of ARQ i.e. with CED to the present the benefits that communication systems can
conventional methods of ARQ i.e. wit h CRC. The derive from using CED for throughput enhancement.
conventional methods of ARQ function by dividing
the data into packets and then attaching CRC’s [3] to 5 REFERENCES
each packet. Upon detection of an error in the
conventional ARQ method, a retransmission of the [1] C. Boyd, J. Cleary, S. Irvine, I. Rinsma-
entire block is requested. To simulate a fair Melchert, and I. Witten, “Integrating error
comparison for our method versus conventional detection into arithmetic coding,” IEEE Trans.
ARQ methods, we used the optimal packet size for Commun., vol. 45, pp. 1–3, Jan. 1997.
each bit-error probability tested using conventional [2] G. Langdon, “An introduction to arithmetic
ARQ. The optimal packet size was calculated by coding,” IBM J. Res. Develop. , vol. 28, pp. 135–
differentiating the throughput equation for 149, Mar. 1984.
conventional ARQ (details can be found in [4]) with [3] T. Ramabadran and S. Gaitonde, “A tutorial on
respect to the packet size, and solving for the packet crc computations,” IEEE Micro, vol. 45, pp. 62–
size which maximizes throughput. The resulting 74, Aug. 1988.
throughputs are shown in Fig. 1. and we see that the [4] M. Schwartz, Telecommunication Networks:
new method of ARQ outperforms conventional ARQ Protocols, Modeling and Analysis. Reading,
methods at all bit-error probabilities. MA: Addison-Wesley, 1987.

UbiCC Journal - Volume 3 75


Cohesive Modeling, Analysis and Simulation for Spoken Urdu Language
Numbers with Fourier Descriptors and Neural Networks

Engr. S K Hasnain, Member IET, Assistant Professor,


Pakistan Navy Engineering College, Karachi
National University of Engineering & Technology, Pakistan
hasnain@pnec.edu.pk

Dr. Azam Beg, Member IEEE, Assistant Professor


College of Information Technology,
UAE University Al-Ain, United Arab Emirate
abeg@uaeu.ac.ae

ABSTRACT
This unified research describes spoken Urdu numbers investigative analysis from ‘siffar’ (zero) to
‘nau’ (nine) for making a concrete foundation in recognition of Urdu language. Sound samples
from multiple speakers were utilized to extract different features using Fourier descriptor and
Neural networks. Initial processing of data, i.e. normalizing and time-slicing was done using a
combination of Simulink in MATLAB. Afterwards, the MATLAB tool box commands were used for
calculation of Fourier descriptions and correlations. The correlation allowed comparison of the
same words spoken by the same and different speakers.

The analysis presented in this paper laid a foundation step in exploring Urdu language in
developing an Urdu speech recognition system. In this paper the speech recognition feed-forward
neural network models in Matlab were developed. The models and algorithm exhibited high
training and testing accuracies. Our major goal work involves in the future use of TI
TMS320C6000 DSK series or linear predictive coding. Such a system can be potentially utilized in
implementation of a voice-driven help setup in different systems. Such as multi media, voice
controlled tele-customer services.

Keywords: Spoken number, Fourier, Correlation, Feature extraction, Feed-forward neural networks, Learning rate

Many researchers have worked in this regard. Some


1. INTRODUCTION commercial software is also available in the market
for speech recognition, but mainly in English and
Automatic speech recognition has been an active other European languages. While the humans are
research topic for more than four decades. With the speaking, the formats vary depending on the position
advent of digital computing and signal processing, of jaws, tongue and other parts of the vocal tract.
the problem of speech recognition was clearly posed Two related key factors are: bandwidth of each
and thoroughly studied. These developments were format, and format membership in a known
complemented with an increased awareness of the bandwidth. The vowel for all human beings tends to
advantages of conversational systems. The range of be similar [2].
the possible applications is wide and includes:
voice-controlled appliances, fully featured speech- Sounds are mainly categorized into these groups:
to-text software, automation of operator-assisted voiced sounds (e.g., vowels and nasals), unvoiced
services, and voice recognition aids for the sounds (e.g., fricatives), and stop-consonants (e.g.,
handicapped [1]. The speech recognition problem plosives). The speech starts in lungs but is actually
has sometimes been treated as a speech-to-text formed when the air passes through larynx and vocal
conversion problem. tracts. Depending on the status of vocal fold in larynx
the sound can be grouped into: voiced sound that is

UbiCC Journal - Volume 3 76


time-periodic in nature and harmonic in frequency; already stored in memory [4]. We based our current
the unvoiced is more noise-like. Speech modeling work on the premise that same word spoken by
can be divided into two types of coding: waveform different speakers is correlated in frequency domain.
and source. In the beginning, the researchers tried to
mimic the sounds as it is, meaning, waveform In the speech recognition research literature, no
coding. This method tries to retain the original work has been reported on Urdu speech processing.
waveform using quantization and redundancy. An So we consider our work to be the first such attempt
alternative approach makes use of breaking the sound in this direction. The analysis has been limited to
up into individual components which are later number recognition. The process involves extraction
modeled, separately. This method of utilizing of some distinct characteristics of individual words
parameters is referred to as source coding. Different by utilizing discrete (Fourier) transforms and their
characteristics of speech can be used to identify the correlations. The system is speaker-independent and
spoken words, the gender of the speaker, and/or the is moderately tolerant to background noise.
identity of the speaker. Two important features of
speech are pitch and formant frequencies: 2. REVIEW OF DISCRETE
TRANSFORMATION & ITS MATLAB
(a) Pitch is a significant distinguishing factor IMPLEMENTATION
among male and female speakers. The frequency of
vibration of vocal folds determines the pitch, for
Discrete Fourier transform (DFT) is itself a sequence
example, 300 times per second oscillation of the
rather than a function of continuous variable and it
folds results in 300 Hz pitch. Harmonics (integer
corresponds to equally spaced frequency samples of
multiples of fundamental frequency) are also created
discrete time Fourier transform of a signal. Fourier
while the air passes through the folds. The age also
series representation of the periodic sequence
affects the pitch. Just before puberty, the pitch is
corresponds to discrete Fourier transform of finite
around 250 Hz. For adult males, the average pitch is
length sequence. So we can say that DFT is used for
60 to 120 Hz, and for females, it is 120 to 200 Hz.
transforming discrete time sequence x(n) of finite
length into discrete frequency sequence X[k] of finite
(b) The vocal tract, consisting of oral cavity, nasal
length. This means that by using DFT, the discrete
cavity, velum, epiglottis, and tongue, modulates the
time sequence x(n) is transformed into corresponding
pitch harmonics created by the pitch generator. The
discrete frequency sequence X[k][4]-[5]..
modulations depend on the diameter and length of
the cavities. These reverberations are called formant
DFT is a function of complex frequency. Usually the
frequencies (or resonances). The harmonics closer to
data sequence being transformed is real. A waveform
the formant frequencies get larger while others are
is sampled at regular time intervals T to produce the
attenuated. While the humans are speaking, the
sample sequence of N sample values, where n is the
formants vary depending on the positions of the
sample number from n=0 to n=N-1.
tongue, jaw, velum and other parts of the vocal tract.
Two related key factors are: bandwidth of each
formant, and formant’s membership in a known {x(nT)} = x(0), x(T ),...,x[(N − 1)T ]
bandwidth. The vowels for all human beings tend to
be similar. Each vowel uttered by a person generates The data values x(nT) will be real only when
different formants. So we can say that the vocal tract representing the values of a time series such as a
is a variable filter, whose inputs are (1) the pitch, and voltage waveform. The DFT of x(nT) is then
(2) the pitch harmonics. The output of the filter is the defined as the sequence
gain or the attenuation of the harmonics falling in
X[(N − 1)ω] in the frequency
{X [kω]} = X(0), X(ϖ),........
different formant frequencies. The filter is called
variable filter model. The transfer function for the domain, where ω is the first harmonic frequency
filter is determined by the formant frequencies [3]. given by ω = 2π / NT.

Correlation exists between objects, phenomena, or Thus X [ k ω ] has real and imaginary components
signals and occurs in such a way that it cannot be by in general, so that for the kth harmonic
chance alone. Unconsciously, the correlation is used
in everyday life. When one looks at a person, car or X ( k ) = R ( k ) + j I( k )
house, one’s brain tries to match the incoming image X (k ) = [R 2
(k ) + I 2 (k ) ]
1/ 2
(2.1)
with hundreds (or thousands) of images that are

UbiCC Journal - Volume 3 77


and with N=4 can be easily verified using definition in
X [k] has the associated phase angle (2.6).

φ ( k ) = tan −1 [I ( k ) / R( k )] (2.2) FFT computing time


=
1
log 2 N (2.6)
DFT computing time 2 N
where X[k] is understood to represent X[k]. These
equations are therefore analogous to those for the The results of the following commands with N=4 can
Fourier transform. Note that N real data values (in be for example easily verified with [5]:
time domain) transform to N complex DFT values (in
frequency domain). The DFT values, X[k], are given x=[1 0 0 1];
by: X=fft(x)

N−1 In this example, the DFT components [Xm]


FD [x (nT)] = ∑x(nT)e− jkωnT, k = 0,1,...,N − 1 (2.3)
=[2, 1-j, 0, 1+j] are found from (2.4). The second
n=0
expression in (2.6) specifies the value of N in (2.4),
which effectively overrides any previous
where ω= 2π / NT and FD denotes the DFT.
specification of the length of the vector x, thus the
following commands produce the same result:
N −1
(2.4)
X [k ] = ∑ x (n T ) e
n =0
− jk 2 π n / N
x=[1 0 0 1 3];
X=fft(x,4)
The Fast Fourier transform (FFT) eliminates most of
the repeated complex products in DFT. In C version The DFT, x= [Xm] has length = 4 is the same as in
of signal processing algorithm, there are several previous example.
different routines for real and complex versions of
x=[1 1];
the DFT and FFT. When these routines are coded
X=fft(x, 4)
into the MATLAB language, they are very slow
compared with the MATLAB fft routine, which are
[Xm] =[ 2, 1-j, 0, 1+j]
coded much more efficiently. Furthermore, the
MATLAB routines are flexible and may be used to
transform real or complex vector of arbitrary length. The result here is same because, when N is greater
They meet the requirements of nearly all signal than the length of x; X is the DFT of a vector
processing applications; consequently, in this paper, consisting of x extended with zeros on the right, from
the fft routines are preferred over all discrete the length of x to N. (The length of vector x itself is
transform operations. MATLAB’s fft routine not increased in this process). The MATLAB library
produces a one-dimensional DFT using the FFT also includes a two dimensional fft routine called fft2.
algorithm; that is when [XK] is a real sequence, fft The routine computes two-dimensional FFT of any
produces the complex DFT sequence [Xm]. In matrix, whose element may be, for example, samples
MATLAB, the length N of the vector [XK] may not (pixel values) of a two dimensional image.
be given. Thus both of the following are legal
expressions: Usually, some recognition occurs when the incoming
images bear a strong correlation with an image in
X=fft(x) (2.5) memory that “best” corresponds to fit or is most
X=fft(x, N) similar to it. A similar approach is used in this
investigation, to measure the similarity between two
The first expression in (2.5) produces a DFT with signals. This process is known as autocorrelation if
same number of elements as in [XK], regardless of the two signals are exactly same and as cross-
whether [XK] is real or complex. In the usual case
correlation if the two signals are different. Since
where [XK] is real and length N, the last N/2 complex
correlation measures the similarity between two
elements of the DFT are conjugates of the first N/2
signals, it is quite useful in identifying a signal by
elements in the reverse order, in accordance with
(2.4). In the unusual case where [XK] is complex, the comparing it with a set of known reference signals.
DFT consists of N independent complex elements. The reference signal that results in the lowest value
For example, the results of the following commands of the correlation with the unknown signals is most
likely the identity of the unknown object [9].

UbiCC Journal - Volume 3 78


organized into layers, and only unidirectional
Correlation involves shifting, multiplication and connections are permitted between adjacent layers.
addition (accumulation). The cross-correlation This is known as a feed-forward multi layer
function (CCF) is a measure of the similarities or Perceptron (MLP) architecture. This architecture is
shared properties between two signals. Application shown in Fig 1.
of CCF includes cross spectral density, detection and
recovery of signals buried in noise, for example the 4. DATA ACQUISITION AND PROCESSING
detection return signals, pattern, and delay
measurement. The general formula for cross- The system presented in this model was limited to
correlation rxY(n) between two data sequences x(n) processing of individual Urdu numerals (0 to 9). The
and y(n) each containing N data might therefore be data was acquired by speaking the numerals into a
written as: microphone connected to MS-Windows-XP based
PC. We recorded the data for fifteen speakers who
N −1 (2.7) uttered the same number set (0 to 9), specifically,
rxy = ∑x
n =0
(n) y (n)
siffar, aik, do, teen, chaar, paanch, chay, saat, aath,
and nau. Each sound sample was curtailed to 0.9
The autocorrelation function (ACF) involves only minute.
one signal and provides information about the
structure of the signal or its behaviour in the time
domain. It is special form of CCF and is used in FDATool
From Wave File
similar applications. It is particularly useful in s1_w1.wav Out
(22050Hz/1Ch/16b)
identifying hidden properties.
One Digital To Wave
S-Function
Filter Design1 Device
N −1 (2.8)
∑ x ( n) x ( n)
yout
rxx = Signal To
n =0 Workspace

3. FEEDFORWARD VS RECURRENT
NETWORKS |FFT| 2 yout1

Magnitude Signal To
Neural networks have proven to be a power tool for FFT Workspace1

solving problem of prediction, classification and


pattern recognition. [12]-[17]. Fig.2 Simulink model for data of numbers extraction
for analysis

One of the obvious methods of speech data


acquisition is to have a person speak into an audio
device such as microphone or telephone. This act of
speaking produces a sound pressure wave that forms
an acoustic signal. The microphone or telephone
receives the acoustic signal and converts it into an
analog signal that can be understood by an
electronic system. Finally, in order to store the
analog signal on a computer, it must be converted to
a digital signal [6],[8].

The data in this paper is acquired by speaking Urdu


numbers into a microphone connected to MS-
Windows-XP based PC. The data is saved into
Fig.1 Neural network model for feedforward multilayer ‘.wav’ format files. The sound files are processed
Perceptron for analysis
after passing through a (Simulink) filter, and are
saved for further analysis. We recorded the data for
Neural network architecture can be divided into two fifteen speakers who spoke the same number set, i.e.
principal types: recurrent and non-recurrent zero to nine.
networks. An important sub-class of non-recurrent
NN consists of architectures in which cells are

UbiCC Journal - Volume 3 79


In general, the digitized speech waveform has a high
FDATool dynamic range, and can suffer from additive noise.
From Wave File yout2
s10_w1.wav Out So first, a Simulink model was used to extract and
(22050Hz/1Ch/16b)
Signal To analyze the acquired data as shown in Fig.3.
Workspace2
Digital
Another Simulink model was developed for
One performing analysis such as standard deviation,
S-Function Filter Design1
To Wave mean, median, autocorrelation, magnitude of FFT
|FFT|2
Device2
From Wave File and data matrix correlation. We also tried a few
s10_w2.wav Out Magnitude Freq Autocorr 0.00000000000000 other statistical techniques, however, most of them
A
(22050Hz/1Ch/16b) FFT2 Vector LPC
failed to provide us any useful insight into the data
Scope2
Two Autocorrelation Display5 characteristics. We would also like to mention that
S-Function LPC we had started our experiments by using Simulink,
From Wave File but found this GUI-based tool to be somewhat
s10_w3.wav Out Time limited because we did not find it easy to create
Horiz Cat
(22050Hz/1Ch/16b)
|FFT|2
To
yout1
Vector multiple models containing variations among them.
Frame Scope This iterative and variable-nature of models
Three
Magnitude Buffer Frame Conversion Matrix Signal To eventually led us to MATLAB’s (text-based) .m
From Wave File FFT1 Concatenation Workspace1 files. We created these files semi-automatically by
s10_w4.wav Out
0.0000000
using a PERL-language script; the script was
(22050Hz/1Ch/16b) developed specifically for this purpose. Three main
Four Mean Display data pre-processing steps were required before the
data could be used for analysis:
Waterfall
To U( : )
From Wave File Frame Scope
4.1 Pre-Emphasis
s10_w5.wav Out
(22050Hz/1Ch/16b) Frame Conversion1 Convert 2-D to 1-D Waterfall2
By pre-emphasis [7], we imply the application of a
Five normalization technique, which is performed by
0.000000000000000e+000
dividing the speech data vector by its highest
From Wave File magnitude.
s10_w6.wav Out Display1
Median
(22050Hz/1Ch/16b) Matrix
Viewer
4.2 Data Length Adjustment
Six
RMS 0.0000000 Matrix FFT execution time depends on exact number of the
From Wave File samples (N) in the data sequence [XK], and the
Viewer
s10_w7.wav Out Display2
RMS execution time is minimal and proportional to
(22050Hz/1Ch/16b) Column 0.0000000
Sum
N*log2(N), where N is a power of two. Therefore, it
Seven is often useful to choose the data length equal to a
0.0000000 Matrix Display6
power of two [10].
From Wave File Sum
s10_w8.wav Out Display3
Standard 4.3 Endpoint Detection
(22050Hz/1Ch/16b) Deviation
Eight CONV 0.0000000 The goal of endpoint detection is to isolate the word
to be detected from the background noise. It is
From Wave File
s10_w9.wav Out Convolution Display4 necessary to trim the word utterance to its tightest
(22050Hz/1Ch/16b) limits, in order to avoid errors in the modeling of
subsequent utterances of the same word. As we can
Nine
see from the upper part of Fig. 4, a threshold has
been applied at both ends of the waveform. The
From Wave File
s10_w10.wav Out front threshold is normalized to a value that all the
(22050Hz/1Ch/16b) spoken numbers trim to a maximum value. These
values were obtained after observing the behavior of
Ten the waveform and noise in a particular environment.
We can see the difference in frequency
Fig.3 Extended Simulink model for analysis of Urdu characteristics of the words aik (one) to paanch
spoken numbers

UbiCC Journal - Volume 3 80


(five) in Fig 4–8 respectively.

4.4 Windowing

Speech signal analysis also involves application of a


window with a time less than the complete signal.
The window first starts with beginning of the signal
and then shifted until it reaches the end. Each
application of the window to the part of the speech
signal results in a spectral vector.
Fig.4 Correlation of the spoken Urdu number
4.5 Frame Blocking aik (one)

Since the vocal tract moves mechanically slow,


speech can be assumed to be a random process with
slowly varying properties. Hence the speech is
divided into overlapping frames of 100 ms. The
speech signal is assumed to be stationary over each
frame and this property will prove useful in further
operations [18].

4.6 Fourier Transform

The MATLAB algorithm for the two dimensional Fig.5 Correlation of the spoken Urdu number
FFT routine is as follows [9]: dau (two)

fft2(x) =fft(fft(x), ‘);

Thus the two dimensional FFT is computed by first


computing the FFT of x, that is, the FFT of each
column of x, and then computing the FFT of each
row of the result. Note that as the application of fft2
command produced even symmetric data, we only
show lower half of the frequency spectrum in our
graphs.

4.7 Correlation Fig.6 Correlation of the spoken Urdu number teen


(three)
Calculations for correlation coefficients of different
speakers were performed [2]-[3]. As expected, the
cross-correlation of the same speaker for the same
word did come out to be 1. The correlation matrix of
a spoken number was generated in a three-
dimensional form for generating different
simulations and graphs.

Figures 4 - 8 show the combined correlation


extracted for fifteen different speakers for the Urdu
number siffar (zero) to paanch (five), that
differentiate the number spoken by different speaker
for detailed analysis. Fig.7 Correlation of the spoken Urdu number
char (four)

UbiCC Journal - Volume 3 81


SPEAKER: s1 w2
700

600

500

400

300

200

100

0
0 50 100 150 200 250 300

Fig 11. FFT magnitude spectrum for spoken


Fig.8 Correlation of the spoken Urdu number Urdu number dau (two)
paanch (five)

Figures 9 - 14 show the FFT magnitude spectrum 700


SPEAKER: s1 w3

extracted for fifteen different speakers for the Urdu


600
number siffar (zero) to paanch (five), that again
differentiate the number spoken by different speaker 500

for detailed analysis. 400

300

SPEAKER: s1 w0
800 200

700
100

600
0
0 50 100 150 200 250 300
500

400 Fig.12 FFT magnitude spectrum for spoken


300
Urdu number teen (three)
200

100 SPEAKER: s1 w4
400

0 350
0 50 100 150 200 250 300

300

Fig.9 FFT magnitude spectrum for the spoken 250

Urdu number siffar (zero) 200

150

100

SPEAKER: s1 w1 50

500
0
0 50 100 150 200 250 300
450

400
Fig.13 FFT magnitude spectrum for spoken
350
Urdu number char (four)
300

250

200
SPEAKER: s1 w5
500
150
450

100 400

50 350

300
0
0 50 100 150 200 250 300
250

200

Fig.10 FFT magnitude spectrum for spoken 150

Urdu number aik (one) 100

50

0
0 50 100 150 200 250 300

Fig.14 FFT magnitude spectrum for spoken


Urdu number paanch (five)

UbiCC Journal - Volume 3 82


Figures 15-19 show the combined magnitude 900
SPEAKER ONE: s1 w4

spectrum of correlation extracted for fifteen different 800

speakers for the Urdu number siffar (zero) to paanch 700

(five), shown here for brevity which differentiate the 600

number spoken by different speaker for detailed 500

analysis, while Fig 20 shows surface graph of the 400

300
number aik (one). 200

100

0
0 2000 4000 6000 8000 10000 12000
SPEAKER ONE: s1 w0
1800

1600
Fig.18 Magnitude spectrum of the correlation of the
1400
spoken Urdu number spoken chaar(four) by speaker-
1
1200

1000

800
SPEAKER ONE: s1 w5
1200
600

400 1000

200
800

0
0 2000 4000 6000 8000 10000 12000
600

Fig.15 The waveform of the correlation of the spoken 400

Urdu number siffar (zero) by speaker-1


200

0
0 2000 4000 6000 8000 10000 12000
SPEAKER TWO: s2 w0
600
Fig.19 Magnitude spectrum of the correlation of the
500 spoken Urdu number spoken paanch (five) by speaker-
1
400

300
SPEAKERS 15 WORD 1 surface

200

100 0.8

0.6

0
0 2000 4000 6000 8000 10000 12000 0.4

0.2

Fig.16 The waveform of the correlation of the spoken 0


15
Urdu number siffar (zero) by speaker-2 10
15
10
5
5
0 0

SPEAKER THREE:s3 w0
1800 Fig.20 The surface plot of the correlation of the
1600 spoken Urdu numbers spoken aik (one) by speaker-15
1400

1200

1000
4.8 Creating a Network
800

600

400
To create a feedforward neural network (MLP)
200
object in an interactive way to use the command
0
nntool to open the Neural Network Manager. We can
0 2000 4000 6000 8000 10000 12000
then import or define data and train our network in
Fig.17 Magnitude spectrum of the spoken Urdu number the GUI or export our network to the command line
siffar (zero) by speaker-3 for training [5].

net = newff(MxMx, [S1 S2 . . . SK], {TF1 TF2 . . .


TFK}, BTF, BLF, PF)

UbiCC Journal - Volume 3 83


%where MnMx (p − 1 × 2) matrix of min and max This section describes the system workflow for
values for x input vectors. creating, initializing, training and running the
%Si Size of ith layer, for K layers. network on test data. The following sub sections
%TFi Transfer function of ith layer, default = tansig. demonstrate the Word Network to demonstrate the
%BTF Backpropagation network training function, system workflow of the demo program.
default = traingdx.
%BLF Backpropagation weight/bias learning The first step is to create and initialize the Neural
function, default = learngdm. Network and set up the network configuration.
%PF Performance function, default = mse.
net=newff(minmax_of_p,[64, 32, 32, 1;],
4.9 Network Structure and Parameters {'tansig','tansig','tansig','purelin'},'traingd');
// this is input layer.
%Neural Network object: Layer = 64
... // this is hidden layer no 1.
%subobject structures: Layer = 35
inputs: {1x1 cell} of inputs // this is hidden layer no. 2
layers: {2x1 cell} of layers Layer = 35
outputs: {1x2 cell} containing 1 output // this is output layer.
targets: {1x2 cell} containing 1 target Layer = 1
%biases: {2x1 cell} containing 2 biases
inputWeights: {2x1 cell} containing 1 input weight
layerWeights: {2x2 cell} containing 1 layer weight 4.12 Data Structure Training
...
Data training is an array list of a structure which is
%weight and bias values:
presented. Each of the initial four string values are
IW: {2x1 cell} containing 1 input weight matrix
part of the input vector provided to the network. The
LW: {2x2 cell} containing 1 layer weight matrix
last string variable represents the target output for the
b: {2x1 cell} containing 2 bias vectors
network in the code fragmented below:
4.10 Network Training / Testing
net.trainParam.show = 50;
% x-axis of graph is 50 values apart
In incremental training the weights and biases of the
net.trainParam.lr = 0.01;
network are updated each time an input is presented
% learning rate
to the network. The cell array format is most
net.trainParam.epochs = 6000;
convenient to be used for networks with multiple
net.trainParam.goal = 0.01;
inputs and outputs, and allows sequences of inputs to
% tolerance
be presented. The matrix format can be used if only
net=train(net,p,t);
one time step is to be simulated, so that the sequences
of inputs cannot be described. There are two basic
4.13 Setting and Loading Training Data
functions available for the incremental training:
learngd and learngdm. Refer to the manual for
Following code loads up the default training data into
details.
the training data list. The default training data
includes hard-coded input and output patterns for a
% learning rate
number of different homonyms word scenarios along
net.trainParam.lr = 0.01;
with their corresponding suggested output.
net.trainParam.epochs = 6000;
% tolerance
% Minimum array size has been kept 1e10
net.trainParam.goal = 0.01;
net=train(net,p,t);
min_arr_size = 1e10;
%test the network with
y_start = sim(net,p);
% Read data from .wav files and calculate
figure(200),plot(t,'g');
magnitudes from FFTs
% 15 speakers have spoken the same word
4.11 System Workflow
pronounced as "siffar"
s1_w0 = wavread('s1_w0.wav');

UbiCC Journal - Volume 3 84


s1_w0_norm = s1_w0/max(s1_w0);
s1_w0_fft0 = abs(fft2(s1_w0_norm)); In the above code, notice that the first line in the
min_arr_size =min(min_arr_size, function is the training of the word structure network,
size(s1_w0_norm)); which the Word Network uses to first detect the
% Minimum array size has been kept 1e10 problem in the network and then try to identify the
alternate word for the incorrect word in word engine.
min_arr_size = 1e10; 4.15 Pattern Detecting

% Read data from .wav files and calculate Once the network is trained, it can be used to detect
magnitudes from FFTs the correct output for the given input vector. The
% 15 speakers have spoken the same word following functions take the input pattern as an array
pronounced as "siffar" and use the function to retrieve the output from the
s1_w0 = wavread('s1_w0.wav'); network.
s1_w0_norm = s1_w0/max(s1_w0); % Creating new array 'alldat (:,:)' for NN use ...
s1_w0_fft0 = abs(fft2(s1_w0_norm));
min_arr_size= min(min_arr_size, alldat(1,1) = 1;
size(s1_w0_norm)); alldat(1,2) = 1;
………….. alldat(1,3:258) = s1_w1_fft0(1:256);

s15_w0 = wavread('s15_w9.wav'); alldat(2,1) = 1;


s15_w0_norm = s2_w0/max(s15_w9); alldat(2,2) = 2;
s15_w0_fft0 = abs(fft2(s15_w9_norm)); alldat(2,3:258) = s1_w2_fft0(1:256);
min_arr_size=min(min_arr_size,
size(s15_w9_norm)) alldat(3,1) = 1;
alldat(3,2) = 3;
4.14 Network Training alldat(3,3:258) = s1_w3_fft0(1:256);

Before using the Neural Networks in the demo 4.16 Setting and Loading Test Data
program to detect and classify correct word structure
as well as able to find out the alternate word for the The demo program is capable of loading the required
incorrectly recognized word (by the Speech input for the neural network from the wave file. The
Recognition engine), we need to train the Word wave file that contains the input sentences to be
Network with the training data. The train network detected by the neural network, however should be in
uses the object to train the word frequency and a specific format so that the program can correctly
network 6000 and 1500 times respectively. The load and feed the input data to the neural network
following code snippet shows how the word network and generate the correct output.
gets trained.
load 'neural_network.mat'
cutoff = 64;
for i=1:150; % The data of 15 speakers spoken in Urdu from
for j=3:cutoff+2; siffar (zero) to nau (nine).
ptrans(i,j-2) = log(alldat(i,j)); % Minimum array size has been kept 1e10
end
end min_arr_size = 1e10;

p = transpose(ptrans); % inputs % Read data from .wav files and calculate


t = transpose(alldat(:,2)); % target array magnitudes from FFTs
% 15 speakers have spoken the same word
for i = 1:cutoff pronounced as "siffar"
minmax_of_p(i,1) = min(p(i,:)); s1_w0 = wavread('s1_w0.wav');
end s1_w0_norm = s1_w0/max(s1_w0);
s1_w0_fft0 = abs(fft2(s1_w0_norm));
for i = 1:cutoff min_arr_size= min(min_arr_size,
minmax_of_p(i,2) = max(p(i,:)); size(s1_w0_norm));
end

UbiCC Journal - Volume 3 85


% Creating new array 'alldat(:,:)' for NN use ...
The effect of learning rate (α) on training is shown in
alldat(1,1) = 1; Fig.25-38. In our experiments, a larger value of α
alldat(1,2) = 1; took lesser time to train, although convergence to the
alldat(1,3:258) = s1_w1_fft0(1:256); steady state value was noisy. It shows that there
alldat(2,1) = 1; exists a trade off between learning rate (α), network
alldat(2,2) = 2; parameter setting and the epoch for which the
alldat(2,3:258) = s1_w2_fft0(1:256); algorithm was used.

alldat(3,1) = 1; We observed that Fourier descriptor feature was


alldat(3,2) = 3; independent of the spoken numbers, with the
alldat(3,3:258) = s1_w3_fft0(1:256); combination of Fourier transform and correlation
technique commands used in MATLAB, a high
% min_arr_size = min_arr_size/2; accuracy recognition system can be realized.
Recorded data was used in Simulink model for
min_arr_size = 11000; introductory analysis. The feedforward neural network
was trained for different learning rates, goal and epochs.
% 15 speakers have spoken the same word It was found that there is a trade off between learning
pronounced as "siffar". rate, goal and epochs. The network was trained and
% To find min the array length so all arrays are tested as shown in Fig 21-30.
clipped to it.
2
Performance is 0.0089228, Goal is 0.01
10
s1_w0_fft(1:min_arr_size,1)=s1_w0_fft0(1:min_arr_
size, 1); 1
10
s2_w0_fft(1:min_arr_size,1)=s2_w0_fft0(1:min_arr_
size, 1);
Training-Blue Goal-Black

0
10
s3_w0_fft(1:min_arr_size,1)=s3_w0_fft0(1:min_arr_
size, 1); -1

s4_w0_fft(1:min_arr_size,1)=s4_w0_fft0(1:min_arr_ 10

size, 1);
-2
10

% testing the network


y_start = sim(net,p);
-3
10
0 1 2 3 4 5 6 7 8

figure(200),plot(t,'g'); 8 Epochs

hold on; Fig.21 Training on Class 5with 12 speakers


plot(y_start,'b'); Layers (64,35, 35,1)
legend('actual','predicted'); Learning rate 0.01
hold off Goal 0.01

For the above model we can examine structure of the 3


10
Performance is 0.00942263, Goal is 0.01

created network nnet1 and its all initial parameters.


Typing in nnet1 on the command line gives the 2
10

structure of the top level neural network object. The


Training-Blue Goal-Black

extract of the structure is as follows: 10

0
10
5. ANALYSIS & RESULTS
-1
10

When we compared the frequency content of the


same word by different speakers, we found striking
-2
10

similarities among them. This helped us to get more -3


10
confidence in our initial hypothesis that a single word 0 1 2 3 4 5
10 Epochs
6 7 8 9 10

uttered by a diverse set of speakers would exhibit


similar characteristics. Additionally, Fig.23 shows Fig.22 Training on number 6 with 12 speakers
surface graph for the correlation of frequency content Layers (64,35, 35,1)
among different speakers, for words aik (one). Learning rate 0.01

UbiCC Journal - Volume 3 86


Goal 0.01

2
Performance is 0.00987841, Goal is 0.01
10

2
Performance is 0.00979687, Goal is 0.01
1
10 10

Training-Blue Goal-Black
1 0
10 10
Training-Blue Goal-Black

0 -1
10 10

-1 -2
10 10

-2 -3
10 10
0 500 1000 1500 2000
2459 Epochs

-3
10
0 10 20 30 40 50 60
Fig.26 Training on 15 speakers
65 Epochs
Layers (64,35, 35,1)
Fig.23 Training on number 7 with 12 speakers Learning rate 0.01
Layers (64,35, 35,1) Goal 0.01
Learning rate 0.01
Goal 0.01 2
10
Performance is 0.00994686, Goal is 0.01

2
Performance is 0.00912022, Goal is 0.01
1
10 10
Training-Blue Goal-Black

1 0
10 10
Training-Blue Goal-Black

0 -1
10 10

-1 -2
10 10

-2 -3
10 10
0 500 1000 1500 2000
2285 Epochs

-3
10
0 1 2 3 4 5 6 7 8 9 10 11
Fig.27 Training on 12 speakers
11 Epochs
Layers (64,35, 35,1)
Fig.24 Training on number 8 with 12 speakers Learning rate 0.01
Layers (64,35, 35,1) Goal 0.01
Learning rate 0.01
Goal 0.01 2
10
Performance is 0.00955099, Goal is 0.01

2
Performance is 0.00967774, Goal is 0.01
1
10 10
Training-Blue Goal-Black

1 0
10 10
Training-Blue Goal-Black

0 -1
10 10

-1 -2
10 10

-2 -3
10 10
0 500 1000 1500
1569 Epochs

-3
10
0 2 4 6 8 10 12
Fig.28 Training on 10 speakers
12 Epochs
Layers (64,35, 35,1)
Fig.25 Training on number 9 with 12 speakers Learning rate 0.01
Layers (64,35, 35,1) Goal 0.01

UbiCC Journal - Volume 3 87


7 64, 10, 10, 10 64.44%
8 64, 15, 15, 10 69.63%
Performance is 0.0098974, Goal is 0.01
10
2
9 64, 20, 20, 10 58.52%
10 64, 25, 20, 10 71.85%
1
10
11 64, 25, 25, 10 71.85%
Training-Blue Goal-Black

10
0
12 64, 30, 30, 1 72.59%
13 64, 35, 35, 1 94%
10
-1
Table 1. Training of Neural Network with
different number of speakers
-2
10

The feedforword neural network was trained for


10
-3

0 200 400 600 800 1000 1200


number of speakers with same layers, learning rate,
1246 Epochs
goal and epochs. It was found that there is trade off
Fig.29 Training on 7 speakers between neural network, learning rate, goal and
Layers (64,35, 35,1) epochs as shown in table-2.
Learning rate 0.01
Goal 0.01 S No of Number of Accurac
No Training Testing y Rate %
speakers Speaker
10
2
Performance is 0.00996304, Goal is 0.01
1 15 5 94
2 12 4 95
10
1 3 10 3 96.7
4 7 2 95
Training-Blue Goal-Black

10
0 5 5 2 100
Table 2. Training of Neural Network with
10
-1 different number of speakers showing accuracy
rate.
-2
10
Relationship between network layers and training
10
-3 data can be formulated from the effect of learning
0 200 400 600
1243 Epochs
800 1000 1200
rate (α) on training is shown in Fig 21-30. In
experiments, a large value of α took lesser time to
Fig.30 Training on 5 speakers
train, although convergence to the steady state value
Layers (64,35, 35,1)
was noisy. It shows that there exists a trade off
Learning rate 0.01
between layers of the network (network parameter
Goal 0.01
setting), learning rate (α), and the epoch for which
the algorithm was used.
We created and tested the networks in different
configurations, especially the hidden layer size. The
following table shows the learning accuracy with 6. CONCLUSION
some of the networks. A maximum accuracy of 94%
was achieved with double hidden layers network (64, In this paper, we presented recognition analysis of Urdu
35, 35 1). Therefore double-hidden layers should numbers siffar to nau (one to nine), which is totally an
suffice for application as shown in table 1. unique idea. The data was acquired in moderate noisy
environment by word utterances of 15 different
speakers. FFT algorithm was used in MATLAB to
S No Learning
analyze the data. As expected, we found high
Neuron count accuracy
correlation among frequency contents of the same
1 64, 10, 10 81.48% word, when spoken by many different speakers.
2 64, 20, 10 81.48%
3 64, 30, 10 72.59% We have investigated creation of neural network
4 64, 40, 10 83.70% models for automatically recognizing individual Urdu
5 64, 60, 10 62.96% numbers to be specific. The feedforward neural
6 64, 80, 10 45.93% network was trained for different learning rates;
combined and individually for different goals and

UbiCC Journal - Volume 3 88


epochs. It was found that there is a trade off between on Engineering Education, Coimbra, Portugal,
learning rate, goal and epochs. It means all values to be September 3-7, 2007.
adjusted at a level of learning rate, parameter goals and
epochs. We are still in further development stage of the [13] A. Beg. Predicting Processor Performance with a
system using TI TMS 320C6713. This recognition Machine Learnt Model. IEEE International
system could be of many potential applications, for Midwest Symposium on Circuits and Systems,
example, voice-driven menu selection in a telephone- MWSCAS/NEWCAS 2007, Montreal, Canada,
based customer service in Urdu speaking countries such August 5-8, 2007, pp. 1098-1101.
as Pakistan/India.
[14] A. Beg, P.W.C. Prasad, S.M.N.A. Senanayake.
7. REFERENCES Learning Monte Carlo Data for Circuit Path
Length. In Proc. International Conference on
[1] S K Hasnain, Azam Beg and Samiullah Awan, Computers, Communications & Control
“Frequency Analysis of Urdu Spoken Numbers Technologies, CCCT 2007, Orlando, Florida,
Using MATLAB and Simulink” Journal of July 12-14, 2007.
Science & Technology PAF KIET ISSN 1994-
862x, Karachi, Dec. 2007. [15]A. Beg, P.W.C. Prasad, M. Arshad, S K Hasnain.
Using Recurrent Neural Networks for Circuit
[2] D. O’Shaughnessy. Speech Communication: Complexity Modeling", In Proc. IEEE INMIC
Human and Machine. Addison Wesley Conference, Islamabad, Pakistan, December 23-
Publishing Co., 1987. 24, 2006, pp. 194-197.

[3] T. Parsons. Voice and Speech Processing. [16] S K Hasnain, Nighat Jamil, “Implementation
McGraw-Hill College Div., Inc, 1986. of Digital Signal Processing real time Concepts
Using Code Composer Studio 3.1, TI DSK
[4] S K Hasnain, Pervez Akhter, “Digital Signal TMS 320C6713 and DSP Simulink Blocksets,”
Processing, Theory and Worked Examples,” IC-4 conference, Pakistan Navy Engineering
2007. College, Karachi, Nov. 2007

[5] MATLAB User’s Guide. Mathworks Inc., 2006. [17] M M El Choubassi, H E El Khoury, C E Jabra
Alagha, J A Skaf, M A AL Alaoui, “Arabic
[6] DSP Blockset (For use with Simulink) User’s Speech Recognition Using Recurrent Neural
Guide Mathworks Inc., 2007. Networks,” Symp. Signal Processing & Info.
Tech., 2003, ISSPIT 2003, Dec. 2003, pp. 543-
[7] Samuel D Stearns, Ruth A David, “Signal 547.
Processing Algorithms in MATLAB,” Prentice
Hall, 1996. [18] M A Al-Alaoui, R Mouci, M M Mansour, R
Ferzli, “A Cloning Approach to Classifier
[8] “TMS320C6713 DSK User’s Guide,” Texas Training,” IEEE Trans. Systems, Man and
Instruments Inc., 2005. Cybernetics – Part A: Systems and Humans, vol.
32, no. 6, pp. 746-752, 2002.
[9] G. C. M. Fant. Acoustic Theory of Speech
Production. Mouton, Gravenhage, 1960.

[10] S K Hasnain, Aisha Tahir, “Digital Signal


Processing Laboratory Workbook”, 2006.

[11] S. Varho. New Linear Predictive Systems for


Digital Speech Processing. PhD dissertation,
Helsinki University of Technology, Finland,
2001.

[12] A. Beg, W. Ibrahim. An Online Tool for


Teaching Design Trade-offs in Computer
Architecture. In Proc. International Conference

UbiCC Journal - Volume 3 89

Vous aimerez peut-être aussi