Vous êtes sur la page 1sur 10

www.ietdl.

org

Published in IET Circuits, Devices & Systems


Received on 11th March 2008
Revised on 8th October 2008
doi: 10.1049/iet-cds:20080072

ISSN 1751-858X

Field programmable gate array-based design


and realisation of automatic censored cell
averaging constant false alarm rate detector
based on ordered data variability
A.M. Alsuwailem1 S.A. Alshebeili1 M.H. Alhowaish2
S.M. Qasim1
1
Department of Electrical Engineering, King Saud University, Riyadh 11421, Saudi Arabia
2
Communications and Information Technology Commission, Riyadh, Saudi Arabia
E-mail: smanzoor@ksu.edu.sa

Abstract: The design and field programmable gate array (FPGA)-based realisation of automatic censored cell
averaging (ACCA) constant false alarm rate (CFAR) detector based on ordered data variability (ODV) is
discussed here. The ACCA – ODV CFAR algorithm has been recently proposed in the literature for detecting
radar target in non-homogeneous background environments. The ACCA – ODV detector estimates the unknown
background level by dynamically selecting a suitable set of ranked cells and doing successive hypothesis tests.
The proposed detector does not require any prior information about the background environment. It uses the
variability index statistic as a shape parameter to accept or reject the ordered cells under investigation.
Recent advances in FPGA technology and availability of sophisticated design tools have made it possible to
realise the computation intensive ACCA –ODV detector in hardware, in a cost-effective way. The architecture is
modular and has been implemented and tested on an Altera Stratix II FPGA using Quartus II software. The
post place and route result show that the proposed design can operate at 100 MHz, the maximum clock
frequency of the prototyping board and for this frequency the total processing time required to perform a
single run is 0.21 ms. This amounts to a speedup for the FPGA-based hardware implementation by a factor of
110 as compared to software-based implementation, which takes 23 ms to perform the same operation.

1 Introduction The received signal in a radar system is always accompanied


by thermal noise and clutter. Clutter is the term applied to any
Radar is an acronym for Radio Detection and Ranging. unwanted radar signal from scatterers that are not of interest to
Radar is an electromagnetic system used for the detection, the radar user. Examples of clutter in radar signal detection are
location and some times for recognition of objects (or reflections from terrain, sea, rain, birds, insects, chaff and so on
targets). It operates by transmitting electromagnetic energy [2]. The performance of the radar receiver is greatly dependent
and then extracting the necessary information about the on the presence of such disturbances, and the receiver is desired
target from the returned echo signal. Radar system analysis to achieve constant false alarm rate (CFAR), and maximum
provides quantitative estimates of performance in all the probability of target detection. In a radar system, a target is
radar’s functions, for a variety of target types in a variety of detected when the output of the receiver crosses a
complex environments. Radar has been used, or proposed predetermined fixed threshold level set to achieve a
for use, in a wide range of applications, both in military specified probability of false alarm (Pfa). Modern radars
and civilian systems [1]. usually make the detection decision automatic by using an

12 IET Circuits Devices Syst., 2009, Vol. 3, Iss. 1, pp. 12– 21


& The Institution of Engineering and Technology 2009 doi: 10.1049/iet-cds:20080072

Authorized licensed use limited to: St. Xavier's Catholic College of Engineering. Downloaded on March 12, 2009 at 00:26 from IEEE Xplore. Restrictions apply.
www.ietdl.org

adaptive threshold based on a CFAR detector. This detector non-coherent integrator are fed sequentially into a shift
dynamically determines a detection threshold by estimating register. The adaptive threshold Z, which is proportional to
the local background noise/clutter power and multiplying this the estimate of the total noise power, is formed by
estimate by a scaling constant based on the desired Pfa . processing the contents of reference cells surrounding the
cell under test (CUT), whose content is Y. To maintain Pfa
Although software-based implementation is very flexible, the at the desired value, the adaptive threshold is multiplied by
whole CFAR processing may degrade the performance of the a scaling factor called the threshold multiplier T. The
processor. Hence, to accelerate the processing, it is proposed product TZ is the resulting adaptive threshold. The output
to realise the computationally intensive CFAR detector in Y from the CUT is then compared with the threshold in
field programmable gate array (FPGA) hardware. FPGAs order to make a decision. A target is declared to be present
are a form of programmable logic. They offer design if Y exceeds TZ.
flexibility like software, but with time performance closer to
application-specific integrated circuits. Recent advances in The processor configuration varies with different CFAR
FPGA technology have resulted in enormous possibilities for schemes. For example, cell averaging (CA) CFAR processor
the implementation of sophisticated algorithms of high sums the contents of surrounding cells to produce the
complexity, in a variety of important applications, by using statistic Z, that is
low cost, high performance and high speed reconfigurable
hardware. FPGAs have become one of the prevailing X
N
technologies for fast prototyping and implementation of Z¼ Xi (1)
complex digital systems [3]. In this paper, a novel FPGA- i¼1
based design and realisation of a complex automatic censored
cell averaging (ACCA) CFAR detector based on ordered For homogenous environments, the CA – CFAR processor is
data variability (ODV), proposed in [4], are presented. optimum. However, in the presence of interfering targets, the
assumption of homogenous environment is no longer valid.
The rest of this paper is organised as follows. Section 2 The performance of the CA– CFAR processor seriously
describes the background information on CFAR theory and degrades under such conditions. Various classes of CFAR
the related work done towards the hardware realisation of techniques have been proposed to enhance the robustness
different CFAR processors. Section 3 describes the against non-homogeneous environment for different
ACCA–ODV detection algorithm. The hardware applications [5]. In particular, ordered statistics (OS)-based
architecture of the proposed CFAR detector is discussed in CFAR detectors proved to provide good performance in
detail in Section 4. Section 5 provides the FPGA-based the presence of interference. The clutter power estimate
realisation and simulation results. FPGA prototyping results Z, in OS– CFAR detectors, is computed by sorting the
are discussed in Section 6. Finally, Section 7 presents observations in the reference window in ascending order and
concluding remarks and some directions for future research. setting

Z ¼ X(k) (2)
2 Background theory and
related work where X(k) is the kth ordered sample. The rank of the order
A typical CFAR processor as shown in Fig. 1 consists of a statistic to be used is determined in advance. It can be any
matched filter followed by an envelope detector and a value between 1  k  N, and is typically chosen to
non-coherent integrator. The output samples of the maximise detection performance. The OS – CFAR detector
has a small additional detection loss over the CA –CFAR
detector in homogeneous backgrounds, but can resolve
closely spaced interferences. However, it requires a longer
processing time than the CA– CFAR detector.

Efficient hardware realisations of different CFAR detectors


were considered by many authors. In particular, a configurable
hardware architecture for adaptive processing of noisy signals
for target detection based on CFAR algorithms has been
presented in [6]. The architecture has been designed to deal
with parallel/pipeline processing and to be configured for
three versions of CFAR algorithms: the cell average, the
max and the min CFAR [7]. In [8], hardware realisation of
a CA– CFAR processor using discrete components has been
reported. The design suffers from several drawbacks as
Figure 1 Block diagram of a typical CFAR detector compared to the previous work, such as low resolution in

IET Circuits Devices Syst., 2009, Vol. 3, Iss. 1, pp. 12– 21 13


doi: 10.1049/iet-cds:20080072 & The Institution of Engineering and Technology 2009

Authorized licensed use limited to: St. Xavier's Catholic College of Engineering. Downloaded on March 12, 2009 at 00:26 from IEEE Xplore. Restrictions apply.
www.ietdl.org

data elements size, small array averaging, no safeguard cells,


non-configurable and complex design.

In [9], real-time design of a more complex CFAR detector,


taking into account the great opportunity offered by FPGAs,
is developed. The system considered in [9] combines ordering
with arithmetic averaging. Such a detector is known as
trimmed mean (TM) filtering in the signal processing
literature [2, 5]. The TM – CFAR detector reduces to the
CA– CFAR and OS– CFAR detectors for specific
trimming values.

The CFAR detectors discussed so far have been developed


under the assumption of a homogeneous background. In
practice, the environment is usually non-homogeneous
because of the presence of multiple targets and/or clutter
edges in the reference window, which consists of a finite
number of range samples of received radar signal. In such
situations, OS detectors have been known to yield good
performance as long as the non-homogeneous background
and outlying returns are properly discarded [5]. However, Figure 2 Block diagram of ACCA– ODV CFAR detector
most of the work in the literature considers some type of
censoring based on a priori knowledge or a judicial guess.

Techniques based on the automatic censoring of unwanted to represent the largest rank possible, since CFAR loss would
cells have been proposed in the literature [2]. The ACCA– increase with the decrease in the value of j. In particular, the
ODV CFAR detector selects dynamically, by doing numerical results obtained in [5] show that the appropriate
successive hypothesis tests, a suitable set of ranked reference value of j, when detection is performed in homogeneous
window cells to estimate the unknown background level and environments, is j ¼ N. However, in the presence of k
set the adaptive threshold accordingly. The advantage interfering targets in the reference window, the value of j is
associated with this detector is that it neither requires any best selected such that j ¼ N 2 k. Therefore the main
prior information about the clutter parameters nor does it objective of the ACCA– ODV censoring algorithms is to
require the number of interfering targets. The effectiveness of have the task of determining the best value of k. Once the
the ACCA–ODV algorithm has been extensively studied in number of interfering targets is determined automatically,
[4] by computing the probability of censoring and the the output of the test cell X0 is then compared with the
probability of detection in different background environments. adaptive threshold Tk according to

H1
3 ACCA–ODV detection algorithm .
X0 Tk (5)
The square law detected range samples, fXi: i ¼ 0,
,
1, . . . , Ng, are sent serially into a tapped delay line, as H0
shown in Fig. 2. X0 is the test cell. The remaining N cells
surrounding the test cell are the auxiliary cells that are used
to construct the CFAR procedure. These auxiliary cells are where the adaptive threshold Tk (or equivalently the
ranked in ascending order according to their magnitudes to parameter tk) is selected so that the design Pfa is achieved.
yield Hypothesis H1 denotes the presence of a target in the test
cell, whereas H0 is the null hypothesis (i.e. no target is
X (1)  X (2)      X (p)      X (N ) (3) present).

The test cell X0 is to be compared with the threshold Tk , to To determine the number of interfering targets k, the ODV
decide whether a target is present or not. Selecting statistic V0 is first compared with the ODV threshold S0 ,
which is selected so that a low probability of false censoring
X
j Pfc is maintained. The statistic V0 is defined as follows
Tk ¼ tk X (i) (4)
i¼1
mp þ X (N )2
leads to a CFAR processor in Rayleigh clutter. The threshold V0 ¼ (6)
Tk is parameterised by the variable tk . The subscript j is taken (sp þ X (N ))2

14 IET Circuits Devices Syst., 2009, Vol. 3, Iss. 1, pp. 12– 21


& The Institution of Engineering and Technology 2009 doi: 10.1049/iet-cds:20080072

Authorized licensed use limited to: St. Xavier's Catholic College of Engineering. Downloaded on March 12, 2009 at 00:26 from IEEE Xplore. Restrictions apply.
www.ietdl.org

where Hypothesis H1 represents the case where X(N 2 k) and thus


the subsequent samples X(N 2 k þ 1),
p
X X(N 2 k þ 2), . . . , X(N ) correspond to clutter samples
sp ¼ X (i) (7) with interference, whereas H0 denotes the case where
i¼1 X(N 2 k) is a clutter sample without interference.
and The successive tests are repeated as long as the hypothesis
p
H1 is declared true. The algorithm stops when the cell under
X investigation is declared homogeneous (i.e. clutter sample
mp ¼ X 2 (i) (8)
i¼1
only) or, in the extreme case, when all the N 2 p highest
cells are tested (i.e. k ¼ N 2 p).
The parameter p has to be carefully selected to yield a robust
performance in both homogeneous and non-homogeneous It is quite clear from Fig. 2 that the threshold selection is a
environments. Values of p . N/2 have been found to yield a key element in the implementation of the ACCA– ODV
reasonable performance [4]. algorithm. The threshold parameter tk is determined for a
design Pfa by [4, 5]
If V0 , S0 , the algorithm decides that X(N ) corresponds   NY
k 
to a clutter sample without interference, and it terminates. N N  j þ 1 1
Pfa (k) ¼ Tk þ (11)
If, on the other hand, V0 . S0 , the algorithm decides that N k j¼1
N kjþ1
the sample X(N ) is a return echo from an interfering
target. In this case, X(N ) is censored and the algorithm As of Sk , these thresholds are selected such that a low
proceeds to compare the statistic V1 with the threshold S1 probability of hypothesis test error is achieved in a
to determine whether X(N 2 1) corresponds to an homogeneous environment. For the ACCA–ODV
interfering target or a clutter sample without interference. algorithm, this probability is defined, at each value of k, as
In this case, we have
ek ¼ Prob(Vk . Sk j homogeneous environment) (12)
mp þ X (N  1)2
V1 ¼ (9)
(sp þ X (N  1))2 The ODV thresholds Sk are selected such that a low Pfc is
maintained at each step [4]. Hence, the values of Sk are
At the (k þ 1)th step, the ODV statistic Vk is compared with determined by setting
the threshold Sk and a decision is made according to the test
e0 ¼ e1 ¼    ¼ eN p1 ¼ design Pfc (13)
H1
. Table 1 gives the threshold parameter Sk obtained using ACCA–
Vk Sk (10) ODV algorithm in a Gaussian homogeneous background [4].
,
H0

where 4 ACCA–ODV CFAR detector


architecture
mp þ X (N  k)2
Vk ¼ The proposed detector comprises of the following main
(sp þ X (N  k))2 modules: shift register, sorting and censoring module, parallel

Table 1 ODV thresholds in a homogeneous background with exponential probability density function (pdf)

(N, p) Pfc Sk
S0 S1 S2 S3 S4 S5 S6 S7
22
(16, 12) 10 0.356 0.246 0.199 0.173 — — — —
5  1023 0.389 0.267 0.213 0.183 — — — —
23
10 0.456 0.320 0.246 0.206 — — — —
(24, 16) 1022 0.332 0.235 0.189 0.162 0.143 0.131 0.122 0.117
23
5  10 0.362 0.255 0.204 0.173 0.152 0.138 0.129 0.122
1023 0.422 0.305 0.240 0.200 0.174 0.155 0.142 0.133

IET Circuits Devices Syst., 2009, Vol. 3, Iss. 1, pp. 12– 21 15


doi: 10.1049/iet-cds:20080072 & The Institution of Engineering and Technology 2009

Authorized licensed use limited to: St. Xavier's Catholic College of Engineering. Downloaded on March 12, 2009 at 00:26 from IEEE Xplore. Restrictions apply.
www.ietdl.org

adder, multiplier, comparator and timing/control unit as shown


in Fig. 3. Fig. 4 represents the flow chart of the ACCA–ODV
CFAR algorithm. For illustration, a reference window of length
N ¼ 16 is taken into consideration. The number of guard cells
and parameter p are fixed at 2 and 12, respectively.

The serial-in parallel-out shift register consists of N


reference cells, surrounding the test cell, which is located at
the centre tap. Samples of the reference window are divided
into N/2 symmetrical leading (right side) and lagging (left
side) groups. In addition, there are G safeguard cells, which
are divided into two symmetrical groups separating the test
cell from the reference cells on both sides. In order to avoid
the possibility of any signal energy spill from the test cell
into the adjacent cells, guard cells are used. The raw data
samples of the signal to be processed is received from the
radar system in digital form and fed to the shift register in
a serial manner. The length L of the shift register is given by

L ¼ N þ G þ 1 cells (14)

The two N/2 cell reference groups are merged into one N cell
output group and sent to the sorting circuit for arranging the
samples in ascending order (the highest values are on the
right side). The sorting circuit is the most sophisticated
part as far as the circuit size and processing time are
concerned in this design. This is because the sorting must
be done sequentially for the N cells. After sorting is done,
the subsequent operations are explained in detail in Fig. 5,
where a certain number of data cells at the array edges are
subjected to an automatic censoring mechanism from one
side (right side in our case). The censoring operation is
very beneficial in minimising the estimation error of the
background, and is performed according to the background
configuration. The basic idea of this circuit is to consider p
of the lowest cells ( p ¼ 12 in our case) to represent the
initial estimation of the background level and then use this
estimate to compute the number of interfering targets k.

Background noise is estimated by first computing the Figure 4 Flow chart of ACCA– ODV CFAR detector
values of sp and mp . Once computed, they can be used

Figure 3 ACCA– ODV CFAR detector architecture

16 IET Circuits Devices Syst., 2009, Vol. 3, Iss. 1, pp. 12– 21


& The Institution of Engineering and Technology 2009 doi: 10.1049/iet-cds:20080072

Authorized licensed use limited to: St. Xavier's Catholic College of Engineering. Downloaded on March 12, 2009 at 00:26 from IEEE Xplore. Restrictions apply.
www.ietdl.org

maximum clock frequency (Fmax) of the FPGA chip


employed, that is

Tclk
Ttotal ¼ s (15)
Fmax

Fmax entirely depends on the FPGA family selected, whereas


the required number of clock cycles Tclk is composed of the
following: three clock cycles for shifting the data one cell,
two clock cycles for the multiply and compare operations
(one clock cycle each) and N clock cycles for the sorting
algorithm. Therefore Ttotal changes linearly with the
number of reference cells N, and is given by

(5 þ N )
Ttotal ¼ s (16)
Fmax

4.1 Parallel architecture of censoring


module
In order to reduce the processing time for the censoring
module, a two level architecture based on a parallel approach
is adopted [4]. In Level 1, the arithmetic block consists of
computing the ODV statistics. Since there is no dependency
between the inputs/outputs data of these tasks, the statistics
Vk can be processed simultaneously. Each of these
Figure 5 Basic blocks of ACCA– ODV CFAR detector tasks needs the corresponding ranked cell X(N 2 k), the
corresponding square X 2(N 2 k) and the two quantities sp
with the sorted samples (X13 – X16) to determine the ODV and mp . Each Vk calculation requires two fixed-point
statistic (V0 –V3). ODV statistics are fed with the censoring addition, one fixed-point multiplication and one fixed-point
thresholds (S0 – S3) to four parallel comparators, the output division operation. sp and mp are provided by a previous task
of which is a binary word of length four bits. This binary requiring p fixed-point additions for each computation.
code is applied to a mask circuit, which determines the
number of cells to be censored. The resulting uncensored In Level 2, since the decisions dk are simultaneously
cells and the initial population cells (X1 – Xp) are assigned provided, a logic block is necessary to generate the binary
to parallel adder simultaneously, and the result is multiplied code (Kq21 , . . . , K1 , K0) of the estimated number of
by the pre-calculated scale value tk using a binary censored cells, by treating globally the binary inputs dk [4].
multiplier. Finally, the multiplier output is compared with The sequence of operations for the execution of censoring
the original test cell using a magnitude comparator circuit, algorithm is illustrated in Fig. 6. The censoring circuit
which produces (as a one bit flag output) a high state when automatically eliminates the interfering targets found in the
the test cell is higher than the value of threshold (tk  Z) largest four cells of the sorted cells. For a given (N, p), the
and gives a low state otherwise. binary representation needs q bits such that 2q 2 1  N 2 p
and the Boolean functions ( gq21 , . . . , g1 , g0) are defined by
The timing and control module is the heart of the analysing the input states for each bit of the code
proposed CFAR detector and is responsible for initialising (Kq21 , . . . , K1 , K0). To better illustrate the binary code
the process, generating the various synchronisation and generation of k, consider the case (N ¼ 16 p ¼ 12). In this
initialisation clock pulses, and enabling signals. It typically case, the range of k is [0, 4] and therefore its code can be
consists of several counters with different configurations represented by q ¼ 3 bits. Table 2 represents the input states
and time constants. The circuit is initialised by the external of the binary decision dk for all the values of k [4].
reset signal to resume the processor operations. After the
system initialisation, the shift register is filled with L useful The architecture of censoring module as shown in Fig. 6
data cells. This is achieved by shifting the data right L consists of six different units: mp , sp and Vk circuits, a
times. This operation requires L clock cycles and is comparator, a converter unit and a mask unit. These
performed only once and hence it will not be counted to components are used to perform censoring on the
determine the processor speed. However, the total time remaining/largest cells of the sorted cells (X16 , X15 , X14 and
(Ttotal ) that is required to execute a single run for detecting X13) as shown in Fig. 5. The mp unit computes the formula
an object is determined by two factors, namely, the number indicated by (8). This unit consists of a parallel multiplier
of clock cycles required to perform single run (Tclk) and the and an adder to multiply each cell of the lowest cells with

IET Circuits Devices Syst., 2009, Vol. 3, Iss. 1, pp. 12– 21 17


doi: 10.1049/iet-cds:20080072 & The Institution of Engineering and Technology 2009

Authorized licensed use limited to: St. Xavier's Catholic College of Engineering. Downloaded on March 12, 2009 at 00:26 from IEEE Xplore. Restrictions apply.
www.ietdl.org

Figure 6 Architecture of censoring module

Table 2 Converter lookup table

Binary decision Represented No. of censored cells Mask code Threshold factor
code
d3 d2 d1 d0 K2 K1 K0 k M3 M2 M1 M0 tk
   0 0 0 0 0 1 1 1 1 0 1 0 1 0 1 0 1 0 1 ¼ (341)10
  0 1 0 0 1 1 1 1 1 0 0 1 1 0 1 1 0 1 1 1 ¼ (439)10
 0 1 1 0 1 0 2 1 1 0 0 1 0 0 0 1 0 0 1 0 1 ¼ (549)10
0 1 1 1 0 1 1 3 1 0 0 0 1 0 1 0 1 0 1 1 0 0 ¼ (684)10
1 1 1 1 1 0 0 4 0 0 0 0 1 1 0 1 0 1 1 0 0 0 ¼ (856)10

, do not care

itself, the resulting squared output of each cell is then


simultaneously applied to the parallel adder in order to
compute the estimation of mp . The sp estimation circuit
consists of a parallel adder, which adds simultaneously the
lowest cells together to satisfy (7).

As is evident from (9), the Vk unit is composed of two


parallel adders, two multipliers and a divider. The task of
this unit is to add mp with the square of X16 first, then the
result of this addition operation is divided by the square of
sp added to X16 to find V0 . This operation is repeated
three times to find V1 , V2 and V3 . Then, V0 is first
compared with S0 to determine the binary decision d0 . If
V0 is greater than S0 the decision is logic 1, else the
decision is logic 0. This operation is repeated for all Vk to
find the remaining binary decisions. That means four Figure 7 Architecture of mask unit
comparators are required to implement this operation on
hardware. The result of the four comparators is a binary
decision dk (dk ¼ d3d2d1d0), assigned to a converter to each cell of the remaining/largest cells and its mask code
determine the specific mask code Mk that points to the (i.e. X16 AND M0 , X15 AND M1 etc.). It censors any one
number of censored cells, and the related threshold factor of the cells when its mask code is equal to logic 0 and
tk which is based on converter lookup table as shown in accepts it when the mask code is equal to logic 1. Table 3
Table 2. illustrates the accepted and censored cells for different
situation. Finally, the output of the censoring module is the
The main task of the mask unit is to determine the rest of the remaining/largest cells, which are not censored
uncensored cells. It consists of four parallel AND gates as and the threshold factor which is related to the number of
shown in Fig. 7. It is basically AND operation between censored cells.

18 IET Circuits Devices Syst., 2009, Vol. 3, Iss. 1, pp. 12– 21


& The Institution of Engineering and Technology 2009 doi: 10.1049/iet-cds:20080072

Authorized licensed use limited to: St. Xavier's Catholic College of Engineering. Downloaded on March 12, 2009 at 00:26 from IEEE Xplore. Restrictions apply.
www.ietdl.org

Table 3 Mask lookup table comparator circuit, 2:1 MUX and output flip flops. The
main task is to compare the two input elements: if the first
No. of Masking code Accepted cells element is greater than the second, the two elements are
censored cell swapped, and no change is performed otherwise. This
k M3 M2 M1 M0 X13 X14 X15 X16 operation is repeated for each pair of adjacent elements till
the end of the entire data array. The process can be
0 1 1 1 1 X13 X14 X15 X16 performed simultaneously for every adjacent pairs, so as to
1 1 1 1 0 X13 X14 X15 0 speed up the processing time through what is called parallel
bubble sorting.
2 1 1 0 0 X13 X14 0 0
3 1 0 0 0 X13 0 0 0 The next step in this module is to repeat the operation
described above on the array results obtained from the
4 0 0 0 0 0 0 0 0 previous sorted data in a serial manner and in synchronised
clocked stages till the end. Note that if the number of
elements to be sorted is N, the number of the stages will be
N; hence, the total number of the compare– swap circuits
4.2 Architecture of sorting module will be given by
Sorting is the operation that puts elements of a list in a certain
order. This operation plays an important role since it
consumes long computation time and constitutes a bottleneck No: of units ¼ N (N  1)=2 (17)
in the field of real-time signal processing applications [9].
The sorting algorithm can work in ascending order (data
elements are sorted from the smallest to the largest) or in
descending order (data elements are sorted from the largest to
the smallest). Since the processing time is critical, the decision 5 FPGA realisation and simulation
of choosing the highest speed and most efficient method of
The ACCA– ODV CFAR detector has been designed,
data sorting is of great interest. In this work, the bubble
synthesised and simulated using Altera Quartus II software
sorting algorithm is adopted. It is one of the best sorting
[12] targeting Stratix II FPGA. It provides a complete
algorithms that combine high speed and simplicity for the
design environment for designing system-on-a-
applications that involve small number of elements [10, 11].
programmable-chip. It offers a very rich library of
parameterised modules that can be utilised to construct
The bubble sort algorithm compares every two elements, different processing units used in this design. The designed
and then decides which one is greater. As shown in Fig. 8, CFAR detector is modular, which enables the designer to
each compare– swap switch circuit is simply composed of a test the various modules individually.

Figure 8 Architecture of compare – swap circuit

IET Circuits Devices Syst., 2009, Vol. 3, Iss. 1, pp. 12– 21 19


doi: 10.1049/iet-cds:20080072 & The Institution of Engineering and Technology 2009

Authorized licensed use limited to: St. Xavier's Catholic College of Engineering. Downloaded on March 12, 2009 at 00:26 from IEEE Xplore. Restrictions apply.
www.ietdl.org

For illustration, a shift register consisting of 19 cells (each In the absence of other hardware implementations,
cell of 16 bit), 16 reference cells and 2 guard cells are information on the extent of speedup obtained by our
considered. Hence, the total number of the clock cycles hardware implementation has been gathered by implementing
Tclk required is 21, the number of stages for the sorting the ACCA–ODV algorithm in software. The full
circuit is 16 and total number of compare– swap switch implementation of the ACCA–ODV CFAR detector was
circuits is 120. carried out in C language targeted to general purpose PC
(3.4 GHz Pentium 4 processor with on-board RAM of 1 GB
For simulation, two memories and their associated read/ running Microsoft Windows XP Professional). For the same
write control signals and address generation unit are built (N, p) configuration, the processing time on this platform is
inside the FPGA device as shown in the simplified block 23 ms. The performance improvement of the proposed
diagram of Fig. 9. A 256  16 ROM is used, which ACCA–ODV CFAR hardware architecture is 110 times
receives the data serially, stores it and is then read by the than the software implementation of the same algorithm.
rest of the hardware. The resulting flags decided by the
thresholding modules are stored in 256  1 RAM.
Table 4 summarises the FPGA hardware resource
utilisation of different modules and the proposed CFAR 6 FPGA prototyping
detector. The proposed ACCA– ODV CFAR detector, including the
associated input data ROM and the output target
The FPGA implementation result shows that the detection RAM, have been implemented on Stratix II
processor can achieve a maximum operating frequency of FPGA chip. The EP2S60 digital signal processing
109.37 MHz, which is very close to the clock frequency of (DSP) development kit [13] built around Stratix II device
the prototyping board (100 MHz). This implies that the has been selected for prototyping the proposed CFAR
total processing time Ttotal (for N ¼ 16) to perform a single detector because of its low cost, configurability and the fact
run is 0.21 ms that the operating master clock is 100 MHz, which is very
close to the maximum frequency determined by the
Ttotal ¼ 21=100 ¼ 0:21 ms compiler (109.37 MHz). This kit is a development
platform for high performance DSP designs. It is normally
employed to design, verify and evaluate systems prior to
final stand-alone single chip implementation.

The in-circuit memory content editor provided with the


Quartus II software provides read and write access to
in-system FPGA memories. The data is read in Hexadecimal
(HEX) format, although the processor is running at
maximum speed. The designed CFAR detector has been
tested and verified by generating 256 data samples drawn
from an exponential distribution function. The data set is
downloaded to 256  16 ROM and the output array
Figure 9 Block diagram of hardware simulation setup indicating the target presence result is stored in 256  1 RAM.

Table 4 FPGA resource utilisation of different modules


Synthesis summary for different modules targeting Altera Stratix II FPGA
Modules Shifting and Censoring Timing/control Adder, multiplier and ACCA – ODV CFAR
sorting comparator detector
Total ALUTs 4265/48 352 14 032/48 352 167/48 352 199/48 352 (,1%) 18 861/48 352
(9%) (29%) (,1%) (39%)
Total registers 4144 1415 123 151 5943
Total memory bits 0 0 4843/2 544 192 0 4843/2 544 192
(,1%) (,1%)
DSP block 9 bit 0 40/288 (14%) 0 8/288 (3%) 48/288 (17%)
elements
Maximum     109.37 MHz
frequency

20 IET Circuits Devices Syst., 2009, Vol. 3, Iss. 1, pp. 12– 21


& The Institution of Engineering and Technology 2009 doi: 10.1049/iet-cds:20080072

Authorized licensed use limited to: St. Xavier's Catholic College of Engineering. Downloaded on March 12, 2009 at 00:26 from IEEE Xplore. Restrictions apply.
www.ietdl.org

7 Conclusions [2] BARKAT M.: ‘Signal detection and estimation’ (Artech


House, Norwood, MA, 2005, 2nd edn.)
In this paper, we have presented a design of ACCA–ODV
CFAR detector and its realisation using Altera Stratix II [3] TODMAN T.J., CONSTANTINIDES G.A., WILTON S.J.E., MENCER O., LUK
FPGA. FPGA proves to be an efficient hardware target W., CHEUNG P.Y.K.: ‘Reconfigurable computing: architectures
for realising the proposed CFAR detector and the and design methods’, IEE Proc. Comput. Digit Tech., 2005,
implementation results demonstrate that hardware-based 152, (2), pp. 193– 207
ACCA–ODV CFAR detector provides high computational
speeds. However, the total processing time for executing a [4] FARROUKI A., BARKAT M. : ‘Automatic censoring CFAR
single run for detecting a target depends on the number of detector based on ordered data variability for
reference cells and maximum clock frequency of the FPGA chip. nonhomogeneous environments’, IEE Proc. Radar Sonar
Navig., 2005, 152, (1), pp. 43 – 51
For N ¼ 16 and p ¼ 12, implementation result shows that
the proposed design is very compact and takes 0.21 ms to [5] GANDHI P.P., KASSAM S.A.: ‘Analysis of CFAR processors in
perform processing of a single run. The proposed FPGA- nonhomogeneous background’, IEEE Trans. Aerosp.
based ACCA– ODV CFAR detector was thoroughly tested Electron. Syst., 1988, 24, (4), pp. 427– 445
after implementation. It was found after doing place and
route that the design can operate at 100 MHz, the [6] CUMPLIDO R., TORRES C., LOPEZ S.: ‘A configurable FPGA-
maximum clock frequency of the prototyping board. This based hardware architecture for adaptive processing of
provides a speedup of 110 times as compared to noisy signals for target detection based on constant false
software-based implementation. alarm rate (CFAR) algorithms’. Proc. Int. Signal Processing
Conf. and Expo (GSPX’2004), USA, September 2004,
It is worth noting that the ACCA–ODV CFAR detector CDROM
has been designed under the assumption that the radar clutter
has a Gaussian pdf, which results in Rayleigh distributed [7] CUMPLIDO R., TORRES C., LOPEZ S.: ‘On the implementation
amplitude. In several applications of radar target detection, of an efficient FPGA-based CFAR processor for target
the clutter amplitude may not be Rayleigh distributed. This is detection’. Proc. 1st Int. Conf. Electrical and Electronics
true when working with high resolution radars, low grazing Engineering, Acapulco, Mexico, June 2004, pp. 214 – 218
angles and horizontal polarisation at high frequencies.
[8] EL-FARAMAWY N.M., EL-BADAWY E.A., SALEM A.I.: ‘Hardware
Future work would concentrate on the design and implementation of CA-CFAR processor’. Proc. Int. Conf.
realisation of FPGA-based CFAR detector to regulate the Computer and Communication Engineering, Kuala Lumpur,
false alarm in high resolution radars. Various tradeoffs Malaysia, May 2006, pp. 573 – 578
between implementation of functions in software using
embedded soft processor and hardware resources of FPGA [9] ALSUWAILEM A.M., ALSHEBEILI S.A., ALAMMAR M.: ‘Design and
will be further studied to determine the critical parts and implementation of a configurable real-time FPGA-based
optimise the designed CFAR detector by applying parallel TM-CFAR processor for radar target detection’, J. Act.
processing and pipelining techniques. Passive Electron. Devices, 2008, 3, (3-4), pp. 241– 256

[10] FAHMY S. , CHEUNG P., LUK W.: ‘Novel FPGA-based


implementation of median and weighted median filters
8 Acknowledgments for image processing’. Proc. Int. Conf. Field Programmable
This work was supported by the Prince Sultan Advanced Logic and Applications (FPL’2005), August 2005,
Technologies Research Institute (PSATRI) at King Saud pp. 142– 147
University, Saudi Arabia. The authors would like to thank
the reviewers for many useful comments and suggestions [11] BLAIR G.M.: ‘Low cost sorting circuit for VLSI’, IEEE Trans.
that have helped to improve the quality of this paper. Circuits Syst., 1996, 43, (6), pp. 515– 516

[12] Quartus II user manual: http://www.altera.com/


literature/lit-qts.jsp, accessed June 2008
9 References
[13] Stratix II development kit EP2S60 DSP user manual
[1] BARTON D.K.: ‘Radar system analysis and modeling’ http://www.altera.com.cn/products/devkits/altera/kit-dsp-
(Artech House, Norwood, MA, 2005) 2S60.html, accessed June 2008

IET Circuits Devices Syst., 2009, Vol. 3, Iss. 1, pp. 12– 21 21


doi: 10.1049/iet-cds:20080072 & The Institution of Engineering and Technology 2009

Authorized licensed use limited to: St. Xavier's Catholic College of Engineering. Downloaded on March 12, 2009 at 00:26 from IEEE Xplore. Restrictions apply.

Vous aimerez peut-être aussi