Académique Documents
Professionnel Documents
Culture Documents
Abstract—In this paper, we show that using FPGA as a Co- Effective hardware implementation of FFT algorithm can
Processor with floating point arithmetic can enhance DSP system be achieved through the following methods. The first, related
performance levels through optimized core implementation of to the modern and new techniques is the Field Programmable
critical compute-intensive digital signal processing algorithms Gate Array (FPGA) approach, and the second, based on
such as Fast Fourier Transform (FFT) . Our approach is based
Application Specified Integrated Circuit (ASIC) architecture.
on building basic building blocks, by implementing optimized,
multi-cycle, floating point arithmetic core, which are then used to The FPGA technologies are quite mature for Digital Signal
implement much complex layers of logic such as FFT butterfly, Processing (DSP) applications [5] due to fast progress in Very
complex multiplier and a DFT block. We present performance Large Scale Integration (VLSI) technology. Today, the FPGA
results to show that a speedup of 10-19X can be achieved over an devices provide fully programmable system-on-the-chip
optimized FFT DSP Coprocessor implementation on a low cost environments by incorporating the programmability of logic
FPGA such as Cyclone IV. cells, and architecture of gate arrays. They consist of tens of
thousands of configurable logic blocks which make them an
Index Terms - DSP Coprocessor, FFT, Single Precision appropriate solution for specified digital signal processing
Floating Point Numbers, IEEE754, FPGA.
application.
The objective of this work was to get an area and time
I. INTRODUCTION efficient architecture that could be used as a coprocessor, with
Fig. 2. Radix-2 Decimation In Time Butterfly cost of this solution is in resolution and in accuracy. However,
this loss of accuracy is relative as the resolution of a
V. FFT COPROCESSOR IMPLEMENTATION denormalized number depends on the position of its leading 1.
We exploit FPGA’s parallelism in the FFT computation in
two ways: pipelined floating point arithmetic units (multiplier B. Floating Point Adder/Subtractor
and add/sub unit) and parallelism within the stages of FFT. In the pre-processing stage, the unit checks which input
number is bigger by comparing the exponent and then aligns
A. Floating Point Hardware Considerations the mantissa of the smaller number by the difference. In the
In any operation, that’s using floating point numbers the calculation stage of a floating-point adder/Subtractor unit we
first task is to analyze the exponent and the mantissa operands use fixed arithmetic adder/Subtractor which does not present
in order to determine the number type. On the other hand, the any particular complexity that calculates the mantissa of the
last task of the operation is to compose the sign, exponent and result. The sign of the result is calculated in the preprocessing
mantissa of the result into a number. Therefore, our floating stage by taking into account the input signs, if it is an addition
point units, the multiplication unit and the sub/add unit are or a subtraction and which operand is bigger. The exponent of
composed of three stages: the pre-processing stage, the the result which equals to the exponent of the biggest operand
calculation stage and the post-processing stage, as illustrated in is adjusted during post-processing, if the mantissa result
Figure 3. presents a carry (addition) or a cancelation of its most
significant bits (subtraction).