Vous êtes sur la page 1sur 12

SEMINAR REPORT

FLOATING POINT IMPLEMENTATION ON FIXED POINT PROCESSOR


Supervisor :- Prof. M.C.Chandorkar
By- P.D.Aneesh Roll (113079017)

Why we need such implementation?


Pros : Fixed Point Processors are cheaper and easily available Accuracy level can be equivalent to that of a floating processor For same word length, resolution of fixed point processors are better than floating processors. Cons : The resolution of floating point processors increases more rapidly with increase in the dynamic range. For same algorithm, floating point is 1.1 to 4.3 times superior over fixed point processors, in terms of clock cycles.

Some Important Terms


Binary Floating Point (BFP) : Representing Floating Numbers in the form of binary numbers. Decimal Floating Point (DFP) : Representing Floating numbers in the form of decimal digits. Binary Integer Decimal (BID) : Type of encoding in DFP, in which each digit is represented by 4 bits. Encodes significand as unsigned Binary integer. Densely Packed Decimal (DPD) : Type of encoding floating point number in which 3 digits are represented with the help of 10 bits.

DFP MULTIPLICATION
BFP causes rounding error which can accumulate. Thus, DFP is preferred over BFP. IEEE standard representation of floating point number is Take two 64-bit operands and extract out corresponding triples (i.e. sign, exponent and significand) Multiply the significand values, and add the exponent values, to get the intermediate result as IPexp and IPc Check whether the result fits within the required precision level. Otherwise d digits of IPc will be rounded d = max(digits(IPc)-p ,0) where p be the required precision

Calculate leading one position (LOP) using leading 1 detector (LOD) Multiplication of Ac & Bc will lie between 2^k to {2^(k+2)-1},which decides no. of digits in IPc where k= Alop + Blop The rounded up intermediate result is produced in the carry save form Final result is generated using a carry propagate adder

DFP DIVISION
This method provides an algorithm for decimal floating point division that uses Newton-Raphson iteration. Uses Taylor series expansion for initial approx, so both Y and X are normalised to a range from 0.1 to 1 with Y<X. For Q = Y/X, first obtain an initial approximation of the divisors reciprocal, i.e. R0 = 1/X The n-digit divisor is converted into k digit more significand part (XM) and (n-k) digit low significand part (XL) With the new input interval of size k, we get approximation of 1/X in terms of function of XM and XL.

DFP DIVISION (contd.)


Now use Ri+1 = Ri * (2 X * Ri), to find mth iteration value such that the approximation error is of no significance (i.e less than 10^(-n-2)) Find out Q by multiplying mth iteration with Y Truncate the result Q to Q with n+1 digits Determine rounding scheme by determining 1)the sign of the equation N = Y QX 2)nth and (n+1)th digit of Q According to the rounding scheme, final output Q is calculated to be Q or Q+10^-n or Q-10^-n

DFP ADDER/SUBTRACTOR
BID encoding based 64-bit DFP addition and subtraction is discussed here. Its best suited for h/w implementation. First, extract the sign, exponent and significand value of the operands. If A is less than B, swap each other Align one operand with respect to other, by shifting it by d digits, where d = Aexp Bexp Significands are added up to get intermediate result ZIC If OP = operation = { 0 (for add) and 1 (for subtract) } then Sign and Effective operation are computed in parallel as [Sign = sign(Aexchng) xor ZICSign] & [EOP= OP xor Asign xor Bsign]

DFP ADDER (contd)


Using a binary leading 1 detector and LUT, determine the no. of digits in ZIC (Qi) and thus the window size for rounding. This method has a max error of 1 digit, which can be removed by another LUT having pre-calculated powers of 10 No. of digits to round = max (Qi -16,0) Except for some infrequent cases that need second rounding, latency of method is 7 cycles. Performance can be improved by allowing the result to go directly to output, if it fits within the precision range. Can be implemented using simple Mux and comparator combination.

FUTURE SCOPE
Use of 128 or higher bit fixed point processors can make it possible to design them for wide dynamic range applications like radar target acquisition and identification. Due to the capability of providing equivalent level of accuracy, it can be also used for implementation of tedious calculations like determining FFT, ECG, Robotics, Image recognition etc.

Cheap and easy availability of fixed point processors make it favorite for designing general purpose or commercial applications.

Thank you !!!!!

Vous aimerez peut-être aussi