Vous êtes sur la page 1sur 54

VLSI Arithmetic

Lecture 5
Prof. Vojin G. Oklobdzija University of California
http://www.ece.ucdavis.edu/acsel

Review
Lecture 4

Lings Adder
Huey Ling, High-Speed Binary Adder IBM Journal of Research and Development, Vol.5, No.3, 1981.

Used in: IBM 3033, IBM S370/168, Amdahl V6, HP etc.

Lings Derivations
define:

ai

bi

H i 1 Ci 1 Ci

Ci 1 g i pi Ci
ci+1 ci

gi implies Ci+1 which implies Hi+1 , thus: gi= gi Hi+1

gi ai bi
si

pi Ci pi Ci pi gi pi pi Ci pi Ci pi Ci 1 pi H i 1

ai bi pi gi ti
0 0 0 0 0 1 1 0 0 1

pi Ci pi H i 1

Ci 1 ti H i 1

Ci 1 gi pi Ci gi Hi 1 pi Ci gi Hi 1 pi Hi 1 ti Hi 1
Oklobdzija 2004 Computer Arithmetic

1
1

0 1 0
1 0 1

1
1

Lings Derivations
From: H i 1 Ci 1 Ci
and

Ci 1 g i pi Ci
Ci 1 ti H i 1

H i 1 Ci 1 Ci g i pi Ci Ci g i Ci

H i 1 g i ti 1 H i
fundamental expansion

because:

Now we need to derive Sum equation

Oklobdzija 2004

Computer Arithmetic

Ling Adder
Variation of CLA: Lings equations:

pi ai bi g i ai bi Ci 1 g i pi Ci

ti ai bi g i ai bi H i 1 g i ti 1 H i Si ti H i 1 g i ti 1 H i
Ling, IBM J. Res. Dev, 5/81

Si pi Ci

Oklobdzija 2004

Computer Arithmetic

Ling Adder
Variation of CLA: Lings equation:
ai
Hi+1

Ci 1 gi gi Ci pi Ci gi gi pi Ci
Ci 1 g i ti Ci

bi
Hi gi, ti

ai-1 bi-1

ci+1

ci

gi-1, ti-1

ci-1

si

si-1

H i 1 g i ti 1 H i
Ling uses different transfer function. Four of those functions have desired properties (Lings is one of them)

see: Doran, IEEE Trans on Comp. Vol 37, No.9 Sept. 1988.
Oklobdzija 2004 Computer Arithmetic 7

Ling Adder
Conventional:
Fan-in of 5

C4 g 3 t3 g 2 t3t 2 g1 t3t 2t1 g 0 t3t 2t1t0Cin


Ling:

H 4 g 3 t2 g 2 t2t1g1 t2t1t0 g 0 t2t1t0t1Cin H 4 g 3 g 2 t2 g1 t2t1g0 t2t1t0Cin


Fan-in of 4

Oklobdzija 2004

Computer Arithmetic

Advantages of Lings Adder


Uniform loading in fan-in and fan-out H16 contains 8 terms as compared to G16 that contains 15. H16 can be implemented with one level of logic (in ECL), while G16 can not (with 8-way wireOR). (Lings adder takes full advantage of wired-OR, of special importance when ECL technology is used - his IBM limitation was fan-in of 4 and wire-OR of 8)
Oklobdzija 2004 Computer Arithmetic 9

Ling: Weinberger Notes

Oklobdzija 2004

Computer Arithmetic

10

Ling: Weinberger Notes

Oklobdzija 2004

Computer Arithmetic

11

Ling: Weinberger Notes

Oklobdzija 2004

Computer Arithmetic

12

Advantage of Lings Adder


32-bit adder used in: IBM 3033, IBM S370/ Model168, Amdahl V6. Implements 32-bit addition in 3 levels of logic Implements 32-bit AGEN: B+Index+Disp in 4 levels of logic (rather than 6) 5 levels of logic for 64-bit adder used in HP processor

Oklobdzija 2004

Computer Arithmetic

13

Implementation of Lings Adder in CMOS


(S. Naffziger, A Subnanosecond 64-b Adder, ISSCC 96)

Oklobdzija 2004

Computer Arithmetic

14

S. Naffziger, ISSCC96

H 4 g 3 g 2 t 2 g1 t 2t1 g 0

Ci 1 ti H i 1
Oklobdzija 2004 Computer Arithmetic 15

S. Naffziger, ISSCC96

H 4 g 3 g 2 t 2 g1 t 2t1 g 0
Oklobdzija 2004 Computer Arithmetic 16

H 4 g 3 g 2 t 2 g1 t 2t1 g 0

S. Naffziger, ISSCC96
Oklobdzija 2004 Computer Arithmetic 17

S. Naffziger, ISSCC96
Oklobdzija 2004 Computer Arithmetic 18

S. Naffziger, ISSCC96
Oklobdzija 2004 Computer Arithmetic 19

S. Naffziger, ISSCC96
Oklobdzija 2004 Computer Arithmetic 20

S. Naffziger, ISSCC96
Oklobdzija 2004 Computer Arithmetic 21

C16 p15 H16 p15 ( g15 g11 t11 g 7 t11t7 g 0 )


S. Naffziger, ISSCC96
Oklobdzija 2004 Computer Arithmetic 22

S. Naffziger, ISSCC96
Oklobdzija 2004 Computer Arithmetic 23

S. Naffziger, ISSCC96
Oklobdzija 2004 Computer Arithmetic 24

S. Naffziger, ISSCC96
Oklobdzija 2004 Computer Arithmetic 25

Ling Adder Critical Path

Oklobdzija 2004

Computer Arithmetic

26

Ling Adder: Circuits


CK A2 A2 B2 A1 B1 G3 B2 A1 A0 B0 B1 G4 CK A3

CK A1 B1 A3 B0 A2 B3 B2 P4

B3

A0
CK

CK G0 P1 P2 CK G1

LC

SumL C1L K

LCH

LCL C0H P

C1H C0L

G2
G C1H SumH CK C1L C0H LCH LCL C0L

Oklobdzija 2004

Computer Arithmetic

27

LCS4 Critical G Path


in1 4b (k,p) or (g,p) P4 G3 G4

12b C15 32b C47 16b S63 S62 S48 C31 C15

Oklobdzija 2004

Computer Arithmetic

28

LCS4 Logical Effort Delay


Prefix-4 Ling/Conditional-Sum (Dynamic - Long Carry Path) Total Path Effort fo, opt Parasitic Branch Total LE 2.97 1.84 1.80 1.00 3.32 3.84E+02 9.73E-01 3.74E+02 1.81 1.00 1.00 1.00 1.36 1.00 Effort Delay (ps) Parasitic Delay (ps) Total Delay (ps) Total Delay (FO4)

Stages dg3# (dg3) g4 (NAND2) C15# (GG4) C15 (INV) C47# (LC) C47 (INV) C47#b (INV) C47b (INV) S63# (SUM) S63 (INV)

Branch 4.0 2.0 1.0 1.0 3.0 1.0 1.0 1.0 16.0 1.0

LE 0.98 1.11 1.01 1.00 1.03 1.00 1.00 1.00 0.86 1.00

66

70

136

7.2

Oklobdzija 2004

Computer Arithmetic

29

Results:
0.5u Technology Speed: 0.930 nS Nominal process, 80C, V=3.3V

See: S. Naffziger, A Subnanosecond 64-b Adder, ISSCC 96

Oklobdzija 2004

Computer Arithmetic

30

Prefix Adders and Parallel Prefix Adders

from: Ercegovac-Lang
Oklobdzija 2004 Computer Arithmetic 32

Prefix Adders
Following recurrence operation is defined:

(g, p)o(g,p)=(g+pg, pp)


such that:

(g0, p0) Gi, Pi =

i=0

(gi, pi)o(Gi-1, Pi-1 )


for i=0, 1, .. n

1in

ci+1 = Gi c1 = g0+ p0 cin

(g-1, p-1)=(cin,cin)

This operation is associative, but not commutative It can also span a range of bits (overlapping and adjacent)
Oklobdzija 2004 Computer Arithmetic 33

from: Ercegovac-Lang
Oklobdzija 2004 Computer Arithmetic 34

Parallel Prefix Adders: variety of possibilities


from: Ercegovac-Lang

Oklobdzija 2004

Computer Arithmetic

35

Pyramid Adder:
M. Lehman, A Comparative Study of Propagation Speed-up Circuits in Binary Arithmetic Units, IFIP Congress, Munich, Germany, 1962.

Oklobdzija 2004

Computer Arithmetic

36

Parallel Prefix Adders: variety of possibilities


from: Ercegovac-Lang

Oklobdzija 2004

Computer Arithmetic

37

Parallel Prefix Adders: variety of possibilities


from: Ercegovac-Lang

Oklobdzija 2004

Computer Arithmetic

38

Hybrid BK-KS Adder

Oklobdzija 2004

Computer Arithmetic

39

Parallel Prefix Adders: S. Knowles 1999

operation is associative: h>ijk

operation is idempotent: h>ijk

produces carry: cin=0

Oklobdzija 2004

Computer Arithmetic

40

Parallel Prefix Adders: Ladner-Fisher

Exploits associativity, but not idempotency. Produces minimal logical depth


Oklobdzija 2004 Computer Arithmetic 41

Parallel Prefix Adders: Ladner-Fisher


(16,8,4,2,1)

Two wires at each level. Uniform, fan-in of two. Large fan-out (of 16; n/2); Large capacitive loading combined with the long wires (in the last stages)
Oklobdzija 2004 Computer Arithmetic 42

Parallel Prefix Adders: Kogge-Stone


Exploits idempotency to limit the fan-out to 1. Dramatic increase in wires. The wire span remains the same as in Ladner-Fisher. Buffers needed in both cases: K-S, L-F

Oklobdzija 2004

Computer Arithmetic

43

Kogge-Stone Adder

Oklobdzija 2004

Computer Arithmetic

44

Parallel Prefix Adders: Brent-Kung


Set the fan-out to one Avoids explosion of wires (as in K-S) Makes no sense in CMOS:
fan-out = 1 limit is arbitrary and extreme much of the capacitive load is due to wire (anyway)

It is more efficient to insert buffers in L-F than to use B-K scheme

Oklobdzija 2004

Computer Arithmetic

45

Brent-Kung Adder

Oklobdzija 2004

Computer Arithmetic

46

Parallel Prefix Adders: Han-Carlson


Is a hybrid synthesis of L-F and K-S Trades increase in logic depth for a reduction in fan-out:
effectively a higher-radix variant of K-S. others do it similarly by serializing the prefix computation at the higher fan-out nodes.

Others, similarly trade the logical depth for reduction of fan-out and wire.

Oklobdzija 2004

Computer Arithmetic

47

Parallel Prefix Adders: variety of possibilities


from: Knowles

bounded by L-F and K-S at ends

Oklobdzija 2004

Computer Arithmetic

48

Parallel Prefix Adders: variety of possibilities


Knowles 1999

Following rules are used: Lateral wires at the jth level span 2j bits Lateral fan-out at jth level is power of 2 up to 2j Lateral fan-out at the jth level cannot exceed that a the (j+1)th level.

Oklobdzija 2004

Computer Arithmetic

49

Parallel Prefix Adders: variety of possibilities


Knowles 1999

The number of minimal depth graphs of this type is given in:

at 4-bits there is only K-S and L-F, afterwards there are several new possibilities.
Oklobdzija 2004 Computer Arithmetic 50

Parallel Prefix Adders: variety of possibilities

Knowles 1999

example of a new 32-bit adder [4,4,2,2,1]


Oklobdzija 2004 Computer Arithmetic 51

Parallel Prefix Adders: variety of possibilities


Knowles 1999

Example of a new 32-bit adder [4,4,2,2,1]


Oklobdzija 2004 Computer Arithmetic 52

Parallel Prefix Adders: variety of possibilities


Knowles 1999

Delay is given in terms of FO4 inverter delay: w.c.


(nominal case is 40-50% faster)

K-S is the fastest K-S adders are wire limited (requiring 80% more area) The difference is less than 15% between examined schemes
Oklobdzija 2004 Computer Arithmetic 53

Parallel Prefix Adders: variety of possibilities


Knowles 1999

Conclusion Irregular, hybrid schmes are possible The speed-up of 15% is achieved at the cost of large wiring, hence area and power Circuits close in speed to K-S are available at significantly lower wiring cost
Oklobdzija 2004 Computer Arithmetic 54

Vous aimerez peut-être aussi