Vous êtes sur la page 1sur 58

Arithmetic Circuits

1
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
A Generic Digital Processor

MEM ORY
INPUT-OUTPUT

CONTROL

DATAPATH

2
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Building Blocks for Digital Architectures

Arithmetic unit
- Bit-sliced datapath (adder, multiplier, shifter, comparator, etc.)

Memory
- RAM, ROM, Buffers, Shift registers
Control
- Finite state machine (PLA, random logic.)
- Counters
Interconnect
- Switches
- Arbiters
- Bus

3
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
An Intel Microprocessor

9-1 Mux

5-1 Mux
a g64
CARRYGEN

node1

SUMSEL
sum sumb

REG
ck1 to Cache
9-1 Mux

2-1 Mux

SUMGEN s0
+ LU s1
b

LU : Logical
Unit

1000um

Itanium has 6 integer execution units like this

4
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Bit-Sliced Design
Control

Bit 3

Data-Out
Multiplexer
Bit 2
Data-In

Register

Adder

Shifter
Bit 1
Bit 0

Tile identical processing elements


5
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Bit-Sliced Datapath
From register files / Cache / Bypass

Multiplexers
Shifter
Adder stage 1
Wiring
Loopback Bus
Loopback Bus

Loopback Bus
Adder stage 2

Wiring
Bit slice 63

Bit slice 2
Bit slice 1
Bit slice 0
Adder stage 3

Sum Select

To register files / Cache

6
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Itanium Integer Datapath

Fetzer, Orton, ISSCC’02 7


© Digital
EE141 Integrated Circuits 2nd
Arithmetic Circuits
Adders

8
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Full-Adder
A B

Cin Full Cout


adder

Sum

9
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
The Binary Adder
A B

Cin Full Cout


adder

Sum

S = A  B  Ci

= ABC i + ABC i + ABCi + ABCi


C o = AB + BCi + ACi

10
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Express Sum and Carry as a function of P, G, D

Define 3 new variable which ONLY depend on A, B


Generate (G) = AB
Propagate (P) = A  B
Delete/kill = A B

Can also derive expressions for S and Co based on D and P


Note that we will be sometimes using an alternate definition for
Propagate (P) = A + B
11
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
The Ripple-Carry Adder
A0 B0 A1 B1 A2 B2 A3 B3

Ci,0 Co,0 Co,1 Co,2 Co,3


FA FA FA FA
(= Ci,1)

S0 S1 S2 S3

Worst case delay linear with the number of bits


td = O(N)

tadder = (N-1)tcarry + tsum

Goal: Make the fastest possible carry path circuit

12
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Complimentary Static CMOS Full Adder
VDD

VDD
Ci A B

A B
A

B
Ci B
VDD
A
X
Ci

Ci A S
Ci

A B B VDD
A B Ci A

Co B

28 Transistors
13
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Inversion Property
A B A B

Ci FA Co Ci FA Co

S S

S  A B C i  = S  A B  C i 

C o  A B C i  = Co  A B  C i 

14
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Minimize Critical Path by Reducing Inverting Stages

Even cell Odd cell

A0 B0 A1 B1 A2 B2 A3 B3

Ci,0 Co,0 Co,1 Co,2 Co,3


FA FA FA FA

S0 S1 S2 S3

Exploit Inversion Property

15
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
A Better Structure: The Mirror Adder
VDD

VDD VDD A

A B B A B Ci B
Kill
"0"-Propagate A Ci
Co
Ci S
A Ci
"1"-Propagate Generate
A B B A B Ci A

24 transistors
16
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Mirror Adder
Stick Diagram
VDD

A B Ci B A Ci Co Ci A B

Co

GND
17
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
The Mirror Adder
•The NMOS and PMOS chains are completely symmetrical.
A maximum of two series transistors can be observed in the carry-
generation circuitry.
•When laying out the cell, the most critical issue is the minimization
of the capacitance at node Co. The reduction of the diffusion
capacitances is particularly important.
•The capacitance at node Co is composed of four diffusion
capacitances, two internal gate capacitances, and six gate
capacitances in the connecting adder cell .
•The transistors connected to Ci are placed closest to the output.
•Only the transistors in the carry stage have to be optimized for
optimal speed. All transistors in the sum stage can be minimal
size.

18
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Transmission Gate Full Adder

P
VDD
VDD Ci
A
P S Sum Generation
A A P Ci

A P VDD
B B
VDD A
P
P Co Carry Generation
Ci Ci Ci
A
Setup P

19
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Manchester Carry Chain
VDD
Pi
VDD 
Pi
Ci Co
Gi
Co Gi
Ci

Di
Pi 

20
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Manchester Carry Chain
VDD

P0 P1 P2 P3
C3

Ci,0
G0 G1 G2 G3

C0 C1 C2 C3

21
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Manchester Carry Chain
Stick Diagram
Propagate/Generate Row

VDD
Pi Gi  Pi + 1 Gi + 1 

Ci - 1 Ci Ci + 1

GND

Inverter/Sum Row

22
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Carry-Skip Adder
P0 G1 P0 G1 P2 G2 P3 G3 Also called
Carry-bypass
Ci,0 C o,0 C o ,1 Co,2 Co,3
FA FA FA FA

P0 G1 P0 G1 P2 G2 P3 G3
BP=P oP1 P2 P3
Ci,0 C o ,0 Co,1 C o,2

Multiplexer
FA FA FA FA
Co,3

Idea: If (P0 and P1 and P2 and P3 = 1)


then Co3 = C0, else “kill” or “generate”.

23
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Carry-Skip Adder (cont.)
Bit 0–3 Bit 4–7 Bit 8–11 Bit 12–15
Setup tsetup Setup Setup Setup
tbypass

Carry Carry Carry Carry


propagation propagation propagation propagation

Sum Sum Sum tsum Sum

M bits

tadder = tsetup + Mtcarry + (N/M-1)tbypass + (M-1)tcarry + tsum

24
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Carry Ripple versus Carry Bypass

tp
ripple adder

bypass adder

4..8 N
25
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Carry-Select Adder
Setup

P,G

"0" "0" Carry Propagation

"1" "1" Carry Propagation

Co,k-1 Multiplexer Co,k+3

Carry Vector

Sum Generation

26
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Carry Select Adder: Critical Path
Bit 0–3 Bit 4–7 Bit 8–11 Bit 12–15
Setup Setup Setup Setup

0 0-Carry 0 0-Carry 0 0-Carry 0 0-Carry

1 1-Carry 1 1-Carry 1 1-Carry 1 1-Carry

Multiplexer Multiplexer Multiplexer Multiplexer


Ci,0 Co,3 Co,7 Co,11 Co,15

Sum Generation Sum Generation Sum Generation Sum Generation


S0–3 S4–7 S8–11 S12–15

27
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Linear Carry Select
Bit 0-3 Bit 4-7 Bit 8-11 Bit 12-15

Setup Setup Setup Setup

(1)

"0" Carry "0" Carry "0" Carry "0" Carry


"0" "0" "0" "0"
(1)

"1" Carry "1" Carry "1" Carry "1" Carry


"1" "1" "1" "1"
(5) (5) (5) (5) (5)
(6) (7) (8)
Multiplexer Multiplexer Multiplexer Multiplexer
Ci,0
(9)

Sum Generation Sum Generation Sum Generation Sum Generation

S0-3 S 4-7 S8-11 S 12-15 (10)

28
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Square Root Carry Select
Bit 0-1 Bit 2-4 Bit 5-8 Bit 9-13 Bit 14-19

Setup Setup Setup Setup


(1)

"0" Carry "0" Carry "0" Carry "0" Carry


"0" "0" "0" "0"
(1)

"1" Carry "1" Carry "1" Carry "1" Carry


"1" "1" "1" "1"
(3) (3) (4) (5) (6) (7)
(4) (5) (6) (7)
Multiplexer Multiplexer Multiplexer Multiplexer Mux
Ci,0
(8)
Sum Generation Sum Generation Sum Generation Sum Generation Sum

S0-1 S2-4 S5-8 S9-13 S14-19 (9)

29
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Adder Delays - Comparison
50

40 Ripple adder
tp (in unit delays)

30

Linear select
20

10
Square root select

0
0 20 40 60
N

30
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
LookAhead - Basic Idea
A0, B0 A1, B1 ••• AN-1, BN-1

Ci,0 P0 Ci,1 P1
Ci, N-1 PN-1

S0 S1 ••• SN-1

C o k = f A k B k Co k – 1  = Gk + P kCo k – 1

31
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Look-Ahead: Topology
Expanding Lookahead equations: VDD

C o k = Gk + Pk Gk – 1 + Pk – 1Co k – 2  G3

G2

G1
All the way:
G0
C o k = Gk + Pk  Gk – 1 + P k – 1  + P1 G0 + P0 Ci 0  
Ci,0
Co,3

P0

P1

P2

P3

32
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Logarithmic Look-Ahead Adder
A0 F

A1 A2 A3 A4 A5 A6 A7

A0
tp N
A1

A2
A3
F
A4
A5
A6 tp log2(N)
A7

33
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Carry Lookahead Trees

Co  0 = G0 + P0 Ci  0
C o 1 = G1 + P1 G0 + P1 P0 Ci 0
C o 2 = G2 + P2 G1 + P2 P1 G0 + P2 P1 P0 C i 0
=  G2 + P2 G1 +  P2 P1   G0 + P0 Ci  0  = G 2:1 + P2:1 C o 0

Can continue building the tree hierarchically.

34
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
© Digital
(A0, B0) S0

(A1, B1) S1

EE141 Integrated
(A2, B2) S2

(A3, B3) S3

Circuits2nd
(A4, B4) S4
Tree Adders

(A5, B5) S5

(A6, B6) S6

(A7, B7) S7

(A8, B8) S8

16-bit radix-2 Kogge-Stone tree


(A9, B9) S9

(A10, B10) S10

(A11, B11) S11

(A12, B12) S12

(A13, B13) S13

(A14, B14) S14

(A15, B15) S15


35
Arithmetic Circuits
© Digital
(a 0, b 0) S0

(a 1, b 1) S1

EE141 Integrated
(a 2, b 2) S2

(a 3, b 3) S3

Circuits2nd
(a 4, b 4) S4
Tree Adders

(a 5, b 5) S5

(a 6, b 6) S6

(a 7, b 7) S7

(a 8, b 8) S8

16-bit radix-4 Kogge-Stone Tree


(a 9, b 9) S9

(a 10, b 10) S 10

(a 11, b 11) S 11

(a 12, b 12) S 12

(a 13, b 13) S 13

(a 14, b 14) S 14

(a 15, b 15) S 15
36
Arithmetic Circuits
© Digital
(a 0, b 0) S0

(a 1, b 1) S1

EE141 Integrated
(a 2, b 2) S2

(a 3, b 3) S3

Circuits2nd
(a 4, b 4) S4

(a 5, b 5) S5
Sparse Trees

(a 6, b 6) S6

(a 7, b 7) S7

(a 8, b 8) S8

(a 9, b 9) S9

(a 10, b 10) S 10

(a 11, b 11) S 11

(a 12, b 12) S 12
16-bit radix-2 sparse tree with sparseness of 2

(a 13, b 13) S 13

(a 14, b 14) S 14

(a 15, b 15) S 15
37
Arithmetic Circuits
© Digital
(A0, B0) S0

EE141 Integrated
(A1, B1) S1

(A2, B2) S2

Circuits2nd
(A3, B3) S3

Brent-Kung Tree
(A4, B4) S4
Tree Adders

(A5, B5) S5

(A6, B6) S6

(A7, B7) S7

(A8, B8) S8

(A9, B9) S9

(A10, B10) S10

(A11, B11) S11

(A12, B12) S12

(A13, B13) S13

(A14, B14) S14

(A15, B15) S15


38
Arithmetic Circuits
Example: Domino Adder
VDD

VDD
Clk
Gi = aibi

Clk
Pi= ai + bi ai

ai bi
bi

Clk
Clk

Propagate Generate

39
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Example: Domino Adder
VDD VDD

Clkk Clkk
Pi:i-2k+1 Gi:i-2k+1

Pi:i-k+1 Pi:i-k+1
Gi:i-k+1

Pi-k:i-2k+1 Gi-k:i-2k+1

Propagate Generate

40
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Example: Domino Sum

41
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Multipliers

42
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
The Binary Multiplication
M + N– 1
·· k
Z = X Y =  Zk 2
k=0
M – 1 N – 1 
 i j
=   X 2   Yj 2 
 i  
 i=0  j = 0 

M – 1 N – 1 
 i + j
=  
  Xi Yj 2 


i =0 j= 0 

with
M –1
i
X =  Xi 2
i=0
N– 1
j
Y =  Y j2
j= 0 43
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
The Binary Multiplication

1 0 1 0 1 0 Multiplicand
x 1 0 1 1 Multiplier
1 0 1 0 1 0
1 0 1 0 1 0

0 0 0 0 0 0 Partial products

+ 1 0 1 0 1 0

1 1 1 0 0 1 1 1 0 Result

44
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
The Array Multiplier
X3 X2 X1 X0 Y0

X3 X2 X1 X0 Y1 Z0

HA FA FA HA

X3 X2 X1 X0 Y2 Z1

FA FA FA HA

X3 X2 X1 X0 Y3 Z2

FA FA FA HA

Z7 Z6 Z5 Z4 Z3

45
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
The MxN Array Multiplier
— Critical Path
HA FA FA HA

FA FA FA HA Critical Path 1
Critical Path 2

Critical Path 1 & 2


FA FA FA HA

46
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Carry-Save Multiplier
HA HA HA HA

HA FA FA FA

HA FA FA FA

HA FA FA HA

Vector Merging Adder

47
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Multiplier Floorplan
X3 X2 X1 X0

Y0
Y1 HA Multiplier Cell
C S C S C S C S
Z0

FA Multiplier Cell
Y2
C S C S C S C S
Z1 Vector Merging Cell

Y3
C S C S C S C S X and Y signals are broadcasted
Z2 through the complete array.
( )

C C C C
S S S S

Z7 Z6 Z5 Z4 Z3

48
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Wallace-Tree Multiplier
Partial products First stage
6 5 4 3 2 1 0 6 5 4 3 2 1 0 Bit position

(a) (b)

Second stage Final adder


6 5 4 3 2 1 0 6 5 4 3 2 1 0

FA HA
(c) (d)

49
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Wallace-Tree Multiplier

50
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Wallace-Tree Multiplier
y0 y1
y2

y0 y1 y2 y3 y4 y5
Ci-1
FA

y3
FA FA
Ci Ci Ci-1
Ci-1
FA Ci Ci-1

y4
FA
Ci Ci-1 Ci Ci-1
FA

y5

Ci FA
FA

C S
C S

51
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Multipliers —Summary

• Optimization Goals Different Vs Binary Adder

• Once Again: Identify Critical Path

• Other possible techniques


- Logarithmic versus Linear (Wallace Tree Mult)
- Data encoding (Booth)
- Pipelining
FIRST GLIMPSE AT SYSTEM LEVEL OPTIMIZATION

52
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Shifters

53
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
The Binary Shifter
Right nop Left

Ai Bi

Ai-1 Bi-1

Bit-Slice i

...
54
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
The Barrel Shifter
A3
B3

Sh1
A2
B2

Sh2 : Data Wire


A1
B1 : Control Wire

Sh3
A0
B0

Sh0 Sh1 Sh2 Sh3

Area Dominated by Wiring


55
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
4x4 barrel shifter
A3

A2

A1

A0

Sh0 Sh1 Sh2 Sh3


Buffer
Widthbarrel ~ 2 pm M

56
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
Logarithmic Shifter
Sh1 Sh1 Sh2 Sh2 Sh4 Sh4

A3 B3

A2 B2

A1 B1

A0 B0

57
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits
0-7 bit Logarithmic Shifter

A
3
Out3

A
2
Out2

A
1
Out1

A
0
Out0

58
© Digital
EE141 Integrated Circuits2nd Arithmetic Circuits

Vous aimerez peut-être aussi