Vous êtes sur la page 1sur 31

Algorithmic Techniques in VLSI CAD

Shantanu Dutt University of Illinois at Chicago

Algorithms in VLSI CAD


Divide & Conquer (D&C) [e.g., merge-sort, partition-driven placement] Reduce & Conquer (R&C) [e.g., multilevel techniques such as the hMetis partitioner] Dynamic programming [e.g., matrix multiplication, optimal buffer insertion] Mathematical programming: linear, quadratic, 0/1 integer programming [e.g., floorplanning, global placement]

Algorithms in VLSI CAD (contd)


Search Methods:
Depth-first search (DFS): mainly used to find any solution when cost is not an issue [e.g., FPGA detailed routing---cost generally determined at the global routing phase] Breadth-first search (BFS): mainly used to find a soln at min. distance from root of search tree [e.g., maze routing when cost = dist. from root] Best-first search (BeFS): used to find optimal solutions w/ any cost function, Can be done when a provable lower-bound of the cost can be determined for each branching choice from the current partial soln node [e.g., TSP, global routing]

Iterative Improvement: deterministic, stochastic

Determine if the problem can be solved in a hierarchical or divide-&conquer (D&C) manner:


Root problem A

Divide & Conquer

Stitch-up of solns to A1 and A2 to form the complete soln to A

Subprob. A1

Subprob. A2

Do recursively until subprob-size is s.t. TTbased design is doable

A1,1

A1,2

A2,1

A2,2

D&C approach: See if the problem can be broken up into 2 or more smaller subproblems that can be stitched-up to give a soln. to the parent prob. Do this recrusively for each large subprob until subprobs are small enough for an easy solution technique (could be exhasutive!) If the subprobs are of a similar kind to the root prob then the breakup and stitching will also be similar

Reduce-&-Conquer
Examples: Multilevel graph/hypergraph partitioning (e.g., hMetis), multilevel routing

Reduce problem size (Coarsening)

Solve

Uncoarsen and refine solution

Dynamic Programming (DP)


Root Problem A Stitch-up function f: Optimal soln of root = f(optimal solns of subproblems) = f(opt(A1), opt(A2), opt(A3), opt(A4)

Stitch-up function

A1

A2

A3

A4

Subproblems

The above property means that everytime we optimally solve the subproblem, we can store/record the soln and reuse it everytime it is part of the formulation of a higher-level problem

Dynamic Programming (contd)


Root Problem A Stitch-up function

A1 Subproblems

A2

A3

A4

Matrix multiplication example: Most computationally efficient way to perform the series of matrix mults: M = M1 x M2 x .. x Mn, Mi is of size ri x ci w/ ri = ci -1 for i > 1. DP formulation: opt_seq(M) = (by defn) opt_seq(M(1,n)) = mini=1 to n-1 {opt_seq(M(1, i)) + opt_seq(M(i+1, n)) + r1xcixcn} Correctness rests on the property that the optimal way of multiplying M1x x Mi & Mi+1 to Mn will be used in the min stitch-up function to determine the optimal soln for M Thus if the optimal soln invloves a cut at Mr, then the opt_seq(M(1,r)) & opt_seq(M(r+1,n)) will be part of opt_seq(M) Perform computation bottom-up (smallest sequences first) Complexity: Note that each subseq M(j, k) will appear in the above computation and is solved exactly once (irrespective of how many times it appears). Time to solve M(j, k), j < n, k >= j, not counting the time to solve its subproblems (which are accounted for in the complexity of each M(j,k)) is length l of seq -1 = l-1 (min of l-1 different options is computed). Note l = j-k+1 # of different M(j, k)s is of length l = n l + 1, 2 <= l <= n. Total complexity = Sum i = 1 to n-1 (i+1) (n-i) = O(n 3) (as opposed to, say, O(2 n) using exhaustive search)

A DP Example: Simple Buffer Insertion Problem


Given: Source and sink locations, sink capacitances and RATs, a buffer type, source delay rules, unit wire resistance and capacitance

Buffer

RAT3

RAT4

s0 RAT2 RAT1
Courtesy: Chuck Alpert, IBM

Simple Buffer Insertion Problem (contd)


Find: Buffer locations and a routing tree such that slack/RAT at the source is maximized

q( s0 ) min 1i 4 {RAT ( si ) delay( s0 , si )}


RAT3 s0 RAT2 RAT1
Courtesy: Chuck Alpert, IBM

RAT4

Slack/RAT Example
RAT = 500 delay = 400

Slack/RAT = -200
RAT = 400 delay = 600 RAT = 500 delay = 350 Slack/RAT = +100 RAT = 400 delay = 300
Courtesy: Chuck Alpert, IBM

Elmore Delay

R A
1

B
C
1

R
2

C C
2

Delay ( A C ) R1 (C1 C2 ) R2C2


Courtesy: Chuck Alpert, IBM

DP Example: Van Ginneken Buffer Insertion Algorithm [ISCAS90]


Associate each leaf node/sink with two metrics (Ct, Tt) Downstream loading capacitance (Ct) and RAT (Tt) DP-based alg propagates potential solutions bottom-up [Van Ginneken,
90]

Add a wire

Ct Cn Cw Note: Take Ln = Cn 1 Tt Tn Rw Ln Rw Cw 2

Ct, Tt

Cw, Rw Cn, Tn

Add a buffer

Ct Cb Tt Tn Tb Rb Ln

Ct, Tt

Cn, Tn

Merge two solutions: For each Zn=(Cn,Tn), Zm=(Cm,Tm) soln. vectors in the 2 subtrees, create a soln vector Zt=(Ct,Tt) where Ct Cn Cm
Tt min(Tn , Tm )
Courtesy: UCLA

Ct, Tt

Cn, Tn

Cm, Tm

DP Example (contd)
Add a wire to each merged solution Zt (same cap. & delay change formulation as before) Add a buffer to each Zt Delete all dominated solutions Zd: Zd=(Cd, Td) is dominated if there exists a Zr=(Cr, Tr) s.t. Cd >= Cr and Td <= Tr (i.e., both metrics are worse) The remaining soln vectors are all optimal solns for this subtree and one of them will be part of the optimal solution at the root/driver of the net---this is the DP feature of this algorithm

RAT3 s0

RAT4

RAT2 RAT1

Van Ginneken Example


(20,400) Buffer C=5, d=30 Wire C=10,d=150 (20,400)

(30,250) (5, 220)


Buffer C=5, d=50 C=5, d=30 (45, 50) (5, 0) (20,100) (5, 70) Wire C=15,d=200 C=15,d=120 (30,250) (5, 220)

(20,400)

Courtesy: Chuck Alpert, IBM

Van Ginneken Example Contd


(45, 50) (5, 0) (20,100) (5, 70)
(30,250) (5, 220) (20,400)

(5,0) is inferior to (5,70). (45,50) is inferior to (20,100)


Wire C=10 (30,10) (15, -10) (20,100) (5, 70) (30,250) (5, 220) (20,400)

Pick solution with largest slack, follow arrows to get solution


Courtesy: Chuck Alpert, IBM

Mathematical Programming
Others Linear programming (LP) E.g., Obj: Min 2x1-x2+x3 w/ constraints x1+x2 <= a, x1-x3 <= b -- solvable in polynomial time Quadratic programming (QP) E.g., Min. x12 x2x3 w/ linear constraints -- solvable in polynomial (cubic) time w/ equality constraints

Some vars are integers

Mixed integer linear prog (ILP) Mixed integer quad. prog (IQP) -- NP-hard Some vars -- NP-hard

Mixed 0/1 integer linear prog (0/1 ILP) -- NP-hard

are in {0,1}

Mixed 0/1 integer quad. prog (0/1 IQP) -- NP-hard

0/1 ILP/QLP Examples


Generally useful for assignment problems, where objects {O1, ..., On) are assigned to bins {B1, ..., Bm} 0/1 variable xi,j = 1 of object Oi is assigned to bin Bj Min-cut bi-partitioning for graphs G(V,E) can me modeled as a 0/1 IQP x i,1 = 1 => ui in V1 else ui in V2 Edge (ui, uj) in cutset if ui xi,1 (1-xj,1) + (1-xi,1)(xj,1 ) = 1 uj Objective function:
Min Sum (ui, uj) in E c(i,j) (xi,1 (1-xj,1) + (1xi,1)(xj,1) Constraint: Sum w(ui) x i,1 <= max-size

V2

V1

Search Techniques
A B E C
2 6 1

A C E F
5 3 G 4

A
5

B
4 2

C
6

E
7

G D F

G F

Graph

DFS

BFS

dfs(v) /* for basic graph visit or for soln finding when nodes are partial solns */ v.mark = 1; for each (v,u) in E if (u.mark != 1) then dfs(u) Algorithm Depth_First_Search for each v in V v.mark = 0; for each v in V if v.mark = 0 then if G has partial soln nodes then dfs(v); else soln_dfs(v);

soln_dfs(v) /* used when nodes are basic elts of the problem and not partial soln nodes */ v.mark = 1; If path to v is a soln, then return(1); for each (v,u) in E if (u.mark != 1) then soln_found = soln_dfs(u) if (soln_found = 1) then return(soln_found) end for; v.mark = 0; /* can visit v again to form another soln on a different path */ return(0)

optimal_soln_dfs(v) /* used when nodes are basic elts of the problem and not partial soln nodes */ begin v.mark = 1; If path to v is a soln, then begin if cost < best_cost then begin best_soln=soln; best_cost=cost; endif v.mark=0; return; Endif for each (v,u) in E if (u.mark != 1) then optimal_soln_dfs(u) end for; v.mark = 0; /* can visit v again to form another soln on a different path */ end

Search TechniquesExhaustive DFS


1

A C
2 6

B E F
5

3 G 4

DFS Algorithm Depth_First_Search for each v in V v.mark = 0; best_cost = infinity; optimal_soln_dfs(root);

10 (1) 12

Best-First Search
costs

BeFS (root) 19 begin open = {root} /* open is list of gen. but not (2) expanded nodes---partial solns */ best_soln_cost = infinity; 16 18 18 while open != nullset do begin 17 curr = first(open); (3) if curr is a soln then return(curr) /* curr is an optimal soln */ Expand_&_est_cost(Y) else children = Expand_&_est_cost(curr); begin /* generate all children of curr & estimate children = nullset; their costs---cost(u) should be a lower for each basic elt x of problem reachable bound of cost of the best soln reachable from Y & can be part of current partial soln. Y from u */ do begin for each child in children do begin if x not in u and if feasible if child is a soln then child = Y U {x}; delete all nodes w in open s.t. path_cost(child) = path_cost(Y) + cost(w) >= cost(child); cost(u, x) endif /* cost(Y,x) is cost of reaching x from Y */ store child in open in increasing order est(child) = lower bound cost of best soln of cost; reachable from child; endfor cost(child) = path_cost(child) + est(child); endwhile children = children U {child}; end /* BFS */ endfor end /* Expand_&_est_cost(Y);
15

Best-First Search
10 costs (1) 12 15 (2) 18 17 18 (3) 19

16

Proof of optimality when cost is a LB The current set of nodes in open represents a complete front of generated nodes, i.e., the rest of the nodes in the search space are descendants of open Assuming the basic cost (cost of adding an elt in a partial soln to contruct another partial soln that is closer to the soln) is non-negative, the cost is monotonic, i.e., cost of child >= cost of parent If first node curr in open is a soln, then cost(curr) <= cost(w) for each w in open Cost of any node in the search space not in open and not yet generated is >= cost of its ancestor in open and thus >= cost(curr). Thus curr is the optimal (min-cost) soln

A 5 3 5 E 1

Search techs for a TSP example


B
A

A E F

4 5 8 F 7 2 D E F F E D E E D x C D C F F B

TSP graph

27 Solution nodes

31

33

Exhaustive search using DFS (w/ backtrack) for finding an optimal solution

Search techs for a TSP example (contd)


A 5 3 5 E 1 Path cost for (A,E,F) = 8 2 E F F 23+8 D MST for node (A, E, F); = MST{F,A,B,C,D}; cost=16 8 F 7 C D C F 21+6 F C D 22+9 C
X

9 B 4 5 B E F 8+16 E E D
X X A

A F D 5+15

C B F F 14+9

F 11+14

Lower-bound cost estimate: A MST({unvisited cities} U 27 {current city} U {start city}) LB as structure (spanning tree) is a superset of reqd soln structure (cycle) min_cost(set S) <= min_cost(set S) if S is a superset of S

20

BeFS for finding an optimal solution

BFS for 0/1 ILP Solution


X = {x1, , xm} are 0/1 vars
X2=0 X2=1

Solve LP w/ x2=0; Cost=cost(LP)=C1


X4=0

Solve LP w/ x2=1; Cost=cost(LP)=C2


X4=1

Cost relations: C5 < C3 < C1 < C6 C2 < C1 C4 < C3

Solve LP w/ x2=1, x4=0; Cost=cost(LP)=C3


X5=0

Solve LP w/ x2=1, x4=1; Cost=cost(LP)=C4


X5=1

Solve LP w/ x2=1, x4=1, x5=0 Cost=cost(LP)=C5

Solve LP w/ x2=1, x4=1, x5=1 Cost=cost(LP)=C6

optimal soln

Iterative Improvement Techniques


Iterative improvement Stochastic (non-greedy) Make a combination of deterministic greedy moves and probabilistic moves that cause a detrioration (can help to jump out of local minima) Until (stopping criteria satisfied) Stopping criteria could be an upper bound on the total # of moves or iterations

Deterministic Greedy Locally/immediately greedy Non-locally greedy

Make move that is Make move that is immediately (locally) best best according to some non-immediate (non-local) Until (no further impr.) metric (e.g., probability(e.g., FM) based lookahead as in PROP) Until (no further impr.)

Vous aimerez peut-être aussi