Algorithms VLSI CAD Final f07

Algorithmic Techniques in VLSI CAD
Shantanu Dutt University of Illinois at Chicago
Algorithms in VLSI CAD

Divide & Conquer (D&C) [e.g., merge-sort, partition-driven placement] Reduce & Conquer (R&C) [e.g., multilevel techniques such as the hMetis partitioner] Dynamic programming [e.g., matrix multiplication, optimal buffer insertion] Mathematical programming: linear, quadratic, 0/1 integer programming [e.g., floorplanning, global placement]
Algorithms in VLSI CAD (contd)

Search Methods:
Depth-first search (DFS): mainly used to find any solution when cost is not an issue [e.g., FPGA detailed routing---cost generally determined at the global routing phase] Breadth-first search (BFS): mainly used to find a soln at min. distance from root of search tree [e.g., maze routing when cost = dist. from root] Best-first search (BeFS): used to find optimal solutions w/ any cost function, Can be done when a provable lower-bound of the cost can be determined for each branching choice from the current partial soln node [e.g., TSP, global routing]
Iterative Improvement: deterministic, stochastic
Determine if the problem can be solved in a hierarchical or divide-&conquer (D&C) manner:

Root problem A
Divide & Conquer
Stitch-up of solns to A1 and A2 to form the complete soln to A
Subprob. A1
Subprob. A2
Do recursively until subprob-size is s.t. TTbased design is doable
A1,1
A1,2
A2,1
A2,2
D&C approach: See if the problem can be broken up into 2 or more smaller subproblems that can be stitched-up to give a soln. to the parent prob. Do this recrusively for each large subprob until subprobs are small enough for an easy solution technique (could be exhasutive!) If the subprobs are of a similar kind to the root prob then the breakup and stitching will also be similar
Reduce-&-Conquer
Examples: Multilevel graph/hypergraph partitioning (e.g., hMetis), multilevel routing
Reduce problem size (Coarsening)
Solve
Uncoarsen and refine solution
Dynamic Programming (DP)

Root Problem A Stitch-up function f: Optimal soln of root = f(optimal solns of subproblems) = f(opt(A1), opt(A2), opt(A3), opt(A4)
Stitch-up function
A1
A2
A3
A4
Subproblems
The above property means that everytime we optimally solve the subproblem, we can store/record the soln and reuse it everytime it is part of the formulation of a higher-level problem
Dynamic Programming (contd)

Root Problem A Stitch-up function
A1 Subproblems
A2
A3
A4
Matrix multiplication example: Most computationally efficient way to perform the series of matrix mults: M = M1 x M2 x .. x Mn, Mi is of size ri x ci w/ ri = ci -1 for i > 1. DP formulation: opt_seq(M) = (by defn) opt_seq(M(1,n)) = mini=1 to n-1 {opt_seq(M(1, i)) + opt_seq(M(i+1, n)) + r1xcixcn} Correctness rests on the property that the optimal way of multiplying M1x x Mi & Mi+1 to Mn will be used in the min stitch-up function to determine the optimal soln for M Thus if the optimal soln invloves a cut at Mr, then the opt_seq(M(1,r)) & opt_seq(M(r+1,n)) will be part of opt_seq(M) Perform computation bottom-up (smallest sequences first) Complexity: Note that each subseq M(j, k) will appear in the above computation and is solved exactly once (irrespective of how many times it appears). Time to solve M(j, k), j < n, k >= j, not counting the time to solve its subproblems (which are accounted for in the complexity of each M(j,k)) is length l of seq -1 = l-1 (min of l-1 different options is computed). Note l = j-k+1 # of different M(j, k)s is of length l = n l + 1, 2 <= l <= n. Total complexity = Sum i = 1 to n-1 (i+1) (n-i) = O(n 3) (as opposed to, say, O(2 n) using exhaustive search)
A DP Example: Simple Buffer Insertion Problem

Given: Source and sink locations, sink capacitances and RATs, a buffer type, source delay rules, unit wire resistance and capacitance
Buffer
RAT3
RAT4
s0 RAT2 RAT1
Courtesy: Chuck Alpert, IBM
Simple Buffer Insertion Problem (contd)

Find: Buffer locations and a routing tree such that slack/RAT at the source is maximized
q( s0 ) min 1i 4 {RAT ( si ) delay( s0 , si )}

RAT3 s0 RAT2 RAT1
RAT4
Slack/RAT Example
RAT = 500 delay = 400
Slack/RAT = -200
RAT = 400 delay = 600 RAT = 500 delay = 350 Slack/RAT = +100 RAT = 400 delay = 300
Elmore Delay
R A
1
B
C
1
R
2
C C
2
Delay ( A C ) R1 (C1 C2 ) R2C2

DP Example: Van Ginneken Buffer Insertion Algorithm [ISCAS90]

Associate each leaf node/sink with two metrics (Ct, Tt) Downstream loading capacitance (Ct) and RAT (Tt) DP-based alg propagates potential solutions bottom-up [Van Ginneken,
90]
Add a wire
Ct Cn Cw Note: Take Ln = Cn 1 Tt Tn Rw Ln Rw Cw 2
Ct, Tt
Cw, Rw Cn, Tn
Add a buffer
Ct Cb Tt Tn Tb Rb Ln
Ct, Tt
Cn, Tn
Merge two solutions: For each Zn=(Cn,Tn), Zm=(Cm,Tm) soln. vectors in the 2 subtrees, create a soln vector Zt=(Ct,Tt) where Ct Cn Cm
Tt min(Tn , Tm )
Courtesy: UCLA
Ct, Tt
Cn, Tn
Cm, Tm
DP Example (contd)
Add a wire to each merged solution Zt (same cap. & delay change formulation as before) Add a buffer to each Zt Delete all dominated solutions Zd: Zd=(Cd, Td) is dominated if there exists a Zr=(Cr, Tr) s.t. Cd >= Cr and Td <= Tr (i.e., both metrics are worse) The remaining soln vectors are all optimal solns for this subtree and one of them will be part of the optimal solution at the root/driver of the net---this is the DP feature of this algorithm
RAT3 s0
RAT4
RAT2 RAT1
Van Ginneken Example

(20,400) Buffer C=5, d=30 Wire C=10,d=150 (20,400)
(30,250) (5, 220)

Buffer C=5, d=50 C=5, d=30 (45, 50) (5, 0) (20,100) (5, 70) Wire C=15,d=200 C=15,d=120 (30,250) (5, 220)
(20,400)
Van Ginneken Example Contd

(45, 50) (5, 0) (20,100) (5, 70)
(30,250) (5, 220) (20,400)
(5,0) is inferior to (5,70). (45,50) is inferior to (20,100)

Wire C=10 (30,10) (15, -10) (20,100) (5, 70) (30,250) (5, 220) (20,400)
Pick solution with largest slack, follow arrows to get solution

Mathematical Programming
Others Linear programming (LP) E.g., Obj: Min 2x1-x2+x3 w/ constraints x1+x2 <= a, x1-x3 <= b -- solvable in polynomial time Quadratic programming (QP) E.g., Min. x12 x2x3 w/ linear constraints -- solvable in polynomial (cubic) time w/ equality constraints
Some vars are integers
Mixed integer linear prog (ILP) Mixed integer quad. prog (IQP) -- NP-hard Some vars -- NP-hard
Mixed 0/1 integer linear prog (0/1 ILP) -- NP-hard
are in {0,1}
Mixed 0/1 integer quad. prog (0/1 IQP) -- NP-hard
0/1 ILP/QLP Examples

Generally useful for assignment problems, where objects {O1, ..., On) are assigned to bins {B1, ..., Bm} 0/1 variable xi,j = 1 of object Oi is assigned to bin Bj Min-cut bi-partitioning for graphs G(V,E) can me modeled as a 0/1 IQP x i,1 = 1 => ui in V1 else ui in V2 Edge (ui, uj) in cutset if ui xi,1 (1-xj,1) + (1-xi,1)(xj,1 ) = 1 uj Objective function:
Min Sum (ui, uj) in E c(i,j) (xi,1 (1-xj,1) + (1xi,1)(xj,1) Constraint: Sum w(ui) x i,1 <= max-size
V2
V1
Search Techniques
A B E C
2 6 1
A C E F
5 3 G 4
A
5
B
4 2
C
6
E
7
G D F
G F
Graph
DFS
BFS
dfs(v) /* for basic graph visit or for soln finding when nodes are partial solns */ v.mark = 1; for each (v,u) in E if (u.mark != 1) then dfs(u) Algorithm Depth_First_Search for each v in V v.mark = 0; for each v in V if v.mark = 0 then if G has partial soln nodes then dfs(v); else soln_dfs(v);
soln_dfs(v) /* used when nodes are basic elts of the problem and not partial soln nodes */ v.mark = 1; If path to v is a soln, then return(1); for each (v,u) in E if (u.mark != 1) then soln_found = soln_dfs(u) if (soln_found = 1) then return(soln_found) end for; v.mark = 0; /* can visit v again to form another soln on a different path */ return(0)
optimal_soln_dfs(v) /* used when nodes are basic elts of the problem and not partial soln nodes */ begin v.mark = 1; If path to v is a soln, then begin if cost < best_cost then begin best_soln=soln; best_cost=cost; endif v.mark=0; return; Endif for each (v,u) in E if (u.mark != 1) then optimal_soln_dfs(u) end for; v.mark = 0; /* can visit v again to form another soln on a different path */ end
Search TechniquesExhaustive DFS

1
A C
2 6
B E F
5
3 G 4
DFS Algorithm Depth_First_Search for each v in V v.mark = 0; best_cost = infinity; optimal_soln_dfs(root);
10 (1) 12
Best-First Search
costs
BeFS (root) 19 begin open = {root} /* open is list of gen. but not (2) expanded nodes---partial solns */ best_soln_cost = infinity; 16 18 18 while open != nullset do begin 17 curr = first(open); (3) if curr is a soln then return(curr) /* curr is an optimal soln */ Expand_&_est_cost(Y) else children = Expand_&_est_cost(curr); begin /* generate all children of curr & estimate children = nullset; their costs---cost(u) should be a lower for each basic elt x of problem reachable bound of cost of the best soln reachable from Y & can be part of current partial soln. Y from u */ do begin for each child in children do begin if x not in u and if feasible if child is a soln then child = Y U {x}; delete all nodes w in open s.t. path_cost(child) = path_cost(Y) + cost(w) >= cost(child); cost(u, x) endif /* cost(Y,x) is cost of reaching x from Y */ store child in open in increasing order est(child) = lower bound cost of best soln of cost; reachable from child; endfor cost(child) = path_cost(child) + est(child); endwhile children = children U {child}; end /* BFS */ endfor end /* Expand_&_est_cost(Y);
15
Best-First Search
10 costs (1) 12 15 (2) 18 17 18 (3) 19
16
Proof of optimality when cost is a LB The current set of nodes in open represents a complete front of generated nodes, i.e., the rest of the nodes in the search space are descendants of open Assuming the basic cost (cost of adding an elt in a partial soln to contruct another partial soln that is closer to the soln) is non-negative, the cost is monotonic, i.e., cost of child >= cost of parent If first node curr in open is a soln, then cost(curr) <= cost(w) for each w in open Cost of any node in the search space not in open and not yet generated is >= cost of its ancestor in open and thus >= cost(curr). Thus curr is the optimal (min-cost) soln
A 5 3 5 E 1
Search techs for a TSP example

B
A
A E F
4 5 8 F 7 2 D E F F E D E E D x C D C F F B
TSP graph
27 Solution nodes
31
33
Exhaustive search using DFS (w/ backtrack) for finding an optimal solution
Search techs for a TSP example (contd)

A 5 3 5 E 1 Path cost for (A,E,F) = 8 2 E F F 23+8 D MST for node (A, E, F); = MST{F,A,B,C,D}; cost=16 8 F 7 C D C F 21+6 F C D 22+9 C
X
9 B 4 5 B E F 8+16 E E D
X X A
A F D 5+15
C B F F 14+9
F 11+14
Lower-bound cost estimate: A MST({unvisited cities} U 27 {current city} U {start city}) LB as structure (spanning tree) is a superset of reqd soln structure (cycle) min_cost(set S) <= min_cost(set S) if S is a superset of S
20
BeFS for finding an optimal solution
BFS for 0/1 ILP Solution

X = {x1, , xm} are 0/1 vars
X2=0 X2=1
Solve LP w/ x2=0; Cost=cost(LP)=C1

X4=0
Solve LP w/ x2=1; Cost=cost(LP)=C2

X4=1
Cost relations: C5 < C3 < C1 < C6 C2 < C1 C4 < C3
Solve LP w/ x2=1, x4=0; Cost=cost(LP)=C3

X5=0
Solve LP w/ x2=1, x4=1; Cost=cost(LP)=C4

X5=1
Solve LP w/ x2=1, x4=1, x5=0 Cost=cost(LP)=C5
Solve LP w/ x2=1, x4=1, x5=1 Cost=cost(LP)=C6
optimal soln
Iterative Improvement Techniques

Iterative improvement Stochastic (non-greedy) Make a combination of deterministic greedy moves and probabilistic moves that cause a detrioration (can help to jump out of local minima) Until (stopping criteria satisfied) Stopping criteria could be an upper bound on the total # of moves or iterations
Deterministic Greedy Locally/immediately greedy Non-locally greedy
Make move that is Make move that is immediately (locally) best best according to some non-immediate (non-local) Until (no further impr.) metric (e.g., probability(e.g., FM) based lookahead as in PROP) Until (no further impr.)

Algorithms VLSI CAD Final f07

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Algorithms VLSI CAD Final f07

Transféré par

Droits d'auteur :

Formats disponibles

Algorithmic Techniques in VLSI CAD

Shantanu Dutt University of Illinois at Chicago

Algorithms in VLSI CAD

Algorithms in VLSI CAD (contd)

Iterative Improvement: deterministic, stochastic

Determine if the problem can be solved in a hierarchical or divide-&conquer (D&C) manner:

Divide & Conquer

Stitch-up of solns to A1 and A2 to form the complete soln to A

Do recursively until subprob-size is s.t. TTbased design is doable

Reduce problem size (Coarsening)

Uncoarsen and refine solution

Dynamic Programming (DP)

Dynamic Programming (contd)

A DP Example: Simple Buffer Insertion Problem

Simple Buffer Insertion Problem (contd)

q( s0 ) min 1i 4 {RAT ( si ) delay( s0 , si )}

Delay ( A C ) R1 (C1 C2 ) R2C2

DP Example: Van Ginneken Buffer Insertion Algorithm [ISCAS90]

Van Ginneken Example

(30,250) (5, 220)

Courtesy: Chuck Alpert, IBM

Van Ginneken Example Contd

(5,0) is inferior to (5,70). (45,50) is inferior to (20,100)

Pick solution with largest slack, follow arrows to get solution

Some vars are integers

Mixed 0/1 integer linear prog (0/1 ILP) -- NP-hard

Mixed 0/1 integer quad. prog (0/1 IQP) -- NP-hard

0/1 ILP/QLP Examples

Search TechniquesExhaustive DFS

DFS Algorithm Depth_First_Search for each v in V v.mark = 0; best_cost = infinity; optimal_soln_dfs(root);

Search techs for a TSP example

Search techs for a TSP example (contd)

BeFS for finding an optimal solution

BFS for 0/1 ILP Solution

Solve LP w/ x2=0; Cost=cost(LP)=C1

Solve LP w/ x2=1; Cost=cost(LP)=C2

Cost relations: C5 < C3 < C1 < C6 C2 < C1 C4 < C3

Solve LP w/ x2=1, x4=0; Cost=cost(LP)=C3

Solve LP w/ x2=1, x4=1; Cost=cost(LP)=C4

Solve LP w/ x2=1, x4=1, x5=0 Cost=cost(LP)=C5

Solve LP w/ x2=1, x4=1, x5=1 Cost=cost(LP)=C6

Iterative Improvement Techniques

Deterministic Greedy Locally/immediately greedy Non-locally greedy

Vous aimerez peut-être aussi