Académique Documents
Professionnel Documents
Culture Documents
transistor sizes. We have tested a standard optimizer, a Monte Carlo scheme and a method based on Genetic Algorithms combined with very accurate SPICE simulations to automatically optimize transistor sizes of three di erent digital CMOS circuits. While the standard optimizer and the Monte Carlo scheme are advantageous for small circuits, the method based on Genetic Algorithms was found to be more stable for larger circuits.
1 Introduction
Transistor size optimization is a traditional obligation in VLSI (Very Large Scale Integration) design. It is used to improve the performance of a circuit to achieve a design goal in a speci c technology. This design goal can either be boosting operating speed, lowering power consumption, or lowering area requirements. In this context the netlist of the circuit is already determined, only the width and the length of the MOS (Metal Oxide Semiconductor) transistors can be adjusted. The gures of merit depend in a complex way on the individual sizes of the transistors. Changing transistor sizes in a circuit often leads to surprising results, which are not easily predicted. It is hard to optimize digital CMOS circuits because their operation can only be modeled by distinguishing between di erent operating regions for each transistor depending on its terminal voltages (large signal behavior). In contrast, analog circuits can be expressed using small signal equivalents of the transistors, which are linearized in the operating point. Even for a two transistor digital circuit such as an inverter (see Fig. 1) it is very hard to get accurate design equations. For example, a step input to a single inverter results in two operating regions with two di erential equations (using level 2 models Sah64]). If the step input is changed to a more plausible but still not entirely realistic linear rising ramp, the circuit traverses ve regions of operation, each of which is described by its own di erential equation. In general, circuit designers manually optimize circuits by trial and error using an accurate circuit simulator such as SPICE. A good intuition and the designer's understanding of the transistors are important. Still, it is a tedious work of assigning sizes (i.e. width and length of the channel of the MOS) to all
MP
W/L
IN
MN W/L
a)
OUT
CLoad
IN
myinv.epsi 118 35 mmOUT
tpd10 tpd01
b)
the accuracy declines. To further reduce complexity only the output transistor pair of each gate of combinational blocks have been optimized. In the next section we will introduce the three digital CMOS circuits with their optimization criteria and objective functions. Then we will look at the optimization methods in Sect. 3. In Sect. 4 the results are presented and discussed before drawing conclusions in the last section.
MPA
MPB
MPC
A B C
MNA
nand3.epsi 50 MPB 49 mm
MPC
M1 M2 A B M3
M4 C
M7
Q
I2 M8
huro .epsi 87 40 mm
M6 E
M5
QB
FD M8a FU M9 FL
CLK
I1
3 Optimization Methods
3.1 Monte Carlo
The rst stochastic optimization method is based on Wur93], where transistors of Domino CMOS circuits were sized in an incremental way. The procedure looks as follows: 1. Assign a set of random sizes for each transistor 2. Evaluate the solution (calculate the tness function) 3. Increase, decrease each sizes of the set by a xed step (minimum feature size) or do not change it at all (with equal probability) 4. Evaluate the solution; if this set has a better tness continue with this set, otherwise use the previous one. 5. If maximum number of evaluations is reached stop, else goto 3. We used this Monte Carlo (MC) for all three circuits and incorporated an adaptive scheme. The step size is rst set to four times minimum feature size.
cplfa.epsi 118 72 mm
Fig. 4. Complementary Pass Gate Full Adder (CPLFA). For simplicity not all wires are drawn; nodes with same node are connected
After the rst third of the optimization it is reduced to twice the minimum feature size and to the minimum for the last third. Duration of the optimization was limited by the number of function evaluations.
Three di erent variants of a genetic algorithm have been used and are summarized in Table2. These parameters have been used to optimize the three circuits. The number of computations per evaluation was set to 400 for the NAND3, to 1000 for the DFF, and to 3000 for the CPLFA. All experiments were performed using the GENEsYs package Bac92]. The rst setup (PT) is the standard setup provided by this package. It uses tness proportional selection and two-point crossover. The second setup (RU) re ects empirical knowledge of optimal parameter settings. In particular, the following changes have been made: { ranking selection, as this is known to overcome some serious disadvantages of proportional selection Whi89, MSV93a], { uniform crossover that has shown its superiority in several investigations (see, e.g., Sys89]), { the optimal mutation rate of 1/n, where n is the length of the individual in bits MSV93b]. The third setup (ES) is based on evolution strategies Sch81] and uses no recombination but the adaptive mutation scheme AMEM (adaptive mutation excluding mutation rates) (see, e.g. Bac94]).
Figures 5a{c show the statistics for the three problems and the four optimizers at the end of each run. The line through the box marks the data's median, while the upper and lower edge of the box mark the upper and lower quartile of the data. The box encloses the interquartile range (IQR), that is the range of half of the data that clusters around the middle of the data. The tails reach to upper and lower adjacent values, where the adjacent values are the largest (smallest) datum that is not more than 1.5 IQR above or below the upper and lower quartile. All other data points are plotted beyond the ends of the tails as dots. For the two small circuits (NAND3 and DFF) the MC outperforms the three genetic algorithms in both performance and accuracy. For the large example however, the MC has a much larger variance. The most stable optimizer for the large problem is the RU genetic algorithm. For the CPFLA circuit the performance of the three genetic algorithms are compared in Fig. 6. Here, the average of the 10 best individuals in each generation is shown as a function of the generation number. The graph shows the superiority of the RU setup. The performance of the MC is also added to this graph, however only every 20th solution has been incorporated.
NAND3
0.5 1.6
DFF
2.8 2.6
CPLFA
0.45
1.4
2.4
F itnes s
F itnes s
0.4
1.2
stat.epsi 118 51 mm
F itnes s
0.35
0.3 PT RU ES MC
0.8 PT RU ES MC
1.2 PT RU ES MC
Strategy
Strategy
Strategy
a)
b)
c)
5 RU PT 4.5 ES MC 4
3.5
result2.ps 119 74 mm
2.5
Fig. 6. The optimization processes in dependence of the generations for the three GA
20 40 60 80 100 120 140
1.5
setups and the MC used. The graph shows the average of the best individuals for CPLFA.
Also, further computation time might have lead to better results as the objective function values continue to decline. Some of the results were surprising, as for example the resulting transistor sizes of a 2-input NAND gate depicted on Fig. 7. In this NAND gate, the NMOS are tapered as one might expect, but the PMOS have di erent sizes, which seems to be wrong at rst glance and are unlikely to be assigned even by an experienced VLSI designer. Simulation of this NAND gate results in equal rise and fall times for any input combination. The sizing therefore must be correct. Analyzing the circuit in more detail reveals the reason. If node Q has to be
MPB 28/2
MPA 22/2
Q A
nand.epsi 67 59 mm
10/2
5 Conclusion
In this paper we have shown that the important but tedious work of manually optimizing transistor sizes of VLSI circuits can be accurately performed by stochastic optimization. We have compared four stochastic optimization methods on three CMOS subcircuits of di erent complexity (6, 16, 34 MOSFETs). On the smaller two subcircuits the Monte Carlo method has been found to yield better results than the three genetic algorithms investigated. On the largest of the three subcircuits, in contrast, the genetic algorithms were found to yield signi cantly smaller variances provided they were allowed to run for a su cient number of generations, which suggests that the Monte Carlo methods tends to end up in a suboptimal solution more frequently as problem size increases. Among the genetic algorithms, the most consistent results were obtained by using ranking selection and uniform crossover. All our experiments started from a random seed - no initial guess was submitted. An advantage using genetic algorithms is the opportunity to include a
set of good initial solutions in the rst generation. The circuits obtained optimized had a comparable performance as those optimized by experienced designers. However, results were obtained automatically (e.g. 8h CPU time for the CPLFA on a Sparc S-10-30), while manual optimization of a circuit can take many hours of expensive engineer's time. The GA has been applied for the optimization of the dynamic logic- ip- ops used in the design of a 800 MHz 1 m CMOS adder RH96].
References
Bac92] Thomas Back. GENEsYs, 1992. Computer Science Department, LSXI, University of Dortmund, Baroper Str. 301, D-4600 Dortmund 50, Germany. Bac94] Thomas Back. Evolutionary Algorithms in Theory and Practice. PhD thesis, Fachbereich Informatik, Universitat Dortmund, 1994. Heu90] L. S. Heusler. Transistor Sizing for Timing Optimization of Combinational Digital CMOS Circuits. PhD thesis, ETH Zurich, 1990. HK94] A. M. Hill and S-M. Kang. Genetic Algorithm Based Design Optimization of CMOS VLSI Circuits. In Proceedings of the Third International Conference on Parallel Problem Solving from Nature - PPSN III, pages 546{555, October 1994. Hsp] HSPICE { Circuit Simulator, Metasoft. MSV93a] Heinz Muhlenbein and Dirk Schlierkamp-Voosen. Predictive models for the breeder genetic algorithm. Evolutionary Computation, 1(1), 1993. MSV93b] Heinz Muhlenbein and Dirk Schlierkamp-Voosen. The science of breeding and its application to the breeder genetic algorithm. Evolutionary Computation, 1(4), 1993. RH95] R. Rogenmoser and Q. Huang. A 375 MHz 1- m CMOS 8-Bit Multiplier. In Proceedings of the 1995 Symposium on VLSI Circuits, pages 13{14, June 1995. RH96] R. Rogenmoser and Q. Huang. An 800-MHz 1- m CMOS Pipelined 8-bit Adder using True Single-Phase Clocked Logic-Flip-Flops. IEEE Journal of Solid{State Circuits, 31(3):401{409, March 1996. Sah64] C. T. Sah. Characteristics of the Metal-Oxide-Semiconductor Transistors. IEEE Transactions on Electron Devices, ED-11:324{345, July 1964. Sch81] H.-P. Schwefel. Numerical Optimization of Computer Models. Wiley, Chichester, 1981. Sys89] Gilbert Syswerda. Uniform crossover in genetic algorithms. In J. David Scha er, editor, Proceedings of the Third International Conference on Genetic Algorithms, pages 2{9, San Mateo, CA, 1989. Morgan Kaufmann Publishers. Whi89] Darrell Whitley. The GENITOR algorithm and selection pressure: Why rank-based allocation of reproductive trials is best. In J. David Scha er, editor, Proceedings of the Third International Conference on Genetic Algorithms, pages 116 { 121, San Mateo, CA, 1989. Morgan Kaufmann Publishers. Wur93] L. T. Wurtz. An E cient Scaling Procedure for Domino CMOS Logic. IEEE Journal of Solid{State Circuits, 28(9):979{982, September 1993. a This article was processed using the L TEX macro package with LLNCS style