Voltage Scaling for Dynamic Power Reduction

Voltage Scaling
Reducing the power supply voltage is the effective technique to reduce dynamic power
with the speed penalty. Keeping all others factors constant if power scaling is scaled
down propagation delay will increase. This can be compensated by scaling down the
threshold voltage to the same extent as the supply voltage. This allows the circuit to
produce the same speed performance at a lower Vdd. At the same time smaller threshold
voltages lead to smaller noise margin and increased leakage current.
Dynamic Voltage and Frequency Scaling (DVFS)
We know that supply voltage can be reduced if frequency of operation is reduced. If

reduction in supply voltage is quadratic then approximately cubic reduction of power
consumption can be achieved. However, it should be noted that frequency reduction
slows the operation.
The above mentioned relation between energy and voltage is not always true. The authors
in [1] showed that quadratic relationship between energy and Vdd deviates as Vdd is
scaled down into the sub threshold voltage level. Sub threshold leakage current increases
exponentially with the supply voltage. Since in sub threshold operation the on current
takes the form of sub threshold current delay increases exponentially with voltage
scaling. At very low voltages dynamic power reduces quadratically. But the leakage
energy increases with supply voltage reduction since leakage energy is linear with the
circuit delay. Hence dynamic and leakage power becomes comparable in sub threshold
voltage region.
According to Bo Zhai et al. [1] dynamic voltage and frequency scaling is very popular
low power technique. But larger voltage ranges does not improve power efficiency. They
showed that for sub threshold supply voltages, leakage energy becomes dominant,
making "just in time completion" energy inefficient. They also showed that extending
voltage range below half Vdd will improve the energy efficiency for most processor
designs while extending this range to sub threshold operations is beneficial only for
specific applications. One of the important points to be noted from their study is DVFS in
sub threshold voltage range is never energy efficient.
References
[1] Bo Zhai, David Blaauw, Dennis Sylvester and Krisztian Flaunter, "Theoretical and
Practical Limits of Dynamic Voltage Scaling", DAC , San Diago, California, USA,
pp.868-873, June 7-11, 2004
Setup Time and Hold Time-Story of Poor Flip-Flop !
It is always interesting to talk about setup and hold!! Don’t think that if anybody asks
questions related to setup time and hold time, he or she doesn’t know about setup and
hold. He or she may know everything about setup time and hold time, time being it
confuses. The term “setup” and “hold” is such a word in this VLSI – ASIC design world
which only creates continuous questions, hard to explain in words, at least i myself is
concerned! I remember, during my MTech days my professor used to say always "whole
VLSI world is depending on two pillars, setup time and hold time". It would be more
realistic if i say that he used to scold us !!
Read more »
You might also like:
• What are the different types of delays in ASIC or VLSI design?
• What is the difference between a latch and a flip-flop?
• Process-Voltage-Temperature (PVT) Variations and Static Timing Analysis
• Timing paths
LinkWithin
4 comments Tags: hold time, setup time, Static Timing Analysis (STA)
Reactions:
06 June 2009
Timing paths
Timing Path
Timing path is defined as the path between start point and end point where start point and
end point is defined as follows:
Start Point:
All input ports or clock pins of a sequential element are considered as valid start point.
End Point:
All output port or D pin of sequential element is considered as End point.
Read more »
• Dynamic vs Static Timing Analysis
• Multi Voltage Designs: Timing Issues
• PVT, Derarting and STA
LinkWithin
0 comments Tags: Static Timing Analysis (STA), Timing Analysis, Timing paths
Reactions:
16 December 2008
Transition Delay and Propagation Delay
Transition Delay
Transition delay or slew is defined as the time taken by signal to rise from 10 %( 20%) to
the 90 %( 80%) of its maximum value. This is known as “rise time”.
Similarly “fall time” can be defined as the time taken by a signal to fall from 90 %( 80%)
to the 10 %( 20%) of its maximum value.
Transition is the time it takes for the pin to change state.
Setting Transition Time Constraints

The above theoretical definitions are to be applied on practical designs. Now, the
transition time of a net becomes the time required for its driving pin to change logic
values (from 10 %( 20%) to the 90 %( 80%) of its maximum value). This transition time
used foe delay calculations are based on the timing library (.lib files).
Transition related constraints can be provided in Design Compiler (logic synthesis tool
from Synopsys) by using below commands:
1. max_transition : This attribute is applied to each output of a cell. During optimization,

Design Compiler tries to make the transition time of each net less than the value of the
max_transition attribute.
2. set_max_transition: This command is used to change the maximum transition time

restriction specified in a technology library.
“This command sets a maximum transition time for the nets attached to the identified
ports or to all the nets in a design by setting the max_transition attribute on the named
objects.
For example, to set a maximum transition time of 3.2 on all nets in the design adder, enter
the following command:
set_max_transition 3.2 [get_designs adder]
To undo a set_max_transition command, use the remove_attribute command. For

example, enter the following command:
remove_attribute [get_designs adder] max_transition”
(Directly quoted from Design Complier user manual)
Setting Capacitance Constraints

The transition time constraints specified above do not provide a direct way to control the
actual capacitance of nets. To control capacitance directly, below command has to be
used:
set_max_capacitance: This command sets the maximum capacitance constraint on input

ports or designs.
In addition to set_max_transition, set_max_capacitance can also be used as this

command works independent.
This command applies maximum capacitance limit to output pin or port of the design.
This command can also be used to apply capacitance limit on any net.
Eg:
set_max_capacitance 4 [get_designs decoder]
To remove the set_max_capacitance command, use the remove_attribute command.
remove_attribute [get_designs decoder] max_capacitance
Propagation Delay
Propagation delay is the time required for a signal to propagate through a gate or net.
Hence if it is cell, you can call it as “Gate or Cell Delay” or if it is net you can call it as
“Net Delay”
Propagation delay of a gate or cell is the time it takes for a signal at the input pin to affect
the output signal at output pin.
For any gate propagation delay is measured between 50% of input transition to the
corresponding 50% of output transition.
There are 4 possibilities:
Propagation delay between 50 % of Input rising to 50 % of output rising.
Propagation delay between 50 % of Input rising to 50 % of output falling.
Propagation delay between 50 % of Input falling to 50 % of output rising.
Propagation delay between 50 % of Input falling to 50 % of output falling.

Each of these delays has different values. Maximum and minimum values of these set are
very important. Maximum and minimum propagation delay values are considered for
timing analysis.
For net propagation delay is the delay between the time a signal is first applied to the net
and the time it reaches other devices connected to that net.
Propagation delay is taken as the average of rise time and fall time i.e. Tpd=
(Tphl+Tplh)/2.
Propagation delay depends on the input transition time (slew rate) and the output load.
Hence two dimensional look up tables are used to calculate these delays. How to
calculate propagation delay of net and gate? Please refer below articles to find the
detailed explanation.
How gate delay is calculated?
How net delay is calculated?
Contamination Delay:
Best case delay from valid input to valid output. i.e. minimum propagation delay.

• Net Delay or Interconnect Delay or Wire Delay or Extrinsic Delay or Flight Time
• Delays in ASIC Design
• Dynamic vs Static Timing Analysis
LinkWithin
0 comments Tags: Propagation delay, Static Timing Analysis (STA), Timing Analysis,
Transition delay
Reactions:
14 October 2008
Net Delay or Interconnect Delay or Wire Delay or Extrinsic Delay or

Flight Time
Net delay is the difference between the time a signal is first applied to the net and the
time it reaches other devices connected to that net.
It is due to the finite resistance and capacitance of the net. It is also known as wire delay.
Wire delay = function of (Rnet, Cnet+Cpin)
This is output pin of the cell to the input pin of the next cell.
Net delay is calculated using Rs and Cs.
There are several factors which affect net parasitic:
• Net Length
• Net cross-sectional area
• Resistively of material used for metal layers (Aluminum vs. copper)
• Number of vias traversed by the net
• Proximity to other nets (crosstalk)
Post-layout design is annotated with RCs extracted from layout for better accuracy.
Annotated RCs override information from WLM.
Interconnect introduces capacitive, resistive and inductive parasites. All three have
multiple effects on the circuit behavior.
1. Interconnect parasites cause an increase in propagation delay (i.e. it slows down

working speed)
2. Interconnect parasites increase energy dissipation and affect the power
distribution.
3. Interconnect parasites introduce extra noise sources, which affect reliability of
the circuit. (Signal Integrity effects)
Dominant parameters determine the circuit behavior at a given circuit node. Non-
dominant parameters can be neglected for interconnect analysis.
• Inductive effect can be ignored if the resistance of the wire is substantial enough-
this is the case for long aluminum wires with a small cross section or if the rise
and fall times of the applied signals are slow.
• When the wires are short, the cross section of the wire is large or the interconnect
material used has a low resistivity, a capacitive only model can be used.
• When the separation between neighboring wires is large or when the wires only
run together for short distance, inter-wire capacitance can be ignored, and all the
parasitic capacitance can be modeled as capacitance to ground.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~
Capacitance
Capacitance can be modeled by the parallel plate capacitor model.
C = (ε / t).WL
Where
ε --> permittivity of dielectric material (SiO2)
t --> thickness of dielectric material (SiO2)
W --> width of wire

L --> length of wire
ε --> εr εo where εr --> relative permittivity of SiO2
εo --> 8.854 x 10-12 F/m; permittivity of free space
As technology node shrinks (scaling), to minimize resistance of the wires, it is desirable

to keep the cross section of the wire (WxH) as large as possible. But this increases area.
Small values of W lead to denser wiring and less area overhead. In advanced process
W/H ratio has reduced below unity. Under such circumstances parallel plate capacitance
model becomes inaccurate. The capacitance between the sidewall of the wires and
substrate called fringing capacitance can no longer be ignored and contributes to the
overall capacitance.
Inter-wire capacitance become dominant factor in multilayer interconnect structures.

These floating capacitors (not connected to substrate or ground) form a source of noise
(cross talk). This effect is more pronounced for wires in the higher interconnect layer, as
these are farther away from the substrate.
Generally higher metal layers (i.e. interconnects) have higher thickness (i.e. height)
and higher dielectric layers have higher permittivity. Hence these wires display the
highest inter-wire capacitance. Hence use it for global signals that are not sensitive
to interference. (eg. Supply rails). Or it is advisable to separate wires by an amount
that is larger than minimum spacing.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~
Resistance
Resistance R= (ρ.L)/ (H.W) = (ρ. L)/ Area

L --> length
W --> width
ρ --> resistivity (ohm-m)

Since H (height, thickness) is constant for a given technology we can write: R = Rs.
(L/W) where Rs=ρ/H ohm/sqare is called “sheet resistance”.
At very high frequencies “skin effect” comes into play such that the resistance becomes
frequency dependent. High frequency currents tend to flow primarily on the surface of a
conductor, with the current density falling off exponentially with depth into the
conductor.
Skin effect is only an issue for wider wires. Since clocks tends to carry the highest
frequency signals on a chip and also fairly wide to limit resistance, the skin effect likely
to have its first impact on these lines.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~
Inductance
With the adoption of low resistance interconnect materials and the increase of switching
frequencies to GHz range, inductance starts to an important role. Consequences of on
chip inductance include ringing and overshoot effect, reflection of signals due to
impedance mismatch, inductive coupling between lines, and switching noise due to
(Ldi/dt) voltage drops.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~
Lumped Capacitor Model

As long as the resistive component of the wire is small, and switching frequencies are
in the low to medium range, it is meaningful to consider only the capacitive
component of the wire, and to lump the distributed capacitance into a single
capacitance.
The only impact on performance is introduced by the loading effect of the capacitor on
the driving gate.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~
Lumped RC Model
If wire length is more than a few millimeters, the lumped capacitance model is
inadequate and a resistive capacitive model has to be adopted.
In lumped RC model the total resistance of each wire segment is lumped into one single
R, combines the global capacitive into single capacitor C.
Analysis of network with larger number of R and C becomes complex as network

contains many time constants (zeroes and poles). Elmore delay model overcome such
problem.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~
Elmore Delay Model

Properties of the network:
• Has single input node

• All the capacitors are between a node and ground.
• Network does not contain any resistive loops.
“Path resistance” is the resistance from source node to any other node.
“Shared path resistance” is the resistance shared among the paths from the source node
to any other two nodes.
Hence,
Delay at node 1: Tow d1 = R1C1
Delay at node 2: Tow d2= (R1+R2)C2
Delay at node 3: Tow d3 = (R1+R2+R3)C3
In general:
τdi=R1C1+(R1+R2)C2+……..+(R1+R2+R3+…..+Ri)Ci
If
R1=R2=R3=….=R
C1=C2=C3=…..C then
τdi=RC+2RC+……..+nRC
Thus Elmore delay is equivalent to the first order time constant of the network.
Assuming an interconnect wire of length L is partitioned into N identical segments. Each
segment has length L/N.
Then,
τd=L/N.R.L/N.C+ 2 (L/n.r+L/N.C)+……
=(L/N)2(RC+2RC+…….+NRC)
=(L/N)2. N(N+1)
τ
or d=RC.L2/2
=> The delay of a wire is a quadratic function of its length
=> doubling the length of the wire quadruples its delay
Advantages
• It is simple
• It is always situated between minimum and maximum bounds
Disadvantages
• It is pessimistic and inaccurate for long interconnect wires.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~
Distributed RC model
Lumped RC model is always pessimistic and distributed RC model provides better
accuracy over lumped RC model.
But distributed RC model is complex and no closed form solution exists. Hence
distributed RC line model is not suitable for Computer Aided Design Tools.
The behavior of the distributed RC line can be approximated by a lumped RC ladder

network such as Elmore Delay model hence these are extensively used in EDA tools.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~
Transmission Line Model
When frequency of operation increases to a larger extent, rise (or fall) time of the
signal becomes comparable to time of flight of the net, then inductive effects starts
dominating over RC values.
This inductive effect is modeled by Transmission Line models. The model assumes that
the signal is a "wave" and it propagates over the medium "net".
There are two types of transmission models:
Lossless transmission line model: This is good for Printed Circuit Board level design.
Lossy transmission line model: This model is used for IC interconnect model.
Transmission line effects should be considered when the rise or fall time of the input
signal is smaller than the time of flight of the transmission line or resistance of the wire is
less than characteristics impedance.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~
Wire Load Models
Extraction data from already routed designs are used to build a lookup table known as the
wire load model (WLM). WLM is based on the statistical estimates of R and C based on
“Net Fan-out”.
For fanouts greater than those specified in a wire load table, a “slope factor” is specified
for linear extrapolation.
wire_load (“5KGATES”) {
resistance : 0.000271 -------------> R per unit length
capacitance : 0.00017 -------------> C per unit length
slope : 29.4005 ---------------------> Used for linear extrapolation
fanout_length (1, 18.38) ----------> (fanout = 1, length = 18.38)
fanout_length (2, 47.78)
Eg:
Fanout = 7
Net length = 135.98 + 2 x 29.4005 (slope) = 194.78 ----------> length of net with
fanout of 7
Resistance = 194.78 x 0.000271 = 0.05279 units
Capacitance = 194.78 x 0.00017 = 0.03311 units
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~
Wire load models for synthesis
Wire load modeling allows us to estimate the effect of wire length and fanout on the
resistance, capacitance, and area of nets. Synthesizer uses these physical values to
calculate wire delays and circuit speeds. Semiconductor vendors develop wire load
models, based on statistical information specific to the vendors’ process. The models
include coefficients for area, capacitance, and resistance per unit length, and a fanout-to-
length table for estimating net lengths (the number of fanouts determines a nominal
length).
Selection of wire load models in the initial stage (before physical design) depends on the
fallowing factors:
1. User specification
2. Automatic selection based on design area
3. Default specification in the technology library
Once the final routing step is over in the physical design stage, wire load models are
generated based on the actual routing in the design and synthesis is redone using those
wire load models.
In hierarchical designs, we have to determine which wire load model to use for nets that
cross hierarchical boundaries. There are three modes for determining which wire load
model to use for nets that cross hierarchical boundaries:
Top:
Applying same wire load models to all nets as if the design has no hierarchy and uses the
wire load model specified for the top level of the design hierarchy for all nets in a design
and its sub designs.
Enclosed:
The wire load model of the smallest design that fully encloses the net is applied. If the
design enclosing the net has no wire load model, then traverses the design hierarchy
upward until we finds a wire load model. Enclosed mode is more accurate than top mode
when cells in the same design are placed in a contiguous region during layout.
Use enclosed mode if the design has similar logical and physical hierarchies.
Segmented:
Wire load model for each segment of a net is determined by the design encompassing the
segment. Nets crossing hierarchical boundaries are divided into segments. For each net
segment, the wire load model of the design containing the segment is used. If the design
contains a segment that has no wire load model, then traverse the design hierarchy
upward until it finds a wire load model.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~
Interconnect Delay vs. Deep Sub Micron Issues
Performances of deep sub micron ICs are limited by increasing interconnect loading
affect. Long global clock networks account for the larger part of the power consumption
in chips. Traditional CAD design methodologies are largely affected by the interconnect
scaling. Capacitance and resistance of interconnects have increased due to the smaller
wire cross sections, smaller wire pitch and longer length. This has resulted in increased
RC delay. As technology is advancing scaling of interconnect is also increasing. In such
scenario increased RC delay is becoming major bottleneck in improving performance of
advanced ICs.
Here the gate delay and the interconnect delay are shown as functions of various
technology nodes ranging from 180nm to 60nm. The interconnect delays shown assumes
a line where repeaters are connected optimally and includes the delay due to the
repeaters. From the graph it can be observed that with the shrinking of technology gate
delay reduces but interconnect delay increases.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~
Limits of Cu/low-k interconnects

At submicron level of 250 nm copper with low-k dielectric was introduced to decrease
affects of increasing interconnect delay. But below 130 nm technology node interconnect
delays are increasing further despite of introducing low-k dielectric. As the scaling
increases new physical and technological effects like resistivity and barrier thickness
start dominating and interconnect delay increases. Introduction of repeaters to shorten the
interconnect length increases total area. The vias connecting repeaters to global layers
can cause blockage in lower metal layers. Thus as the technology improves material
limitations will dominate factor in the interconnect delay. Increasing metal layer width
will cause increase in metallization layer. This can’t be a solution for the problem as it
increases complexity, reliability and cost.
Cu low-k dielectric films are deposited by a special process known as Damascene

process. Adhesion property of Cu with dielectric materials is very poor. Under electric
bias they easily drift and cause short between metal layers. To avoid this problem a
barrier layer is deposited between dielectric and Cu trench. Even though it decreases
effective cross section of interconnects compared to drawn dimensions, it improves
reliability. The barrier thickness becomes significant in deep submicron level and
effective resistance of the interconnect rises further. In addition to this increasing electron
scattering and self heating caused by the electron flow in interconnects due to comparable
increase in internal chip temperature also contribute to increase interconnect resistance.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~
References
[1] Jan M. Rabaey, Anantha Chandrakasan and Borivoje Nikolic,

"Digital Integrated Circuits- A Design Perspective", Prentice Hall,
Second Edition
[2] Design Compiler User Manual
• Transition Delay and Propagation Delay
• Physical Design Questions and Answers
LinkWithin
2 comments Tags: Net delay, Static Timing Analysis (STA)
Reactions:
01 September 2008
Delays in ASIC Design
We encounter several types of delays in ASIC design. They are as follows:
• Gate delay or Intrinsic delay

• Net delay or Interconnect delay or Wire delay or Extrinsic delay or Flight time
• Transition or Slew
• Propagation delay
• Contamination delay
Wire delays or extrinsic delays are calculated using output drive strength, input
capacitance and wire load models. Other delays are intrinsic properties of each and every
gate.
Delays are interdependent on different electrical properties. [Nekoogar]:
• Input capacitance of the logic gate is a function of output state, output loads and
input slew rate.
• Internal timing arcs and output slew rate is a function of switching input(s).
• Capacitance of the wire is dependent on frequency.
• Internal timing arcs are a function of input slew rates.
• Output slew rate is a function of input slew rate on each input.
• Wires exhibit RLC characteristics instead of lumped RC.

Gate Delay
Transistors within a gate take a finite time to switch. This means that a change on the
input of a gate takes a finite time to cause a change on the output. [Magma]
Gate delay =function of (input transition (slew) time, Cnet+Cpin).
or
Gate delay =function of (input transition (slew) time, Cload).
where Cload=Cnet+Cpin
Cnet-->Net capacitance
Cpin-->pin capacitance of the driven cell
Cell delay is also same as Gate delay.
How gate delay is calculated?
Cell or gate delay is calculated using Non-Linear Delay Models (NLDM). NLDM is
highly accurate as it is derived from SPICE characterizations. The delay is a function of
the input transition time (i.e. slew) of the cell, the wire capacitance and the pin
capacitance of the driven cells. A slow input transition time will slow the rate at which
the cell’s transistors can change state logic 1 to logic 0 (or logic 0 to logic 1), as well as a
large output load Cload (Cnet + Cpin), thereby increasing the delay of the logic gate.
There is another NLDM table in the library to calculate output transition. Output
transition of a cell becomes the input transition of the next cell down the chain.
• Table models are usually two-dimensional to allow lookups based on the input
slew and the output load (Cload). A sample table is given below.
timing() {
related_pin : "CKN";
timing_type : falling_edge;
timing_sense : non_unate;
cell_rise(delay_template_7x7) {
index_1 ("0.012, 0.032, 0.074, 0.154, 0.318, 0.644, 1.3");
index_2 ("0.001278, 0.0046008, 0.0112464, 0.0245376, 0.05112, 0.10454, 0.212148");
values ( \
"0.225894, 0.249015, 0.285537, 0.352680, 0.484244, 0.748180, 1.279570", \
"0.231295, 0.254415, 0.290938, 0.358081, 0.489646, 0.753585, 1.284980", \
"0.243754, 0.266878, 0.303398, 0.370542, 0.502105, 0.766044, 1.297440", \
"0.267240, 0.290389, 0.326908, 0.394052, 0.525615, 0.789561, 1.320950", \
"0.307080, 0.330200, 0.366721, 0.433861, 0.565425, 0.829373, 1.360760", \
"0.380552, 0.403875, 0.440426, 0.507569, 0.639136, 0.903084, 1.434500", \
"0.497588, 0.521769, 0.558548, 0.625744, 0.757301, 1.021260, 1.552680");
rise_transition(delay_template_7x7) {
index_1 ("0.012, 0.032, 0.074, 0.154, 0.318, 0.644, 1.3");
index_2 ("0.001278, 0.0046008, 0.0112464, 0.0245376, 0.05112, 0.10454, 0.212148");

values ( \
"0.040574, 0.068619, 0.125391, 0.246672, 0.497688, 1.005982, 2.030120", \
"0.040570, 0.068618, 0.125390, 0.246672, 0.497688, 1.005940, 2.030240", \
"0.040565, 0.068616, 0.125389, 0.246650, 0.497770, 1.006180, 2.030120", \
"0.040532, 0.068612, 0.125387, 0.246670, 0.497710, 1.006164, 2.030100", \
"0.040578, 0.068621, 0.125392, 0.246636, 0.497688, 1.006182, 2.030040", \
"0.041763, 0.069211, 0.125662, 0.246758, 0.497726, 1.005930, 2.030000", \
"0.045813, 0.071321, 0.126671, 0.247154, 0.497846, 1.005962, 2.030180");
index_1 --> input transition values
index_2--> output load capacitance values
values--> delay values
Situation 1:
Input transition and output load values match with table index
values
If both input transition and output load values match with table index values then
corresponding delay value is directly picked up from the delay “values” table as
highlighted by yellow shaded data.
Situation 2:
Output load values doesn't match with table index values
• When the actual load capacitance values does not fall directly on or at one of the
load-axis index points, the delay is determined by interpolation from the closest
points. Note that to carry out interpolation input transition point should match
with the any one of the table index values.
• Determine the equation for the line segment connecting the two nearest points in
the table.
To do this first we need to find the slope value.
Slope m = (y2-y1)/(x2-x1) where (y2-y1) is delay segment (generally in ns) on y axis and
(x2-x1) is load segment (generally in pf) on x-axis.
• Solve for the delay at the load point of interest.
The linear equation is:
y = mx+c
where
y-->delay (ns)
m-->slope
x-->load capacitance (pf)
i.e. delay=slope*load point of interest (constant value is zero)
Load point of interest means load capacitance value for which delay has to be calculated.
Situation 3:
Both input transition and output load values doesn't match
with table index values
• If both input transition and load capacitance values do not match exactly with the
look up table index values then bilinear interpolation is used.
• Multiple linear interpolations (~3) are performed on multiple closest table data
points (~4) as shown in highlighted violet color in the look up table.
Situation 4:
Output load values doesn't match with table index values and is outside the table
boundary
• When the load point is outside of the boundary of the index, the delay is
extrapolated to the closest known points.
• Lookup value too far out of range of the given table value could lead to
inaccuracy. [Cadence]
Intrinsic delay
• Intrinsic delay is the delay internal to the gate. This is from input pin of the cell to
output pin of the cell.
• It is defined as the delay between an input and output pair of a cell, when a near
zero slew is applied to the input pin and the output does not see any load
condition. It is caused by the internal capacitance associated with its transistor.
• This delay is largely dependent on the size of the transistors forming the gate
because increasing size of transistors increase internal capacitors.
References
[Nekoogar] Farzad Nekoogar, “Timing Verification of Application Specific Integrated
Circuits”, Prentice Hall
[Magma] Magma Blast Fusion User Guides

[Cadence] Cadence SOC Encounter User Guides
Matrix Multiplier Design and Synthesis
Net Delay or Interconnect Delay or Wire Delay or Extrinsic ...
Process-Voltage-Temperature (PVT) Variations and Static ...
LinkWithin
0 comments Tags: Delays, Gate Delay, Intrinsic Delay, Static Timing Analysis (STA),
Timing Analysis
Reactions:
12 August 2008
Dynamic vs Static Timing Analysis
Timing analysis is integral part of ASIC/VLSI design flow. Anything else can be
compromised but not timing! Timing analysis can be static or dynamic. Dynamic timing
analysis verifies functionality of the design by applying input vectors and checking for
correct output vectors whereas Static Timing Analysis checks static delay requirements
of the circuit without any input or output vectors.
Dynamic timing analysis has to be accomplished and functionality of the design

must be cleared before the design is subjected to Static Timing Analysis (STA).
Dynamic Timing Analysis (DTA) and Static Timing Analysis (STA) are not
alternatives to each other. Quality of the Dynamic Timing Analysis (DTA)
increases with the increase of input test vectors. Increased test vectors increase
simulation time. Dynamic timing analysis can be used for synchronous as well as
asynchronous designs. Static Timing Analysis (STA) can’t run on asynchronous
deigns and hence Dynamic Timing Analysis (DTA) is the best way to analyze
asynchronous designs. Dynamic Timing Analysis (DTA) is also best suitable for
designs having clocks crossing multiple domains.
Example of Dynamic Timing Analysis(DTA) tool is Modelsim (from mentor

Graphics), VCS (from Synopsys). DTA is also carried out on post layout netlist to
verify that functionality of the design has not changed. Test vectors remain same
for both.
SPICE Simulation
Device level timing analysis is carried out using SPICE simulation. SPICE
simulation is very essential for full custom designs to verify the electrical
properties of the designs. These are calculated based on the mathematical
equations that represent electrical properties of devices. Material and some of
the electrical properties of the devices, which are represented by either variables
or constants, are stored in model files. Examples are threshold voltage of
MOSFET, electron density etc. SPICE characterized data is tabulated in
technology libraries which becomes basic delay information for the Static Timing
Analysis. For example let us consider a AND gate. Several electrical properties
such as input and output transition, propagation delay, output capacitance etc
are evaluated by this SPICE simulation. SPICE simulated data gives maximum
accuracy compared to any other form of simulation. SPICE code is manually
written and simulated. Hence for a larger design SPICE simulation is
cumbersome job. There are specific tools available for transistor level Static
Timing Analysis (STA), (Eg. Pathmill from Synopsys) SPICE simulation being the
backbone of all these tools.
What is Static Timing Analysis (STA)?
In Static Timing Analysis (STA) static delays such as gate delay and net delays
are considered in each path and these delays are compared against their
required maximum and minimum values. Circuit to be analyzed is broken into
different timing paths constituting of gates, flip flops and their interconnections.
Each timing path has to process the data within a clock period which is
determined by the maximum frequency of operation. Cell delays are available in
the corresponding technology libraries. Cell delay values are tabulated based on
input transition and fanout load which are characterized by SPICE simulation.
Net delays are calculated based on the Wire Load Models(WLM) or extracted
resistance R and capacitance C. Wire Load Models(WLM) are available in the
Technology File. These values are Table Look Up(TLU) values calculated based
on the net fanout length.
The static timing analyzer will report the following delays (or it can do following
analysis):
Register to Register delays

Setup times of all external synchronous inputs
Clock to Output delays
Pin to Pin combinational delays
Different Analysis Modes-Best, Worst, Typical, On Chip Variation (OCV)
Data to Data Checks
Case Analysis
Multiple Clocks per Register
Minimum Pulse Width Checks
Derived Clocks
Clock Gating Checks
Netlist Editing
Report_clock_timing
Clock Reconvergence Pessimism
Worst-Arrival Slew Propagation
Path-Based Analysis
Debugging Delay Calculation
and many more......!!
The wide spread use of STA can be attributed to several factors [David]:
• The basic STA algorithm is linear in runtime with circuit size, allowing
analysis of designs in excess of 10 million instances.
• The basic STA analysis is conservative in the sense that it will over-
estimate the delay of long paths in the circuit and under-estimate the delay
of short paths in the circuit. This makes the analysis ”safe”, guaranteeing
that the design will function at least as fast as predicted and will not suffer
from hold-time violations.
• The STA algorithms have become fairly mature, addressing critical timing
issues such as interconnect analysis, accurate delay modeling, false or
multi-cycle paths, etc.
• Delay characterization for cell libraries is clearly defined, forms an

effective interface between the foundry and the design team, and is readily
available. In addition to this, the Static Timing Analysis (STA) does not
require input vectors and has a runtime that is linear with the size of the
circuit [Agarwal].
Advantages of STA:
• All timing paths are considered for the timing analysis. This is not the case
in simulation.
• Analysis times are relatively short when compared with event and circuit
simulation.
• Timing can be analyzed for worst case, best case simultaneously. This
type of analysis is not possible in dynamic timing analysis.
• Static Timing Analysis (STA) works with timing models. STA has more
pessimism and thus gives maximum delay of the design. DTA performs
full timing simulation. The problem associated with DTA is the
computational complexity involved in finding the input patterns (vectors)
that produce maximum delay at the output and hence it is slow.
Disadvantages of STA:
• All paths in the design may not run always in worst case delay. Hence the analysis
is pessimistic.
• Clock related all information has to be fed to the design in the form of constraints.
• Inconsistency or incorrectness or under constraining of these constraints may lead
to disastrous timing analysis.
• STA does not check for logical correctness of the design.
• STA is not suitable for asynchronous circuits.
References
[David] David Blaauw, Kaviraj Chopra, Ashish Srivastava and Lou Scheffer,
“Statistical Timing Analysis: From basic principles to state-of-the-art.”,
Transactions on Computer-Aided Design of Integrated Circuits and Systems
(T-CAD), IEEE.
[Agarwal] Agarwal, A. Blaauw, D. Zolotov, V. Sundareswaran, S. Min Zhao

Gala, K. and Panda, R., “Statistically Delay computation considering spatial
correlations,” Proceedings of the ASP-DAC 2003, pp.271-276, Jan 2003.
Timing paths
PVT, Derarting and STA
LinkWithin
3 comments Tags: Static Timing Analysis (STA), Timing Analysis
Reactions:
07 July 2008
Companywise ASIC/VLSI Interview Questions
Below interview questions are contributed by ASIC_diehard (Thanks a lot !). Below
questions are asked for senior position in Physical Design domain. The questions are also
related to Static Timing Analysis and Synthesis. Answers to some questions are given as
link. Remaining questions will be answered in coming blogs.
Common introductory questions every interviewer asks are:
• Discuss about the projects worked in the previous company.

• What are physical design flows, various activities you are involved?
• Design complexity, capacity, frequency, process technologies, block size you
handled.
Intel
• Why power stripes routed in the top metal layers?
The resistivity of top metal layers are less and hence less IR drop is seen in power
distribution network. If power stripes are routed in lower metal layers this will use good
amount of lower routing resources and therefore it can create routing congestion.
• Why do you use alternate routing approach HVH/VHV (Horizontal-Vertical-

Horizontal/ Vertical-Horizontal-Vertical)?
Answer:
This approach allows routability of the design and better usage of routing resources.
• What are several factors to improve propagation delay of standard cell?
Answer:
Improve the input transition to the cell under consideration by up sizing the driver.
Reduce the load seen by the cell under consideration, either by placement refinement or
buffering.
If allowed increase the drive strength or replace with LVT (low threshold voltage) cell.
• How do you compute net delay (interconnect delay) / decode RC values present in
tech file?
• What are various ways of timing optimization in synthesis tools?
Answer:
Logic optimization: buffer sizing, cell sizing, level adjustment, dummy buffering etc.
Less number of logics between Flip Flops speedup the design.
Optimize drive strength of the cell , so it is capable of driving more load and hence
reducing the cell delay.
Better selection of design ware component (select timing optimized design ware
components).
Use LVT (Low threshold voltage) and SVT (standard threshold voltage) cells if allowed.
• What would you do in order to not use certain cells from the library?
Answer:
Set don’t use attribute on those library cells.
• How delays are characterized using WLM (Wire Load Model)?
Answer:
For a given wireload model the delay are estimated based on the number of fanout of the
cell driving the net.
Fanout vs net length is tabulated in WLMs.
Values of unit resistance R and unit capacitance C are given in technology file.
Net length varies based on the fanout number.
Once the net length is known delay can be calculated; Sometimes it is again tabulated.
• What are various techniques to resolve congestion/noise?
Answer:
Routing and placement congestion all depend upon the connectivity in the netlist , a
better floor plan can reduce the congestion.
Noise can be reduced by optimizing the overlap of nets in the design.
• Let’s say there enough routing resources available, timing is fine, can you
increase clock buffers in clock network? If so will there be any impact on
other parameters?
Answer:
No. You should not increase clock buffers in the clock network. Increase in clock buffers
cause more area , more power. When everything is fine why you want to touch clock
tree??
• How do you optimize skew/insertion delays in CTS (Clock Tree Synthesis)?
Answer:
Better skew targets and insertion delay values provided while building the clocks.
Choose appropriate tree structure – either based on clock buffers or clock inverters or mix
of clock buffers or clock inverters.
For multi clock domain, group the clocks while building the clock tree so that skew is
balanced across the clocks. (Inter clock skew analysis).
• What are pros/cons of latch/FF (Flip Flop)?
Answer: Pros and cons of latch and flip flop
• How you go about fixing timing violations for latch- latch paths?
• As an engineer, let’s say your manager comes to you and asks for next project die
size estimation/projection, giving data on RTL size, performance requirements.
How do you go about the figuring out and come up with die size considering
physical aspects?
• How will you design inserting voltage island scheme between macro pins
crossing core and are at different power wells? What is the optimal resource
solution?
• What are various formal verification issues you faced and how did you resolve?
• How do you calculate maximum frequency given setup, hold, clock and clock
skew?
• What are effects of metastability?
Answer: Metastability
• Consider a timing path crossing from fast clock domain to slow clock domain.
How do you design synchronizer circuit without knowing the source clock
frequency?
• How to solve cross clock timing path?
• How to determine the depth of FIFO/ size of the FIFO?
Answer: FIFO Depth
STmicroelectronics
• What are the challenges you faced in place and route, FV (Formal Verification),
ECO (Engineering Change Order) areas?
• How long the design cycle for your designs?
• What part are your areas of interest in physical design?
• Explain ECO (Engineering Change Order) methodology.
• Explain CTS (Clock Tree Synthesis) flow.
Answer: Clock Tree Synthesis
• What kind of routing issues you faced?

• How does STA (Static Timing Analysis) in OCV (On Chip Variation)
conditions done? How do you set OCV (On Chip Variation) in IC compiler?
How is timing correlation done before and after place and route?
Answer: Process-Voltage-Temperature (PVT) Variations and Static Timing Analysis

(STA)
• If there are too many pins of the logic cells in one place within core, what kind of
issues would you face and how will you resolve?
• Define hash/ @array in perl.
• Using TCL (Tool Command Language, Tickle) how do you set variables?
• What is ICC (IC Compiler) command for setting derate factor/ command to
perform physical synthesis?
• What are nanoroute options for search and repair?
• What were your design skew/insertion delay targets?
• How is IR drop analysis done? What are various statistics available in reports?
• Explain pin density/ cell density issues, hotspots?
• How will you relate routing grid with manufacturing grid and judge if the routing
grid is set correctly?
• What is the command for setting multi cycle path?
• If hold violation exists in design, is it OK to sign off design? If not, why?
Texas Instruments (TI)

• How are timing constraints developed?
• Explain timing closure flow/methodology/issues/fixes.
• Explain SDF (Standard Delay Format) back annotation/ SPEF (Standard Parasitic
Exchange Format) timing correlation flow.
• Given a timing path in multi-mode multi-corner, how is STA (Static Timing
Analysis) performed in order to meet timing in both modes and corners, how are
PVT (Process-Voltage-Temperature)/derate factors decided and set in the
Primetime flow?
• With respect to clock gate, what are various issues you faced at various stages in
the physical design flow?
• What are synthesis strategies to optimize timing?
• Explain ECO (Engineering Change Order) implementation flow. Given post
routed database and functional fixes, how will you take it to implement ECO
(Engineering Change Order) and what physical and functional checks you need to
perform?
Qualcomm
• In building the timing constraints, do you need to constrain all IO (Input-Output)
ports?
• Can a single port have multi-clocked? How do you set delays for such ports?
• How is scan DEF (Design Exchange Format) generated?
• What is purpose of lockup latch in scan chain?
• Explain short circuit current.
Answer: Short Circuit Power
• What are pros/cons of using low Vt, high Vt cells?
Answer:
Multi Threshold Voltage Technique
Issues With Multi Height Cell Placement in Multi Vt Flow

• How do you set inter clock uncertainty?
Answer:
set_clock_uncertainty –from clock1 -to clock2
• In DC (Design Compiler), how do you constrain clocks, IO (Input-Output) ports,

maxcap, max tran?
• What are differences in clock constraints from pre CTS (Clock Tree
Synthesis) to post CTS (Clock Tree Synthesis)?
Answer:
Difference in clock uncertainty values; Clocks are propagated in post CTS.
In post CTS clock latency constraint is modified to model clock jitter.
• How is clock gating done?
Answer: Clock Gating
• What constraints you add in CTS (Clock Tree Synthesis) for clock gates?
Answer:
Make the clock gating cells as through pins.
• What is trade off between dynamic power (current) and leakage power
(current)?
Answer:
Leakage Power Trends
Dynamic Power
• How do you reduce standby (leakage) power?
Answer: Low Power Design Techniques
• Explain top level pin placement flow? What are parameters to decide?
• Given block level netlists, timing constraints, libraries, macro LEFs (Layout
Exchange Format/Library Exchange Format), how will you start floor planning?
• With net length of 1000um how will you compute RC values, using
equations/tech file info?
• What do noise reports represent?
• What does glitch reports contain?
• What are CTS (Clock Tree Synthesis) steps in IC compiler?
• What do clock constraints file contain?
• How to analyze clock tree reports?
• What do IR drop Voltagestorm reports represent?
• Where /when do you use DCAP (Decoupling Capacitor) cells?
• What are various power reduction techniques?
Answer: Low Power Design Techniques
Hughes Networks
• What is setup/hold? What are setup and hold time impacts on timing? How will
you fix setup and hold violations?
• Explain function of Muxed FF (Multiplexed Flip Flop) /scan FF (Scal Flip Flop).
• What are tested in DFT (Design for Testability)?
• In equivalence checking, how do you handle scanen signal?
• In terms of CMOS (Complimentary Metal Oxide Semiconductor), explain
physical parameters that affect the propagation delay?
• What are power dissipation components? How do you reduce them?
Answer:
Short Circuit Power
Leakage Power Trends
Dynamic Power
Low Power Design Techniques
• How delay affected by PVT (Process-Voltage-Temperature)?
Answer: Process-Voltage-Temperature (PVT) Variations and Static Timing Analysis

(STA)
• Why is power signal routed in top metal layers?
Avago Technologies (former HP group)

• How do you minimize clock skew/ balance clock tree?
• Given 11 minterms and asked to derive the logic function.
• Given C1= 10pf, C2=1pf connected in series with a switch in between, at t=0
switch is open and one end having 5v and other end zero voltage; compute the
voltage across C2 when the switch is closed?
• Explain the modes of operation of CMOS (Complimentary Metal Oxide
Semiconductor) inverter? Show IO (Input-Output) characteristics curve.
• Implement a ring oscillator.
• How to slow down ring oscillator?
Hynix Semiconductor
• How do you optimize power at various stages in the physical design flow?
• What timing optimization strategies you employ in pre-layout /post-layout stages?
• What are process technology challenges in physical design?
• Design divide by 2, divide by 3, and divide by 1.5 counters. Draw timing
diagrams.
• What are multi-cycle paths, false paths? How to resolve multi-cycle and false
paths?
• Given a flop to flop path with combo delay in between and output of the second
flop fed back to combo logic. Which path is fastest path to have hold violation
and how will you resolve?
• What are RTL (Register Transfer Level) coding styles to adapt to yield optimal
backend design?
• Draw timing diagrams to represent the propagation delay, set up, hold, recovery,
removal, minimum pulse width.
Clock Tree Synthesis (CTS)
The goal of CTS is to minimize skew and insertion delay. Clock is not propagated before
CTS as shown in Figure (1).
Figure (1) Ideal clock before CTS
After CTS hold slack should improve. Clock tree begins at .sdc defined clock source and
ends at stop pins of flop. There are two types of stop pins known as ignore pins and sync
pins. ‘Don’t touch’ circuits and pins in front end (logic synthesis) are treated as ‘ignore’
circuits or pins at back end (physical synthesis). ‘Ignore’ pins are ignored for timing
analysis. If clock is divided then separate skew analysis is necessary.
Global skew achieves zero skew between two synchronous pins without considering
logic relationship.
Local skew achieves zero skew between two synchronous pins while considering logic
relationship.
If clock is skewed intentionally to improve setup slack then it is known as useful skew.
Rigidity is the term coined in Astro to indicate the relaxation of constraints. Higher the
rigidity tighter is the constraints.
In Clock Tree Optimization (CTO) clock can be shielded so that noise is not coupled to
other signals. But shielding increases area by 12 to 15%. Since the clock signal is global
in nature the same metal layer used for power routing is used for clock also. CTO is
achieved by buffer sizing, gate sizing, buffer relocation, level adjustment and HFN
synthesis. We try to improve setup slack in pre-placement, in placement and post
placement optimization before CTS stages while neglecting hold slack. In post placement
optimization after CTS hold slack is improved. As a result of CTS lot of buffers are
added. Generally for 100k gates around 650 buffers are added.
Global skew report is shown below.
**********************************************************************
*
* Clock Tree Skew Reports
*
* Tool : Astro
* Version : V-2004.06 for IA.32 -- Jul 12, 2004
* Design : sam_cts
* Date : Sat May 19 16:09:20 2007
*
**********************************************************************
======== Clock Global Skew Report =============================
Clock: clock
Pin: clock
Net: clock
Operating Condition = worst

The clock global skew = 2.884
The longest path delay = 4.206
The shortest path delay = 1.322
The longest path delay end pin: \mac21\/mult1\/mult_out_reg[2]/CP

The shortest path delay end pin: \mac22\/adder1\/add_out_reg[3]/CP
The Longest Path:

====================================================================
Pin Cap Fanout Trans Incr Arri Master/Net
--------------------------------------------------------------------
clock 0.275 1 0.000 0.000 r clock
U1118/CCLK 0.000 0.000 0.000 r pc3c01
U1118/CP 3.536 467 1.503 1.124 1.124 r n174
\mac21\/mult1\/mult_out_reg[2]/CP
4.585 3.082 4.206 r sdnrq1
[clock delay] 4.206
====================================================================
The Shortest Path:

====================================================================
Pin Cap Fanout Trans Incr Arri Master/Net
--------------------------------------------------------------------
clock 0.275 1 0.000 0.000 r clock
U1118/CCLK 0.000 0.000 0.000 r pc3c01
U1118/CP 3.536 467 1.503 1.124 1.124 r n174
\mac22\/adder1\/add_out_reg[3]/CP
1.701 0.198 1.322 r sdnrq1
[clock delay] 1.322
====================================================================
Figure (2) Clock after CTS and CTO
Related Articles
• Physical Design Flow
• Libraries
• Inputs–outputs from physical design process
• Floor Planning
• Power Planning
• Timing Analysis in Physical Design
• Placement
• Routing

• Clock Definitions
• Multi Voltage Designs: Timing Issues
• Companywise ASIC/VLSI Interview Questions
LinkWithin
2 comments Tags: Clock Tree Synthesis (CTS)
Reactions:
26 September 2007
Multi Voltage Designs: Timing Issues
Clock
Clock Tree Synthesis (CTS) tools should be aware of different power
domains and understand the level shifters to insert them in
appropriate places. Clock tree is routed through level shifters to
reach different power domains. Simultaneous timing analysis and
optimization is necessary for multiple voltage domains. Thus CTS
becomes more complex in multi voltage designs.
Static Timing Analysis (STA)
Timing analysis for single voltage design is easy.When it comes to

static voltage scaling it becomes little tougher job as analysis has
to be carried out for different voltages.This methodology requires
libraries which are characterized for different voltages used.
Multi level and dynamic voltage scaling pose a greater challenge. For
each supply voltage level or operating point constraints are
specified. There can be different operating modes for different
voltages. Constraints need not be same for all modes and voltages. The
performance target for each mode can vary. EDA tool should be capable
of handling all these situations simultaneously to carry out timing
analysis. Different constraints at different modes and voltages have
to be satisfied.
Related Articles
Multiple Voltage ASIC/SoC Designs: Classification
Multiple Voltage Design Challenges
Multiple Voltage Designs: Power Planning Issues

Dynamic vs Static Timing Analysis
Timing paths
Multi Vdd (Voltage)
Sub Threshold Current
The sub threshold current always flows from source to drain even if the gate to source
voltage is lesser than the threshold voltage of the device. This happens due to the carrier
diffusion between the source and drain regions of the CMOS transistor in weak inversion.
When gate to source voltage is smaller than but very close to threshold voltage of the
device then sub threshold current becomes significant.
As observed by [4] currently, sub threshold leakage is still playing the main part in the
three mechanisms. However, researchers believe that gate leakage and reverse-biased
junction Band To Band Tunneling (BTBT) will be as important as sub threshold from 45
nm process downwards. In addition, with technology scaling, the gate oxide thickness
will be reduced and the substrate doping densities will be increased. As a result other
factors such as gate-induced drain leakage (GIDL) and drain-induced barrier lowering
(DIBL) will also become more and more evident. Therefore, future effective low leakage
design will need to target at several components since all of them play an important role
in the total leakage consumption. Various techniques at process and circuit level exist to
reduce leakage consumption, including modifying doping profile, oxide thickness and
channel length. Forward or inverse body biasing is also one of them, which is a technique
resulting in variable threshold CMOS.
Sub threshold current Isub, which occurs when gate voltage is below threshold voltage
Vth, is a main part of leakage current [2]. Isub depends on different effects and voltages,
which are formulated in following equations [1]:
Where
q is the electrical charge.
T is the temperature,
n is the sub threshold swing coefficient,
kB is the Boltzmann constant,
η is the drain induced barrier lowering (DIBL) coefficient,
γ is the body effect coefficient,
μ is the mobility,
Vth0 is the zero-bias threshold voltage,
Vgs is the gate-source voltage,
Vbs is the bulk-source voltage,

Vdsis the drain-source voltage,
εox and εSi are the gate dielectric constants of gate oxide and silicium,
NSUB is the uniform substrate doping concentration and
NDEP the channel doping concentration,
Tox is the thickness of the oxide layer,
ФS is the surface potential,
DSUB and ETA0 are technology dependent DIBL coefficients, and
ETAB is a body-bias coefficient of the BSIM4-Modell.
The delay Td of a CMOS device can be approximated by equation (5).
Where
k’ is a technology constant,
CL is the load, and
α models the short channel effects [3].
Variation of Vth is a common technique to reduce leakage because Isub exponentially

scales with Vth (see Equation 1). Thus, higher Vth results in lower leakage. However,
from equation (5) follows higher Vth additionally results in longer delay [2]. Hence,
optimize the design with the balance application of low Vth (LVT) and high Vth devices
(HVT).
Transfer characteristics of MOSFET for VGS near Vth are shown in below figure.
Transfer characteristics of MOSFET VGS near Vth [2]
From the above figure it can be observed that ID increases exponentially with reduction in
Vth.
As noted by [4] key dependencies of the sub threshold slope can be summarized as
follows:
- Tox ↓ =>Cox ↑=> n ↓ =>sharper sub threshold
- NA ↑ =>Csth ↑ =>n ↑ =>softer sub threshold

- VSB ↑ =>Csth ↓ =>n ↓ =>sharper sub threshold
- T ↑ =>softer sub threshold
How to minimize sub threshold leakage?
A increase in the threshold voltage of the device keeps the Vgs of the NMOS transistor
safely below the Vt,n. This is the case for logic zero input. For the logic one input
increase in the threshold voltage of the device keeps the |Vgs| of the PMOS transistor
safely below the |Vt,p|.
References
[1] Anantha P. Chandrakasan, Samuel Sheng and Robert W.Broadersen, “Low Power
CMOS Digital Design”, IEEE Journal of Solid State Circuits, vol. 27, no. 4, pp. 472-484,
April 1992
[2] Massoud Pedram, “Leakage Power Modeling and Minimization”, University of

Southern California, Dept. of EE-Systems, Los Angeles, CA 90089, ICCAD 2004
Tutorial, www.ceng.usc.edu, 10/10/2007
[3] Frank Sill, Frank Grassert and Dirk Timmermann, “Reducing Leakage with Mixed-
Vth (MVT)”, 18th International Conference on VLSI Design, IEEE, pp.874-877, January
2005
[4] Wei Liu ,Techniques for Leakage Power Reduction in Nanoscale Circuits: A Survey1,
Department of Informatics and Mathematical Modeling ,Technical University of
Denmark , IMM Technical Report 2007
• 1) Chip utilization depends on ___.
a. Only on standard cells

b. Standard cells and macros
c. Only on macros
d. Standard cells macros and IO pads
• 2) In Soft blockages ____ cells are placed.
a. Only sequential cells

b. No cells
c. Only Buffers and Inverters
d. Any cells
• 3) Why we have to remove scan chains before placement?
a. Because scan chains are group of flip flop

b. It does not have timing critical path
c. It is series of flip flop connected in FIFO
d. None
• 4) Delay between shortest path and longest path in the clock is called ____.
a. Useful skew
b. Local skew
c. Global skew
d. Slack
• 5) Cross talk can be avoided by ___.
a. Decreasing the spacing between the metal layers

b. Shielding the nets
c. Using lower metal layers
d. Using long nets
• 6) Prerouting means routing of _____.
a. Clock nets
b. Signal nets
c. IO nets
d. PG nets
• 7) Which of the following metal layer has Maximum resistance?

a. Metal1
b. Metal2
c. Metal3
d. Metal4
• 8) What is the goal of CTS?
a. Minimum IR Drop
b. Minimum EM
c. Minimum Skew
d. Minimum Slack
• 9) Usually Hold is fixed ___.
a. Before Placement
b. After Placement
c. Before CTS
d. After CTS
• 10) To achieve better timing ____ cells are placed in the critical path.
a. HVT
b. LVT
c. RVT
d. SVT
• 11) Leakage power is inversely proportional to ___.
a. Frequency
b. Load Capacitance
c. Supply voltage
d. Threshold Voltage
• 12) Filler cells are added ___.
a. Before Placement of std cells

b. After Placement of Std Cells
c. Before Floor planning
d. Before Detail Routing
• 13) Search and Repair is used for ___.
a. Reducing IR Drop
b. Reducing DRC
c. Reducing EM violations
d. None
• 14) Maximum current density of a metal is available in ___.
a. .lib
b. .v
c. .tf
d. .sdc
• 15) More IR drop is due to ___.
a. Increase in metal width

b. Increase in metal length
c. Decrease in metal length
d. Lot of metal layers
• 16) The minimum height and width a cell can occupy in the design is called
as ___.
a. Unit Tile cell

b. Multi heighten cell
c. LVT cell
d. HVT cell
• 17) CRPR stands for ___.
a. Cell Convergence Pessimism Removal

b. Cell Convergence Preset Removal
c. Clock Convergence Pessimism Removal
d. Clock Convergence Preset Removal
• 18) In OCV timing check, for setup time, ___.
a. Max delay is used for launch path and Min delay for capture path
b. Min delay is used for launch path and Max delay for capture path
c. Both Max delay is used for launch and Capture path
d. Both Min delay is used for both Capture and Launch paths
• 19) "Total metal area and(or) perimeter of conducting layer / gate to gate
area" is called ___.
a. Utilization
b. Aspect Ratio
c. OCV
d. Antenna Ratio
• 20) The Solution for Antenna effect is ___.

a. Diode insertion
b. Shielding
c. Buffer insertion
d. Double spacing
• 21) To avoid cross talk, the shielded net is usually connected to ___.
a. VDD
b. VSS
c. Both VDD and VSS
d. Clock
• 22) If the data is faster than the clock in Reg to Reg path ___ violation may
come.
a. Setup
b. Hold
c. Both
d. None
• 23) Hold violations are preferred to fix ___.
a. Before placement
b. After placement
c. Before CTS
d. After CTS
• 24) Which of the following is not present in SDC ___?
a. Max tran
b. Max cap
c. Max fanout
d. Max current density
• 25) Timing sanity check means (with respect to PD)___.
a. Checking timing of routed design with out net delays

b. Checking Timing of placed design with net delays
c. Checking Timing of unplaced design without net delays
d. Checking Timing of routed design with net delays
• 26) Which of the following is having highest priority at final stage (post
routed) of the design ___?
a. Setup violation
b. Hold violation
c. Skew
d. None
• 27) Which of the following is best suited for CTS?
a. CLKBUF
b. BUF
c. INV
d. CLKINV
• 28) Max voltage drop will be there at(with out macros) ___.
a. Left and Right sides

b. Bottom and Top sides
c. Middle
d. None
• 29) Which of the following is preferred while placing macros ___?
a. Macros placed center of the die

b. Macros placed left and right side of die
c. Macros placed bottom and top sides of die
d. Macros placed based on connectivity of the I/O
• 30) Routing congestion can be avoided by ___.
a. placing cells closer

b. Placing cells at corners
c. Distributing cells
d. None
• 31) Pitch of the wire is ___.
a. Min width
b. Min spacing
c. Min width - min spacing
d. Min width + min spacing
• 32) In Physical Design following step is not there ___.
a. Floorplaning
b. Placement
c. Design Synthesis
d. CTS
• 33) In technology file if 7 metals are there then which metals you will use for
power?
a. Metal1 and metal2

b. Metal3 and metal4
c. Metal5 and metal6
d. Metal6 and metal7
• 34) If metal6 and metal7 are used for the power in 7 metal layer process
design then which metals you will use for clock ?
a. Metal1 and metal2

b. Metal3 and metal4
c. Metal4 and metal5
d. Metal6 and metal7
• 35) In a reg to reg timing path Tclocktoq delay is 0.5ns and TCombo delay is
5ns and Tsetup is 0.5ns then the clock period should be ___.
a. 1ns
b. 3ns
c. 5ns
d. 6ns
• 36) Difference between Clock buff/inverters and normal buff/inverters is __.
a. Clock buff/inverters are faster than normal buff/inverters

b. Clock buff/inverters are slower than normal buff/inverters
c. Clock buff/inverters are having equal rise and fall times with high drive strengths
compare to normal buff/inverters
d. Normal buff/inverters are having equal rise and fall times with high drive strengths
compare to Clock buff/inverters.
• 37) Which configuration is more preferred during floorplaning ?
a. Double back with flipped rows

b. Double back with non flipped rows
c. With channel spacing between rows and no double back
d. With channel spacing between rows and double back
• 38) What is the effect of high drive strength buffer when added in long net ?
a. Delay on the net increases

b. Capacitance on the net increases
c. Delay on the net decreases
d. Resistance on the net increases.
• 39) Delay of a cell depends on which factors ?
a. Output transition and input load

b. Input transition and Output load
c. Input transition and Output transition
d. Input load and Output Load.
• 40) After the final routing the violations in the design ___.
a. There can be no setup, no hold violations

b. There can be only setup violation but no hold
c. There can be only hold violation not Setup violation
d. There can be both violations.
• 41) Utilisation of the chip after placement optimisation will be ___.
a. Constant
b. Decrease
c. Increase
d. None of the above
• 42) What is routing congestion in the design?
a. Ratio of required routing tracks to available routing tracks

b. Ratio of available routing tracks to required routing tracks
c. Depends on the routing layers available
• 43) What are preroutes in your design?
a. Power routing
b. Signal routing
c. Power and Signal routing
d. None of the above.
• 44) Clock tree doesn't contain following cell ___.
a. Clock buffer
b. Clock Inverter
c. AOI cell
• Answers:
1)b
2)c
3)b
4)c
5)b
6)d
7)a
8)c
9)d
10)b
11)d
12)d
13)b
14)c
15)b
16)a
17)c
18)a
19)d
20)a
21)b
22)b
23)d
24)d
25)c
26)b
27)a
28)c
29)d
30)c
31)d
32)c
33)d
34)c
35)d
36)c
37)a
38)c
39)b
40)d
41)c
42)a
43)a
44)c
• 3-D chip design strategy
• Vertical Interconnect Technologies (3-D ICs)
• Digital design Interview Questions
saud said...
23) Hold violations are prefered to fix ___.

a. Befor placement
b. After placement
c. Before CTS
d. AfterCTS
ANS given : b (i think its wrong)
According to me ...before CTS there is ideal clock and no real clock is present. if
real clock is not present we dont know the skew and hence cannot fix hold
accurately...
you are welcome to Correct me if i am wrong..
April 9, 2008 9:41 PM
Murali said...
hi saud,
You are right.

It should be d.After CTS.
I have corrected it.
Thanks for your observation.
Enjoy good reading !
rgds
murali
April 10, 2008 9:36 AM
vamsi addagada said...
hi saud
hold violation fix after CTS it is call colck propagate mode
April 10, 2008 10:06 AM
Grigor said...
* 27) Which of the following is best suited for CTS?
a. CLKBUF b. BUF c. INV d. CLKINV

Your answer is 27)a.
But Clock tree QoR such as insertion delay, skew, pulse width are better when
using CLKINV.
I think answeres "a" and "d" are correct.

Which one is better?
April 10, 2008 5:26 PM
Murali said...
problem with inverter is it shifts the logic level.... and hence to get back original
logic you have to use one more inverter which will ultimately increase area.
April 11, 2008 8:54 AM
Grigor said...
Hi murali,
Regarding increased area you are right if we have only one stage clock tree.
Generally the same drive strength Inverter contains less transistors than buffer. So
if we have 2 logically equivalent clock trees which has more than 3 stages (which
is the case in most designs) the area is smaller with inverter tree rather than in
buffer.
It depends on design which one is preferable.
It is arguable question, but INVERTER clock tree has more advantages (less area,
small skew, small insertion delay, small duty cycle distortion) than buffer tree.
April 11, 2008 12:27 PM
Anonymous said...
Please explain the answer to 18) OCV timing for setup time
April 30, 2008 9:10 AM
muju said...
11. Leakage Power is directly proportional to Vt is wrong...
Acording to my view it shd be inversely proportional to Vt bcz if the threshold

voltage is less we get more leakage power so in order to reduce the leakage power
we go for hig Vt so the question shd be framed as leakage power is inversely
proportional to threshold voltage
Correct me if i am wrong ok
May 30, 2008 11:20 AM
Anonymous said...
Question: 1) Chip utilization depends on ___.
Given Answer: Standard cells and macros.
I feel the answer is Standard cells, macros and pads as pad area also plays
important role in chip utilisation.
Correct me if i am wrong.
August 7, 2008 7:22 PM
Gk said...
12 answer d)Before Detail Routing.... I think it should be After Routing.Because

During routing we may need to add some more buffers to meet timing and DRV
goals .
August 28, 2008 12:21 PM
Anonymous said...
Hi
i need to get answer for the difference between hvt and lvt cells construction
November 12, 2008 2:30 PM
• What parameters (or aspects) differentiate Chip Design & Block level
design??
• How do you place macros in a full chip design?
• Differentiate between a Hierarchical Design and flat design?
• Which is more complicated when u have a 48 MHz and 500 MHz clock
design?
• Name few tools which you used for physical verification?
• What are the input files will you give for primetime correlation?
• What are the algorithms used while routing? Will it optimize wire length?
• How will you decide the Pin location in block level design?
• If the routing congestion exists between two macros, then what will you
do?
• How will you place the macros?
• How will you decide the die size?
• If lengthy metal layer is connected to diffusion and poly, then which one
will affect by antenna problem?
• If the full chip design is routed by 7 layer metal, why macros are designed
using 5LM instead of using 7LM?
• In your project what is die size, number of metal layers, technology,
foundry, number of clocks?
• How many macros in your design?
• What is each macro size and no. of standard cell count?
• How did u handle the Clock in your design?
• What are the Input needs for your design?
• What is SDC constraint file contains?
• How did you do power planning?
• How to find total chip power?
• How to calculate core ring width, macro ring width and strap or trunk
width?
• How to find number of power pad and IO power pads?
• What are the problems faced related to timing?
• How did u resolve the setup and hold problem?
• If in your design 10000 and more numbers of problems come, then what
you will do?
• In which layer do you prefer for clock routing and why?
• If in your design has reset pin, then it’ll affect input pin or output pin or
both?
• During power analysis, if you are facing IR drop problem, then how did u
avoid?
• Define antenna problem and how did u resolve these problem?
• How delays vary with different PVT conditions? Show the graph.
• Explain the flow of physical design and inputs and outputs for each step in
flow.
• What is cell delay and net delay?
• What are delay models and what is the difference between them?
• What is wire load model?
• What does SDC constraints has?
• Why higher metal layers are preferred for Vdd and Vss?
• What is logic optimization and give some methods of logic optimization.
• What is the significance of negative slack?
• What is signal integrity? How it affects Timing?
• What is IR drop? How to avoid .how it affects timing?
• What is EM and it effects?
• What is floor plan and power plan?
• What are types of routing?
• What is a grid .why we need and different types of grids?
• What is core and how u will decide w/h ratio for core?
• What is effective utilization and chip utilization?
• What is latency? Give the types?
• How the width of metal and number of straps calculated for power and
ground?
• What is negative slack ? How it affects timing?
• What is track assignment?
• What is grided and gridless routing?
• What is a macro and standard cell?
• What is congestion?
• Whether congestion is related to placement or routing?
• What are clock trees?
• What are clock tree types?
• Which layer is used for clock routing and why?
• What is cloning and buffering?
• What are placement blockages?
• How slow and fast transition at inputs effect timing for gates?
• What is antenna effect?
• What are DFM issues?
• What is .lib, LEF, DEF, .tf?
• What is the difference between synthesis and simulation?
• What is metal density, metal slotting rule?
• What is OPC, PSM?
• Why clock is not synthesized in DC?
• What are high-Vt and low-Vt cells?
• What corner cells contains?
• What is the difference between core filler cells and metal fillers?
• How to decide number of pads in chip level design?
• What is tie-high and tie-low cells and where it is used
• What is LEF?
• What is DEF?
• What are the steps involved in designing an optimal pad ring?
• What are the steps that you have done in the design flow?
• What are the issues in floor plan?
• How can you estimate area of block?
• How much aspect ratio should be kept (or have you kept) and what is the
utilization?
• How to calculate core ring and stripe widths?
• What if hot spot found in some area of block? How you tackle this?
• After adding stripes also if you have hot spot what to do?
• What is threshold voltage? How it affect timing?
• What is content of lib, lef, sdc?
• What is meant my 9 track, 12 track standard cells?
• What is scan chain? What if scan chain not detached and reordered? Is it
compulsory?
• What is setup and hold? Why there are ? What if setup and hold violates?
• In a circuit, for reg to reg path ...Tclktoq is 50 ps, Tcombo 50ps, Tsetup 50ps,
tskew is 100ps. Then what is the maximum operating frequency?
• How R and C values are affecting time?
• How ohm (R), fared (C) is related to second (T)?
• What is transition? What if transition time is more?
• What is difference between normal buffer and clock buffer?
• What is antenna effect? How it is avoided?
• What is ESD?
• What is cross talk? How can you avoid?
• How double spacing will avoid cross talk?
• What is difference between HFN synthesis and CTS?
• What is hold problem? How can you avoid it?
• For an iteration we have 0.5ns of insertion delay and 0.1 skew and for other
iteration 0.29ns insertion delay and 0.25 skew for the same circuit then which one
you will select? Why?
• What is partial floor plan?

• Synthesis Interview Questions
• CMOS Design Interview Questions
LinkWithin
Tags: Physical Design
Reactions:
2 comments:
Alexander said...
some of the Answers to these questions can be found at the below mentioned
location:
http://www.vlsichipdesign.com/asic_vlsi_faq/faq_page1.html
Visit this blog will try to answer one question daily

http://asicinterview.blogspot.com
November 16, 2007 3:06 PM
Anil said...
Hi,
Do any one know the purpose of adding endcap cells?

Regards,
Anil
March 24, 2008 12:38 AM
Post a Comment
Leave your valuable comments here...
Links to this post
Create a Link
Newer Post Older Post Home

Subscribe to: Post Comments (Atom)
Search This Blog

Search
powered
by
Popular Posts
• Setup Time and Hold Time-Story of Poor Flip-Flop !

• Clock Gating
• Low Power Design Techniques
• Clock Tree Synthesis (CTS)
Recent Comments
Blog Archive
• ► 2010 (2)
o ► July 2010 (1)
 My 3 Day Experience With Synopsys
Lynx Design Syst...
o ► June 2010 (1)
 Low Power Techniques -
Presentation
• ► 2009 (14)
o ► September 2009 (1)
 Setup Time and Hold Time-Story of
Poor Flip-Flop !...
o ► August 2009 (1)
 MULTIPLEXING 7 SEGMENT
DISPLAY USING PIC
MICROCONT...
o ► June 2009 (2)
 Free download: OpenSPARC 64 bit
processor and Nang...
 Timing paths
o ► May 2009 (3)
 IMPLEMENTATION OF
CHEBYSHEV TYPE –1(ORDER-2)
BANDP...
 IMPLEMENTATION OF II-ORDER
CHEBYSHEV TYPE-I LOWPAS...
Read more..  SRAM Chip Supporting Circuit
Design
• DSP (22) o ► April 2009 (2)
• Low Power  CoreConnect Bus and AMBA Bus
Techniques (16) Specification Resourc...
• Verification (16)  System on Chip article links
• MATLAB (15) o ► February 2009 (1)
• Timing Analysis (14)  BUTTERWORTH LOWPASS
• ASIC (12) (order-1) FILTER
• Static Timing IMPLEMENTATIO...
Analysis (STA) (11) o ► January 2009 (4)
• DSP filters (10)  PIC Microcontrollers for Digital
• FPGA (10) Filter Implementa...
• Physical Design (10)  Digital Filter Implementation Using
• Digital design (9) MATLAB
Physical Design Questions and Answers
• I am getting several emails requesting answers to the questions posted in this

blog. But it is very difficult to provide detailed answer to all questions in my
available spare time. Hence i decided to give "short and sweet" one line answers
to the questions so that readers can immediately benefited. Detailed answers will
be posted in later stage.I have given answers to some of the physical design
questions here. Enjoy !
What parameters (or aspects) differentiate Chip Design and Block level design?
• Chip design has I/O pads; block design has pins.
• Chip design uses all metal layes available; block design may not use all metal
layers.
• Chip is generally rectangular in shape; blocks can be rectangular, rectilinear.
• Chip design requires several packaging; block design ends in a macro.
How do you place macros in a full chip design?
• First check flylines i.e. check net connections from macro to macro and macro to
standard cells.
• If there is more connection from macro to macro place those macros nearer to
each other preferably nearer to core boundaries.
• If input pin is connected to macro better to place nearer to that pin or pad.
• If macro has more connection to standard cells spread the macros inside core.
• Avoid criscross placement of macros.
• Use soft or hard blockages to guide placement engine.
Differentiate between a Hierarchical Design and flat design?
• Hierarchial design has blocks, subblocks in an hierarchy; Flattened design has no

subblocks and it has only leaf cells.
• Hierarchical design takes more run time; Flattened design takes less run time.
Which is more complicated when u have a 48 MHz and 500 MHz clock design?
• 500 MHz; because it is more constrained (i.e.lesser clock period) than 48 MHz
design.
Name few tools which you used for physical verification?
• Herculis from Synopsys, Caliber from Mentor Graphics.
What are the input files will you give for primetime correlation?
• Netlist, Technology library, Constraints, SPEF or SDF file.
If the routing congestion exists between two macros, then what will you do?
• Provide soft or hard blockage
How will you decide the die size?
• By checking the total area of the design you can decide die size.
If lengthy metal layer is connected to diffusion and poly, then which one will affect
by antenna problem?
• Poly
If the full chip design is routed by 7 layer metal, why macros are designed using
5LM instead of using 7LM?
• Because top two metal layers are required for global routing in chip design. If top
metal layers are also used in block level it will create routing blockage.
In your project what is die size, number of metal layers, technology, foundry,
number of clocks?
• Die size: tell in mm eg. 1mm x 1mm ; remeber 1mm=1000micron which is a big
size !!
• Metal layers: See your tech file. generally for 90nm it is 7 to 9.
• Technology: Again look into tech files.
• Foundry:Again look into tech files; eg. TSMC, IBM, ARTISAN etc
• Clocks: Look into your design and SDC file !
How many macros in your design?
• You know it well as you have designed it ! A SoC (System On Chip) design may
have 100 macros also !!!!
What is each macro size and number of standard cell count?
• Depends on your design.
What are the input needs for your design?
• For synthesis: RTL, Technology library, Standard cell library, Constraints
• For Physical design: Netlist, Technology library, Constraints, Standard cell library
What is SDC constraint file contains?
• Clock definitions
• Timing exception-multicycle path, false path
• Input and Output delays
How did you do power planning?

How to calculate core ring width, macro ring width and strap or trunk width?
How to find number of power pad and IO power pads?
How the width of metal and number of straps calculated for power and ground?
• Get the total core power consumption; get the metal layer current density value
from the tech file; Divide total power by number sides of the chip; Divide the
obtained value from the current density to get core power ring width. Then
calculate number of straps using some more equations. Will be explained in detail
later.
How to find total chip power?
• Total chip power=standard cell power consumption,Macro power consumption

pad power consumption.
What are the problems faced related to timing?
• Prelayout: Setup, Max transition, max capacitance
• Post layout: Hold
How did you resolve the setup and hold problem?
• Setup: upsize the cells
• Hold: insert buffers
In which layer do you prefer for clock routing and why?
• Next lower layer to the top two metal layers(global routing layers). Because it has
less resistance hence less RC delay.
If in your design has reset pin, then it’ll affect input pin or output pin or both?
• Output pin.
During power analysis, if you are facing IR drop problem, then how did you avoid?
• Increase power metal layer width.
• Go for higher metal layer.
• Spread macros or standard cells.
• Provide more straps.
Define antenna problem and how did you resolve these problem?
• Increased net length can accumulate more charges while manufacturing of the
device due to ionisation process. If this net is connected to gate of the MOSFET it
can damage dielectric property of the gate and gate may conduct causing damage
to the MOSFET. This is antenna problem.
• Decrease the length of the net by providing more vias and layer jumping.
• Insert antenna diode.
How delays vary with different PVT conditions? Show the graph.
• P increase->dealy increase
• P decrease->delay decrease
• V increase->delay decrease
• V decrease->delay increase
• T increase->delay increase
• T decrease->delay decrease
Explain the flow of physical design and inputs and outputs for each step in flow.
• Click here to see the flow diagram

What is cell delay and net delay?
• Gate delay
• Transistors within a gate take a finite time to switch. This means that a change on
the input of a gate takes a finite time to cause a change on the output.[Magma]
• Gate delay =function of(i/p transition time, Cnet+Cpin).
• Cell delay is also same as Gate delay.
• Cell delay
• For any gate it is measured between 50% of input transition to the corresponding
50% of output transition.
• Intrinsic delay
• Intrinsic delay is the delay internal to the gate. Input pin of the cell to output pin
of the cell.
• It is defined as the delay between an input and output pair of a cell, when a near
zero slew is applied to the input pin and the output does not see any load
condition.It is predominantly caused by the internal capacitance associated with
its transistor.
• This delay is largely independent of the size of the transistors forming the gate
because increasing size of transistors increase internal capacitors.
• Net Delay (or wire delay)
• The difference between the time a signal is first applied to the net and the time it
reaches other devices connected to that net.
• It is due to the finite resistance and capacitance of the net.It is also known as wire
delay.
• Wire delay =fn(Rnet , Cnet+Cpin)
What are delay models and what is the difference between them?
• Linear Delay Model (LDM)
• Non Linear Delay Model (NLDM)
What is wire load model?
• Wire load model is NLDM which has estimated R and C of the net.
Why higher metal layers are preferred for Vdd and Vss?
• Because it has less resistance and hence leads to less IR drop.
What is logic optimization and give some methods of logic optimization.
• Upsizing
• Downsizing
• Buffer insertion
• Buffer relocation
• Dummy buffer placement
What is the significance of negative slack?
• negative slack==> there is setup voilation==> deisgn can fail
What is signal integrity? How it affects Timing?
• IR drop, Electro Migration (EM), Crosstalk, Ground bounce are signal integrity
issues.
• If Idrop is more==>delay increases.
• crosstalk==>there can be setup as well as hold voilation.
What is IR drop? How to avoid? How it affects timing?
• There is a resistance associated with each metal layer. This resistance consumes
power causing voltage drop i.e.IR drop.
• If IR drop is more==>delay increases.
What is EM and it effects?
• Due to high current flow in the metal atoms of the metal can displaced from its
origial place. When it happens in larger amount the metal can open or bulging of
metal layer can happen. This effect is known as Electro Migration.
• Affects: Either short or open of the signal line or power line.
What are types of routing?
• Global Routing
• Track Assignment
• Detail Routing
What is latency? Give the types?
• Source Latency
• It is known as source latency also. It is defined as "the delay from the clock origin
point to the clock definition point in the design".
• Delay from clock source to beginning of clock tree (i.e. clock definition point).
• The time a clock signal takes to propagate from its ideal waveform origin point to
the clock definition point in the design.
• Network latency
• It is also known as Insertion delay or Network latency. It is defined as "the delay

from the clock definition point to the clock pin of the register".
• The time clock signal (rise or fall) takes to propagate from the clock definition
point to a register clock pin.
What is track assignment?
• Second stage of the routing wherein particular metal tracks (or layers) are
assigned to the signal nets.
What is congestion?
• If the number of routing tracks available for routing is less than the required
tracks then it is known as congestion.
Whether congestion is related to placement or routing?

• Routing
What are clock trees?
• Distribution of clock from the clock source to the sync pin of the registers.
What are clock tree types?
• H tree, Balanced tree, X tree, Clustering tree, Fish bone
What is cloning and buffering?
• Cloning is a method of optimization that decreases the load of a heavily loaded

cell by replicating the cell.
• Buffering is a method of optimization that is used to insert beffers in high fanout

nets to decrease the dealy.

• Physical Design Interview Questions
• Net Delay or Interconnect Delay or Wire Delay or Extrinsic Delay or Flight Time
LinkWithin
Tags: Physical Design
Reactions:
17 comments:
Anil said...
with reference to the question.. "Which is more complicated when u have a 48

MHz and 500 MHz clock design?"
48 MHz will have more time period and 500 MHz will have less time period, so
500Mz will be more complicated.
How come 48Mz be complicated.. Can any one elabore this..
February 19, 2008 12:46 PM
Anil said...
Hi, Thank you for making a blog with fabulous questoin and answers in back
end...
I have a doubt with reference to the question "calculating the power ring width".
From tech file how do we get the maximun metal density of a layer?
Where is it available???
Also where the max electromigration value is stored??
February 19, 2008 1:03 PM
Murali said...
Hi cvn,
Sorry for the typing mistake...You are absolutely right ... 48 and 500 numbers
wrongly exchanged...let me correct that !
Thanks for ur appreciation... participate in discussion and enjoy reading!
In tf check Layer definitions:
Layer "M1" {layerNumber=15

maskName="metal1"
........
........
maxCurrDensity = 6.583
.......
......
rgds
murali
Anil said...
Hi Murali,
Thank you very much for your nice clarification. I have some more doubts with
reference to the question "Define antenna problem and how did you resolve these
problem?", Can we insert a buffer (to divide the lengthy metal into two)to resolve
antenna proble.
I mean when we insert a buffer we are inserting silicon (along with a little metal).
so it can also resolve the problem.
Are there any disadvantages with this kind of approach?

Thanks and Regards,
Anil
Murali said...
Hi anil,
First preference is to metal layer jumping.
If antenna problem is in lower layer jump to higher layer and again come back.
If it is in higher layer, well.. you cant jump ! hence use diode.
Last option,as you said insert buffer. But when you do that higher metal layers has
to come to lower metal layer (M1 or M2) to connect to pins of buffer and go
back.And also there may not enough place for buffer insertion. (Remember after
routing we go for antenna check). This may lead to congestion and DRC
voilation.
In P&R tool you have all these options to fix antenna problem.
rgds
murali
February 20, 2008 11:43 AM
Anil said...
Hi Murali,
While calculating the power consumption, we add up standar cell power, macro
power and pad power. How do we know power consumption of all these?
rgds,
Anil
February 23, 2008 10:55 AM
Murali said...
Please refer:
http://asic-soc.blogspot.com/2007/10/power-planning.html
rgds
murali
February 26, 2008 12:14 PM
savita said...
hi can any one help me understanding STA with example if you have any material
pls send it to bgsavita@gmail.com it would be great help
thanku
savita
March 20, 2008 4:14 PM
Murali said...
let me try....!
March 22, 2008 4:43 PM
Anonymous said...
Hi, Thanks for this nice material, looking forward for more interesting and deep
analysis of different stages of pnr.
rgds,
Amulya.
April 1, 2008 12:13 PM
padmavathi said...
can u give the details how to find die area if i know total area from dc
compiler.how to estimate die size.can u elaborate on this
May 16, 2008 9:57 AM
Murali said...
Total cell area is obtained from the area report from DC. Take squareroot of this.
Obtained value is the approaximate height and width of the core area.
Total area report provides the area considering pads also. Hence you can estimate
what is tha extra area required for the pad.
Thus you can estimate die size. Remember that this is just an estimate. Actual die
size can vary.
rgds
murali
May 16, 2008 12:33 PM
padma.p said...
Thank u for ur reply.In dc we dont know how much area for net routing.
u given in example of floor plan using SAMM(systolic array matrix multiplier)
floor plan .can u explain on what bases u estimated that.
May 22, 2008 11:53 AM
Murali said...
Since over the cell routing is very common in all EDA tools we need not worry
about area required for nets.
Required Inputs:
Technology used eg. 0.18 Micron etc

Total Number of standard cells
One standard cell area
Number of IO pads
Pad height
Core utilization allowed eg.0.7 (i.e.70 %)
Calculations:
Total standard cell area = no. of standard cells * one standard cell area
(Alternatively this can be directly obtained from the DC area report).
Core size = Standard cell area / Utilization (Assuming there are no hard macros; If
there are then add this also )
= X um * Y um.
Die area = [Core width + PG ring width + core offset + 2 * pad height ] *
[Core height + PG ring width + core offset + 2 * pad height ]
= A um * B um
=AB um2
May 22, 2008 2:05 PM
muju said...
f the full chip design is routed by 7 layer metal, why macros are designed using
5LM instead of using 7LM?
* Because top two metal layers are required for global routing in chip design. If
top metal layers are also used in block level it will create routing blockage.
Here wt is meant by routing blockage ?? can anyone explain me this term??...
Reply me
Mujtaba Ahmed
May 30, 2008 11:35 AM
K.K. said...
Routing blockage's are used to prevent metal layers get routed in particular chip
area.
June 6, 2008 9:08 PM
Mantu said...
Can some one explain me wht is the difference between set_input_delay and
set_driving_cell in DC?
June 22, 2009 10:18 PM
Placement
Complete placement flow is illustrated in Figure (1).

Figure (1) Placement flow [1]
Before the start of placement optimization all Wire Load Models (WLM) are removed.
Placement uses RC values from Virtual Route (VR) to calculate timing. VR is the
shortest Manhattan distance between two pins. VR RCs are more accurate than WLM
RCs.
Placement is performed in four optimization phases:
1. Ire-placement optimization
2. In placement optimization
3. Post Placement Optimization (PPO) before clock tree synthesis (CTS)
4. PPO after CTS.
Pre-placement Optimization optimizes the netlist before placement, HFNs are

collapsed. It can also downsize the cells.
In-placement optimization re-optimizes the logic based on VR. This can perform cell
sizing, cell moving, cell bypassing, net splitting, gate duplication, buffer insertion, area
recovery. Optimization performs iteration of setup fixing, incremental timing and
congestion driven placement.
Post placement optimization before CTS performs netlist optimization with ideal
clocks. It can fix setup, hold, max trans/cap violations. It can do placement optimization
based on global routing. It re does HFN synthesis.
Post placement optimization after CTS optimizes timing with propagated clock. It tries
to preserve clock skew.
Reference
[1] Astro User Guide, Version X-2005.09, September 2005
Related Articles
• Physical Design Flow
• Libraries
• Inputs–outputs from physical design process
• Floor Planning
• Power Planning
• Timing Analysis in Physical Design
• Clock Tree Synthesis (CTS)
• Routing
Timing Analysis in Physical Design
Timing analysis at back end requires knowledge of all clock related constraints provided
at front end. When .sdc file given to physical design tool (like Astro) its first object is to
remove all Wire Load Models (WLM) which are used for front end timing analysis. In
backend there is no term called as wire load model. Actual delays are calculated based on
the RC value of metal layers. All RC values like sidewall, junction and fringe
capacitances are stored as Table Look Up (TLU) format in technology file.
In backend design hold violation has higher priority compared to setup violation because
hold violation is related to data path of the design. Setup violation can be eliminated by
slowing down the clock.
Placement and routing goal is always to meet timing constraints provided by the .sdc file.
If latency and uncertainty are not set for clock at front end then at backend doing Clock
Tree Synthesis (CTS) is not possible.
Cell delay and net delay are stored as look up table.
Cell delay consists of transition, timing arcs and capacitances while net delay is
constituted by RCs only. Cell delays are available in libraries
. Net delays are specified in technology files. (In front end it is in WLM). Cell delays are
fixed. Net delays are not fixed and they depend on interconnect length and width. Net
delay parameters Rnet and Cnet are available as Table Look Up (TLU) provided by the
vendor.
There is one more set of file TLU+ which account for Ultra Deep Sub Micron (UDSM)
effects. UDSM effects are not included in TLU file. A mapping file maps TLU to TLU+.
UDSM effects like Optical Proximity Correction (OPC), Resumption Enhanced
Technology (RET) and Litho Compliance Check (LCC) are not taken care by Astro.
For the placement stage virtual RC (based on Manhattan distance) Layout Parasitic
Extraction (LPE) mode is used. For CTS real R and virtual C is used and for routing
Real RC is used.
Clock definition given to SAMM in front end design flow is generated as .sdc file from
Design Compiler is given below. It includes clock frequency, rise and fall time, setup and
hold, skew and insertion delay.
#####################################################
# Created by Design Compiler write_sdc on Fri May 11 18:35:45 2007
#####################################################
create_clock -period 4.85 -waveform {0 2.425} [get_ports {clock}]
set_clock_transition -rise 0.04 [get_clocks {clock}]
set_clock_transition -fall 0.04 [get_clocks {clock}]
set_clock_uncertainty 0.485 -setup [get_clocks {clock}]
set_clock_uncertainty 0.27 -hold [get_clocks {clock}]
set_clock_latency 0.45 [get_clocks {clock}]
set_clock_latency -source 0.45 [get_clocks {clock}]
Issues with Multi Height Cell Placement in Multi Vt Flow
Creating the reference libraries

There are two reference libraries required. One is low Vt cell library and another is high
Vt cell library. These libraries have two different height cells. Reference libraries are
created as per the standard synopsys flow. Library creation flow is given in Figure 1.
Read_lib command is used for this purpose. As TF and LEF files are available TF+LEF
option is chosen for library creation. After the completion of the physical library
preparation steps, logical libraries are prepared.
Figure 1 Library preparation command window
Different Unit Tile Creation
The unit tile height of lvt cells is 2.52 µ and hvt cells are 1.96 µ. Hence two separate unit
tiles have to be created and should be added in the technology file. Hvt reference library
is created with the unit tile name “unit” and lvt reference library is created with unit tile
name “lvt_unit”. By default “unit” tile is defined in technology file and the other unit tile
“lvt_unit” is also added to the technology file.
Figure 2. Tile height specifications in library preparation
Floor Planning
70% of the core utilization is provided. Aspect ratio is kept at 1. Rows are flipped, double
backed and made channel less. No Top Design Format (TDF) file is selected as default
placement of the IO pins are considered. Since we have multi height cells in the reference
library separate placement rows have to be provided for two different unit tiles. The core
area is divided into two separate unit tile section providing larger area for Hvt unit tile as
shown in the Figure 3.
Figure 3. Different unit tile placement
First as per the default floor planning flow rows are constructed with unit tile. Later rows
are deleted from the part of the core area and new rows are inserted with the tile
“lvt_unit”. Improper allotment of area can give rise to congestion. Some iteration of trial
and error experiments were conducted to find best suitable area for two different unit
tiles. The “unit” tile covers 44.36% of core area while “lvt_unit” 65.53% of the core area.
PR summary report of the design after the floor planning stage is provided below.
PR Summary:
Number of Module Cells: 70449
Number of Pins: 368936
Number of IO Pins: 298
Number of Nets: 70858
Average Pins Per Net (Signal): 3.20281
Chip Utilization:
Total Standard Cell Area: 559367.77
Core Size: width 949.76, height 947.80; area 900182.53
Chip Size: width 999.76, height 998.64; area 998400.33
Cell/Core Ratio: 62.1394%
Cell/Chip Ratio: 56.0264%
Number of Cell Rows: 392
Placement Issues with Different Tile Rows

Legal placement of the standard cells is automatically taken care by Astro tool as two
separate placement area is defined for multi heighten cells. Corresponding tile utilization
summary is provided below.
PR Summary:
[Tile Utilization]
============================================================
unit 257792 114353 44.36%
lvt_unit 1071872 702425 65.53%
============================================================
But this method of placement generates unacceptable congestion around the junction area
of two separate unit tile sections. The congestion map is shown in Figure 4.
Figure 4. Congestion
There are two congestion maps. One is related to the floor planning with aspect ratio 1
and core utilization of 70%. This shows horizontal congestion over the limited value of
one all over the core area meaning that design can’t be routed at all. Hence core area has
to be increased by specifying height and width. The other congestion map is generated
with the floor plan wherein core area is set to 950 µm. Here we can observe although
congestion has reduced over the core area it is still a concern over the area wherein two
different unit tiles merge as marked by the circle. But design can be routable and can be
carried to next stages of place and route flow provided timing is met in subsequent
implementation steps.
Tighter timing constraints and more interrelated connections of standard cells around the
junction area of different unit tiles have lead to more congestion. It is observed that
increasing the area isn't a solution to congestion. In addition to congestion, situation
verses with the timing optimization effort by the tool. Timing target is not able to meet.
Optimization process inserts several buffers around the junction area and some of them
are placed illegally due to the lack of placement area.
Corresponding timing summary is provided below:
Timing/Optimization Information:
[TIMING]
Setup Hold Num Num
Type Slack Num Total Target Slack Num Trans MaxCap Time
========================================================
A.PRE -3.491 3293 -3353.9 0.100 10000.000 0 8461 426 00:02:26
A.IPO -0.487 928 -271.5 0.100 10000.000 0 1301 29 00:01:02
A.IPO -0.454 1383 -312.8 0.100 10000.000 0 1765 36 00:01:57
A.PPO -1.405 1607 -590.9 0.100 10000.000 0 2325 32 00:00:58
A.SETUP -1.405 1517 -466.4 0.100 -0.168 6550 2221 31 00:04:10
========================================================
Since the timing is not possible to meet design has to be abandoned from subsequent
steps. Hence in a multi vt design flow cell library with multi heights are not preferred.
References
[1] Astro, User Guide, Version X-2005.09, September 2005

• Multi Threshold (MVT) Voltage Technique
• Multi Vdd (Voltage)
• Matrix Multiplier Design and Synthesis
Backend (Physical Design) Interview Questions and Answers
• Below are the sequence of questions asked for a physical design engineer.
In which field are you interested?
• Answer to this question depends on your interest, expertise and to the requirement
for which you have been interviewed.
• Well..the candidate gave answer: Low power design
Can you talk about low power techniques?

How low power and latest 90nm/65nm technologies are related?
• Refer here and browse for different low power techniques.
Do you know about input vector controlled method of leakage reduction?

• Leakage current of a gate is dependant on its inputs also. Hence find the set of
inputs which gives least leakage. By applyig this minimum leakage vector to a
circuit it is possible to decrease the leakage current of the circuit when it is in the
standby mode. This method is known as input vector controlled method of
leakage reduction.
How can you reduce dynamic power?
• -Reduce switching activity by designing good RTL

• -Clock gating
• -Architectural improvements
• -Reduce supply voltage
• -Use multiple voltage domains-Multi vdd
What are the vectors of dynamic power?
• Voltage and Current
How will you do power planning?
• Refer here for power planning.
If you have both IR drop and congestion how will you fix it?
• -Spread macros
• -Spread standard cells
• -Increase strap width
• -Increase number of straps
• -Use proper blockage
Is increasing power line width and providing more number of straps are the only
solution to IR drop?
• -Spread macros
• -Spread standard cells
• -Use proper blockage
In a reg to reg path if you have setup problem where will you insert buffer-near to
launching flop or capture flop? Why?
• (buffers are inserted for fixing fanout voilations and hence they reduce setup
voilation; otherwise we try to fix setup voilation with the sizing of cells; now just
assume that you must insert buffer !)
• Near to capture path.
• Because there may be other paths passing through or originating from the flop
nearer to lauch flop. Hence buffer insertion may affect other paths also. It may
improve all those paths or degarde. If all those paths have voilation then you may
insert buffer nearer to launch flop provided it improves slack.
How will you decide best floorplan?
• Refer here for floor planning.
What is the most challenging task you handled?

What is the most challenging job in P&R flow?
• -It may be power planning- because you found more IR drop

• -It may be low power target-because you had more dynamic and leakage power
• -It may be macro placement-because it had more connection with standard cells or
macros
• -It may be CTS-because you needed to handle multiple clocks and clock domain
crossings
• -It may be timing-because sizing cells in ECO flow is not meeting timing
• -It may be library preparation-because you found some inconsistancy in libraries.
• -It may be DRC-because you faced thousands of voilations
How will you synthesize clock tree?
• -Single clock-normal synthesis and optimization

• -Multiple clocks-Synthesis each clock seperately
• -Multiple clocks with domain crossing-Synthesis each clock seperately and
balance the skew
How many clocks were there in this project?
• -It is specific to your project

• -More the clocks more challenging !
How did you handle all those clocks?

• -Multiple clocks-->synthesize seperately-->balance the skew-->optimize the clock
tree
Are they come from seperate external resources or PLL?
• -If it is from seperate clock sources (i.e.asynchronous; from different pads or pins)
then balancing skew between these clock sources becomes challenging.
• -If it is from PLL (i.e.synchronous) then skew balancing is comparatively easy.
Why buffers are used in clock tree?
• To balance skew (i.e. flop to flop delay)
What is cross talk?
• Switching of the signal in one net can interfere neigbouring net due to cross
coupling capacitance.This affect is known as cros talk. Cross talk may lead setup
or hold voilation.
How can you avoid cross talk?
• -Double spacing=>more spacing=>less capacitance=>less cross talk

• -Multiple vias=>less resistance=>less RC delay
• -Shielding=> constant cross coupling capacitance =>known value of crosstalk
• -Buffer insertion=>boost the victim strength
How shielding avoids crosstalk problem? What exactly happens there?
• -High frequency noise (or glitch)is coupled to VSS (or VDD) since shilded layers
are connected to either VDD or VSS.
• Coupling capacitance remains constant with VDD or VSS.
How spacing helps in reducing crosstalk noise?
• width is more=>more spacing between two conductors=>cross coupling

capacitance is less=>less cross talk
Why double spacing and multiple vias are used related to clock?
• Why clock?-- because it is the one signal which chages it state regularly and more
compared to any other signal. If any other signal switches fast then also we can
use double space.
• Double spacing=>width is more=>capacitance is less=>less cross talk
• Multiple vias=>resistance in parellel=>less resistance=>less RC delay
How buffer can be used in victim to avoid crosstalk?
• Buffer increase victims signal strength; buffers break the net length=>victims are
more tolerant to coupled signal from aggressor.
What is the difference between soft macro and hard macro?
• What is the difference between hard macro, firm macro and soft macro?
or
• What are IPs?
• Hard macro, firm macro and soft macro are all known as IP (Intellectual
property). They are optimized for power, area and performance. They can be
purchased and used in your ASIC or FPGA design implementation flow. Soft
macro is flexible for all type of ASIC implementation. Hard macro can be used in
pure ASIC design flow, not in FPGA flow. Before bying any IP it is very
important to evaluate its advantages and disadvantages over each other, hardware
compatibility such as I/O standards with your design blocks, reusability for other
designs.
Soft macros
• Soft macros are in synthesizable RTL.
• Soft macros are more flexible than firm or hard macros.
• Soft macros are not specific to any manufacturing process.
• Soft macros have the disadvantage of being somewhat unpredictable in terms of

performance, timing, area, or power.
• Soft macros carry greater IP protection risks because RTL source code is more
portable and therefore, less easily protected than either a netlist or physical layout
data.
• From the physical design perspective, soft macro is any cell that has been placed
and routed in a placement and routing tool such as Astro. (This is the definition
given in Astro Rail user manual !)
• Soft macros are editable and can contain standard cells, hard macros, or other soft
macros.
Firm macros
• Firm macros are in netlist format.
• Firm macros are optimized for performance/area/power using a specific

fabrication technology.
• Firm macros are more flexible and portable than hard macros.
• Firm macros are predictive of performance and area than soft macros.
Hard macro
• Hard macros are generally in the form of hardware IPs (or we termed it as
hardwre IPs !).
• Hard macos are targeted for specific IC manufacturing technology.
• Hard macros are block level designs which are silicon tested and proved.
• Hard macros have been optimized for power or area or timing.

• In physical design you can only access pins of hard macros unlike soft macros
which allows us to manipulate in different way.
• You have freedom to move, rotate, flip but you can't touch anything inside hard
macros.
• Very common example of hard macro is memory. It can be any design which
carries dedicated single functionality (in general).. for example it can be a MP4
decoder.
• Be aware of features and characteristics of hard macro before you use it in your
design... other than power, timing and area you also should know pin properties
like sync pin, I/O standards etc
• LEF, GDS2 file format allows easy usage of macros in different tools.
From the physical design (backend) perspective:
• Hard macro is a block that is generated in a methodology other than place and
route (i.e. using full custom design methodology) and is brought into the physical
design database (eg. Milkyway in Synopsys; Volcano in Magma) as a GDS2 file.
• Here is one article published in embedded magazine about IPs. Click here to read.
Synthesis and placement of macros in modern SoC designs are challenging. EDA tools
employ different algorithms accomplish this task along with the target of power and area.
There are several research papers available on these subjects. Some of them can be
downloaded from the given link below.
• "Hard Macro Placement in Complex SoC Design" - view and read article from
soccentral
• "Hard Macro Placement in Complex SoC Design" - download white paper
IEEE/Univerity research papers

• "Local Search for Final Placement in VLSI Design" - download
• "Consistent Placement of Macro-Blocks Using Floorplanning and standard cell

placement" - download
• "A Timing-Driven Soft-Macro Placement And Resynthesis Method In Interaction
with Chip Floorplanning" - download

• Physical Design Interview Questions
• What is the difference between FPGA and ASIC?

Voltage Scaling for Dynamic Power Reduction

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Voltage Scaling for Dynamic Power Reduction

Transféré par

Droits d'auteur :

Formats disponibles

Voltage Scaling

Dynamic Voltage and Frequency Scaling (DVFS)

We know that supply voltage can be reduced if frequency of operation is reduced. If

Setup Time and Hold Time-Story of Poor Flip-Flop !

Transition Delay and Propagation Delay

Transition is the time it takes for the pin to change state.

Setting Transition Time Constraints

1. max_transition : This attribute is applied to each output of a cell. During optimization,

2. set_max_transition: This command is used to change the maximum transition time

set_max_transition 3.2 [get_designs adder]

To undo a set_max_transition command, use the remove_attribute command. For

remove_attribute [get_designs adder] max_transition”

(Directly quoted from Design Complier user manual)

Setting Capacitance Constraints

set_max_capacitance: This command sets the maximum capacitance constraint on input

In addition to set_max_transition, set_max_capacitance can also be used as this

set_max_capacitance 4 [get_designs decoder]

To remove the set_max_capacitance command, use the remove_attribute command.

remove_attribute [get_designs decoder] max_capacitance

There are 4 possibilities:

Propagation delay between 50 % of Input rising to 50 % of output rising.

Propagation delay between 50 % of Input rising to 50 % of output falling.

Propagation delay between 50 % of Input falling to 50 % of output rising.

Propagation delay between 50 % of Input falling to 50 % of output falling.

How gate delay is calculated?

How net delay is calculated?

You might also like:

Net Delay or Interconnect Delay or Wire Delay or Extrinsic Delay or

Wire delay = function of (Rnet, Cnet+Cpin)

Net delay is calculated using Rs and Cs.

There are several factors which affect net parasitic:

1. Interconnect parasites cause an increase in propagation delay (i.e. it slows down

ε --> permittivity of dielectric material (SiO2)

t --> thickness of dielectric material (SiO2)

W --> width of wire

ε --> εr εo where εr --> relative permittivity of SiO2

εo --> 8.854 x 10-12 F/m; permittivity of free space

As technology node shrinks (scaling), to minimize resistance of the wires, it is desirable

Inter-wire capacitance become dominant factor in multilayer interconnect structures.

Resistance R= (ρ.L)/ (H.W) = (ρ. L)/ Area

ρ --> resistivity (ohm-m)

Lumped Capacitor Model

Analysis of network with larger number of R and C becomes complex as network

Elmore Delay Model

• Has single input node

Delay at node 1: Tow d1 = R1C1

Delay at node 2: Tow d2= (R1+R2)C2

Delay at node 3: Tow d3 = (R1+R2+R3)C3

=> The delay of a wire is a quadratic function of its length

=> doubling the length of the wire quadruples its delay

• It is pessimistic and inaccurate for long interconnect wires.

The behavior of the distributed RC line can be approximated by a lumped RC ladder

There are two types of transmission models:

resistance : 0.000271 -------------> R per unit length

capacitance : 0.00017 -------------> C per unit length

slope : 29.4005 ---------------------> Used for linear extrapolation

fanout_length (1, 18.38) ----------> (fanout = 1, length = 18.38)

fanout_length (2, 47.78)

fanout_length (3, 77.18)

fanout_length (4, 106.58)

fanout_length (5, 135.98)

2. Automatic selection based on design area