Académique Documents
Professionnel Documents
Culture Documents
… controlling leakage in
nanometer CMOS SOC’s
Agenda
Introduction
CLEAN project
Motivation
Leakage Physics
Variation
Leakage Control
Memory Leakage
Estimation
Conclusion
… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 2
CLEAN Project: Consortium
ST Microelectronics (I/F)
Infineon Technologies (D)
BullDAST (I)
ChipVision Design Systems (D)
Introduction
CLEAN project
Motivation
Leakage Physics
Variation
Leakage Control
Memory Leakage
Estimation
Conclusion
… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 10
Motivation: A Timeline of Problems
What Moore’s law really said: For each technology generation, there is a best number of
components with the best per components price.
Reason: The price of a system does not depend on the number of components, thus more
components means less cost per component. But above a certain limit, the yield is going
down, exponentially increasing the cost again.
Moore’s observation: For each new technology, the best number of components and the best
price per component both develop exponentially.
Motivation: A Timeline of Problems
frequency
2000
180n 2002
1GHz
180n
1997
250n
1995
350n
100MHz
1993
0.8µ
1989
1985 1µ
1.5µ 10MHz
1982
1978 1.5µ
3µ
1974
6µ
1MHz
1972
10µ
104T
107T
108T
105T
106T
1971
10µ
transistors
2000
power density
1997
250n
1995
350n
100W/cm²
1993
0.8µ
1989
1985 1µ
10W/cm² 1.5µ
1982
1978 1.5µ
3µ
1974
6µ
1W/cm²
1972
10µ
1971
10µ
1960 1970 1980 1990 2000 2010
But also the (unwanted) power density (power per unit area) developed exponentially.
Transistor count and frequency have no limit (the more the better), but for power density,
there is a bound, we should certenly not pass
(hot plate ≈ 10W/cm², nuclear fuel rod surface ≈ 100W/cm², rocket nozzle ≈ 1000W/cm²,
sun’s surface ≈ 60000W/cm²)
Motivation: A Timeline of Problems
happy scaling
?
c aling
the 5V world
ffs on
nd o v ariati
E 2004 2010
200290n 2008 32n
2006
S 2000 180n 45n
CF 65n
unpredictable
180n
degradation
variations &
1997
1995 250n
350n
1993
0.8µ
V CMOS
5 1982
OS
1989
1978 1.5µ 1985 1µ
large atoms
NM 3µ
1.5µ
1974
1972
6µ mean physics
1971
10µ
10µ
Scaling has always been done for the sake of transistor count and frequency.
All!! other major technology changes have always been done due to power:
Going from NMOS to CMOS Î because of short cirquit currents
Going from constant voltage scaling CVS to constant field scaling CFS (field =
voltage per size) Î because of dynamic power
The end of the ‘gigaherz race’ Î because of subthreshold leakage (CFS would
need threshold voltage scaling, but subthreshold leakage does not allow any further scaling)
By now, we left the era of mean physics (with various leakage effects) and
entered the era or ‘large atoms’:
Today’s technology is characterized by variations and tomorrow’s technology will
face unpredictable variations and aging (=degradation)
Motivation: Materials used for CMOS
Before 90's
Since 90's
1 2
H He Today
3 4 5 6 7 8 9 10
Radioactive
Li Be B C N O F Ne
11 12 13 14 15 16 17 18
Na Mg Al Si P S Cl Ar
19 20 31 32 33 32 35 36 21 22 23 24 25 26 27 28 29 30
K Ca Ga Ge As Se Br Kr Sc Ti V Cr Mn Fe Co Ni Cu Zn
37 38 49 50 51 52 53 54 39 40 41 42 43 44 45 46 47 48
Rb Sr In Sn Sb Te I Xe Y Zr Nb Mo Tc Ru Rh Pd Ag Cd
55 56 81 82 83 84 85 86 71 72 73 74 75 76 77 78 79 80
Cs Ba Tl Pb Bi Po At Rn Lu Hf Ta W Re Os Ir Pt Au Hg
87 88 103 104 105 106 107 108 109
Fr Ra Lr Rf Db Sg Bh Hs Mt
57 58 59 60 61 62 63 64 65 66 67 68 69 70
La Ce Pr Nd Pm Sm Eu Gd Tb Dy Ho Er Tm Yb
89 90 91 92 93 94 95 96 97 98 99 100 101 102
Ac Th Pa U Np Pu Am Cm Bk Cf Es Fm Md No
As motivation:
The number of elements, that we need for CMOS production.
In the 1980’s, only 6 elements were needed to build CMOS systmes (Si = semiconductor,
O=isolation, B,P=doping, Al=interconnect, H=chemical passivation)
In the 1990’s, only 8 new materials were added, all were improving the systems behavior
(Cu=better interconnect, Ge=higher mobility, Ta,W=better conductorÆsemiconductor
interface, N=higher dielectricity in the oxide,…)
Nowadays, we use nearly each element (short of the radioactive ones).
Motivation: Power Budget
VDD
Pavg = α f CL VDD2
E=½CLVDD2
+ Isc VDD
+ Ileak VDD
Isub
P Isc
Ileak = Isub + Igate + Ijun + …
Ijun
N CL
Igate
E=½CLVDD2
It’s a well known fact, that there are 3 sources of power consumption:
The dominating one – the dynamic power – is resulting from the energy needed to charge and
discharge the load capacitance when doing a transition
There are two reasons why short circuit currents – occurring in transition when both transistors
are open – are usually not regarded.
at first: For steep input flanks, the load capacitance buffers the short circuit – limiting the power
to some 5-10% of the dynamical one
and then: in terms of power estimation, knowing the input slope, the short circuit currents can
be modeled by an equivalent additional load capacitance
IHCI Igate
Gate
Source Drain
Igs Igb Igd
n+ IGISL IGIDL n+
Isubth
source drain
Ijunction Ijunction
Bulk Ipunch Roy03
If the channel is closed: Drain is typically at high , all other terminals at low potential.
We observe:
- subthreshold currents through the channel
- GIDL currents from the gate/drain overlap region to the substrate
- tunneling through the gate oxide from the drain to the gate
- punchthrough from drain to source when both pn junctions touch each other
- pn-junction leakage as known from diodes
If the channel is open, source and drain have the same potential. Typically Source, drain and
gate are at high and bulk at low potential. Now we see:
- tunneling from gate to bulk
- junction leakage from drain to bulk
when switching we see hot carrier injection carried by electrons ballistically traversing the gate
oxide to the gate and thus carrying a current from gate to channel
Motivation: ITRS 2006 Prognosis
400 120
Igbulk
bulk
gate
Igate
IgUTB
UTB
tunn sub IgDG
350 e ling thre DG
105
sho bulk
Ioff bulk
ld
Isub
UTB
Ioff UTB
DG
Ioff DB
300 bulkbulk
Idyn 90
Idyn
UTB
Idyn UTB
DGDG
Idyn
250 node
node 75
200 cap 60
. ch
arge
150 45
100 30
thin body
50 15
bulk dual gate
[nA/µm]
[nm]
0 0
2004 2006 2008 2010 2012 2014 2016 2018 2020
Prognosis of the development of the 3 major power effects (for a single transistor disregarding
interconnect!).
For bulk CMOS, gate leakage will be skyrocketing. Thus the introduction of high-k devices
(which entered the market in 2008) was mandatory. High-k devices will be discussed later.
They can remove gate-leakage for good.
From 2010 on, ultra thin body devices will be produced keeping the subthreshold leakage
under control.
The introduction of dual gate devices will drastically reduce the leakage for the cost of higher
dynamic power (dual gate means dual capacitance to charge).
Motivation: Memory Driver
er MB
1,00E-05 24W p
1,00E-06 z
r 100MH
15W pe
1,00E-07
26
1,00E-08
W
pe
r1
1,00E-09 00
MH
SRAM Leakage [W] z
1,00E-10 SRAM Dynamic [Ws]
DRAM Leakage [W]
1,00E-11
DRAM Dynamic [Ws]
1,00E-12
1,00E-13 B
nW per M
80
1,00E-14
1,00E-15
2004 2006 2008 2010 2012 2014 2016 2018
Prognosis of the leakage and dynamic power development for SRAM and DRAM.
Dynamic power depends on the number of accesses, thus it is given in power per MHz
Static power depends on the number of transistors, thus it is given in power per MByte
Agenda
Introduction
Leakage Physics
Variation
Leakage Control
Memory Leakage
Estimation
Conclusion
… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 20
IHCI Igate
Gate
Source
Igs Igb
Drain
Igd Subthreshold
n+ IGISL IGIDL n+
source
Ijunction
Isubth
drain
Ijunction
Leakage
Bulk Ipunch
1,0E-02
1,0E-07
Threshold voltage is Vds dependent.
1,0E-08 → increasing Vds decreases Vth
80 mV/decade → drain induced barrier lowering (DIBL)
Isd [A/µm]
1,0E-09
Cdm
1,0E-10 ∝ 1+
Cox
.0V
2
1,0E-11
slo
DD =
pe
V
1,0E-12
.1V
1,0E-13
DD =
0
I off
V
1,0E-14
1,0E-15
-0,5 0 0,5 1 Vgate [V] 1,5
1) Characteristic plot for source-drain current vs. gate voltage for 130nm
Below a certain threshold voltage:
subthreshold slope of about 80mV/decade
Resulting: Ioff=10 fA/µm Weak Inversion and Junction Leakage
2) Trend for smaller nodes: Vth is sinking, thus Ioff is rising exponentially
Resulting: Ioff=30nA/µm
Factor 100 per generation OR 10 per year
Gate
Source
Igs Igb
Drain
Igd Subthreshold
n+ IGISL IGIDL n+
source
Ijunction
Isubth
drain
Ijunction
Leakage
Bulk Ipunch
V
− β th
2W
I subth ≈α T e T if gate is locking and Vds is high
L
Gate
Source
Igs Igb
Drain
Igd Gate
n+ IGISL IGIDL n+
source
Ijunction
Isubth
drain
Ijunction
Leakage
Bulk Ipunch
Conduction Vox
Energy diagram of a Poly-Si transistor:
Leakage current can be carried by Vg-Vd Valence Ψox
tunneling electrons or holes. ΨS
direct tunneling: from gate to channel Bandgap
SiO2
Fowler-Nordheim: from gate to oxide Gate Channel
This diagram shows textbook example of tunneling effect at rectangular potential barrier:
Classically impossible, an electron on the left can pass a barrier higher than its energy with a
certain probability.
The resulting current density exponentially depends on the thickness of the barrier, the energy
difference between electron and barrier and the electrons effective mass.
In the overlap region, the tunneling can carry current from the gate to source and drain directly.
The current tunneling to the channel will go to source drain or bulk
IHCI Igate
Gate
Source
Igs Igb
Drain
Igd Junction
n+ IGISL IGIDL n+
source
Ijunction
Isubth
drain
Ijunction
Leakage
Bulk Ipunch
p n
As known from diodes: small currents are carried by
drifting carriers
electron-hole generation in junction
ΦBI+Vapp
conduction
An electron in the p-side’s valence band sees a
valence
potential as presented below.
Σg
Electrons can pass this triangular barrier by tunneling
through it. →Band-To-Band-Tunneling (BTBT)
As known from diodes, we have randomly drifting carriers and electron-hole generation at
imperfections.
As soon as build in potential plus applied reverse bias are higher than the band gap, electrons
can directly tunnel from the p-side’s valence band to the n-side’s conduction band.
Gate
Source
Igs Igb
Drain
Igd Junction
n+ IGISL IGIDL n+
source
Ijunction
Isubth
drain
Ijunction
Leakage
Bulk Ipunch
0 Å
Gnd x =1
GIDL To
VDD 5Å
Ijunc. [A/µm]
10-6 1
x=
To
2 0Å
drifting x=
To
el.-hole generation
GIDL 10-9
BTBT
T=100°C
RBB
T=0°C L=50nm,W=1µm
FBB 10-12
0 Vd [V] 2
VGIDL
Temp
L delay Tox
sub-th gate
leak leak
VBB VDD
pn-jun dynamic
positive correlation
leak
short
negative correlation
circuit
Introduction
Leakage Physics
Variation
Leakage Control
Memory Leakage
Estimation
Conclusion
… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 27
Variation: Variation Model
Each parameter p of each transistor can be described as
The actual value of one parameter (e.g. the oxide thickness) can vary due to several
reasons:
T
ie
r-d
t e
In
V th
On current is a measure of the device speed (double on-current means double performance)
Here you can see high correlation between on-current (speed) and off-current (energy
consumption).
Later, we will exploit this correlation (the fact, that faster devices also have higher leakage
power)
Variation: Local Variation
Random parameter variations Æ smaller scale than inter-die variations.
P avg
Example: 108 Transistors – 100 CP’s – 20 gates on each CP
r
108 isto
P trans
P = ∑ Ptransistor = 108 ⋅Ptransistor
law of large
numbers helps
D CP
i =1
100⎛ 20 ⎞ m
D = max⎜ ∑ Dtransistor ⎟
D syste
j =1
⎝ i =1 ⎠ Dt
2 rans
lar 0 is isto
r
wo
ge no
nu t a
rst
m
be
o
r intra-die
f1
00
variations
As presented in this slide, the variation of the input parameters changes the average value of
the result.
If you just compute the average leakage by using average length,… you will underestimate the
average.
Reason: As the I(L) dependency is non-linear, the leakage of the average length is smaller than
the average leakage of all lengths.
layer thickness
OPC variations Interface
roughness
CMP effects
Caused by: high temperature, high Caused by: high VDD, very low VBB
current density in interconnect (strong reverse ABB)
Results in: high temperature, high Results in: oxide charge ÆVth increase
current density in interconnect Æ oxide breakdown
Æ runaway problem, sudden death
R’s
Doping
Vbody
history
Poly Si granularity,
interface roughness, high-
k morphology
Well proximity effect: Some of the ions scattered out of the edge of the photoresist are
implanted in the silicon surface near the mask edge, altering the threshold voltage of those
devices
Introduction
Leakage Physics
Variation
Leakage Control
Transistor engineering
Power gating
ABB / DVS
Memory Leakage
Estimation
Conclusion
… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 34
Technology: Well Engineering
n-doped
Halo doping & pocket implant (lateral NUD) light
increase channel doping at the pn/junction med p-doped
steeper pn-junction profile
high
less short channel effects
In addition: highly p-doped further pockets can be implanted shielding S/D directly below the
surface -> pocket implant
Compare to the definition of the threshold voltage to see the impact of NUD
Technology: High-k
Oxide dielectricity
gate capacitance is performance constrained
capacitance is kept high by reducing Tox
but gate oxide may run ‘out of atoms’
high gate tunneling
ox C ∝kSiO2 ⋅ LW Tox
idea: introduce high-k insulators
effective electrical Tox is small Tox = Tox k SiO2 k material
eff
[Intel]
Depletion NiSi
region TiN
HfO2
20nm
Fully-depleted Ultra Thin Body SOI
[F.Andrieu et al VLSI 2006]
1) Ultra thin channel Î depletion layer capacitance is lower Î higher slope (see subthreshold
section)
FinFET: The channel is constructed as a fin. Thus both sides can be controlled by the source.
As the channel is controlled from both sides, it can be switched on or off more effectively
Æ the slope can not be reduced below the fundamental limit of 63mV/decade, but
can be brought very close
Technology: Strained Silicon
Strained silicon:
Electron speed:
70% faster
Circuit performance:
35% higher
http://www.research.ibm.com/resources/press/strainedsilicon/
Igate
poly metal
Cox
SiON high k
Channel electrostatic
control
Bulk
Ion/Isub
FinFET
PDSOI FDSOI
Ion
µ↑
strained Si SiGe on
strained Si on insolator insulator
(SSOI) (SGOI)
Introduction
Leakage Physics
Variation
Leakage Control
Transistor engineering
Power gating
ABB / DVS
Memory Leakage
Estimation
Conclusion
… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 41
Power Gating: Overview
Sleep transistor insertion sleep
low
low leakage
leakage sleep
sleep transistors
transistors
high
high performance logic
logic
implementation
implementation as
as PMOS
PMOS oror NMOS
NMOS or
or both
both
decreases
decreases leakage in standby mode
BUT: gate
0
increases
increases leakage in active mode leakage
increases
increases dynamic
dynamic power
power
increases
increases area demand Logic
reduces
reduces switching
switching speed
reduces
reduces the
the maximum
maximum increased increased area
stack
stack depth
depth dynamic power
If not compensated
by higher VDD
To reduce the leakage if the circuit is dysfunctional, we can put a high Vth sleep transistor in
series to the circuit.
This will result in very high leakage savings (subthreshold and gate leakage) if the power
switches are locking, but if the component is not idle, the application of power-gating has
some disadvantages:
Power Gating: Application
VDD
Gnd
standard cell row
J Frenkil - Sequence
Power-gating each standard cell creates a huge area overhead (needs a lot of gating
transistors and a lot of control interconnect).
But to apply this, we do not have to modify our design tools (only the library).
Doing power-gating per stdcell-row is much more efficient in terms of control and area
overhead, but standard tools can no longer handle the design. Even full custom new tools
will have to significantly modify their implementation, as the delay of a cell is no longer a
local problem (mapping is typically done before layout, but now timing depends on the
layout, and mapping of course depends on the delay). This will result in a huge overdesign
or in much more complex design tools.
The power ring solution is only mend to add power gating to old IP’s which can not be
changed.
Power Gating: Interfacing Problem
G H
Low-VTh Low-VTh
CMOS gate CMOS gate
Virtual GND Virtual GND
Sleep_A’ Sleep_B’
GND GND
Usually, not the entire system sleeps, but some parts are always awake.
Interconnect from a sleeping region to a wake region may have huge problems because the
voltage on this interconnect will very slowly go from 0 to 1 (NMOS gating) or 1 to 0 (PMOS
gating) over 1000’s of cycles leading to extreme short circuits in the wake region.
Power Gating: State Retention
Problem
How to store states of registers within a gated region?
Solution
Balloon State Retention Flip Flop
store/
restor
e
SR SR slp slp
Clk Clk
D state retention
extension
There is an additional interfacing problem: It is (as described) important not to produce short
circuit by floating inputs. But additionally, when we power gate large regions (including
registers), we will loose the state of these registers.
Balloon Flip Flops fulfill both, they work as voltage anchors, and they can save the register
state on sleep.
Balloon Flip Flops consist of 3 latches: Two are fast and work as Flip Flop when awake. The
third one is slow and leakage saving and store the state when sleep. The black latch is not
power gated.
Power Gating: Ground bounce
What’s a ground bounce and why does it happen?
Power grid not an ideal conductor, it
consists of VDD
L
Resistive parts C
R
Capacitive parts Circuit
Inductive parts C
L
R
GND
Introduction
Leakage Physics
Variation
Leakage Control
Transistor engineering
Power gating
ABB / DVS
Memory Leakage
Estimation
Conclusion
… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 47
ABB / DVS: Background ABB 1/2
Adaptive body biasing (ABB)
also called VTCMOS (variable threshold CMOS)
In standard CMOS:
the body voltage for NMOS devices is at 0V
the body voltage for PMOS devices is at VDD
-
VABB
Gnd
The idea of ABB is simple. Force the occurrence of the body effect (see subthreshold slides)
by applying a body potential which is not 0 for NMOS and not VDD for PMOS.
ABB / DVS: Background ABB 2/2
Analysis: Vary VABB and measure VTH
Result for NMOS (and PMOS):
an ABB voltage higher than 0V (lower than VDD) will result in a lower
effective |VTH |
increased
Reverse BB (RBB) Forward BB (FBB)
component speed
increased 32nm
10000
subthreshold leakage
65nm
1000
70nm
90nm
180nm 100
I_sub [nA/µm]
Effect weakens with 10
smaller technology 1
0.001
Higher Tox decreases BTBT -0.5 -0.3 -0.1 0.1 0.3 0.5
Spice simulation:
threshold current
1 VBB
Increases GIDL
to substrate 70nm data
[Neau03]
The BTBT (Band to band tunnelling from drain to substrate) limits the applicability of the
RBB.
The minimum of the total current Itot = Isub+IBTBT is (Analysis) at dItot/dVBB = 0 =>
dIsub/dVBB = - dIBTBT/dVBB
The absolute value of the optimal VBB is technology dependent (here for different doping
depths 17nm red curve to 20nm blue curve)
For the 70nm technology the max VBB was 0.1V, and for the 50nm it can be below 0.05V
because the effect of BTBT rises with technology.
ABB / DVS: Yield Optimization
find
ing
w bes
o t VDD=0.8V VDD=0.88V
DVS vs. ABB Î
des rking VBB /V
ign po
VBB=0.5V VBB=0.0V
tim int D
D
f[s-1]
power and delay limits determine yield
min
ABB / DVS used to move systems closer to
the limits
max P[W]
0 p x
1 1 XOR
r
CLK REG
c
/ V DD
V BB n t
atin
g
poi g Block diagram
upd rking tunin
w o e
tim [Narendra05]
run
Î
<08/08/13> ISLPED Tutorial 52
Fixed (in design time, the ABB is chosen) – is only done for FBB – will be explained later
Once (while testing the chip, the VBB is chosen optimally) – will reduce variance
Adaptively (while runtime) – this can also reduce variability due to temperature and Vdd
noise. A replica of the critical path is measured, the VBB is always high enough to ensure
timing of the critical path.
Power Gating vs. ABB comparison
ABB Power gating
- Low benefit + High benefit
+ State preserving - Loosing state
+ Compensates Vth fluctuation - Does not compensate
+ Additional hardware only once - Additional transistor in series:
additional interconnect only slower, larger, lower yield
(+)Only needs modified library: (-)For per row gating:
Conventional design tools Timing is hard to validate
Introduction
Leakage Physics
Variation
Leakage Control
Memory Leakage
Estimation
Conclusion
… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 54
Cell Design: Leakage in SRAM cells
Subthreshold leakage occurs when:
Transistor is off
Vds is non-zero
Cell state (stored value) determines exactly which transistors “leak”
delay
leak @ 0
leak @ 1
Navid Azizi, Farid N. Najm, and Andreas Moshovos,
Low-Leakage Asymmetric-Cell SRAM, IEEE
Transactions on VLSI, pp. 701-715, Aug. 03
[Kaxiras et al.]
Counts fixed time
[Powell et al.]
Dynamically estimates and adapts to the required instruction-cache size and turns off a
selected set
of cache lines.
Coarse-grain: entire portions of cache are turned on/off
The number of lines in sleep mode is controlled by periodically examining the miss rate
Miss rate < pre-determined value
Put to sleep another part of the cache
Miss rate > acceptable value
Activate more lines are
Memory Architectures: Drowsy memories
When accessing a drowsy line, its state is lost
Gating circuitry prevents access to drowsy lines
mem
Replace SRAM cell
-> slight performance reduction
sense amps -> high leakage savings
wl driver
wl driver
mem mem
mem
Introduction
Leakage Physics
Variation
Leakage Control
Memory Leakage
Estimation
Conclusion
… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 62
Estimation: What leakage is modelled?
2ox
1p++ 3ox
5
6p++
4ox
7p++
13ox 9n
8p+ 11n+
12n+ 10p+
V L
ds , V T V T
V ch ,
… gs ox gd ox
bs
63
<08/08/13> ISLPED Tutorial 63
Estimation: Leakage and delay under PTV variation
tio +
ria al
et tio l +
tic
n
m +
va min
is
l d ia na
er n
in
ca ar i
no
log(leakage)
lo v om
n
log(leakage)
f
frequency
frequency
e.g. all wafer Tox e.g. dopant distribut. e.g. well proximity
64
<08/08/13> ISLPED Tutorial 64
Estimation: Why RTL modelling?
behavioural m
od
SystemC solution ific
synthesis etric
at
io param
n yi e l d
n
at io evaluation
ul ftarget
s im sy R
Σ
nt T
data he
Edyn area Tcrit Pleak inte sis Verilog
0010 0101 0110 1110 r-
1001 1100 0000 0100 di e + UPF
1010 0001 1001 1101
1010 0010 1011 0010
ada tion
solu
floorplan temperature voltage
ptiv
e
65
<08/08/13> ISLPED Tutorial 65
Estimation: The leakage model
VDD
ln (I sub W ) = ln( χ L) − β VTH (L, Tox , N ch ) T ox
0 VBB
= K ≈ α 0 + α1 N
L + α2 1
L
T
N + α3 1
L2
T
N
0
d
d by n Te epe an
ne tio mp nd pa
r mi riza ,V o n ra aly
t e te DD ,
V m ti c
de rac BB e te al
a p
ch r
de r o c
pe es
… a2 a1 a0
(
I component Temp, VDD , VBB , Lch , Tox , N dep ) nd
en
ce
s
b0
b1 = β 0 ⋅ I sub
NMOS
+ γ 0 ⋅ I sub
PMOS
y
b2 a·b d b on
in e
rm rizat
i + β1 ⋅ I gate
NMOS
+ γ 1 ⋅ I gate
PMOS
t e
de racte
cha + β 2 ⋅ I stack
NMOS
+ γ 2 ⋅ I stack
PMOS
66
<08/08/13> ISLPED Tutorial 66
Estimation: The delay model
characterization
inverter model gate model RTL model
1
1
& ≥1 1
… & &
0
edge inverter CL ≥1 ≥1
67
<08/08/13> ISLPED Tutorial 67
Estimation: Process Variation Engine
(1) (5)
inter-die µL µT µNµ (3)µ µ intra-die
L T N
Monte Carlo
variation data (4) µL µT (2)
µN variation data
µL µT µNµ µ µ
L T N
pre-integrate
al
model
ic
t yp
t s
lec set
se
dynamic
systemC RTL RTL
PowerOpt & SC
description netlist floorplan
power
The tool reads SystemC description and produces RTL netlist + dynamic power prediction +
RT level floorplan
Use this information to compute a temperature and VDD map. Decide about an optimization
strategy. Now all information except for the variation is available. Variation data comes from
the variation engine (see later).
Model strengths:
accurate leakage (std dev<10%) and delay (std dev<5%) model
inter-die variation is accurately handled even for correlated
parameters
resulting models are fast, compact and accurate
Limitations:
only valid for a small die area, where PTV gradients are negligible
(thus each RT component needs its own model)
Under development:
expectation value is good for intra-die effect on leakage
but SSTA is needed for intra-die delay modeling
70
<08/08/13> ISLPED Tutorial 70
Agenda
Introduction
Leakage Physics
Variation
Leakage Control
Memory Leakage
Estimation
Conclusion
… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 71
Recent Problem: Subthreshold Slope
1,0E-02
slope is limited
1,0E-03
DIBL
S ≥ 63mV / dec. [K Roy ISLPED ‘06]
1,0E-04 45nm
130nm
1,0E-05
GIDL slope is channel leakage
1,0E-06 V
− β TH
2W
1,0E-07 I subth ≈ α T e S ⋅T
L
1,0E-08
1,0E-09
C
S = 1 + dm
1,0E-10
.0V
Cox
2
1,0E-11
DD =
V
1,0E-12
… is gate leakage
.1V
1,0E-13
0
I gate ∝ LWVDD
2
V
1,0E-14
BPTM simulation
1,0E-15
-0,5 0 0,5 1 Vgate [V] 1,5
global variation
T
threshold voltage of all transistors is deviating
influence ie
r -d
Vth↓ = performance↑ & leakage↑ te
In
Vth↑ = performance↓ & leakage↓
V th
[CLEAN Workshop]
Global variation: each parameter that can vary in production and that has an influence on the
threshold voltage is contributing to the variation of subthreshold leakage and performance.
Recent Problem: Variation
global variation
threshold voltage of all transistors is deviating
influence
Vth↓ = performance↑ & leakage↑
Vth↑ = performance↓ & leakage↓
Different sources of variation occur at different production steps and have thus a different
influence range (entire lot, one wafer, single die, or each transistor individually)
In fact there are also intermediate effect (for instance resulting in variation gradients over
the die)
Recent Problem: Running out of Atoms
doping variation
45nm: statistical local fluctuation
32nm and below: random discrete dopants
[D Frank]
The random dopant effect will in 45nm result in high statistical fluctuation of the threshold
voltage (only 100 dopants are not always distributed smoothely)
In 32nm, each single doping atom may have significant influence on the transistor.
Recent Problem: Running out of Atoms
doping variation
45nm: statistical local fluctuation
32nm and below: random discrete dopants
[A Asenov]
Each material we use in production shows a certain granularity (larger than single
atom/molecule) size. The metal clusters for instance will lead to a line edge roughness of
approximately 5nm for each metal line. For a 90nm interconnect, this effect may be only
additional noise, but for a 22nm line, a +-5nm variation is significant
Recent Problem: Running out of Atoms
[Chandu Visweswariah]
doping variation
45nm: statistical local fluctuation
32nm and below: random discrete dopants
oxide thickness
Atomic dimensions already reached
Tox<2nm in 65nm node
SiO2 has 0.3nm diameter
less than 6 layers
The oxide thickness is allready the smallest structure we have to controll. In 65nm, typical
thickness is at 1.5nm (which is 5 molecule layers). With high-k, this size is becoming larger
again (approx 10-15nm)
Recent problems: Degradation
electromigration
Caused by: high temp, high current density
Results in: high temp, high current density
Æ runaway problem, sudden death
Degradation (=aging of the components) is the next problem that we will have to face.
Electro-migration: At high temperatures and current densities, the metal ions follow the
momentum of the electrons.
Problem: At the thinnest position the current density and temperature both are highest. The
thinner the connection, the faster its material vanishes.
Î Drastical end with a runaway situation (see sudden northwood death syndrome)
Recent problems: Degradation
electromigration
Caused by: high temp, high current density
Results in: high temp, high current density
Æ runaway problem, sudden death
Hot electron degradation (there are 4 different effects causing hot electron degradation).
Electrons can enter the oxide and are trapped there pre-charging the channel, and thus
influencing the threshold voltage.
Systems can recover from hot electron degradation. The change is not permanent.
Recent problems: Degradation
electromigration
Caused by: high temp, high current density
Results in: high temp, high current density
Æ runaway problem, sudden death
oxide breakdown
Caused by: oxide charge, dirt in oxide,
radiation, mesh errors
Results in: sudden runaway Æ oxide failure
Due to charges, dirt, mesh errors or radiaion impact in the oxide, the current can punch
through the oxide, resulting in a permanent device failure.
Recent problems: Degradation
electromigration
Caused by: high temp, high current density
Results in: high temp, high current density
Æ runaway problem, sudden death
oxide breakdown
Caused by: oxide charge, dirt in oxide,
radiation, mesh errors
Results in: sudden runaway Æ oxide failure
The passivation between channel and oxide (hydrogen atoms) can vanish (attracted by the
gate).
The effect just occurs in PMOS devices when the transistor is conduction (bulk at VDD and
gate at ground) and the bulkÆgate voltage is negative.
Recent Transistor Design
We already have well engineering since 2002 in order to control the threshold voltage
dependence to other parameters (well engineering is still being further developed)
Recent Transistor Design
Since 2004, we can control (increase) the mobility of the majority carriers (electrons or holes)
to increase the performance without threshold voltage reduction
Recent Transistor Design
Since 2008, also high-k technology is available (e.g. Intel Penryn) to control gate leakage
Future Transistor Design
FinFET
multi-gated transistors Drain
[C.Jahan]
n
Fi
Surrounding gate Gate
Source
Drain
Source Gate
Multi-gate devices such as FinFET will combine SOI, high-k and an increased electrical
control over the channel.
With FinFETs the subthreshold slope will be brought very close to the fundamental limit
enabling further threshold voltage and thus also frequency scaling
Future Transistor Design
Bulk FDSOI
multi-gated transistors
increasing electrostatic control
Gate Buried
approach the 63mV limit Gate oxide
UTB FD SOI
fully depleted ultra thin body SOI
Depletion
region
ultra-thin: SOI layer
thinner than depletion width Metal gate
High-k dielectrics
channel is undoped Undoped thin channel
Æ no wells poly
HfO2
20nm
[F.Andrieu]
Fully depleted ultra thin body devices will enable bulk CMOS at least down to the 32nm node.
The undoped channels avoid random dopant variation and increase the integration density, as
no wells are needed.
Gate Level Techniques: Power Gating
easy concept
SC use fast, low Vth transistors for logic
CM use low leakage, high Vth transistors
OS e
V
slp <0 if o s sibl to cut leakage path on idle
V p W
sleep FBB educe
to r
g
g atin e
OS ibl
NM poss
also
Power gating in principal is very easy: The logic is implemented in fast, but high leakage
devices. If the component is idle, a slow, but low leakage gating transistor in series to the
gate cuts the leakage currents paths.
Gate Level Techniques: Power Gating
I I
Twakeup Twakeup
easy concept
Imax use fast, low Vth transistors for logic
Imax use low leakage, high Vth transistors
to cut leakage path on idle
t t
large sleep small sleep
transistor transistor technical problems
gating transistor sizing:
supply noise and wakeup delay
floating outputs
row- or cell based gating
U[V]
VDD
sleep
Gnd
t per gate
VDD
Gnd
0 Out 1 Out 8 per row sleep
easy concept
slp voltage use fast, low Vth transistors for logic
anchor use low leakage, high Vth transistors
to cut leakage path on idle
logic with logic w/o
sleep mode sleep mode technical problems
gating transistor sizing:
supply noise and wakeup delay
store/
restor
floating outputs
e row- or cell based gating
SR SR slp slp
Clk Clk
state retention
D
sequentials are loosing state at power down
Clk Clk Clk Clk voltage anchors, balloon latches, dual VDD
Q
slp slp
High Vth not gated
Low Vth but gated
When using power gating, the interface between gated and not gated parts have to be
stabilized
When going to sleep, the state of the sequentials has to be stored and then restored, when
waking up.
For both problems, design for power gating at the behavioural level can avoid or at least
reduce the overhead for interfacing and state retention.
System Level: adaptive techniques
on-chip binning
exploit correlation of leakage and performance
adaptively slow down the system to save leakage
1000
energy [nWs]
4 delay [ns] 6
10
Above the gate level, the most promissing technique is adaptive leakage management.
The idea is always to either speed up slow systems for the cost of leakage power or the slow
them down, reducing the leakage.
The plot here show a system’s power and performance distribution before… [see next page]
System Level: adaptive techniques
on-chip binning
exploit correlation of leakage and performance
adaptively slow down the system to save leakage
1000
energy [nWs]
4 delay [ns] 6
10
on-chip binning
exploit correlation of leakage and performance
adaptively slow down the system to save leakage
techniques
AVS
ABB
ABB+AVS
[Meijer04]
Above the gate level, the most promissing technique is adaptive leakage management.
The idea is always to either speed up slow systems for the cost of leakage power or the slow them down,
reducing the leakage.
The available techniques are adaptive voltage scaling, adaptive body biasing, or a combination of both.
on-chip binning
exploit correlation of leakage and performance
adaptively slow down the system to save leakage
techniques implementation
AVS design time
ABB test time
ABB+AVS boot or run time
[Narendra05]
leakage
[Meijer04]
frequency
leakage currents
subthreshold leakage will be under control
gate & junction leakage can be avoided
global variations
sensibility can be reduced
will be controlled by adaptive techniques
local variations
some will be bypassed
some will be made deterministic by better models
LER will remain critical
degradation
will be the next problem to solve
Final conclusions:
The leakage currents will be kept under control, but we have to sacrifice all frequency
scaling.
From the technology side, transistors will be made less sensible to parameter variation, and
the remaining global variation can be reduced by adaptive techniques
Some statistic variations will be bypassed (e.g. random dopant effect). Some effects are not
really statistic (as well proximity effect or OPC related disturbances). With better estimation
software, they may be made deterministic and can then be avoided by design (If well
proximity effect is always increasing a certain tarnsistor’s threshold voltage, we can reduce
the thershold by design. The final device will have exactly the desired properties.).
But especially the line edge roughness will remain a critical source of statistical variation.