Vous êtes sur la page 1sur 94

On Leakage Currents

Sources and Reduction for Transistors,


Gates, Memories and Digital Systems

Wolfgang Nebel, Domenik Helms

… controlling leakage in
nanometer CMOS SOC’s
Agenda

ƒ Introduction
ƒ CLEAN project
ƒ Motivation
ƒ Leakage Physics
ƒ Variation
ƒ Leakage Control
ƒ Memory Leakage
ƒ Estimation
ƒ Conclusion

… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 2
CLEAN Project: Consortium
ƒ ST Microelectronics (I/F)
ƒ Infineon Technologies (D)

ƒ BullDAST (I)
ƒ ChipVision Design Systems (D)

ƒ University of Catalunya (E)


ƒ Politecnico di Torino (I)
ƒ OFFIS (D)
ƒ WUT (P)
ƒ LETI (F)

ƒ EDAC (D), DTU (DK), COREP (I), BME (H)

<08/08/13> ISLPED Tutorial 3

IDM = integrated device manufacturer


SME = small and medium sized enterprise
CLEAN Project : Challenges

ƒ Power consumption of nanoelectronics is a limiting factor

ƒ Leakage power will soon become the dominating part

ƒ Nanometric processes have large process variations


ƒ leakage & performance variation
ƒ severe impact on yield

ƒ EDA support for low leakage today is extremely poor

<08/08/13> ISLPED Tutorial 4


CLEAN Project : Goals and Keys

ƒ Attack leakage for 65nm and 45nm CMOS with innovative


tools and circuit solutions
ƒ SoC-level
ƒ micro-architectural-level
ƒ RT-level
ƒ cell’s library level
ƒ back-end.

ƒ Develop prototype EDA tools to implement missing parts of


today‘s flows

ƒ Integrate and field-test those tools into industrial flows

<08/08/13> ISLPED Tutorial 5


CLEAN Project: WP1 Modelling and Design

ƒ Device level leakage models


ƒ analysis of process variability and temperature on 65nm & 45nm
ƒ Circuit & gate level leakage models

ƒ RT block & IP component macro modelling

ƒ Develop circuit structures for


ƒ sleep-transistor cells
ƒ voltage anchor cells
ƒ ABB support
ƒ Develop low-leakage memory/cache architectures

<08/08/13> ISLPED Tutorial 6


CLEAN Project: WP2 Synthesis and Optimization

ƒ Physical synthesis for distributed sleep transistor insertion

ƒ RTL leakage management based on voltage anchoring

ƒ Tools for architectural level ABB-islands

ƒ Architectural level synthesis for power gating

<08/08/13> ISLPED Tutorial 7


CLEAN Project: WP3

Methodologies, Flows and Tools Specification and Integration

ƒ Definition of EDA requirements for the leakage optimization

ƒ Definition of the flow requirements

ƒ Integration of the WP2 methodologies into


ƒ PowerOpt (alias ChipVision ORINOCO 2.0)
ƒ PowerChecker (Bulldast EDA Toolsuite)

<08/08/13> ISLPED Tutorial 8


CLEAN Project: Prototype Tools Integration

Front-End – ChipVision PowerOpt

Proven Seamless FE-BE Integration


Design
Description
(Java, C++, HW/SW Partitioning
SystemC) System Level
Architectural Exploration Power Estimator
and (IP) Block Selection
Block Design/Synthesis Behavioral
Block Integration and Power
Leakage
Communication Design Optimization
Modeling
and RTL Planning/Optimization
Optimization (DataPath, Memory, Buses, RTL Power
Clock, Power Distribution, Optimization
Test Structures)
Gate/Transistor-Level
Physical Synthesis Power Optimization

BullDAST PowerChecker – Back-End

<08/08/13> ISLPED Tutorial 9


Agenda

ƒ Introduction
ƒ CLEAN project
ƒ Motivation
ƒ Leakage Physics
ƒ Variation
ƒ Leakage Control
ƒ Memory Leakage
ƒ Estimation
ƒ Conclusion

… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 10
Motivation: A Timeline of Problems

<08/08/13> ISLPED Tutorial 11

What Moore’s law really said: For each technology generation, there is a best number of
components with the best per components price.
Reason: The price of a system does not depend on the number of components, thus more
components means less cost per component. But above a certain limit, the yield is going
down, exponentially increasing the cost again.

Moore’s observation: For each new technology, the best number of components and the best
price per component both develop exponentially.
Motivation: A Timeline of Problems

frequency
2000
180n 2002
1GHz
180n

1997
250n
1995
350n
100MHz
1993
0.8µ

1989
1985 1µ
1.5µ 10MHz
1982
1978 1.5µ

1974

1MHz

1972
10µ
104T

107T

108T
105T

106T
1971
10µ
transistors

<08/08/13> ISLPED Tutorial 12

Moore’s law for the Intel processor family:

Also frequency developed exponentially (at least till 2002)


Motivation: A Timeline of Problems

2000
power density

1000W/cm² 180n 2002


180n

1997
250n
1995
350n
100W/cm²
1993
0.8µ

1989
1985 1µ
10W/cm² 1.5µ
1982
1978 1.5µ

1974

1W/cm²

1972
10µ

1971
10µ
1960 1970 1980 1990 2000 2010

<08/08/13> ISLPED Tutorial 13

But also the (unwanted) power density (power per unit area) developed exponentially.

Transistor count and frequency have no limit (the more the better), but for power density,
there is a bound, we should certenly not pass

(hot plate ≈ 10W/cm², nuclear fuel rod surface ≈ 100W/cm², rocket nozzle ≈ 1000W/cm²,
sun’s surface ≈ 60000W/cm²)
Motivation: A Timeline of Problems

happy scaling

?
c aling
the 5V world
ffs on
nd o v ariati
E 2004 2010
200290n 2008 32n
2006
S 2000 180n 45n
CF 65n

unpredictable
180n

degradation
variations &
1997
1995 250n
350n
1993
0.8µ

V CMOS
5 1982
OS
1989
1978 1.5µ 1985 1µ
large atoms
NM 3µ
1.5µ

1974
1972
6µ mean physics
1971
10µ
10µ

1960 1970 1980 1990 2000 2010

<08/08/13> ISLPED Tutorial 14

Scaling has always been done for the sake of transistor count and frequency.
All!! other major technology changes have always been done due to power:
Going from NMOS to CMOS Î because of short cirquit currents
Going from constant voltage scaling CVS to constant field scaling CFS (field =
voltage per size) Î because of dynamic power
The end of the ‘gigaherz race’ Î because of subthreshold leakage (CFS would
need threshold voltage scaling, but subthreshold leakage does not allow any further scaling)
By now, we left the era of mean physics (with various leakage effects) and
entered the era or ‘large atoms’:
Today’s technology is characterized by variations and tomorrow’s technology will
face unpredictable variations and aging (=degradation)
Motivation: Materials used for CMOS

Before 90's
Since 90's
1 2

H He Today
3 4 5 6 7 8 9 10
Radioactive
Li Be B C N O F Ne
11 12 13 14 15 16 17 18

Na Mg Al Si P S Cl Ar
19 20 31 32 33 32 35 36 21 22 23 24 25 26 27 28 29 30

K Ca Ga Ge As Se Br Kr Sc Ti V Cr Mn Fe Co Ni Cu Zn
37 38 49 50 51 52 53 54 39 40 41 42 43 44 45 46 47 48

Rb Sr In Sn Sb Te I Xe Y Zr Nb Mo Tc Ru Rh Pd Ag Cd
55 56 81 82 83 84 85 86 71 72 73 74 75 76 77 78 79 80

Cs Ba Tl Pb Bi Po At Rn Lu Hf Ta W Re Os Ir Pt Au Hg
87 88 103 104 105 106 107 108 109

Fr Ra Lr Rf Db Sg Bh Hs Mt
57 58 59 60 61 62 63 64 65 66 67 68 69 70

La Ce Pr Nd Pm Sm Eu Gd Tb Dy Ho Er Tm Yb
89 90 91 92 93 94 95 96 97 98 99 100 101 102

Ac Th Pa U Np Pu Am Cm Bk Cf Es Fm Md No

<08/08/13> ISLPED Tutorial 15

As motivation:
The number of elements, that we need for CMOS production.
In the 1980’s, only 6 elements were needed to build CMOS systmes (Si = semiconductor,
O=isolation, B,P=doping, Al=interconnect, H=chemical passivation)
In the 1990’s, only 8 new materials were added, all were improving the systems behavior
(Cu=better interconnect, Ge=higher mobility, Ta,W=better conductorÆsemiconductor
interface, N=higher dielectricity in the oxide,…)
Nowadays, we use nearly each element (short of the radioactive ones).
Motivation: Power Budget

VDD
Pavg = α f CL VDD2
E=½CLVDD2
+ Isc VDD
+ Ileak VDD
Isub
P Isc
Ileak = Isub + Igate + Ijun + …

Ijun

N CL
Igate
E=½CLVDD2

<08/08/13> ISLPED Tutorial 16

It’s a well known fact, that there are 3 sources of power consumption:

The dominating one – the dynamic power – is resulting from the energy needed to charge and
discharge the load capacitance when doing a transition

There are two reasons why short circuit currents – occurring in transition when both transistors
are open – are usually not regarded.
at first: For steep input flanks, the load capacitance buffers the short circuit – limiting the power
to some 5-10% of the dynamical one
and then: in terms of power estimation, knowing the input slope, the short circuit currents can
be modeled by an equivalent additional load capacitance

Leakage currents in contrast behave completely different:


- they occur even if there is no transition
- they basically flow anywhere
- they do not behave like an equivalent capacitance but an equivalent resistance
- Leakage is not just one effect, but a collection of different effects:
-> next page: collection of these effects
Motivation: Leakage Overview

ƒ If channel is locking: ƒ If channel is conducting:


ƒ subthreshold current (Isubth) ≤180nm ƒ gate tunnelling (Igate)
ƒ gate tunnelling to S/D (Igate) ≤90nm ƒ pn-junction leakage (Ijunction)
ƒ pn-junction leakage (Ijunction) ≤65
nm
ƒ gate induced drain leakage (IGIDL) ƒ If channel is switching:
ƒ depletion punchthrough (Ipunch) ƒ hot carrier injection (IHCI)

IHCI Igate

Gate
Source Drain
Igs Igb Igd
n+ IGISL IGIDL n+
Isubth
source drain
Ijunction Ijunction
Bulk Ipunch Roy03

<08/08/13> ISLPED Tutorial 17

There are 6 different leakage effects (explained for an NMOS transistor):

If the channel is closed: Drain is typically at high , all other terminals at low potential.
We observe:
- subthreshold currents through the channel
- GIDL currents from the gate/drain overlap region to the substrate
- tunneling through the gate oxide from the drain to the gate
- punchthrough from drain to source when both pn junctions touch each other
- pn-junction leakage as known from diodes

If the channel is open, source and drain have the same potential. Typically Source, drain and
gate are at high and bulk at low potential. Now we see:
- tunneling from gate to bulk
- junction leakage from drain to bulk

when switching we see hot carrier injection carried by electrons ballistically traversing the gate
oxide to the gate and thus carrying a current from gate to channel
Motivation: ITRS 2006 Prognosis
400 120
Igbulk
bulk
gate

Igate
IgUTB
UTB
tunn sub IgDG
350 e ling thre DG
105
sho bulk
Ioff bulk
ld

Isub
UTB
Ioff UTB
DG
Ioff DB
300 bulkbulk
Idyn 90

Idyn
UTB
Idyn UTB
DGDG
Idyn
250 node
node 75

200 cap 60
. ch
arge
150 45

100 30

thin body
50 15
bulk dual gate
[nA/µm]

[nm]
0 0
2004 2006 2008 2010 2012 2014 2016 2018 2020

<08/08/13> ISLPED Tutorial 18

Prognosis of the development of the 3 major power effects (for a single transistor disregarding
interconnect!).

For bulk CMOS, gate leakage will be skyrocketing. Thus the introduction of high-k devices
(which entered the market in 2008) was mandatory. High-k devices will be discussed later.
They can remove gate-leakage for good.

From 2010 on, ultra thin body devices will be produced keeping the subthreshold leakage
under control.

The introduction of dual gate devices will drastically reduce the leakage for the cost of higher
dynamic power (dual gate means dual capacitance to charge).
Motivation: Memory Driver

er MB
1,00E-05 24W p
1,00E-06 z
r 100MH
15W pe
1,00E-07
26
1,00E-08
W
pe
r1
1,00E-09 00
MH
SRAM Leakage [W] z
1,00E-10 SRAM Dynamic [Ws]
DRAM Leakage [W]
1,00E-11
DRAM Dynamic [Ws]
1,00E-12

1,00E-13 B
nW per M
80
1,00E-14

1,00E-15
2004 2006 2008 2010 2012 2014 2016 2018

<08/08/13> ISLPED Tutorial 19

Prognosis of the leakage and dynamic power development for SRAM and DRAM.
Dynamic power depends on the number of accesses, thus it is given in power per MHz
Static power depends on the number of transistors, thus it is given in power per MByte
Agenda

ƒ Introduction
ƒ Leakage Physics
ƒ Variation
ƒ Leakage Control
ƒ Memory Leakage
ƒ Estimation
ƒ Conclusion

… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 20
IHCI Igate

Gate
Source
Igs Igb
Drain
Igd Subthreshold
n+ IGISL IGIDL n+

source
Ijunction
Isubth
drain
Ijunction
Leakage
Bulk Ipunch

1,0E-02

1,0E-03 ƒ Classical Isd(Vgate) behaviour for 130nm


DIBL
1,0E-04 45nm I on
130nm ƒ For 45nm the off current rises by 106.
1,0E-05
→ subthreshold current in off-state Ioff
1,0E-06
V th

1,0E-07
ƒ Threshold voltage is Vds dependent.
1,0E-08 → increasing Vds decreases Vth
80 mV/decade → drain induced barrier lowering (DIBL)
Isd [A/µm]

1,0E-09
Cdm
1,0E-10 ∝ 1+
Cox
.0V
2

1,0E-11
slo
DD =

pe
V

1,0E-12
.1V

1,0E-13
DD =
0

I off
V

1,0E-14

1,0E-15
-0,5 0 0,5 1 Vgate [V] 1,5

BSIM4 simulations using PTM models

<08/08/13> ISLPED Tutorial 21

1) Characteristic plot for source-drain current vs. gate voltage for 130nm
Below a certain threshold voltage:
subthreshold slope of about 80mV/decade
Resulting: Ioff=10 fA/µm Weak Inversion and Junction Leakage

2) Trend for smaller nodes: Vth is sinking, thus Ioff is rising exponentially
Resulting: Ioff=30nA/µm
Factor 100 per generation OR 10 per year

3) DIBL: drain voltage biases threshold voltage -> higher off-current


Resulting: Ioff=300nA/µm

4) GIDL: a drain gate voltage lowers the drain’s junction barrier.


thus higher Ioff due to junction leakage
IHCI Igate

Gate
Source
Igs Igb
Drain
Igd Subthreshold
n+ IGISL IGIDL n+

source
Ijunction
Isubth
drain
Ijunction
Leakage
Bulk Ipunch

V
− β th
2W
I subth ≈α T e T if gate is locking and Vds is high
L

Vth = VFB + Φ S (T ) + γ Φ S + Vbs − Φ S − ( ) (Vbi (T ) − Φ S ) + Vds 2


{ 123 144424443 cosh (L lc ) − 1
Flatband
Voltage
Surface Body Effect 144424 443
Potential Drain Induced Barrier Lowering
Φ T
+ α (Vbs ) S ox − k retroVbs + k halo (L ) Φ S + Δ DITS (Vds , T )
1442 W4+4
ΔW3 123 14243 14243
Non Uniform Vth Roll- up Drain Induced
Narrow Width Effect Lateral Doping Threshold Shift

<08/08/13> ISLPED Tutorial 22

Off-current is exponentially dependent on threshold voltage


and the threshold voltage again depends on several parameters
Thus every effect on threshold voltage is also an effect on leakage

Simplified equation from BSIM4 manual:


- first two: zero bias threshold
- next the body effect coefficient. Explained later. Note now: if bulk source voltage is 0, body
effect is 0. A positive source voltage results in a positive term thus threshold is higher and
leakage lower
- next factors: describe the effect of non uniform doping. We discuss this later on (example:
body effect increases Vth with Vbs and retrograd well reduces is again – thus Vthis stabilized.
The same holds for Vds & L)
- the negative DIBL-factor basically depends linearly on Vds and exponentially on the channel
length.
Diagram: The shorter the channel, the more the drain influences the effective channel length
- The last parameter describes the increase of threshold due to a parasitic channel at the sides
of the transistor (not very important)
IHCI Igate

Gate
Source
Igs Igb
Drain
Igd Gate
n+ IGISL IGIDL n+

source
Ijunction
Isubth
drain
Ijunction
Leakage
Bulk Ipunch

ƒ Tunneling: Electrons can pass potential barriers,


V higher than their energy (classically impossible).
ƒ Tunneling current exponentially depends on barrier
T height V and width T and on carrier’s mass meff.

Conduction Vox
ƒ Energy diagram of a Poly-Si transistor:
ƒ Leakage current can be carried by Vg-Vd Valence Ψox
tunneling electrons or holes. ΨS
ƒ direct tunneling: from gate to channel Bandgap
SiO2
ƒ Fowler-Nordheim: from gate to oxide Gate Channel

G ƒ Carriers leak to source, drain and channel


S D ƒ Channel leakage is sub-divided into source,
drain and bulk leakage

<08/08/13> ISLPED Tutorial 23

This diagram shows textbook example of tunneling effect at rectangular potential barrier:
Classically impossible, an electron on the left can pass a barrier higher than its energy with a
certain probability.
The resulting current density exponentially depends on the thickness of the barrier, the energy
difference between electron and barrier and the electrons effective mass.

If we look at a band diagram of gate and channel we see a similar picture:


Electrons as holes can tunnel through the insulating oxide – called direct tunneling or can
tunnel into the insulators conduction band – called FN-tunneling

In the overlap region, the tunneling can carry current from the gate to source and drain directly.
The current tunneling to the channel will go to source drain or bulk
IHCI Igate

Gate
Source
Igs Igb
Drain
Igd Junction
n+ IGISL IGIDL n+

source
Ijunction
Isubth
drain
Ijunction
Leakage
Bulk Ipunch

ƒ Schematic view of a pn-junction in reverse bias


Gate pn-junc.

p n
ƒ As known from diodes: small currents are carried by
ƒ drifting carriers
ƒ electron-hole generation in junction

ΦBI+Vapp
conduction
ƒ An electron in the p-side’s valence band sees a
valence
potential as presented below.

Σg
ƒ Electrons can pass this triangular barrier by tunneling
through it. →Band-To-Band-Tunneling (BTBT)

<08/08/13> ISLPED Tutorial 24

There are 3 mechanisms transporting electrons from the p to the n region:

As known from diodes, we have randomly drifting carriers and electron-hole generation at
imperfections.

As soon as build in potential plus applied reverse bias are higher than the band gap, electrons
can directly tunnel from the p-side’s valence band to the n-side’s conduction band.

This will typically result in very low additional junction-currents.


IHCI Igate

Gate
Source
Igs Igb
Drain
Igd Junction
n+ IGISL IGIDL n+

source
Ijunction
Isubth
drain
Ijunction
Leakage
Bulk Ipunch

ƒ Smaller barrier means exponentially higher BTBT:


ƒ tech scaling: steeper doping profiles
ƒ Gate Induced Drain Leakage (GIDL)
ƒ GIDL: field from drain to gate reduces junction width
ƒ Body biasing also influences the BTBT current
10-3

0 Å
Gnd x =1
GIDL To
VDD 5Å
Ijunc. [A/µm]

10-6 1
x=
To
2 0Å
drifting x=
To
el.-hole generation
GIDL 10-9
BTBT

T=100°C
RBB
T=0°C L=50nm,W=1µm
FBB 10-12
0 Vd [V] 2
VGIDL

<08/08/13> ISLPED Tutorial 25

But a special effect can drastically increase this BTBT:


The potential difference between drain and gate makes the pn-junction steeper and thus the
tunneling distance smaller, and thus the current exponentially higher.
This effect is called Gate induced drain leakage.

Diagram: Drain to bulk current of 45nm NMOS:


Behavior separates to left and right part.
Left: Thermal dependence
Right: gate voltage dependence
below the GIDL voltage (of 0.8V), no GIDL influence
-> thermal behavior of drifting, electron hole generation
Above GIDL voltage: BTBT is dominating.
-> Drain voltage and oxide thickness determine to bulk current
Leakage Physics: Conclusion

Temp

L delay Tox

sub-th gate
leak leak

VBB VDD

pn-jun dynamic
positive correlation
leak

short
negative correlation
circuit

<08/08/13> ISLPED Tutorial 26


Agenda

ƒ Introduction
ƒ Leakage Physics
ƒ Variation
ƒ Leakage Control
ƒ Memory Leakage
ƒ Estimation
ƒ Conclusion

… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 27
Variation: Variation Model
ƒ Each parameter p of each transistor can be described as

p = pnom + δpglb + δpdet ( x, y ) + δprnd + δpdegrad (t , OC )


ƒ pnom nominal parameter value

ƒ δpglb global variation (per die)


ƒ average p on this instance (waver or die) minus factory average

ƒ δpdet deterministic variation


ƒ depends on the position on the instance (waver or die)

ƒ δprnd random variation


ƒ unpredictable (only distribution function may be given)

ƒ δpdegrad degradation (aging)


ƒ depends on time and operation conditions (OC)

<08/08/13> ISLPED Tutorial 28

The actual value of one parameter (e.g. the oxide thickness) can vary due to several
reasons:

pnom is the nominal parameter value, the designer wanted to have


pglb is the difference between the pnom and the de facto average parameter value
(combination of fab, lot, wafer and die-to-die variation)
pdet results from die-to-die variation and leads to a non-uniform, but deterministic
variation (e.g.: 1nm less in lower left corner and 2nm more in upper right corner)
prnd variation due to intra-die
Variation: Global Variation

ƒ Example of impact of process variability on 65nm CMOS

T
ie
r-d
t e
In
V th

<08/08/13> ISLPED Tutorial 29

On current is a measure of the device speed (double on-current means double performance)

Here you can see high correlation between on-current (speed) and off-current (energy
consumption).

Later, we will exploit this correlation (the fact, that faster devices also have higher leakage
power)
Variation: Local Variation
ƒ Random parameter variations Æ smaller scale than inter-die variations.
P avg
ƒ Example: 108 Transistors – 100 CP’s – 20 gates on each CP
r
108 isto
P trans
P = ∑ Ptransistor = 108 ⋅Ptransistor
law of large
numbers helps
D CP
i =1

100⎛ 20 ⎞ m

D = max⎜ ∑ Dtransistor ⎟
D syste
j =1
⎝ i =1 ⎠ Dt
2 rans
lar 0 is isto
r
wo

ge no
nu t a
rst

m
be
o

r intra-die
f1
00

variations

<08/08/13> ISLPED Tutorial 30

As presented in this slide, the variation of the input parameters changes the average value of
the result.

If you just compute the average leakage by using average length,… you will underestimate the
average.

Reason: As the I(L) dependency is non-linear, the leakage of the average length is smaller than
the average leakage of all lengths.

The lower formula gives the correction due to the variation


Variation: Sources of Variability
Random discrete Line edge Poly-Si granularity
dopants roughness

layer thickness
OPC variations Interface
roughness

CMP effects

<08/08/13> ISLPED Tutorial 31


Variation: Reliability Issues
Electromigration Hot electron oxide degradation

Caused by: high temperature, high Caused by: high VDD, very low VBB
current density in interconnect (strong reverse ABB)
Results in: high temperature, high Results in: oxide charge ÆVth increase
current density in interconnect Æ oxide breakdown
Æ runaway problem, sudden death

Oxide breakdown Negat. Bias Temper. Instability


Caused by: oxide charge, dirt in oxide, Caused by: hydrogen passivation of
radiation, mesh errors PMOS channel lost to the oxide
Results in: sudden runaway Results in: Vth increase
Æ oxide failure

<08/08/13> ISLPED Tutorial 32


Variation: Conclusion
Process Environment Temporal
L and W Temp range NBTI
Layer thickness VDD range Hot Electron Shifts
Global

R’s
Doping
Vbody

OPC Self-heating Distribution of NBTI


Deterministic

Phase shift IR drops Voltage noise


Layout mediated strain SOI Vbody history
Well proximity Oxide breakdown
Local

history

Random dopants, LER,


Stochastic

Poly Si granularity,
interface roughness, high-
k morphology

Line width due to pattern Thermal hot spots Computational load


Across-chip

density effects due to non- dependent hot spots


uniform power
dissipation

<08/08/13> ISLPED Tutorial 33

Well proximity effect: Some of the ions scattered out of the edge of the photoresist are
implanted in the silicon surface near the mask edge, altering the threshold voltage of those
devices

pattern density effects: dishing & erosion of CMP


Agenda

ƒ Introduction
ƒ Leakage Physics
ƒ Variation
ƒ Leakage Control
ƒ Transistor engineering
ƒ Power gating
ƒ ABB / DVS
ƒ Memory Leakage
ƒ Estimation
ƒ Conclusion
… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 34
Technology: Well Engineering

ƒ Well engineering by non uniform doping [NUD]

ƒ Retrograde well (vertical NUD) Source Gate Drain


ƒ below the surface:
Pocket
ƒ reduce doping concentration Halo
ƒ increase carrier mobility
Epilayer Retrograde
ƒ deep in the well:
ƒ increase doping concentration Bulk

ƒ limit channel depth

n-doped
ƒ Halo doping & pocket implant (lateral NUD) light
ƒ increase channel doping at the pn/junction med p-doped
ƒ steeper pn-junction profile
high
ƒ less short channel effects

<08/08/13> ISLPED Tutorial 35

At the technology level, leakage can be reduced by doing


sophisticated doping – not uniformly distributed
– called well engineering

In the vertical direction:


- doping concentration below the surface can be reduced increasing carrier mobility
- deep below the channel the doping is increased limiting the channel depth
and shielding punchthrough currents
- is called retrograde well
In the lateral direction:
- introduce a halo around source and drain having high doping
- the junction profile gets steeper, the channel length gets stabilized reducing SCE
- most of the channel has high carrier mobility

In addition: highly p-doped further pockets can be implanted shielding S/D directly below the
surface -> pocket implant

Compare to the definition of the threshold voltage to see the impact of NUD
Technology: High-k

ƒ Oxide dielectricity
ƒ gate capacitance is performance constrained
ƒ capacitance is kept high by reducing Tox
ƒ but gate oxide may run ‘out of atoms’
ƒ high gate tunneling
ox C ∝kSiO2 ⋅ LW Tox
ƒ idea: introduce high-k insulators
ƒ effective electrical Tox is small Tox = Tox k SiO2 k material
eff

ƒ physical Tox stays large enough to prevent tunneling

ƒ k(SiO2)=3.9 k(SiNO)=4.1-4.2 k( Hf72O2)=50


ƒ but: problems with lattice sizes
ƒ gate insulator is added atomic layer by layer

[Intel]

<08/08/13> ISLPED Tutorial 36

Today in some processes: SiNO nitrided gate oxide = oxinitride


Î increases k from 3.9 to 4.1-4.2
Î Tox + 5 to 10 %
Zr=Zirconium Æ k=40
Hf=Hafnium Æ k=72
Technology: FD UTB SOI
Bulk
FDSOI
Metal gate
Gate High-k dielectrics
Gate Buried
oxide Undoped thin
channel poly

Depletion NiSi
region TiN

HfO2
20nm
ƒ Fully-depleted Ultra Thin Body SOI
[F.Andrieu et al VLSI 2006]

ƒ Increased electrostatic control due to increased gate-channel coupling

ƒ Advantages of SOI technologies


ƒ Reduced junction capacitances
ƒ Immunity to single event upset and to latch-up
ƒ Increase layout density (no wells)

<08/08/13> ISLPED Tutorial 37

Idea of FD UTB SOI:

1) Ultra thin channel Î depletion layer capacitance is lower Î higher slope (see subthreshold
section)

2) No doping in channel needed Î no doping vatiation


Technology: Multi Gate 3D Devices
FinFETs
Drain
ƒ Create channel as a fin
Source
ƒ Two gates control the channel

ƒ Better channel control


ƒ higher switching speed
SOI
ƒ lower subthreshold leakage
Gates

AMD 15nm FinFET gate


Gate delay: 300fs (3300GHz)
Available: 2009?
TSMC: FinFET already for 65nm for critical
devices [EETimes]

<08/08/13> ISLPED Tutorial 38

FinFET: The channel is constructed as a fin. Thus both sides can be controlled by the source.

As the channel is controlled from both sides, it can be switched on or off more effectively
Æ the slope can not be reduced below the fundamental limit of 63mV/decade, but
can be brought very close
Technology: Strained Silicon

Strained silicon:

Electron speed:
70% faster

Circuit performance:
35% higher

http://www.research.ibm.com/resources/press/strainedsilicon/

<08/08/13> ISLPED Tutorial 39

Idea here: increase the mobility of the carriers


Î faster electrons means higher performance
Technology: Timeline

Igate
poly metal
Cox

SiON high k
Channel electrostatic
control

Bulk

Ion/Isub
FinFET
PDSOI FDSOI

Ion
µ↑

strained Si SiGe on
strained Si on insolator insulator
(SSOI) (SGOI)

‘07 ‘09 ‘12 ‘15


65nm 45nm 32nm 22nm

<08/08/13> ISLPED Tutorial 40

Overview over technology development

PDSOI: Partially depleted SOI


Agenda

ƒ Introduction
ƒ Leakage Physics
ƒ Variation
ƒ Leakage Control
ƒ Transistor engineering
ƒ Power gating
ƒ ABB / DVS
ƒ Memory Leakage
ƒ Estimation
ƒ Conclusion
… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 41
Power Gating: Overview
ƒ Sleep transistor insertion sleep
ƒƒ low
low leakage
leakage sleep
sleep transistors
transistors
ƒƒ high
high performance logic
logic
ƒƒ implementation
implementation as
as PMOS
PMOS oror NMOS
NMOS or
or both
both
ƒƒ decreases
decreases leakage in standby mode

BUT: gate
0
ƒƒ increases
increases leakage in active mode leakage

ƒƒ increases
increases dynamic
dynamic power
power
ƒƒ increases
increases area demand Logic
ƒƒ reduces
reduces switching
switching speed
ƒƒ reduces
reduces the
the maximum
maximum increased increased area
stack
stack depth
depth dynamic power

If not compensated
by higher VDD

<08/08/13> ISLPED Tutorial 42

To reduce the leakage if the circuit is dysfunctional, we can put a high Vth sleep transistor in
series to the circuit.

This will result in very high leakage savings (subthreshold and gate leakage) if the power
switches are locking, but if the component is not idle, the application of power-gating has
some disadvantages:
Power Gating: Application
VDD
Gnd
standard cell row

ƒ Per gate VDD


sleep
ƒ only modified cell library needed Gnd
per gate
ƒ fine grained controllable
ƒ high area overhead

ƒ Per row VDD


Gnd
ƒ only additional gating cell needed per row sleep
ƒ delay penalty hard to predict

J Frenkil - Sequence

<08/08/13> ISLPED Tutorial 43

Basically 3 different implementation styles:

Power-gating each standard cell creates a huge area overhead (needs a lot of gating
transistors and a lot of control interconnect).
But to apply this, we do not have to modify our design tools (only the library).

Doing power-gating per stdcell-row is much more efficient in terms of control and area
overhead, but standard tools can no longer handle the design. Even full custom new tools
will have to significantly modify their implementation, as the delay of a cell is no longer a
local problem (mapping is typically done before layout, but now timing depends on the
layout, and mapping of course depends on the delay). This will result in a huge overdesign
or in much more complex design tools.

The power ring solution is only mend to add power gating to old IP’s which can not be
changed.
Power Gating: Interfacing Problem

Sleep Region A Sleep Region B


Vdd Vdd

G H
Low-VTh Low-VTh
CMOS gate CMOS gate
Virtual GND Virtual GND
Sleep_A’ Sleep_B’
GND GND

• If Sleep_A=0 => G’s output may be floating.


• Several solutions:
• let output of A be 1
• let input of B be not dominant
• fix output with voltage anchors

<08/08/13> ISLPED Tutorial 44

This slide shows a problem when doing power gating:

Usually, not the entire system sleeps, but some parts are always awake.

Interconnect from a sleeping region to a wake region may have huge problems because the
voltage on this interconnect will very slowly go from 0 to 1 (NMOS gating) or 1 to 0 (PMOS
gating) over 1000’s of cycles leading to extreme short circuits in the wake region.
Power Gating: State Retention
ƒ Problem
ƒ How to store states of registers within a gated region?
ƒ Solution
ƒ Balloon State Retention Flip Flop

store/
restor
e
SR SR slp slp
Clk Clk

D state retention
extension

Clk Clk Clk Clk


Q
slp slp High Vth not gated
Low Vth but gated
conventional
flip flop

<08/08/13> ISLPED Tutorial 45

There is an additional interfacing problem: It is (as described) important not to produce short
circuit by floating inputs. But additionally, when we power gate large regions (including
registers), we will loose the state of these registers.

Balloon Flip Flops fulfill both, they work as voltage anchors, and they can save the register
state on sleep.

Balloon Flip Flops consist of 3 latches: Two are fast and work as Flip Flop when awake. The
third one is slow and leakage saving and store the state when sleep. The black latch is not
power gated.
Power Gating: Ground bounce
What’s a ground bounce and why does it happen?
ƒ Power grid not an ideal conductor, it
consists of VDD
L
ƒ Resistive parts C
R
ƒ Capacitive parts Circuit
ƒ Inductive parts C
L
R
GND

ground bounce during wakeup of


a power gated component
ƒ During wakeup
ƒ Large but slowly changing current @1.5V Kim04
during wakeup
ƒ Resistive part is dominating
ƒ Result: Reduced voltage @0.9V

<08/08/13> ISLPED Tutorial 46


Agenda

ƒ Introduction
ƒ Leakage Physics
ƒ Variation
ƒ Leakage Control
ƒ Transistor engineering
ƒ Power gating
ƒ ABB / DVS
ƒ Memory Leakage
ƒ Estimation
ƒ Conclusion
… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 47
ABB / DVS: Background ABB 1/2
ƒ Adaptive body biasing (ABB)
ƒ also called VTCMOS (variable threshold CMOS)

ƒ In standard CMOS:
ƒ the body voltage for NMOS devices is at 0V
ƒ the body voltage for PMOS devices is at VDD

ƒ Idea of ABB: VDD


+
VABB
ƒ contact the well of NMOS and PMOS
ƒ vary the well potential
ƒ modify the effective threshold voltage

-
VABB
Gnd

<08/08/13> ISLPED Tutorial 48

The idea of ABB is simple. Force the occurrence of the body effect (see subthreshold slides)
by applying a body potential which is not 0 for NMOS and not VDD for PMOS.
ABB / DVS: Background ABB 2/2
ƒ Analysis: Vary VABB and measure VTH
ƒ Result for NMOS (and PMOS):
ƒ an ABB voltage higher than 0V (lower than VDD) will result in a lower
effective |VTH |
ƒ increased
Reverse BB (RBB) Forward BB (FBB)
component speed
ƒ increased 32nm
10000

subthreshold leakage
65nm
1000
70nm
90nm
180nm 100

I_sub [nA/µm]
ƒ Effect weakens with 10

smaller technology 1

ƒ But with high-k ABB will


0.1

get import again


0.01

0.001
ƒ Higher Tox decreases BTBT -0.5 -0.3 -0.1 0.1 0.3 0.5

current (GIDL effect) V_ABB [V]

<08/08/13> ISLPED Tutorial 49

Spice simulation:

Subthreshold current depends on the body voltage.

In the diagram, you can see:


1) The subthreshold current is rising with shrinking device sizes
2) The effect of forward VBB is stronger than the effect of backward VBB
3) The effect of VBB is lowered with shrinking device sizes
ABB / DVS: Limits of Reverse ABB
ƒ RBB reduces the VDD

threshold current
1 VBB
ƒ Increases GIDL
to substrate 70nm data
[Neau03]

ƒ The total current has a technology


dependent minimum
50nm data
different doping
profiles

<08/08/13> ISLPED Tutorial 50

The BTBT (Band to band tunnelling from drain to substrate) limits the applicability of the
RBB.

The minimum of the total current Itot = Isub+IBTBT is (Analysis) at dItot/dVBB = 0 =>
dIsub/dVBB = - dIBTBT/dVBB

The absolute value of the optimal VBB is technology dependent (here for different doping
depths 17nm red curve to 20nm blue curve)

For the 70nm technology the max VBB was 0.1V, and for the 50nm it can be below 0.05V
because the effect of BTBT rises with technology.
ABB / DVS: Yield Optimization
find
ing
w bes
o t VDD=0.8V VDD=0.88V
ƒ DVS vs. ABB Î
des rking VBB /V
ign po
VBB=0.5V VBB=0.0V
tim int D
D

ƒ DVS more potential because of e tun


ing
wide application range
ƒ Power / Performance
optimum with ABB and DVS
wo
ƒ Values depend on the technology rk
i
on ng p
90nm data
[Meijer04]
and the system Î m oin
tes eas t d
t t ure ep
ƒ ABB / DVS for yield optimization im e
e t men nds
un t
ing
ƒ varying power and performance in systems
due to process variation

f[s-1]
ƒ power and delay limits determine yield

min
ƒ ABB / DVS used to move systems closer to
the limits
max P[W]

<08/08/13> ISLPED Tutorial 51


ABB / DVS: Runtime Adaption
ƒ ABB example of usage [Tschanz02]:
ƒ for a given circuit find the critical path
ƒ duplicate the critical path (with non blocking inputs)
ƒ and check functional correctness

0 p x
1 1 XOR
r
CLK REG
c

/ V DD
V BB n t
atin
g
poi g Block diagram
upd rking tunin
w o e
tim [Narendra05]
run
Î
<08/08/13> ISLPED Tutorial 52

ABB voltage can be chosen in several ways:

Fixed (in design time, the ABB is chosen) – is only done for FBB – will be explained later
Once (while testing the chip, the VBB is chosen optimally) – will reduce variance
Adaptively (while runtime) – this can also reduce variability due to temperature and Vdd
noise. A replica of the critical path is measured, the VBB is always high enough to ensure
timing of the critical path.
Power Gating vs. ABB comparison
ABB Power gating
- Low benefit + High benefit
+ State preserving - Loosing state
+ Compensates Vth fluctuation - Does not compensate
+ Additional hardware only once - Additional transistor in series:
additional interconnect only slower, larger, lower yield
(+)Only needs modified library: (-)For per row gating:
Conventional design tools Timing is hard to validate

+ IP can be reused - Flip-flops have to be replaced


- Needs triple well + Conventional well structure

<08/08/13> ISLPED Tutorial 53


Agenda

ƒ Introduction
ƒ Leakage Physics
ƒ Variation
ƒ Leakage Control
ƒ Memory Leakage
ƒ Estimation
ƒ Conclusion

… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 54
Cell Design: Leakage in SRAM cells
ƒ Subthreshold leakage occurs when:
ƒ Transistor is off
ƒ Vds is non-zero
ƒ Cell state (stored value) determines exactly which transistors “leak”

N1,P1 = Internal Leakage


N4 = Bitline Leakage
= Gate Leakage
= BL2WL Gate Leakage

<08/08/13> ISLPED Tutorial 55


Cell Design: Dual-Vt Asymmetric-Cell

ƒ Replace leaky transistors with high-Vt (HV) transistors

Leakage Reduction Performance


‘0’ ‘1’ BLB BL
reduced
stability
70X 0X -12% -47%

Navid Azizi, Farid N. Najm, and Andreas Moshovos, Low-Leakage


Asymmetric-Cell SRAM, IEEE Transactions on VLSI, pp. 701-715, Aug. 03

<08/08/13> ISLPED Tutorial 56


Cell Design: DVT memory cells

delay
leak @ 0

leak @ 1
Navid Azizi, Farid N. Najm, and Andreas Moshovos,
Low-Leakage Asymmetric-Cell SRAM, IEEE
Transactions on VLSI, pp. 701-715, Aug. 03

2.5x 7.0x -3%

58x 7.0x -1%

2.0x 1.9x -0%

2.0x 6.7x -0%

<08/08/13> ISLPED Tutorial 57


Memory Architectures: Cache Decay
ƒ Example: Power down each cache line after 1024 idle cycles
ƒ 10 bit counter per line needed Î huge area & power penalty
ƒ Solution: Power down after 4 ticks (1 tick = 256 cycles)
ƒ 2 bit counter per line needed
ƒ shutdown after 769-1024 cycles (phase dependent) (769=1024-255)

[Kaxiras et al.]
Counts fixed time

<08/08/13> ISLPED Tutorial 58


Memory Architectures: DRI for I-Cache
ƒ Dynamic Resizable Instruction-Cache

[Powell et al.]

Low-level mechanism: Gated VDD (non preserving)


Effectiveness on average: Leakage -62% Execution time +4%.

<08/08/13> ISLPED Tutorial 59

DRI: Dynamically Resizable I-Cache

Dynamically estimates and adapts to the required instruction-cache size and turns off a
selected set
of cache lines.
Coarse-grain: entire portions of cache are turned on/off

The number of lines in sleep mode is controlled by periodically examining the miss rate
Miss rate < pre-determined value
Put to sleep another part of the cache
Miss rate > acceptable value
Activate more lines are
Memory Architectures: Drowsy memories
ƒ When accessing a drowsy line, its state is lost
ƒ Gating circuitry prevents access to drowsy lines

Simple policy: Send drowsy signal each N cycles


No-Access policy: Send drowsy signal N cycles after last access

Comparable performance, Leakage reduction > 80%

<08/08/13> ISLPED Tutorial 60


Memory Architectures: Conclusion
wl driver

mem
Replace SRAM cell
-> slight performance reduction
sense amps -> high leakage savings
wl driver

wl driver
mem mem
mem

sense amps sense amps


Power down lines Make memory drowsy
after long idle and wake up busy lines

<08/08/13> ISLPED Tutorial 61


Agenda

ƒ Introduction
ƒ Leakage Physics
ƒ Variation
ƒ Leakage Control
ƒ Memory Leakage
ƒ Estimation
ƒ Conclusion

… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 62
Estimation: What leakage is modelled?

2ox
1p++ 3ox
5
6p++
4ox

7p++
13ox 9n
8p+ 11n+
12n+ 10p+

subthreshold gate gate induced


leakage tunneling drain leakage

V L
ds , V T V T
V ch ,
… gs ox gd ox
bs

63
<08/08/13> ISLPED Tutorial 63
Estimation: Leakage and delay under PTV variation

global local random local deterministic


(inter die) (intra die)

tio +
ria al

et tio l +

tic
n

m +
va min

is
l d ia na
er n
in
ca ar i
no
log(leakage)

lo v om
n
log(leakage)
f
frequency

frequency

e.g. all wafer Tox e.g. dopant distribut. e.g. well proximity

64
<08/08/13> ISLPED Tutorial 64
Estimation: Why RTL modelling?

behavioural m
od
SystemC solution ific
synthesis etric
at
io param
n yi e l d
n
at io evaluation
ul ftarget
s im sy R
Σ
nt T
data he
Edyn area Tcrit Pleak inte sis Verilog
0010 0101 0110 1110 r-
1001 1100 0000 0100 di e + UPF
1010 0001 1001 1101
1010 0010 1011 0010

ada tion
solu
floorplan temperature voltage

ptiv
e
65
<08/08/13> ISLPED Tutorial 65
Estimation: The leakage model

VDD
ln (I sub W ) = ln( χ L) − β VTH (L, Tox , N ch ) T ox
0 VBB
= K ≈ α 0 + α1 N
L + α2 1
L
T
N + α3 1
L2
T
N
0
d
d by n Te epe an
ne tio mp nd pa
r mi riza ,V o n ra aly
t e te DD ,
V m ti c
de rac BB e te al
a p
ch r
de r o c
pe es
… a2 a1 a0
(
I component Temp, VDD , VBB , Lch , Tox , N dep ) nd
en
ce
s

b0
b1 = β 0 ⋅ I sub
NMOS
+ γ 0 ⋅ I sub
PMOS

y
b2 a·b d b on
in e
rm rizat
i + β1 ⋅ I gate
NMOS
+ γ 1 ⋅ I gate
PMOS
t e
de racte
cha + β 2 ⋅ I stack
NMOS
+ γ 2 ⋅ I stack
PMOS

66
<08/08/13> ISLPED Tutorial 66
Estimation: The delay model

characterization
inverter model gate model RTL model

1
1
& ≥1 1

… & &
0
edge inverter CL ≥1 ≥1

depends on dgate,input= dcomponent=


slope, Cload, f(dinv,Cload,temp) f(dinv,temp)

VDD, VBB, temp,


Lch,Tox,Ndep

67
<08/08/13> ISLPED Tutorial 67
Estimation: Process Variation Engine

(1) (5)
inter-die µL µT µNµ (3)µ µ intra-die
L T N
Monte Carlo
variation data (4) µL µT (2)
µN variation data
µL µT µNµ µ µ
L T N

pre-integrate
al

model
ic
t yp
t s
lec set
se

Ileak models have


(1) increased
Ileak tcrit (2)
Ileak tcrit expectation value
binning (3)
I(5) (4)
leak tcrit
I t
tcrit Ileak tcrit leak crit

<08/08/13> ISLPED Tutorial 68


Estimation: Model Application

dynamic
systemC RTL RTL
PowerOpt & SC
description netlist floorplan
power

constraints ABB/AVS thermal & electrical


f,Tmax,… optimization model
leakage
& delay temp VDD
from process map map
variation model
user
engine package
integrated structure &
into a tool leakage subthreshold power grid
estd. yield
management gate tunnelling
model results due to inter-die
strategy pn-junction

<08/08/13> ISLPED Tutorial 69

How the model is integrated into the tool:

The tool reads SystemC description and produces RTL netlist + dynamic power prediction +
RT level floorplan
Use this information to compute a temperature and VDD map. Decide about an optimization
strategy. Now all information except for the variation is available. Variation data comes from
the variation engine (see later).

All data for the models is available and leakage is computed.


Leakage is re-iterated in the thermal & electrical mapping.
Estimation: Conclusion

ƒ Model strengths:
ƒ accurate leakage (std dev<10%) and delay (std dev<5%) model
ƒ inter-die variation is accurately handled even for correlated
parameters
ƒ resulting models are fast, compact and accurate

ƒ Limitations:
ƒ only valid for a small die area, where PTV gradients are negligible
(thus each RT component needs its own model)

ƒ Under development:
ƒ expectation value is good for intra-die effect on leakage
but SSTA is needed for intra-die delay modeling

70
<08/08/13> ISLPED Tutorial 70
Agenda

ƒ Introduction
ƒ Leakage Physics
ƒ Variation
ƒ Leakage Control
ƒ Memory Leakage
ƒ Estimation
ƒ Conclusion

… controlling leakage in
nanometer CMOS SOC’s
<08/08/13> ISLPED Tutorial 71
Recent Problem: Subthreshold Slope

1,0E-02

slope is limited
1,0E-03
DIBL
S ≥ 63mV / dec. [K Roy ISLPED ‘06]
1,0E-04 45nm
130nm
1,0E-05
GIDL slope is channel leakage
1,0E-06 V
− β TH
2W
1,0E-07 I subth ≈ α T e S ⋅T
L
1,0E-08

80 mV/decade slope is oxide thickness…


Isd [A/µm]

1,0E-09

C
S = 1 + dm
1,0E-10
.0V

Cox
2

1,0E-11
DD =
V

1,0E-12
… is gate leakage
.1V

1,0E-13
0

Tox−2 exp(− βToxVox )


DD =

I gate ∝ LWVDD
2
V

1,0E-14
BPTM simulation
1,0E-15
-0,5 0 0,5 1 Vgate [V] 1,5

<08/08/13> ISLPED Tutorial 72

The subthreshold slope is one of the major design metrics.


The slope defines the ratio between on current (=speed) and off current (=leakage). If we
could reduce the slope, we could easily reduce the threshold volage to gain performance
Because of the slope, the oxide thickness has to always be agressively scaled down
this is boosting the gate leakage
thus subthreshold slope combines the system performance with the two most important
leakage sources
Recent Problem: Variation

global variation

T
threshold voltage of all transistors is deviating

influence ie
r -d
Vth↓ = performance↑ & leakage↑ te
In
Vth↑ = performance↓ & leakage↓
V th

[CLEAN Workshop]

<08/08/13> ISLPED Tutorial 73

Global variation: each parameter that can vary in production and that has an influence on the
threshold voltage is contributing to the variation of subthreshold leakage and performance.
Recent Problem: Variation

global variation
threshold voltage of all transistors is deviating

influence
Vth↓ = performance↑ & leakage↑
Vth↑ = performance↓ & leakage↓

Vth-correlation for one die


within fab: identical Vth
wafer-to-wafer: identical Vth inter-die
die-to-die: Vth gradients

within die: random Vth intra-die [J Cong]

<08/08/13> ISLPED Tutorial 74

Different sources of variation occur at different production steps and have thus a different
influence range (entire lot, one wafer, single die, or each transistor individually)
In fact there are also intermediate effect (for instance resulting in variation gradients over
the die)
Recent Problem: Running out of Atoms

doping variation
45nm: statistical local fluctuation
32nm and below: random discrete dopants

[D Frank]

<08/08/13> ISLPED Tutorial 75

Our technology is running out of atoms soon.

The random dopant effect will in 45nm result in high statistical fluctuation of the threshold
voltage (only 100 dopants are not always distributed smoothely)
In 32nm, each single doping atom may have significant influence on the transistor.
Recent Problem: Running out of Atoms

doping variation
45nm: statistical local fluctuation
32nm and below: random discrete dopants

line edge roughness


Molecular sizes (resist, silicon, metal)
introduce LER of several nm

[A Asenov]

<08/08/13> ISLPED Tutorial 76

Each material we use in production shows a certain granularity (larger than single
atom/molecule) size. The metal clusters for instance will lead to a line edge roughness of
approximately 5nm for each metal line. For a 90nm interconnect, this effect may be only
additional noise, but for a 22nm line, a +-5nm variation is significant
Recent Problem: Running out of Atoms

[Chandu Visweswariah]
doping variation
45nm: statistical local fluctuation
32nm and below: random discrete dopants

line edge roughness


Molecular sizes (resist, silicon, metal)
introduce LER of several nm

oxide thickness
Atomic dimensions already reached
Tox<2nm in 65nm node
SiO2 has 0.3nm diameter
less than 6 layers

<08/08/13> ISLPED Tutorial 77

The oxide thickness is allready the smallest structure we have to controll. In 65nm, typical
thickness is at 1.5nm (which is 5 molecule layers). With high-k, this size is becoming larger
again (approx 10-15nm)
Recent problems: Degradation
electromigration
Caused by: high temp, high current density
Results in: high temp, high current density
Æ runaway problem, sudden death

<08/08/13> ISLPED Tutorial 78

Degradation (=aging of the components) is the next problem that we will have to face.

Electro-migration: At high temperatures and current densities, the metal ions follow the
momentum of the electrons.
Problem: At the thinnest position the current density and temperature both are highest. The
thinner the connection, the faster its material vanishes.
Î Drastical end with a runaway situation (see sudden northwood death syndrome)
Recent problems: Degradation
electromigration
Caused by: high temp, high current density
Results in: high temp, high current density
Æ runaway problem, sudden death

hot electron degradation


Caused by: high VDD, very low VBB
Results in: oxide charge
ÆVth increase Æ oxide breakdown

<08/08/13> ISLPED Tutorial 79

Hot electron degradation (there are 4 different effects causing hot electron degradation).

Electrons can enter the oxide and are trapped there pre-charging the channel, and thus
influencing the threshold voltage.
Systems can recover from hot electron degradation. The change is not permanent.
Recent problems: Degradation
electromigration
Caused by: high temp, high current density
Results in: high temp, high current density
Æ runaway problem, sudden death

hot electron degradation


Caused by: high VDD, very low VBB
Results in: oxide charge
ÆVth increase Æ oxide breakdown

oxide breakdown
Caused by: oxide charge, dirt in oxide,
radiation, mesh errors
Results in: sudden runaway Æ oxide failure

<08/08/13> ISLPED Tutorial 80

Due to charges, dirt, mesh errors or radiaion impact in the oxide, the current can punch
through the oxide, resulting in a permanent device failure.
Recent problems: Degradation
electromigration
Caused by: high temp, high current density
Results in: high temp, high current density
Æ runaway problem, sudden death

hot electron degradation


Caused by: high VDD, very low VBB
Results in: oxide charge
ÆVth increase Æ oxide breakdown

oxide breakdown
Caused by: oxide charge, dirt in oxide,
radiation, mesh errors
Results in: sudden runaway Æ oxide failure

Negative Bias Temperature Instability


Caused by: hydrogen passivation of
PMOS channel lost to the oxide
Results in: Vth increase

<08/08/13> ISLPED Tutorial 81

The passivation between channel and oxide (hydrogen atoms) can vanish (attracted by the
gate).

The effect just occurs in PMOS devices when the transistor is conduction (bulk at VDD and
gate at ground) and the bulkÆgate voltage is negative.
Recent Transistor Design

MOSFET structure well engineering 2002


1 poly-silicon gate stabilize Vth
2 buffer oxide layer
3 gate sidewall isolation 6,8: reduce Rsd
4 gate oxide 10: reduce Vth(Lch) dep.
5 metal drain contact 11: reduce Vth(VDD) dep.
6 elevated source 12: reduce Vth(Vbs) dep.
7 source
8 shallow source extension
9 channel
10 halo doping
11 pocket implant 2ox
12 retrograde well 1p++ 3ox
13 field oxide (FOX) 6p++ 5
4ox
7p++
13ox 9n 11
8p+ n+
12n+ 10p+

<08/08/13> ISLPED Tutorial 82

Which anti-leakage techniques are already available?

We already have well engineering since 2002 in order to control the threshold voltage
dependence to other parameters (well engineering is still being further developed)
Recent Transistor Design

MOSFET structure well engineering 2002


1 poly-silicon gate stabilize Vth
2 buffer oxide layer
3 gate sidewall isolation 6,8: reduce Rsd
4 gate oxide 10: reduce Vth(Lch) dep.
5 metal drain contact 11: reduce Vth(VDD) dep.
6 elevated source 12: reduce Vth(Vbs) dep.
7 source
8 shallow source extension
9 channel
10 halo doping
11 pocket implant
12 retrograde well
13 field oxide (FOX)

strained silicon 2004


70% faster electrons
35% higher performance

<08/08/13> ISLPED Tutorial 83

Since 2004, we can control (increase) the mobility of the majority carriers (electrons or holes)
to increase the performance without threshold voltage reduction
Recent Transistor Design

MOSFET structure well engineering 2002 high-k 2008


1 poly-silicon gate stabilize Vth high slope is needed
2 buffer oxide layer
6,8: reduce Rsd high slope needs high Cox
3 gate sidewall isolation
4 gate oxide 10: reduce Vth(Lch) dep. Cox sinks with L & W scaling
5 metal drain contact 11: reduce Vth(VDD) dep. Tox can no longer be reduced
6 elevated source 12: reduce Vth(Vbs) dep. Æ introduce HfO2
7 source
8 shallow source extension
9 channel
10 halo doping
11 pocket implant
12 retrograde well
13 field oxide (FOX)

strained silicon 2004


70% faster electrons
35% higher performance
[Intel Penryn]

<08/08/13> ISLPED Tutorial 84

Since 2008, also high-k technology is available (e.g. Intel Penryn) to control gate leakage
Future Transistor Design
FinFET
multi-gated transistors Drain

increasing electrostatic control


approach the 63mV limit n
Fi
TriGate Source
Gate
Drain

[C.Jahan]
n
Fi
Surrounding gate Gate
Source
Drain

Source Gate

[AMD’s 15nm FinFET]

<08/08/13> ISLPED Tutorial 85

Which anti-leakage techniques are about to come?

Multi-gate devices such as FinFET will combine SOI, high-k and an increased electrical
control over the channel.
With FinFETs the subthreshold slope will be brought very close to the fundamental limit
enabling further threshold voltage and thus also frequency scaling
Future Transistor Design

Bulk FDSOI
multi-gated transistors
increasing electrostatic control
Gate Buried
approach the 63mV limit Gate oxide

UTB FD SOI
fully depleted ultra thin body SOI
Depletion
region
ultra-thin: SOI layer
thinner than depletion width Metal gate
High-k dielectrics
channel is undoped Undoped thin channel
Æ no wells poly

Æ higher integration density


Æ no doping variation TiN
NiSi

HfO2
20nm

[F.Andrieu]

<08/08/13> ISLPED Tutorial 86

Fully depleted ultra thin body devices will enable bulk CMOS at least down to the 32nm node.
The undoped channels avoid random dopant variation and increase the integration density, as
no wells are needed.
Gate Level Techniques: Power Gating

easy concept
SC use fast, low Vth transistors for logic
CM use low leakage, high Vth transistors
OS e
V
slp <0 if o s sibl to cut leakage path on idle
V p W
sleep FBB educe
to r

g
g atin e
OS ibl
NM poss
also

<08/08/13> ISLPED Tutorial 87

Power gating in principal is very easy: The logic is implemented in fast, but high leakage
devices. If the component is idle, a slow, but low leakage gating transistor in series to the
gate cuts the leakage currents paths.
Gate Level Techniques: Power Gating

I I
Twakeup Twakeup
easy concept
Imax use fast, low Vth transistors for logic
Imax use low leakage, high Vth transistors
to cut leakage path on idle
t t
large sleep small sleep
transistor transistor technical problems
gating transistor sizing:
supply noise and wakeup delay
floating outputs
row- or cell based gating
U[V]

VDD
sleep
Gnd
t per gate

VDD
Gnd
0 Out 1 Out 8 per row sleep

<08/08/13> ISLPED Tutorial 88

The problems of power gating are mostly technologically:


Depending on the gating transistor size, area overhead, remaining leakage, performance
degradation, wakeup time and supply noise can be traded of.
At wakeup, the gates are in an intermediate , high short circuit state for a very long time.
Power gating can be done gatewise (high area overhead, but with a modified library, all
design tools can be reused)
or rowwise (conceptually better, but especially the timing of each cell now depends on the
activity of all neighbor cells)
Gate Level Techniques: Power Gating
VDD

easy concept
slp voltage use fast, low Vth transistors for logic
anchor use low leakage, high Vth transistors
to cut leakage path on idle
logic with logic w/o
sleep mode sleep mode technical problems
gating transistor sizing:
supply noise and wakeup delay
store/
restor
floating outputs
e row- or cell based gating
SR SR slp slp
Clk Clk
state retention
D
sequentials are loosing state at power down
Clk Clk Clk Clk voltage anchors, balloon latches, dual VDD
Q
slp slp
High Vth not gated
Low Vth but gated

<08/08/13> ISLPED Tutorial 89

When using power gating, the interface between gated and not gated parts have to be
stabilized
When going to sleep, the state of the sequentials has to be stored and then restored, when
waking up.

For both problems, design for power gating at the behavioural level can avoid or at least
reduce the overhead for interfacing and state retention.
System Level: adaptive techniques

on-chip binning
exploit correlation of leakage and performance
adaptively slow down the system to save leakage

1000
energy [nWs]

4 delay [ns] 6
10

<08/08/13> ISLPED Tutorial 90

Above the gate level, the most promissing technique is adaptive leakage management.
The idea is always to either speed up slow systems for the cost of leakage power or the slow
them down, reducing the leakage.

The plot here show a system’s power and performance distribution before… [see next page]
System Level: adaptive techniques

on-chip binning
exploit correlation of leakage and performance
adaptively slow down the system to save leakage

1000
energy [nWs]

4 delay [ns] 6
10

<08/08/13> ISLPED Tutorial 91

…and after application of ABB


with AVS, all systems would exaxtly meet its performance constraint for the least possible
leakage.
System Level: adaptive techniques

on-chip binning
exploit correlation of leakage and performance
adaptively slow down the system to save leakage

techniques
AVS
ABB
ABB+AVS

[Meijer04]

<08/08/13> ISLPED Tutorial 92

Above the gate level, the most promissing technique is adaptive leakage management.
The idea is always to either speed up slow systems for the cost of leakage power or the slow them down,
reducing the leakage.

The available techniques are adaptive voltage scaling, adaptive body biasing, or a combination of both.

Depending on the application time, different effects can be exploited:


At design time, these techniques are not adaptive, nevertheless, by identifying the best body bias/supply working
point, leakage can be reduced without a performance degradation (thus, this sweetspot tuning should always be
regarded as son as body bias control is available)
At test time, ABB/AVS techniques can exploit the strong leakage-delay correlation of global process variations.
Testtime tuning has a low overhead, as measurements of leakage and delay can be made off-chip by the testing
equipment.
ABB/AVS at boot-time can also exploit aging margins in young systems to reduce leakage and thus slow down
the (thermal and current driven) aging effects. With bot-time tuning, onchip monitors for delay (and temperature)
have to be introduced
At runtime, also margins for thermal and voltage variations can be used to reduce leakage as long as the system
is not in its worst case condition. But of course, runtime tuning also has the largest implementation overhead.
The system has to be made ready to cope with a runtime adoption of the supply voltage.
System Level: adaptive techniques

on-chip binning
exploit correlation of leakage and performance
adaptively slow down the system to save leakage

techniques implementation
AVS design time
ABB test time
ABB+AVS boot or run time

[Narendra05]

leakage

[Meijer04]

frequency

<08/08/13> ISLPED Tutorial 93

Depending on the application time, different effects can be exploited:


At design time, these techniques are not adaptive, nevertheless, by identifying the best body
bias/supply working point, leakage can be reduced without a performance degradation (thus,
this sweetspot tuning should always be regarded as son as body bias control is available)
At test time, ABB/AVS techniques can exploit the strong leakage-delay correlation of global
process variations. Testtime tuning has a low overhead, as measurements of leakage and
delay can be made off-chip by the testing equipment.
ABB/AVS at boot-time can also exploit aging margins in young systems to reduce leakage and
thus slow down the (thermal and current driven) aging effects. With bot-time tuning, onchip
monitors for delay (and temperature) have to be introduced
At runtime, also margins for thermal and voltage variations can be used to reduce leakage as
long as the system is not in its worst case condition. But of course, runtime tuning also has the
largest implementation overhead. The system has to be made ready to cope with a runtime
adoption of the supply voltage.
Final Conclusion

leakage currents
subthreshold leakage will be under control
gate & junction leakage can be avoided

global variations
sensibility can be reduced
will be controlled by adaptive techniques

local variations
some will be bypassed
some will be made deterministic by better models
LER will remain critical

degradation
will be the next problem to solve

<08/08/13> ISLPED Tutorial 94

Final conclusions:
The leakage currents will be kept under control, but we have to sacrifice all frequency
scaling.
From the technology side, transistors will be made less sensible to parameter variation, and
the remaining global variation can be reduced by adaptive techniques
Some statistic variations will be bypassed (e.g. random dopant effect). Some effects are not
really statistic (as well proximity effect or OPC related disturbances). With better estimation
software, they may be made deterministic and can then be avoided by design (If well
proximity effect is always increasing a certain tarnsistor’s threshold voltage, we can reduce
the thershold by design. The final device will have exactly the desired properties.).
But especially the line edge roughness will remain a critical source of statistical variation.

Degradation will be the next problem, we will have to face.

Vous aimerez peut-être aussi