Vous êtes sur la page 1sur 12

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO.

5, MAY 2011 883

A Compact Hybrid Current/Voltage Sense Amplifier


With Offset Cancellation for High-Speed SRAMs
Mohammad Sharifkhani, Member, IEEE, Ehsan Rahiminejad, Member, IEEE,
Shah M. Jahinuzzaman, Member, IEEE, and Manoj Sachdev, Senior Member, IEEE

Abstract—A hybrid current/voltage sense amplification scheme


is proposed for high speed SRAMs. The scheme includes an offset
cancellation technique which makes it robust against the current
sense amplifier (CSA) mismatch. The offset cancellation allows for
fast open loop operation of the differential CSA. A fourfold reduc-
tion of the cell access time is achieved compared to the conven-
tional scheme under similar cell current and bitline capacitance.
Thanks to its automatic turn off nature, the proposed CSA incurs
zero static power without an auxiliary turn off circuit. The reduc-
tion of the charge redistribution on the bitlines offers a low bitline
dynamic power consumption as well. In this work, the proposed
scheme is rigorously analyzed and compared to the conventional
scheme. The analysis is verified using circuit level simulations and
compared to the conventional scheme as a reference analytically
and using simulations.
Index Terms—Current sense-amplification (CSA), high-speed,
SRAM.

I. INTRODUCTION
ANOSCALE technologies unveiled two significant chal-
N lenges to the design of high-speed and reliable SRAMs.
The first challenge is the process variation which threatens the
Fig. 1. (a) Concept of current sense amplification and (b) differential current
sensing with offset.
reliability by affecting the sensing circuits sensitivity. This ef-
fect demands larger signal magnitudes, which deteriorate the an infinitely small differential voltage between its two internal
speed as well as power consumption. The second challenge is nodes when it is enabled. In practice, however, the initial differ-
the variation of the cell current, which reduces the worst case ential voltage at the internal nodes of the VSA must be larger
cell current that drives the bitline. In effect, this reduction de- than the offset for correct detection. This turns into a large bit-
mands a longer wordline activation time to ensure sufficient bit- line voltage swing which adversely affects the speed, power, and
line voltage swing for correct sensing. The wordline activation wordline activation time. Hence, offset presents the fundamental
time directly influences the cell data stability. A long wordline limit to the sense amplification process in an SRAM. Offset can-
activation time can affect the noise margin [1]–[3]. cellation using double gate MOS devices and or complex cir-
Two phase, latch-type voltage sense amplifiers (VSA) are fa- cuits have been proposed to alleviate this concern [4], [5]. This
vorable because of their simple structure and low power con- paper presents a rigorous analysis of offset effect in the proposed
sumption. The self shut-off mechanism of VSA has made it a two stage sense amplification scheme. The method is applicable
pervasive choice in today’s SRAM design. Under ideal condi- to any other sensing scheme.
tions, where the offset of the VSA is zero, the VSA amplifies Current sense amplifiers (CSA) has long been proposed as
a promising approach for high speed applications since they
Manuscript received July 19, 2009; revised October 20, 2009. First published do not require large bitline voltage swing for detection [6].
February 17, 2010; current version published April 27, 2011. Fig. 1(a) illustrates the concept of the CSA. A CSA is essen-
M. Sharifkhani and E. Rahiminejad are with the Department of Electrical En- tially a linear, current buffer that ties the bitline to a known
gineering, Sharif University of Technology, Tehran 1458889694, Iran (e-mail:
msharifk@sharif.edu; msharifk@vlsi.uwaterloo.ca; e.rahiminejad@yahoo.
voltage and presents a small input resistance to the input signal
com). which has a form of current. The output current creates a charge
S. M. Jahinuzzaman is with the Department of Electrical and Computer En- variation at the output node which results in a sufficiently large
gineering, University of Concordia, Concordia, QC H3G 1M8, Canada (e-mail:
shah@ece.concordia.ca; smjahinu@engmail.uwaterloo.ca).
output voltage. The linear amplifiers, however, are usually large
M. Sachdev is with Department of Electrical and Computer Engineering, Uni- and they impose a dc power consumption. Therefore, at the ex-
versity of Waterloo, Waterloo, ON N2L 3G1, Canada (e-mail: msachdev@uwa- pense of area and complexity, an additional circuitry is required
terloo.ca).
Color versions of one or more of the figures in this paper are available online
to turn off the CSA when it becomes idle [7]. The proposed
at http://ieeexplore.ieee.org. scheme offers an automatic shut-off mechanism which results
Digital Object Identifier 10.1109/TVLSI.2009.2039949 in a reduced area and power.
1063-8210/$26.00 © 2010 IEEE
884 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 5, MAY 2011

II. OPERATION

In this work, we present a hybrid current/voltage SA that


operates in three phases, OCCSA, without extra timing over-
head (see Fig. 4). The three phases are referred to as preampli-
fier offset cancelation phase (POC), access phase (ACC), and
evaluation phase (EV). The circuit includes a presense amplifier
(PSA) which operates as a CSA at the first stage during the ACC
phase as well as a multiplexer switch. In the OCCSA circuit, the
PSA is an open loop CSA. Hence, it offers a wide bandwidth
at a low power consumption with higher vulnerability to the
differential offset. Therefore, its offset is canceled in the POC
Fig. 2. Closed-loop offset insensitive differential CSA. phase before it actually operates as a PSA in the ACC phase.
The second stage is a latch type VSA that amplifies the PSA
output in the evaluation phase.
A differential CSA is vulnerable to the bias current offset in Fig. 3 compares the operation of the conventional VSA type
the differential legs. Fig. 1(b) shows the conceptual circuit of sensing with the OCCSA scheme. While a complicated CSA
a typical differential CSA that deals with a differential current could be used, a single-stage differential common-gate ampli-
signal. The dc offset of the CSA can significantly influence the fier comprising M1 and M2 is employed as the PSA of the
output node voltage [8]. To mitigate the offset effect, closed- OCCSA scheme. Bitlines are the input terminals of the PSA and
loop offset insensitive circuits has been proposed that contains the nodes and are the output of the PSA. The output
a voltage amplifier in a feedback loop [9]. currents of the PSA transistors are functions of the transconduc-
Fig. 2 illustrates an example of a closed-loop offset insen- tance of M1 and M2. itself is a function of the effective
sitive CSA. With an infinite gain and zero offset, the OpAmp voltage, of M1 and M2. A conventional pMOS
can mitigate the mismatch between the nMOS transistors. The multiplexer operates in the triode region during the wordline ac-
OpAmp offset and limited gain tax the performance of the ar- tivation since the drain source voltage is small while the source
chitecture in practice. Moreover, the architecture mitigates the gate voltage is about VDD.
mismatch only after the loop settles at its dc condition where the The PSA differential offset is mainly due to the threshold
bitlines are at their nominal voltages. Given large bitline capaci- voltage mismatch between M1 and M2 [12]. The threshold
tance, the CSA has to be activated long before it can actually op- voltage of the transistors M1 and M2 can be described as
erate. This adds to the DC power consumption of the CSA. Fur- and , respectively. The difference in the threshold voltages
thermore, the bandwidth of the amplifier should be considerably creates a difference in the , hence, the of the
wide and proportional to the bitline capacitance to maintain the amplifiers. This results in an offset current. In order to remove
speed advantage of the CSA [10]. This demands a high dc power the offset current, the of the devices should be equalized.
for a closed-loop CSA. In this paper, an open-loop offset can- The first phase is the POC phase where the wordline is in-
cellation method is proposed which removes the OpAmp alto- active and the PSA offset is canceled as shown in Fig. 4. In
gether without sacrificing the performance due to the mismatch this phase, the of M1 and M2 are equalized by the equaliza-
offset. tion of the, . Proper charge up of the bitlines (i.e., the source
Owing to the large area, a conventional CSA is usually shared voltage of M1 and M2) associated with M1 and M2 ensures an
among a number of bitlines using resistive multiplexer switches. equalized at the end of POC phase. In this phase, the bit-
The resistance of these switches deteriorates the speed [11]. lines are initially precharged to an equal voltage level .
To alleviate this effect, additional switches and amplifiers are This voltage is chosen sufficiently below , where
placed in a feedback loop at the expense of complexity, area, is the average threshold voltage of M1 and M2. For a
and power [7]. The proposed scheme combines the operation of 1.8 V, a around 0.9 V is typical. During POC
multiplexing with the current amplification which results in a phase, the is active, therefore the PSA output nodes are
significant reduction in the area and complexity. tied up to VDD. The POC phase starts with the activation of the
This paper presents a hybrid offset cancelled current sense signal . Hence, M1 and M2 turn on and pull up the bit-
amplifier structure (OCCSA) that offers a reduced bitline line voltages from towards and ,
voltage swing and a low power consumption by reducing the respectively. Since the initial bitline voltage at the source is the
wordline activation time. The scheme accords a self-shutting same for M1 and M2 , the effective voltage of M1 and
mechanism at a negligible area over head. This paper is divided M2, , is not equal due to the threshold voltage mis-
into the following sections. Section II explains the operation match. Hence, the charging currents of the transistors, , de-
of the scheme. Section III derives an analytical expressions to pend on the respective . That is, at the beginning of the POC
quantitatively describe the benefit of the scheme and its trade- phase, is proportional to the effective voltage of M1 and
offs. Section IV presents the simulation results and comparison M2; . Thus, at the beginning,
with respect to the conventional scheme. Section V describes the transistor with a lower sources a larger current, which
a design example to shed light on the practical aspects of the eventually creates a higher voltage on the corresponding bitline
proposed scheme. Finally, Section VI draws the conclusion. at the source over time. That is, a higher source voltage,
SHARIFKHANI et al.: COMPACT HYBRID CURRENT/VOLTAGE SENSE AMPLIFIER 885

Fig. 3. Timing and voltage waveforms in the (a) conventional and (b) proposed OCCSA scheme.

, for the transistor with a lower . Hence, over current buffer on the cell current. The PSA forms a differential
time, the effective voltage of the two transistors, , voltage variation at the output, and . The differential
are equalized because of a higher source voltage for a smaller voltage created at and (and the internal nodes of the
threshold voltage: . In other VSA, and ) is far larger than the differential voltage
words, in POC phase a differential voltage is developed between over the bitlines. This is due to the fact that the bitline capaci-
the bitlines which compensates the threshold voltage difference tance is larger than the capacitance of and by orders
of M1 and M2 to achieve an equal effective voltage. Since the ef- of magnitude. In other words, when PSA operates on the cell
fective voltage of M1 and M2 are equalized, the PSA can be used current, the required differential voltage at the internal nodes of
as an offset-less differential CSA in the next phase. A longer the VSA is created at a much shorter time. This effect results
POC phase , therefore, offers a better offset cancella- in a reduced wordline activation time which boosts the cell data
tion at the expense of a smaller , hence, . Clearly, the stability and the yield [3]. Note that in ACC phase the and
differential bitline voltage at the end of POC does not cause a are floating as well as and because M3 and
detection error in the ACC phase since the data in the ACC phase M4 are off. It is evident that without proper offset cancellation
is detected through differential current sensing at the source of in the POC phase, the PSA differential output current is domi-
the PSA. With offset cancelled, the PSA differential output cur- nated by the PSA offset rather than the cell current.
rent is determined by cell current (see Fig. 1). The evaluation phase is similar to the conventional scheme.
The circuit enters the ACC with the activation of the wordline The VSA amplifies the initial condition to the supply levels
and disabling the signal . The cell reveals the data over when it gets active: . If the initial condition between
the bitlines in the form of a differential current. During the entire the two internal nodes of the cross-coupled latch is larger than
ACC phase, , the PSA operates as a differential unity gain the offset voltage of the VSA, the correct evaluation of the ini-
886 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 5, MAY 2011

Fig. 4. Timing associated with the proposed OCCSA considering Fig. 3.

tial condition takes place. Otherwise, the output of the VSA is lines are precharged differently. In the POC phase, since M1 has
determined by the intrinsic offset of the VSA. Hence, in both a larger threshold voltage than M2, charges up more slowly
conventional and OCCSA the necessary condition for correct compared to which is precharged by M2. At the end of POC
operation is where is the initial phase, the effective voltage of both transistors, is al-
differential voltage at the internal nodes of the VSA before the most the same. Hence, they provide similar current amplifica-
evaluation and is the VSA offset. tion. When the circuit enters ACC phase, the differential output
Fig. 5(a) presents the HSpice simulated waveforms of the current of the PSA is dominated by the cell current. Hence,
OCCSA scheme where the PSA do not pose an offset. In this the differential voltage created at the internal nodes of the latch
design, the POC is about 200 pS and the access time VSA is determined by the cell current rather than by the PSA
300 pS. With 0.9 V the bitlines are charged offset. Therefore, the PSA offers the same amplification benefit
up to about 1 V through the source of M1 and M2 during as the previous case where the PSA does not have an offset.
POC. Clearly, since there is no offset, the bitlines do not create The stability of the cells is maintained because of two rea-
a differential voltage at the end of POC. Moreover, since the sons. First, the wordline is active once the precharge voltage of
drains of M1 and M2 are tied to , the internal nodes of the the bitline is around , where is the threshold voltage
VSA, and are preset and equalized to in this of PSA transistors. This precharge voltage is sufficiently high to
phase. The ACC phase begins with the activation of the word- maintain the cell stability [13]. Second, the current sense ampli-
line and deactivation of . During this phase, the current fication allows a significantly shorter than usual wordline acti-
that is sunk by the cell is amplified by the PSA. At the end of vation time . A shorter cell access time translates into a
the ACC phase, the differential voltage of the bitlines, is higher cell stability [3].
about 30 mV while the differential voltage at the internal nodes
of the latch VSA, is about 90 mV. This amplification III. ANALYSIS
can be attributed to the differential current sensing of the PSA in In this section, we derive analytical expressions that ex-
the ACC. Fig. 5(a) illustrates the case where POC phase is fin- plain the behavior and effectiveness of the OCCSA scheme.
ished before the BL is fully charged up to . Hence, the This analysis unveils the efficiency of the scheme, its design
PSA keeps charging up the BLs during the ACC phase as well. tradeoffs, and a quantitative comparison is made against the
It is noteworthy that since the current buffer is only an nMOS, conventional latch SA. In this analysis, the threshold voltage
it is input impedance is far from perfect (i.e., zero) hence a drop mismatch between M1 and M2 models the offset of CSA this
in the bitline differential voltage appears during the ACC time. type of modeling is widely used in the context of differential
Fig. 5(b) illustrates the waveforms of the same nodes when SA schemes [8], [12].
the PSA has a pessimistic offset voltage of 35 mV. During the The analysis of the operation of the circuit is based on Fig. 3
POC phase, the bitlines are precharged by M1 and M2 from 0.9 and the explanation provided in the previous section. The flow
V up to 1.1 V. Because of the offset between M1 and M2, the bit- of the derivation is demonstrated in Fig. 6. In each of the three
SHARIFKHANI et al.: COMPACT HYBRID CURRENT/VOLTAGE SENSE AMPLIFIER 887

Fig. 5. Voltage waveforms in the OCCSA (a) without and (b) with the PSA differential offset of 1V = 35 mV.

Fig. 6. Flow of derivations incorporating the offset of PSA and latch VSA as well as design parameters including timing parameters.

phases, the voltage of the critical nodes of the circuit is analyzed. phase. The residual offset of the PSA is defined as
Key parameters that are used in the analysis is also described in the effective offset of the PSA transistor after the completion of
the figure. The final condition of the nodes in each phase are the the POC phase. is derived in Appendix I as
initial condition for the respective subsequent phase.
The POC phase analysis unveils two key parameters that ef- (1)
fect operation of the PSA transistors in the subsequent ACC
888 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 5, MAY 2011

Fig. 7. Differential voltage at the internal nodes of the latch VSA (V 0V ) is the superposition of two components: (a) the PSA residual offset from the
POC phase and (b) the cell current.

where is . A quick comparison can be made against the conventional


The effective transconductance of the PSA transistors which scheme where the same latch VSA is used without a PSA. For
operate in the ACC phase is based on the gate and source a large enough , is substantially larger than that
voltages of PSA transistors at the end of the POC phase of conventional scheme where
. Moreover, the OCCSA scheme holds a square law
(2) relationship between the cell access time and while the
conventional scheme provides a linear relationship between the
two.
ACC analysis describes the effect of two counteracting sig-
nals (a) PSA residual offset and (b) cell current in the ACC
A. OCCSA Scheme Under the Limiting Case of
phase. Over ACC phase, both of these signals are amplified
by the PSA transistors and their superposition create the PSA is amplified to the full logic levels in the evaluation
output voltage. The PSA residual offset carried over from POC phase. For a correct detection, the absolute value of this voltage
phase creates the effective PSA offset, at the end of must be in favor of the even if the latch VSA offset
ACC phase. This value is a function of the ACC time and is zero

(6)

(3) From (5) and (3), (6) can be expanded to


where is the parasitic capacitance at the output of the PSA.
The cell wordline is active during ACC which provides (7)
which carries the data signal. Over ACC phase, creates a
differential voltage at the PSA output, . The cell current Combining this inequality with (2) and (1), we have
has to go through two RC stages to create at the PSA
output (see Fig. 7)

(8)
(4)
Practical design values in a 0.18- m technology portrays the
design tradeoffs obtained in the analysis. For the typical PSA
In this analysis, the differential output of the PSA is essen- offset of 10 mV, Fig. 9 demonstrates the tradeoff be-
tially the internal nodes of the latch VSA assuming that the VSA tween the access time and offset cancellation time for various
input pMOS gates impose a negligible RC time constant. Hence, values of under 67 A and 1 pF which
at the end of ACC phase, the differential voltage at the internal associates with a 512 cell/column configuration.
nodes of the latch VSA can be expressed as A longer offset cancellation time offers a smaller
. Similarly, a smaller bitline precharge relaxes
(5) for a given PSA offset . Also, from (8) a smaller
PSA offset allows a shorter access time. Notably, for
assuming that the cell current and PSA residual offset contract, larger than , the circuit operates properly
which is what takes place in worst case scenario (see Fig. 6). without offset cancellation time: .
SHARIFKHANI et al.: COMPACT HYBRID CURRENT/VOLTAGE SENSE AMPLIFIER 889

Fig. 8. Offset cancellation time, T , needed to compensate the PSA offset Fig. 9. Access time needed for correct detection for different T under dif-
1
of V for different bitline precharge voltages V . ferent V conditions.

For a fixed , it is instructive to find the minimum cells on the selected column (see Fig. 11). The leakage affects
that guarantees the correct detection condition the input signal independent of the type of sense amplifier used.
described in (6)
A. Comparison Under Equal Nonzero Latch VSA Offset

A comparison is made against the conventional scheme when


both schemes employ an identical latch VSA with the same
(9)
offset (see Fig. 3). To make a correct detection,
Fig. 8 illustrates the effect of PSA offset on the required
the following inequality must be held:
for various bitline precharge voltage with
300 pS, 67 A and 1 pF. A larger
requires a longer . However, by choosing a lower
(10)
the timing constrains are relaxed. It is noteworthy that
The which is the voltage created as a result of cell
for 10 mV, the is dominated by the cell current,
current must be larger than the sum of the counteracting offset
and the PSA offset does not make a sensible contribution for the
due to the PSA and the latch VSA.
specified . Hence, no PSA offset cancellation is needed;
Fig. 10(a) portrays the minimum access time needed to create
.
a voltage larger than the at the end of the access time for
different s and under typical 10 mV and
IV. COMPARISON AND SIMULATION RESULTS 0.9 V. For typical values of VSA offsets, the OCCSA scheme
Conventional two step voltage sense amplification is taken as outperforms the conventional scheme by offering a three times
a reference for the comparisons in this section. Most of the pre- smaller cell access time. In an array, the minimum is de-
viously reported techniques compare themselves with the con- termined by the worst case over the entire array. Hence,
ventional VSA scheme in their respective technologies. Three the actual access time is a function of the largest in an
sets of comparison are made in this section. The first set com- array. Fig. 10(b) shows the same tradeoff for different values of
pares the OCCSA scheme against conventional scheme based under 0.5 ns for 512 cells/column.
on the equations derived in Section III. The second set com- To put the comparative analysis in perspective, we compare
pares the derivations against the simulation results for different the voltage that is generated at the VSA internal nodes as a re-
design values for the OCCSA and conventional schemes. The sult of cell current in both schemes. In the OCCSA scheme, the
third set of simulations compares the two schemes in terms of is derived in (5). For the same cell and BL capacitance,
energy consumption for the same latch VSA offset. In this com- the same voltage in the conventional scheme (without PSA) can
parison, the following values are held for the respective vari- be expressed as follows:
ables: 67 A, 1.8 V, and typical PSA offset of
10 mV and 1 pF which associates with a 512 (11)
cell/row configuration when it impose worst case capacitance.
In the two latter comparisons, the simulations included the par- The improvement factor is defined as the ratio of the dif-
asitic effects such as the leakage current of the non-accessed ferential voltage created using the OCCSA scheme over the dif-
890 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 5, MAY 2011

), if the POC time is increased aggressively. If we allow


exceed , then the BL charges up such that the
overdrive voltage, , of the PSA transistors will be too
small at the end of POC phase. This will slide the PSA towards
shot off state before it acts as an amplifier in the access time.
On the other hand, for a given and , (9) and (13) can be
solved together to provide the optimum and the associated
. The design procedure is discussed in the Section IV-B.
Fig. 12 demonstrates the relationship between the improve-
ment factor and under different PSA offset, , condi-
tion. For example, to achieve two times improvement over con-
ventional scheme, a of 1 V is needed when the offset of
the PSA is around 70 mV. Smaller PSA offsets allows
a higher and therefore a smaller energy overhead.
B. Simulation Results: Speed Comparison
Fig. 13 demonstrates the minimum access time needed for
proper detection for a given latch VSA offset based on HSpice
simulation results and the equations. The simulations and
derivations are compared under various design parameters of
and . It can be seen that the simulation results are
in good agreement with the derivations with less than 10%
error. The error is mainly due to the body effect which affects
the threshold voltage of the PSA devices as they operate during
POC and ACC phases and has not been accounted for in the
analysis.
The proposed scheme is compared against the conventional
method under various latch VSA offset for both schemes.
Fig. 14 illustrates the HSpice simulation results for both
schemes. Up to ten times cell access time improvement can
be achieved using the proposed scheme for a similar latch
VSA offset, cell current and bitline capacitance. The simu-
lation results is consistent with the predictions made by the
design equations that are delivered in the previous section. It is
noteworthy that the offset cancellation time run in parallel to
wordline decoding to ensure that the cell access time improve-
Fig. 10. Access time needed for correct detection when VSA offset of V
ment enhance the overall access time.
exists under various (a) T and (b) V .
C. Simulation Results: Energy Comparison
Table I shows the power efficiency of the proposed scheme
ferential voltage created using the conventional scheme for the compared to the conventional scheme when both scheme suffer
same cell access time in both schemes from the same latch VSA offset, same cell current as well as the
same bitline capacitance of 512 cells/column. The results are
(12) obtained using accurate circuit level simulations. The simula-
tions are conducted under several values of latch VSA offset,
On the other hand, the substitution of (11), (5) and (2) in
, for both schemes. The latch VSA offset imposes a
(12) unveils a maximum value for that achieves times
minimum access time hence a minimum bitline voltage varia-
improvement
tion in the conventional scheme. The energy consumption for
the conventional scheme is calculated using (15) [14], [15]
(13)
(15)
where is the value of the at the beginning of the POC
phase where is the minimum bitline voltage variation in the con-
ventional scheme for correct detection. This value is obtained
(14) using circuit simulation for a given .
The bitline energy consumption for the proposed scheme
The is defined in the context of comparison is calculated for the same used in the conventional
against the conventional scheme. The proposed scheme can scheme under a given offset cancellation time and access time.
lose its performance benefit over the conventional method (with In these simulations, the design variables of 0.7 ns and
SHARIFKHANI et al.: COMPACT HYBRID CURRENT/VOLTAGE SENSE AMPLIFIER 891

Fig. 11. Schematic of the simulated circuit that includes the leakage current of the idle cells on the selected column.

0.45 ns impose the maximum value of . That


is, to correctly detect in the given time ( for offset can-
cellation and for sensing) and for a given offset voltage
(the latch VSA offset and the PSA offset ), the
PSA transistors demand certain amount of effective voltage to
cancel the PSA and latch VSA offset. The required effective
voltage sets the maximum value of . Consequently, the
choice of affects the power consumption (16)

(16)

where is the supply voltage and


is the charge variation over one bitline. The values of
and are obtained using circuit simulation
where correct detection is achieved under timing and mismatch
constraints that is described previously. Note that in the pro-
posed scheme, both bitlines undergo a voltage variation in the
POC phase hence a factor of two appears in (16). The bitline
voltage variation during ACC phase is negligible owing to the Fig. 12. V needed to achieve noise margin improvement over the conven-
current sensing nature of the PSA. tional scheme ( ) under various PSA offset 1V .
For the typical PSA offset values of about 10 mV,
more than four times improvement in the access time comes at
the expense of mere 50% more energy, if the same latch VSA is partly attributed to the fact that in this scheme both bitlines
used in both schemes. Note that, the reduction of the access time undergo voltage variation.
not only improves the speed, but also boosts the stability of the It is noteworthy that the simulations are consistent with the
cell [16]. Larger energy consumption in the proposed scheme predictions made in the previous section; a smaller is able
892 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 5, MAY 2011

TABLE I
ENERGY AND DELAY COMPARISON BETWEEN PROPOSED AND CONVENTIONAL SCHEME FOR IDENTICAL LATCH VSA OFFSET, V

Fig. 14. HSpice simulation results for the conventional sense amplification
method versus proposed method along with the predictions made by the deriva-
tions.

information such as and . Design requirements such


as speed determines design variables including , ,
and . In order to achieve the highest performance, the
POC phase is designed to be concurrent with the row decoding
. Hence, no time overhead is added to
the overall access time compared to the conventional two phase
scheme. In other words, the scheme is scalable for different fre-
quencies similar to the conventional scheme. The creation of the
POC time window is possible by using dummy decoders.
In the next step, Fig. 10(b) is obtainable for a given
and other design variables such as and using key (8).
Using Fig. 10(b), the largest allowable is chosen for a
targeted for the given . The largest allowable
leads to the minimum power consumption (see (16)).
The number of cells in a column determines the
hence as well as . The number of cells on a row and
the number of words on a row does not influence the sensing
Fig. 13. Minimum access time that provides correct detection under a given procedure except for the parasitic effects due to the interleaving
latch VSA offset based on HSpice simulation and derivation results for the pro-
posed scheme under (a) various V s and (b) V =09
: and T = 0.4 of multiple words. However, for a give address width, catego-
ns. rization of address fields into and decoders influence the
which indirectly affects other design variables in-
cluding .
to compensate a larger PSA offset to offer the same cell ac- We have observed that even if the POC time does not go in
cess time needed to overcome a given latch VSA offset . parallel with the row decoding time, it is possible to reduce the
overall read transaction time
V. DESIGN CONSIDERATION , below the access time of a conventional method. However,
Fig. 15 describes the design methodology for the proposed this can only take place if the is precharged at much lower
scheme. The design depends on the stochastic information of voltages which affects the power and stability.
the device parameters particularly mismatch of the PSA, The proposed scheme impose a small overhead to the entire
, and latch VSA offset, , as well as circuit level design. According to Fig. 10, the proposed scheme can tolerate
SHARIFKHANI et al.: COMPACT HYBRID CURRENT/VOLTAGE SENSE AMPLIFIER 893

Fig. 15. Design methodology.

a significant variation over without sacrificing the perfor- [2] M. Khellah, Y. Ye, S. K. Nam, D. Somasekhar, G. Pandya, A. Farhang,
mance. This property relaxes the design considerations for the K. Zhang, C. Webb, and V. De, “Wordline and bitline pulsing schemes
for improving SRAM cell stability in low-Vcc 65 nm CMOS designs,”
reference voltage for . It can be generated on the same in IEEE Symp. VLSI Circuits Dig. Tech. Papers, 2006, pp. 9–10.
die with a minimal power 1 W and area overhead ( [3] M. Sharifkhani and M. Sachdev, “SRAM cell data stability: A dynamic
1%). Compact low-power mid-rail reference voltage generators perspective,” IEEE J. Solid-State Circuits, vol. 44, no. 2, pp. 609–619,
Feb. 2009.
are common in both DRAM and SRAM designs (See [17, pp. [4] A. Choudhary and S. Kundu, “A process variation tolerant self-com-
290–337]). Moreover, according to [17], the active power con- pensating sense amplifier design,” in Proc. IEEE Comput. Soc. Ann.
sumption of the memory unit is mainly due to the bitline voltage Symp. VLSI, 2009, pp. 263–267.
[5] S. Mukhopadhyay, , H. Mahmoodi, and K. Roy, “A novel high-per-
variations as well as wordline and decoders. In the proposed formance and robust sense amplifier using independent gate control in
scheme the only additional time frame is POC time which can sub-50-nm double-gate MOSFET,” IEEE Trans. Very Large Scale In-
be created using a dummy decoder. This barely affects the en- tegr. (VLSI) Syst., vol. 14, no. 2, pp. 183–192, Feb. 2006.
[6] E. Seevinck, P. van Beers, and H. Ontrop, “Current-mode techniques
tire power specification. for high-speed vlsi circuits with application to current sense amplifier
for CMOS SRAM’s,” IEEE J. Solid-State Circuits, vol. 26, no. 4, pp.
VI. CONCLUSION AND DISCUSSION 525–536, Apr. 1991.
[7] B. Wincht, J. Y. Larguier, and D. S. Landsiedel, “A 1.5 v 1.7 ns 4 k
A hybrid current/voltage offset canceled sense amplifier 2 32 SRAM with a fully-differential auto-power-down current sense
scheme is proposed. In this scheme, the cell access time and amplifier,” in Proc. ISSCC, Feb. 2003, pp. 462–508.
[8] K. Seno, K. Knorpp, L.-L. Shu, N. Teshima, H. Kihara, H. Sato,
current sense amplification is concurrent while the voltage F. Miyaji, M. Takeda, M. Sasaki, Y. Tomo, P. T. Chuang, and K.
sensing comes afterwards. The offset of the CSA is cancelled Kobayashi, “A 9-ns 16-mb CMOS SRAM with offset-compensated
before the wordline activation to improve the sensitivity of current sense amplifier,” IEEE J. Solid-State Circuits, vol. 28, no. 11,
pp. 1119–1124, Nov. 1993.
the CSA and avoid the interference of the CSA offset with [9] K. Ishibashi, K. Takasugi, K. Komiyaji, H. Toyoshima, T. Yamanaka,
the cell current. It is shown that the offset cancellation allows A. Fukami, N. Hashimoto, N. Ohki, A. Shimizu, T. Hashimoto, T.
fast open-loop operation of the current sense amplifier with Nagano, and T. Nishida, “A 6-ns 4-mb CMOS SRAM with offset-
voltage-insensitive current sense amplifiers,” IEEE J. Solid-State Cir-
minimum area and power compared to [9], [10]. For the first cuits, vol. 30, no. 4, pp. 480–486, Apr. 1995.
time, a rigorous and methodological approach for the analysis [10] B. Wincht, S. Paul, and D. S. Landsiedel, “Analysis and compensation
of an SA in presence of offset in a dynamic environment is of the bitline multiplexer in SRAM current sense amplifiers,” IEEE J.
Solid-State Circuits, vol. 36, no. 11, pp. 1745–1755, Nov. 2001.
presented. [11] N. Shibata, “Current sense amplifiers for low-voltage memories,”
The analysis and HSpice circuit simulations unveiled a four IEICE Trans. Electron, vol. 79, pp. 1120–1130, Aug. 1996.
times speed improvement over the conventional scheme for the [12] A. Bhavnagarwala, X. Tang, and J. Meindl, “The impact of intrinsic
device fluctuations on CMOS SRAM cell stability,” IEEE J. Solid-State
same cell current and bitline capacitance for 50% more energy. Circuits, vol. 36, no. 4, pp. 657–665, Apr. 2001.
The cell data stability is ensured by the reduction of the cell [13] K. Kanda, S. Hattori, and T. Sakurai, “90% write power-saving SRAM
access time and bitline precharge voltage. Cell access using sense-amplifying memory cell,” IEEE J. Solid-State Circuits, vol.
93, no. 6, pp. 929–933, Jun. 2004.
times as low as a couple of hundred pico seconds is achievable [14] M. Sharifkhani and M. Sachdev, “A low power SRAM architecture
in 0.18- m CMOS technology for a 1 pF bitline load (more than based on segmented virtual grounding,” in Proc. IEEE Symp. Low-
512 cells/column) with less than 100 mV bitline voltage swing. Power Electron. Des. (ISLPED), Oct. 2006, pp. 256–261.
[15] M. Sharifkhani and M. Sachdev, “Segmented virtual ground architec-
ture for low-power embedded sram,” IEEE Trans. Very Large Scale
REFERENCES Integr. (VLSI) Syst., vol. 15, no. 2, pp. 196–205, Feb. 2007.
[1] M. Pilo, J. Barwin, G. Braceras, C. Browning, S. Burns, J. Gabric, S. [16] M. Sharifkhani, S. Jahinnuzaman, and M. Sachdev, “Data stability in
Lamphier, M. Miller, A. Roberts, and F. Towler, “An SRAM design low-power sram design,” in Proc. IEEE Custom Integr. Circuits Conf.
in 65 nm and 45 nm technology nodes featuring read and write-assist (CICC), 2007, pp. 237–241.
circuits to expand operating voltage,” in IEEE Symp. VLSI Circuits Dig. [17] K. Itoh, VLSI Memory Chip Design. New York: Springer-Verlag,
Tech. Papers, 2006, pp. 15–17. 2001.
894 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 5, MAY 2011

Mohammad Sharifkhani (S’98–M’07) received the Shah M. Jahinuzzaman (S’03–M’08) received


B.Sc. and M.A.Sc. degrees in electrical and computer the B.Sc. degree (with honors) in electrical and
engineering from the University of Tehran, Tehran, electronic engineering from Bangladesh University
Iran, in 1998 and 2000, respectively, and the Ph.D. of Engineering and Technology, Dhaka, Bangladesh,
degree from the University of Waterloo, Waterloo, in 2002, and the M.A.Sc. and Ph.D. degrees in elec-
ON, Canada, in 2006. trical and computer engineering from the University
He was an Analog and Mixed-Signal Design of Waterloo, Waterloo, ON, Canada, in 2004 and
Engineer with Valence Semiconductor, Dubai, UAE, 2008, respectively.
from 2000 to 2002. He was a Postdoctoral Research He is currently an Assistant Professor with the
Fellow with the University of Waterloo, in 2007. He Department of Electrical and Computer Engineering,
also worked with Ferdowsi University of Mashad, Concordia University, Montreal, QC, Canada. His
Mashad, Iran, in 2007. He joined Micro Electronics Research and Development research interests include variability-aware low-power/high-speed digital
Center of Iran (MERDCI) as a Senior Research Associate in 2008, where circuit design and fault tolerant memory design.
he worked on crypto processors for SmartCard applications. He is currently Dr. Jahinuzzaman was a recipient of the Best Paper Award at the 2009 Mi-
an Assistant Professor with the Department of Electrical Engineering, Sharif crosystems and Nanoelectronics Research Conference (MNRC) in Ottawa and a
University of Technology, Tehran, Iran. His current research is on low-power number of government and university scholarships including Ontario Graduate
SRAM circuits and architectures, and crypto processors. Since 2009, he has Scholarship, Ontario Graduate Scholarship in Science and Technology, and Uni-
been working on digital video broadcasting (DVB) tuner and demodulator versity of Waterloo President’s Scholarship.
micro-architectures. His research has led to several publications including more
than 17 IEEE papers and two U.S. patents. He was a vice-chair in the IEEE
student branch at the University of Tehran in 1998.
Prof. Sharifkhani was a recipient of several scholarships during his studies. Manoj Sachdev (SM’01) received the B.E. degree
(with honors) in electronics and communication engi-
neering from University of Roorkee, Roorkee, India,
and the Ph.D. degree from Brunel University, Brunel,
Ehsan Rahiminejad (M’09) was born in Gorgan, U.K.
Iran, In 1985. He received the B.Sc. degree in He was with Semiconductor Complex Limited,
electrical engineering from the Ferdowsi University Chandigarh, India, from 1984 to 1989, where he
of Mashad, Mashad, Iran, in 2004. He is currently designed CMOS Integrated Circuits. From 1989
a pursuing the Ph.D. degree in electronics from the to 1992, he worked with the ASIC Division,
Department of Electronics, Ferdowsi University of SGS-Thomson, Agrate, Milan. In 1992, he joined
Mashad, Marshad, Iran. Philips Research Laboratories, Eindhoven, where
He works with the Integrated Systems Lab he researched on various aspects of VLSI testing and manufacturing. He is
(ISL), Department of Electronics, University of a Professor with the Department of Electrical and Computer Engineering,
Mashad. His research interests include the fields of University of Waterloo, Waterloo, ON, Canada, since 1998. Currently, he serves
analog-to-digital converters and VLSI circuits. His as the Department Chair and holds the University of Waterloo research chair.
current research is on very low-power low-speed analog-to-digital converters. His research interests include low power and high performance digital circuit
design, mixed-signal circuit design, test and manufacturing issues of integrated
circuits. He has written five books, two book chapters, and has contributed
to over 150 technical articles in conferences and journals. He holds over 30
granted and pending U.S. patents in the broad area of VLSI circuit design and
test.
Prof. Sachdev was a recipient of several awards including the 1997 European
Design and Test Conference Best Paper Award, the 1998 International Test Con-
ference Honorable Mention Award, and the 2004 VLSI Test Symposium Best
Panel Award.

Vous aimerez peut-être aussi