Académique Documents
Professionnel Documents
Culture Documents
I. INTRODUCTION
ANOSCALE technologies unveiled two significant chal-
N lenges to the design of high-speed and reliable SRAMs.
The first challenge is the process variation which threatens the
Fig. 1. (a) Concept of current sense amplification and (b) differential current
sensing with offset.
reliability by affecting the sensing circuits sensitivity. This ef-
fect demands larger signal magnitudes, which deteriorate the an infinitely small differential voltage between its two internal
speed as well as power consumption. The second challenge is nodes when it is enabled. In practice, however, the initial differ-
the variation of the cell current, which reduces the worst case ential voltage at the internal nodes of the VSA must be larger
cell current that drives the bitline. In effect, this reduction de- than the offset for correct detection. This turns into a large bit-
mands a longer wordline activation time to ensure sufficient bit- line voltage swing which adversely affects the speed, power, and
line voltage swing for correct sensing. The wordline activation wordline activation time. Hence, offset presents the fundamental
time directly influences the cell data stability. A long wordline limit to the sense amplification process in an SRAM. Offset can-
activation time can affect the noise margin [1]–[3]. cellation using double gate MOS devices and or complex cir-
Two phase, latch-type voltage sense amplifiers (VSA) are fa- cuits have been proposed to alleviate this concern [4], [5]. This
vorable because of their simple structure and low power con- paper presents a rigorous analysis of offset effect in the proposed
sumption. The self shut-off mechanism of VSA has made it a two stage sense amplification scheme. The method is applicable
pervasive choice in today’s SRAM design. Under ideal condi- to any other sensing scheme.
tions, where the offset of the VSA is zero, the VSA amplifies Current sense amplifiers (CSA) has long been proposed as
a promising approach for high speed applications since they
Manuscript received July 19, 2009; revised October 20, 2009. First published do not require large bitline voltage swing for detection [6].
February 17, 2010; current version published April 27, 2011. Fig. 1(a) illustrates the concept of the CSA. A CSA is essen-
M. Sharifkhani and E. Rahiminejad are with the Department of Electrical En- tially a linear, current buffer that ties the bitline to a known
gineering, Sharif University of Technology, Tehran 1458889694, Iran (e-mail:
msharifk@sharif.edu; msharifk@vlsi.uwaterloo.ca; e.rahiminejad@yahoo.
voltage and presents a small input resistance to the input signal
com). which has a form of current. The output current creates a charge
S. M. Jahinuzzaman is with the Department of Electrical and Computer En- variation at the output node which results in a sufficiently large
gineering, University of Concordia, Concordia, QC H3G 1M8, Canada (e-mail:
shah@ece.concordia.ca; smjahinu@engmail.uwaterloo.ca).
output voltage. The linear amplifiers, however, are usually large
M. Sachdev is with Department of Electrical and Computer Engineering, Uni- and they impose a dc power consumption. Therefore, at the ex-
versity of Waterloo, Waterloo, ON N2L 3G1, Canada (e-mail: msachdev@uwa- pense of area and complexity, an additional circuitry is required
terloo.ca).
Color versions of one or more of the figures in this paper are available online
to turn off the CSA when it becomes idle [7]. The proposed
at http://ieeexplore.ieee.org. scheme offers an automatic shut-off mechanism which results
Digital Object Identifier 10.1109/TVLSI.2009.2039949 in a reduced area and power.
1063-8210/$26.00 © 2010 IEEE
884 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 5, MAY 2011
II. OPERATION
Fig. 3. Timing and voltage waveforms in the (a) conventional and (b) proposed OCCSA scheme.
, for the transistor with a lower . Hence, over current buffer on the cell current. The PSA forms a differential
time, the effective voltage of the two transistors, , voltage variation at the output, and . The differential
are equalized because of a higher source voltage for a smaller voltage created at and (and the internal nodes of the
threshold voltage: . In other VSA, and ) is far larger than the differential voltage
words, in POC phase a differential voltage is developed between over the bitlines. This is due to the fact that the bitline capaci-
the bitlines which compensates the threshold voltage difference tance is larger than the capacitance of and by orders
of M1 and M2 to achieve an equal effective voltage. Since the ef- of magnitude. In other words, when PSA operates on the cell
fective voltage of M1 and M2 are equalized, the PSA can be used current, the required differential voltage at the internal nodes of
as an offset-less differential CSA in the next phase. A longer the VSA is created at a much shorter time. This effect results
POC phase , therefore, offers a better offset cancella- in a reduced wordline activation time which boosts the cell data
tion at the expense of a smaller , hence, . Clearly, the stability and the yield [3]. Note that in ACC phase the and
differential bitline voltage at the end of POC does not cause a are floating as well as and because M3 and
detection error in the ACC phase since the data in the ACC phase M4 are off. It is evident that without proper offset cancellation
is detected through differential current sensing at the source of in the POC phase, the PSA differential output current is domi-
the PSA. With offset cancelled, the PSA differential output cur- nated by the PSA offset rather than the cell current.
rent is determined by cell current (see Fig. 1). The evaluation phase is similar to the conventional scheme.
The circuit enters the ACC with the activation of the wordline The VSA amplifies the initial condition to the supply levels
and disabling the signal . The cell reveals the data over when it gets active: . If the initial condition between
the bitlines in the form of a differential current. During the entire the two internal nodes of the cross-coupled latch is larger than
ACC phase, , the PSA operates as a differential unity gain the offset voltage of the VSA, the correct evaluation of the ini-
886 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 5, MAY 2011
tial condition takes place. Otherwise, the output of the VSA is lines are precharged differently. In the POC phase, since M1 has
determined by the intrinsic offset of the VSA. Hence, in both a larger threshold voltage than M2, charges up more slowly
conventional and OCCSA the necessary condition for correct compared to which is precharged by M2. At the end of POC
operation is where is the initial phase, the effective voltage of both transistors, is al-
differential voltage at the internal nodes of the VSA before the most the same. Hence, they provide similar current amplifica-
evaluation and is the VSA offset. tion. When the circuit enters ACC phase, the differential output
Fig. 5(a) presents the HSpice simulated waveforms of the current of the PSA is dominated by the cell current. Hence,
OCCSA scheme where the PSA do not pose an offset. In this the differential voltage created at the internal nodes of the latch
design, the POC is about 200 pS and the access time VSA is determined by the cell current rather than by the PSA
300 pS. With 0.9 V the bitlines are charged offset. Therefore, the PSA offers the same amplification benefit
up to about 1 V through the source of M1 and M2 during as the previous case where the PSA does not have an offset.
POC. Clearly, since there is no offset, the bitlines do not create The stability of the cells is maintained because of two rea-
a differential voltage at the end of POC. Moreover, since the sons. First, the wordline is active once the precharge voltage of
drains of M1 and M2 are tied to , the internal nodes of the the bitline is around , where is the threshold voltage
VSA, and are preset and equalized to in this of PSA transistors. This precharge voltage is sufficiently high to
phase. The ACC phase begins with the activation of the word- maintain the cell stability [13]. Second, the current sense ampli-
line and deactivation of . During this phase, the current fication allows a significantly shorter than usual wordline acti-
that is sunk by the cell is amplified by the PSA. At the end of vation time . A shorter cell access time translates into a
the ACC phase, the differential voltage of the bitlines, is higher cell stability [3].
about 30 mV while the differential voltage at the internal nodes
of the latch VSA, is about 90 mV. This amplification III. ANALYSIS
can be attributed to the differential current sensing of the PSA in In this section, we derive analytical expressions that ex-
the ACC. Fig. 5(a) illustrates the case where POC phase is fin- plain the behavior and effectiveness of the OCCSA scheme.
ished before the BL is fully charged up to . Hence, the This analysis unveils the efficiency of the scheme, its design
PSA keeps charging up the BLs during the ACC phase as well. tradeoffs, and a quantitative comparison is made against the
It is noteworthy that since the current buffer is only an nMOS, conventional latch SA. In this analysis, the threshold voltage
it is input impedance is far from perfect (i.e., zero) hence a drop mismatch between M1 and M2 models the offset of CSA this
in the bitline differential voltage appears during the ACC time. type of modeling is widely used in the context of differential
Fig. 5(b) illustrates the waveforms of the same nodes when SA schemes [8], [12].
the PSA has a pessimistic offset voltage of 35 mV. During the The analysis of the operation of the circuit is based on Fig. 3
POC phase, the bitlines are precharged by M1 and M2 from 0.9 and the explanation provided in the previous section. The flow
V up to 1.1 V. Because of the offset between M1 and M2, the bit- of the derivation is demonstrated in Fig. 6. In each of the three
SHARIFKHANI et al.: COMPACT HYBRID CURRENT/VOLTAGE SENSE AMPLIFIER 887
Fig. 5. Voltage waveforms in the OCCSA (a) without and (b) with the PSA differential offset of 1V = 35 mV.
Fig. 6. Flow of derivations incorporating the offset of PSA and latch VSA as well as design parameters including timing parameters.
phases, the voltage of the critical nodes of the circuit is analyzed. phase. The residual offset of the PSA is defined as
Key parameters that are used in the analysis is also described in the effective offset of the PSA transistor after the completion of
the figure. The final condition of the nodes in each phase are the the POC phase. is derived in Appendix I as
initial condition for the respective subsequent phase.
The POC phase analysis unveils two key parameters that ef- (1)
fect operation of the PSA transistors in the subsequent ACC
888 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 5, MAY 2011
Fig. 7. Differential voltage at the internal nodes of the latch VSA (V 0V ) is the superposition of two components: (a) the PSA residual offset from the
POC phase and (b) the cell current.
(6)
(8)
(4)
Practical design values in a 0.18- m technology portrays the
design tradeoffs obtained in the analysis. For the typical PSA
In this analysis, the differential output of the PSA is essen- offset of 10 mV, Fig. 9 demonstrates the tradeoff be-
tially the internal nodes of the latch VSA assuming that the VSA tween the access time and offset cancellation time for various
input pMOS gates impose a negligible RC time constant. Hence, values of under 67 A and 1 pF which
at the end of ACC phase, the differential voltage at the internal associates with a 512 cell/column configuration.
nodes of the latch VSA can be expressed as A longer offset cancellation time offers a smaller
. Similarly, a smaller bitline precharge relaxes
(5) for a given PSA offset . Also, from (8) a smaller
PSA offset allows a shorter access time. Notably, for
assuming that the cell current and PSA residual offset contract, larger than , the circuit operates properly
which is what takes place in worst case scenario (see Fig. 6). without offset cancellation time: .
SHARIFKHANI et al.: COMPACT HYBRID CURRENT/VOLTAGE SENSE AMPLIFIER 889
Fig. 8. Offset cancellation time, T , needed to compensate the PSA offset Fig. 9. Access time needed for correct detection for different T under dif-
1
of V for different bitline precharge voltages V . ferent V conditions.
For a fixed , it is instructive to find the minimum cells on the selected column (see Fig. 11). The leakage affects
that guarantees the correct detection condition the input signal independent of the type of sense amplifier used.
described in (6)
A. Comparison Under Equal Nonzero Latch VSA Offset
Fig. 11. Schematic of the simulated circuit that includes the leakage current of the idle cells on the selected column.
(16)
TABLE I
ENERGY AND DELAY COMPARISON BETWEEN PROPOSED AND CONVENTIONAL SCHEME FOR IDENTICAL LATCH VSA OFFSET, V
Fig. 14. HSpice simulation results for the conventional sense amplification
method versus proposed method along with the predictions made by the deriva-
tions.
a significant variation over without sacrificing the perfor- [2] M. Khellah, Y. Ye, S. K. Nam, D. Somasekhar, G. Pandya, A. Farhang,
mance. This property relaxes the design considerations for the K. Zhang, C. Webb, and V. De, “Wordline and bitline pulsing schemes
for improving SRAM cell stability in low-Vcc 65 nm CMOS designs,”
reference voltage for . It can be generated on the same in IEEE Symp. VLSI Circuits Dig. Tech. Papers, 2006, pp. 9–10.
die with a minimal power 1 W and area overhead ( [3] M. Sharifkhani and M. Sachdev, “SRAM cell data stability: A dynamic
1%). Compact low-power mid-rail reference voltage generators perspective,” IEEE J. Solid-State Circuits, vol. 44, no. 2, pp. 609–619,
Feb. 2009.
are common in both DRAM and SRAM designs (See [17, pp. [4] A. Choudhary and S. Kundu, “A process variation tolerant self-com-
290–337]). Moreover, according to [17], the active power con- pensating sense amplifier design,” in Proc. IEEE Comput. Soc. Ann.
sumption of the memory unit is mainly due to the bitline voltage Symp. VLSI, 2009, pp. 263–267.
[5] S. Mukhopadhyay, , H. Mahmoodi, and K. Roy, “A novel high-per-
variations as well as wordline and decoders. In the proposed formance and robust sense amplifier using independent gate control in
scheme the only additional time frame is POC time which can sub-50-nm double-gate MOSFET,” IEEE Trans. Very Large Scale In-
be created using a dummy decoder. This barely affects the en- tegr. (VLSI) Syst., vol. 14, no. 2, pp. 183–192, Feb. 2006.
[6] E. Seevinck, P. van Beers, and H. Ontrop, “Current-mode techniques
tire power specification. for high-speed vlsi circuits with application to current sense amplifier
for CMOS SRAM’s,” IEEE J. Solid-State Circuits, vol. 26, no. 4, pp.
VI. CONCLUSION AND DISCUSSION 525–536, Apr. 1991.
[7] B. Wincht, J. Y. Larguier, and D. S. Landsiedel, “A 1.5 v 1.7 ns 4 k
A hybrid current/voltage offset canceled sense amplifier 2 32 SRAM with a fully-differential auto-power-down current sense
scheme is proposed. In this scheme, the cell access time and amplifier,” in Proc. ISSCC, Feb. 2003, pp. 462–508.
[8] K. Seno, K. Knorpp, L.-L. Shu, N. Teshima, H. Kihara, H. Sato,
current sense amplification is concurrent while the voltage F. Miyaji, M. Takeda, M. Sasaki, Y. Tomo, P. T. Chuang, and K.
sensing comes afterwards. The offset of the CSA is cancelled Kobayashi, “A 9-ns 16-mb CMOS SRAM with offset-compensated
before the wordline activation to improve the sensitivity of current sense amplifier,” IEEE J. Solid-State Circuits, vol. 28, no. 11,
pp. 1119–1124, Nov. 1993.
the CSA and avoid the interference of the CSA offset with [9] K. Ishibashi, K. Takasugi, K. Komiyaji, H. Toyoshima, T. Yamanaka,
the cell current. It is shown that the offset cancellation allows A. Fukami, N. Hashimoto, N. Ohki, A. Shimizu, T. Hashimoto, T.
fast open-loop operation of the current sense amplifier with Nagano, and T. Nishida, “A 6-ns 4-mb CMOS SRAM with offset-
voltage-insensitive current sense amplifiers,” IEEE J. Solid-State Cir-
minimum area and power compared to [9], [10]. For the first cuits, vol. 30, no. 4, pp. 480–486, Apr. 1995.
time, a rigorous and methodological approach for the analysis [10] B. Wincht, S. Paul, and D. S. Landsiedel, “Analysis and compensation
of an SA in presence of offset in a dynamic environment is of the bitline multiplexer in SRAM current sense amplifiers,” IEEE J.
Solid-State Circuits, vol. 36, no. 11, pp. 1745–1755, Nov. 2001.
presented. [11] N. Shibata, “Current sense amplifiers for low-voltage memories,”
The analysis and HSpice circuit simulations unveiled a four IEICE Trans. Electron, vol. 79, pp. 1120–1130, Aug. 1996.
times speed improvement over the conventional scheme for the [12] A. Bhavnagarwala, X. Tang, and J. Meindl, “The impact of intrinsic
device fluctuations on CMOS SRAM cell stability,” IEEE J. Solid-State
same cell current and bitline capacitance for 50% more energy. Circuits, vol. 36, no. 4, pp. 657–665, Apr. 2001.
The cell data stability is ensured by the reduction of the cell [13] K. Kanda, S. Hattori, and T. Sakurai, “90% write power-saving SRAM
access time and bitline precharge voltage. Cell access using sense-amplifying memory cell,” IEEE J. Solid-State Circuits, vol.
93, no. 6, pp. 929–933, Jun. 2004.
times as low as a couple of hundred pico seconds is achievable [14] M. Sharifkhani and M. Sachdev, “A low power SRAM architecture
in 0.18- m CMOS technology for a 1 pF bitline load (more than based on segmented virtual grounding,” in Proc. IEEE Symp. Low-
512 cells/column) with less than 100 mV bitline voltage swing. Power Electron. Des. (ISLPED), Oct. 2006, pp. 256–261.
[15] M. Sharifkhani and M. Sachdev, “Segmented virtual ground architec-
ture for low-power embedded sram,” IEEE Trans. Very Large Scale
REFERENCES Integr. (VLSI) Syst., vol. 15, no. 2, pp. 196–205, Feb. 2007.
[1] M. Pilo, J. Barwin, G. Braceras, C. Browning, S. Burns, J. Gabric, S. [16] M. Sharifkhani, S. Jahinnuzaman, and M. Sachdev, “Data stability in
Lamphier, M. Miller, A. Roberts, and F. Towler, “An SRAM design low-power sram design,” in Proc. IEEE Custom Integr. Circuits Conf.
in 65 nm and 45 nm technology nodes featuring read and write-assist (CICC), 2007, pp. 237–241.
circuits to expand operating voltage,” in IEEE Symp. VLSI Circuits Dig. [17] K. Itoh, VLSI Memory Chip Design. New York: Springer-Verlag,
Tech. Papers, 2006, pp. 15–17. 2001.
894 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 5, MAY 2011