Académique Documents
Professionnel Documents
Culture Documents
Abstract
A delay-insensitive (DI) circuit is a digital logic circuit that operates correctly regardless of any delays in
the logic gates (modules) or in the wires (interconnection lines). In practice, DI circuits are constructed by
putting together DI primitive modules. These primitives have implementations in CMOS (complementary
metal oxide semiconductor) technology, in RSFQ (rapid single flux quantum) technology, in SET (single
electron tunneling) technology, and on asynchronous cellular automata (ACA).
1 Introduction
An asynchronous circuit is a digital logic circuit that does not use a global clock signal. Delay-insensitive (DI)
circuits are asynchronous circuits that make the least restrictive timing assumptions. DI circuits are made by
connecting DI building blocks (called primitive modules or primitives). As long as the internal timing assumptions
of these blocks are satisfied, the behaviors of circuits composed of these primitives are not affected by the speed
of operation of the modules or of the delays in the wires connecting them.
Although there have been many delay-insensitive circuit building blocks proposed in the past (for example,
[3, 6]), we only consider the blocks created by Patra and Fussell [9, 10]. They present sets of blocks that are
universal (can be used to make any DI circuit) and minimal (no proper subset of them is sufficient for making all
such circuits). Software models of these are described in [13]; the models aid in the visualization of the primitives’
operation.
A description of a DI primitive consists of the accepted behavior of the module in response to its environment
and also the accepted behavior of the environment in response to the module. We simplify the discussion here
and describe a primitive by giving examples of accepted and unaccepted behaviors.
We focus on one DI primitive, the Merge.1 A Merge has two input ports a? and b? and one output port c!.
A signal transition on one input port, say, a?, once assimilated by the module, leads to a signal transition on the
output port c!. The Merge is serial, that is, every signal on one of its input ports must be followed by exactly
one signal on an output port of the module before the next input signal can be assimilated by the module [6,
p. 3]. For example, b?c!a?c! and a?c!a?c! are accepted behaviors for the Merge, while a?b?, a?c!c!, and c! are
unaccepted behaviors.
For the remainder of the paper, we show how a Merge is implemented in different technologies.
1
switch, a MOSFET has three terminals: a drain, a gate, and a source.2 When the gate voltage is high, a PMOS
does not conduct current from its source to its drain (it is off), and an NMOS conducts current from its drain
to its source (it is on). When the gate voltage is low, a PMOS is on, and an NMOS is off. (For details, see, for
example, [1, §5.5, 5.6, 5.7].)
Figure 1: Two-input CMOS xor gate (figure taken from [8, p. 162] and modified)
A Merge can be implemented in CMOS as an xor gate [16, p. 77] (see Figure 1). Note that here a signal
transition is taken to be a change in voltage (either from high to low, or from low to high).
CMOS implementations of some of Patra and Fussell’s other building blocks are in [9, 10, 16].
CMOS implementations of DI circuits are complex [11, p. 42], making them inefficient and impractical. “[I]n
CMOS technology asynchronous designs actually perform worse [than synchronous designs] with respect to power
consumption, wiring requirements, and speed” [6, p. 1034].
for the other DI RSFQ primitives they presented, biasing currents were a little less than a tenth of a milliampere, JJ critical currents
were a few tenths of a milliampere, and inductances were a few picohenrys.
2
Figure 2: RSFQ Merge (figure taken from [11, p. 46])
[T]he two inputs are a and b, and the output is c. Ib1 is the bias current, which is split between
two arms feeding junctions J3, J1 and J4, J2. The critical current thresholds are arranged so
that Ic3 < Ic1 and likewise Ic4 < Ic2 . When a pulse on a arrives, additional current flows through
J1, exceeding its critical current, whereupon J1 goes resistive. J3 is not triggered since the current
induced by the input pulse is in the opposite direction from the bias current. The SFQ pulse developed
consequently across J1 is transferred through J3, across which the potential is still zero, to J5. This
causes J5 to trip and emit an output pulse at c. The pulse generated by J1 also trips J4, whose
critical current is less than that of J2. When J4 becomes resistive, it prevents J2 from tripping. Since
there is no voltage drop across J2, no pulse is emitted back through input b. Inputs on b operate
symmetrically. Thus the junctions J3 and J4 serve to isolate the inputs from each other and provide
signal directionality (from inputs to output). [11, p. 46]
RSFQ implementations of some of Patra and Fussell’s other building blocks are in [11], and some have been
fabricated and tested at low frequencies. These DI RSFQ primitives have been used in designs of self-timed
pipelined parallel adders [2].
RSFQ technology has sub-picosecond junction switching speed (allowing operation at several hundred giga-
hertz) and very low power dissipation (below one microwatt per JJ even in its resistive state). However, since
low temperature superconductors are used, they must be cooled using liquid helium. There are also “limitations
on RSFQ memory density due to the large physical size of a flux quantum and the difficulty of amplifying output
signals to off-chip power levels at speeds comparable to those attainable on the chip.” [11, p. 45]
C3 = 0.1 aF, V s = 16 mV, and the resistance of each junction is 25.8 kΩ.
3
Figure 3: SET Merge (figure taken from [15, p. 705] and modified)
a positive charge on n3. This is inverted and so the output becomes low, as it should be. If one
of the input signals undergoes a transition then the voltage of J2 (and J1) becomes lower than the
critical voltage causing the electron to tunnel back leaving no charge on n2 (and n1). This reduces
the voltage over J3 to under the critical voltage and the electron tunnels back leaving no charge on
n3. This value is complemented afterwards by the output inverter thus the output becomes high, as
it should. If the second signal also undergoes a transition then the voltage over J1 becomes higher
than the critical voltage causing an electron to tunnel. This cause[s] an electron to tunnel through
J3 leaving a positive charge on n3 corresponding to a low output, as it should. If subsequently one
of the signals transitions again, going low this time, the circuit goes into the previous state with no
charges on n1, n2, and n3. If after that the other signal also transitions the circuit returns to the
original charge neutral state. [15, p. 705]
Figure 4: SET static inverting buffer (figure taken from [15, p. 705] and modified)
It must be noted that these designs were simulated using ideal conditions (zero kelvin temperature, no co-
tunneling or background charge effects) [15] and have not yet been fabricated.
4
transitions according to a transition function, which determines the cell’s state based on the states of cells in its
neighborhood.” [5, §2]
Asynchronous cellular automata (ACA) “allow any cell to undergo state transitions at arbitrary times inde-
pendent of the timings of the other cells’ transitions. Due to the asynchronicity, however, computation in ACA
may be nondeterministic, i.e., more than one global configuration may evolve from a certain configuration.” [5,
§2]
Lee, et al. [5, 4] consider two-dimensional ACA in which each cell has a neighborhood composed of its four
orthogonal adjacent cells along with itself (a von Neumann neighborhood). They were able to embed a universal
set of DI building blocks5 [6] in a 5-state ACA, achieving computational universality [5].
Later, they embedded some of Patra and Fussell’s primitives in a 4-state ACA (see Figure 5). From these
they constructed a universal logic element (a Rotary Element [7]), showing that their model has computational
universality.
Figure 5: Transition rules in A4 (figure taken from [4, p. 207] and modified)
A Merge Core is shown in Figure 6. If we connect two Entrances (Figure 7), a Turn Core (Figure 8), and
an Exit (Figure 9), we get the Merge module shown in Figure 10.
Although no physical implementations of ACA models like this have yet been created, some possible candidates
are discussed in [12].
References
[1] Jose Araneta. A First Course in Semiconductor Devices and Circuits. National Book Store, Mandaluyong
City, Philippines, 2007.
[2] Y. Kameda, S. Polonsky, M. Maezawa, and T. Nanya. Self-timed parallel adders based on DI RSFQ primi-
tives. IEEE Transactions on Applied Superconductivity, 9(2):4040–4045, June 1999.
[3] Robert Keller. Towards a theory of universal speed-independent modules. IEEE Transactions on Computers,
23(1):21–33, 1974.
5 Their primitives differ from Patra and Fussell’s [9, 10] by allowing input and output lines of modules to be bi-directional and
5
Figure 6: (a) ACA Merge Core; (b) Merge Core operating on a signal on its right internal path (figure taken
from [4, p. 212] and modified)
Figure 7: (a) ACA Entrance; (b) Entrance operating on an input signal (figure taken from [4, p. 208] and
modified)
Figure 8: (a) ACA Turn Core; (b) Turn Core operating on a signal arriving on its lower internal path (figure
taken from [4, p. 210] and modified)
6
Figure 9: (a) ACA Exit; (b) Exit operating on a signal (figure taken from [4, p. 209])
7
Figure 10: ACA Merge module (figure taken from [4, p. 213] and modified)
[4] Jia Lee, Susumu Adachi, Ferdinand Peper, and Shinro Mashiko. Delay-insensitive computation in asyn-
chronous cellular automata. Journal of Computer and System Sciences, 70:201–220, 2005.
[5] Jia Lee, Susumu Adachi, Ferdinand Peper, and Kenichi Morita. Embedding universal delay-insensitive
circuits in asynchronous cellular spaces. Fundamenta Informaticae, XX:1–24, 2003.
[6] Jia Lee, Ferdinand Peper, Susumu Adachi, and Kenichi Morita. Universal delay-insensitive circuits with
bi-directional and buffering lines. IEEE Transactions on Computers, 53(8):1034–1046, August 2004.
[7] Kenichi Morita. A simple universal logic element and cellular automata for reversible computing. In Maurice
Margenstern and Yurii Rogozhin, editors, MCU, volume 2055 of Lecture Notes in Computer Science, pages
102–113. Springer, 2001.
[8] Joel Noche. An asynchronous single-precision floating-point arithmetic unit. Master’s thesis, University of
the Philippines, Diliman, College of Engineering, 2003.
[9] Priyadarsan Patra and Donald Fussell. Building-blocks for designing DI circuits. Technical Report TR93-23,
Department of Computer Sciences, University of Texas at Austin, 1993.
[10] Priyadarsan Patra and Donald Fussell. Efficient building blocks for delay insensitive circuits. In Proceedings
of the International Symposium on Advanced Research in Asynchronous Circuits and Systems, pages 196–
205. IEEE Computer Society, November 1994.
[11] Priyadarsan Patra, Stanislav Polonsky, and Donald Fussell. Delay insensitive logic for RSFQ superconductor
technology. In Proceedings of the International Symposium on Advanced Research in Asynchronous Circuits
and Systems, pages 42–53. IEEE Computer Society, April 1997.
[12] Ferdinand Peper, Jia Lee, Susumu Adachi, and Shinro Mashiko. Laying out circuits on asynchronous cellular
arrays: A step towards feasible nanocomputers? Nanotechnology, 14(4):469–485, 2003.
[13] Jesse Sacayanan and Joel Noche. Modeling of delay-insensitive circuit building-blocks using the Hamburg
design system. Philippine Engineering Journal, XXIII(2):11–18, December 2002.
[14] Saleh Safiruddin. Single electron tunneling based building blocks for delay insensitive circuits. Master’s thesis,
Delft University of Technology, Faculty of Electrical Engineering, Mathematics and Computer Science, 2008.
[15] Saleh Safiruddin and Sorin Cotofana. Building blocks for delay-insensitive circuits using single electron
tunneling devices. In Proceedings of the IEEE International Conference on Nanotechnology, pages 704–708.
IEEE, August 2007.
[16] Philip Shirvani, Subhasish Mitra, Jo Ebergen, and Marly Roncken. DUDES: A fault abstraction and col-
lapsing framework for asynchronous circuits. In Proceedings of the International Symposium on Advanced
Research in Asynchronous Circuits and Systems, pages 73–82. IEEE Computer Society, April 2000.