Vous êtes sur la page 1sur 8

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 26, NO.

12, DECEMBER 1991

1860

A CMOS Field-Programmable
Analog Array
Edward K. F. Lee a n d P. Glenn Gulak, Member, ZEEE

Abstract -The design details and test results of a field-programmable analog array (FPAA) prototype chip in 1.2-pm
CMOS are presented. The analog array is based on subthreshold circuit techniques and consists of a collection of homogeneous configurable analog blocks (CABS) and an interconnection network. Interconnections between CABs and the analog
functions to be implemented in each block are defined by a set
of configuration bits loaded serially into an on-board shift
register by the user. Macromodels are developed for the analog
functions in order to simulate various neural network applications on the field-programmable analog array.

I. INTRODUCTION

IELD-programmable gate arrays for prototyping digital circuits are now commercially available from several vendors (as in [l]). Conspicuously absent in the
literature is a field-programmable analog array (FPAA)
and perhaps for good reason-many
more challenges
must be addressed such as bandwidth, linearity, signalto-noise ratio, frequency response, etc. Noteworthy, however, are several recent commercial products and publications in the literature that offer fixed topologies and
programmable coefficients. For example, most singlechip analog neural network implementations are designed
for a fixed network topology even though the synaptic
weights are programmable [41, 191, [121, [131. As pointed
out by Sivilotti [14], it would be advantageous to devise an
IC strategy whereby various analog networks can be realized, as determined by the user, in some type of reusable
generic prototyping medium. The approach taken in this
paper is to focus on a small number of specialized analog
functions that are realized using subthreshold techniques.
Furthermore, differential circuitry is used for noise immunity, and current-mode signaling provides addition and
subtraction operations. In addition, memory functions
such as integrators, coefficient storage, and sampled analog delay lines are also supported.
As manufacturability and cost consciousness has grown
among circuit designers, an awareness of the importance
of testability has also increased. But, testing of analog
circuits is a difficult task. The FPAA concept provides an
Manuscript received May 14, 1991; revised July 16, 1991. This work
was supported by an operating grant from NSERC and Micronet.
The authors are with the VLSI Research Group, Department of
Electrical Engineering, University of Toronto, Toronto, Canada M5S
1A4.
IEEE Log Number 9103054.

interesting approach to the testing of analog ICs. Since


the circuits in every programmable block are identical,
and since the input and output of each block are accessible through the routing network, the FPAA provides good
controllability and observability, such that each programmable analog function can be tested exhaustively.
In addressing these challenging requirements, this paper presents a novel FPAA and associated test results. In
Section 11, we outline various functional requirements for
the implementation of neural networks. Section I11 presents the design of a configurable analog block (CAB).
Section IV presents the strategy for interconnecting
CABs. Section V discusses the implementation of the
prototype chip and the procedure for loading the coefficients and for configuration of the array. The experimental results for the prototype chip are also presented in this
section. The simulation of neural network applications,
using macromodels of the circuits discussed in Section 11,
are discussed in Section VI. Finally, the conclusions are
presented in Section VII.
REQUIREMENTS
11. FUNCTIONAL
The functions required to implement most neural networks are addition, threshold operation, coefficient multiplication, and signal multiplication (used in high-order
neural networks). In networks capable of learning, the
coefficient of the coefficient multiplier is often required
to be adjustable. In order to realize these functions in
CMOS technology, a subthreshold circuit technique is
used, offering the advantage of low power dissipation.
Since a neural network usually has many neurons and
synaptic weights, the required circuits must be simple to
permit several functional blocks to be implemented in the
same die. In the following, circuits for realization of the
required functions are presented. These circuits utilize
both differential current-mode and differential voltagemode signals. The swing of the voltage-mode signals is
kept small (except for the case of the threshold operation)
to maximize the speed of operation. The operation of
addition among signals is obtained by representing the
signals in current mode and then adding the currents at
the corresponding nodes.
The threshold operation is similar to the behavior of an
analog comparator. It has two states (low and high states)
and a gain region, which is required for the transition

0018-9200/91/ 1200-1860$01.OO 01991 IEEE

1
I

LEE AND GULAK CMOS FIELD-PROGRAMMABLE ANALOG ARRAY

1861

differential output current Y defined by Z, and Z4 is the


product of the two differential voltage mode inputs X
and 2. Furthermore, the coefficient multiplier can be
obtained by removing the 2 input stage (indicated by
dotted lines) and connecting the input S of the multiplier
core to a voltage difference stored on a pair of capacitors.
The stored voltage can be refreshed as discussed in [61.
Although the multipliers and the comparators can be
cascaded together, direct connection of the multipliers is
not possible. Since this structure is usually desired in
high-order neural networks, a current buffer is designed
for this purpose. The buffer can be obtained by connecting the gate to the drain of M1 in Fig. 1 instead of
connecting it to the gate of M 2 . Since this circuit also
utilizes the translinear technique, the transfer characteristic between the input and output current is very linear.
X

Fig. 1. Schematic diagram of the current comparator.

Fig. 2. Schematic diagram of the signal multiplier.

between the two states. Fig. 1 shows a simple current


comparator. If there is a difference in current at the input
terminals X , node 1will have a large voltage swing due to
the high impedance at this node. The differential pair
( M 3 and M 4 ) will act as a voltage-to-current converter,
which converts the input voltage difference at terminal X
to current outputs at terminal Y . The current mirrors
formed by M 7 and M 8 together with transistors in the
multiplier take the input differential current defined by I,
and Z, and mirror it to the specified multiplier differential input.
The schematic for the signal multiplier is shown in Fig.
2. This circuit is a classic realization of a four-quadrant
Gilbert multiplier in subthreshold CMOS technology. The

111. CONFIGURABLE
ANALOG
BLOCK
A typical [l] field-programmable gate array (FPGA)
consists of configurable logic blocks (CLBs) that perform
the required logic function and an interconnection network to provide connections between CLBs. The FPAA
presented in this paper is modeled on a similar strategy,
although there are many issues unique to analog circuits
that must be accommodated. The analog functions will be
grouped into CABs and the interconnection network will
connect them together. One important observation, however, is that the circuit topologies for different functions
are similar. Therefore, different functions can be conveniently obtained by configuring the circuit primitives
(Table I(a)> with a few transistors that act as analog
switches (Table I(b)).
The design of the CAB is shown in Fig. 3. The configuration of the CAB is determined by a local 3-b shift
register. Each bit of the shift register controls various
switches and/or multiplexors within the CAB itself. For
example, when the shift register contains the bit pattern
001, the CAB is configured as a four-quadrant multiplier
with X and Y as inputs and 2 as output. Table I1 shows
all the configurations inside the CAB.
Though various specialized types of CABs could be
defined in the same manner, controlling the level of
granularity of the CAB is critical to minimizing the number of 1 / 0 lines that ultimately must be accommodated
by the interconnection network.

IV. INTERCONNECTIONNETWORK
Interconnection networks can usually be categorized
into two types: crossbar and multistage (hierarchical). The
crossbar interconnection network provides full interconnection capability between any two connected elements,
which is known as nonblocking. It also provides less delay
on data transfer. However, the crossbar approach exhibits
an area growth rate of O ( N 2 )for connecting N inputs to
N outputs. On the other hand, hierarchical interconnection networks evolve with slower cost growth. Examples of
this type of network include Banyan [3], omega [5], and

1862

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 26, NO. 12, DECEMBER 1991

TABLE I
D~SCRIPTIONS
OF (a) THE CIRCUIT
PRIMITIVES
AND (b) THE SWITCHES
Prlmitive Element

Ill,

)D V4

I
VSSl

v5

GND

Fig. 3. Schematic diagram for the CAB.

TABLE I1
CONFIGURATIONS
OF THE CAB

Description

VDOL
1

Buffer from X to Y

P
O

X+AY
W

Buffer from X to Y

+-X
*
1.0 +z

Z=W

Z=XW

x*s:

Comparator from x to Y

x ~ - > Y _

Comparator from X to Y

I I

Symbol

Z=XW

Circuit Diagram

Switch Element

1.0 *z

z=w
Signal multiplier
IY-x*z

I-

Signal multiplier (buffered)Y-2x


Y=XZ

Constant multiplier
Y=
K

D*E
Difkrentiml D and E

delta [ll]. These networks have a cost growth of


O(N1ogN) at the expense of longer delay between inputs and outputs. For the design of the array, the idea of
using a hierarchical network will be appropriate because
of the area requirement. The delay of the network is
acceptable since the CABs utilize current-mode and
small-swing voltage-mode signals which minimize the delay. However, those hierarchical networks mentioned
above are designed for data transfer or routing between
two layers of elements. They do not allow connections
between elements in the same layer. Therefore, the ele-

W(Z)

Loading weight W from GI and G2


Loaded at AM as %-3V

and stored as w=+3V

ments are not capable of being fully connected. Furthermore, as the CABs are increased, the dimension of the
array will be proportional to N x log N and the layout of
the array will be rectangular. However, a layout with an
aspect ratio of one is usually desired in a VLSI design.
Consequently, the network mentioned above may not be
suitable for the design of the array.
From a VLSI point of view, the area of the layout is of
most concern. Therefore, the interconnection network
must have the property of area universality [SI. A
network that is area universal is a network that, for a
given area, can efficiently embed any circuit whose size is
only slightly smaller. One such area-universal network is a

1
I

1863

LEE AND GULAK CMOS FIELD-PROGRAMMABLE ANALOG ARRAY

z
Y

W U i

Fig. 4. ~n area-universal fat tree (the dotted box indicated by 9


highlights the motohme
. - _ imdementation).
.
I

fat tree [8] shown in Fig. 4, which is also a hierarchical


network. In fact, Leiserson [7] showed that the fat tree
was near optimum in terms of the number of elements,
required area (or volume), data transfer delay, and growth
rate of routing channel size. Based on this network, the
FPAA is designed with the CABS (as the leaves of the fat
tree) connected by switch blocks (SBs) in different levels
of the fat tree. The switch blocks can be realized by using
crossbar switches. However, the switch blocks can be
simplified by constraining the number of allowable connections. With this in mind, the switch blocks just above
the leaves are designed as shown in Fig. 5. The connections made by this SB will be determined by the contents
of a local 10-b shift register, which is specified by the
user. The SB also allows for polarity changes of the
differential signals. Therefore, addition or subtraction
among current-mode signals is possible. The overall structure of the SBs in different levels is designed by repeatedly embedding different netlists of various types of circuits (e.g., a Hopfield network) to the fat-tree network
and appropriately locating the ON/OFF switches to fit
these systems onto the network.

V. IMPLEMENTATION
AND PROGRAMMING
OF THE
PROTOTYPE
DESIGN
In order to demonstrate the feasibility of the circuits, a
prototype containing the portion inside the dotted box 9
(Fig. 4) was designed, which consists of two CABS and an
SB. The shift registers of the SB and the shift registers of
the CAB are connected together as a chain. The configuration bits are set by shifting the bits into the chip serially.
A control unit is needed to determine the required bit
patterns for the configuration cycles as well as the clock
phases of the shift registers to control the begin and end
of the different cycles.

The coefficients required by the coefficient multiplier


are loaded before the actual configuration of the entire
array. The bit pattern 100 for the CAB is dedicated to
this operation. During the coefficient loading cycle, the
control unit will shift a ONE followed by ZEROS into the
array. When the shift registers of a particular CAB contain the pattern 100, the global write signal
is set low
by the control unit and the required coefficient value will
be loaded to this CAB through the global wires G1 and
G2 with an external differential voltagesource. Before
the next ZERO is shifted into the array, W will be reset
high. Since only the first bit is a logic ONE, exactly one
CAB will be in the coefficient loading mode. Therefore,
the coefficients will be loaded into the corresponding
CABs sequentially.
After the coefficient loading cycle, the control unit will
shift a specified bit pattern into the chip. This bit pattern
will determine the operations of the CABS and the
connections bztween CABS. During this cycle, the global
write signal W will always be high in order to prevent
disturbance of the coefficient values. At the end of this
cycle, the clock phases for the shift registers will be kept
low and the analog array is now configured and ready to
operate.
The experimental prototype is fabricated in 1.2-pm
CMOS technology. Some specifications of the CAB and
the SB are shown in Table 111. The die photo of the
prototype chip is shown in Fig. 6. When testing the chip,
one of the CABs was configured as a signal multiplier
and the other one was configured as a current buffer.
Internal interconnections between the CABs themselves
and the external connections from the chip were through
the SB. The configuration bits were shifted into the chip
by an auxiliary off-chip control circuit. Since inputs to the
signal multiplier require a differential voltage-mode signal, two external variable voltage sources were used for
one input to the multiplier. The other differential input to
the multiplier was obtained from the output of the current buffer. Since the circuits operate in the subthreshold
region, the differential output current is in the nanoampere range. In order to measure current in this range, two
electrometers were used. The dc characteristic of the

1864

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 26, NO. 12, DECEMBER 1991

TABLE 111
SPECIFICATIONS
OF THE CAE3 AND SB
Item

Size

No. of transistors % of area for shift registers

CAB 192 pm x 176 p m


SB 202 m1X197wm

= 30%

98
148

75%

Fig. 6. Die photo of the prototype chip.

600

P
400

200

Differential
output
current

InAI
-200 .

.,........

-400

-600

-200

-100

100

200

Differential input current [nA]

Fig. 7. DC measurement of the four-quadrant Gilbert multiplier.

output was measured by varying the two current sources


with different fixed voltages at the inputs of the multiplier. The results are shown in Fig. 7. The output differential current was fairly linear with respect to the input
differential current. The worst-case percentage error over
the -100- to 100-nA range of the input differential
current was about 2%. The reduction in the observed
nonlinearity is most likely due to the nonlinearities in the
buffer. Since the MOSFETs in the buffer may not operate inside the subthreshold region, it may violate the

exponential Z, - V,, characteristic required for the


translinear circuit technique. The output offset current is
about 20 nA, which is 3.5% of the maximum output
current. The noise of the array was measured by taking
samples at a rate of 50 samples/min at the output terminals for an extended period of time (20 m i d . The root
mean square value of the noise was less than 1 nA.
Another experiment involved the use of the current
comparator by configuring one of the CABSwith the bit
pattern 011. The connections between the CABSand the

1865

LEE AND GULAK CMOS FIELD-PROGRAMMABLEANALOG ARRAY

..................

- .1021

...................

Differential
output
current
[nAl

-120
-go!

-50

-100

SO

100

Differential input current [nA]

Fig. 8. DC measurement for the comparator in cascade with the Gilbert multiplier.

the models and the actual circuits are well matched.


Nevertheless, the settling time of the actual circuit is
longer than that of the model since the impedance of the
actual circuit is varying as a function of the input differential current while the model has constant impedance over
the entire input range. However, the model still provides
a lower bound for the settling time. The model of the
constant multiplier is the same as the signal multiplier
with one of the inputs set to a constant differential
voltage. Since the loading of the voltages to the analog
memories is not involved in the simulations, the model of
this operation is omitted. The model for the interconnection switches is modeled by an RC network. The estimated wiring capacitance is also included in the model.

Fig. 9. Model of the current buffer. (Note that Ri,= 1 MR/number


of fan-ins.)

instruments were the same as the previous example. Fig. 8


shows the experimental results, which were compared
with the HSPICE simulations. The experimental results
had an offset current of about 30 nA.

B. Neural Network Simulation Using the Macromodels


VI. SIMULATIONS
USING
MACROMODELS
In this section, the analog operations discussed above
are modeled by a set of simple circuit models. These
models are then used in the simulation of the neural
networks to reduce the computational requirements.

A. Macromodels of the Circuits


The macromodels are developed by first simulating the
dc and ac characteristics of the analog operations and
then modeling the operations with sets of resistors, capacitors, and dependent sources according to the simulation
results derived from the extracted layout. As an illustration, the schematic diagram of the model for the current
buffer is shown in Fig. 9. The models of other analog
functions can be obtained using the same procedures.
Different values of the resistors, capacitors, and dependent sources may be obtained if different processes are
used. During the simulation of the circuits, the biasing
current is set to 100 nA. The dc and ac characteristics of

Based on the models developed, the multilayer feedforward network and the Hopfield network are simulated to
illustrate the application of the FPAA in this area. Fig. 10
shows the schematic diagram of a single-layer neural
network realized by the circuits discussed above. Four
switches are assumed in each connection between different functions. The weights of the network were obtained
by training the network with the delta rule [2]. The input
differential voltage values of the multiplier models are set
according to the resulting normalized weight values. The
input and output patterns used during training are shown
in Table IV. The response of the network is correct and
shown in Fig. 11.
The behavior of the networks due to the offset errors of
the circuit is also studied and simulated with the use of a
multilayer network to solve the classic EXCLUSIVE-OR
problem. Table V shows the corresponding input and
output patterns. The simulation results are shown in Fig.
12 with the values of the constant differential current
sources randomly disturbed in the range of -2.5 to 2.5

IEEE JOURNAL OF SOLID-STATECIRCUITS, VOL. 26, NO. 12, DECEMBER 1991

1866

mumn 1

neuron 5
output
current

Input

[nAI

I
0

10

20

30

40

time ( p e c )

Fig. 12. Simulation results for the EXCLUSIVE-OR problem with offset
in the range of - 2.5 to 2.5 nA.

+
ourput

neuron 5

Fig. 10. Schematic diagram of the single-layer network.

output
current

TABLE IV
INPUTAND OUTPUT
PA-ITERNSFOR THE SINGLE-LAYER NETWORK
Input pattern
~

Output pattern

~~~~

( - 1, 1, - 1, 1, - 1)
(+ 1, - 1, 1, - 1, 1)

[nAl

(-1,+l)
(+ 1, - 1)
(+L + 1 )

( + l , -1, - 1, -1, -1)

... ..

10

20

30

40

time (psec]

Fig. 13. Simulations results for the EXCLUSIVE-OR problem with offset
in the range of - 10 to 10 nA.

r r s p o m c 01 neuron 6
rcbPo"lc of neuron 1

For the case of Hopfield network, the patterns are


stored at the minimum points of the following energy
function:

- 5 7

.loo+

--

U...

1.2

. ..

..

I
I

2.4

36

time [psec]

Fig. 11. Simulation results for the single-layer network.


TABLE V
INPUT AND OUTPUTP A ~ E R NFOR
S THE EXCLUSIVE-OR PROBLEM
Input pattern

Output pattern

nA (1.25% of the maximum input current) with the same


biasing current. The network still provides the correct
response. When the offsets are in the range of - 10 to 10
nA (5% of the maximum input current), incorrect results
were produced as shown in Fig. 13. As shown in the
simulation, the offset errors of the circuits may affect the
performance of the neural networks.

where
the output of neuron i,
wij the weight from i to j,
Bi
the constant input to neuron i.

oi

The offset errors of the circuits are incorporated into the


constant input Bi7sand therefore may affect the minimum
points of the above function. A n incorrect pattern will be
recalled unless the minimum points are widely spread. If
a correct pattern is recalled, the offset errors will cause a
variation in the settling time. Fig. 14 shows the schematic
diagram of a four-neuron Hopfield network. The output
of the network with random offset errors in the range of
-15 to 15 nA (3.75% of the maximum input current) at
the input of each neuron is shown in Fig. 15. The weights
of the network are obtained by the outer-product rule
[lo]. The network has two minimum points which are
(+ 1, - 1, 1, - 1) and ( - 1, 1, - 1, 1). Since the two
points are widely spread, the resulting pattern is correct
but various settling times are evident with various random
offset errors.

1
I

1867

LEE AND GULAK: CMOS FIELD-PROGRAMMABLE ANALOG ARRAY

designing the PC interface unit. Fabrication support by


CMC is gratefully acknowledged.
REFERENCES

I4

1-17
G

OUQUl

Fig. 14. Schematic diagram of the Hopfield network.

0 5

1 5

time [psec]

Fig. 15. Simulation results for the Hopfield network.

VII. CONCLUSION
The concept of developing a field-programmable analog array (FPAA) to realize different neural network
topologies is proposed. The array is designed using subthreshold circuit techniques to achieve very low power
dissipation. The interconnection network is based on an
area-universal fat tree. Consequently, the layout of the
array is compact and area efficient. An experimental
prototype design was successfully implemented and tested.
The prototype can be dynamically reconfigured to perform various analog functions. A set of macromodels for
the circuits is developed. These models are then used in
the simulation of various neural networks. Based on the
prototype and simulation results, the FPAA concept appears to be a promising candidate for neural network
applications.
ACKNOWLEDGMENT
The authors thank Prof. S. Zukotynski for generous
access to the Keithley SMUs and to Oswin Hall for

[l] The Programmable Gate Array Data Book, Xilinx, San Jose, CA,
1989.
[2] G. E. Hinton, D. E. Rumelhart, and R. J. Williams, Parallel
Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1, Foundations. Cambridge, M A MIT Press, 1986.
[3] L. R. Coke and G. J. Lipovski, Banyan networks for partitioning
multiprocessing system, in Proc. 1st Annual Computer Architecture
Conf., 1973, pp. 21-28.
[4] H. P. Graf and L. D. Jackel, Analog electronic neural network
circuits, IEEE Circuits Devices Mag., vol. 5 , no. 4, pp. 44-49, July
1989.
[5] D. H. Lawrie, Access and alignment of data in an array processor,
IEEE Trans. Comput., vol. C-24, pp. 1145-1155, 1975.
[6] E. K. F. Lee, Field programmable analog arrays-A CMOS
realization, M.A.Sc. thesis, Univ. Toronto, Toronto, Ont., Canada,
1991.
[7] C. E. Leiserson, Fat-trees: Universal networks for hardware-efficient supercomputing, IEEE Trans. Comput., vol. C-34, no. 10, pp.
892-901, 1985.
181 C. E. Leiserson, VLSI Theory and Parallel Supercomputing. Cambridge, M A MIT Press, 1989.
[9] J. Mann, R. Lippmann, B. Berger, and J. Raffel, A self-organizing
neural net chip, in Proc. IEEE 1988 Custom Integrated Circuits
Conf., 1988, pp. 10.3.1-10.3.5.
[lo] R. J. McEliece, E. C. Posner, E. R. Rodemich, and S. S. Venkatesh,
The capacity of the Hopfield associative memory, IEEE Trans.
Inform. Theory, vol. IT-33, no. 4, pp. 461-482, July 1987.
[ 111 J. H. Patel, Processor-memory interconnections for multiprocessors, in Proc. 6th Annual Symp. Computer Architecture, 1979, pp.
168- 177.
[ 121 J. I. Raffel, Electronic implementation of neuromorphic systems,
in Proc. IEEE I988 Crrstom Integrated Circuits Conf., 1988, pp.
10.3.1- 10.3.5.
1131 D. B. Schwartz, R. E. Howard, and W. E. Hubbard, A programmable analog neural network chip, IEEE. J . Solid-state Circuits, vol. 24, no. 2, pp. 313-319, Apr. 1989.
[14] M. A. Sivilotti, A dynamically configurable architecture for prototyping analog circuits, in Advanced Res. VLSI: Proc. Decennial
Caltech Conf. VLSI. Cambridge, MA: MIT Press, 1988.

Edward K. F. Lee received the B.A.Sc degree


from the University of Windsor, Windsor, Ont.,
Canada, in 1988 and the M.A.Sc degree from
the University of Toronto, Toronto, Ont., in
1991. Currently, he is with the Department of
Electrical Engineering, University of Toronto,
where he is working towards the Ph.D. degree.
His research interests are in the areas of anacessing.
log/digital
circuits, VLSI design, and signal pro-

P. Glenn Gulak (S82-M83) received the Ph.D.


degree in electrical engineering from the University of Manitoba, Winnipeg, Man., Canada.
From February 1985 to January 1988 he was a
Research Associate in the Information Systems
Laboratory and the Computer Systems Laboratory at Stanford University, Stanford, CA.
Presently he is an Assistant Professor in the
Department of Electrical Engineering at the
University of Toronto, Toronto, Ont., Canada.
His research interests include VLSI circuits and
systems for signal proc essing and digital communications.
I

Vous aimerez peut-être aussi