Vous êtes sur la page 1sur 5

42nd South Eastern Symposium on System Theory T2B.

4
University of Texas at Tyler
Tyler, TX, USA, March 7-9, 2010

Fault Tolerant Block Based Neural Networks


Sai sri Krishna Haridass and David H. K. Hoe
Electrical Engineering Department
University of Texas at Tyler
Tyler, TX 75799 USA

Abstract: Block Based Neural Networks weights and biases. Due to this issue, training
(BBNNs) have shown to be a practical means for algorithms are usually run first in software prior to
implementing evolvable hardware on the hardware implementation. A reconfigurable
reconfigurable fabrics for solving a variety of neural network allows topological changes to be
problems that take advantage of the massive programmed into the circuit allowing an optimal
parallelism offered by a neural network approach. topology to be found completely at the hardware
This paper proposes a method for obtaining a fault level [1]. Second, implementing ANNs of any
tolerant implementation of BBNNs by using a practical size requires a large number of synaptic
biologically inspired layered design. At the lowest interconnections. Block based neural networks
level, each block has its own online detection and (BBNNs) have been proposed as a means for
correcting logic combined with sufficient spare reducing this interconnect complexity. Together
components to ensure recovery from permanent with their regular structure, BBNNs are suitable
and transient errors. Another layer of hierarchy for implementation on reconfigurable fabrics, such
combines the blocks into clusters, where a as FPGAs [5].
redundant column of blocks can be used to replace ANNs implemented with advanced processing
blocks that cannot be repaired at the lowest level. technology and operating in harsh environments
The hierarchical approach is well-suited to a must be fault tolerant. The continued scaling of
divide-and-conquer approach to genetic device sizes increases the defect density and
programming whereby complex problems are makes the scaled devices more vulnerable to
subdivided into smaller parts. The overall transient faults due to external influences such as
approach can be implemented on a reconfigurable electromagnetic interference. Thus a totally self-
fabric. checking system is required where faults are
automatically detected and corrected. This paper
Index terms - Reconfigurable logic, Block
proposes a architecture for implementing a block
based network, Fault detection and correction
based reconfigurable fault tolerant neural network.
I. Introduction A system based on the proposed architecture is
composed of a number of functional cells that are
Artificial Neural Networks (ANNs), modelled interconnected to perform a desired function. The
after the biological neural networks of the brain, error detection in the BBNNs is performed in
have shown the ability to solve a variety of every block. Once a faulty cell is identified, a
complex problems due to their unique features spare block replaces it. A hierarchical approach
such as massive parallelism with inherent arranges the blocks into clusters, with a sufficient
concurrency. This makes them well suited to number of spare blocks located in each cluster to
efficiently solve problems such as image and ensure the required level of fault tolerance. This
pattern recognition and data prediction. Software self-healing mechanism is inspired by complex
implementations running on sequential machines biological systems which often are able to remain
fail to take advantage of the inherent parallelism of operational while sustaining certain levels of
ANNs. Electronic implementations of ANNs face damage [2, 3].
several challenges. First, the trial and error based This paper, is organized as follows. First, the
training algorithms for obtaining an optimal basic concepts of the block based neural network
network topology requires frequent changes to the and its relationship to a genetic system are
network structure and parameters such as synaptic

978-1-4244-5692-5/10/$26.00 © IEEE 2010 357


reviewed. Then the fault tolerant design of the
BBNN system is described.
vj h
= (∑ i∈I )
wij ui + b j , j ∈ J

wij = synaptic weight between ui - vj


II. Block-Based Neural Network ui = input signal of a block
A Block-Based Neural Network (BBNN) is a bj =bias of the jth node
two-dimensional array of basic neural network
blocks interconnected in nearest neighbor fashion. The indices i and j denote the input and output
Figure 1 depicts the structure of the basic neural nodes, respectively. The activation function h(•)
network block. Every block in the network has can be a linear or a nonlinear function. Figure 3
four nodes, with each of these nodes acting as depicts a cluster of neurons. The column to the
either an input or an output of the block. This right is a column of spare neurons. The overall
block forms the essential processing element of structure of the BBNN defines the signal flow
the BBNN [4]. represented by the arrows between the blocks [5].

Fig. 1. Basic Unit of BBNN

The structure of the BBNN defines the signal


flow between blocks, which in turn automatically Fig. 3. A cluster of blocks with spare cells
determines the internal configuration or input-
output connections of the basic blocks. Depending III Analogy between BBNN and Biological
on the flow of data between blocks, four types of Systems
internal configurations are needed, as shown in In biological systems, the gene is the basic
Figure 2. Each individual neuron block computes unit of heredity in a living organism [7].
outputs that are functions of the summation of Composed of sequences of DNA, genes contain
weighted inputs and a bias, essentially forming a the information that encodes different biological
simple feedforward neural network [5]. traits. The totality of an organism’s hereditary
information is found in its genome, which consists
of genes and the non-coding sequences of DNA. A
computational metaphor is appropriate since the
genome can be viewed as a computer’s operating
system, while the genes are the “individual
subroutines in the overall system that are
repetitively called in the process of transcription”
[6]. At an intermediate level is the chromosome,
which consists of many genes and other regulatory
and nucleotide sequences.
The BBNN referred to in this paper is quite
similar to a biological system in its structure and
functioning. The configuration bits used to specify
Fig. 2. Four different internal configurations of a the synaptic weights, internal structure, and
BBNN block (a) 1/3,(b) 3/1,(c) 2/2, (d) 2/2 intercellular connections can be considered to
The output vj of each block is computed as: encode the basic genetic information of each block

358
of the BBNN. Each such block can be considered block. The main parts of this cellular block are the
to form a cell whose set of genetic information is weight table, central logic unit, self test logic unit,
contained within a chromosome. At the top level finite state machine control logic unit, input and
of the hierarchy, the complete set of weights and output nodes and their interconnections.
structural information corresponds to the genome. The input nodes are connected to the central
Each cell of the BBNN forms an autonomous logic unit via a multiplier whose other input comes
system capable of self-healing and adaptive from the weight table, which contains the synaptic
change. weight information previously obtained from
The biological model of the BBNN has two running a GA. There are three multipliers per
important implications for its operation. First, by block, two main multipliers (M1 and M2) and a
encoding all the genetic information for each spare unit (Mspare). A block must perform at
block within a chromosome, evolutionary most four multiplications at any given processing
algorithms (EA) can be used to simultaneously step (configuration 2/2), so two clock cycles are
find an optimal solution for the internal structure required to process the data for each block. The
and weight of the BBNN. The use of EAs for use of the spare multiplier plus a time redundant
solving some basic problems with BBNNs have approach forms the basis of our fault tolerant
demonstrated the feasibility of this approach [4, approach.
5]. Second, the biologically-inspired model leads
to a viable layered-based approach to fault tolerant
design [2]. At the lowest level of the cell, self-
correcting logic is used to detect and correct
errors. The next layer consists of a clusters of
cells, which can invoke spare cells in the event the
self-healing ability at the cellular level fails.
Another layer can similarly be formed with self-
healing capability, analogous to cells forming
tissues and organs in biological systems. This
hierarchical arrangement is well-suited to a divide-
and-conquer approach to solving complex
problems where multiple genetic algorithms (GAs)
can be run on one system [9].
IV. Fault Tolerant Mechanism
A fault tolerant system has the ability to detect
and correct errors. On-line fault tolerance is of
prime importance for mission critical systems such
as those operating in aircraft or those operating in
harsh environments (e.g., satellites).
Faults can be classified into two types:
processing faults and run-time faults. Processing
faults occur during production of the system. For
example, the defect density of a silicon chip is due
to the processing faults, which directly impacts the Fig. 4. Functional representation of a Basic Unit of the
yield of the chip. Run-time errors are those errors BBNN
which take place because of some internal or A. Error Checking Algorithm
external perturbations during dynamic operation of
the system. Depending on the region and type of Before the whole system starts processing any
impact the faults can either be permanent or data, a power-on self-test takes place which is
transient. monitored by the self test logic. This test checks
In this paper we propose a structure shown in all three multipliers in every block in the network.
Figure 4 which can tolerate up to one error per Hardware (several multipliers working in parallel)

359
and temporal (repeated attempts with varied previous step (i.e., give M1 and M2 the same input
inputs) redundancy are used to ensure that all the and weight and carry on with their output).
components of the cellular block are functional.
The structure of the BBNN allows for efficient
implementation of these standard approaches for
achieving system fault tolerance. Error checking
codes (such as Hamming codes) will be used to
ensure the integrity of the configuration bits. If the
block is initially found to be faulty, it can
immediately be disabled, and replaced with a spare
block as described below.
Once the self-test is complete, we can have
confidence that every block in the BBNN is
initially error free. The algorithm shown in Figure
7 is then followed for online fault detection during Fig. 5. Evaluation when none of the outputs of the
operation of the BBNN. Following this algorithm multiplier are same.
both inputs arriving at the central control logic are If a match fails from the neighboring neurons,
evaluated using a comparator. For simplicity M1 is then the fault is a permanent one and needs to be
considered the main multiplier and Mspare is the corrected by replacing the faulty block with a
spare multiplier. (M1 and M2 are checked by spare block. This is achieved by bypassing the
Mspare on alternating pairs of clock cycles.) column with the faulty block and invoking the
If both multiplier outputs are equal then it is spare column in the cluster as shown in Figure 6.
assumed that there is no error, the product is
passed to the output, and the Error_count value is
decreased by 1 (if the Error_count value is greater
than zero). The Error_count provides some degree
of history to allow for recovery from transient
errors and to ensure that a block is not prematurely
removed from the system.
If the two multipliers differ in value, then
Error_count is incremented by two, the system is
stalled with the existing values, and the product is
recalculated using the other multiplier in the block Fig. 6(a) Block b2 is a faulty block in the cluster.
(in this case, the one labeled M2). The outputs of
the first two multipliers (M1 and Mspare) are
compared with the output of the new multiplier
M2. If either of the initial two multiplier values
agrees with multiplier M2, this value will be
assigned to the output node.
In the case that M2 does not agree with either
of M1 or Mspare, the multipliers of the
neighboring blocks are invoked. This part of the
logic is shown in Fig. 5. The outputs of multipliers
M2 and M1 of the neighboring blocks are labeled
Nb1_m2_out and Nb2_m1_out, respectively. Fig. 6(b) Blocks b2 and c2 are shifted right with
These outputs enter the block evaluator along with their respective columns to correct the fault.
M1_out and M2_out of the current block for #B_enable is the block enable signal
further comparison. If M1_out of the current block #M1_out is the output of multiplier M1
is equal to Nb1_m2_out and M2_out of current #M2_out is the output of multiplier M2
#Mspare_out is the output of multiplier Mspare
block is equal to Nb2_m1_out then the fault is a
#N_input is the input of the output node
transient one so we neglect the fault and repeat the

360
#Error_count is the value of the error counter V. Summary and Current Work
#Temp_input is the temporary signal which holds
the output of the neuron temporarily A fault tolerant BBNN using a biologically
inspired hierarchical approach has been
described. Efficient online detection and
Threshold= 15 correction of errors at the cellular and cluster
B_enable=1 level is achievable, resulting in several levels of
redundancy. This should yield robust BBNN
designs suitable for implementation on
if (M1_out = = Mspare_out) reconfigurable nanoscale fabrics. Currently we
Temp_input = M1_out;
are implementing the proposed design on a
if (Error_count > 0)
Error_count = Error_count – 1; Xilinx Virtex 5 FPGA as a proof of concept and
end if will present the results at the conference.
else REFERENCES
Error_count = Error_count + 2;
[1] S. Satyanarayana and Y. P. Tsividis, “A
Stall the system. Reconfigurable VLSI Neural Network,” IEEE J.
Recalculate the product using the multiplier M2. of Solid-State Circuits, vol. 27, no. 1, pp. 67-81,
do Jan. 1992.
{ [2] W. Baker, D. M. Halliday, Y. Thoma, E
check_parameter = 1 Sanchez, G. Tempesti, A. M. Tyrrell, “Fault
if (Mspare_out == M2_out) Tolerance Using Dynamic Reconfiguration on
Mask M1_out the POEtic Tissue,” IEEE Trans. on
Temp_input = Mspare_out; Evolutionary Computation, vol. 11, no. 5, pp.
elseif(M1_out = = M2_out) 666-684, Oct. 2007.
Mask Mspare_out [3] P. K. Lala and B. K. Kumar, “Human immune
Temp_input =M1_out; system inspired architecture for self-healing
else digital systems,” Proceedings, International
Error_count=Error_count +5; Symposium on Quality Electronic Design, pp.
292-297, 2002.
Invoke the neighboring neurons and use [4] S. Merchant, G. D. Peterson, S. Ki Park, S.G.
their multipliers to evaluate if M1 or M2 Kong, “FPGA Implementation of Evolvable
or both multipliers of the current block are Block-based Neural Networks,” IEEE Congress
correct. on Evolutionary Computation, pp. 3129-3136,
if (Nb1_m2_out == M1_out && 2006.
Nb2_m1_out==M2_out) [5] Sang-Woo Moon, Seong-Gon Kong, “Block-
then {fault is a transient fault, so neglect it based neural networks,” IEEE Trans. on Neural
and repeat the previous step of evaluation} Networks, vol. 12, no. 2, pp. 307-317, Mar 2001.
else [6] M. B. Gerstein, C. Bruce, J. S. Rozowsky, D.
check_parameter=0; Zheng, J. Du, et al., “What is a gene, post-
ENCODE? History and updated definition,”
At this point it implies that there is more
Genome Research, vol. 17 (6) 669–681, 2007
than a single error in the neuron block;
(citation on p. 671).
thus we need to disable the block and
replace it with a spare block by shifting [7] D. Noble, “Genes and causation,” Philosophical
the whole column one unit right. Trans. of Royal Society: Series A, vol. 366, pp.
3001–3015, 2008.
end if [8] Y. Jewajinda and P. Chongstitvatana, “FPGA
end if implementation of a cellular univariate
} while (check_parameter==0) estimation of distribution algorithm and block-
based neural network as an evolvable hardware,”
end if IEEE Congress on Evolutionary Computation,
pp. 3366-3373, June 2008.
N_input = Temp_input; [9] E. Cantu-Paz, Efficient and accurate parallel
genetic algorithms, Springer, 2000.
Fig. 7. Error Detection and Correction Algorithm

361

Vous aimerez peut-être aussi