Vous êtes sur la page 1sur 24

Programmable System Design

Testing and Design for Reliability of Digital Systems

Interdisciplinary Project

Fault Tolerant Processor


using Hybrid Hardware Redundancy

Matteo Ainardi
Vittorio Giovara
Alberto Grand
Fabio Margaglia

March 11, 2009


Contents

1 Introduction 3

2 General Architecture 5
2.1 Finite State Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 the breaker module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Custom Development Tools 12


3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Memory Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

A Memory content 15

B Test Program 16

C HEX generator 17

D Compiler patch 19

1
List of Figures

2.1 System Architecture Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 6


2.2 Main Finite State Machine Scheme . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Finite State Machine Scheme for the process the breaker . . . . . . . . . . 11

2
Chapter 1

Introduction

The ability of a computing system to survive hardware or software failures is gaining ever-
growing importance in some application fields. Such a property is called fault-tolerance; in
particular, a fault-tolerant system is able to operate properly when one or more faults occur
in some of its components. This characteristic is particularly desirable in high-availability
or life-critical systems.
Two main approaches may be followed in order to make a system fault-tolerant. The
first obvious approach consists of making use of special hardened technology which is
reliable with respect to the faults at stake.
Instead a second approach resorts to ordinary, unreliable technology and is based on
suitable design techniques which exploit redundancy. Several types of redundancy are
possible: information redundancy, which relies on storing more data than it is necessary
in order to provide error detection and, possibly, error correction; time redundancy, which
relies on taking more processing time than needed to perform the computation(s) once;
hardware redundancy, which consists of implementing a system with more hardware than
it is needed to provide the required functionalities.
The latter option encompasses a number of different techniques, among which is hybrid
hardware redundancy. Following this approach, a critical primary module is replicated
several times. A subset of these primary module instances are active at the same time,
that is, they run in parallel and their outputs are compared for discrepancies; the remaining
primary modules are spare. A voter module is in charge of selecting the actual output. A
configuration module detects the active primary module(s) whose output(s) differ from the
actual one; as soon as a primary module is found faulty, it is replaced with a spare one by
the configuration module and marked as improperly working. The faulty module can then
be permanently evicted, so it is powered off and no longer used, or temporarily evicted. In
this case, it is not powered off and its outputs are monitored for a given time interval; if it
does not exhibit any faulty behaviour at the end of the monitoring period, it is promoted
to the role of spare module. This technique provides error masking and correction as long
as the number of correctly-behaving modules is sufficiently high to perform voting; after
that, only error detection is provided.
This interdisciplinary project purports to tackle the design of a fault-tolerant comput-
ing system and its implementation on an FPGA board. The type of redundancy that has
been adopted is a hybrid, hardware one. The DE2 board and the Nios II processor from
Altera were the options of choice because particularly well-suited for this purpose. The
behavior of the system slightly differs from what described above. In particular, since
the Altera Nios II VHDL code does not provide the possibility to power off a processor,
the only possible solution was to put the spare processor and the faulty one in a reset

3
state, instead of powering them off. Furthermore, when a processor is only affected by a
transient fault, it is reintegrated as primary module at the end of the evaluation phase,
rather than being used as spare. This way the spare processor is always the fourth one,
which makes the control unit simpler.

4
Chapter 2

General Architecture

The system has been designed following a structural description style. Its main compo-
nents are the four Nios II processors, the voter, the validator, the finite state machine
(FSM) and a number of multiplexers. Each of these will be described in the following
except the FSM, whose complexity makes it worth a more detailed explanation.

PROCESSORS
The four Nios II processors are component instances of the processor that has been created
through the SOPC Builder. It is a closed-source processor (Altera’s IP); the SOPC builder
encrypts the corresponding VHDL file, so only the processor interface (its input and output
signals) is known. The main components that have been installed are:

• the core itself, economy version, chosen for size constraints;

• an on-chip memory, which contains a sample program to be executed by the proces-


sors;

• an auxiliary on-chip memory, which contains the procedure to restore the correct
processor status and registers after a failure. It also provides the space to save the
registers themselves. An on-chip RAM was used because simulation was problematic
with other types of memories.

• a PIO controller, needed for driving some LEDs which indicate that the processors
are running.

VOTER
The task of this module is to perform voting of the signals coming from the three active
processors and detect possible faults. The voter takes as inputs four sets containing seven
signals each (d address in, d byteenable in, d read in, d write in, d writedata in,
i address in, i read in). Three control signals (sel0, sel1 and sel2) specify which
outputs from the processor are currently being voted: each of them can assume the values
00 (meaning processor number 0), 01 (meaning processor number 1), 10 (meaning pro-
cessor number 2) and 11 (meaning processor number 3), in a mutually exclusive fashion.
Three output signals (fault 0, fault 1 and fault 2) are activated when the correspond-
ing processor is found faulty. Note that the fault 0 signal does not necessarily refer to
processor 0, but rather to the one specified by sel0; the same holds for fault 1 and

5
Figure 2.1: System Architecture Model

6
fault 2. The module also provides the correct values, obtained through voting, for each
of the seven processor signals.
The different copies of each of the seven signals coming from the different processors
are compared with each other to make sure they assume the same values at any time.
This task is carried out by two module types, single voter and single voter one bit
– the two types differing only for the width of the signals on which they operate. These
modules take as inputs the four copies of a processor signal (either n- or 1-bit wide) and the
selection signals. Their outputs are three fault signals (fault 0, fault 1 and fault 2),
which are asserted when a fault is detected in the corresponding processor, and the correct
signal value (correct output). All fault signals related to the same processor are OR-ed
with each other in order to create the global fault signal for that processor.

VALIDATOR
The purpose of this module is to monitor a processor’s behaviour after it has been found
faulty and evicted. The fourth processor has thus been activated and is being used. It takes
as inputs the set of seven processor signals coming from the voter – which therefore are
correct – and the three sets of signals coming from processors 0, 1 and 2. A 2-bit selection
signal specifies which processor is to be evaluated. A further enable signal is used to
prevent this module from generating an output, which could interfere with the normal
evolution of states within the finite state machine. This module performs a comparison
between the signals of the processor under evaluation and the correct ones on a per-clock-
cycle basis. If the two signal sets are equal, the valid signal is asserted. This output signal
is sent to the finite state machine, which evaluates it over a number of clock cycles.

BREAK MULTIPLEXER
It’s a 4-to-1 32-bit multiplexer which is inserted between the Avalon instruction bus and
the four processors. The selection signal is used to decide which instructions feed the
processors: when its value is 00, the processors are fed with the normal instruction flow
(coming from either on-chip memory 0 or on-chip memory 1); when it is 01 a fixed value1
is sent to all processors, to launch the break procedure upon a processor failure, so that
the newly activated processor can be aligned with the correctly working ones; when it is
10 a fixed value2 is sent again, to perform a flush of the processor pipeline. This feature is
currently not exploited, but it has been deemed useful to leave it for future possible use.
The fourth input of the multiplexer is not used.

ERROR MULTIPLEXERS
These are 2-to-1 32-bit multiplexers; there are four of them, each one inserted between the
break multiplexer and a processor. Their purpose is to artificially inject errors in order to
simulate a faulty processor. A selection signal establishes if the normal instruction flow
or a fixed value (all 0s) feeds the processor. The selection signal is connected to buttons
on the DE2 board, so that a fault in a given processor can easily be generated by pressing
the corresponding button.
1
the transmitted value is 00000000001111011010000000111010 corresponding to the break instruction.
2
the transmitted value is 00000000000000000010000000111010 corresponding to the flush instruction.

7
2.1 Finite State Machine
The finite state machine drives the control signals directed to the multiplexers and to the
other parts of the system.
It controls the voter (through the three sel signals), the validator (through enable
and sel signals), the four processors resets and the mux break sel signal.
The fsm receives as inputs the fault signals from the voters in order to detect, when a
fault occurs, which is the processor affected by the fault. The valid signal, driven by the
validator, is used to distinguish a processor temporarily faulty from a broken processor.
Furthermore i read and i readdata signals are provided as inputs to detect the beginning
and the end of the break procedure.
The finite state machine process the fsm evolves through the following states:

RESET FIRST PART : the FSM enters this state after a global reset; this state con-
sists of a delay (realized through a counter) which is needed to correctly reset all
processors;

RESET SECOND PART : this state consists of a further delay, needed to let all
processors exit correctly from the reset state before an error can be injected;

NORMAL: In this case the three processors, P0, P1 and P2 are working and P3 ’s reset
signal is high. If a fault is detected on a certain processor, the FSM goes into the
BREAK state of the faulty processor;

BREAK P0 , BREAK P1 , BREAK P2 : When a fault occours the P3 processor is


activated (the reset is switched to low). The the breaker process is asked to re-
align all the processors through the break request signal and, when this procedure
has been completed, break complete signal is set to ’1’. We assume that during re-
alignment phase it is not possible that another fault occurs. When the re-alignment
has been completed the processor evaluation phase starts (VAL state).

VAL P0 , VAL P1 , VAL P2 : In this phase the processor which previously presented a
fault is evaluated: if for a certain period of time no fault is detected on the processor
then it is marked as valid and then the state goes back to NORMAL , otherwise, if others
faults are detected on the same processor, the processor is marked as broken and the
fsm goes into the corresponding BROKEN state. This check is implemented by means
of a counter controlled by the counter start signal, when the counter ends it sets
counter finish to 1. If valid signal is 1 for the whole counter period then the fault
was temporary, otherwise the fault was permanent. If a fault on another processor
is detected during evaluation phase then the system goes to OUT OF SERVICE state
since it is not able to provide fault tolerance anymore.

BROKEN P0 , BROKEN P1 , BROKEN P2 : A certain processor is broken (i.e.


permanently faulty) : the system keeps working until another fault is detected, then
it moves to OUT OF SERVICE state since it is not possible to implement the voting
procedure .

OUT OF SERVICE : the system is not working anymore since it does not provide
fault tolerance mechanism.

8
Figure 2.2: Main Finite State Machine Scheme

9
Two further processes are used in the finite state machine:

• the counter. it is a simple counter which is useful to determine the amount of time
that must be spent in evaluation (VAL) state (50,000,000 clock cycles).

• the breaker. It is a sequential process which is in charge of initiating and controlling


the execution of the break procedure. It is implemented as a finite state machine
whose behavior is driven by the break request, i read and i readdata signals.

2.2 the breaker module


This block has been developed to provide instructions at the processors’ inputs with the
correct timing; this timing is given by the i read signal coming out of the processors.
This signal is kept to ‘1’ for two clock cycles and then to ‘0’ for a variable number of
clock cycles: in order to be correctly executed, an instruction must be asserted on the
i readdata port of a processor during the second half of an i read cycle and during
the first half of the next i read cycle. Due to the variable length of the low period of
the i read signal, a counter-based FSM is not able to generate the correct control signals:
the only possible solution is an FSM sensitive to the i read signal, able to manage an
aperiodic behaviour.
While testing a preliminary version of the system on the DE2 board, it was noticed
that sometimes the execution blocked in the BREAK state without executing the break
instruction. It was not possible to determine the reason of this phenomenon, so as solution
a further sequential block was developed so that it could hold the break instruction at the
processors’ inputs until the break routine is actually executing.
The finite state machine is composed of five states:

• WAITING FOR START : it waits until break request and i read are both ‘1’,
then it feeds the processors with the break instruction and goes to state BREAK 1;

• BREAK 1 : it feeds the processors with the break instruction (by holding the
mux break on the break instruction) until i read is ‘1’; when i read becomes ‘0’
control moves to state BREAK 2;

• BREAK 2 : the break instruction is still provided as input to the processors until
i read becomes ‘1’, then the mux break is switched to select instructions coming
from the secondary on-chip memory and the control moves to the CHECK state;

• CHECK : this state controls that the break instruction has been properly executed
by checking whether the next instruction corresponds to the first instruction of the
break routine. If this is the case, control moves to the WAITING FOR END state,
otherwise it goes back to WAITING FOR START and the break instruction is executed
again.

• WAITING FOR END: it checks the content of the i readdata signal looking
for the BRET instruction. When this instruction is detected, the execution of the
break routine is finished and the break complete signal is set to ‘1’. Control comes
back to WAITING FOR START until the next break request.

10
Figure 2.3: Finite State Machine Scheme for the process the breaker

11
Chapter 3

Custom Development Tools

3.1 Overview
In order to implement the possibility of restoring the state of the working processors to
the newly activated one, there were several possibilities offered by the FPGA such as:

1. using an interrupt

2. implementing a custom-instruction

3. implementing a JTAG module embedded in the processor

4. exploiting a break function

However most of these features didn’t completely satisfy the requirements or required
additional extensions: for instance the boundary scan of the JTAG module would have
involved supplementary control software and a complete implementation of the JTAG
protocol. For these reasons, the break model was chosen allowing direct raw access to the
whole content of the processor.
When a break is invoked, the current contex is blocked and control jumps to a fixed
memory location, set up during processor configuration; the execution flow starts back
executing code present in that memory until a jump function (BRET) resumes the previous
contex. In this particular configuration the memory is loaded with code that moves the
32 processor registers and the 32 control registers from a working processor to external
memory and then moved again from memory to the spare processor; the loading from
and storing in memory is done with STWIO and LDWIO instructions that are able to bypass
caches. Beware that, in order to access control registers, they must be first read or written
in other normal registers through the instructions RDCTL and WRCTL; these registers are not
avaiable in user mode, but since the break automatically activates the superuser mode
(whatever mode was set before) they become immediately accessible.
The memory is written in file with the Intel .HEX format that stores address infor-
mation, instruction and checksum in one single structure; more precisely it encodes the
instruction in six fields composed of start code (a colon “:”), byte count (16 or 32 bytes),
adrress, entry type, data and parity (sum of the previous fiels in 2’s complement) in hex-
adecimal format.

12
3.2 Memory Generation
Since the memory content is large, it seemed convenient to develop some custom tools to
easen the generation of the memory image.
First of all a custom compiler was obtained, by applying patch 3k-cc, from the open-
source compiler previously developed for the GLE-MiPS project. The following instruc-
tions were added:

I STWIO / LDWIO - memory store and load operations

I RDCTL / WRCTL - control register read and write instructions

I RDINT / WRINT - parameters for control register functions

I BRET - restores the previous contex.

Note that functions RDCTL and WRCTL require a 6 bit parameter (0x26 and 0x2E re-
spectively) in position 18 - 11 of the encoding and they have been inserted as pseudo-
instructions in the compiler; morover, to maintain binary compatibility immediate in-
structions must be written as

STWIO R6, R5, 100

instead of the more commonly found SWTIO R6, 100(R5).


The output file is a stream of ASCII characters representing zeros and ones, so an
interpreter for transforming this raw format into Intel .HEX format was written. By
reading the bytestream, it converts the encoded instructions into the equivalent .HEX
format, with checksumming and opening and closing clauses.

3.3 Structure
The second on-chip memory is a small (1024-byte) memory which is seen (and addressable)
by the processors between adresses 0x4400 and 0x47FF. When generating the processor
with the SOPC Builder, the break vector has been set to address 0x4400 so that, when a
break instruction is encountered, the execution flow will jump to this memory.
The break procedure stored in this memory performs the following operations:

• it saves the 31 general purpose registers (R0 needs not to be saved since it is perma-
nently set to 0) through STWIO instruction; [31 instructions]

• it saves the 32 control registers, which requires two distinct instructions: RDCTL first
saves a control register into a general purpose register and STWIO subsequently saves
the latter in memory; [64 instructions]

• it loads the 32 control registers back to the register file, which needs values to be
loaded first from memory to general purpose registers by means of LDWIO and then
from general purpose to control registers with WRCTL; [64 instructions]

• it loads the 31 general purpose registers back to the register file through the LDWIO
instruction; [31 instructions]

• it performs BRET restoring the previous context. [1 instruction]

13
The size of the memory has been carefully tailored as follows. The procedure consists
of 191 4-byte instructions, which take 764 bytes overall. Registers are saved to the same
memory and require exactly (31 + 32) ∗ 4 = 252 bytes. As a consequence, the total amount
of needed memory is 764 + 252 = 1016 bytes, so it all fits within the 1024 bytes of available
memory. The starting memory location where the registers are saved has been computed
starting from the end of the memory: 18431 (0x47FF) - 251 = 18180 (0x4704).
The memory content, the test program, the .HEX generation and the compiler patch
are available in the Appendix section. The simple C program which is continuously exe-
cuted by the three active processors turns on the eighteen red leds one after the other in
a loop, by resorting to the PIO controller.

14
Appendix A

Memory content

STWIO R1, R0, 18176; RDCTL R2, R9; LDWIO R2, R0, 18404; LDWIO R2, R0, 18308;
STWIO R2, R0, 18180; RDCTL R3, R10; LDWIO R3, R0, 18408; LDWIO R3, R0, 18312;
STWIO R3, R0, 18184; RDCTL R4, R11; LDWIO R4, R0, 18412; LDWIO R4, R0, 18316;
STWIO R4, R0, 18188; RDCTL R5, R12; LDWIO R5, R0, 18416; LDWIO R5, R0, 18320;
STWIO R5, R0, 18192; RDCTL R6, R13; LDWIO R6, R0, 18420; LDWIO R6, R0, 18324;
STWIO R6, R0, 18196; RDCTL R7, R14; LDWIO R7, R0, 18424; LDWIO R7, R0, 18328;
STWIO R7, R0, 18200; RDCTL R8, R15; LDWIO R8, R0, 18428; LDWIO R8, R0, 18332;
STWIO R8, R0, 18204; STWIO R1, R0, 18336; WRCTL R24, R1; WRCTL R0, R1;
STWIO R9, R0, 18208; STWIO R2, R0, 18340; WRCTL R25, R2; WRCTL R1, R2;
STWIO R10, R0, 18212; STWIO R3, R0, 18344; WRCTL R26, R3; WRCTL R2, R3;
STWIO R11, R0, 18216; STWIO R4, R0, 18348; WRCTL R27, R4; WRCTL R3, R4;
STWIO R12, R0, 18220; STWIO R5, R0, 18352; WRCTL R28, R5; WRCTL R4, R5;
STWIO R13, R0, 18224; STWIO R6, R0, 18356; WRCTL R29, R6; WRCTL R5, R6;
STWIO R14, R0, 18228; STWIO R7, R0, 18360; WRCTL R30, R7; WRCTL R6, R7;
STWIO R15, R0, 18232; STWIO R8, R0, 18364; WRCTL R31, R8; WRCTL R7, R8;
STWIO R16, R0, 18236; RDCTL R1, R16; LDWIO R1, R0, 18368; LDWIO R1, R0, 18176;
STWIO R17, R0, 18240; RDCTL R2, R17; LDWIO R2, R0, 18372; LDWIO R2, R0, 18180;
STWIO R18, R0, 18244; RDCTL R3, R18; LDWIO R3, R0, 18376; LDWIO R3, R0, 18184;
STWIO R19, R0, 18248; RDCTL R4, R19; LDWIO R4, R0, 18380; LDWIO R4, R0, 18188;
STWIO R20, R0, 18252; RDCTL R5, R20; LDWIO R5, R0, 18384; LDWIO R5, R0, 18192;
STWIO R21, R0, 18256; RDCTL R6, R21; LDWIO R6, R0, 18388; LDWIO R6, R0, 18196;
STWIO R22, R0, 18260; RDCTL R7, R22; LDWIO R7, R0, 18392; LDWIO R7, R0, 18200;
STWIO R23, R0, 18264; RDCTL R8, R23; LDWIO R8, R0, 18396; LDWIO R8, R0, 18204;
STWIO R24, R0, 18268; STWIO R1, R0, 18368; WRCTL R16, R1; LDWIO R9, R0, 18208;
STWIO R25, R0, 18272; STWIO R2, R0, 18372; WRCTL R17, R2; LDWIO R10, R0, 18212;
STWIO R26, R0, 18280; STWIO R3, R0, 18376; WRCTL R18, R3; LDWIO R11, R0, 18216;
STWIO R27, R0, 18284; STWIO R4, R0, 18380; WRCTL R19, R4; LDWIO R12, R0, 18220;
STWIO R28, R0, 18288; STWIO R5, R0, 18384; WRCTL R20, R5; LDWIO R13, R0, 18224;
STWIO R29, R0, 18292; STWIO R6, R0, 18388; WRCTL R21, R6; LDWIO R14, R0, 18228;
STWIO R30, R0, 18296; STWIO R7, R0, 18392; WRCTL R22, R7; LDWIO R15, R0, 18232;
STWIO R31, R0, 18300; STWIO R8, R0, 18396; WRCTL R23, R8; LDWIO R16, R0, 18236;
RDCTL R1, R0; RDCTL R1, R24; LDWIO R1, R0, 18336; LDWIO R17, R0, 18240;
RDCTL R2, R1; RDCTL R2, R25; LDWIO R2, R0, 18340; LDWIO R18, R0, 18244;
RDCTL R3, R2; RDCTL R3, R26; LDWIO R3, R0, 18344; LDWIO R19, R0, 18248;
RDCTL R4, R3; RDCTL R4, R27; LDWIO R4, R0, 18348; LDWIO R20, R0, 18252;
RDCTL R5, R4; RDCTL R5, R28; LDWIO R5, R0, 18352; LDWIO R21, R0, 18256;
RDCTL R6, R5; RDCTL R6, R29; LDWIO R6, R0, 18356; LDWIO R22, R0, 18260;
RDCTL R7, R6; RDCTL R7, R30; LDWIO R7, R0, 18360; LDWIO R23, R0, 18264;
RDCTL R8, R7; RDCTL R8, R31; LDWIO R8, R0, 18364; LDWIO R24, R0, 18268;
STWIO R1, R0, 18304; STWIO R1, R0, 18400; WRCTL R8, R1; LDWIO R25, R0, 18272;
STWIO R2, R0, 18308; STWIO R2, R0, 18404; WRCTL R9, R2; LDWIO R26, R0, 18280;
STWIO R3, R0, 18312; STWIO R3, R0, 18408; WRCTL R10, R3; LDWIO R27, R0, 18284;
STWIO R4, R0, 18316; STWIO R4, R0, 18412; WRCTL R11, R4; LDWIO R28, R0, 18288;
STWIO R5, R0, 18320; STWIO R5, R0, 18416; WRCTL R12, R5; LDWIO R29, R0, 18292;
STWIO R6, R0, 18324; STWIO R6, R0, 18420; WRCTL R13, R6; LDWIO R30, R0, 18296;
STWIO R7, R0, 18328; STWIO R7, R0, 18424; WRCTL R14, R7; LDWIO R31, R0, 18300;
STWIO R8, R0, 18332; STWIO R8, R0, 18428; WRCTL R15, R8; BRET;
RDCTL R1, R8; LDWIO R1, R0, 18400; LDWIO R1, R0, 18304;

15
Appendix B

Test Program

#include <system.h>
#include <altera_avalon_pio_regs.h>

int main () {
int i;
int j;
int k;

while (1) {
j = 1;
for (i = 0; i < 18; i++){
IOWR_ALTERA_AVALON_PIO_DATA (PIO_0_BASE, j);
j = j << 1;
for (k = 0; k < 50000; k++);
}
}
}

16
Appendix C

HEX generator

#include <stdio.h>
#include <string.h>

int main (int argc, const char * argv[]) {


char buffer[32];
int i,j,e;
unsigned char num, parity;
FILE *fin;

fin = fopen (argv[1], "r");

if (argc != 2) {
fprintf (stderr, "Usage: %s inputfile outputfile", argv[0]);
return -1;
}

/*first line for 32 addressing*/


fprintf (stdout, ":020000020000FC\n");
for (e = 0; e < 256; e += 8){

parity = 0;
fprintf (stdout,":20%04X00", e);

for (j = 0; j < 32; j++){


num = 0;
memset (&buffer, 0, sizeof (buffer));

fgets(buffer, 9, fin);
for (i = 0; i < 7; i++){
if (buffer[i] == ’1’){
num = num | 00000001;
num = num << 1;
}else
if (buffer[i] == ’0’)
num = num << 1;
}

17
if (buffer[7] == ’1’)
num = num | 00000001;
fprintf (stdout, "%02X", num);
parity += num;
}
parity = parity + e + 32;
/*2’s complement*/
parity = ~parity;
parity += 1;
fprintf (stdout, "%02X\n", parity);

}
/*closing line*/
fprintf (stdout, ":00000001FF\n");
fclose (fin);
fprintf (stderr, "Done!\n");
return 0;
}

18
Appendix D

Compiler patch

Index: scanner.jflex
===================================================================
--- scanner.jflex (revision 142)
+++ scanner.jflex (working copy)
@@ -46,6 +46,12 @@

/*symbols*/
+BRET { if (Main.verbosity > 2) System.out.println("-----# Scanner: BRET operation
detected"); return new Symbol(sym.BRET, yyline, yycolumn, new String(yytext()));}
+STWIO { if (Main.verbosity > 2) System.out.println("-----# Scanner: STWIO operation
detected"); return new Symbol(sym.STWIO, yyline, yycolumn, new String(yytext()));}
+LDWIO { if (Main.verbosity > 2) System.out.println("-----# Scanner: LDWIO operation
detected"); return new Symbol(sym.LDWIO, yyline, yycolumn, new String(yytext()));}
+RDCTL { if (Main.verbosity > 2) System.out.println("-----# Scanner: RDCTL operation
detected"); return new Symbol(sym.RDCTL, yyline, yycolumn, new String(yytext()));}
+WRCTL { if (Main.verbosity > 2) System.out.println("-----# Scanner: WRCTL operation
detected"); return new Symbol(sym.WRCTL, yyline, yycolumn, new String(yytext()));}
+
NOP { if (Main.verbosity > 2) System.out.println("-----# Scanner: NOP operation
detected"); return new Symbol(sym.OP_NOP, yyline, yycolumn, new String(yytext()));}
MOV { if (Main.verbosity > 2) System.out.println("-----# Scanner: MOV operation
detected"); return new Symbol(sym.OP_MOV, yyline, yycolumn, new String(yytext())); }
ADD(U|S) { if (Main.verbosity > 2) System.out.println("-----# Scanner: ADD operation
detected");
Index: rules_encoding
===================================================================
--- rules_encoding (revision 142)
+++ rules_encoding (working copy)
@@ -51,4 +51,11 @@
FUNC_RR:001101
FUNC_AND:001110
FUNC_OR:001111
-FUNC_XOR:010000
\ No newline at end of file
+FUNC_XOR:010000
+STWIO:110101
+LDWIO:110111
+RDCTL:111010
+RDINT:100110
+WRCTL:111010
+WRINT:101110
+BRET:11110000000000000100100000111010
Index: parser.cup
===================================================================

19
--- parser.cup (revision 142)
+++ parser.cup (working copy)
@@ -344,7 +344,7 @@

public void unsignedOperation(String op, int arg1, int arg2, int arg3, int line){
long res = 0, mask, auxlg1 = 0, auxlg2 = 0;
- // int reshigh = 0, reslow = 0;
+
if(flag == 0){
auxlg1 = (long)r[arg2];
auxlg2 = (long)arg3;
@@ -370,9 +370,6 @@
res = auxlg1 - auxlg2;
if(op.equals("/"))
res = auxlg1 - auxlg2;
- // reshigh = (int) res >> 32;
- // reslow = (int)res;
- // System.out.println(reshigh + " " + reslow); //debug, shows the values
if(res >= 4294967296L){
//System.out.println("Overflow Error at line: " + line);
parser.report_error("overflow error at line: " + line,null);
@@ -380,7 +377,7 @@
else{
res = res & mask;
r[arg1] = (int)res;
- System.out.println("Result: R" + arg1 + " = " + res);
+ System.out.println("Result: R" + arg1 + " = " + r[arg1]);
}
}
:}
@@ -392,8 +389,8 @@
action_obj.openReadFile("rules_encoding");
:}

-
-terminal OP_NOP, OP_MOV; //special op
+terminal BRET, STWIO, LDWIO, RDCTL, WRCTL; //break op
+terminal OP_NOP, OP_MOV; //special op
terminal OP_ADDU, OP_ADDS, OP_SUBU, OP_SUBS, OP_MULU, OP_MULS, OP_DIVU, OP_DIVS; //arith op
terminal OP_AND, OP_OR, OP_XOR; //logical op
terminal OP_LD32, OP_LD16, OP_LD8, OP_ST32, OP_ST16, OP_ST8; //memory op
@@ -427,6 +424,36 @@

line ::= op:a arguments:b {:


if (verbosity > 1) System.out.println("---> Parser: arguments parsed");
+ if (a.equals("BRET")){
+ ps.print(hm.get(new String("BRET")));
+ }
+ if (a.equals("STWIO")){
+ printBin(b[1] , 5);
+ printBin(b[0] , 5);
+ printBin(b[2] , 16);
+ ps.print(hm.get(a)); //prints opcode
+ }
+ if (a.equals("LDWIO")){
+ printBin(b[1] , 5);
+ printBin(b[0] , 5);
+ printBin(b[2] , 16);
+ ps.print(hm.get(a)); //prints opcode
+ }
+ if (a.equals("RDCTL")){

20
+ Nzeros(10); //prints 10 zeros
+ printBin(b[0] , 5);
+ ps.print(hm.get(new String("RDINT")));
+ printBin(b[1] , 5);
+ ps.print(hm.get(a)); //prints opcode
+ }
+ if (a.equals("WRCTL")){
+ Nzeros(10); //prints 10 zeros
+ printBin(b[0] , 5);
+ ps.print(hm.get(new String("WRINT")));
+ printBin(b[1] , 5);
+ ps.print(hm.get(a)); //prints opcode
+ }
+
if (a.equals("NOP")){
if (flag == 6){
if(emulation >= 1){
@@ -1422,23 +1449,6 @@
}
}
if(a.equals("LLS")){
-// if(flag == 0){
-// if(emulation >= 1){
-// System.out.println("Operation: Immediate to Register Logical Left Shift");
-// this.printResult(0, "LLS", b[0].intValue(), b[1].intValue(), b[2].intValue());
-// r[b[0].intValue()] = r[b[1].intValue()] << b[2].intValue();
-// pc = pc +4;
-// this.printResult(1, "LLS", b[0].intValue(), b[1].intValue(), b[2].intValue());
-// }
-// // no alu_func
-// // LLS R R I
-// ps.print(hm.get(a)); // prints opcode
-// printBin(b[1] , 5); // prints Rs
-// printBin(b[0] , 5); // prints Rt
-// printBin(b[2] , 16); // prints Immediate
-// ps.print("\n"); // only debugging
-//
-// }
if(flag == 1){
if(emulation >= 1){
System.out.println("Operation: Register to Register Logical Left Shift");
@@ -1457,28 +1467,11 @@
ps.print(hm.get("FUNC_" + a)); // prints function #
ps.print("\n"); // only debugging
}
-// if(flag != 0 && flag != 1){
if(flag != 1){
parser.report_error("wrong argument format for " + a.toString(),null);
}
}
if(a.equals("LRS")){
-// if(flag == 0){
-// if(emulation >= 1){
-// System.out.println("Operation: Immediate to Register Logical Right Shift");
-// this.printResult(0, "LRS", b[0].intValue(), b[1].intValue(), b[2].intValue());
-// r[b[0].intValue()] = r[b[1].intValue()] >> b[2].intValue();
-// pc = pc +4;
-// this.printResult(1, "LRS", b[0].intValue(), b[1].intValue(), b[2].intValue());
-// }
-// // no alu_func
-// // LRS R R I

21
-// ps.print(hm.get(a)); // prints opcode
-// printBin(b[1] , 5); // prints Rs
-// printBin(b[0] , 5); // prints Rt
-// printBin(b[2] , 16); // prints Immediate
-// ps.print("\n"); // only debugging
-// }
if(flag == 1){
if(emulation >= 1){
System.out.println("Operation: Register to Register Logical Right Shift");
@@ -1497,28 +1490,11 @@
ps.print(hm.get("FUNC_" + a)); // prints function #
ps.print("\n"); // only debugging
}
-// if(flag != 0 && flag != 1){
if(flag != 1){
parser.report_error("wrong argument format for " + a.toString(),null);
}
}
if(a.equals("ALS")){
-// if(flag == 0){
-// if(emulation >= 1){
-// System.out.println("Operation: Immediate to Register Arithmetic Left Shift");
-// this.printResult(0, "ALS", b[0].intValue(), b[1].intValue(), b[2].intValue());
-// r[b[0].intValue()] = r[b[1].intValue()] << b[2].intValue();
-// pc = pc +4;
-// this.printResult(1, "ALS", b[0].intValue(), b[1].intValue(), b[2].intValue());
-// }
-// // no alu_func
-// // ALS R R I
-// ps.print(hm.get(a)); // prints opcode
-// printBin(b[1] , 5); // prints Rs
-// printBin(b[0] , 5); // prints Rt
-// printBin(b[2] , 16); // prints Immediate
-// ps.print("\n"); // only debugging
-// }
if(flag == 1){
if(emulation >= 1){
System.out.println("Operation: Register to Register Arithmetic Left Shift");
@@ -1537,28 +1513,11 @@
ps.print(hm.get("FUNC_" + a)); // prints function #
ps.print("\n"); // only debugging
}
-// if(flag != 0 && flag != 1){
if(flag != 1){
parser.report_error("wrong argument format for " + a.toString(),null);
}
}
if(a.equals("ARS")){
-// if(flag == 0){
-// if(emulation >= 1){
-// System.out.println("Operation: Immediate to Register Arithmetic Right Shift");
-// this.printResult(0, "ARS", b[0].intValue(), b[1].intValue(), b[2].intValue());
-// r[b[0].intValue()] = r[b[1].intValue()] >> b[2].intValue();
-// pc = pc +4;
-// this.printResult(1, "ARS", b[0].intValue(), b[1].intValue(), b[2].intValue());
-// }
-// // no alu_func
-// // ARS R R I
-// ps.print(hm.get(a)); // prints opcode
-// printBin(b[1] , 5); // prints Rs
-// printBin(b[0] , 5); // prints Rt

22
-// printBin(b[2] , 16); // prints Immediate
-// ps.print("\n"); // only debugging
-// }
if(flag == 1){
if(emulation >= 1){
System.out.println("Operation: Register to Register Arithmetic Right Shift");
@@ -1577,7 +1536,6 @@
ps.print(hm.get("FUNC_" + a)); // prints function #
ps.print("\n"); // only debugging
}
-// if(flag != 0 && flag != 1){
if(flag != 1){
parser.report_error("wrong argument format for " + a.toString(),null);
}
@@ -1673,8 +1631,18 @@

-op ::= OP_NOP:a {: if (verbosity > 1) System.out.println


("---> Parser: parsing NOP operation");
+op ::= BRET:a {: if (verbosity > 1) System.out.println
("---> Parser: parsing STWIO operation");
RESULT = new String(a.toString()); :}
+ | STWIO:a {: if (verbosity > 1) System.out.println
("---> Parser: parsing STWIO operation");
+ RESULT = new String(a.toString()); :}
+ | LDWIO:a {: if (verbosity > 1) System.out.println
("---> Parser: parsing LDWIO operation");
+ RESULT = new String(a.toString()); :}
+ | RDCTL:a {: if (verbosity > 1) System.out.println
("---> Parser: parsing RDCTL operation");
+ RESULT = new String(a.toString()); :}
+ | WRCTL:a {: if (verbosity > 1) System.out.println
("---> Parser: parsing WRCTL operation");
+ RESULT = new String(a.toString()); :}
+ | OP_NOP:a {: if (verbosity > 1) System.out.println
("---> Parser: parsing NOP operation");
+ RESULT = new String(a.toString()); :}
| OP_MOV:a {: if (verbosity > 1) System.out.println
("---> Parser: parsing MOV operation");
RESULT = new String(a.toString()); :}
| OP_ADDU:a {: if (verbosity > 1) System.out.println
("---> Parser: parsing ADD Unsigned operation");

23