Vous êtes sur la page 1sur 3

TWO PASS ASSEMBLY in MIPS

The following outline describes the major characteristics of a two-pass assembler. LC stands for
Load Counter. This is the assembly time equivalent of the PC or Program Counter during
execution.
Pass 0: Expand Macros (if present). This can actually be done either as a separate pass or on
the fly, during the first pass. A MIPS program can actually have several kinds of macros. Many
actual instructions will be assembled as macros if the programmer uses an immediate that will
not fit in 16 bits, or if an inappropriate operand is used. Two examples of this are an immediate
where none is allowed, or a branch destination too far away for the 16 bit offset to address it. All
pseudo-instructions are converted as macros into actual instructions. Finally, the programmer can
define explicit macros and use them in the program. Sadly, SPIM does not support this last.
Pass 1: Build Symbol Table (can detect illegal opcodes or pseudo-ops, multiply defined labels)
1.
Set LC to 0 (or some other initial value)
2.
Scan the program and:
A Place all label declarations in the symbol table. Each will be defined with the current
LC value. Process all assembler directives.
B. Place all references to undeclared labels in the symbol table without value definition.
C. Increment LC as described by data allocation directives:
a:
lf:
array:

.word 3
.byte 10
.word 3,5,6

#a
entered with current value, then LC += 4
# lf
entered with current value, then LC += 4
# array entered with current value, then LC += 12

str:

.asciiz string

# str

entered with current value, then LC += 7

D. Simulate code generation by incrementing LC by the appropriate amount for each


instruction. Each instruction is of course 4 bytes in MIPS, but pseudo-ops can
generate multiple instructions. Some examples of this are shown below.
la
la
b
Loop:

$t0,a

# pseudo op
lui
at,%hi(a)
# LC += 4
addiu
$t0,at,%lo(a)
# LC += 4
$t2,array
# pseudo op
lui
at,%hi(array)
# LC += 4
addiu
$t2,at,%lo(array)
# LC += 4
endLoop
# LC += 4. enter endLoop w/o definition

lw

$t1,8($t2)

add
addi
endLoop: ble
sltu

#Loop defined, then LC += 4


# Loads array[2] in $t1
$t3,$t1,$t0
# LC += 4
$t4,$t4,-4
# LC += 4
$t1,$0,Loop # pseudo op. Define endLoop
at,$0,$t1
# LC += 4

beq
at,$0,Loop
# LC += 4
3. At the end of the first pass over the program, all labels should be defined. Forward
references are the primary reason why two passes are needed. A forward reference is a
reference to a label not yet defined. The branch to endLoop above is a forward reference.
PASS 2: Generate Code (can detect unresolved references)
1. Set LC back to 0
2. Scan the program again and generate the machine code file that was simulated in the first
pass. As all label references should now be defined, so it is now possible to produce the
machine code.
Unresolved External References
In a more complex assembler, unresolved external references may still exist. These
references would be resolved during the LINK phase, producing an executable file. This
link phase allows files to be assembled or compiled separately and linked together later.
It is also possible to link code sections, some parts of which where originally in one
language and some parts in another. Some of the program components could have been
written in assembly while others were written in a high level language (HLL). In such
cases, the parts must generally be compiled/assembled separately, and then linked. It has
been known for a compiler (most likely a C compiler) to allow assembly code to be
located within a HLL program.
Macro Preprocessor
A macro preprocessor, if present, may be activated before or during assembly.
Regardless, macros are defined and expanded before machine code is generated from
them. The assembly code generated by a macro preprocessor is assembled on the fly
along with the regular code. MIPS has a MACRO preprocessor, but it is not
implemented in the SPIM simulator. (RATS, I love writing macros!) Interestingly
enough, the C language, generally considered to be a high level language, has a macro
preprocessor. Actually, C has both high and low level characteristics as it was designed
to replace assembly language as the medium for systems programming. In my opinion, C
was supplied with a macro preprocessor as the assembly languages it was designed to
supplant usually have one.
Absolute Memory Addresses versus Memory Relative Memory Addresses
In MIPS, all load and store memory addresses are absolute but the assembler must
sometimes assemble the 32 bit address using two instructions. Branch and conditional
branch instructions store an offset to the current value of the PC in the instruction if
possible. Apparently, the MIPS Assembler will generate other instructions if the branch
distance is too far.
Optimization
MIPS programs can run faster if instructions are reordered to take advantage of the
architectural features, in particular, the delayed branch, load delay slots, and the efficient

execution of blocks of arithmetic instructions.

Vous aimerez peut-être aussi