Vous êtes sur la page 1sur 27

Advanced Operations

A pointer is an address that is stored in memory.


Pointers are incredibly useful. We can calculate an address, save it in a pointer, and use it later
Every object in C++ and Java is accessed by a pointer

addressing mode general meaning

Register The value is in a register. On most machines, the register must be


specified in the instruction.

Memory (or Memory The value is in memory. The address of the value is encoded in the
Direct) instruction.

Immediate The value is a constant which is encoded into the instruction.

Base The address of the value is in a register (the "base" register). The
instruction specifies which register.

Absolute The address to jump to is encoded in the instruction

Basic MIPS Instructions


MIPS was an acronym for Microprocessor without Interlocking Pipe Stages, was a very successful
microprocessor in the 1990s. relatively simple (and regular) instruction set, and its low cost and low
power requirements. MIPS was probably the chip that settled the longstanding competition between
RISC chips and the then-dominant CISC chips. All new microprocessor chips made today are RISC. The
only surviving CISC chip in use is the Intel family.

The MIPS instruction set is easier to program, use, and implement for several reasons
 every instruction is a standard length - for our family, 32 bits.
 there is a large number (32) of uniform registers. For 31 of these registers their use is
restricted only by convention. (But in reality only 26 registers are generally available)
 there are very few different instruction encodings
 there are no condition codes
 it can function in either big- or little-endian mode
 the procedure calling convention is lightweight, enabling simple leaf procedures to be
called with very little overhead
 the assembly language is easily optimized and its regularity (and large number of
registers) makes "smart" compilers easier to write. (In fact, early MIPS chips relied on
the compiler to generate code in a special way to avoid the necessity of pipeline
interlocks, hence its acronym)
MIPS parameters
All MIPS
instructions are 32-bits. MIPS assembler allows pseudoinstructions. All operations are performed
in registers on 32-bit quantities. The only instructions that access memory are loads and stores,
and these are the only instructions that indicate data size. Loads and stores must be properly
aligned. Thus, a load of a four-byte quantity must be done from an address that is a multiple of
four. Loads of small sizes (halfword or byte) have signed and unsigned versions.
Register classes
As indicated, MIPS has a convention for register usage. Registers are divided into register
classes:
 special-use. These registers are reserved for a special purpose. This includes $0, $at,
$v0/$v1 (function result), the stack, frame, and global pointer registers, the return address
register, and the two kernel registers (10 total). Except for the kernel registers, the stack
pointer, and $0, you may use these registers for other uses if you obey the rules.
 $a0-$a3 are argument registers.
 $s0-$s7 are "saved" registers. These are usually used to "hold" a value for a long period
of time. Again, later.
 $t0-$t9 are "temporary" registers. Use these registers for calculations.

Basic instruction encoding


MIPS instructions are divided into fields, just like our Simple Machine. The fields indicate the
operation, and the operands
 the operation, is split into two fields - the opcode field and, in some instructions, the
function field, each 6 bits wide. The opcode field comprises the most-significant bits of
the instruction. The function field comprises the least-significant part. this design
"moves" the indication of the operation between the two fields based on which instruction
encoding is used, as we will see.
 every instruction has space for two registers, each encoded in 5 bits.
 some instructions have space for a third register, a shift field, and the optional function
field. Other instructions use this space for a 16-bit constant.
Arithmetic Instructions
Arithmetic instructions are encoded as R-type instructions. They have the following parts
 an opcode field that basically says the opcode is in the function field instead.
 two source registers (rs and rt)
 a destination register (rd)
 an optional 5 bit shift amount (for shift instructions)
Load/Store Operations
all operations in MIPS code is done in registers. Thus, before an operation can be performed,
data must be loaded into registers. After the operation is complete, the data must be stored from
the register(s) to its permanent location in memory. load and store operations must deal with the
three different sizes available - bytes, halfwords (shorts) and words (ints). indicated by the size
letter in the load or store operation:
lw - load 32-bits
lb - load 8-bits
For the store instruction, storing smaller quantities (bytes or halfwords) ignores the most-
significant bits of the register. When loading smaller quantities - do something about the other
bits in the register.
If the datum being loaded is a signed quantity, the small datum is sign-extended to fill the entire
register. This is the default of the load instructions, thus, lb - signed.
If the datum being loaded is an unsigned quantity, the most-significant bits of the register should
be zeroed. These load instructions have the additional character u for unsigned. Thus, lbu is
load byte unsigned. It is inherently unsigned.
If you load the upper (most-significant) part of the instruction (with lui, as we shall see) the
lower 16 bits are zeroed.
The load and store instructions have a register (the destination for load instructions, and the
source for store instructions) followed by a memory specification. Without getting hung up on
the memory specifier right now, let's look at how loading a datum affects the register. If the
datum being loaded has the value -2, here is what the register looks like after the load instruction:

instruction lh reg,mem lhu reg,mem lb reg,mem lbu reg,mem

register contents 0xfffffffe 0x0000fffe 0xfffffffe 0x000000fe

The Memory Operand


There is a single way to address memory in instructions, by using the addressing mode base +
displacement. This means a register holds a memory address, and the effective memory address
(the one loaded from or stored to) is that address plus a constant. The constant, which is a 16-bit
signed constant, is specified in the instruction.

For example, if you want to load a 32-bit word from an address contained in register $t0,
placing the word in register $t1, the instruction would be

lw $t1,0($t0)
Thus, 0 is added to the contents of $t0 to get the address of the 32-bit quantity to load. If $t0 is
the base of an array, and the element of the array you want has a constant index like 1, you could
access the element (array[1]) by adjusting the constant. But the constant must be multiplied by
the size of the array element!

Thus, if the array is an integer array, and you want element #1 (array[1]), the instruction
would be

lw $t1,4($t0)

If the array is a character array and you want element #1, the instruction is

lw $t1,1($t0)

Suppose you need to access array[i], where the array is an integer array, the address of the
start of the array (the base of the array) is in $t0, and the value of i is in $t2. Here is what
you do:

sll $t3,$t2,2 # multiply $t2 by 4 (the size of an


element)

# add the base address. This is the effective address of array[i]


add $t3,$t3,$t0

lw $t1,0($t3)

Note that storing to memory takes the same format, and the memory address is still the right-
hand operand. Thus, the equivalent store operation to that above is

sw $t1,0($t3)

Loads and stores must be aligned. This means that you can only load an N-byte quantity
(N=1,2,4,...) from an address that is a multiple of N bytes. Thus, while it is legal to load a byte
from any memory address, it is only legal to load a word from an address that is a multiple of 4.

Loading an initial address

Loading the base address of the array into the $t0 register in the example above is a bit tricky. An
instruction only has room for a 16-bit constant, and addresses are 32-bits! This means we must
initialize $t0 in two steps:

 the upper 16 bits of the register are set with the lui instruction (load upper immediate)

 the lower 16 bits of the register are set using an ori instruction OR an add instruction.

If the desired address is 0x01004008:


lui $t0, 0x100 # $t0 now has 0x01000000

addi $t0,$t0,0x4008 # $t0 now has 0x01004008

Note that the add instruction has a 16-bit signed constant. If the address wanted was
0x01008008, we could not use an add instruction! This is what would happen:

lui $t0, 0x100 # $t0 now has 0x01000000

addi $t0,$t0,0x8008 # $t0 now has 0x00ff8008

There are two solutions to this. First, we could simply add one to the constant used in the lui
instruction

lui $t0, 0x101 # $t0 now has 0x01010000

addi $t0,$t0,0x8008 # $t0 now has 0x01008008

an alternative solution is to use an ori instruction instead of an add. (The constant in andi and
ori is unsigned)

lui $t0, 0x100 # $t0 now has 0x01000000

ori $t0,$t0,0x8008 # $t0 now has 0x01008008

This operation (loading a 32-bit address into a register) is so common that it has
pseudoinstruction in MIPS assembler: la (for load address). This means the instruction la can be
used in MIPS but it is translated to the appropriate lui/ori combination. In our case this would be

la $t0, 0x1008008

The global pointer

As you see, loads from 32-bit addresses ("static" data) are cumbersome. To streamline such
loads, MIPS programs are laid out in memory so that "static data" (as opposed to data on the
stack) is in blocks of 64kB. A register is then initialized with an address that is in the middle of
the current block of data. Later, that register can be used as the base register, and loads and stores
can reference memory using a signed offset from it. If there is more than one 64kB block of data
for a program, the global pointer can be initialized each time a function is entered. If a function's
data is larger than 64kB, the global pointer will have to be manipulated more often.

In our programs, data will be very small. We could simply use a global pointer that points to the
beginning of our memory area. That way offsets are always positive.

Pseudoinstructions blah blah blah fuck go to the page


Bit Instructions and Instruction Encoding

The shift instructions

Consider a number 2^N where 31 > N > 0. This number is represented in binary on our machine
by a word with a single bit set, bit #N. Thus 2^5 has bit 5 set (counting from 0) or 100000 in
binary. In decimal, of course, this is 32.

If we move this bit to the left one position, it becomes 2^6, or 64. Moving left one bit position
multiplies the number by 2. This is a left-shift operation. A left-shift by P positions multiplies the
value by 2^P.

Similarly, moving the bit one position to the right, divides it by 2. Thus, 2^6 (decimal 64)
becomes 2^5 (decimal 32) when right-shifted one position.

Integer division - not exact, bits lost (shifted off the right side). B/c the binary number 011
(decimal 3), when right-shifted one position becomes 001, and the previous 2^0 bit is lost. Tl; dr
binary bullshit

Right-shifting has another problem: what to do when we right-shift a number that has its most-
significant bit set? If we shift zero bits in from the left, the sign bit is no longer set! This would
be correct if the original number was unsigned. Let's look at an example using a four-bit
numbers:

1100 in a four-bit unsigned number is decimal 12. If we right-shift this one position and
set the leftmost position to 0, the result is 0110, or 6 base 10. This is the correct answer.
However, if the original number was interpreted as signed, its original value would have
been -4. When we right-shift -4 by 1 position it shouldn't become 6! Instead we want to
set the [shifted-in] most-significant bit to 1. This would produce 1110, which, when
interpreted as a signed four-bit number is -2.

These two types of right shifts are called logical (when we treat the number as unsigned, and
shift 0 bits in) and arithmetic (when we treat the number as signed, and replicate the sign bit in
the bits shifted in). Since there is only one type of left-shift and it shifts 0 bits in, it is also called
logical.

AND and OR operators

...

Example:

Given two 16-bit unsigned non-zero numbers in $t0 and $t1, set $t2 so that $t2 has the number in
$t1 as its most significant 16 bits and the number in $t0 as its least significant 16 bits.
Which code sequence is correct:

and $t2,$t0,$t1 sll $t2,$t1,16 sll $t2,$t1,16 sll $t2,$t0,16


or $t2,$t2,$t0 and $t2,$t2,$t0 sra $t2,$t2,16
sll $t1,$t1,16
or $t2,$t1,$t2

(One of the answers is correct all of the time and one is correct some of the time.)

MIPS Instructions

All of and, or, xor and nor have R-type MIPS instructions where three registers are used:

op rd, rs, rt # rd = rs op rt for op=and,or,xor,nor

All except nor also have immediate counterparts where the 16-bit immediate value is treated as
unsigned (not sign-extended) when the operation is performed. These are useful for creating 16-
bit ANDs, ORs and XORs.

Extracting values stored in a bit-field

the first 16-bits of the master data structure stored for every data object on Unix, called an inode,
comprise the object's mode. The mode includes both the filetype and the basic permissions.
Within these 16 bits, bit #8 (counting from 0) indicates whether the owner of the file can read it.
Thus, our data looks like

-------B--------

where B is the bit we want, and - indicates each bit that is 'in the way'.

Problem: if a file's mode is in $t0, set $t1 to 1 if the file's owner can read it, 0 otherwise.

There are several ways to solve this problem. Let's look at this graphically again. Here is the
operation we are interested in:

-------B-------- -> 000000000000000B

You may already see a simple solution to this problem, but we are going to take the long way
around and discuss the generally-useful idea of a mask.

A mask is a special bit pattern that is constructed so that an AND or OR operation can be applied
to a selected sequence of bits to isolate it. In our case, we want to construct a mask so that we
can extract our single bit, i.e., so that the operation

-------B-------- -> 0000000B00000000


can be performed. Of course, if we use an AND operator, the bit pattern is all zeros except in the
position of B, where we have a 1:

-------B-------- AND 0000000100000000 -> 0000000B00000000

Once this operation is performed, we can simply shift B to the correct position:

0000000B00000000 -> 000000000000000B

In our case, where the file's mode is in $t0, the following sequence would be used

andi $t2, $t0, 0x100 # here 0x0100 is our 'mask'


srl $t1,$t2,8

We indicated earlier that this may not be the easiest solution, though it is the most general. A
simpler solution would probably be

sll $t2,$t0,23
srl $t0,$t0,31

The concept of a mask is very useful and can be used to isolate and extract any data value. We
will see it again later. As one further example, suppose we wanted to isolate all the permissions
bits in the word in $t0, placing the isolated bits in $t2. Since the permissions take up a total of 12
bits, we would use

andi $t2,$t0,0x0fff

Encoding MIPS instructions

As discussed in an earlier section, R-type instructions must have room in the instruction
encoding for the following parts:

 an opcode field

 two source registers (rs and rt)

 a destination register (rd)

 a shift amount (for shift instructions)

Since there are 32 instructions, the register fields must have 5 bits. Similarly, since the maximum
shift amount is 31, the shift amount field must have 5 bits. This leaves 12 bits for the opcode. To
make the encoding for different instruction types more compatible, the opcode field was broken
into two 6-bit fields, called opcode and function. For R-type instructions, the function (funct)
field indicates the instruction and the opcode (op) field (which is 0 or 1 for an R-type
instruction) indicates to look in the funct field for the operation code.
Encoding Instructions using Instructions

Using our example instruction of add $t0, $t1, $t2 we will use our bitwise instructions
to accomplish two tasks:

1. Create the instruction word from its parts (this is, after all, what MARS must do when it
assembles the instruction!)

2. Take an existing R-type instruction and modify the rd field, setting it to the value currently in $t4.
Leave the remainder of the fields intact.

Both of these tasks will allow us to practice with masks and our bitwise operators.

Problem: Encode the add $t0, $t1, $t2 instruction, placing the result in $t0

If you remember from our earlier discussion, we had values for these constants:

 $t0 is 8, so $t1 is 9 and $t2 is 10.

 for add, the op field is 0 and the funct field is 32

Let's assume we have the following register assignments already. (We are doing the general
solution here rather than optimizing for 0-valued fields.)

li $t1,0 # opcode
li $t2,9 # rs
li $t3,10 # rt
li $t4,8 # rd
li $t5,0 # shamt
li $t6,32 # funct

Notice that each of these load immediate instructions (a pseudoinstruction) is implemented on


Mars by an add instruction. For example, li $t2,9 becomes addiu $t2,$zero,9 (addiu still sign-
extends the immediate value, but the value is positive and less than 0x8000, so sign-extension
doesnt occur)

Here is the easiest way.

move $t0,$t1 # opcode


sll $t0,$t0,5 # make room for rs
add $t0,$t0,$t2 # rs
sll $t0,$t0,5 # shift to make room for rt
add $t0,$t0,$t3 # rt
sll $t0,$t0,5 # shift to make room for rd
add $t0,$t0,$t4 # rd
sll $t0,$t0,5 # shift to make room for shamt
add $t0,$t0,$t5 # shamt
sll $t0,$t0,6 # shift to make room for funct
add $t0,$t0,$t6 # funct

The alternate way is to OR in each piece. Let's do this for practice.

sll $t0,$t1,26 # put op in place


sll $t2,$t2,21 # shift rs to align
or $t0,$t0,$t2 # or in rs
sll $t3,$t3,16 # shift rt to align
or $t0,$t0,$t3 # or in rt
sll $t4,$t4,11 # shift rd to align
or $t0,$t0,$t4 # or in rd
sll $t5,$t5,6 # shift shamt to align
or $t0,$t0,$t5 # or in shamt
or $t0,$t0,$t6 # or in funct

Interestingly, this took one less instruction.

Problem: Take an existing R-type instruction and modify the rd field, setting it to $t4.
Leave the remainder of the fields intact.

This involves OR-ing in the new value of rd. But first, the current instruction's rd field must be
zeroed. This is a common use of masks. Here are the steps involved:

1. Create mask that has 1's everywhere EXCEPT the rd field, where there are 0's

2. AND the existing instruction with the mask. This zeroes the rd field, leaving the remainder of the
instruction intact.

3. Align the new value of rd so that it is in the correct position

4. OR the new value of rd into the existing instruction

Assuming our existing instruction is in $t0 and the value of the new rd field is in $t7, here is the
code

sll $t7,$t7,11 # shift rd to align


ori $t8,$zero,0xF800 # see note below
nor $t8,$t8,$zero # complement of $t8 to form correct mask
and $t0,$t0,$t8 # and mask with existing instruction to zero rd
or $t0,$t0,$t7 # or in the new rd field

Note: We create the complement of the mask, then complement it to get our correct mask (this is
the nor instruction). We can use a load immediate instruction: If we use it with a hexadecimal
constant 0xF800, Mars will realize that it cannot use an addi, as this will sign-extend the result,
and, instead it will substitute an ori instruction. We will just insert the native ori instruction to
show what it looks like. $zero comes in very handy here.
Decisions
If statements and basic branches

Addressing mode

branch instructions are I-type. The label address is encoded in the 16-bit field of the I-type
instruction. 16-bits is insufficient for a full address. The field contains a relative address, which
is part of the branching instructions new addressing mode - pc-relative.

the pc holds the address of the next instruction to be executed. if executing an instruction at
address N, pc is set to the address N+4 (the next instruction is 4 bytes forward). The branch
instruction's 16-bit constant contains a number that indicates how to adjust the pc - but it is not
measured in bytes. To give the branch a longer range, the adjustment is measured in words. Thus,
in the sequence

label: some other instruction


sub $t3,$t0,$t1
slt $t2,$t3,$zero
bne $t2,$zero,label
whatever instruction follows the branch

the bne instruction would hold the constant -4. This number would be multiplied by 4 (read: left-
shifted by 2), then added to the pc. Originally the pc had the address of the instruction whatever.
After the adjustment by 16 bytes, it has the address of label.

This use of word-offsets in a branch instruction enables branches to have a range of 65536
instructions, centered on the current instruction. This is 256kB of code - which, in most cases, is
much larger than the size of a function.

Other branching instructions

In cases where the branch target is too far away, a jump instruction must be used instead. The
jump instruction is simple

j address

jump instruction uses J-type. A J-type instruction is simple: there are 6 bits of opcode followed
by a 26-bit constant. This 26-bit constant is again a 28-bit constant in disguise (it is a word
address and the low-order 2 zero-bits are suppressed), but the result is not PC-relative. Rather,
the 28-bit result replaces the low-order 28-bits of the current PC. This gives the jump instruction
the ability to jump anywhere within the current 256MB (2^28 byte) segment of code. Let's look
at an example. If the current PC is 0x20012C58:

3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 9 8 7 6 5 4 3 2 1 0
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 0 0 0 1 0 1 1 0 0 0
2 0 0 1 2 C 5 8

and we need to jump to the address 0x20053C58, we can't use a branch instruction because the
target PC is 41000 bytes (or 10400 words) away. Instead, we must alter the PC using a jump
instruction. The new PC must look like this

3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 9 8 7 6 5 4 3 2 1 0
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 0 0 0 1 0 1 1 0 0 0
2 0 0 5 3 C 5 8

where we have highlighted bits 2-27 of the target PC to indicate the bits encoded in the jump
instruction. The jump instruction would then be

j 0x0014F16

general jump instruction that can jump to anywhere -- jr reg instruction. In this case, the new PC
is placed in a register and the jr reg instruction indicates to jump to the address in the register.
Here, of course, there is no adjustment of the address - it is a real [byte] address. (Of course the
low-order 2 bits of the address must be zero to avoid an alignment fault.)

In the problems section of this module, Problem One compares these three techniques of
branching.

Translating if-statements

simple example:

if (a < b) {
temp = b;
b = a;
a = temp;
}

assuming a is in $t0 and b is in $t1:

bge $t0,$t1,.L4 # pseudoinstruction


move $t2,$t1 # temp = b
move $t1,$t0 # b = a
move $t0,$t2 # a = temp
.L4:

Loop Statements
A simple loop for practice
int A[N], i, element, *elptr;
i=0;
while (i<N) {
elptr = &A[i];
element = *elptr;
if (element < 0) *elptr = -element;
i++;
}

If iptr is a pointer that points to an integer, and we want it to point to an integer i (that is, iptr
holds the address of i), the declaration of iptr is

int *iptr, i;

this means that iptr is a pointer to an int. (Note that the * only applies to iptr, not to i. i is just an
int.) The * in a declaration means 'is a pointer to'.

You initialize iptr to point to an integer using the addressof operator (&) like this

iptr = &i;

To use the address in iptr and get the integer it points to you use * in an assignment statement
like this

int j = *iptr;

In an assignment statement, * means dereference, or go through the pointer and get what it
points to.

Let's go ahead now and change this to our ugly version of code:

i=0;
loop: if (i>=N) goto loopdone;
elptr=&A[i];
element=*elptr;
if (element >= 0) goto loopskip
*elptr = -element;
loopskip: i++;
goto loop;
loopdone:

MIPS code:

# i=0; // keep i in $t0


move $t0,$zero
# loop: if (i>=N) goto loopdone;
loop:
lw $t1,N
bge $t0,$t1,loopdone
#elptr = &A[i];
la $t2,A
sll $t3,$t0,2
add $t2,$t2,$t3 # $t2 is elptr
#element = *elptr;
lw $t4,0($t2) # $t4 is element
# if (element >= 0) goto loopskip
bge $t4,$zero,loopskip
# *elptr = -element;
sub $t4,$zero,$t4
sw $t4,0($t2)
# loopskip: i++;
loopskip:
addi $t0,$t0,1
# goto loop;
b loop
# loopdone:
loopdone:

A moderately-complex loop

Here is the code we want to translate:

for (i=0,i<N,i++) {
for (j=i+1,j<N,j++) {
if (A[j] > A[i]) {
temp=A[i];
A[i] = A[j];
A[j] = temp;
}
}
A is an integer array of N elements.

Start with the standard translation to gotos

i=0;
nextouter: if (i >= N) goto alldone;

j=i+1;
nextinner: if (j >= N) goto incrouter;

/* if (A[j] > A[i] ) { */


if (A[j] <= A[i]) goto incrinner;
temp=A[i];
A[i] = A[j];
A[j] = temp;
/* } */

incrinner: j=j+1;
goto nextinner;

incrouter: i=i+1;
goto nextouter;

alldone:

MIPS: registers:

 $t9 will be A (&A[0])

 $t0 will be i (since i and j are 'throw away', we will not save them in memory)

 $t1 will be j

 $t8 will be N

 other t registers will be temporaries

First, initialize a simple integer array of 10 elements:

.data
A: .word 5,3,-3,54,2,-56,7,9,11,11
N: .word 10

Now the .text region with the initialization of our variables (above)

.text
.globl main
main:
# $t8 will be N; $t9 will be &A[0]
la $t9, A
lw $t8, N

# $t0 will be i, $t1 will be j


# i=0;
move $t0,$zero

# nextouter: if (i >= N) goto alldone;


nextouter:
bge $t0,$t8,alldone

# j=i+1;
addi $t1,$t0,1

# nextinner: if (j >= N) goto incrouter;


nextinner:
bge $t1,$t8,incrouter

# if (A[j] <= A[i]) goto incrinner;


sll $t6,$t1,2
add $t6,$t6,$t9 # &A[j]
lw $t4,0($t6) # A[j]
sll $t5,$t0,2
add $t5,$t5,$t9 # &A[i]
lw $t3,0($t5) # A[i]
ble $t4,$t3,incrinner

# temp=A[i];
move $t2,$t3

# A[i] = A[j];
sw $t4,0($t5)

# A[j] = temp;
sw $t2,0($t6)

# incrinner: j=j+1;
incrinner:
addi $t1,$t1,1

# goto nextinner;
b nextinner

# incrouter: i=i+1;
incrouter:
addi $t0,$t0,1

# goto nextouter;
b nextouter

alldone:
jr $ra

Switch Statements
Decisions - Problems

Data Areas and Introduction to Procedures


Data Areas

The .text and .data directives control whether you are defining code (.text) or data for the
program. instructions are placed in-memory at significantly different addresses than is data.
reasons for this separation:

 want the addresses for data, and especially for text, to be contiguous and don't want to
'run out' of addresses as our program grows.

 the protections set on these areas are different. Simply, we need to execute code and read
and write data. Conversely, we don't generally need (or want) to execute data, and we
want to limit the ability to read and write code.

To ensure that these regions do not overlap, and sufficient room is permitted for each, systems
adopt a convention for the initial addresses for each. These blocks of addresses are typically
called segments. On MARS the .text segment starts at 0x400000 and the .data segment starts at
0x10000000. This gives us a contiguous region of about 250MB of .text space.

The data region: starts at 0x10000000 and ends at 0x80000000 - an area of 1.75GB. But there are
several kinds of data. The first (and most important) division is between static data and stack
data.

Stack data is used for temporary data and arguments to support a chain of procedure calls. The
stack must be allowed to 'grow' as necessary to support longer call chains of possibly more
complex procedures. Data on the stack exists as long as the procedure that defined it is active.
Thus, the address of an item on the stack is determined when the procedure is called, and will
[probably] be different if the same procedure is called a second time.

Static data is used for data that resides at a fixed constant address and is independent of active
procedures. consists of global variables, string data (such as that used for messages), and
dynamic data such as allocated by malloc (read: new()). This last type of data is still termed
static because it is valid as long as the programmer wants, and is independent of procedures: in
other words, this data does not go out of scope. This data must be freed either explicitly or by
garbage collection (if all references to it are deleted). Just like stack data, static data must be
allowed to grow.

Support for two types of data that must grow using a single block of memory is implemented by
starting each of the types on opposite ends of the address range. Then, one type of data grows
from lower to higher addresses (the static data) and the other (the stack data) grows from higher
to lower addresses:

0x10000000 ................. 0x7ffffffc


static data grows -----> <------------- stack data grows

The static data area aka data area is subdivided into three smaller segments: initialized data,
uninitialized data, and heap data.

initialized data - this data is defined in program modules, space is allocated explicitly for it, and
it is often given an initial value. Examples of initialized data are an array that is initialized, and a
message that has predefined textual data. When using MARS it also contains variables whose
size is known but which are not initialized (i.e. using a .space directive). This latter category
would normally be part of uninitialized data. The size of the initialized data region is known
when the program is created (compiled and linked, or, in the case of MARS, just assembled).

uninitialized data follows initialized data and usually includes arrays that have a size but that are
not initialized (which are part of initialized data in MARS). This portion of uninitialized data is
called bss. Just like initialized data, the size of bss is known when the program is created. On
MARS, both initialized and uninitialized data are in the .data segment.

The end of the (initialized + uninitialized) data area is constant and is the highest legal absolute
address available to the programmer when the program starts. (Remember, stack addresses are
usually allocated outside of the programmer's control and never use absolute addresses). It is
illegal to reference addresses past the end of the known data area. The addresses that follow this
will be used to allocate our third type of static data - heap data. To make heap addresses legal,
the size of the program must be changed by extending the data area. This is done using a system
call named sbrk() (read: s-break).

In a normal program, the memory allocator package malloc() calls sbrk() to get a block of
memory addresses 'made legal' (i.e., mapped into the address space), then manages the block
internally, doling out chunks of it as requested. We will call sbrk() directly each time we need a
piece of heap memory. We give sbrk() a size, and it returns an address to a piece of memory of
that size, which is allocated by simply extending the data area size by that amount. We will do
briefly using the MARS sbrk syscall, then, later we will call our version of the function malloc,
which will call sbrk for us and ensure the memory it returns is not zeroed.

Suppose we need room for an array of N integers, named result. We would simply define result
as a pointer to an int, and the C code to allocate the array would be as follows:

int *result = sbrk(N<<2); (for us, this is the same as


malloc(N<<2))

Byte Order
hexadecimal memory - MARS displays the characters in strings strangely due to the byte order
of the underlying machine and to the fact that all MARS segment displays are shown using four-
byte quantities.

In short, there are two ways to number bytes in a four-byte quantity (a 32-bit word). In each of
the two drawings below, a word is shown as it would appear in a register. The byte farthest to the
left is the most-significant byte. The byte farthest to the right is the least-significant byte. In the
first drawing, we have numbered the bytes 0-3, where 0 is the most-significant byte.

0 1 2 3

If the bytes in our word were numbered like this, and the address of the word when it is placed in
memory was 0x1000, then the most-significant byte would appear at address 0x1000 and the
least-significant byte would appear at address 0x1003. This is how most people (at least most of
those of us who grew up in a left-to-right reading world) would think of byte-ordering, and, in
fact, when a 32-bit value in the header of an Internet packet is transferred across the Internet, the
bytes that make up the value must be transferred in this order - most-significant-byte first or
MSB-first for short. (The format is also called big-endian because the big part of the value (the
most-significant part) is transferred first.)

The other way to number bytes in our word is to assign 0 to the least-significant byte like this

3 2 1 0

Thus, if the address of this word when stored in memory was 0x1000, the byte stored at 0x1000
would be the least-significant byte. This format is called least-significant-byte first (LSB-first or
little-endian). It is the use of LSB-first byte order that causes the confusion in the memory
dumps.

Let's look at the bytes comprising a string of data.

prompt: .asciiz "Enter the integer to convert to hex:"

ASCII strings, whose underlying type is one byte long, are stored as you would expect. The first
byte of the string is stored at address 0, the second at address 1, etc. We can see this if the string
is placed in a file and a character dump of that file is shown using Linux:

If we dump this file character-by-character we see that the characters' addresses simply increase
as we would expect:

$ od -A x -tc botest
000000 E n t e r t h e i n t e g e
000010 r t o c o n v e r t t o
000020 h e x : \n

Let's look at the data character by character where the characters are displayed in hexadecimal:
$ od -A x -tx1 botest
000000 45 6e 74 65 72 20 74 68 65 20 69 6e 74 65 67 65
000010 72 20 74 6f 20 63 6f 6e 76 65 72 74 20 74 6f 20
000020 68 65 78 3a 0a

If we output the file word-by-word, we will see that the words are read in LSB-first order. This
means that the first character ('E', whose value is 0x45) is assumed to be the least-significant
byte of the first word. When the words are redisplayed as 32-bit quantities, the characters appear
swapped:

$ od -A x -tx4 botest
000000 65746e45 68742072 6e692065 65676574
000010 6f742072 6e6f6320 74726576 206f7420
000020 3a786568 0000000a

Let's compare each of these so you can see how they line up:

000000 E n t e r t h e i n t e g e
000000 45 6e 74 65 72 20 74 68 65 20 69 6e 74 65 67 65
000000 65746e45 68742072 6e692065 65676574

The last line here is the way you will see the data displayed in the memory dump, although the
characters will appear in correct order if four bytes of this data are loaded as a word into a
register.

Characters

MARS limits us to the use of ASCII characters.

There are two common ways to represent a character string consisting of ASCII characters. The
one used by MARS is the same as the one used in C - a character string is a sequence of
characters followed by a null byte (a byte with value 0). The drawback - you cannot know the
length of a string without counting the characters. second encoding of a character string - the first
byte of the string indicates its length.

Besides the use of the null byte (which is also called a 'nul') to end the string, we also need to
encode the line terminator character.

Putting this all together, we can encode our string. Let's look up the values:

M o m \n
0x4d 0x6f 0x6d 0xa

If entered into a MIPS program:


.data
mom: .byte 0x4d,0x6f,0x6d,0x0a,0x0
.align 2
.text
.globl main
main:
li $v0,55
li $a1,1
la $a0,mom
syscall
jr $ra

look at the memory in the .data region. Assuming your label mom is word-aligned, you will find
a word with this value

0x0a6d6f4d

in the backwards order. this is due to the byte-order of the machine used. Since Intel machines
are LSB-first, the data, when output in 4-byte quantities, reverses the bytes of a character string
such as this. Whenever you are examining character data in Mars you must remember that the
bytes are reversed to avoid getting confused.

International Characters

characters that are not part of the ASCII subset: Unicode

Studying Unicode can be a bit confusing, since there is a difference between the characters
themselves and how they are transmitted. The problem is this - the computers of the world use a
byte stream to transmit character data. The base Unicode character set, which encodes all of the
most commonly-used languages of the world, is a 16-bit code. Inside a program, these 16-bit
unicode characters are referred to a wide characters. This base Unicode character set has the
ASCII characters as a subset. But how can this new 16-bit code be transmitted as a sequence of
bytes so that its users and the large number of ASCII users can easily coexist?

The solution is to encode non-ASCII Unicode characters as a sequence of one to three bytes.
This encoding is called UTF-8. When transmitting UTF-8, ASCII Unicode characters are
transmitted as single-byte ASCII values. When a non-ASCII character must be transmitted the
unused most-significant bit in the character is used to signal a change in encoding, and the
encoded character is sent. The number of leading 1-bits in the first byte indicates how many
bytes are used for the encoded character.

example - French. 'Mom' becomes 'M�re'. In this word, three of four characters are ASCII.
These do not require any special encoding - ASCII simply remains ASCII in UTF-8. The last
character, �, has the value 0xe8 in Unicode. Since this is greater than 0x7f it must be encoded as
a sequence of characters. The encoding for UTF-8 indicates that any value less than 0x7ff can be
encoded in a sequence of two bytes. The most-significant bits of the first byte would be 110 and
the most-significant bits of the second byte would be 10. The value of the character is encoded
using the remaining bits (as indicated by the underlined x's below):

first byte: 110xxxxx (the number of leading 1-bits indicates the number of bytes used to encode
the character)

second byte: 10xxxxxx

If we encode our character in binary as an eleven-bit number abcdefghijk (with leading


zeros to maintain the correct value), the bits are transmitted like this

first byte: 110abcde

second byte: 10fghijk

Let's apply this to our special character �. Its value is 0xe8, which is 00011101000
expressed in 11 bits. Thus, encoded using UTF-8 the bit patterns are

11000011 10101000 or 0xc3 0xa8

try it in another language. In another popular language one character that can be used as an
expression of Mom has the value 0x5988. According to UTF-8, any 16-bit value can be
encoded using three bytes. The bits used to carry the value are the x's in

first byte: 1110xxxx (again, the number of leading 1-bits indicates the number of bytes used to
encode the character)

second and third bytes: 10xxxxxx

Our character has the 16-bit bit pattern 0101100110001000. This is encoded as

11100101 10100110 10001000

or 0xe5 0xa6 0x88


Although UTF-8 is often used to transmit and display Unicode characters, it is annoying for
storing them internally as well as manipulating them. Programs often use wide characters
internally, even in 'wide character strings'. A set of library routines is used to convert between the
UTF-8 encoding (called 'multi-byte characters) and wide characters as well as to classify wide
characters (for example digit or alphabetic character) just as is done with ASCII characters.

Unicode also has a 32-bit variant, which easily fits all the characters known in all languages in
the world. Even allowing for the private use area of the Unicode value range, all characters can
fit in 24 bits. It can also be encoded using UTF-8 and requires a maximum of four bytes.
If a program understands Unicode it will convert incoming UTF-8 to wide characters for internal
manipulation. When it must send the data elsewhere, it will be encoded as UTF-8 again for
transmission. Since UTF-8 is an eight-bit encoding, there is never an issue with byte-order.

If a program does not understand Unicode it will not be able to translate any encoded Unicode
characters in the input stream. These characters may appear as ? or some other funny character in
the output if the data is later displayed.

Other Character Sets

Since most languages can be represented in 128 or 256 characters, many languages (or language
groups) have their own eight-bit code, possibly with ASCII as a subset. There is a standard set of
these codes (character sets) named ISO8859-X where X indicates the specific encoding. Of
course, data written in one character encoding must be read using the same encoding. You cannot
write some data encoded in ISO8859-X and expect to read it as UTF-8.

A word on Strings

We have been using strings for a while but have not specifically talked about them. Let's take a
moment and go over the basics. We will use character strings as our example, since C strings are
easy to understand.

A C string is simply a sequence of characters with a zero byte at the end. That is what is
generated in the .asciiz directive. We are used to the construct

welcome: .asciiz "Welcome to strings"

Here welcome is a label attached to our string. We also know that to output this string we must
get its address in the appropriate register for the syscall like this

la $a0,welcome

Mars keeps a record of where each label is. At the time the .asciiz directive was encountered,
Mars was filling up a data area (the .data section) with data, sequentially adding data as it was
encountered. At any time, a counter indicated where in the data area we currently were (the "end"
of the .data area). When the welcome: label was encountered, the value of the counter was
recorded and attached to the label welcome in an internal table called the symbol table.
Immediately following, bytes in the .data section were initialized with the contents of the
string and the counter was incremented by its length.

Later, when the la instruction was encountered, the symbol table was consulted, and the address
of welcome was substituted for the label.

For example, if the current end of the .data section was 0x10000040 when the .asciiz
directive was encountered, the address 0x10000040 would be entered in the symbol table for
welcome, 19 bytes (the string plus a null byte) would be initialized (0x10000040 -
0x10000052) and the new end of the .data section would be 0x10000053. Later, when the
la instruction was encountered, the address 0x10000040 would be substituted for the label
welcome. This would essentially become a load immediate instruction with a 32-bit constant

li $a0,0x10000040

and would be translated to two MIPS instructions

lui $at,0x1000
addiu $a0,$at,0x40

Once the address of the string is in a register, you can perform the syscall OR write code to
manipulate the string character by character. To retrieve the first character of the string, for
example

lb $t0,0($a0)

To increment the pointer to point to the next character we simply

addi $a0,$a0,1

etc.

Let's take one more step and examine the C statement (using a global variable welcome):

char * welcome = "Welcome to strings";

Although this looks very similar, it is somewhat different. Again, a null-terminated ASCII string
must be initialized and its address recorded. This time, however, the address is used to initialize a
[pointer] variable with type char *. Let's look at the code that would be generated for Mars by
a compiler, then explain it

.Lwelcome: .asciiz "Welcome to strings"


.align 2
welcome: .word .Lwelcome

Just like before, a label is generated (this time by the compiler) to attach to the string. The
[compiler-generated temporary] label (.Lwelcome) is entered into the symbol table with the
address of the string. Then, to keep track of the string in the high level language, a variable is
initialized with this address. The variable is a char *, or a pointer to char. When Mars
encounters the .word directive, it looks up the value of .Lwelcome and substitutes it. In our
example, the .word directive would become

welcome: .word 0x10000040


However, we have a further step to do. There is now a new label, welcome, which has its own
address. This address (which in our case, after the .align directive, would be 0x10000054)
is the address of the pointer. Once we have a memory word (our pointer) initialized with the
correct address (of the string), we just use it like any other data value. Now to output the string
we would

lw $a0,welcome

Note the difference between the la and lw. In the earlier case welcome was the label
attached to the string. In this case it was a data word that had been initialized with the
address of the string.

Once you have the address of the string in a register you manipulate it the same way.

Note that the characters in C strings are constant characters. It is illegal to modify the
characters in a string initialized in this way. (The type of the pointer welcome above is actually
const char *, or pointer to constant char.) You may very well get a fault if you modify the
characters.

If you want to modify characters in a string you must use a character array and initialized it in
some way - by reading in a string or by copying a constant string to it.

Other pointers

The initialization of integer pointers is exactly the same. The only difference is how you use and
increment the pointer. At that time you must know the underlying size of the data. If the register
$a0 has been initialized with the correct address:

for a string

lb $t0,0($a0)
addi $a0,$a0,1

for an integer array

lw $t0,0($a0)
addi $a0,$a0,4

Background for Procedures


a function (or procedure) is called to isolate data from the calling program (the caller). The
function executes and it uses its own variables, which must be separate from those of the caller.
Assuming the function (the callee) does not use any global data, the only data shared between
them is defined by the function parameters and return value. Usually (always, in the case of
Java), sharing of simple data types (scalars) is unidirectional, i.e., the callee cannot alter the data
of the caller. In C and C++, however, this restriction can be relaxed by the use of pointer or
reference arguments. (Array datatypes as parameters are usually writable by both caller and
callee. This is simply because the array is passed as a pointer.) In the remainder of this and
continuing with the next topic we will discuss how this encapsulation is implemented at the
machine-language level.

To implement encapsulation, procedures must have mechanisms for

 passing parameters

 returning simple values

 allocating local storage. This storage may be deallocated when the procedure returns

In addition to providing encapsulation, procedure implementation at the machine level must


protect the integrity of registers. Some mechanism must be provided so that the use of registers
in the callee do not interfere with those in the caller.

A very important aspect of this implementation is that both sides of a procedure call are blind.
Other than the published interface (of parameters and return value), the caller and callee do not
know anything about each other. Procedures must be able to be translated separately (in separate
modules) and work when the modules are joined.

All of these capabilities are provided by the platform's procedure calling convention and
implemented using a stack. A stack is a block of memory with a movable pointer (the stack
pointer - $sp on MIPS). At any time, memory on one side of the stack pointer is in-use and
memory on the other side of the stack pointer is not yet allocated. The stack pointer itself points
to the last memory address that is in-use. This is called the top-of-stack.

Stacks are usually word- or double-word aligned for simplicity. This means that the stack pointer
must be moved by multiples of the alignment size - in the case of words, by four-byte
increments. Any data placed on the stack must be padded to keep proper alignment. This means
all data placed on the stack must be a minimum of four bytes. This means scalars that are less
than four bytes are converted to their four-byte counterparts and placed on the stack.

Although stack memory and other (static and dynamic) memory could be implemented as
completely separate memory areas (addresses), they are usually implemented using one large
block of addresses, with each starting on opposite ends. In this case (e.g., MIPS) memory is
allocated using increasing addresses and stack space is allocated using decreasing addresses (i.e.,
the stack grows downwards):

memory address use


high address --->
(stack pointer starts here) stack
| memory
|
V
(lowest stack address)
(highest static address)

static memory
...
dynamically
allocated
...
low address ---> pre-allocated
(globals)

(Note that on this implementation the top of stack is actually the lowest address in-use.)

Pushing and popping

Traditionally, placing an item on the stack is called pushing the item on the stack. In our
implementation, this involves the following steps

 moving the stack pointer (by subtracting the size of the item)

 storing the item on the stack (using a non-negative offset from the current stack pointer)

The inverse operation is called popping an item on the stack. The sequence is important:

 getting the item off the stack (if it is to be preserved)

 moving the stack pointer (by adding the size of the item)

This inverse sequence is very important to avoid referencing memory that is not in the active part
of the stack. No such reference should ever be made.

Simple Procedures

Vous aimerez peut-être aussi