Vous êtes sur la page 1sur 82

Assembly Language Programming

• Overview
– Programming languages
High-level language, assembly language, machine code
– Register organisation of MC68000
– Assembly language program
Statement structure, program structure, assembly directives
– Instruction set
– Addressing modes
Direct, immediate, absolute, indirect, relative
• References
A. Clements, “The principles of computer hardware,” Chapters 5, 6.
Programming Languages
Introduction
• All modern computer systems are built upon the von Neumann
model: a general-purpose processor, called the central processing
unit or CPU, is used to perform arithmetic-logical operations, and a
memory is used to store programs and data.
• Data is the object to be manipulated by the computer, and a program
is a collection of instructions, defining how to manipulate the data.

I/O
CPU Memory
devices

Bus
Architecture of a computer system
• The memory in a computer system can only store binary numbers.
Therefore the programs must be encoded into binary numbers –
called machine code or binary code.
• Usually different CPUs (e.g. Pentium or Motorola) use different
binary codes to represent the same operation.
• In earlier years, people wrote programs directly using the machine
code. For example, the following program
0011 1010 0011 1000 ($3A38)
0001 0010 0000 0000 ($1200) Reminder:
1101 1010 0111 1000 ($DA78) 1 hex digits = 4 bits
0001 0010 0000 0010 ($1202) So 3A = 8 bits = 1 byte
0011 0001 1100 0111 ($31C5) 1200 = 16 bits = 2 bytes
0001 0010 0000 0100 ($1204)
may encode the operations of fetching two numbers from the
memory, adding them, and saving the sum back into the memory.
• The binary instructions are understood by the CPU and are ready for
execution; but human programmers have great difficulty in
understanding them!
• Today, programming has become easier, and programs may be
developed by using programming languages, at either a high level
or a low level.
• Examples of high-level programming languages include Basic,
Fortran, Pascal, Cobal, C, C++, Java, … (some people view C as an
intermediate language).
• Low-level programming languages particularly refer to the assembly
languages.
1.2 High level language vs. assembly language

1.2.1 High level languages (HLLs)


• In an HLL, a program may be written as statements.
Example: A and B are two short (2-byte) integers. Compute their
sum and save the result into another integer S.
In HLL (e.g. C++ or Java):
S = A + B;

• High level languages (HLLs) are designed to improve the program’s


readability, and to protect the user against tiring details of the CPU.
They allow for higher productivity in terms of program development
and maintenance, and for greater portability of the programs (with
less dependency on the hardware).
1.2.2 Assembly language
• Assembly language expresses instructions using mnemonics and
symbols. A mnemonic is a name given to an instruction.
• For example, use ADD instead of binary code $DA78 for an
addition operation, and use MOVE instead of binary code $3A38 for
a data movement operation.
• As we can see, assembly language is nothing but just a symbolic
form of the binary language. It improves the program’s readability in
comparison to the binary code.
• The following shows a comparison between the three languages.
HLL Assembly Machine code
S=A+B MOVE $1200, D5 $3A38
ADD $1202, D5 $1200
MOVE D5, $1204 $DA78
$1202
$31C5
$1204
1.2.3 Compiling and assembling
• Do not mistake that the CPU could execute the HLL statements or
assembly mnemonics. It can only understand the binary code!
• In HLL programming, a program called compiler is used to convert
a HLL program into the machine code. Likewise, in assembly
programming, a program called assembler is used to convert a
assembly program into the machine code.
Compiler

HLL program Assembly program Machine code


test.c, test.java test.asm test.obj, test.bin, test.exe

Assembler
• Programs in different levels are differentiated by their extensions,
e.g. high-level - .c, .java; assembly - .asm; binary - .obj, .bin, .exe.
“obj” means object, “bin” means binary, “exe” means executable.
• Because an assembly instruction is just a symbolic form of a
machine instruction, it is possible to obtain the equivalent assembly
program from the binary program by using a disassembler.

Disassembler Assembler

Assembly code MOVE $1200, D5 3A38 Machine code


Suitable for 1200 Suitable for
humans to ADD $1202, D5 DA78 computers
read and 1202 to store and
write MOVE D5, $1204 31C5 execute
1204

• It is very difficult, if not impossible, to go from the machine code


to the high-level language statements because there can be many
different ways of implementing the same high-level statements in
machine level instructions. This is called the loss of semantics.
1.2.4 Why programming in assembly languages ?
• While many software can be developed in HLLs, programming
directly in assembly languages may still be needed as it can generate
target codes which have a higher efficiency than the target codes
generated by the compilers.
• Assembly programming has several benefits:
– Speed. Assembly programs are generally the fastest programs
around.
– Space. Assembly programs are often the smallest.
– Capability. You can do things in assembly languages which are
difficult or impossible in HLLs.
– Knowledge. Your knowledge of assembly languages will help
you write better programs, even when using HLLs.
• Speed and space are two critical issues for applications such as
mobile audio/video communication. Good assembly programmers
are capable of speeding up many programs by a factor of five or ten
over their HLL counterparts, and at the same time ending with the
target codes which are often less than one-half the size of
comparable HLL programs.
• Capability is another reason people resort to assembly language.
Anything you can do on the machine you can do in assembly
language. This is definitely not the case with most HLLs.
• In general, as a comparison:
Language Productivity/Portability Performance
HLL High Low
Assembly Low High
• Programming in assembly language is slow and error-prone but is
the only way to squeeze every last bit of performance out of the
hardware.
• Why assembly programs can offer higher efficiencies?
This is because the assembly language allows the programmer to see
the CPU hardware, so they can optimise the movement and
combination of data between the physical memory addresses,
registers and I/O ports, to reduce the number of instructions and/or
transfers of data required for accomplishing a task.
• Therefore, to be able to write an assembly program, one must have a
good knowledge of the hardware organisation – particularly, the
register organisation – of the microprocessor.
• The assembly programs are thus machine dependent, i.e. a program
designed for one type of CPU will normally not work for a different
type of CPU – with a lower portability than the HLLs.
• We are going to study the principles of assembly language
programming by using the Motorola MC68000 processor as a
vehicle.

• A brief history of MC68000


– The MC68000 is the first member of Motorola’s family of
16/32-bit microprocessors. The successor to MC6809 and
followed by MC68010. It represents a reasonably state-of-the-art
architecture.
– The MC68000 was used in many powerful computers, notably
Sun 2 and Sun 3 workstations, and personal computers, notably
Apple Computer’s first Macintoshes and the Amiga.
2. Motorola Microprocessor MC68000
2.1 Register organisation within MC68000
To main memory
Address bus Data bus
MAR MBR
PC D0
A0 D1
IR op-code operand
A1
D7
A7 CU
ALU
ALU2 control signals
CCR
CU – control unit; PC – program counter; ALU – arithmetic/logic unit
An/Dn – address/data register; CCR – condition code register;
MAR/MBR – memory address/buffer register; IR – instruction register
• Register is a term particularly used to refer to a memory unit located
within the processor (CPU). A 16-bit register can store 16 bits, see
below. Note the numbering scheme for bits (now fairly universally
agreed).

Bit no: 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
1 0 0 0 1 1 0 0 0 1 1 0 0 1 0 1

• Two types of memories may be used by the CPU in the execution of


a program:
– The external memory (often called the main memory) holding the
program and data
– The registers located within the CPU
• Because of the speed of the memory making up the registers, and/or
its proximity, data transfers to and from a CPU register is normally
an order of magnitude faster than access to the external memories.
• Registers can be divided into two classes:
– Special purpose registers – used exclusively by the CPU for the
control of the execution of programs, not directly addressable by
programmers.
– General purpose registers, which are accessible by the
programmers.
• Because access to registers is much faster than to memories, one
may move the data that need frequent access into the general-
purpose registers before starting the program. This will then reduce
the number of accesses to the main memory and hence speed up the
processing – this is one of the secrets why an assembly program can
be a lot faster than an HLL program. HLLs do not offer the
accessibility to CPU registers.
• General purpose registers can be used to hold data (called data
registers) or address of memory (called address registers).
• MC68000 has 16 general-purpose registers, split into 8 data
registers and 8 address registers, all being 32 bits wide, labeled as
8 data registers: D0, D1, D2, …, D7
8 address registers: A0, A1, A2, …, A7
• MC68000 has a program counter (PC), which is 32 bits wide.
– PC holds the address of the next instruction in memory. Its
contents will be automatically updated after an instruction is
fetched from the memory.
• MC68000 has a status register, which is 16 bits wide, including a
system byte and a user byte; the user byte is called the condition
code register (CCR).
– CCR will be updated after each arithmetic/logical operation, to
hold information about the conditions of the result of the
operation.
• The other registers, e.g. instruction register IR, memory address
register MAR and memory buffer register MBR, are not
programmer-accessible (not programmable).
• The binary instructions in MC68000 take a format:

Op-code Operand(s)

The first field is called op-code (operation code), which defines the
operation to be conducted (e.g. ADD, SUB, AND, OR ...); the
second field, if existing, contains operand(s), which correspond to
the data to be processed by the instruction.
• While the MC68000 is internally 32 bits wide, it has a only 16-bit
data bus and thus fetches/sends only 16 bits at a time from/to the
memory. So, by definition, in MC68000, a word is 16 bits (2 bytes)
wide, and a long word is 32 bits (4 bytes) wide.
2.2 Data register model
• Data registers are used to hold temporary or intermediate results
during a calculation, so that these can be accessed much faster than
writing to memory then reading back.
• In addition, data registers can be used to hold frequently-used
operands thereby reducing the number of accesses to the main
memory. This can significantly speed up the execution.
• In the MC68000, three type of operations – byte (.B), word (.W)
and long word (.L) – can be applied to the data register, which
affect its lower 8, 16 and the complete 32 bits, respectively; the
other unused bits will not be affected.
• The result of operation will set the CCR.
• The MC68000 data register model:
D00 to D31 are used to denote the individual bits in a data register.
D31 D16 D15 D08 D07 D00 D0
D1
..
.
D7

Byte (.B)

Word (.W)

Long word (.L)


• Operations on bytes, words and long words are differentiated with
the qualifiers .B, .W and .L, respectively, in the 68000 instructions.
2.3 Address register model
• Primarily, address registers are used to hold the address of operands
to be accessed in memory. Accessing operand via an address
register can be a single-word instruction, e.g.
MOVE (A5), D3 has a machine instruction $3615
in comparison to the two-word instruction – the op-code in one
word followed by the address of the operand in the next, e.g.
MOVE $4000, D3 has a machine instruction $3638 4000
• This reduces the memory accesses for fetching the instructions and
hence speeds up the the CPU operation.
• The reason for a shorter instruction is that there are only 8 address
registers in MC68k, which can thus be encoded using only 3 bits.
• Besides, address registers can be used as data registers to hold data.
• The following shows the MC68000’s address register model when
used for addressing memory and for storing data, respectively.
• Used for addressing the memory. When an address register is
used to address memory, only the lower 24 bits A00~A23 take
effect; the higher 8 bits A24~A31 are discarded. Further, the least
significant bit A00 is always assumed to be zero, therefore
addresses are always even and as such words are addressed. This is
implemented by connecting only bits A01~A23 of the address
register to the address bus. The total addressing capacity:
0 to 223 – 2 (0 to FFFFFE) = 16 MB

A31 A24 A23 A00 A0


A1
..
.
A7

A00 (implied) = 0 24 bit address


• Used as data registers. When used for storing data, there are two
major differences between the address registers and data registers.
– Only word and long word operations are available for address
registers; no instructions operate on the lower order byte of an
address register.
– Operations on address registers will not set the CCR.

A31 A16 A15 A00 A0


A1
..
.
A7

Word (.W)

Long word (.L)


2.4 Program counter (PC) model
• PC holds the address in memory of the next instruction to be
executed. As such, it has the characteristics of the address register
– for addressing memory – only the lower 24 bits are used.
• The least significant bit of the PC is always zero, so instructions
always start on a word boundary (with even addresses).
Always zero
PC 0000 0000 0000 0100 0000 0000

24 bit address

Instruction
Address
Memory
$400 3 A 3 C
$402 5 8 5 8
$404 DA 7 8
2.5 Condition code register
• The CCR is an 8-bit register, included as part of the processor’s
status register. The definition of CCR, and its relationship with the
ALU are shown as follows.
• The CCR contains five flag bits, which are set by the ALU, to hold
information about the result of an arithmetic or logical operation
instruction that has just been executed.

Result ALU Operands


Set flags
CCR X N Z V C

Status register System byte User byte


15 8 7 0
• Flags in CCR and their meanings:
– Carry C, bit 0. Set to 1 if an add operation produces a carry or
a subtract operation produces a borrow; otherwise cleared to 0.
– Overflow V, bit 1. Useful only during operations on signed
integers. Set to 1 if the addition of two like-signed numbers (or
the subtraction of two opposite-signed numbers) produces a
result that exceeds the 2’s complement range of the operand;
otherwise cleared to 0.
– Zero Z, bit 2. Set to 1 if the result is 0, otherwise cleared to 0.
– Negate N, bit 3. Meaningful only in signed number operations.
Set to 1 if a negative result is produced, otherwise cleared to 0.
The N flag follows the MSB of an 8-, 16- or 32-bit operand.
– Extend X, bit 4. This bit functions as a carry for multiple
precision operations.
3. Assembly Program – An Introduction
3.1 Types of code
• An assembly language program generally consists of two types of
code: assembly directives and executable statements.
• An executable statement is an instruction, written in mnemonic
form, that the assembler will translate into machine code. E.g.
MOVE.W D5, $4004 * move contents of D5 into memory
* location $4004
• An assembly directive is a statement, like a data declaration, which
tells the assembler something it needs to know when it assembles the
program. Assembly directives are not part of the assembly-language
instructions and will not be translated into executable code. E.g.
DATA DC.W $1234 * set up a constant $1234 in memory
* at a location named DATA
3.2 Qualifiers
• In the MC68000, the qualifiers .B, .W and .L may be used in
association with an assembly instruction/directive to indicate that the
operation or direction is applied to bytes (.B, 8 bits), words (.W, 16
bits) and long words (.L, 32 bits), respectively. In the above, for
example, we see MOVE.W and DC.W.
• Examples: Effects of the qualifiers (all numbers are in hex format)
– ADD.B D0, D1 (D1  D1 + D0).

Pre: D0 5 5 5 5 5 5 8 2 Post: D0 5 5 5 5 5 5 8 2
D1 3 3 3 3 3 3 2 8 D1 3 3 3 3 3 3 A A

– MOVE.W #0, D1 (D1  0)


Pre: D1 F F F F F F F F Post: D1 F F F F 0 0 0 0
• In MC68000, if you do not specify a qualifier, it is assumed to be
.W, i.e. for 16-bit word. So MOVE is equivalent to MOVE.W.
3.3 Program structure
• A good way to organize a program is to follow the pattern of many
high level languages: data declarations first followed by executable
code. For example (in Java):
class Box {
// Data declaration
double width; double height; double depth;
// Executable code: compute and return volume
double volume() { return width* height* depth; }
}
• This organization places all data declarations together and makes it
easy to look up a declaration when necessary. It also allows the
programmer and other readers of the program to review the data that
will be processed by the program before they begin tracing the
algorithm.
• Example: The following shows a program which adds two 16-bit
values $1234, $4321, stored in memory cells named DATA and
NEXT respectively, and then outputs the result to a memory cell
named ANSWER:
Data declaration
ORG $4000 * base address for data
DATA DC.W $1234 * declare a word constant
NEXT DC.W $4321
ANSWER DC.W 0
Executable instructions
ORG $400 * base address for instructions
MOVE.W DATA, D5 * contents of DATA to D5
ADD.W NEXT, D5 * add contents of NEXT to D5
MOVE.W D5, ANSWER * contents of D5 to ANSWER
MOVE.B #9, D0 * exit from program
TRAP #15
3.4 The assembly process – source code, object code and list file
• The above program is referred to as source code. It is typically
typed into a mainframe or host computer using an editor. The cross
assembler is then run, which takes the source code as input and, if
there are no errors, translates it into object code (in binary format).
The object code file for this source code may look something like
4000123443210000
3A3800004000DA7800004002
31C5000004004103C00094E4F
The object code is not readable, it contains the code to be executed
by the machine along with the information about where in the
memory the code is to be down-line loaded.
• One task the assembler must carry out is to work out the storage
location for each of the declared data and executable instructions.
This information is provided in the list file, produced by the
assembler after assembling the source code.
• A list file may look like:
Object code Source code
Address Contents ORG $4000
4000 1234 DATA DC.W $1234
4002 4321 NEXT DC.W $4321
4004 0000 ANSWER DC.W 0
ORG $400
400 3A38 00004000 MOVE.W DATA, D5
404 DA78 00004002 ADD.W NEXT, D5
408 31C5 00004004 MOVE.W D5, ANSWER
40C 103C 0009 MOVE.B #9, D0
40E 4E4F TRAP #15
• The list file is important as it shows us: 1) a readable version of the
object code, 2) the address in memory each instruction is to be
loaded, and 3) the values of all user defined symbols, e.g. DATA is
translated into the value $4000.
3.5 The TRAP instructions
• TRAP are used specifically for transferring control of the processor
from the executing program to the operating system. There are 16
TRAP instructions, numbered as TRAP #0, …, TRAP #15.
• In the MC68K Simulator, the instruction TRAP #15 is used to
handle input from the keyboard and output to the screen. The
function is specified by a task number, which should first be written
to D0.B. For example:
MOVE.B #5, D0 * task number #5 into D0.B
TRAP #15 * read a char from keyboard and store it in D1.B
MOVE.B #6, D0 * task number #6 into D0.B
TRAP #15 * convert D1.B to ASCII and print on screen
MOVE.B #9, D0 * task number #9 into D0.B
TRAP #15 * exit the program and return to the monitor
4. Statement Structure and Assembly Directives
4.1 Introduction
• In general, an assembly statement (i.e. instruction or assembly
directive) can have four components:
Label Operation/Directive-mnemonic Operands Comment
any of these may be missing from a particular statement. When they
occur they must occur in the order listed above.
• As described earlier, an operation/directive-mnemonic may be
associated with a qualifier .B, .W or .L, to indicate the size of data
the operation or direction act on.
• In the following we describe each of these components in turn, with
an emphasis on the assembly directives. We focus ourselves on the
MC68000 assembly-language vocabulary.
Format : Label Operation/Directive-mnemonic Operands Comment
4.2 Label
• A label is a symbolic name given to an address such that a storage
location can be referred to by its label, rather than by its hex address.
In the previous example, we used labels DATA and NEXT for two
word locations: Address Contents
DATA DC.W $1234 DATA 4000 1234
NEXT DC.W $4321 NEXT 4002 4321
and then addressed these locations by using
MOVE.W DATA, D5 instead of MOVE.W $4000, D5
ADD.W NEXT, D5 instead of ADD.W $4002, D5
• These replacements are important; otherwise, each time when you
edit the program, e.g. insert and delete lines, you’ll have to
recalculate all addresses in hexadecimal. The assembler does this
calculation for you automatically when you use labels.
Format : Label Operation/Directive-mnemonic Operands Comment
4.3 Operation mnemonic
• An operation mnemonic is a word or abbreviation given to a
machine code instruction, e.g.
MOVE.W
ADD.B
which describes the operation to be carried out. The operation
mnemonic will be translated into the op-code of an instruction by
the assembler.
• The complete set of the operation mnemonics form the instruction
set of the assembly language, which is a list of all available
machine-level instructions on the machine.
• To program in assembly language, one must know the instruction set
of the particular microprocessor. The instruction set is machine-
dependent, unlike high-level languages.
Format : Label Operation/Directive-mnemonic Operands Comment
4.4 Directive mnemonic
• A directive mnemonic is a word or abbreviation used to direct the
assembler to do something as it assembles the program. As described
earlier, it is called an assembler directive.
• As said, the assembler directives are not part of the assembly-
language instructions and will not be translated into executable
codes by the assembler. They are used for declaring data/variables
used in the program, and for specifying the base addresses in
memory for storing programs and data.
• Most commonly used assembler directives in MC68000:
ORG
DC
DS
EQU
Format : Label Operation/Directive-mnemonic Operands Comment
4.4.1 ORG
• ORG tells the assembler where in memory the next section of
program is to be located. For example:
ORG $400 Address Contents
MOVE.W DATA, D5 400 3A38 004000
ADD.W NEXT, D5 404 DA78 004002
…... …...
tells the assembler that the program following ORG will be placed in
memory starting from address $400 – the user RAM area.
ORG $4000 Address Contents
DATA DC.W $1234 4000 1234
NEXT DC.W $4321 4002 4321
…... …...
tells the assembler that the data $1234, $4321, … are stored from
$4000.
Format : Label Operation/Directive-mnemonic Operands Comment
4.4.2 DC
• DC means define constant. It tells the assembler to set up one or
more constant data values in memory. For example,
DC.B $12 * set up the 8-bit value $12 in memory
DC.W $1234 * set up the 16-bit value $1234 in memory
DC.L $12345678 * set up the 32-bit value $12345678 in memory
• Used with ORG to allocate space in memory for setting up data. The
stored data may be addressed by their individual labels. E.g.
ORG $4000 Address Contents
DATA DC.W $1234 DATA 4000 1234
NEXT DC.W $4321 NEXT 4002 4321
……
A data area is allocated from address $4000; the address for the
word $4321 is NEXT, or $4002 (each word occupies 2 bytes).
Format : Label Operation/Directive-mnemonic Operands Comment
• DC may be used to request the assembler to set up a list of data
values in contiguous addresses, for example:
ORG $4000
ARRAY DC.W $1234, $5678, $9ABC, $DEF3
sets up an array containing four word values, with respective
addresses $4000, $4002, $4004, $4006, i.e.
Address Contents
ARRAY 4000 1234
4002 5678
4004 9ABC
4006 DEF3
Label ARRAY gives the initial address of the array, i.e. $4000. So
the address of the second element in the array is: ARRAY + 2, ….
Format : Label Operation/Directive-mnemonic Operands Comment
• DC may also be used to set up the ASCII value of characters, for
example:
DC.B ‘A’
this is equivalent to DC.B $41 (ASCII code of A = $41);
DC.B ‘HATFIELD’
this is equivalent to DC.B $48, $41, $54, $46, etc.
and also to DC.B ‘H’, ‘A’, ‘T’, ‘F’, etc.
• General syntax for DC:
Label DC.qualifier value [, value, value, ..., value]
where Label – address to access the ‘value(s)’
.qualifier = .B, .W or .L
value = value to be set up in memory
Format : Label Operation/Directive-mnemonic Operands Comment
4.4.3 DS
• DS means define space. It tells the assembler to reserve memory
locations to hold data when the program runs, for example, input
data or intermediate results. While DC is used to define space for
constants (e.g. 3.14159, ‘HATFIELD’), DS is used to define space
for variables (e.g. x, y, z, …) used in the program.
• General syntax for DS:
Label DS.qualifier num
Label - address to access the storage
.qualifier = .B, .W or .L; num = amount of storage to be reserved
• For example:
Label DS.B 4 * reserve four bytes
Label DS.W 10 * reserve ten (decimal) words
Label DS.L $10 * reserve sixteen (decimal) long words
Format : Label Operation/Directive-mnemonic Operands Comment
• Example: Reserve space for two byte variables, two word variables,
and an array variable of ten words:
ORG $4000
BVAR1 DS.B 1 * reserve the 1st byte
SKIP1 DS.B 1 * Ignore this byte so that the next
* byte starts at an even address
BVAR2 DS.B 1 * reserve the 2nd byte
SKIP2 DS.B 1 * same reason as above
WVAR1 DS.W 1 * reserve the 1st word
WVAR2 DS.W 1 * reserve the 2nd word
ARRAY DS.W 10 * reserve ARRAY[0..9] of words
• The reserved variables may be addressed by their individual labels.
• The initial contents of the reserved locations are unspecified (i.e.
can be any value within the range of representation).
Format : Label Operation/Directive-mnemonic Operands Comment
4.4.4 EQU
• EQU allows the programmer to assign a symbolic name to a
numerical value. This symbolic name can be used instead of that
value in later program text. For example,
LENGTH EQU $8
MASK EQU $000F
DEVICE EQU $800000
• When the assembler encounters the symbol in the source code, it
replaces it with the actual value, thus
MOVE #LENGTH, D1
can be used instead of
MOVE #$8, D1
• The aim is to use names in the source code to make it easier to read
and to locate in one place those items which may need to be changed
if a variation of the program is required.
Format : Label Operation/Directive-mnemonic Operands Comment
4.5 Operands
• Operands are what the operations or directives act on. In the
example:
DATA DC.W $1234
the operand is $1234. This operand provides a value for the
assembler to place in the storage created and named DATA.
• Addition, subtraction, logical AND, OR, etc. are dual-operand
operations requiring two operands, e.g.
ADD.W DATA, D5
where the two operands are the memory address labeled DATA and
the data register D5, respectively.
Format : Label Operation/Directive-mnemonic Operands Comment
• In an assembly language, the operands may be specified in three
different ways:
– Giving the actual data (called the immediate data, e.g. an hex
value $1234);
– Giving the register which holds the data, e.g. D5;
– Giving the memory address where the data is stored, e.g. DATA.

• Instructions generally have one, two, or zero operands. There are a


few special instructions that have three operands. Assembler
directives can have many operands, often in the form of a comma
delimited list, as shown previously for DC, for example.
Format : Label Operation/Directive-mnemonic Operands Comment
4.6 Comments
• Comment fields start with a asterisk (*). They are often placed at the
end of a line and consist of a brief description of one or more lines of
code. It is possible to have a line that is all comment. For example:
SUB.W D1, D0 * subtract D1 from D0 (16-bit word)
* leaving the result in D0 (lower order word)
……
• A well-commented program makes it easier for both the programmer
and reader to understand, debug, modify, and expand in the future.
5. Instruction Set and Addressing Modes
5.1 Introduction
• As described previously, machine instruction is usually encoded into
two fields: op-code and operand(s); the op-code specifies the type of
operation (e.g. arithmetic, logical, etc.) as well as the addressing
mode and instruction length, and the operand(s), if required,
corresponds to the data to be operated upon.
• The instruction set is a collection of all available machine-level
operations, expressed in appropriate mnemonics, which we use to
compose the assembly program.
• The operands, corresponding to the data to be processed, may be
encoded in different ways, typically, as actual data, as registers
which hold data, or as memory addresses where data is stored. Each
of these represents a distinct method, referred as an addressing
mode, used to specify the operands.
Example:

move a word dest. reg dest. mode source mode

0011 1010 0011 1100 0101 1000 0101 1000


Op-code Operand
Encoding of an instruction: MOVE.W #$5858, D5

• The aim of using different addressing modes is to make data access


flexible and hence the execution of the program efficient.
• It is impossible to present instructions without using examples of
addressing modes. They are virtually inseparable. We will introduce
instructions and some simple addressing modes interspersed so we
can learn both as we go along.
5.2 Simple addressing modes
• Consider instructions involving two operands, which have a general
format:
Operation <source>, <destination>
where source refers to the source operand and destination refers to
the destination operand.
• We use a term, effective address - ea, as a general expression for
the address of an operand. The ea of an operand may be just the
data to be processed, or a register or a memory location where the
data is held, depending on the specific addressing mode.
• The MC68000 has 12 addressing modes, each providing a means of
accessing the operands. We first take a look at some fundamental
ones.
5.2.1 Register direct addressing
• Including data register direct and address register direct: the ea
of an operand is one of the eight data registers Dn, n = 0 ~ 7, or one
of the eight address registers An, n = 0 ~ 7, respectively, in which
the operand is held.
• Example: Data register direct addressing
MOVE.L D1, D3 * operands ea’s are D1 and D3
Operation: D3  D1
Effect: e.g.

Pre: D1 1 2 3 4 5 6 7 8 Post: D1 1 2 3 4 5 6 7 8
D3 0 0 0 0 0 0 0 0 D3 1 2 3 4 5 6 7 8

Note the effect of qualifier .L.


• Example: Address register direct addressing
ADD.W A1, D3 * source ea is A1
Operation: D3  D3 + A1.
Effect: e.g.

Pre: A1 3 6 2 2 3 4 5 6 Post: A1 3 6 2 2 3 4 5 6
D3 0 0 0 0 8 8 8 8 D3 0 0 0 0 B C D E

Note the effect of qualifier .W.


5.2.2 Immediate addressing
• In this mode the operand is an actual value which is encoded into
the instruction itself, thus the operand is “immediately” available,
no need to fetch it from another register or memory.
• Immediate addressing is indicated to the assembler by preceding
the operand with the the symbol ‘#’.
• Example:
MOVE.W #$1200, D5 * source ea is a numeric value
or MOVE.W #LABEL, D5 * if e.g. LABEL EQU $1200
Operation: D5  $1200
Instruction: 3A3C 1200 ($1200 is embedded in the instruction)
Effect: e.g.

Pre: D5 F F F F 0 0 0 0 Post: D5
• Example: Add two numbers using immediate addressing
NUM EQU $47
ORG $400
CLR.W D1 * D1  0: clear a word of D1
MOVE.B #NUM, D1 * put $47 into D1.B
MOVE.B #$62, D2 * put $62 into D2.B
ADD.B D2, D1 * D1.B  D1.B + D2.B
* so D1.B = $47 + $62 = $A9
MOVE.B #9, D0 * exit from program
TRAP #15
5.2.3 Absolute addressing
• This mode refers to the actual memory location where the operand
is stored. For example,
MOVE.W $4000, D0 * source ea is a location in memory
SUB.W D0, $4002 * dest. ea is a location in memory
• In a real program, the address would normally be referred to by a
label (see previous discussions for label, DC, DS).
• Example:
SUB.W D0, DATA * dest. ea is a memory cell
* at location DATA
Operation: Mem[DATA]  Mem[DATA] – D0. Effect: e.g.
Memory Memory
Pre: DATA 4 4 A B Post: DATA 3 2 7 7
D0 0 0 0 0 1 2 3 4 D0 0 0 0 0 1 2 3 4
5.3 Example summarizing simple addressing modes
• The following shows some possible addressing modes for the
function ADD <ea>, D3.
NUM EQU $1234
ORG $4000
DATA DC.W $1234
ORG $400
ADD.W D0, D3 * data register direct
ADD.W A0, D3 * address register direct
ADD.W DATA, D3 * absolute refer to a specific
ADD.W $4002, D3 * absolute memory location
ADD.W #NUM, D3 * immediate (word) refer to an
ADD.L #$12345678, D3 * immediate (long) actual value
MOVE.B #9, D0 * exit from program
TRAP #15
5.4 Instruction set summary
• The MC68000’s instructions may be classified into four main types
– Data movement instructions
move data between memory locations, general-purpose
registers, e.g. MOVE, LEA
– Arithmetic and logical instructions
perform arithmetic or logical operations on binary numbers,
memory locations and registers, e.g. ADD, SUB, AND, OR
– Program control instructions
perform branches, jumps and subroutine calls to control the
sequence of program execution, e.g. JMP, JSR, BRA, BEQ etc.
– System control instructions
instructions which call system routines for handling exceptions,
privileging interrupt requests, and transferring control, etc., e.g.
TRAP
• The above classification is also applicable to the instruction sets of
many other microprocessors.
• For a full description of the above instructions, see the Practical
Booklet, in particular the table called Addressing Modes.
– Columns Mnemonic (instruction) and Boolean (operation) are
the most relevant.
– Use these two columns to investigate MC68000 instruction set.
– For conditional branch instructions Bcc (branch on condition
cc), see table Conditional Tests, which gives the available
condition-codes cc. Branches will be discussed in the next
chapter: program flow control.
• In the following we investigate some examples, to show how the
instruction set could be studied.
5.5 Examples of instructions
• Data movement instruction
Mnemonic: MOVE
Qualifiers: .B, .W, .L
Operation: destination operand  source operand
Permissible Addressing modes:
Source Destination Source Destination
register register or in short <ea> <ea>
register memory ea may be
memory register a register
immediate register a memory location
immediate memory an immediate data
memory memory
• Rather than attempting to memorize a table like this for each
instruction learn the following two general principles about
addressing.
• Principle 1: The immediate mode cannot be used for the
destination operand. Immediate mode creates a constant value
encoded in one of the operand fields of the instruction. It does not
make sense for this to be the place where a result is stored.
• Principle 2: In dual-operand operations both operands cannot be
memory references. This follows from the organization of the
MC68000 CPU. The two data paths to the ALU come from the
bank of data registers and a temporary register (MBR). One
memory operand can be placed in the MBR but the other must
come from a register or be an immediate operand.
• The MOVE instruction looks like an exception to this last principle
(see last line of table) but if you think about it the MOVE
instruction is not really a dual-operand operation. It retrieves one
value from one location and stores it in another. This one value can
be retrieved from memory, placed in the MBR, passed through the
ALU and stored back in a different memory location.
• Example: Moving a byte, word and long word from memory.
Memory
DATA 1 1 2 2
3 3 4 4

E.g. D0 0 0 0 0 0 0 0 0
MOVE.B DATA, D0 0 0 0 0 0 0 1 1
MOVE.W DATA, D0 0 0 0 0 1 1 2 2

MOVE.L DATA, D0 1 1 2 2 3 3 4 4

• Reminder: In 68000 there are no instructions which can


manipulate the lower order byte of the address register. Only word
and long word operations are available for address registers.
• Load effective address instruction
Mnemonic: LEA
Operation: a specific An  effective address of an operand
Permissible Addressing modes:
Source Destination
<ea> address register
• The instruction LEA is used to load the effective address of an
operand into an address register.
• Example: LEA DATA, A1
Memory Memory
Pre: DATA 4000 1 2 3 4 Post: DATA 4000 1 2 3 4

A1 0 0 0 0 0 0 0 0 A1 0 0 0 0 4 0 0 0
• Note, that the address and not the contents of that address are
loaded into the address register, unlike: MOVE.W DATA, A1.
• Arithmetic Add, Subtract Instructions
Mnemonics: ADD, SUB
Qualifiers: .B, .W, .L
Operation:
ADD: destination operand  destination operand + source operand
SUB: destination operand  destination operand – source operand
Permissible Addressing modes:
Source Destination
Dn <ea>
<ea> Dn
• The above indicates that: ADD <ea>, <ea> with both ea’s being
memory references is not allowed; (of course the immediate data
#d, d denoting a numeric value, can never be the destination’s ea).
• Several variations, e.g.: ADDA <ea>, An; ADDI #d, <ea>;
ADDQ #d, <ea> – make the ea’s more specific (A – address
register, I – immediate, Q – quick) .
• Example: a sample program to evaluate ([A]+[B])–[C]
A EQU $4000 * assume Mem[$4000] contains $55
B EQU $4002 * assume Mem[$4002] contains $66
C EQU $4004 * assume Mem[$4004] contains $11
ORG $400
MOVE.B A, D1 * D1.B  $55
ADD.B B, D1 * D1.B  D1.B + $66 = $BB
MOVE.B C, D0 * D0.B  $11
SUB.B D0, D1 * D1  D1 – D0 = $BB – $11 = $AA
MOVE.B #3, D0 * print value in D1 on screen in decimal form
TRAP #15 (see Practical Booklet for TRAP instruction)
MOVE.B #9, D0 * exit from program
TRAP #15
• Example: For greater range, use multiple-word arithmetic
– If D0 and D1 contain one 64-bit integer and D2 and D3 contain
another 64-bit integer, they can be added using a pair of
instructions in sequence:
ADD.L D1, D3 * add lower-order 32-bit words
ADDX.L D0, D2 * add higher-order 32-bit words
– The first add operation: D3  D3 + D1, and may produce a
carry into the extend flag X in the CCR (Ch3, S4.2.4).
– ADDX (ADD with extend) includes the flag X into the addition,
operation: D2  D2 + D0 + X.
ADDX.L ADD.L
D0 5 5 5 5 5 5 5 5 D1 8 8 8 8 8 8 8 8
D2 6 6 6 6 9 9 9 9 + D3 9 0 0 0 0 0 0 0
+ 1 X
carry
Result: D2 B B B B D D D F D3 1 8 8 8 8 8 8 8
• Logical AND, OR instructions
Mnemonics: AND, OR
Qualifiers: .B, .W, .L
Operation:
AND: dest. operand  dest. operand AND source operand
OR: dest. operand  dest. operand OR source operand
Permissible Addressing modes:
Source Destination
Dn <ea>
<ea> Dn
• Variations: ANDI, ORI
Permissible Addressing modes:
Source Destination
#d <ea>
d is a numeric value, may be byte, word or long word, depending
on the specific operation.
• Example (mask): ANDI.W #$000F, D0
31 16 15 0
Pre:
D0 anything 1101 0010 1101 1010

The instruction forms the logical AND, bit by bit, with


0000 0000 0000 1111

Post: D0 anything 0000 0000 0000 1010

• Used in a program, a mnemonic might be helpful, e.g.


MASK EQU $000F
.
.
ANDI.W #MASK, D0
.
.
Note, that the sign # must be used to indicate that the symbol
MASK is an immediate value, rather than a memory address.
6. More Addressing Modes
6.1 Introduction
• We have seen that the address registers, A0 ~ A7, can be used for
holding data, the same as normal data registers. We now introduce
the use of address registers for addressing purposes.
• When used for addressing, the address register holds the memory
address of an operand:
– unlike the previously described address register direct
addressing, in which the address register holds the operand itself.
• This difference is indicated to the assembler by enclosing the address
register in round brackets, for example, (A1), means:
– an operand in memory whose address is held in A1.
• The ea of an operand can be loaded into An by using LEA <ea>, An.
• If you know C/C++, you can see that an address register is like a
pointer, pointing to a location in memory where the data is stored.
6.2 Address register indirect addressing
• The simplest mode is: the effective address of an operand is held in
one of the eight address registers.
• Example:
ADD.W (A1), D3 * source ea is (A1)
Operation: D3  D3 + Mem[A1]
Effect: e.g. Address Memory
A1 0 0 0 0 4 0 0 0 4000 2 2 2 2
Pre:
D3 0 0 0 0 9 9 9 9 +

Post: A1 0 0 0 0 4 0 0 0
D3 0 0 0 0 B B B B
• Four variations: address register indirect with auto-increment, auto-
decrement; with offset; and with index. These support variable
addresses or addresses decided at run-time.
6.2.1 Address register indirect with post-increment
• The address held in an address register is incremented
automatically after the operand has been accessed, to point to the
next operand. The increment would be 1 (byte), 2 (bytes) and 4
(bytes), respectively, for byte (.B), word (.W) and long word (.L)
operations.
• This is written as (An)+. For example:
MOVE.W (A3)+, D2
Operation: D2  Mem[A3], A3  A3 + 2
Effect: e.g. Address Memory

Pre: A3 0 0 0 0 4 0 0 0 4000 2 2 2 2
D2 0 0 0 0 0 0 0 0 4002 3 3 3 3

Post: A3 0 0 0 0 4 0 0 2
D2 0 0 0 0 2 2 2 2
• This post-increment facility is similar to that in C/C++/Java, which
is useful when a list of operands are to be accessed in sequence.
• Example: Suppose we have an array holding eight values 1, 2, 3, 4,
5, 6, 7, 8. A Java program which adds all elements of the array
could be written as:
class Array {
short array[ ] = {1, 2, 3, 4, 5, 6, 7, 8};
short sum = 0;
short index = 0;
short count = 8;
for(; ;) {
sum += array[index++]; // post-increment
count – –;
if(count > 0) continue;
else break;
}
}
• The corresponding assembly program:
ORG $4000
ARRAY DC.W 1, 2, 3, 4, 5, 6, 7, 8 * the word array
SUM DS.W 1 * space for the sum
ORG $400
LEA ARRAY, A1 * A1 points to ARRAY
MOVE.B #8, D1 * set up the count
CLR.W D2 * clear D2 for the sum
LOOP ADD.W (A1)+, D2 * add array element to D2
SUB.B #1, D1 * decrement the count
BNE LOOP * back to LOOP if D1>0
MOVE.W D2, SUM * result into memory
MOVE.B #9, D0 * exit from program
TRAP #15
4.2.2 Address register indirect with pre-decrement
• This is identical to the previous post-increment mode, except that
the address held in the address register is decremented before the
operand is accessed. Also, the decrement would be 1, 2 and 4
respectively, for byte, word and long word operations.
• This is written as –(An). For example:
MOVE.L –(A0), D3
Operation: A0  A0 – 4, D3  Mem[A0]
Effect: e.g.
Address Memory

Pre: A0 0 0 0 0 4 0 0 4 4000 2 2 2 2
D3 0 0 0 0 0 0 0 0
4004 5 5 5 5
Post: A0 0 0 0 0 4 0 0 0
D3 0 0 0 0 2 2 2 2
• Example: Combine pre-decrement and post-increment to
manipulate a data structure called a stack – last-in-first-out (LIFO).
ORG $4000
L1 DC.B $55, $00 * i.e. Mem[$4000] = $55
L2 DC.B $77, $00 * i.e. Mem[$4002] = $77
L3 DC.B $99 * i.e. Mem[$4004] = $99
ORG $400
LEA $8000, A0 * set up stack at $8000
MOVE.B L1, –(A0) * A0 = $7FFF, Mem[A0]  $55
MOVE.B L2, –(A0) * A0 = $7FFE, Mem[A0]  $77
MOVE.B L3, –(A0) * A0 = $7FFD, Mem[A0]  $99
MOVE.B (A0)+, D3 * D3  Mem[A0] = $99, A0 = $7FFE
MOVE.B (A0)+, D4 * D4  Mem[A0] = $77, A0 = $7FFF
MOVE.B (A0)+, D5 * D4  Mem[A0] = $55, A0 = $8000
MOVE.B #9, D0 * exit from program
TRAP #15
6.2.3 Address register indirect with 16-bit offset
• The address of an operand is given by the content of the address
register plus an 16-bit signed offset, given as an immediate value.
• Thus ea = (An + d16), written as d16(An), where d16 is a 16-bit
two’s complement value (in the range 32768 to +32767).
• Example: MOVE.W 6(A0), D1 * source ea = (A0 + 6)
or MOVE.W LABEL(A0), D1 * if e.g. LABEL EQU 6
Operation: D1  Mem[A0 + 6]
Effect: e.g. Address Memory

Pre: A0 0 0 0 0 4 0 0 0 4000 2 2 2 2
D1 0 0 0 0 0 0 0 0 6
4006 8 8 8 8
Post: A0 0 0 0 0 4 0 0 0
D1 0 0 0 0 8 8 8 8
• Example: A contiguous storage of student records; each record has
80 bytes, including: name (30 bytes), DoB (8 bytes), level (2 bytes,
in the higher-order byte) and address (40 bytes). Write a program
accessing the levels of all students (assume 500 students).
RECORD EQU $4000 * base address of records
OFFSET EQU 38 * point to level in a record
ORG $400
LEA RECORD, A0 * A0 points to 1st record
MOVE #500, D5 * set count to 500
LOOP MOVE.B OFFSET(A0), D1 * level into D1
MOVE.B #3, D0 * output on screen
TRAP #15
ADDA #80, A0 * A0 points to next record
SUB #1, D5 * decrement the count
BNE LOOP * back to LOOP if D5>0
MOVE.B #9, D0 * exit from program
TRAP #15
6.2.4 Address register indirect with index and 8-bit offset*
• The address of an operand is given by the contents of two registers
– one must be an An and the other may be an An or a Dn – plus an
8-bit immediate value which is represented in two’s complement
and is within the range 128 to +127.
• The address is written as: d8(An, Rm), where d8 is a 8-bit signed
value, and Rm is either Am or Dm, referred to as the index register.
• This represents an ea = An + Rm + d8.
• Example: MOVE.W $10(A0, A1), D1
or MOVE.W LABEL(A0, A1), D1 * LABEL < 8 bits
Operation: D1  Mem[A0 + A1 + $10]
• This can be used to access records with variable lengths: one
register is used to point to the base address of the structure, the
other register is used to select a specific record, and d8 is used for
the offset within this record.
• Example: A typical application of addressing mode d8(An, Rm).
e.g. $10(A0, A1)
A0 holds the base of the A0
complete structure
Record 1

Record 2

A1 selects a specific record A1


within the structure
$10 Record N
d8 for offset within the record

• Note, that d8 is only an 8-bit value, with a small range 128~+127.


This restriction on the size of d8 prevents the above usage in
circumstances involving large records.
6.3 Program counter (PC) relative addressing
• PC holds the address in memory of the next instruction to be
executed. As such, it has the characteristics of address registers.
Specifically, we have PC relative addressing with offset, and index.
• Notations: d16(PC), d8(PC + Rn), respectively, where Rn is An or
Dn, and d16, d8 are two’s complement 16-bit, 8-bit numbers.
• Corresponding to ea = PC + d16, ea = PC + Rn + d8, respectively.
• Used for position-independent coding, e.g.
MOVE.W 20(PC), D0
Data to be moved to is stored 20 bytes ahead in memory of where
the instruction is stored. Such code will execute equally well
wherever it is loaded in the memory.
• Branches Bcc d8 or Bcc d16 cause a branch to the instruction given
by PC + d8 or PC + d16. Programmers merely have to write Bcc
LABEL and the assembler calculates the offset (d8 or d16).
6.4 Examples summarizing addressing modes
• Consider different modes for <ea> in operation: ADD <ea>, D3.
LABEL EQU $100
ORG $4000
DATA DC.W $0010
ORG $400
ADD D0, D3 * data register direct
ADD A0, D3 * address register direct
ADD DATA, D3 * absolute
ADD #6, D3 * immediate
ADD #–6, D3 * immediate
ADD.L #$12345678, D3 * immediate
ADD (A0), D3 * address register indirect
ADD (A0)+, D3 * with post-increment
ADD –(A0), D3 * with pre-decrement
* address register indirect
ADD $100(A0), D3 * with offset
ADD LABEL(A0), D3 * with offset
ADD 10(A0,D6), D3 * and with index
ADD LABEL(A0,D6), D3 * and with index
* Error since LABEL> 8 bits
MOVE.B #9, D0 * exit from program
TRAP #15
• Consider different modes for <ea> in operation: ADD D3, <ea>
LABEL EQU $100
ORG $4000
DATA DC.W $0010
ORG $400
ADD D3, D0 * data register direct
ADD D3, A7 * address register direct
ADD D3, DATA * absolute
* immediate not applicable
ADD D3, (A0) * address register indirect
ADD D3, (A0)+ * with post-increment
ADD D3, –(A0) * with pre-decrement
ADD D3, $100(A0) * with offset
ADD D3, LABEL(A0) * with offset
ADD D3, 10(A0,D6) * and with index, and again:
ADD D3, LABEL(A0,D6) * Error since LABEL>8 bits
MOVE.B #9, D0 * exit from program
TRAP #15
Summary of MC68000’s addressing modes
Mode EA generation Assembly
Data reg direct ea = Dn Dn
Address reg direct ea = An An
Address reg indirect ea = (An) (An)
with post-increment ea = (An), An An + N (An)+
with pre-decrement An An  N, ea = (An) (An)
with offset ea = (An + d16) d16(An)
with index and offset ea = (An + Rm + d8) d8(An, Rn)
Absolute ea = next words LABEL
PC relative
with offset ea = (PC + d16) d16(PC)
with index and offset ea = (PC + Rm + d8) d8(PC, Rm)
Immediate operand = next word(s) #LABEL
N = 1, 2 and 4 for byte, word and long word operations

Vous aimerez peut-être aussi