Vous êtes sur la page 1sur 14

Chapter 2: Assembly Language Fundamentals

References:
• Chapter 3 ”Assembly Language Fundamentals” By Kip R. Irvine, 3rd edition. Most
of the examples in this edition are for 16-bit processors
• Chapter 3 ”Assembly Language Fundamentals” By Kip R. Irvine, 4th edition. Most
of the examples in this edition are for 32-bit processors

Assembly Language statements

Statements generally fall into two classes: instructions and directives


• I nstructions are executable statements that are executed by the processor at
runtime. They are translated directly into machine code by the assembler
Example: mov ax,5

A directive is a command that is recognized and acted upon by the assembler as


the program's source code is being assembled. Directives are used for defining
logical segments, choosing a memory model, defining variables, defining procedures,
and so on. Directives are part of the assembler's syntax, but are not related to the
Intel instruction set. Refer table 1 (Irvine 3rd) chapter 3, page 59 for assembler
standard directives) or refer to Appendix D (Irvine 4th) which contains a complete
reference to all MASM directives and operators. Refer to table 1 for these standards.
Example:
.stack 4096

Table 1: Some Standard Assembly Directives

Directive Description
end End of program assembly
proc Begin procedure
endp End of procedure
title Title of the listing file
.code Mark the beginning of the code segment
.data Mark the beginning of the data segment
.model Specify the program’s memory model
.stack Set the size of the stack segment

Statements consists of: Names, instruction mnemonics, operands, and comment


fields
General format of statement:
[name] mnemonic [operands] [;comment]

• Statements can be written in any column with any number of spaces between
each operand. Blank lines are permitted between statements. Details for each
part of the statement syntax will be explained later in this chapter.

Chp02_Assembly_language_Fundamentals.docx By F.M. Ishengoma Page 1 of 14


I nteger Constants

Ends with a radix symbol that identifies the numeric base (H→ Hex, B→ Bin, D→
Decimal). Default is decimal. For Octal end with q or Q.
Note: The letters for radix (H, B, D) are NOT case sensitive
When a hexadecimal constant begins with a letter, it must contain a leading zero.
For example: 0F6h
Examples of integer constants: 26, 1Ah, 1101b, 2BH, 0F6H, etc

Character or String Constants

A string of characters enclosed in either single or double quotation marks


o ’ABC’
o ’ X’
o ”Assembly Language Programming”
o ’4096’

Note: Number of bytes is determined by the number of characters in a string.


Characters are translated into their ASCII codes by the assembler, so there is no
difference between using ’A’ and 41h (ASCII code for ’A’ ≡41h ≡ 65). Note ASCII
table can be found in ASCII codes table.pdf (Irvine 4th book folder). Please print this
file.

K eyw ords (R eserved w ords)

Keywords has some predefined meaning to the assembler. It can be an instruction,


or it can be an assembler directive
• Keywords cannot be used out of context or as identifiers. Examples: MOV, PROC,
ADD, AX, ENDP

Assembler will detect most of the errors caused by wrong use of keywords
Example:
add: mov ax,5
Above statement will cause an assembling error, since add is a keyword and here it
is used as a label.

Nam es
• A name identifies a label or a variable. It may contain any of the following
characters: A…Z, a…z (letters), 0…9 (Digits), ? (Question mark), _ (underscore),
@ (number sign), $ (dollar sign), . (period)

Names have the following restrictions:


• A maximum of 247 characters (in MASM)
• NOT case sensitive
• The first character can be a letter, @, _, or $. Subsequent characters can be the
same, or they can also be decimal digits. Avoid using @ as the first character,
because many predefined symbol names start with @.

Chp02_Assembly_language_Fundamentals.docx By F.M. Ishengoma Page 2 of 14


• The assembler translates names into memory addresses (this w ill be
discussed later )
Note: A programmer chosen name cannot be the same as an assembler reserved
words (keywords).

Variables
A variable is a location in a program’s data area that has been assigned a name
Example: count db 50
Refer to data allocation directives later in this chapter on how to define variables.

Com m ents
Comments can be specified in two ways:
• Single-line com m ents , beginning with a semicolon character (;). All characters
following the semicolon on the same line are ignored by the assembler and may be
used to comment the program.
• Block com m ents , beginning with the COMMENT directive and a user-specified
symbol. All subsequent lines of text are ignored by the assembler until the same
user-specified symbol appears. For example:

COMMENT !
This line is a comment .
This line is also a comment.
!

We can also use any other symbol:

COMMENT &
This line is a comment.
This line is also a comment.
&

Labels
Labels serve as place markers when a program needs to jump or loop from one
location to another. A label can be on a blank line by itself, or it can share a line with
an instruction. It must be followed by a colon (:)

Label1: mov ax,0 ; label1 shares a line with an instruction


mov bx,0

Label2: ; label2 does not share a line with a statement
jmp Label1

Also a label is translated into memory address.

Chp02_Assembly_language_Fundamentals.docx By F.M. Ishengoma Page 3 of 14


M nem onic
Holds the instruction to be performed. Can be split into two sub-fields: code and
operands. Example:

MOV AX,BX ; mov is mnemonic, AX and BX are operands

Important: The code field should be separated from the operands by at least one
space.

Data allocation directives


• Are used to allocate storage, based on several predefined types:
o DB means define byte (1 byte ≡ 8 bits) - supported by 8085
o DW means define word (2 bytes ≡ 16 bits)
o DD means define double word (4 bytes) for 32-bit CPU
• Also BYTE, WORD or DWORD can be used instead of DB, DW and DD
respectively.
Note: Small letters (db, dw, dd) can also be used.

Define Byte (DB) – supported by 8085


• Allocates storage for one or more 8-bit (byte) values. Syntax is:
varname DB initval
Or
varname BYTE initval

Examples:
var1 db 'A'
var2 db –128
var3 db 255

 A variable’s initial contents may be left undefined by using a question mark for
the initializer. Example:
number db ?

Note: Something must be specified after db (either a known value or ?)

M ultiple I nitializers (Arrays)


DB can also be used to allocate storage for two or more 8-bit (byte) values. Syntax
is:
varname DB initval [,initval] . . .

 If multiple initializers are supplied, they must be separated by commas


Example:
nums db 10, 25, 56, 78

Chp02_Assembly_language_Fundamentals.docx By F.M. Ishengoma Page 4 of 14


If nums is stored at an offset of xh (hexadecimal) then 10 is stored at an offset of
x, 25 at an offset of x+1, 56 at an offset of x+2, and 78 at an offset of x+3
In class demo: Draw diagram to elaborate this!!

• Each initializer can use a different radix (base) when a list of items is defined.
Numeric, character or string constants can be freely mixed.
Examples:
list1 db 10, 32, 41h, 00100010b
list2 db 0Ah, 20h, 'A', 22h

Note: You can continue a line onto the next line, if the last character in the previous
line is \ (backslash). Example:

longArrayDef db 10h, 23h, 46h, 15h,\


17h, 63h, 77h, 89h

Also following declaration is allowed:


var_name db ….., …., ……
db …, …., ……

Example:
longArrayDef db 10h, 23h, 46h, 15h
db 17h, 63h, 77h, 89h

Note:
capital_letters db ’ABC’
is equivalent to
capital_letters db 41h, 42h, 43h ; (ASCII codes for ’A’, ’B’ and ’C’)

Defining Strings using DB directive

 The DB directive is ideal for allocating strings of any length


 A string can be identified by a variable, which marks the offset of the beginning
of the string. Example:
string db "Good afternoon",0

The above string is null terminated string. Another common termination character
is $.
 The string can continue on multiple lines without the necessity of supplying a
label for each line. The following is a long null-terminated string:
LongString db "This is a long string, that "
db "clearly is going to take "
db "several bytes to store",0
Note: It is possible to combine characters and numbers in one definition as:
msg db 'Hello World! ', 0Ah, 0Dh, '$'
where 0Ah is linefeed and 0Dh is carriage return.

Chp02_Assembly_language_Fundamentals.docx By F.M. Ishengoma Page 5 of 14


DUP Operator

 Is used to repeat one or more values when allocating storage


 It is especially useful when allocating space for a string or array
 Appears after a storage allocation directive, such as DB

Examples:
var1 db 20 dup(0) ; 20 bytes, all equal to zero
var2 db 20 dup(?) ; 20 bytes, un-initialized
var3 db 4 dup("ABC") ; 12 bytes: "ABCABCABCABC"
Note: A space is required before the word dup

Define W ord (DW )


 The DW (define word) directive creates storage for one or more 16-bit values.
The syntax is:
name DW initval [,initval] . . .
Examples:
var1 dw 0,65535
var2 dw 456, -32768, 32767
var3 dw 512
var4 dw 1000h,4096h, 'AB',0
var5 dw ? ; single word uninitialized
var6 dw 5 dup(1000h) ; 5 words, each equal to 1000h
var7 dw 5 dup(?) ; 5 uninitialized words

Note: dup can also be used with dw as in var6 and var7 above.
Demo: Draw figure for var4 above.

Question: Is varx dw ’A’ acceptable? Why?

Note: Intel uses Little Endian format (and NOT Big Endian). "Little Endian"
means that the low-order byte of the number is stored in memory at the lowest
address, and the high-order byte at the highest address. Motorolla is one of the
processors which uses Big Endian.

 For example, the value 1234h would be stored in memory as follows using Little
endian format:
Offset: 00 01
Value: 34 12

EQU Directive – supported by 8085

 Assigns a symbolic name to a string or numeric constant


 Syntax: name EQU constant
Examples:
maxint equ 32767

Chp02_Assembly_language_Fundamentals.docx By F.M. Ishengoma Page 6 of 14


maxuint equ 0FFFFh

 A symbol defined with EQU cannot be redefined later in the program (like
constant identifier in C/C++)
 No memory is allocated for EQU names

DATA TRANSFER INSTRUCTIONS

MOV Instruction (refer to instruction set page 17)


Syntax: mov destination, source
Purpose: Copy contents from a source operand to destination operand
Basic forms are:
mov reg , reg
mov reg , immed
mov reg , mem
mov mem , immed
mov mem , reg

 reg can be any non-segment register (for segment registers see next
paragraphs)
 Instruction Pointer (IP) register and immediate values CAN NOT be a destination
operand. The sizes of both operands must be the same!!

 Where segment registers are involved, the following types of moves are
possible, with the exception that CS cannot be a destination operand and
immediate value can not be moved to segment register (segreg):
mov segreg , reg16
mov segreg , mem16
mov reg16 , segreg ; segreg are 16-bits regs
mov mem16 , segreg ; for 16-bit CPUs

 MOV instruction lacks the ability to use two memory operands in one statement
(mov mem,mem not allowed). In such a case, a register must be used when
copying a byte or word from one memory location to another as in:
mov ax,var1 ; where var1 and var2 are words variables
mov var2,ax
This is equivalent to var2=var1 in High level programming languages.

Chp02_Assembly_language_Fundamentals.docx By F.M. Ishengoma Page 7 of 14


Sum m ary of M OV valid operands

Destination Source
reg
segreg reg
mem
reg
mem segreg
segreg mem
reg
mem imm
reg
Note: CS can not be destination!!
Examples of MOV instruction using above operands
 Data segment (Details later on defining data segm ent )
.data
count db 10
total dw 4126h

 Code segment (Details later on defining code segm ent )


.code
;8 bit transfer
mov bl,1 ; 8–bit immediate to register
mov count,26 ; 8–bit immediate to memory
mov al,bl ; 8–bit register to register
mov cl,count ; 8–bit memory to register
;16 bit transfer
mov cx,total ; 16-bit memory to register
mov dx,cx ; 16–bit register to register
mov bx,8FE2h ; 16–bit Immediate to register
mov total,1000h ; 16–bit immediate to memory

Question: What about mov ah, ’A’ and mov AX, ’A’ ??? (Note: ’A’≡41h)

Operands w ith Displacem ents

 Memory values that do not have their own labels can be addressed by adding a
displacement to the name of a memory operand. Example:
arrayB db 10h,20h

Then: (direct-offset addressing mode – details later w hen w e discuss addressing


m odes )
mov al,arrayB ; ≡ mov al,arrayB[0]; AL = 10h
mov bl,arrayB+1 ; ≡ mov bl,arrayB[1]; BL = 20h

Chp02_Assembly_language_Fundamentals.docx By F.M. Ishengoma Page 8 of 14


The notation arrayB+1 refers to the location one byte beyond the beginning of
arrayB
Note: 10h and 20h are saved in contiguous memory locations
 When dealing with an array of 16-bit values (words), the offset of each array
member is two bytes beyond the previous one. Example:
arrayW dw 100h,240h,360h,4A0h
Then:
mov ax, arrayW ; ax = 100h; arrayW[0]
mov bx, arrayW+2 ; bx = 240h ; arrayW[2]
mov cx, arrayW+4 ; cx = 360h ; arrayW[4]
mov dx, arrayW+6 ; dx = 4A0h ; arrayW[6]

Question: What is the value of ax if we use: mov ax,arrayW+1 ?

XCHG (ex change data) I nstruction - (R efer to instruction set page 30)
 The instruction exchanges the contents of two registers, or the contents of a
register and a variable. Refer page 30 instruction set. The syntax is:
xchg reg , reg
xchg reg , mem
xchg mem , reg
 This is the most efficient way to exchange two operands because no storage of
temporary value is required
 Very useful in sorting (remember in H/L: temp=x; x=y; y=temp)

Examples of XCHG
xchg ax,bx ; exchange two 16-bit registers
xchg ah,al ; exchange two 8-bit registers
xchg var1,bx ; exchange 16-bit memory operand (var1) with BX

Note: XCHG mem,mem is not allowed!!!


So, to exchange two variables, a register must be used as a temporary operand as
follows:

mov al,value1 ; load the AL register


xchg value2,al ; exchange AL and value2
mov value1,al ; store AL back into value1

H/W: Satisfy yourself that the above three statements will do the exchange.

ARITHMETIC INSTRUCTIONS

 INC (increment) adds one to a single operand (pg 12 inst set)


 DEC (decrement) subtracts one from a single operand (pg 10 inst set)
Syntax is:
INC destination ; destn=destn+1
DEC destination ; destn=destn-1

Chp02_Assembly_language_Fundamentals.docx By F.M. Ishengoma Page 9 of 14


 Destination can be a register (reg8 or reg16) or memory operand (mem8 or
mem16) for 16-bit CPUs

Examples:
inc al ; increment 8-bit register
dec bx ; decrement 16-bit register
inc byte_var ; increment memory operand
dec word_var ; decrement memory operand

All status flags are affected except the carry flag.

ADD Instruction (pg 4 instruction set)

Syntax is: ADD destination,source


Meaning: destination=destination + source
 Source is unchanged, and sum is assigned to destination
 The sizes of the operands must be the same size
 No more than one operand can be a memory operand (i.e. add mem,mem not
allowed)
 A segment register can not be the destination
 All status flags are affected
 Following table shows allowable operands:

Destination Source
Register
Register Memory
Immediate
Register
Memory Immediate
Examples:
add cl,al ; add 8-bit register to register
add var1,ax ; add 16-bit register to memory
add bx,1000h ; add immediate value to 16-bit register
add var2,10 ; add immediate value to memory
add dx,var3 ; add 16-bit memory to register

Note: In above examples, it is assumed that the variables var1 and var3 have been
defined using dw directive.
To add mem to mem use register as follows:
mov al, byte2
add al, byte1 ; al = byte1 + byte2

Chp02_Assembly_language_Fundamentals.docx By F.M. Ishengoma Page 10 of 14


Sub instruction (page 29 I nstruction set)
Subtracts a source operand from a destination operand. Syntax is:
SUB destination,source
Meaning: destination=destination - source
 The sizes of the two operands must match
 Only one operand can be a memory operand
 Inside the CPU, the source operand is first negated (using 2’s compliment
technique) and then added to the destination
 Allowed operands for SUB are the same as in ADD

Examples of SUB
sub cl,al ; subtract 8–bit register from register
sub bx,1000h ; subtract immediate value from 16–bit reg.
sub var1,10 ; subtract immediate value from memory
sub var2,ax ; subtract 16–bit register from memory
sub dx,var3 ; subtract 16–bit memory from register

Note: If either ADD or SUB generates a result of zero, the Zero flag is set. If the
result is negative, the sign flag is set

Examples showing effect on some flags bits


mov ax,10
sub ax,10 ; AX = 0, ZF = 1 (zero flag set)
mov bx,1
sub bx,2 ; BX = FFFF, SF = 1 (sign flag set)
--------------------------------
mov bl,4Fh
add bl,0B1h ; BL = 00, ZF = 1, CF = 1, 4Fh+B1h=100h
mov ax,0FFFFh
inc ax ; ZF = 1 CF not affected by inc and
; dec instructions

H/W: Satisfy yourself about above flag bits status (after learning Codeview )

Note: The identification of an operand as either signed or unsigned is completely up


to the programmer. The CPU updates the Carry and Overflow flags to cover both
possibilities. Carry is for unsigned and Overflow is for signed. Refer Irvine, 3rd
edition chp3 page 75 section 3.7.4

Hom ew ork : write assembly code for calculating value of z assuming x and y are
known 16-bt variables and z is a 16-bit variable using only add and sub instructions:
z=3x-2y+10

Chp02_Assembly_language_Fundamentals.docx By F.M. Ishengoma Page 11 of 14


OFFSET Operator

The OFFSET operator returns the 16-bit offset of a variable. The assembler
automatically calculates every variable’s offset from DS when a program is being
assembled.
.data
wordnum dw 1234h
.code
……….. ; some other instructions
mov bx , offset wordnum

If the variable wordnum is located at offset of 5 bytes in DS, then bx=5.

Also: lea bx, wordnum ; Load Effective Address can be used instead (page 15
Instruction set)

Important: A value must be stored into a 16-bit register!!!

Examples of using OFFSET

.data
arrayB db 10h, 20h
arrayW dw 100h, 200h, 300h
.code
mov bx , offset arrayB ; offset of a byte array
mov al, [bx] ; al=10h ; [] means contents of
;[bx]=>register-indirect addressing mode
mov cl, [bx+1] ; cl=20h
mov bx , offset arrayW ; offset of a word
mov ax, [bx] ; ax=100h
mov cx, [bx+2] ; cx=200h

Remember: Previously used arrayB+1 and arrayW+2 to get same values as


using OFFSET. If you have to use [], then either BX, SI or DI need to be used to
store the offset. Read Irvine 3rd edition section 3.8 Basic operand types. This w ill
be discussed later in addressing m odes (chapter 3D).

P TR operator

 Operands of an instruction must be of the same type (both bytes or words) for
example in MOV, ADD, SUB instructions
 If one operand is a constant (immediate value), the assembler attempts to
guess the type from the other operand. For example:
mov ax,1 is treated as a word instruction because AX is a 16-bit register; while
mov bh, 1 is a byte instruction
 Often the size of an operand is not clear from the content of an instruction.
Consider the following instructions, which will generate an ” operand m ust
have size ” errors message by the assembler:

Chp02_Assembly_language_Fundamentals.docx By F.M. Ishengoma Page 12 of 14


………………
mov bx, offset var1
inc [bx] ; illegal
mov [bx],1 ; illegal

The assembler doesn’t know whether bx points to a byte or a word. The PTR
operator makes the operand size clear. PTR must be used in combination with the
standard assembler data types such as BYTE, WORD, etc
……………….
mov bx, offset var1
inc byte PTR [bx]
mov word PTR [bx], 1

Example:
.data
val1 dw 1234h
…….
.code
......
mov bx, offset val1
mov al, byte ptr [bx] ; al=34h
mov cl, byte ptr [bx+1] ; cl=12h
mov al, [bx] ; H/W: al=?
mov cl, [bx+1] ; H/W: cl=?

Code and data segments


 Segment registers are the building blocks of programs
 Previously discussed CS, DS, SS and ES
 .data, .code and .stack directives exists in MASM (Refer Kip Irvine book, 3rd
edition chapter 3 page 59)

Simple examples

Exam ple1 : Adding three byte numbers and store the sum into sum variable (How
can you im plem ent it in C++ ? )

.data
var1 db 10h
var2 db 20h
var3 db 30h
sum db 0 ; assume sum of var1, var2 and var3
; can fit into an 8-bit variable sum
.code
…….. ; to be added later
mov al,var1 ; get first number
add al,var2 ; sum of var1 and var2
add al, var3 ; sum of var1, var2 and var3
mov sum,al ; store the sum

Chp02_Assembly_language_Fundamentals.docx By F.M. Ishengoma Page 13 of 14


Example2: Adding an array of byte numbers and store the sum into variable sum

.data
arrayB db 10h,20h,30h
sum db 0 ; or sum db ?
.code
………… ; to be added later
mov bx,offset arrayB
mov al,[bx] ; get first number
add al,[bx+1] ; add 1st and 2nd numbers
add al, [bx+2] ; sum of three numbers
mov sum,al ; store the sum

Hom ew ork : Re-write the above example without using OFFSET operator

Example3: Adding an array of 16-bit numbers


 Difference between sum of 8-bit and 16-bit numbers is the size of the operands
.data
arrayW dw 1000h,2000h,3000h
sum dw 0 ; or sum dw ?
.code
mov bx,offset arrayW ; can use SI, DI or BX
mov ax,[bx] ; first number
add ax,[bx+2] ; add second number
add ax, [bx+4] ; add third number
mov sum,ax

Final sum is 6000h or 24576 dec.

== END ==

Next topic: Chapter 03 M ASM and CodeView and Utility.lib

Chp02_Assembly_language_Fundamentals.docx By F.M. Ishengoma Page 14 of 14

Vous aimerez peut-être aussi