Vous êtes sur la page 1sur 30

3.

MEMORY ACCESS

3.1 Memory organization


3.1.1 The flat or linear memory model
In general, any memory block can be regarded as a sequence of memory locations. Typically,
each memory location stores an 8-bit number, a byte (this is the content of the memory
location). Each memory location is identified by a unique number called address, more
specifically physical address. The size of the memory is directly linked with the physical
address through the following equation:
memorySize 2 physiscalAddressSize[bits ]

(1)

Example 1: using a physical address of 2 bits, one can form 4 different physical addresses: 00,
01, 10, and 11, corresponding to 4 different memory locations. Consequently, a memory with a
physical address of 2 bits will comprise 4 memory locations (4 bytes).
Example 2: using a physical address of 20 bits, one can form 220 different physical addresses,
corresponding to 220 different memory locations. Consequently, a memory with a physical
address of 20 bits will comprise 220 memory locations (1 MB).
physical
addresses
a unique
address
for each
memory
location

FFFFFh
FFFFEh

00010h
0000Fh
0000Eh

00001h
00000h

contents

88h
73h

09h
1Bh
ACh

17h
24h

each
memory
location
stores an
8-bit value

Figure 1. The memory is a sequence of memory locations with unique addresses


The memory model in which the memory appears to the program as a single contiguous address
space and the CPU can directly (and linearly) address all of the available memory locations
without having to resort to any sort of memory segmentation or paging schemes is called flat
memory model or linear memory model.

3.1.2 The x86 (segmented) memory model


The x86 CPU has a 20-bit address bus, which means it can access memories with physical
addresses of up to 20 bits. However, as explained in Laboratory 1, this CPU has 16-bit registers,
which cannot be used to store a physical address (20 bits). Consequently, the x86 CPU organizes
the memory in smaller segments using a 16-bit address, called segment address (SA), to identify
a segment in the memory and another 16-bit address, called effective address (EA), to identify
the memory location inside the segment. As a consequence of the fact that these addresses are
16-bit numbers, the memory is organized in 216=64k segments comprising 216=64k memory
locations.
The physical address (PA), which is used by the microprocessor to physically access the
memory, is obtained by summing the segment address concatenated with 0h (shifted to the left
with 4 bits), and the effective address: PA = SA 0h + EA.
For example, suppose one would like to access the memory location with the effective address
0001h within the segment with the segment address 0001h. The physical address would be
0001 0h + 0001h = 00011h.
Note that the same memory location (the one with the physical address 00011h) can be
regarded as being the memory location with the effective address 0011h within the first
segment (with segment address 0000h), because: 0000h 0h + 0011h = 00011h.
Figure 2 presents an example of how the physical address is formed for various memory
locations with various segment and effective addresses.
The memory model used by the x86 microprocessors working in real-mode is called
segmented memory model.

3.1.3 Information organization inside the memory


There are three types of information that are usually stored in the memory:
the current program (the instructions),
data for the current program (operands, results, variables, data arrays) and
the stack of the current program.
The x86 architecture organizes this information in four separate segments: the code segment,
the data segment, the extended data segment and the stack segment.
The code segment stores the current program (the instructions). The segment address of the
current code segment is stored in CS (the code segment register). Inside the code segment,
information can be accessed using the effective address stored in a special purpose register,
called IP (instruction pointer register). IP always stores the effective address of the current
instruction.

PAs

contents

SA = 0000h

SA = 0001h

SA = 0002h

FFFFFh

10021h
10020h
SA EAs
1001Fh
FFFFh

10011h
0021h
SA EAs
10010h
0020h
1000Fh
FFFFh
001Fh

10001h
0021h
0011h
10000h SA EAs
0020h
0010h
0FFFFh
FFFFh
001Fh
000Fh

00021h
0021h
0011h
0001h
00020h
0020h
0010h
0000h
0001Fh
001Fh
000Fh

00011h
0011h
0001h
00010h
0010h
0000h
0000Fh
000Fh

00001h
0001h
00000h
0000h

9Ah

DCh
3Dh
BBh

C2h
90h
78h

2Eh
FFh
13h

49h
A4h
88h

99h
22h
4Ah

17h
24h

these memory locations are part


of the first three segments
these memory locations are part
of both the first segment (with
SA=0000h) and the second
segment (with SA=0001h)
these memory locations are only
part of the first segment (with
SA=0000h)

Figure 2. The x86 memory model


The data segment stores the operands, results, variables, data arrays to be used in the program.
The segment address of the current data segment is stored in DS (the data segment register).
Inside the data segment, information can be accessed using the effective addresses stored in BX
(base index register) and SI (source index register). BX usually stores the effective address of
elementary variables or the effective address of the first element in data arrays. SI usually stores
the effective address of the current element in a special data array, called source array.
The extended data segment is an extra data segment, which is used similarly to the data
segment to store operands, results, variables, data arrays. The segment address of the current
extended data segment is stored in ES (the extended data segment register). Inside the extended
data segment, information can be accessed using the effective addresses stored in DI
(destination index register). DI usually stores the effective address of the current element in a
special data array, called destination array.
The stack segment stores the stack used by the program. The segment address of the current
stack segment is stored in SS (the stack segment register). Inside the stack segment, information
can be accessed using the effective addresses stored in SP (stack pointer register) and BP (base
pointer register). SP always stores the effective address of the top element in the primary stack.
BP usually stores the effective address of another element in the primary stack or the effective
address of the top element in the alternative stack.

Although the address registers (BX, SI, DI, SP, BP and IP) are associated by default with specific
segment registers (IP with CS, BX and SI with DS, DI with ES, SP and BP with SS) to form
complete addresses for specific information in the memory, the x86 architecture permits the
usage of some address registers with other than the default segment registers as well (this is
called segment redirection). BX, SI and DI can also be used to access data in the code segment,
data segment, extended data segment, and stack segment. BP can also be used to access data in
the code segment, data segment, and extended data segment.
CS, DS, ES and SS can be configured to store the same segment address. In this case all the
information required by the current program is found in a single memory segment. This is in
general the scenario we are going to use further in the laboratory.
Summary for memory organization:
the memory can be regarded as a sequence of memory locations
each memory location stores an 8-bit number and has a unique 20-bit address, called
physical address
the x86 CPU regards the memory as being composed of 64k segments comprising 64k
locations each
the x86 CPU uses a 16-bit segment address (SA) to select a segment and a 16-bit
effective address (EA) to identify a memory location inside the segment
segment addresses (SAs) can be stored in one of the following segment registers: CS, DS,
ES, SS
effective addresses (EAs) can be stored in one of the following address registers: BX, SI,
DI, SP, BP and IP
the translation between the logical organization of the memory in segments and the
physical address is done as follows: PA = SA 0h + EA

3.2 x86 addressing modes for 16-bit microprocessors


Addressing modes are the techniques to specify (in the instruction format) the location of the
operands and results. The x86 addressing modes are summarized in Table 1. Note that all the
addressing modes presented in Table 1 refer to the memory. The addressing mode which
specifies that the data is found in a register is called register addressing.

Table 1. x86 addressing modes

mov AX, 1234h

direct

mov AX, [1234h]

indexed

mov AX, [SI+4h]

indirect

mov AX, [SI]

base relative addressing

direct

mov AX, [BX+13h]

indexed

mov AX, [BX+DI+13h]

implicit

mov AX, [BX+SI]

direct

mov AX, [BP+13h]

indexed

mov AX, [BP+SI+13h]

implicit

mov AX, [BP+DI]

simple addressing

immediate

stack relative addressing

Addressing
Example
mode

Description
the data is found in the memory (in the code
segment) immediately after the instruction code
the data is found in the memory at the effective
address specified in the current instruction
the data is found in the memory at the address
specified by the content of SI or DI + an offset
found in the current instruction
the data is found in the memory at the address
specified by the content of BX, SI or DI
the data is found in the memory at the effective
address obtained as a sum between the content
of BX and the offset specified in the current
instruction
the data is found in the memory at the effective
address obtained as a sum between the content
of BX, the content of SI or DI and the offset
specified in the current instruction
the data is found in the memory at the effective
address obtained as a sum between the content
of BX and the content of SI or DI
the data is found in the memory at the effective
address obtained as a sum between the content
of BP and the offset specified in the current
instruction
the data is found in the memory at the effective
address obtained as a sum between the content
of BP, the content of SI or DI and the offset
specified in the current instruction
the data is found in the memory at the effective
address obtained as a sum between the content
of BP and the content of SI or DI

Figure 3 summarizes the x86 addressing modes for 16-bit microprocessors.

Figure 3. x86 addressing modes for 16-bit microprocessors

3.3 Assembly directives


Assembly directives are operations that are executed by the assembler at assembly time, not by
a CPU at run time. Assembly directives can make the assembly of the program dependent on
parameters input by a programmer, so that one program can be assembled different ways,
perhaps for different applications. Assembly directives can also be used to manipulate
presentation of a program to make it easier to read and maintain. Another common use of
assembly directives is to reserve storage areas for run-time data and optionally initialize their
contents to known values. Assembly directives can also be used to associate arbitrary names
(labels or symbols) with memory locations and various constants. Usually, every constant and
variable is given a name so instructions can reference those locations by name.
The x86 assembly directives we are using in the Microprocessors Laboratory are the sequel:
org (assign location counter)
o Usage: org address
o Examples: org 100h
o Effect: The following instruction will be loaded in memory at the specified
address
db (define byte)
o Usage: [symbol] db [data,[data]]
o Examples:
array db 10h, 23h, 29h
charX db a
inputString db 'ana are mere'
o Effect: Allocates one or more bytes and initializes them with the expression
values (if there are any). The symbol will be associated with the address of the
first allocated byte.
dw (define word)
o Usage: [symbol] dw [data,[data]]
o Examples: numbers dw 8933h, 1240h, 0328h, 99A0h, 0F422h, 0101h
o Effect: Allocates one or more words and initializes them with the expression
values (if there are any). The symbol will be associated with the address of the
first allocated word.

equ (create symbol)


o Usage: symbol equ expression
o Example: number equ 4000h
o Effect: The symbol will have throughout the whole program the value obtained
after evaluating the expression.
offset (offset of expression)
o Usage: offset symbol
o Example: offset array
o Effect: Returns the address associated with the symbol.
byte ptr (byte pointer)
o Usage: byte ptr symbol
o Example: byte ptr [1234h]
o Effect: Converts the symbol to byte.
word ptr (word pointer)
o Usage: word ptr symbol
o Example: word ptr [1234h]
o Effect: Converts the symbol to word.
You can declare static data regions (analogous to global variables) in x86 assembly using the
assembler directives db and dw. Declared locations can be labeled with names for later
reference this is similar to declaring variables by name. Variables declared in sequence will
be located in memory next to one another.
Example declarations:
var
var2
var3

db 64
db ?
db 10
dw ?

; Declare a byte, referred to as location var, containing the value 64.


; Declare an uninitialized byte, referred to as location var2.
; Declare a byte with no label, containing the value 10. Its address is var2 + 1.
; Declare a 2-byte uninitialized value, referred to as location var3.

Unlike in high level languages where arrays can have many dimensions and are accessed by
indices, arrays in x86 assembly language are simply a number of cells located contiguously in
memory. An array can be declared by just listing the values, as in the first example below. Two
other common methods used for declaring arrays of data are the dup directive and the use of
string literals. The dup directive tells the assembler to duplicate an expression a given number
of times. For example, 4 dup (2) is equivalent to 2, 2, 2, 2.
Example declarations:
array1 dw 1, 2, 3
; Declare an array with three values, initialized to 1, 2, and 3.
bytes db 10 dup(?) ; Declare 10 uninitialized bytes starting at location bytes.
array dw 100 dup(0) ; Declare 100 words starting at location array, all initialized to 0.
string db 'hello',0
; Declare 6 bytes starting at the address string, initialized to the
ASCII character values for hello and the null (0) byte.

3.4 x86 memory data transfer instruction


Data transfer instructions are those CPU instructions used to:
set a register or a memory location to a fixed constant value,
copy data from a memory location to a register, or vice versa,
read and write data from I/O devices.
Each microprocessor comes with its specific instruction set, which includes particular types and
flavors of data transfer instructions. The most important data transfer instructions, provided by
the x86 instruction set architecture, are listed in Table 2.
The first x86 stack particularity is that all elements stored in the stack are 16-bit wide.
Consequently, instructions that implicitly insert or remove values from the stack, including push
and pop, increment or decrement the stack pointer (SP) by two.
The second stack particularity is that it grows downwards (to lower addresses). Consequently,
instructions that insert values in the stack, such as push, decrement the stack pointer (SP), while
instructions that remove values from the stack, such as pop, increment the stack pointer.
Details and usage examples regarding the data transfer instructions are provided in Section 3.6.
Table 2. x86 data transfer instructions

Instruction

Usage

Description

MOV Move Data

MOV dest, src

Copy src to dest

XCHG Exchange Data

XCHG dest, src

Exchange src with dest

LEA Load Effective Address

LEA dest, src

Load the effective address of src in dest

PUSH Push in the Stack

PUSH src

Push src in the stack

POP Pop out of the Stack

POP dest

Pop word out of the stack and store in dest

3.4.1 Special data transfer instructions: string/array instructions


The x86 architecture defines two implicit memory zones which store two arrays of 8-bit or 16bit numbers, called the source array and the destination array. The source array is implicitly
stored in the data segment (the segment with the address in DS). The current element in the
source array is implicitly at the effective address specified in SI. The destination array is
implicitly stored in the extended data segment (the segment with the address in ES). The
current element in the destination array is implicitly at the effective address specified in DI.
The x86 instruction set architecture includes some special array instructions (see Table 3) to
load, store, move, scan and compare the elements these two implicit data arrays. All these
instructions have an 8-bit version (lodsb, stosb, movsb, scasb and cmpsb) and a 16-bit version
(lodsw, stosw, movsw, scasw and cmpsw). For the load, store and scan operations, the implicit
target/source/comparison register is AL (for the 8-bit version) and AX (for the 16-bit version).

All the array instructions implicitly use the direction flag (DF), which controls the left-to-right or
right-to-left direction of array processing. If DF is 0, then the index (indexes) of the element(s)
to which the instruction refers is (are) incremented. If DF is 1 then the index (indexes) of the
element(s) to which the instruction refers is (are) decremented. Consequently, all the array
instructions perform two operations: a) process the current element(s) in the array(s) and b)
move to the next element(s) in the array(s).
The array instructions can be preceded by one of the following repeatability prefixes: rep,
repe/repz, repne/repnz. These prefixes instruct the CPU to repeat the array instruction a
number of times specified by the implicit counter (CX) and decrement the counter at each
repetition. In the case of repe/repz, repne/repnz, which are usually used with scas or cmps, the
zero flag (resulted from the comparison) is also verified and the repetition is continued only if
the condition is fulfilled (ZF=0 for repe/repz and ZF=1 for repne/prenz).
Details and usage examples regarding the string/array instructions are provided in Section 3.7.
Table 3. x86 string/array instructions

Instruction

Usage

Description

MOVS Move String

MOVSB / MOVSW

Copy the current element in the source string to


the current element in the destination string

LODS Load String

LODSB / LODSW

Load the current element from the source string in


the accumulator

STOS Store String

STOSB / STOSW

Store the value of the accumulator in the current


element in the destination string

SCAS Scan String

SCASB / SCASW

Compare the value in the accumulator with the


current element in the destination string

CMPS Compare
String

CMPSB / CMPSW

Compare the current element in the source string


with the current element in the destination string

STD Set DF

STD

Set the direction flag: (DF) <- 1

CLD Clear DF

CLD

Clear the direction flag: (DF) <- 0

3.5 Exercises
3.5.1 Exercise 1
Objective. Understand the effect of executing the LEA, MOV, CMP, JE, INC, JMP and LOOP
instructions and the org and db assembly directives.
Requirement. Write a program that replaces all occurrences of a character (called charX),
within a character array (called inputString), with another character (called charY). Count the
number of replacements made using the register DX.

Solution.
1. Start the emulator.
2. Use the Source Code window to write the following program:
org 100h
init:

lea BX, inputString


mov SI, 0h
mov CX, charX-inputString
mov DX, 0h
mov AH, [charY]

compare:

mov AL, [BX+SI]


cmp AL, [charX]
je replace
inc SI
loop compare

nextElem:

int 20h
replace:

mov [BX+SI], AH
inc DX
jmp nextElem

inputString
charX
charY

db
db
db

'alpha beta'
'a'
'A'

3. Understand the program!


3.1. The first line of this program (org 100h) is not an instruction. This is an assembly
directive specifying that the next instruction (and consequently the whole program)
will be loaded in the memory (in the code segment) starting with address 100h.
3.2. The last three lines of this program are used to define and initialize three variables: the
array of characters inputString, the replaced character charX and the replacement charY.
The elements of the array are bytes (8-bit numbers) and the two variables charX and
charY are also 8-bit numbers. This is why they are defined using the db (define byte)
assembly directive. The variables are placed in the memory at some effective addresses
which will be revealed later.
3.3. Note that the elements of inputString and the values of the other two variables are
characters only from the point of view of the programmer (because he wants to
interpret them as characters). From the point of view of the processor the memory
contains only numbers (the ASCII codes of these characters, see Appendix 3. ASCII
Table).

3.4. The block of instructions labeled init has the role of initializing the registers. The
instruction lea BX, inputString loads the effective address of the inputString in BX. The
instruction mov SI, 0h initializes the register SI (which will be used to iterate through
the array) with the value 0h. The instruction mov CX, charX-inputString computes the
difference between the address of charX and the address of the first element in
inputString (this difference represents the number of elements in the character array)
and stores the result in the counter register CX. The instruction mov DX, 0h initializes
the register DX (which will be used to count the number of replacements) with the
value 0h. The last instruction in this bloc, mov AH, charY, copies the ASCII code of the
charY character from the memory into the register AH.
3.5. The block of instructions labeled compare is used to iterate through the character array
and compare each element with the character charX (making the replacements when
needed). The instruction mov AL, [BX+SI] copies the value stored in the memory at the
effective address BX+SI in the register AL (given the above initializations, this value will
be the first character in inputString). The instruction cmp AL, [charX] compares (through
subtraction) the value in register AL with the value stored in the memory at the address
charX. This instruction does not modify the value stored in AL. Its purpose is to modify
the values of the flags (OF, SF, ZF, AF, PF, CF). The next instruction (je replace) takes a
jump to the label replace if the two numbers compared by the previous instruction were
equal. The instruction je (jump if equal) has access to the result of the previous
instruction through the zero flag (ZF). If ZF is 1 it means the subtraction ended with a
null result and the jump is taken. If ZF is 0 then the jump is not taken and the processor
continues with ne following instruction (inc SI). The instruction inc SI increments the
values stored in the register SI. Consequently, the next element of the array will be
processed at the next iteration. The last instruction in this block is loop compare. This
instruction decrements the implicit counter (CX) and, if the resulted value is not 0, it
jumps back to the compare label (to process the next element in inputString). If the value
of CX is 0 then it means that all the elements of the array were processed and the
processor does not jump back to compare: it continues with the following instruction
(int 20h). This last instruction ends the current program.
3.6. The block of instructions labeled replace is executed only when the current element in
the inputString is equal to charX. The first instruction in this block (mov [BX+SI], AH)
overwrites the value stored in the memory at the address BX+SI (the current element)
with the value stored in the register AH (the character charY). The second instruction in
this block (inc DX) increments the number of replacements made and the last
instruction jumps unconditionally back in the compare loop at the label nextElem.
4. Save the program (File menu -> Save As submenu).
5. Compile the program and view the symbol list
5.1. Click the Compile button to compile the program.
5.2. You will be prompted to save the executable file. Save it with the recommended name.

5.3. View the compilation status in the Assembler Status dialog. If the program was edited
correctly the message should be <program name> is assembled successfully into
bytes.
5.4. Click the View Button and then Symbol Table to view the symbol table associated to this
program. The information presented in this list should be interpreted as follows:
the symbols charX, charY and inputString are byte variables (size = 1) stored in
the memory at the addresses 012Ch, 012Dh, 0122h. Note that even though
inputString defines an array, the symbol inputString represents only the start
address of this array. These symbols can be associated with C pointers.
the symbols compare, init, nextElem and replace are labels of some instructions
in the program and are associated with the addresses of these instructions
(0110h, 0100h, 0118h, and 011Dh).
5.5. The list of symbols will help you find the data you are working with (in the memory).
6. Load the executable program in the emulator.
6.1. Click the Run button to load the program in the emulator and execute it.
7. Execute the program step-by-step, watch the status change of the registers, memory
locations, flags, etc. and write down observations.
7.1. Click the Reload button to reload the executed program.
7.2. Inspect the Emulator Window and note that:
7.2.1. The current instruction (mov BX, 0122h) is highlighted. This is the first instruction
in the program and was loaded at the logical address 0700:0100 (segment address
: effective address). The effective address was imposed by the org 100h assembly
directive. This instruction is equivalent to the instruction you wrote in the program
(lea BX, inputString), because the symbol inputString was replaced with the address
it is associated with.
7.2.2. The value in the register IP (the register that stores the effective address of the
current instruction) is 0100h.
7.3. Click on View Menu -> Memory to view the current status of the memory. In the Address
Text Box write the address of the inputString variable: leave the segment address
unchanged (0700) and replace the effective address (0100) with the effective address of
inputString (0122). Click on the Update Button and note that:
7.3.1. The start address is now 0700:0122 and the values stored in the memory are 61h,
6Eh, 61h, .... In the right part of the Memory Window these numbers are interpreted
as ASCII codes and represented with characters (alpha beta). Recognize that
these are the characters in inputString. Following you should note the two
characters representing the variables charX (a) and charY (A).

7.4. Click the Single Step button to execute the first instruction (mov BX, 0122h) and note
that register BX was loaded with the start address of inputString (0122h).
7.5. Execute the next three instructions and note that the registers SI, CX and DX were loaded
with the values 0000h, 000Ah and 0000h.
7.6. Inspect the Memory Window and note that the memory location with the address 012Dh
(the address of charY) stores the value 41h (the ASCII code of A). Execute the
instruction mov AH, [012D] and note that register AH (the upper half of AX) was loaded
with the value stored in the memory at address 012Dh (41h).
7.7. Compute BX+SI (at the first iteration the result is 0122h + 0h = 0122h). Inspect the
Memory Window and note that the memory location with the address 0122h (the
address of the first element in inputString) stores the value 61h (the ASCII code of a).
Execute the instruction mov AL, [BX+SI] and note that register AL (the lower half of AX)
was loaded with the value stored in the memory at address 0122h (61h).
7.8. Note that the memory location with the address 012Ch (the address of charX) stores the
value 61h (the ASCII code of a). Execute the instruction cmp AL, [012C], click on the
Flags button to view the status of the flags and note that the zero flag (ZF) was set (to 1).
7.9. The next instruction is jz 011Dh. This instruction is equivalent to the instruction you
wrote in the program (je replace), because the symbol replace was replaced with the
address it is associated with. The zero flag is now one, which means the execution of the
instruction should result in a jump to the label replace. Execute the current instruction
and note that the jump was made to the first instruction in the replace block of code.
Also note that the value in the register IP (the register that stores the effective address
of the current instruction) is 011Dh.
7.10.
The first element in inputString was equal to a (the same as the variable charX).
Consequently, a jump was taken to the replace block of instructions. Compute BX + SI (at
the first iteration the result is 0122h + 0h = 0122h). Execute the current instruction (mov
[BX+SI], AH) and note that the value in register AH (41h) was copied in the memory
location with the address 122h. Note the modification in the character representation of
the Memory Window also (inputString should now be Alpha beta).
7.11.
Execute the next instruction (inc DX) and note that the value in DX was
incremented.
7.12.
Execute the next instruction (jmp 0118h) and note that a jump is made to the
instruction with the effective address 0118h (the next instruction which will be
executed will be the one with address 0118h). Remember that this address is in fact
associated with the label nextElem (the instruction you wrote in the source code was
jmp nextElem).

7.13.
After the first element in the string was replaced, the processor jumped back in
the compare loop at the nextElem label. Execute the current instruction (inc SI) and note
that the value in the register SI was incremented.
7.14.
The current instruction is loop 0110h. Check the symbol list and remember that
the compare label was associated with the 0110h address. Note that the register CX
stored the value 000Ah. Execute the instruction and note that;
7.14.1. The value in CX was decremented.
7.14.2. The processor took a jump to the instruction with the address 0110h (the first
instruction in the compare loop).
7.15.
Compute BX+SI (at the second iteration the result is 0122h + 1h = 0123h). Inspect
the Memory Window and note that the memory location with the address 0123h (the
address of the second element in inputString) stores the value 6Ch (the ASCII code of
l). Execute the instruction mov AL, [BX+SI] and note that register AL was loaded with
the value stored in the memory at address 0123h (6Ch).
7.16.
Note that the memory location with the address 012Ch (the address of charX)
stores the value 61h (the ASCII code of a). Execute the instruction cmp AL, [012C], click
on the Flags button to view the status of the flags and note that the zero flag (ZF) was
reset (to 0). This is because 6Ch 61h is not zero. Consequently, the next instruction (jz
011Dh) will not result in a jump to replace (the instruction with the address 011Dh).
The processor will continue the execution sequentially (with the following instruction:
inc SI).
7.17.
Execute the current instruction (inc SI) and note that the value in the register SI
was incremented.
7.18.
The current instruction is loop 0110h. Not that we reached the end of the
compare loop again. This time the current element (the second one) in the string was
not equal to charX, therefore it was not replaced with charY. Execute the instruction
(loop 0110h) and note that;
7.18.1. The value in CX was decremented.
7.18.2. The processor took a jump to the instruction with the address 0110h (the first
instruction in the compare loop).
7.19.
Continue to execute the instructions step by step noting the modifications in the
inputString, in DX (which counts the replacements) and CX (which counts the number of
unprocessed elements). Stop when CX reaches 1. Execute the instruction (loop 0110h)
and note that;
7.19.1. The value in CX was decremented and now it stores the value 0h.

7.19.2. The processor does not jump back to the instruction with the address 0110h (the
first instruction in the compare loop), but continues with the following instruction
(int 20h).
7.20.
The current instruction is int 20h. Click the Single Step button twice and note
that a Message dialog, saying that the program has returned control to the operating
system is displayed. Click Ok.
8. Write down conclusions regarding the effect of the various instructions on the registers,
flags and memory locations.

3.5.2 Exercise 2
Objective. Understand the effect of executing the string instructions (cld, std, movs, lods, and
stos), some conditional jump instructions (je, jne) and the repeatability prefix rep.
Requirement. Write a program that replaces all occurrences of a character (called charX),
within an array of characters (called inputString), with another character (called charY), by
copying the array of characters in an auxiliary memory zone (where the array will be called
outputString).
Solution.
1. Start the emulator.
2. Load the program called lab3_prog2.asm using the Open button. The Source Code window
should display the following program:

org 100h
init:

lea SI, inputString


lea DI, outputString
mov CX, charX-inputString
cld

compare:

lodsb
cmp AL, [charX]
jne copyCurrent
je copyCharY
loop compare

nextElem:
copyBack:

dec SI
dec DI
xchg SI, DI
mov CX, charx-inputString
std
repnz movsb

end:

int 20h

copyCurrent:

stosb
jmp nextElem

copyCharY:

mov AL, [charY]


stosb
jmp nextElem

inputString
charX
charY
outputString

db
db
db
db

'alpha beta'
'a'
'A'
20 dup('X')

3. Understand the program!


3.1. The last four lines of this program are used to define and initialize the variables. In
addition to the previous exercise, in this program the line outputString db 20 dup('X')
allocates an array of 20 memory locations and initializes all the values in the array with
the character X (capital x). This will be the output string.
3.2. As opposed to the previous exercise, inputString will now be iterated using a single
pointer (SI). OutputString will be iterated using DI. Consequently, in the block of
instructions labeled init these two registers are initialized with the address of the first
element in the two arrays. The direction flag is initialized with 0 (cld) meaning that the
arrays will be iterated from left to right. The counter CX is initialized exactly as in the
previous exercise (with the number of elements in inputString).
3.3. The block of instructions labeled compare iterates through the inputString and compares
the current element with charX. If the current element is charX then the program copies

charY in outputString, otherwise it copies the current element from inputString to


outputString. The instruction lodsb copies the current element from the source array
into the accumulator and increments SI. The instruction cmp AL, [charX] compares the
current element with charX. If the two values are not equal the program jumps to
copyCurrent (jne copyCurrent), otherwise it jumps to copyCharY (je copyCharY). Finally,
the instruction loop decrements the counter CX and jumps back to compare if CX is not
zero. The instructions at copyCurrent: a) store the value from the accumulator in the
outputString (also incrementing DI) and b) jump back in the compare loop.
3.4. After the compare loop is finished the values in the outputString must be copied back in
the inputString. This operation is performed by the block of instructions called copyBack.
SI and DI are decremented so that they store the effective addresses of the last elements
in inputString and respectively outputString. Then their values are exchanged (xchg SI,
DI) because for this operation the inputString is the destination array and the
outputString is the source array. CX is again initialized with the number of elements in
inputString (all number of elements to be copied back). The direction flag is set to 1 (std)
because the arrays will be iterated from right to left. The repeatability prefix rep placed
in front of movsb (rep movsb) repeats the instruction movsb a number of times specified
in CX. CX is decremented after each execution of movsb. Consequently, the instruction
rep movbs copies all the elements from the outputString back to the inputString, starting
from right to left.
4. Compile the program and view the symbol list.
5. Load the executable program in the emulator.
6. Execute the program step-by-step, watch the status change of the registers, memory
locations, flags, etc. and write down observations. In particular, while executing this
program you should watch the changes in the memory and you should note how the
elements in the two arrays of characters are modified.
7. Write down conclusions regarding the effect of the various instructions on the registers,
flags and memory locations.

3.5.3 Exercise 3
Objective. Understand the effect of executing the push, pop, lods, stos instructions, the equ and
dw assembly directives and the usage of the stack. Understand the little endian data storing
convention.
Requirement. Write a program that inserts in the stack all the values larger that 8000h from an
array of unsigned 16-bit numbers (called numbers), then extracts the values from the stack
copying them at the beginning of the array.

Solution.
1. Start the emulator.
2. Load the program called lab3_prog3.asm using the Open button. The Source Code window
should display the following program:
org 100h
jmp init
value
numbers

equ
dw

init:

lea SI, numbers


mov CX, (init-numbers)/2
mov DX, 0h
cld

compare:

lodsw
cmp AX, value
ja insert
loop compare

nextElem:

8000h
8933h, 1240h, 0328h, 99A0h, 0F422h, 0101h

extract:

lea DI, numbers


mov CX, DX
pop AX
stosw
loop extract

end:

int 20h

insert:

push AX
inc DX
jmp nextElem

3. Understand the program!


3.1. The first instruction in the program (jmp init) jumps over the variables definition
section.
3.2. The next two lines of this program are used to define and initialize one constant value,
called value and one variable: the array of numbers, called numbers. Remember that the
equ directive is used to define constant values, while the dw directive is used to define
word variables. Consequently, the elements of the array are words (16-bit numbers).
The array is placed in the memory at an effective address which will be revealed later.
Note that 16-bit hexadecimal numbers that start with a letter must be preceded by a
leading zero.
3.3. The block of instructions labeled init is used to initialize the registers and flags which
will be used in this program. The instruction lea SI, numbers loads the effective address

of the numbers array in SI. The instruction mov CX, (init-numbers)/2 computes the
difference between the address of the init label and the address of the first element in
numbers array (this difference represents the number of memory locations allocated for
the array), divides it by two (to obtain the number of elements in the array) and stores
the result in the counter register CX. The instruction mov DX, 0h initializes the register
DX (which will be used to count the number values inserted in the stack) with the value
0h. The last instruction in this block (cld) resets the value of the direction flag (DF),
meaning the string instructions will iterate through the arrays from left to right.
3.4. The block of instructions labeled compare iterates through the array of numbers and
stores in the stack all the values larger that 8000h. The first instruction in this block
(lodsw) copies the current 16-bit number from the source string (the value in the
memory at the effective address stored in SI) into the accumulator (AX). Next, this value
is compared with 8000h. After the comparison, a conditional jump (ja insert) is used to
jump to label insert if the current element in the array is larger than 8000h. The loop
instruction decrements the counter CX and jumps back to label compare if CX is nonzero.
3.5. The block of instructions labeled insert pushes the value in AX into the stack and
increments DX (which counts how many numbers were stored in the stack). In the end
it jumps back inside the compare loop.
3.6. After the compare loop, DI is initialized with the effective address of the first element in
the numbers array and CX is reinitialized with the number of values that need to be
popped out of the stack. The extract loop pops the values from the stack, stores them in
the accumulator and from the accumulator into the destination array (which is the
same numbers array).
3.7. Finally, the instruction int 20h ends the current program.
4. Compile the program and view the symbol list.
5. Load the executable program in the emulator.
6. Execute the program step-by-step, watch the status change of the registers, memory
locations, flags, etc. and write down observations. In particular, while executing this
program you should watch the changes in the stack and you should note how 16-bit
numbers are stored in the memory using two memory locations (the least significant byte at
the lower address and the most significant byte at the higher address).
7. Write down conclusions regarding the effect of the various instructions on the registers,
flags and memory locations.

3.5.4 Exercise 4
Objective. Emphasize the importance of the significance of the numbers processed in an
assembler program.

Requirement. Refer to the previous exercise and modify the source code so that the values in
the numbers array are regarded as signed numbers.
Solution.
1. Redo the previous exercise and note that after the execution of the program the values
8933h, 99A0h and 0F422h were selected as being larger than 8000h, and were copied at

the beginning of the array.


2. Modify the source code by replacing the ja insert instruction with the jg insert instruction.
This is the only required change because:
2.1. The cmp instruction compares the current element in the array with the constant value
regardless of the significance of the data: it changes the values of CF and ZF as if the
compared numbers are unsigned and the values of SF and OF as if the compared values
are signed.
2.2. The values in the numbers array and the constant value can be regarded as signed or
unsigned values; the interpretation is the programmers choice: the programmer
decides if he uses a conditional jump for unsigned numbers or a conditional jump for
signed numbers after the comparison.
2.3. The rest of the program logic remains unchanged.
3. Compile the program and view the symbol list.
4. Execute the program step-by-step, watch the status change of the registers, memory
locations, flags, etc. and write down observations.
5. Write down conclusions regarding the effect of the various instructions on the registers,
flags and memory locations.

3.5.5 Exercise 5
Requirement. Write a program that copies all the values lower than 1000h from an array of
signed 16-bit numbers (called inputArray) in another array (called outputArray). In this exercise
you are required to use the string instructions lods and stos.

3.6 Appendix 1. Data transfer instructions examples


MOV Move (Copy) Data

Example

Usage: MOV dest, src


Arguments:
dest - general-purpose register, segment register (except CS) or memory location
src - immediate value, general-purpose register, segment register or memory location
Effects: Copies the source to the destination, overwriting the destination's value: (dest) (src).
Flags: none
Miscellaneous: The arguments must be the same size (byte, word).

Example

Example

XCHG Exchange Data

Example

Usage: XCHG dest, src


Arguments:
dest - register or memory location
src register or memory location
Effects: Exchanges the source with the destination: (dest) (src).
Flags: none
Miscellaneous: The arguments must be the same size (byte, word). Two memory locations
cannot be used in one instruction.

PUSH Push Operand in the Stack

Example

(SP) 2,

Example

Usage: PUSH src


Arguments: src 16-bit immediate value, register or memory location
Effects: Decrements stack pointer with 2 and copies src on top of the stack: (SP)
((SS):(SP)+1) (srchigh), ((SS):(SP)) (srclow).
Flags: none
Miscellaneous: src must be a 16-bit value.

POP Pop a word from the Stack

Example

Example

Usage: POP dest


Arguments: dest 16-bit register, segment register or memory location
Effects: Copies the element (16-bit) from the top of the stack into dest and increments the stack
pointer with 2: (desthigh) ((SS):(SP)+1), (destlow) ((SS):(SP)), (SP) (SP) + 2.
Flags: none

3.7 Appendix 2. String/Array instructions examples


MOVS Move String

Example

Usage: MOVSB / MOVSW


Arguments: none
Effects:
movsb: Copies the current 8-bit element in the source string to the current element in
the destination string and increments (if DF=0) or decrements (if DF=1) the values in SI
and DI by 1: ((ES):(DI)) ((DS):(SI)), (SI) (SI) 1, (DI) (DI) 1.
movsw: Copies the current 16-bit element in the source string to the current element in
the destination string and increments (if DF=0) or decrements (if DF=1) the values in SI
and DI by 2: ((ES):(DI))
((DS):(SI)), ((ES):(DI)+1)
((DS):(SI)+1), (SI)
(SI) 2,
(DI) (DI) 2.
Flags: none
Miscellaneous: can be prefixed by rep, repe/repz, repne/repnz

Example

Example

LODS Load String

Example

Example

Usage: LODSB / LODSW


Arguments: none
Effects:
lodsb: Copies the current 8-bit element from the source string to the accumulator and
increments (if DF=0) or decrements (if DF=1) the value in SI by 1: (AL)
((DS):(SI)),
(SI) (SI) 1.
lodsw: Copies the current 16-bit element from the source string to the accumulator and
increments (if DF=0) or decrements (if DF=1) the value in SI by 2: (AL)
((DS):(SI)),
(AH) ((DS):(SI)+1), (SI) (SI) 1.
Flags: none

STOS Store String

Example

Example

Usage: STOSB / STOSW


Arguments: none
Effects:
stosb: Copies the value in the accumulator in the current 8-bit element in the destination
string and increments (if DF=0) or decrements (if DF=1) the value in DI by 1: ((ES):(DI))
(AL), (DI) (DI) 1.
stosw: Copies the value in the accumulator in the current 16-bit element in the
destination string and increments (if DF=0) or decrements (if DF=1) the value in DI by 2:
((ES):(DI)) (AL), ((ES):(DI)+1) (AH), (DI) (DI) 2.
Flags: none

3.8 Appendix 3. ASCII Table

Vous aimerez peut-être aussi