Académique Documents
Professionnel Documents
Culture Documents
MEMORY ACCESS
(1)
Example 1: using a physical address of 2 bits, one can form 4 different physical addresses: 00,
01, 10, and 11, corresponding to 4 different memory locations. Consequently, a memory with a
physical address of 2 bits will comprise 4 memory locations (4 bytes).
Example 2: using a physical address of 20 bits, one can form 220 different physical addresses,
corresponding to 220 different memory locations. Consequently, a memory with a physical
address of 20 bits will comprise 220 memory locations (1 MB).
physical
addresses
a unique
address
for each
memory
location
FFFFFh
FFFFEh
00010h
0000Fh
0000Eh
00001h
00000h
contents
88h
73h
09h
1Bh
ACh
17h
24h
each
memory
location
stores an
8-bit value
PAs
contents
SA = 0000h
SA = 0001h
SA = 0002h
FFFFFh
10021h
10020h
SA EAs
1001Fh
FFFFh
10011h
0021h
SA EAs
10010h
0020h
1000Fh
FFFFh
001Fh
10001h
0021h
0011h
10000h SA EAs
0020h
0010h
0FFFFh
FFFFh
001Fh
000Fh
00021h
0021h
0011h
0001h
00020h
0020h
0010h
0000h
0001Fh
001Fh
000Fh
00011h
0011h
0001h
00010h
0010h
0000h
0000Fh
000Fh
00001h
0001h
00000h
0000h
9Ah
DCh
3Dh
BBh
C2h
90h
78h
2Eh
FFh
13h
49h
A4h
88h
99h
22h
4Ah
17h
24h
Although the address registers (BX, SI, DI, SP, BP and IP) are associated by default with specific
segment registers (IP with CS, BX and SI with DS, DI with ES, SP and BP with SS) to form
complete addresses for specific information in the memory, the x86 architecture permits the
usage of some address registers with other than the default segment registers as well (this is
called segment redirection). BX, SI and DI can also be used to access data in the code segment,
data segment, extended data segment, and stack segment. BP can also be used to access data in
the code segment, data segment, and extended data segment.
CS, DS, ES and SS can be configured to store the same segment address. In this case all the
information required by the current program is found in a single memory segment. This is in
general the scenario we are going to use further in the laboratory.
Summary for memory organization:
the memory can be regarded as a sequence of memory locations
each memory location stores an 8-bit number and has a unique 20-bit address, called
physical address
the x86 CPU regards the memory as being composed of 64k segments comprising 64k
locations each
the x86 CPU uses a 16-bit segment address (SA) to select a segment and a 16-bit
effective address (EA) to identify a memory location inside the segment
segment addresses (SAs) can be stored in one of the following segment registers: CS, DS,
ES, SS
effective addresses (EAs) can be stored in one of the following address registers: BX, SI,
DI, SP, BP and IP
the translation between the logical organization of the memory in segments and the
physical address is done as follows: PA = SA 0h + EA
direct
indexed
indirect
direct
indexed
implicit
direct
indexed
implicit
simple addressing
immediate
Addressing
Example
mode
Description
the data is found in the memory (in the code
segment) immediately after the instruction code
the data is found in the memory at the effective
address specified in the current instruction
the data is found in the memory at the address
specified by the content of SI or DI + an offset
found in the current instruction
the data is found in the memory at the address
specified by the content of BX, SI or DI
the data is found in the memory at the effective
address obtained as a sum between the content
of BX and the offset specified in the current
instruction
the data is found in the memory at the effective
address obtained as a sum between the content
of BX, the content of SI or DI and the offset
specified in the current instruction
the data is found in the memory at the effective
address obtained as a sum between the content
of BX and the content of SI or DI
the data is found in the memory at the effective
address obtained as a sum between the content
of BP and the offset specified in the current
instruction
the data is found in the memory at the effective
address obtained as a sum between the content
of BP, the content of SI or DI and the offset
specified in the current instruction
the data is found in the memory at the effective
address obtained as a sum between the content
of BP and the content of SI or DI
db 64
db ?
db 10
dw ?
Unlike in high level languages where arrays can have many dimensions and are accessed by
indices, arrays in x86 assembly language are simply a number of cells located contiguously in
memory. An array can be declared by just listing the values, as in the first example below. Two
other common methods used for declaring arrays of data are the dup directive and the use of
string literals. The dup directive tells the assembler to duplicate an expression a given number
of times. For example, 4 dup (2) is equivalent to 2, 2, 2, 2.
Example declarations:
array1 dw 1, 2, 3
; Declare an array with three values, initialized to 1, 2, and 3.
bytes db 10 dup(?) ; Declare 10 uninitialized bytes starting at location bytes.
array dw 100 dup(0) ; Declare 100 words starting at location array, all initialized to 0.
string db 'hello',0
; Declare 6 bytes starting at the address string, initialized to the
ASCII character values for hello and the null (0) byte.
Instruction
Usage
Description
PUSH src
POP dest
All the array instructions implicitly use the direction flag (DF), which controls the left-to-right or
right-to-left direction of array processing. If DF is 0, then the index (indexes) of the element(s)
to which the instruction refers is (are) incremented. If DF is 1 then the index (indexes) of the
element(s) to which the instruction refers is (are) decremented. Consequently, all the array
instructions perform two operations: a) process the current element(s) in the array(s) and b)
move to the next element(s) in the array(s).
The array instructions can be preceded by one of the following repeatability prefixes: rep,
repe/repz, repne/repnz. These prefixes instruct the CPU to repeat the array instruction a
number of times specified by the implicit counter (CX) and decrement the counter at each
repetition. In the case of repe/repz, repne/repnz, which are usually used with scas or cmps, the
zero flag (resulted from the comparison) is also verified and the repetition is continued only if
the condition is fulfilled (ZF=0 for repe/repz and ZF=1 for repne/prenz).
Details and usage examples regarding the string/array instructions are provided in Section 3.7.
Table 3. x86 string/array instructions
Instruction
Usage
Description
MOVSB / MOVSW
LODSB / LODSW
STOSB / STOSW
SCASB / SCASW
CMPS Compare
String
CMPSB / CMPSW
STD Set DF
STD
CLD Clear DF
CLD
3.5 Exercises
3.5.1 Exercise 1
Objective. Understand the effect of executing the LEA, MOV, CMP, JE, INC, JMP and LOOP
instructions and the org and db assembly directives.
Requirement. Write a program that replaces all occurrences of a character (called charX),
within a character array (called inputString), with another character (called charY). Count the
number of replacements made using the register DX.
Solution.
1. Start the emulator.
2. Use the Source Code window to write the following program:
org 100h
init:
compare:
nextElem:
int 20h
replace:
mov [BX+SI], AH
inc DX
jmp nextElem
inputString
charX
charY
db
db
db
'alpha beta'
'a'
'A'
3.4. The block of instructions labeled init has the role of initializing the registers. The
instruction lea BX, inputString loads the effective address of the inputString in BX. The
instruction mov SI, 0h initializes the register SI (which will be used to iterate through
the array) with the value 0h. The instruction mov CX, charX-inputString computes the
difference between the address of charX and the address of the first element in
inputString (this difference represents the number of elements in the character array)
and stores the result in the counter register CX. The instruction mov DX, 0h initializes
the register DX (which will be used to count the number of replacements) with the
value 0h. The last instruction in this bloc, mov AH, charY, copies the ASCII code of the
charY character from the memory into the register AH.
3.5. The block of instructions labeled compare is used to iterate through the character array
and compare each element with the character charX (making the replacements when
needed). The instruction mov AL, [BX+SI] copies the value stored in the memory at the
effective address BX+SI in the register AL (given the above initializations, this value will
be the first character in inputString). The instruction cmp AL, [charX] compares (through
subtraction) the value in register AL with the value stored in the memory at the address
charX. This instruction does not modify the value stored in AL. Its purpose is to modify
the values of the flags (OF, SF, ZF, AF, PF, CF). The next instruction (je replace) takes a
jump to the label replace if the two numbers compared by the previous instruction were
equal. The instruction je (jump if equal) has access to the result of the previous
instruction through the zero flag (ZF). If ZF is 1 it means the subtraction ended with a
null result and the jump is taken. If ZF is 0 then the jump is not taken and the processor
continues with ne following instruction (inc SI). The instruction inc SI increments the
values stored in the register SI. Consequently, the next element of the array will be
processed at the next iteration. The last instruction in this block is loop compare. This
instruction decrements the implicit counter (CX) and, if the resulted value is not 0, it
jumps back to the compare label (to process the next element in inputString). If the value
of CX is 0 then it means that all the elements of the array were processed and the
processor does not jump back to compare: it continues with the following instruction
(int 20h). This last instruction ends the current program.
3.6. The block of instructions labeled replace is executed only when the current element in
the inputString is equal to charX. The first instruction in this block (mov [BX+SI], AH)
overwrites the value stored in the memory at the address BX+SI (the current element)
with the value stored in the register AH (the character charY). The second instruction in
this block (inc DX) increments the number of replacements made and the last
instruction jumps unconditionally back in the compare loop at the label nextElem.
4. Save the program (File menu -> Save As submenu).
5. Compile the program and view the symbol list
5.1. Click the Compile button to compile the program.
5.2. You will be prompted to save the executable file. Save it with the recommended name.
5.3. View the compilation status in the Assembler Status dialog. If the program was edited
correctly the message should be <program name> is assembled successfully into
bytes.
5.4. Click the View Button and then Symbol Table to view the symbol table associated to this
program. The information presented in this list should be interpreted as follows:
the symbols charX, charY and inputString are byte variables (size = 1) stored in
the memory at the addresses 012Ch, 012Dh, 0122h. Note that even though
inputString defines an array, the symbol inputString represents only the start
address of this array. These symbols can be associated with C pointers.
the symbols compare, init, nextElem and replace are labels of some instructions
in the program and are associated with the addresses of these instructions
(0110h, 0100h, 0118h, and 011Dh).
5.5. The list of symbols will help you find the data you are working with (in the memory).
6. Load the executable program in the emulator.
6.1. Click the Run button to load the program in the emulator and execute it.
7. Execute the program step-by-step, watch the status change of the registers, memory
locations, flags, etc. and write down observations.
7.1. Click the Reload button to reload the executed program.
7.2. Inspect the Emulator Window and note that:
7.2.1. The current instruction (mov BX, 0122h) is highlighted. This is the first instruction
in the program and was loaded at the logical address 0700:0100 (segment address
: effective address). The effective address was imposed by the org 100h assembly
directive. This instruction is equivalent to the instruction you wrote in the program
(lea BX, inputString), because the symbol inputString was replaced with the address
it is associated with.
7.2.2. The value in the register IP (the register that stores the effective address of the
current instruction) is 0100h.
7.3. Click on View Menu -> Memory to view the current status of the memory. In the Address
Text Box write the address of the inputString variable: leave the segment address
unchanged (0700) and replace the effective address (0100) with the effective address of
inputString (0122). Click on the Update Button and note that:
7.3.1. The start address is now 0700:0122 and the values stored in the memory are 61h,
6Eh, 61h, .... In the right part of the Memory Window these numbers are interpreted
as ASCII codes and represented with characters (alpha beta). Recognize that
these are the characters in inputString. Following you should note the two
characters representing the variables charX (a) and charY (A).
7.4. Click the Single Step button to execute the first instruction (mov BX, 0122h) and note
that register BX was loaded with the start address of inputString (0122h).
7.5. Execute the next three instructions and note that the registers SI, CX and DX were loaded
with the values 0000h, 000Ah and 0000h.
7.6. Inspect the Memory Window and note that the memory location with the address 012Dh
(the address of charY) stores the value 41h (the ASCII code of A). Execute the
instruction mov AH, [012D] and note that register AH (the upper half of AX) was loaded
with the value stored in the memory at address 012Dh (41h).
7.7. Compute BX+SI (at the first iteration the result is 0122h + 0h = 0122h). Inspect the
Memory Window and note that the memory location with the address 0122h (the
address of the first element in inputString) stores the value 61h (the ASCII code of a).
Execute the instruction mov AL, [BX+SI] and note that register AL (the lower half of AX)
was loaded with the value stored in the memory at address 0122h (61h).
7.8. Note that the memory location with the address 012Ch (the address of charX) stores the
value 61h (the ASCII code of a). Execute the instruction cmp AL, [012C], click on the
Flags button to view the status of the flags and note that the zero flag (ZF) was set (to 1).
7.9. The next instruction is jz 011Dh. This instruction is equivalent to the instruction you
wrote in the program (je replace), because the symbol replace was replaced with the
address it is associated with. The zero flag is now one, which means the execution of the
instruction should result in a jump to the label replace. Execute the current instruction
and note that the jump was made to the first instruction in the replace block of code.
Also note that the value in the register IP (the register that stores the effective address
of the current instruction) is 011Dh.
7.10.
The first element in inputString was equal to a (the same as the variable charX).
Consequently, a jump was taken to the replace block of instructions. Compute BX + SI (at
the first iteration the result is 0122h + 0h = 0122h). Execute the current instruction (mov
[BX+SI], AH) and note that the value in register AH (41h) was copied in the memory
location with the address 122h. Note the modification in the character representation of
the Memory Window also (inputString should now be Alpha beta).
7.11.
Execute the next instruction (inc DX) and note that the value in DX was
incremented.
7.12.
Execute the next instruction (jmp 0118h) and note that a jump is made to the
instruction with the effective address 0118h (the next instruction which will be
executed will be the one with address 0118h). Remember that this address is in fact
associated with the label nextElem (the instruction you wrote in the source code was
jmp nextElem).
7.13.
After the first element in the string was replaced, the processor jumped back in
the compare loop at the nextElem label. Execute the current instruction (inc SI) and note
that the value in the register SI was incremented.
7.14.
The current instruction is loop 0110h. Check the symbol list and remember that
the compare label was associated with the 0110h address. Note that the register CX
stored the value 000Ah. Execute the instruction and note that;
7.14.1. The value in CX was decremented.
7.14.2. The processor took a jump to the instruction with the address 0110h (the first
instruction in the compare loop).
7.15.
Compute BX+SI (at the second iteration the result is 0122h + 1h = 0123h). Inspect
the Memory Window and note that the memory location with the address 0123h (the
address of the second element in inputString) stores the value 6Ch (the ASCII code of
l). Execute the instruction mov AL, [BX+SI] and note that register AL was loaded with
the value stored in the memory at address 0123h (6Ch).
7.16.
Note that the memory location with the address 012Ch (the address of charX)
stores the value 61h (the ASCII code of a). Execute the instruction cmp AL, [012C], click
on the Flags button to view the status of the flags and note that the zero flag (ZF) was
reset (to 0). This is because 6Ch 61h is not zero. Consequently, the next instruction (jz
011Dh) will not result in a jump to replace (the instruction with the address 011Dh).
The processor will continue the execution sequentially (with the following instruction:
inc SI).
7.17.
Execute the current instruction (inc SI) and note that the value in the register SI
was incremented.
7.18.
The current instruction is loop 0110h. Not that we reached the end of the
compare loop again. This time the current element (the second one) in the string was
not equal to charX, therefore it was not replaced with charY. Execute the instruction
(loop 0110h) and note that;
7.18.1. The value in CX was decremented.
7.18.2. The processor took a jump to the instruction with the address 0110h (the first
instruction in the compare loop).
7.19.
Continue to execute the instructions step by step noting the modifications in the
inputString, in DX (which counts the replacements) and CX (which counts the number of
unprocessed elements). Stop when CX reaches 1. Execute the instruction (loop 0110h)
and note that;
7.19.1. The value in CX was decremented and now it stores the value 0h.
7.19.2. The processor does not jump back to the instruction with the address 0110h (the
first instruction in the compare loop), but continues with the following instruction
(int 20h).
7.20.
The current instruction is int 20h. Click the Single Step button twice and note
that a Message dialog, saying that the program has returned control to the operating
system is displayed. Click Ok.
8. Write down conclusions regarding the effect of the various instructions on the registers,
flags and memory locations.
3.5.2 Exercise 2
Objective. Understand the effect of executing the string instructions (cld, std, movs, lods, and
stos), some conditional jump instructions (je, jne) and the repeatability prefix rep.
Requirement. Write a program that replaces all occurrences of a character (called charX),
within an array of characters (called inputString), with another character (called charY), by
copying the array of characters in an auxiliary memory zone (where the array will be called
outputString).
Solution.
1. Start the emulator.
2. Load the program called lab3_prog2.asm using the Open button. The Source Code window
should display the following program:
org 100h
init:
compare:
lodsb
cmp AL, [charX]
jne copyCurrent
je copyCharY
loop compare
nextElem:
copyBack:
dec SI
dec DI
xchg SI, DI
mov CX, charx-inputString
std
repnz movsb
end:
int 20h
copyCurrent:
stosb
jmp nextElem
copyCharY:
inputString
charX
charY
outputString
db
db
db
db
'alpha beta'
'a'
'A'
20 dup('X')
3.5.3 Exercise 3
Objective. Understand the effect of executing the push, pop, lods, stos instructions, the equ and
dw assembly directives and the usage of the stack. Understand the little endian data storing
convention.
Requirement. Write a program that inserts in the stack all the values larger that 8000h from an
array of unsigned 16-bit numbers (called numbers), then extracts the values from the stack
copying them at the beginning of the array.
Solution.
1. Start the emulator.
2. Load the program called lab3_prog3.asm using the Open button. The Source Code window
should display the following program:
org 100h
jmp init
value
numbers
equ
dw
init:
compare:
lodsw
cmp AX, value
ja insert
loop compare
nextElem:
8000h
8933h, 1240h, 0328h, 99A0h, 0F422h, 0101h
extract:
end:
int 20h
insert:
push AX
inc DX
jmp nextElem
of the numbers array in SI. The instruction mov CX, (init-numbers)/2 computes the
difference between the address of the init label and the address of the first element in
numbers array (this difference represents the number of memory locations allocated for
the array), divides it by two (to obtain the number of elements in the array) and stores
the result in the counter register CX. The instruction mov DX, 0h initializes the register
DX (which will be used to count the number values inserted in the stack) with the value
0h. The last instruction in this block (cld) resets the value of the direction flag (DF),
meaning the string instructions will iterate through the arrays from left to right.
3.4. The block of instructions labeled compare iterates through the array of numbers and
stores in the stack all the values larger that 8000h. The first instruction in this block
(lodsw) copies the current 16-bit number from the source string (the value in the
memory at the effective address stored in SI) into the accumulator (AX). Next, this value
is compared with 8000h. After the comparison, a conditional jump (ja insert) is used to
jump to label insert if the current element in the array is larger than 8000h. The loop
instruction decrements the counter CX and jumps back to label compare if CX is nonzero.
3.5. The block of instructions labeled insert pushes the value in AX into the stack and
increments DX (which counts how many numbers were stored in the stack). In the end
it jumps back inside the compare loop.
3.6. After the compare loop, DI is initialized with the effective address of the first element in
the numbers array and CX is reinitialized with the number of values that need to be
popped out of the stack. The extract loop pops the values from the stack, stores them in
the accumulator and from the accumulator into the destination array (which is the
same numbers array).
3.7. Finally, the instruction int 20h ends the current program.
4. Compile the program and view the symbol list.
5. Load the executable program in the emulator.
6. Execute the program step-by-step, watch the status change of the registers, memory
locations, flags, etc. and write down observations. In particular, while executing this
program you should watch the changes in the stack and you should note how 16-bit
numbers are stored in the memory using two memory locations (the least significant byte at
the lower address and the most significant byte at the higher address).
7. Write down conclusions regarding the effect of the various instructions on the registers,
flags and memory locations.
3.5.4 Exercise 4
Objective. Emphasize the importance of the significance of the numbers processed in an
assembler program.
Requirement. Refer to the previous exercise and modify the source code so that the values in
the numbers array are regarded as signed numbers.
Solution.
1. Redo the previous exercise and note that after the execution of the program the values
8933h, 99A0h and 0F422h were selected as being larger than 8000h, and were copied at
3.5.5 Exercise 5
Requirement. Write a program that copies all the values lower than 1000h from an array of
signed 16-bit numbers (called inputArray) in another array (called outputArray). In this exercise
you are required to use the string instructions lods and stos.
Example
Example
Example
Example
Example
(SP) 2,
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example