Académique Documents
Professionnel Documents
Culture Documents
Chapter 2
S. Dandamudi
Outline
Flow control
Branching
Basic components Procedure calls
The processor Memory
Execution cycle Basic operations
System clock Types of memory
Number of addresses Storing multibyte data
3-address machines Input/Output
2-address machines Performance: Data alignment
1-address machines
0-address machines
Load/store
architecture
S. Dandamudi
Chapter 2: Page 3
Basic Components (cont’d)
Chap
2005 ter 2:
S. Dandamudi
Page
4
The Processor
S. Dandamudi
Chapter 2: Page 5
Instruction Execute Cycle
Instruction
Fetch Obtain instruction from program storage
Instruction
Infinite Cycle
decode
write
write
flags ALU
execute
(output)
The Processor (cont’d)
System clock
Provides timing signal 1
Clock period =
Clock frequency
S. Dandamudi
Chapter 2: Page 8
Number of Addresses
Four categories
3-address machines
2 for the source operands and one for the result
2-address machines
One address doubles as source and result
1-address machine
Accumulator machines
Accumulator is used for one source and result
0-address machines
Stack machines
Operands are taken from the stack
Result goes onto the stack
S. Dandamudi
Chapter 2: Page 9
Number of Addresses (cont’d)
Three-address machines
Two for the source operands, one for the result
RISC processors use three addresses
Sample instructions
add dest,src1,src2
; M(dest)=[src1]+[src2]
sub dest,src1,src2
; M(dest)=[src1]-[src2]
mult dest,src1,src2
; M(dest)=[src1]*[src2]
Example
C statement
A = B + C * D – E + F + A
Equivalent code:
Two-address machines
One address doubles (for source operand & result)
Last example makes a case for it
Address T is used twice
Sample instructions
load dest,src ; M(dest)=[src]
add dest,src ; M(dest)=[dest]+[src]
sub dest,src ; M(dest)=[dest]-[src]
mult dest,src ; M(dest)=[dest]*[src]
Example
C statement
A = B + C * D – E + F + A
Equivalent code:
load T,C ;T = C
mult T,D ;T = C*D
add T,B ;T = B+C*D
sub T,E ;T = B+C*D-E
add T,F ;T = B+C*D-E+F
add A,T ;A = B+C*D-E+F+A
S. Dandamudi Chapter 2: Page 13
Number of Addresses (cont’d)
One-address machines
Uses special set of registers called accumulators
Specify one source operand & receive the result
Example
C statement
A = B + C * D – E + F + A
Equivalent code:
load C ;load C into accum
mult D ;accum = C*D
add B ;accum = C*D+B
sub E ;accum = B+C*D-E
add F ;accum = B+C*D-E+F
add A ;accum = B+C*D-E+F+A
store A ;store accum contents in A
Zero-address machines
Stack supplies operands and receives the result
Special instructions to load and store use an address
Example
C statement
A = B + C * D – E + F + A
Equivalent code:
push E sub
push C push F
push D add
Mult push A
push B add
add pop A
S. Dandamudi
Chapter 2: Page 17
Load/Store Architecture
Instructions expect operands in internal processor registers
Special LOAD and STORE instructions move data between registers and
memory
RISC and vector processors use this architecture
Reduces instruction length
S. Dandamudi
Number of Addresses (cont’d)
Example
C statement
A = B + C * D – E + F + A
Equivalent code:
Procedure calls
Parameter passing
Register-based
Stack-based
S. Dandamudi
Chapter 2: Page 21
Flow of Control (cont’d)
Branches
Unconditional
branch target
Absolute address
PC-relative
Target address is specified relative to PC contents
Example: MIPS
Absolute address
j target
PC-relative
b target
S. Dandamudi
Chapter 2: Page 22
Flow of Control (cont’d)
S. Dandamudi
Chapter 2: Page 23
Flow of Control (cont’d)
Branches
Conditional
Jump is taken only if the condition is met
Two types
Set-Then-Jump
Condition testing is separated from branching
Condition code registers are used to convey the condition test
result
cmp AX,BX
je target
S. Dandamudi
Chapter 2: Page 24
Flow of Control (cont’d)
Test-and-Jump
Single instruction performs condition testing and branching
beq Rsrc1,Rsrc2,target
Jumps to target if Rsrc1 = Rsrc2
Procedure calls
Requires two pieces of information to return
End of procedure
Pentium
uses ret instruction
MIPS
uses jr instruction
Return address
In a (special) register
MIPS allows any general-purpose register
On the stack
Pentium
S. Dandamudi
Chapter 2: Page 27
Flow of Control (cont’d)
Parameter passing
Register-based
Internal registers are used
Faster
Limit the number of parameters
Due to limited number of available registers
Stack-based
Stack is used
Slower
Requires memory access
General-purpose
Not limited by the number of registers
2005
To be Chap
used with ter 2:
S. Dandamudi
S. Page
Dandamu 30
di, “
Memory (cont’d)
CLK
Address
ADDR
RD
Data
DATA
2005
To be
used with
S.
Dandamu
di,
“Introduct Chap
ion to ter 2:
S. Dandamudi
Assembly Page
Language
Programm 32
ing,”
Memory (cont’d)
Non-volatile memory
Retains contents even in the absence of power 2005
To be
Basic types of memory used with
S.
Read-only memory (ROM) Dandamu
di,
Read/write memory (RAM) “Introduct Chap
ion to ter 2:
S. Dandamudi
Assembly Page
Language 34
Programm
Memory (cont’d)
Read/write memory
Commonly referred to as random access memory (RAM)
Volatile memories
DRAM types
FPM DRAMs
FPM = Fast Page Mode
EDO DRAMs
EDO = Extended Data Out
Uses pipelining to speedup access
SDRAMs
Use an external clock to synchronize data output
Also called SDR SDRAMs (Single Data Rate)
DDR SDRAMs 2005
DDR = Double Data Rate To be
used with
Provides data on both falling and rising edges of theS.clock
RDRAMs Dandamu
di,
Rambus DRAM
“Introduct Chap
ion to ter 2:
S. Dandamudi
Assembly Page
Language 38
Programm
Storing Multibyte Data
2005
To be
used with
S.
Dandamu
di,
“Introduct Chap
ion to ter 2:
S. Dandamudi
Assembly Page
Language 39
Programm
Storing Multibyte Data
(cont’d)
Little endian
Used by Intel IA-32 processors
Big endian
Used most processors by default
MIPS supports both byte orderings
Big endian is the default
Not a problem when working with same type of
machines
Need to convert the format if working with a different
2005
machine To be
2005
To be
used with
S.
Dandamu
di,
“Introduct Chap
ion to ter 2:
S. Dandamudi
Assembly Page
Language 41
Programm
Input/Output (cont’d)
2005
To be
used with
S.
Dandamu
di,
“Introduct Chap
ion to ter 2:
S. Dandamudi
Assembly Page
Language 44
Programm
Performance: Data Alignment (cont’d)
Unaligned
Sort time (seconds)
Aligned
0
5000 10000 15000 20000 25000
Array size
2005
To be used with S. Dandamudi, “Introduction to Assembly Language Programming,” Second Edition, Chapter
Springer,2:2005.
S. Dandamudi Page 45
Performance: Data Alignment (cont’d)
Data alignment
Soft alignment
Data is not required to be aligned
Data alignment is optional