Académique Documents
Professionnel Documents
Culture Documents
Intermediate Concepts
Slides by: Muhamed Mudawar
CS 282 KAUST
Spring 2010
Outline:
MIPS
5 stage pipelining
Structural Hazards
Data Hazards & Forwarding
Branch Hazards
Handling Exceptions
Handling Multicycle Operations
Slide 2
Slide 3
Instruction Formats
Slide 4
registers LO and HI
program counter PC
32 Floating Point Registers (FPRs)
64-bit Floating Point registers in MIPS 64
FIR: Floating-point Implementation Register
FCSR: Floating-point Control & Status Register
GPRs
R0 R31
LO
HI
PC
FPRs
F0 F31
FIR
FCSR
Slide 5
Slide 6
Arithmetic / Logical
Instructions
Slide 7
Slide 8
Slide 9
Slide 10
Slide 11
Slide 12
Next:
stage pipelining
Structural Hazards
Data Hazards & Forwarding
Branch Hazards
Handling Exceptions
Handling Multicycle Operations
Slide 13
5 Steps of Instruction
Execution
Slide 14
Reg. Fetch
Next SEQ PC
Adder
Next SEQ PC
Zero?
RS1
MUX
MEM/WB
Data
Memory
EX/MEM
ALU
MUX MUX
ID/EX
Imm
Reg File
IF/ID
Memory
Address
RS2
Back
MUX
Next PC
Access
WB Data
Fetch
Sign
Extend
RW = Rd or Rt
RW
RW
Slide 15
Stag
e
Any Instruction
IF
ID
EX
MEM
WB
ALU Instruction
Load / Store
Branch
EX/MEM.ALUoutput
ID/EX.A func ID/EX.B,
or
EX/MEM.ALUoutput
ID/EX.A op ID/EX.Imm
EX/MEM.ALUoutput
ID/EX.A + ID/EX.Imm
EX/MEM.ALUoutput
ID/EX.NPC + (ID/EX.Imm
<< 2)
EX/MEM.B ID/EX.B
MEM/WB.LMD
MEM[EX/MEM.ALUoutput
]
or
MEM[EX/MEM.ALUoutput
] EX/MEM.B
Pipelining:
Basic and Intermediate
Concepts
Regs[MEM/WB.Rw]
For load
only:
EX/MEM.cond br
condition
MEM/WB.ALUoutput
EX/MEM.ALUoutput
Slide 16
Pipelined Control
Control
Slide 17
Visualizing Pipelining
Pipeline registers
carry data for a
given instruction
from one stage to
the other
One instruction completes
each cycle
Overlapped Execution of
Instructions
Pipelining: Basic and Intermediate Concepts
Slide 18
Pipeline Performance
Assume
datapath
Instr
Instr
fetch
Register
read
ALU op
Memory
access
Register
write
Total
time
load
200ps
100 ps
200ps
200ps
100 ps
800ps
store
200ps
100 ps
200ps
200ps
R-format 200ps
100 ps
200ps
branch
100 ps
200ps
200ps
700ps
100 ps
600ps
500ps
Slide 19
Pipeline Performance
Single-cycle (Tc= 800ps)
Speedup =
800 / 200 = 4
20
Pipeline Speedup
If
Slide 21
Load/store addressing
Calculate address in 3rd stage, access memory
in 4th
Slide 22
Slide 23
Next:
MIPS An ISA for Pipelining
5 stage pipelining
Structural
Hazards
Slide 24
Structure Hazards
Conflict
Hence,
Slide 25
MEM
Reg
MEM
Instr 4
Pipelining: Basic and Intermediate Concepts
DMem
ALU
Reg
Reg
Reg
Ifetch
Reg
DMem
Reg
DMem
Reg
ALU
Instr 3
Mem
MEM
ALU
O
r
d
e
r
Instr 2
Reg
ALU
I Load MEM
n
s
t Instr 1
r.
ALU
Slide 26
Stall
Reg
Mem
Reg
Reg
Mem
ALU
Mem
Mem
Reg
Mem
Reg
Instr 3
Mem
Reg
ALU
O
r
d
e
r
Instr 2
Reg
ALU
I Load Mem
n
s
t Instr 1
r.
ALU
Slide 27
Resolving Structural
Hazards
Serious
Hazard:
Hazard cannot be ignored
Solution
Slide 28
CPIunpipelined
CPIpipelined
Cycle Timeunpipelined
Cycle Timepipelined
Pipeline Depth
Speedup
1 Pipeline stall cycles per instruction
Pipelining: Basic and Intermediate Concepts
Slide 29
SpeedupA/B
structural hazards
1.33
CPIA Clock rateB
1
1.05
Slide 30
Instructions
r6, 8(r5)
IF
r2, 10(r3)
CC1
Structural Hazard
Two instructions are
attempting to write
the register file
during same cycle
ID
EX MEM WB
IF
ID
EX
WB
IF
ID
EX
WB
IF
ID
EX MEM
Time
Slide 31
r6, 8(r5)
IF
r2, 10(r3)
CC1
ALU instructions
skip the MEM stage
ID
EX MEM WB
IF
ID
EX
WB
IF
ID
EX
IF
ID
WB
EX MEM
Slide 32
Next:
Data
Branch Hazards
Handling Exceptions
Handling Multicycle Operations
Slide 33
Data Hazards
Data
Read
Slide 34
Time (cycles)
value of r2
CC1
CC2
CC3
CC4
CC5
CC6
CC7
CC8
10
10
10
10
10/20
20
20
20
IM
Reg
ALU
DM
Reg
IM
Reg
ALU
DM
Reg
IM
Reg
ALU
DM
Reg
IM
Reg
ALU
DM
Reg
IM
Reg
ALU
DM
sw r8, 10(r2)
Result
Instructions
During
Slide 35
Instruction Order
CC1
CC2
CC3
CC4
CC5
CC6
CC7
CC8
10
10
10
10
10/20
20
20
20
IM
Reg
ALU
DM
Reg
bubble
bubble
Reg
ALU
DM
Reg
IM
Reg
ALU
DM
IM
or r6, r3, r2
The
Slide 36
The
No bubbles are inserted into the pipeline and no cycles are wasted
ALU
CC1
CC2
CC3
CC4
CC5
IM
Reg
ALU
DM
Reg
IM
Reg
ALU
DM
Reg
IM
Reg
ALU
DM
Reg
IM
Reg
ALU
DM
Reg
IM
Reg
ALU
DM
CC6
CC7
CC8
Slide 37
Slide 38
register
Previous instruction is in the EX/MEM register
Second previous is in the MEM/WB register
RAW Data hazards when
1a.
1b.
2a.
2b.
Fwd from
EX/MEM
pipeline reg
EX/MEM.RegisterRd = ID/EX.RegisterRs
EX/MEM.RegisterRd = ID/EX.RegisterRt
MEM/WB.RegisterRd = ID/EX.RegisterRsFwd from
MEM/WB
MEM/WB.RegisterRd = ID/EX.RegisterRtpipeline reg
Slide 39
And
Slide 40
Forwarding Conditions
Detecting
Detecting
Slide 41
the sequence:
add r1,r1,r2
sub r1,r1,r3
and r1,r1,r4
Both
hazards occur
Slide 42
Slide 43
Load Delay
Not
Program Order
In
next instruction
or r6, r3, r2
add r7, r2, r2
IF
Reg
ALU
DM
Reg
IF
Reg
ALU
DM
Reg
Slide 44
Program Order
Time (cycles)
lw
r2, 20(r1)
CC1
CC2
CC3
CC4
CC5
IM
Reg
ALU
DM
Reg
bubble
Reg
IM
IM
CC6
CC7
ALU
DM
Reg
Reg
ALU
DM
CC8
Reg
Slide 45
Compiler Scheduling
r10, (r1)
# r1 = addr b
lw
r11, (r2)
# r2 = addr c
add
# stall
sw
r12, (r3)
# r3 = addr a
lw r13, (r4)
# r4 = addr e
lw
r14, (r5)
# r5 = addr f
sub
# stall
sw
r15, (r6)
# r6 = addr d
lw
r10, 0(r1)
lw
r11, 0(r2)
lw
r13, 0(r4)
lw
r14, 0(r5)
r12, 0(r3)
sw r14,
0(r6)
Compiler optimizes for performance. Hardware
checks
for safety.
Pipelining: Basic and Intermediate Concepts
Slide 46
Load/Store Data
Forwarding
Slide 47
Slide 48
Slide 49
Next:
MIPS An ISA for Pipelining
5 stage pipelining
Structural Hazards
Data Hazards & Forwarding
Branch
Hazards
Handling Exceptions
Handling Multicycle
Operations
Slide 50
Control Hazards
Slide 51
Next2
Next3
Label: target instruction
Next1
Reg
DMem
Ifetch
Reg
Ifetch
Reg
Reg
DMem
Reg
DMem
Ifetch
Reg
ALU
Ifetch
DMem
ALU
Reg
ALU
Next1
Ifetch
ALU
beq r1,r3,label
ALU
Reg
Reg
DMem
Reg
MIPS Solution:
Move branch test to ID stage (second stage)
Adder to calculate new PC in ID stage
Branch delay is reduced from 3 to just 1 clock
cycle
Pipelining: Basic and Intermediate Concepts
Slide 53
Access
Adder
Adder
MUX
Next
SEQ PC
Next PC
Zero?
RS1
MUX
MEM/WB
Data
Memory
EX/MEM
ALU
MUX
ID/EX
Imm
Reg File
IF/ID
Memory
Address
RS2
WB Data
Addr. Calc
Reg. Fetch
Write
Back
Sign
Extend
RD
RD
RD
Slide 54
Slide 55
Slide 56
become
s
if r2=0 then
add
r1,r2,r3
B. From branch
target
sub r4,r5,r6
add r1,r2,r3
if r1=0 then
delay slot
become
ssub r4,r5,r6
add r1,r2,r3
if r1=0 then
sub r4,r5,r6
C. From fall
through
add r1,r2,r3
if r1=0 then
delay slot
or r7,r8,r9
sub r4,r5,r6
become
s
add r1,r2,r3
if r1=0 then
or r7,r8,r9
sub r4,r5,r6
A is the best choice, fills delay slot & reduces instruction count (IC)
Slide 57
Effectiveness of Delayed
Branch
Compiler effectiveness for single branch delay
slot
Fills about 60% of branch delay slots
About 80% of instructions executed in branch delay
slots useful in computation
About 50% (60% x 80%) of slots usefully filled
Delayed
Slide 58
Performance of Branch
Schemes
Pipeline depth
1 Pipeline stall cycles from branches
Pipeline speedup
Pipeline depth
1 Branch Frequency Branch Penalty
Pipeline CPIbranch stalls Ideal CPIno stalls Branch Freq Branch Penalty
Pipelining: Basic and Intermediate Concepts
Slide 59
Evaluating Branch
Alternatives
Branch
Penalty
Penalty
Scheme
Uncondition
Untaken
Penalty
Taken
al
Stall always
Predict taken
Predict not
taken
Delayed
branch
Branch
Scheme
Uncondition
al
Branches
Untaken
Branche
s
Taken
Branches
All
Branche
s
4%
6%
10%
20%
Stall always
0.08
0.18
0.30
CPI+0.56
Predict taken
0.08
0.18
0.20
CPI+0.46
Predict not
taken
0.08
0.30
CPI+0.38
0.20
CPI+0.24
Slide 60
Next:
Handling
Exceptions
Handling Multicycle
Operations
Slide 61
Exception
Interrupt
Slide 62
Types of Exceptions
I/O
Slide 63
Handling Exceptions
In
Save
Save
Slide 64
Handler Actions
Read
Determine
If
action required
Terminate program
Report error using EPC, Cause,
Pipelining: Basic and Intermediate Concepts
Slide 65
Alternative Approach
Vectored
Interrupts
Undefined opcode:
C000 0000
Overflow: C000 0020
:
C000 0040
Instructions
either
Slide 66
Exceptions in a Pipeline
Another
to mispredicted branch
Slide 67
ID
EX
Arithmetic exception
WB
PC
None
Inst.
Mem
PC address
Exception
Decode
Illegal
Opcode
Overflow
Data
Mem
Data address
Exceptions
Asynchronous Interrupts
Pipelining: Basic and Intermediate Concepts
Slide 68
Slide 69
Multiple Exceptions
Pipelining
In
complex pipelines
Slide 70
Imprecise Exceptions
Just
Simplifies
Slide 71
Next:
Handling
Multicycle
Operations
Slide 72
Slide 73