Académique Documents
Professionnel Documents
Culture Documents
Pipelining
Pipeline Hazards
Processor Architecture
Pipelining
Pipeline Hazards
4
Add
Add
Data
PC
Address Instruction
Instruction
memory
Register #
Registers
Register #
ALU
Address
Data
memory
Register #
Data
Processor Architecture
Pipelining
Pipeline Hazards
Processor Architecture
Pipelining
Pipeline Hazards
Add
M
u
x
ALU operation
Data
PC
Address Instruction
Instruction
memory
Register #
Registers
Register #
Register # RegWrite
MemWrite
Address
ALU
M
u
x
Zero
Data
memory
Data
Control
MemRead
Processor Architecture
Pipelining
Pipeline Hazards
Instruction Fetch
Instruction Decode
Execute
Memory Access
Write-Back
Duration1
200ps
200ps
100ps
Stage
IF, MEM
EX
ID, WB
Processor Architecture
Pipelining
Pipeline Hazards
IF
200
200
200
200
200
ID
100
100
100
100
EX
200
200
200
200
MEM
200
200
WB
100
100
Total
600
800
700
500
200
Processor Architecture
Pipelining
Pipeline Hazards
6 PM
10
11
12
2 AM
6 PM
10
11
12
2 AM
Task
order
A
B
C
D
Time
Total of 2 20 = 40 hours?
Task
order
Time
Processor Architecture
Pipelining
Pipeline Hazards
Task
order
If Dit takes 2 hours to wash, dry, fold and store one set of
clothes, how long will it take for 20 sets?
Time
6 PM
10
11
12
Task
order
A
B
C
D
2 AM
Processor Architecture
Pipelining
Pipeline Hazards
Processor Architecture
lw $1, 100($0)
Instruction
Reg
fetch
Pipelining
Data
access
ALU
Pipeline Overheads
lw $2, 200($0)
Pipeline Hazards
Reg
Instruction
Reg
fetch
800 ps
lw $3, 300($0)
ALU
Data
access
Reg
Instruction
fetch
800 ps
800 ps
Program
execution
Time
order
(in instructions)
200
400
600
Instruction
fetch
Reg
Instruction
fetch
Reg
200 ps
Instruction
fetch
lw $1, 100($0)
lw $3, 300($0)
ALU
800
Data
access
ALU
Reg
1000
1200
1400
Reg
Data
access
ALU
Reg
Data
access
Reg
Processor Architecture
Pipelining
Pipeline Hazards
Processor Architecture
Pipelining
Pipeline Hazards
Obstacles to Pipelining
Processor Architecture
Pipelining
Pipeline Hazards
Structural Hazards
Structural Hazard: hardware cannot support instruction
Suppose we want to add a new instruction:
xor dst, src0 , n(src1 )
Fetch second operand during MEM, two cycles after EX!
Requires an additional MEM (read) stage before EX
Requires ALU to calculate n+src1 as well as XOR
But each instruction only has one cycle in EX stage!
Processor Architecture
Pipelining
Pipeline Hazards
Data Hazards
Data Hazards: ALU needs value not yet in register file
Suppose we execute the following dependent instructions:
add $s0, $t0, $t1
sub $t2, $s0, $t3
Result of add not written to $s0 until WB
But sub requires $s0 = $t0+$t1 the very next cycle!
200
Time
add $s0, $t0, $t1
IF
400
ID
600
EX
800
MEM
1000
WB
Processor Architecture
Pipelining
Pipeline Hazards
EX Forwarding
Wasted cycles waiting for previous instruction to complete
Compiler could fill bubbles with independent instructions
Or even the hardware out-of-order execution
Hard to find useful instructions; happens too often!
200
IF
No stalls required
400
600
800
ID
EX
MEM
IF
ID
EX
1000
WB
MEM
WB
Processor Architecture
Pipelining
Pipeline Hazards
MEM Forwarding
What about load instructions?
Consider the following instruction sequence:
lw $s0, 20($t1)
sub $t2, $s0, $t3
Result from lw not available until after MEM stage
Program
execution
Time
order
(in instructions)
lw $s0, 20($t1)
200
IF
400
ID
bubble
800
600
EX
bubble
IF
MEM
bubble
ID
1000
1200
1400
WB
bubble
EX
bubble
MEM
WB
Processor Architecture
Pipelining
Reordering Instructions
Pipeline Hazards
Processor Architecture
Pipelining
Pipeline Hazards
Branch/Control Hazards
Processor Architecture
Pipelining
Pipeline Hazards
Branch Prediction
Static Branch Prediction
If target before PC, predict taken likely to be a loop
Otherwise could be if-then control predict not taken
Unconditional branches (or jumps) always taken
Processor Architecture
Pipelining
Pipeline Hazards
Reading Material