Vous êtes sur la page 1sur 3

CS-421 Parallel Processing BE (CIS) Batch 2004-05

Handout_5.1
MIPS Pipeline
1. Instruction Fetch (IF) Stage
a. Instruction Fetch
Instruction’s address in PC is applied to instruction memory that causes the addressed
instruction to become available at the output lines of instruction memory.
b. Updating PC
The address in PC is incremented by 4 but what is written in PC is determined by the
control signal PCSrc. Depending upon the status of control signal PCSrc, PC is either
written by the branch target address (BTA) or the sequential address (PC + 4).
2. Instruction Decode (ID) Stage
a. Instruction is decoded by the control unit that takes 6-bit opcode and generates control
signals.
b. The control signals are buffered in the pipeline registers until they are used in the
concerned stage by the corresponding instruction.
c. Registers are also read in this stage. Note that the first source register’s identifier in every
instruction is at bit positions [25:21] and second source register’s identifier (if any) is at
bit positions [20:16].
d. The destination register’s identifier is either at bit positions [15:11] (for R-type) or at
[20:16] (for lw and addi). The correct destination register’s identifier is selected via
multiplexer controlled by the control signal RegDst. However, this multiplexer is placed
in the EX stage because the instruction decoding is not finished until the second stage is
complete. But this identifier is buffered until the WB stage because an instruction writes
a register in the WB stage.
3. Execution (EX) Stage
a. This stage is marked by the use of ALU that performs the desired operation on registers
(R-type), calculates address (memory reference instructions), or compares registers
(branch).
b. An ALU control accepts 6-bit funct field and 2-bit control signal ALUOp to generate
the required control signal for the ALU.
c. BTA is also calculated in the EX stage by a separate adder.
4. Memory (M) Stage
a. Data memory is read (lw) or written (sw) using the address calculated by the ALU in EX
stage.
b. ZERO output of ALU and BRANCH signal generated by the control unit are ANDed
to determine the fate of branch (taken or not taken).

Page - 1 - of 3
CS-421 Parallel Processing BE (CIS) Batch 2004-05
Handout_5.1
5. Write Back (WB) Stage
a. Result produced by ALU in EX stage (R-type) or data read from data memory in M stage
(lw) is written in destination register. The data to be written in destination register is
selected via multiplexer controlled by the control signal MemToReg.
Harvard Architecture
Separate memory units for instructions and data (Harvard Architecture) are required because in a
given pipeline cycle two instructions may need to use memory (one for instruction fetch and
another for data read/write) as shown below.
WB I1 I 2 I3 I 4 I5 I6
M I1 I2 I3 I4 I5 I6 I7
EX I1 I2 I 3 I 4 I5 I 6 I7 I8
ID I1 I2 I3 I4 I5 I6 I7 I8 I9
IF I1 I2 I3 I4 I5 I6 I7 I8 I9 I10
cycles 1 2 3 4 5 6 7 8 9 10
As indicated, in cycle 4 I1 accesses memory for data read/write and I4 is being fetched
accessing instruction memory. Harvard Architecture averts this problem.

Exercise
1. What are the sizes of pipeline registers IF/ID, ID/EX, EX/M and M/WB?
2. Why there is no pipeline register needed after WB stage?

Graphical Representation of MIPS Pipeline

Consider pipelined execution of following MIPS instructions:


lw $1, 0($2)
add $3, $4, $5
The lw instruction uses all stages in the pipeline but add (like any other R-type instruction)
doesn’t access data memory i.e. it doesn’t use M stage. Thus the progress of above instructions
through the MIPS pipeline is illustrated below:
CC1 CC2 CC3 CC4 CC5
lw IF ID EX M WB
add IF ID EX WB
Did you notice any problem? In CC5 a resource conflict is observed. That is, two different
instructions attempt to use the same hardware in the same cycle. This can be averted by ensuring
uniformity: make all instructions pass through all the stages in the same order.
Page - 2 - of 3
CS-421 Parallel Processing BE (CIS) Batch 2004-05
Handout_5.1
As a consequence, some instructions will do nothing (accomplished through disabling
corresponding control signals) in some stages.
R-Type IF ID EX M WB

sw IF ID EX M WB

beq IF ID EX M WB
Where shaded boxes represent pipeline stages in which given instruction does nothing.
Only lw uses all five stages.
Pipeline Cycle Time
Consider the following processing delays of different pipeline stages:
Pipeline Stage Processing Delay
IF 3ns
ID 2ns
EX 2ns
M 3ns
WB 2ns
Let the delay introduced by each pipeline register be 1ns.
Time(ns) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
I1 IF IF IF ID ID EX EX M M M WB WB
I2 IF IF IF ID ID EX EX M M M
I3 IF IF IF ID ID EX EX
where,
Indicates the time slot during which a pipeline register is being read/written

Indicates the time slot during which a faster stage has to wait for a slower stage to
finish.

Let the pipeline cycle time be denoted by PCT; that is, once the pipeline sets up, one instruction
gets completed every PCT seconds. Here, PCT = 4ns. That is, the sum of the processing delay
of the slowest stage and the delay introduced by the pipeline register.
As can be observed, the performance of pipeline is constrained by the processing delay of the
slowest stage: A chain is as strong as its weakest link.
Exercise
What’s the instruction latency and instruction throughput of pipelined and non-pipelined
execution for a program containing 1000 instructions? Assume the delays given above.

*****

Page - 3 - of 3

Vous aimerez peut-être aussi