Vous êtes sur la page 1sur 3

CS 161: Assignment 3

Due at 11:59PM on May 14, 2014



1. Consider the following control and datapath with datapath latencies in Table 1. Assume registers and data memory
is edge-triggered and all hardware components can work concurrently when there is no data dependencies.



I-Mem Add Mux ALU Register
Read/Write
D-Mem
Read/Write
Sign-Extend Shift-Left-2 ALU Ctrl
200ps 70ps 20ps 90ps 90ps 250ps 15ps 10ps 30ps
Table 1: Datapath latencies

a. To avoid being on the critical path, what is the maximum time to generate each of the following control
signals: MemRead, ALUOp, MemWrite, ALUSrc, RegWrite?








b. Given the control signal generation times in the following Table 2 and the datapath latencies in the above
Table 1, derive the exact time to execute R-type, Load, Store, Branch, and Jump instructions.

RegDst Jump Branch MemRead MemtoReg ALUOp MemWrite ALUSrc RegWrite
500ps 500ps 450ps 200ps 450ps 200ps 500ps 100ps 500ps
Table 2: Control signal generation times





2. Consider executing the following assembly code in MIPS five stage (IF, ID, EX, ME, WB) pipeline model:

Loop: lw $t0, 0($s1)
addi $t0, $t0, 1
sw $t0, 0($s1)
addi $s1, $s1, 4
bne $s1, $s2, Loop

a. Assume there is only one memory port, data forwarding is not implemented, and branch instruction stalls until
the end of the WB stage. Complete the following pipeline execution diagram for one iteration. What is the CPI
assuming there are infinite number of iterations?

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
lw $t0, 0($s1) IF ID EX ME WB
addi $t0, $t0, 1
sw $t0, 0($s1)
addi $s1, $s1, 4
bne $s1,$s2,Loop
lw $t0, 0($s1)
(next iteration)


b. Indicate all data dependences and their types (i.e., RAW, WAR, or WAW).



c. Assume structural hazards are resolved, data forwarding is implemented, and branch result is available at the
end of the ID stage. Complete the following pipeline execution diagram for one iteration. What is the CPI
assuming there are infinite number of iterations?

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
lw $t0, 0($s1) IF ID EX ME WB
addi $t0, $t0, 1
sw $t0, 0($s1)
addi $s1, $s1, 4
bne $s1,$s2,Loop
lw $t0, 0($s1)
(next iteration)


d. Assume structural hazards are resolved, data forwarding is implemented, branch result is available at the end of
the ID stage, and you may reorder the code to avoid pipeline stalls. Complete the following pipeline execution
diagram for one iteration. What is the CPI assuming there are infinite number of iterations?

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
lw $t0, 0($s1) IF ID EX ME WB








e. Assume structural hazards are resolved, data forwarding is implemented, branch result is available at the end of
the ID stage, loops are unrolled twice with unnecessary loop overhead eliminated, and you may reorder the
code to avoid pipeline stalls. Complete the following pipeline execution diagram for one unrolled-iteration (i.e.,
two original iterations). What is the CPI assuming there are infinite number of iterations?

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
lw $t0, 0($s1) IF ID EX ME WB









f. Assume at most two instructions (i.e., one ALU/branch and one load/store) can be issued at each cycle, data
access from/to memory does not interfere the instruction fetch, data forwarding is implemented, branch
prediction is perfect (i.e., branch does not stall), and you may reorder the code within one iteration to avoid the
pipeline stalls. Complete the execution diagram for one iteration. What is the CPI assuming there are infinite
number of iterations?

Cycle ALU/branch Load/store
1 nop lw $t0, 0($s1)
2
3
4
5

g. Assume at most two instructions (one ALU/branch and one load/store) can be issued at each cycle, data access
from/to memory does not interfere the instruction fetch, data forwarding is implemented, branch prediction is
perfect (i.e., branch does not stall), loops are unrolled four times with unnecessary loop overhead eliminated,
and you may reorder the instructions to avoid the pipeline stalls. Complete the execution diagram for one
unrolled-iteration (i.e., 4 original iterations). What is the CPI assuming there are infinite number of iterations?

Cycle ALU/branch Load/store
1
2
3
4
5
6
7
8
9
10


3. What optimizations will the gcc compiler perform on your code when you compile your code using the
compilation flags -O0, -O1, -O2, and -O3 respectively? Compile the C code we provided in your
Assignment2-Answer with the compilation flags -O0, -O1, -O2, and -O3 respectively and report
the size (in bytes) of the executables produced by the three compilation approaches. Run the executables
you get and compare the performance of different executables. Explain why the performance is different.

Vous aimerez peut-être aussi