Single Cycle Processor PPT by Svnit

Computer Organization
Micro Architecture Level

The Processor: Datapath and Control
Problem Oriented Language Level
LEVEL 5
(High Language Level)
Translation(Compiler)
LEVEL 4 Assembly Language Level

Translation(Assembler)
Six LEVEL 3 Operating System Machine Level

Level Partial Interpretation
(Operating System)
Computer
LEVEL 2 Instruction Set Architecture Level
Interpretation (Microprogram) /
Direct Execution (Hardwired)
LEVEL 1 Micro Architecture Level

Hardware
LEVEL 0 Digital Logic Level

GOAL: Datapath and control unit
 Design a computer architecture
 To support the defined ISA
 To fetch and execute each instruction
 Study of
 Processor designs
• Single cycle
• Multiple cycle
• Pipelined
GOAL: Datapath and control unit
 Study of
 Datapath
• Basic components
• How a datapath is constructed?
• How the speed of the CPU is determined?
 Control unit
• Why do we need control unit?
• How control unit relates instructions and datapath?
• Control unit design
Instruction Set Architecture (ISA)
 A programmable system uses
 A sequence of instructions to control its
operation
 A typical instruction specifies:
 Operation to be performed
 Operands to use
 Where to place the result
 Which instruction to execute next
Instruction Set Architecture (ISA) (cont.)
 Instructions are stored in
 RAM or ROM as a program
 The addresses for instructions in a
computer are provided by
A program counter (PC) that can
 Count up
 Load a new address based on an instruction
and, optionally, status information
Instruction Set Architecture (ISA) (cont.)
 Part of the Control Unit to execute an instruction

 The PC and associated control logic
 Executing an instruction –
 Activating the necessary sequence of operations
specified by the instruction
 Controlled by the control unit and performed,
 In the datapath
 In the control unit
 In external hardware such as memory or

input/output
ISA: Instruction Format
 A instruction consists of a bit vector

 The fields of an instruction are subvectors
representing specific functions and having specific
binary codes defined
 The format of an instruction defines the subvectors
and their function
 An ISA usually contains multiple formats
MIPS Instruction & instuction Formats
 Simplified only for
 Arithmetic-logic instructions : add, sub, and, or, slt
 Memory-reference instructions : lw, sw
 Control-flow instructions : beq, j
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
rs rt rd shamt R-Format
op funct
6 bits 5 bits 5 bits 16 bits
op rs rt offset I-Format
6 bits 26 bits
J-Format
op address
Implementing MIPS:
the Fetch/Execute Cycle
 High-level abstract view of fetch/execute implementation
 use the program counter (PC) to read instruction address
 fetch the instruction from memory and increment PC
 use fields of the instruction to select registers to read
 execute depending on the instruction
 repeat…
Processor Implementation Styles
 Single Cycle
 Perform each instruction in 1 clock cycle
 Clock cycle must be long enough for slowest instruction
 Multi-Cycle
 Break fetch/execute cycle into multiple steps
 Perform 1 step in each clock cycle
 Pipelined
Single Vs. Multi-Cycle Processor Machine
Cycle time = 20 ns Cycle time = 5 ns
1 cycle 4 cycles
Load Load
1 cycle 3 cycles
Add Add
1 cycle 2 cycles
Beq Beq
Time for a load, add, and beq = 60 ns 45 ns
• In the Single cycle implementation, every instruction
requires one cycle to complete, so cycle time = time taken for
the slowest instruction
• If the execution was broken into multiple (faster)
cycles, the shorter instructions can finish sooner
12
 Single Cycle
 Disadvantage: only as fast as the slowest instruction
 Multi-Cycle
 Advantage: each instruction uses only as many cycles as it needs
 Pipelined
 Execute each instruction in multiple steps
 Perform 1 step / instruction in each clock cycle
Unpipelined Start and finish a job before moving to the next
Jobs
Time
A B C
A B C Break the job into smaller stages
A B C
A B C
Pipelined
 Single Cycle
 Disadvantage: only as fast as the slowest instruction
 Multi-Cycle
 Advantage: each instruction uses only as many cycles as it needs
 Pipelined
 Execute each instruction in multiple steps
 Perform 1 step / instruction in each clock cycle
 Process multiple instructions in parallel
Functional Elements
 Two types of functional elements in the hardware:
 Elements that operate on data
 Called combinational elements
 Elements that contain data

 Called state or sequential elements
Combinational Elements
 Works as an input  output function, e.g., ALU
 Combinational logic reads input data from one register and writes
output data to another, or same, register
 read/write happens in a single cycle – combinational element cannot
store data from one cycle to a future one
Combinational logic hardware units
State State State

element Combinational logic element Combinational logic
element
1 2
Clock cycle
State Elements
 Contain data in internal storage, e.g., registers and memory
 All state elements together define the state of the machine
 Flipflops and latches are 1-bit state elements, equivalently,
they are 1-bit memories
 The output(s) of a flipflop or latch always depends on the bit
value stored, i.e., its state, and can be called 1/0 or high/low or
true/false
 The input to a flipflop or latch can change its state depending
on whether it is clocked or not…
Basic memory unit
Set-Reset (SR-) latch Made from two cross-coupled nand gates
(unclocked) When both Sbar and Rbar are 1, then either one
of the following two states is stable:
Think of Sbar as S, the inverse of set (which a) Q = 1 & Qbar = 0
sets Q to 1), and Rbar as R, the inverse of reset. b) Q = 0 & Qbar = 1
Sbar and the latch will continue in the current stable
(set) n1 Q state.
If Sbar changes to 0 (while Rbar remains at 1),

then the latch is forced to the exactly one
possible stable state (a). If Rbar changes to 0
Rbar n2 Qbar (while Sbar remains at 1), the latch is forced to
(reset) the exactly one possible stable state (b).
So, the latch remembers which of Sbar or Rbar

equivalently with nor gates was last 0 during the time they are both 1.
R
Q
When both Sbar and Rbar are 0 the exactly one
stable state is Q = Qbar = 1. However, if after
_ that both Sbar and Rbar return to 1, the latch must
Q
S then jump non-deterministically to one of stable
states (a) or (b), which is undesirable behavior.
Synchronous Logic:
Clocked Latches and Flipflops
 Clocks are used in synchronous logic to determine when a state
element is to be updated
 in level-triggered clocking methodology either the state changes
only when the clock is high or only when it is low (technology-
dependent)
Falling edge
Clock period Rising edge

 in edge-triggered clocking methodology either the rising edge or
falling edge is active (depending on technology) – i.e., states
change only on rising edges or only on falling edge
 Latches are level-triggered

 Flipflops are edge-triggered
Clocked SR-latch
 State can change only when clock is high
 Potential problem : both inputs Sbar = 0 & Rbar = 0 will
cause non-deterministic behavior
Sbar X
r1 Q
n1
clk clkbar
a
Rbar r2 n2 Qbar
Y
Clocked D-latch
 State can change only when clock is high
 Only single data input (compare SR-latch)
 No problem with non-deterministic behavior
D Dbar X
a2
r1 Q
n1
clk clkbar
a1
r2 n2 Qbar
Y
Timing diagram of D-latch

Clocked D-flipflop
 Negative edge-triggered
 Made from three SR-latches
sbar
cbar s
clear
q
clkbar
clk
r qbar
rbar
d
Logi Sim
 All components that we have discussed – and shall discuss – can
be fabricated using Logi Sim
State Elements on the Datapath:
Register File
 Registers are implemented with arrays of D-flipflops
Clock
Read register
5 bits number 1 Read
data 1 32 bits
5 bits Read register
number 2
Register file
5 bits Write
register
Read 32 bits
Write data 2
32 bits data Write
Control signal
Register file with two read ports and one write port
State Elements on the Datapath:
Register File
 Port implementation: Clock
Clock
Write
Read register C
number 1 0
Register 0
Register 0 1 D
Register 1 M n-to-1 C
Register number
u Read data 1 decoder Register 1
Register n – 1 x D
n– 1
Register n n
Read register
number 2
C
Register n – 1
M D
u Read data 2 C
x Register n
Register data D
Read ports are implemented Write port is implemented using

with a pair of multiplexors – 5 a decoder – 5-to-32 decoder for
bit multiplexors for 32 registers 32 registers. Clock is relevant to
write as register state may change
only at clock edge
Datapath:
Instruction Store/Fetch & PC Increment
Three elements used to store and fetch instructions and increment the PC
Instruction
address
PC
Instruction Add Sum
Instruction
memory
a. Instruction memory b. Program counter c. Adder

Datapath:
Instruction
address
PC
Instruction Add Sum
Instruction
memory

Datapath:
Instruction
address
PC
Instruction Add Sum
Instruction
memory

Datapath:
Instruction
address
PC Add
Instruction Add Sum
Instruction 4
memory
Read
PC address
Instruction
Three elements used to store and fetch Instruction

memory
instructions and increment the PC
Datapath
Animating the Datapath
Instruction <- MEM[PC]

PC <- PC + 4
ADD
PC
ADDR
Memory
RD Instruction
Datapath: R-Type Instruction
Two elements used to implement R-type instructions

ALU control ALU operation
5 Read 3 Read 3
register 1 register 1
Read Read
Register 5 data 1 data 1
Read Read
numbers register 2 Zero Zero
Registers Data ALU Instruction register 2
ALU Registers ALU ALU
5 Write result
register Write result
Read
register
Write data 2 Read
Data data data 2
Write
data
RegWrite
RegWrite
a. Registers b. ALU
Datapath

5 Read 3 Read 3
Read Read
Read Read
5 Write result
Read
register
Write data 2 Read
Data data data 2
Write
data
RegWrite
RegWrite
a. Registers b. ALU
Datapath

5 Read 3 Read 3
Read Read
Read Read
5 Write result
Read
register
Write data 2 Read
Data data data 2
Write
data
RegWrite
RegWrite
a. Registers b. ALU
Datapath
add rd, rs, rt
Instruction
op rs rt rd shamt funct R[rd] <- R[rs] + R[rt];
5 5 5 Operation
3
RN1 RN2 WN
RD1
Register File ALU Zero
WD
RD2
RegWrite
Datapath: Load/Store Instruction
3 ALU operation
Read
MemWrite register 1 MemWrite
Read
data 1
Read
Instruction register 2 Zero
Registers ALU ALU
Address Read Write Read
result Address
data 16 32 register data
Sign Read
Write data 2
extend Data
Write Data data
data memory memory
RegWrite Write
data
16 32
Sign MemRead
MemRead
extend
a. Data memory unit b. Sign-extension unit
Two additional elements used Datapath

To implement load/stores
lw rt, offset(rs)
R[rt] <- MEM[R[rs] + s_extend(offset)];
sw rt, offset(rs)
MEM[R[rs] + sign_extend(offset)] <- R[rt]
Recall: Branch Addressing
 Branch instructions:
 beq rs, rt, L1
 bne rs, rt, L1
 Specify:
 Opcode, two registers, target address
 Format:
op rs rt constant or address
6 bits 5 bits 5 bits 16 bits
 16 bit Address ?
Recall: Branch Addressing
 16 bits is too small a reach in a 232 address space
 Solution:
 Principle of locality
 Most branch targets are near branch
 Forward or backward Direction
 Use PC (= program counter), called PC-relative addressing
based on Principle of Locality
 PC-relative addressing
 Target address = PC + offset × 4
 PC already incremented by 4 by this time
Datapath: Branch Instruction
PC + 4 from instruction datapath
How to? Add Sum Branch target
No shift hardware required: Shift

simply connect wires from left 2
input to output, each shifted
left 2 bits ALU operation
Read 3
Instruction register 1
Read
data 1
Read
register 2 To branch
Registers ALU Zero
Write control logic
register
Read
data 2
Write
data
RegWrite
16 32
Sign
extend
Datapath
beq rs, rt, offset

if (R[rs] == R[rt]) then
PC  PC+4 + s_extend(offset<<2)
Composing of the Elements
Each datapath element can only do one function at a time
 Check possibility
 Need separate instruction and data memories
 Use multiplexers where alternate data sources are used for
different instructions
MIPS Datapath I: Single-Cycle
Input is either register (R-type) or sign-extended
lower half of instruction (load/store)
Data is either
from ALU (R-type)
or memory (load) Combining the datapaths for R-type instructions
and load/stores using two multiplexors
Animating the Datapath:
R-type Instruction
Instruction add rd,rs,rt
32 16 5 5 5 Operation
3
RN1 RN2 WN
RD1
WD
M MemWrite
RD2 U ADDR MemtoReg
RegWrite X
Data
E Memory RD M
X U
16 32 ALUSrc X
T WD
N MemRead
D
Load Instruction
Instruction lw rt,offset(rs)
3
RN1 RN2 WN
RD1
WD
M MemWrite
RD2 U ADDR MemtoReg
RegWrite X
Data
E Memory RD M
X U
16 32 ALUSrc X
T WD
N MemRead
D
Store Instruction
Instruction sw rt,offset(rs)
3
RN1 RN2 WN
RD1
WD
M MemWrite
RD2 U ADDR MemtoReg
RegWrite X
Data
E Memory RD M
X U
16 32 ALUSrc X
T WD
N MemRead
D
MIPS Datapath II: Single-Cycle
Separate adder as ALU operations and PC

increment occur in the same clock cycle
Add
Read Registers
ALU operation
register 1 3 MemWrite
PC Read
Read Read MemtoReg
address
register 2 data 1 ALUSrc Zero
Instruction ALU ALU
Write Read Address Read
register data 2 M result data
u M
Instruction Write x u
memory Data x
data memory
Write
RegWrite data
16 Sign 32 MemRead
extend
Separate instruction memory
as instruction and data read
occur in the same clock cycle
Adding instruction fetch
MIPS Datapath III: Single-Cycle
PCSrc New multiplexor
M
Add u
x
4 Add ALU
result
Shift
left 2 Extra adder needed as both
adders operate in each cycle
Registers
Read 3 ALU operation
MemWrite
Read register 1 ALUSrc
PC Read
address Read data 1 MemtoReg
register 2 Zero
Instruction ALU ALU
Write Read Address Read
register M result data
data 2 u M
Instruction u
memory Write x Data x
data memory
Write
RegWrite data
16 32
Sign
extend MemRead
Instruction address is either
PC+4 or branch target address
Adding branch capability and another multiplexor

Important note: in a single-cycle implementation data cannot be stored
during an instruction – it only moves through combinational logic
Datapath Executing add
ADD
M
ADD
ADD U
4 X
PC <<2 PCSrc
Instruction
ADDR RD
Instruction 3
Memory RN1 RN2 WN
RD1
WD
M MemWrite
RD2 U ADDR MemtoReg
RegWrite X
Data
E Memory RD M
U
16 X 32 ALUSrc X
T WD
N MemRead
add rd, rs, rt D
Datapath Executing lw
ADD
M
ADD
ADD U
4 X
PC <<2 PCSrc
Instruction
ADDR RD
Instruction 3
Memory RN1 RN2 WN
RD1
WD
M MemWrite
RD2 U ADDR MemtoReg
RegWrite X
Data
E Memory RD M
U
16 X 32 ALUSrc X
T WD
N MemRead
lw rt,offset(rs) D
Datapath Executing sw
ADD
M
ADD
ADD U
4 X
PC <<2 PCSrc
Instruction
ADDR RD
Instruction 3
Memory RN1 RN2 WN
RD1
WD
M MemWrite
RD2 U ADDR MemtoReg
RegWrite X
Data
E Memory RD M
U
16 X 32 ALUSrc X
T WD
N
sw rt,offset(rs) D
MemRead
Datapath Executing beq
ADD
M
ADD
ADD U
4 X
PC <<2 PCSrc
Instruction
ADDR RD
Instruction 3
Memory RN1 RN2 WN
RD1
WD
M MemWrite
RD2 U ADDR MemtoReg
RegWrite X
Data
E Memory RD M
U
16 X 32 ALUSrc X
T WD
N
beq r1,r2,offset D
MemRead
Control
 Control unit takes input from

 The instruction opcode bits
 Control unit generates

 Write enable (possibly, read enable also) signals for each storage
element
 Selector controls for each multiplexor
 ALU control input
Designing the Main Control
R-type opcode rs rt rd shamt funct

31-26 25-21 20-16 15-11 10-6 5-0
Load/store opcode rs rt address

or branch
31-26 25-21 20-16 15-0
 Observations about MIPS instruction format

 opcode is always in bits 31-26
 two registers to be read are always rs (bits 25-21) and rt (bits 20-16)
 base register for load/stores is always rs (bits 25-21)
 16-bit offset for branch equal and load/store is always bits 15-0
 destination register for loads is in bits 20-16 (rt) while for R-type
instructions it is in bits 15-11 (rd) (will require multiplexor to select)
Datapath with Control I
PCSrc
1
Add M
u
x
4 ALU 0
Add result
New multiplexor RegWrite Shift
left 2
Instruction [25– 21] Read

Read register 1 Read MemWrite
PC data 1
address Instruction [20– 16] Read MemtoReg
ALUSrc
Instruction register 2 Zero
1 Read ALU ALU
[31– 0] Write data 2 1 Read
M result Address 1
u register M data
Instruction Instruction [15– 11] x u M
memory Write x u
0 data Registers x
0
Write Data 0
RegDst data memory
Instruction [15– 0] 16 Sign 32
extend ALU MemRead
control
Instruction [5– 0]
ALUOp
Adding control to the MIPS Datapath III (and a new multiplexor to select field to
specify destination register)
Control Signals
Signal Name Effect when deasserted Effect when asserted
RegDst The register destination number for the The register destination number for the
Write register comes from the rt field (bits 20-16) Write register comes from the rd field (bits 15-11)
RegWrite None The register on the Write register input is written

with the value on the Write data input
AlLUSrc The second ALU operand comes from the The second ALU operand is the sign-extended,
second register file output (Read data 2) lower 16 bits of the instruction
PCSrc The PC is replaced by the output of the adder The PC is replaced by the output of the adder
that computes the value of PC + 4 that computes the branch target
MemRead None Data memory contents designated by the address
input are put on the first Read data output
MemWrite None Data memory contents designated by the address

input are replaced by the value of the Write data input
MemtoReg The value fed to the register Write data input The value fed to the register Write data input
comes from the ALU comes from the data memory
Effects of the seven control signals

Datapath with Control II
0
M
u
x
ALU
Add result 1
Add Shift PCSrc
RegDst left 2
4 Branch
MemRead
Instruction [31 26] MemtoReg
Control
ALUOp
MemWrite
ALUSrc
RegWrite
Instruction [25 21] Read

PC Read register 1
address Read
Instruction [20 16] data 1
Read
register 2 Zero
Instruction 0 Registers Read ALU ALU
[31– 0] 0 Read
M Write data 2 result Address 1
Instruction u register M data
u M
memory Instruction [15 11] x u
1 Write x Data
data x
1 memory 0
Write
data
16 32
Instruction [15 0] Sign
extend ALU
control
Instruction [5 0]
MIPS datapath with the control unit: input to control is the 6-bit instruction
opcode field, output is seven 1-bit signals and the 2-bit ALUOp signal
PCSrc cannot be
set directly from the
0
M
u opcode: zero test
x
ALU
Add result 1
outcome is required
Add Shift PCSrc
RegDst left 2
4 Branch
MemRead
Control
ALUOp
MemWrite
ALUSrc
RegWrite

PC Read register 1
address Read
Read
register 2 Zero
[31– 0] 0 Read
u M
1 Write x Data
data x
1 memory 0
Write
data
Datapath with Instruction [15 0]

16
Sign
extend
32
ALU
control
Control II (cont.) Instruction [5 0]
Determining control signals for the MIPS datapath based on instruction opcode
Memto- Reg Mem Mem
Instruction RegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUOp0
R-format 1 0 0 1 0 0 0 1 0
lw 0 1 1 1 1 0 0 0 0
sw X 1 X 0 0 1 0 0 0
beq X 0 X 0 0 0 1 0 1
Control Signals:
R-Type Instruction
ADD
0
M
ADD
ADD U
4 rs rt rd X
I[25:21] I[20:16] I[15:11] 1

PC <<2 PCSrc
Instruction
ADDR RD I
32 5
0
0 1 Value depends on
Instruction ???
Memory
MUX RegDst Operation funct
16 5 5 5
1 3
RN1 RN2 WN
RD1
0
WD 0
immediate/
offset M MemWrite 0
RD2 U ADDR MemtoReg
I[15:0] RegWrite X 1
1
Data
E Memory RD M
U
1 16 X
T
32 ALUSrc
WD X
Control signals
N
D
0 MemRead 0
shown in blue 0
Control Signals:
lw Instruction
ADD
0
M
ADD
ADD U
4 rs rt rd X
I[25:21] I[20:16] I[15:11] 1

PC <<2 PCSrc
Instruction
ADDR RD I
32 5
0
0 1
Instruction MUX
010
RegDst Operation
Memory 16 5 5 5
0 3
RN1 RN2 WN
RD1
Register File ALU Zero 0
WD 0
immediate/
offset M MemWrite 1
RD2 U ADDR MemtoReg
1
Data
E Memory RD M
U
1 16 X
T
32 ALUSrc
WD X
Control signals
N
D
1 MemRead 0
shown in blue 1
Control Signals:
sw Instruction
ADD
0
M
ADD
ADD U
4 rs rt rd X
I[25:21] I[20:16] I[15:11] 1

PC <<2 PCSrc
Instruction
ADDR RD I
32 5
0
0 1
Instruction
Memory
MUX 010
RegDst Operation
16 5 5 5
X 3
RN1 RN2 WN
RD1
WD 0
immediate/
offset M MemWrite X
RD2 U ADDR MemtoReg
1
Data
E Memory RD M
U
0 16 X
T
32 ALUSrc
WD X
Control signals
N
D
1 MemRead 0
shown in blue 0
Control Signals:
beq Instruction
ADD
0
M
ADD
ADD U
4 rs rt rd X
I[25:21] I[20:16] I[15:11] 1

PC <<2 PCSrc
Instruction
ADDR RD I
32 5 1 if Zero=1
0 1
Instruction MUX 110
RegDst Operation
Memory 16 5 5 5
X 3
RN1 RN2 WN
RD1
WD 0
immediate/
offset M MemWrite X
RD2 U ADDR MemtoReg
1
Data
E Memory RD M
U
0 16 X
T
32 ALUSrc
WD X
Control signals
N
D
0 MemRead 0
shown in blue 0
Datapath with Control III
Jump opcode address
31-26 25-0
Composing jump New multiplexor with additional
target address control bit Jump
Instruction [25– 0] Shift Jump address [31– 0]

left 2
26 28 0 1
PC+4 [31– 28] M M

u u
x x
ALU
Add result 1 0
Add Shift
RegDst
Jump left 2
4 Branch
MemRead
Control MemtoReg
ALUOp
MemWrite
ALUSrc
RegWrite

Read register 1
PC address Read
Instruction [20– 16] data 1
Read
register 2 Zero
[31– 0] 0 Read
u M
memory Instruction [15– 11] x u
1 Write x Data
data x
1 memory 0
Write
data
16 32
Instruction [15– 0] Sign
extend ALU
control
MIPS datapath extended to jumps: control unit generates new Jump control bit
Datapath Executing j
R-Type Instruction Steps
add $t1, $t2, $t3
1. Fetch instruction and increment PC
2. Read two source registers from the register file
3. ALU operates on the two register operands
4. Write result to register
R-type Instruction: Step 1
add $t1, $t2, $t3 (active = bold)
0
M
u
x
Add ALU 1
result
Add Shift
RegDst left 2
4 Branch
MemRead
Instruction [31– 26] MemtoReg
Control ALUOp
MemWrite
ALUSrc
RegWrite

Read register 1
PC address Read
Instruction [20– 16] Read data 1
register 2 Zero
Instruction 0 Registers Read
[31– 0] ALU ALU
M Write data 2 0 Address Read
result 1
u M
memory x u
Instruction [15– 11] Write x
1 Data x
data 1 memory 0
Write
data
16 32
extend ALU
control
Fetch instruction and increment PC count

0
M
u
x
ALU
Add result 1
Add Shift
RegDst left 2
4 Branch
MemRead
Control
ALUOp
MemWrite
ALUSrc
RegWrite

PC Read register 1
address Read
Read
register 2 Zero
[31– 0] 0 Read
u M
1 Write x Data
data x
1 memory 0
Write
data
16 32
extend ALU
control
Read two source registers from the register file

0
M
u
x
Add ALU 1
result
Add Shift
RegDst left 2
4 Branch
MemRead
Control ALUOp
MemWrite
ALUSrc
RegWrite

Read register 1
PC address Read
Instruction [20 16] Read data 1
register 2 Zero
[31– 0] Write 0 Read
M data 2 result Address data 1
Instruction u register M
u M
memory x u
Instruction [15 11] Write x
1 Data x
data 1 memory 0
Write
data
16 32
extend ALU
control
Instruction [5 0]
ALU operates on the two register operands

0
M
u
x
ALU
Add result 1
Add Shift
RegDst left 2
4 Branch
MemRead
Control ALUOp
MemWrite
ALUSrc
RegWrite

Read register 1
PC address Read
Read
register 2 Zero
[31– 0] 0 Read
u M
1 Write x Data
data x
1 memory 0
Write
data
16 32
extend ALU
control
Instruction [5 0]
Write result to register

Load Instruction Steps
lw $t1, offset($t2)
2. Read base register from the register file: the base register
($t2) is given by bits 25-21 of the instruction
3. ALU computes sum of value read from the register file and
the sign-extended lower 16 bits (offset) of the instruction
4. The sum from the ALU is used as the address for the data
memory
5. The data from the memory unit is written into the register file:
the destination register ($t1) is given by bits 20-16 of the
instruction
Load Instruction
lw $t1, offset($t2)
0
M
u
x
ALU
Add result 1
Add Shift
RegDst left 2
4 Branch
MemRead
Control
ALUOp
MemWrite
ALUSrc
RegWrite

PC Read register 1
address Read
Read
register 2 Zero
[31– 0] 0 Read
u M
1 Write x Data
data x
1 memory 0
Write
data
Instruction [15– 0] 16 32
Sign
extend ALU
control
Branch Instruction Steps
beq $t1, $t2, offset
2. Read two register ($t1 and $t2) from the register file
3. ALU performs a subtract on the data values from the
register file; the value of PC+4 is added to the sign-
extended lower 16 bits (offset) of the instruction shifted
left by two to give the branch target address
4. The Zero result from the ALU is used to decide which
adder result (from step 1 or 3) to store in the PC
Branch Instruction
beq $t1, $t2, offset
0
M
u
x
ALU
Add result 1
Add
Shift
RegDst left 2
4 Branch
MemRead
Control
ALUOp
MemWrite
ALUSrc
RegWrite

PC Read register 1
address Read
Instruction [20– 16] Read data 1
register 2 Zero
Instruction
[31– 0] 0 Registers Read ALU ALU
Write 0 Read
M data 2 result Address 1
u M
Write x Data
1 data x
1 memory 0
Write
data
16 32
extend ALU
control
ALU Control
 Plan to control ALU:
 Main control sends a 2-bit ALUOp control field to the ALU control.
Based on ALUOp and funct field of instruction
 The ALU control generates the 3-bit ALU control field
ALU control Func-

field tion
000 and
001 or
010 add
110 sub
111 slt
ALU Control
 Plan to control ALU: main control sends a 2-bit ALUOp control field to
the ALU control. Based on ALUOp and funct field of instruction the
ALU control generates the 3-bit ALU control field
Recall from ALU Design
ALU control Func- 2
field tion
ALUOp 3
Main ALU To
000 and Control Control ALU ALU
001 or control
010 add input
110 sub 6
111 slt
Instruction
funct field
 ALU must perform
 add for load/stores (ALUOp 00)
 sub for branches (ALUOp 01)
 one of and, or, add, sub, slt for R-type instructions, depending on the
instruction’s 6-bit funct field (ALUOp 10)
Setting ALU Control Bits
Instruction AluOp Instruction Funct Field Desired ALU control
opcode operation ALU action input
LW 00 load word xxxxxx add 010
SW 00 store word xxxxxx add 010
Branch eq 01 branch eq xxxxxx subtract 110
R-type 10 add 100000 add 010
R-type 10 subtract 100010 subtract 110
R-type 10 AND 100100 and 000
R-type 10 OR 100101 or 001
R-type 10 set on less 101010 set on less 111
ALUOp Funct field Operation
ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0
0 0 X X X X X X 010
0 1 X X X X X X 110
1 X X X 0 0 0 0 010
1 X X X 0 0 1 0 110
1 X X X 0 1 0 0 000
1 X X X 0 1 0 1 001
1 X X X 1 0 1 0 111
Truth table for ALU control bits
Implementation: ALU Control Block
ALUOp Funct field Operation
ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0
0 0 X X X X X X 010
0 1 X X X X X X 110
1 X X X 0 0 0 0 010
1 X X X 0 0 1 0 110
1 X X X 0 1 0 0 000
1 X X X 0 1 0 1 001
1 X X X 1 0 1 0 111
ALUOp
Truth table for ALU control bits ALU control block
ALUOp0
ALUOp1
Operation2
F3
Operation
F2 Operation1
F (5– 0)
F1
Operation0
F0
ALU control logic

Implementation: Main Control Block Inputs
Op5
Op4
Signal R- lw sw beq
Op3
Op2
name format Op1
Op5 0 1 1 0 Op0
Op4 0 0 0 0
Inputs
Op3 0 0 1 0 Outputs
Op2 0 0 0 1 R-format Iw sw beq
RegDst
Op1 0 1 1 0 ALUSrc
Op0 0 1 1 0 MemtoReg
RegDst 1 0 x x RegWrite
ALUSrc 0 1 1 0 MemRead
MemtoReg 0 1 x x MemWrite
Outputs
RegWrite 1 1 0 0 Branch
MemRead 0 1 0 0 ALUOp1
MemWrite 0 0 1 0 ALUOpO
Branch 0 0 0 1 Main control PLA (programmable

ALUOp1 1 0 0 0 logic array): principle underlying
ALUOP2 0 0 0 1 PLAs is that any logical expression
Truth table for main control signals can be written as a sum-of-products
Logi Sim
 All components that we have discussed – and shall discuss – can
be fabricated using Logi Sim
Single-cycle Implementation Notes
 The steps are not really distinct as each instruction
completes in exactly one clock cycle – they simply
indicate the sequence of data flowing through the
datapath
 The operation of the datapath during a cycle is purely
combinational – nothing is stored during a clock cycle
 Therefore, the machine is stable in a particular state
at the start of a cycle and reaches a new stable state
only at the end of the cycle
Single-Cycle Design Problems
 Assuming fixed-period clock every instruction datapath uses one
clock cycle implies:
 CPI = 1
 cycle time determined by length of the longest instruction path (load)
 but several instructions could run in a shorter clock cycle: waste of time
when having more complicated instructions like floating point!
 resources used more than once in the same cycle need to be
duplicated
 waste of hardware and chip area
Example: Fixed-period clock vs. variable-period clock
in a single-cycle implementation
 Consider a machine with an additional floating point unit. Assume
functional unit delays as follows
 memory: 2 ns., ALU and adders: 2 ns., FPU add: 8 ns., FPU multiply: 16 ns.,
register file access (read or write): 1 ns.
 multiplexors, control unit, PC accesses, sign extension, wires: no delay
 Assume instruction mix as follows
 all loads take same time and comprise 31%
 all stores take same time and comprise 21%
 R-format instructions comprise 27%
 branches comprise 5%
 jumps comprise 2%
 FP adds and subtracts take the same time and totally comprise 7%
 FP multiplys and divides take the same time and totally comprise 7%
 Compare the performance of (a) a single-cycle implementation using a fixed-
period clock with (b) one using a variable-period clock where each instruction
executes in one clock cycle that is only as long as it needs to be.
Example (Cont.)
Functional unit delays as follows memory: 2 ns., ALU and adders: 2 ns., FPU add: 8 ns.,
FPU multiply: 16 ns., register file access (read or write): 1 ns.
multiplexors, control unit, PC accesses, sign extension, wires: no delay
Instruction Instr. Register ALU Data Register FPU FPU

class mem. read oper. mem. write add/ mul/
sub div
Load word 2 1 2 2 1
Store word 2 1 2 2
R-format 2 1 2 0 1
Branch 2 1 2
Jump 2
FP mul/div 2 1 1 16
FP add/sub 2 1 1 8
Solution
Instruction Instr. Register ALU Data Register FPU FPU Total
class mem. read oper. mem. write add/ mul/ time
sub div ns.
Load word 2 1 2 2 1 8
Store word 2 1 2 2 7
R-format 2 1 2 0 1 6
Branch 2 1 2 5
Jump 2 2
FP mul/div 2 1 1 16 20
FP add/sub 2 1 1 8 12
 Clock period for fixed-period clock

= longest instruction time = 20 ns.
 Average clock period for variable-period clock
= 8  31% + 7  21% + 6  27% + 5  5% + 2  2% + 20  7% + 12  7%
= 7.0 ns.
 Therefore, performancevar-period /performancefixed-period = 20/7 = 2.9
Fixing the problem with single-cycle designs
 One solution:
 a variable-period clock with different cycle times for each
instruction class
 unfeasible, as implementing a variable-speed clock is technically
difficult
 Another solution:
 use a smaller cycle time…
 …have different instructions take different numbers of cycles
by breaking instructions into steps and fitting each step into one
cycle
 feasible: multicyle approach!
Next
 Multi Cycle
Implementing MIPS:
the Fetch/Execute Cycle
 High-level abstract view of fetch/execute implementation
 use the program counter (PC) to read instruction address
 fetch the instruction from memory and increment PC
 use fields of the instruction to select registers to read
 execute depending on the instruction
 repeat…
Data
Register #
PC Address Instruction Registers ALU Address
Register #
Instruction
memory Data
Register # memory
Data

Single Cycle Processor PPT by Svnit

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Single Cycle Processor PPT by Svnit

Transféré par

Droits d'auteur :

Formats disponibles

Computer Organization

Micro Architecture Level

LEVEL 4 Assembly Language Level

Six LEVEL 3 Operating System Machine Level

LEVEL 1 Micro Architecture Level

LEVEL 0 Digital Logic Level

 Part of the Control Unit to execute an instruction

 In the control unit

 In external hardware such as memory or

 A instruction consists of a bit vector

6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

Cycle time = 20 ns Cycle time = 5 ns

 Elements that contain data

Combinational logic hardware units

State State State

If Sbar changes to 0 (while Rbar remains at 1),

So, the latch remembers which of Sbar or Rbar

Clock period Rising edge

 Latches are level-triggered

Timing diagram of D-latch

Read ports are implemented Write port is implemented using

a. Instruction memory b. Program counter c. Adder

a. Instruction memory b. Program counter c. Adder

a. Instruction memory b. Program counter c. Adder

Three elements used to store and fetch Instruction

Instruction <- MEM[PC]

Two elements used to implement R-type instructions

Two elements used to implement R-type instructions

Two elements used to implement R-type instructions

a. Data memory unit b. Sign-extension unit

Two additional elements used Datapath

6 bits 5 bits 5 bits 16 bits

How to? Add Sum Branch target

No shift hardware required: Shift

beq rs, rt, offset

Separate adder as ALU operations and PC

Adding branch capability and another multiplexor

 Control unit takes input from

 Control unit generates

R-type opcode rs rt rd shamt funct

Load/store opcode rs rt address

 Observations about MIPS instruction format

Instruction [25– 21] Read

RegWrite None The register on the Write register input is written

MemWrite None Data memory contents designated by the address

Effects of the seven control signals

Instruction [25 21] Read

Instruction [25 21] Read

Datapath with Instruction [15 0]

Control II (cont.) Instruction [5 0]

I[25:21] I[20:16] I[15:11] 1

I[25:21] I[20:16] I[15:11] 1

I[25:21] I[20:16] I[15:11] 1

I[25:21] I[20:16] I[15:11] 1

Instruction [25– 0] Shift Jump address [31– 0]

PC+4 [31– 28] M M

Instruction [25– 21] Read

Instruction [25– 21] Read

Fetch instruction and increment PC count

Instruction [25– 21] Read

Read two source registers from the register file

Instruction [25 21] Read