Vous êtes sur la page 1sur 24

DMY

16-bit RISC Microprocessor


Cecilia Florescu Mojdeh Makabi Daniel Yee December 2, 2002

CS M152B

DMY

Overview

Purpose: Design a pipelined RISC microprocessor


Design Platform: Xilinx ISE 4.1, ModelSim 5.6, Visual C++ 6.0, Windows 2000 Professional

DMY

Pipelining
It acts like an assembly line
Station 1 Station 2 Station 3 Station 4

Fords Auto Assembly Line

Sequential Auto Production VS


Auto Production

Pipelining Auto Production


Auto Production

4 1 2 3 4 1 2 3 4 Time

2 1

3 2 1

4 3 2 4 3 4 Time

DMY

Pipelined RISC
RISC is an acronym for Reduced Instruction Set Computer It has a reduced and simple instruction set It has a large number of general-purpose registers

In our Pipelined RISC Processor:


Each instruction takes 1 clock cycle for each stage The processor can accept 1 new instruction per clock Instructions are processed in stages as they pass down Multiple instructions in some phase of execution concurrently Pipelining doesn't improve the latency of instructions (each instruction still requires the same amount of time to complete) It does improve the overall throughput

DMY

Pipelined RISC Design


IF/ID ID/EX EX/MEM MEM/WB

+
Control Unit

PC

Instruction Memory Registers ALU Sign Exd

Memory

DMY

Instruction Fetch Stage


IF/ID ID/EX EX/MEM MEM/WB

+
Control Unit

PC

Instruction Memory Registers ALU Sign Exd

Memory

DMY

Instruction Decode Stage


IF/ID ID/EX EX/MEM MEM/WB

+
Control Unit

PC

Instruction Memory Registers ALU Sign Exd

Memory

DMY

Execution Stage
IF/ID ID/EX EX/MEM MEM/WB

+
Control Unit

PC

Instruction Memory Registers ALU Sign Exd

Memory

DMY

Memory Access Stage


IF/ID ID/EX EX/MEM MEM/WB

+
Control Unit

PC

Instruction Memory Registers ALU Sign Exd

Memory

DMY

Write Back Stage


IF/ID ID/EX EX/MEM MEM/WB

+
Control Unit

PC

Instruction Memory Registers ALU Sign Exd

Memory

DMY

Modified Pipelined RISC Design

16-bit ISA
16-bit fixed-length instructions, 16 registers no funct field for R-type, only op field limited number of operations 4-bit opcode field => maximum 16 operations Suggested R-type R-type I-type J-type
3 3 3 3 4

opcode
4

rs
4

rt
4

rd
4

funct

opcode
4

rs
4

rt
4

rd
4

opcode
4

rs

rt
12

address

opcode

target address

DMY

Multiplier Algorithms

Pencil-and-paper method
10101 x 101 10101 101010 000000 + 101010 11100111 0 1 0

requires M cycles for one NxM multiplication implemented with AND, adder, and shift register

DMY

Multiplier Algorithms

Array Multiplier

DMY

Multiplier Algorithms

Modified Booth Encoding (MBE)


reduces number of partial products by N/2 for MxN multiplication performs parallel encoding v. serial encoding in original Booth
Y2i + 1 0 0 0 0 1 1 1 1 Y2 0 0 1 1 0 0 1 1 Y2i - 1 0 1 0 1 0 1 0 1 Operation on X 0xX +1 x X +1 x X +2 x X -2 x X -1 x X -1 x X 0xX

DMY

Multiplier Algorithms

Wallace Tree
P3j P4j P5j 3-2 compressor P6j P7j P8j 3-2 compressor

P0j P1j P2j 3-2 compressor


c3 j

c2 j c3j-1 c2j-1

c1 j c1j-1

3-2 compressor
c5 j c6 j c5j-1

3-2 compressor
c4 j c4j-1 c6j-1

increases speed of summing by increased parallelism all bits of PP in each column are added independently and simultaneously x-2 compressor composed of CSAs; x := the number of PPs in column

4-2 compressor Sum[j]

Carry[j]

9-2 Compressor

DMY

Multiplier Design

Issues and Solutions

limited opcode size made NOP instruction ADD $0, $0, $0 => freed one opcode ADD instruction doesnt change register $0 (constant zero value) latency v. simplicity multiplier lies in critical path; must calculate product in one cycle algorithms trade simplicity of control and/or wiring for faster speed multiplier latency not detrimental if n is small enough => 8x8 multiplier negative and positive integer multiplication 8 LSB of 16-bit operand taken as a twos complement number sign detection unit detects signs operands and sets product sign

DMY

Exception Managing Hardware

Pipeline Modifications

EPC register tracks the problematic instruction EPC_2 register to hold the instruction to return to, if allowed Expansion of control unit to detect overflow signal and handle exception
IF/ID ID/EX EX/MEM MEM/WB

+
Control Unit

EPC

Overflow PC Instruction Memory Registers ALU Subrt Addr Sign Exd Memory

Clk EPC 2

Data Input

DMY

Arithmetic Overflow Handler


ALU performs arithmetical operations

Software Support
Assurance that MEM and WB stages of pipeline continue execution
Instruction continues to MEM stage

Is Overflow signal high?

NO

YES

Control Unit has been notified, and takes corrective action

Instruction in MEM_WB latch will continue

DMY

Arithmetic Overflow Handler


ALU performs arithmetical operations

Software Support
Assurance that MEM and WB stages of pipeline continue execution Interruption of program
Instruction continues to MEM stage

Is Overflow signal high?

NO

YES

Control Unit has been notified, and takes corrective action

Instruction in MEM_WB latch will continue

Instructions in IF_ID and ID_EXE latches will be flashed

DMY

Arithmetic Overflow Handler


ALU performs arithmetical operations

Software Support
Assurance that MEM and WB stages of pipeline continue execution Interruption of program Request to involve the operating system
Instruction continues to MEM stage

Is Overflow signal high?

NO

YES

Control Unit has been notified, and takes corrective action

Instruction in MEM_WB latch will continue

Instructions in IF_ID and ID_EXE latches will be flashed

Content of EPC will be stored in R$15

DMY

Arithmetic Overflow Handler


ALU performs arithmetical operations

Software Support
Assurance that MEM and WB stages of pipeline continue execution Interruption of program Request to involve the operating system Enhancement of ISA MFCO - move from coprocessor JR - jump to address stored in reserved register
Instruction continues to MEM stage

Is Overflow signal high?

NO

YES

Control Unit has been notified, and takes corrective action

Instruction in MEM_WB latch will continue

Instructions in IF_ID and ID_EXE latches will be flashed

Content of EPC will be stored in R$15

PC will jump to overflow handling subroutine

DMY

Overflow Example
Instruction stored at address 103: 32 + 65527= 65559
Clock Op A Op B ALU Out 32 65527 xx 0 0 23 0 Clock -------------------------

Note:
xx xx xx xx xx xx xx

------------------------Op B ------------------------ALU Out -------------------------

Op A

216 = 65536 216 < 65559

Overflow IF_Flash ID_Flash PC 103 104 105 00 49152 49153 10 00

-------------------------

Overflow

IF_Flash ------------------------ID_Flash -------------------------

------------------------PC Jump -------------------------

PC

49183 00

104 11

105 00

PC Jump

DMY

Conclusion
16-bit processor, enhanced with a multiplier and able to detect arithmetic overflow Harvard Architecture model for memory management 14 multipurpose, 2 reserved registers Advantages and disadvantages of designed 16-bit ISA

DMY

References

Boerger, Egon. Architecture Design and Validation Methods. New York Springer, 2000. Carpinelli, John D. Computer Systems Organization and Architecture. Boston: Addison-Wesley, 2001. Cohen, Ben. VHDL Coding Styles and Methodologies. Boston: Kluwer Academic Publishers, 1999. Dahan, David. 17x17-Bit, High-Performance, Fully Synthesizable Multiplier. Technology Licensing Division DSP Group Inc. Ercegovac, Milos D., Thomas Lang, and Jaime H. Moreno. Introduction to Digital Systems. New York: John Wiley & Sons, Inc., 1999. Hennessy, John L. and David A. Patterson. Computer Organization and Design. 2nd ed. San Francisco: Morgan Kaufmann Publishers Inc., 1997.

High Speed Parallel Multiplier For LEON Processor Algorithm. Lab #5: Implementation of a Multiplier. EE116L course, UCLA. Nahata, Sunny and Rohit Madampath. 8 by 8 bit High Speed Multiplier Design Using (4,2) Counters. 2002. Smith, James E. The Microarchitecture of Superscalar Processors. New York: Madison, 1995. Stalling, William. Computer Organization and Architecture. 6th ed. Upper Saddle River:

Prentice Hall, 2003. Sweetman, Dominic. See MIPS Run. San Francisco: Morgan Kaufmann Publishers Inc., 1999. Tamir, Yuval. Computer Systems Architecture Notes. UCLA. Yeh, Wen-Chang and Chein-Wei Jen. High-Speed Booth Encoded Parallel Multiplier Design. IEEE Transactions on Computers, Vol. 49, No. 7. July 2000.

Vous aimerez peut-être aussi