Vous êtes sur la page 1sur 25

EE382

Processor Design
Stanford University
Winter Quarter 1998-1999
Instructor: Michael Flynn
Teaching Assistant: Steve Chou
Administrative Assistant: Susan Gere
Lecture 1 - Introduction

Michael Flynn EE382 Winter/99 Slide 1


Class Objectives
 Learn theoretical analysis and limits
— develop intuition
— project long-term trends and bound design space more
efficiently than simulation
 Learn models for VLSI component cost tradeoffs
— emphasis on microprocessor
 Learn modeling techniques for computer system
performance
— emphasis on queuing
 Put it all together to balance system performance and
cost
— Emphasis on multiprocessors, memory, and I/O
— Practical examples and design targets

Michael Flynn EE382 Winter/99 Slide 2


Course Prerequisites
 Computer Architecture and Organization (EE282)
— Instruction Set Architecture
— Machine Organization
— Basic Pipeline Design
— Cache Organization
— Branch Prediction
— Superscalar Execution
• In-Order
• Out-of-Order
 Statistics
— Basic probability
• distribution functions
• statistical measures
— Familiarity with stochastic processes and Markov models
is helpful, but not required

Michael Flynn EE382 Winter/99 Slide 3


Course Information
 Access to the course web page is necessary
http://www-leland.stanford.edu/class/ee382/
— Course info, assignments, old exams, design
tools,FAQs, ...
 Textbook and reference material
— Computer Architecture: Pipelined and Parallel Processor
Design, Michael J. Flynn
 Problem set and design problem philosophy
— Learn by doing: maximize learning/effort
 Exam philosophy
— Extend what you have learned
— Open-book, not a speed or trick contest
 You are expected to give us feedback
— Questions, office hours, email, surveys

Michael Flynn EE382 Winter/99 Slide 4


Grading
 Problem Sets and Design Problems 40%
— 6 problem sets,
— 2 design problems
 Midterm 20%
 Final Exam 40%
— Covers entire course
— Scheduled March 15, 8:30-11:30AM

Michael Flynn EE382 Winter/99 Slide 5


Key Concepts of Abstraction
 Instruction Set Architecture (ISA)
— Functional interface for assembly-language programmer
— Examples: SGI MIPS, Sun SPARC, PowerPC, HPPA, DEC
Alpha, Intel (x86), IBM System/390, IBM AS/400
 Implementation (Machine Organization)
— Partitioning into units and logic design
— Examples
• Intel386 CPU, Intel486 CPU, Pentium® Processor, Pentium® Pro
Processor
• Alpha 21064, 21164, 21264
 Realization
— Physical fabrication and assembly
— Examples
• IBM 709(‘54) built with vacuum tubes and 7090(‘59) built with transistors
• Pentium Processor in 0.8 m, 0.6m, 0.35 m BiCMOS/CMOS

Michael Flynn EE382 Winter/99 Slide 6


Instruction Set Architecture
 “... the attributes of a [computing] system as seen by the
programmer, i.e. the conceptual structure and functional
behavior, as distinct from the organization of the data flow
and controls, the logical design, and the physical
implementation.”
Amdahl, Blaauw, and Brooks, 1964
 Consists of:
— Organization of storage
— Data types
— Encodings and representations (instruction formats)
— Instruction (or Operation Code) Set
— Modes for addressing data Items and instructions
— Program visible exceptional conditions
 Specifies requirements for binary compatibility across
implementations

Michael Flynn EE382 Winter/99 Slide 7


Instruction Set Types
 Load/Store (L/S)
— Only load and store instructions refer to memory
• no memory ALU ops
— used by several microprocessors
• Power PC, HP, DEC Alpha
 Register/Memory (R/M)
— ALU operations can have either source or destination in
memory
— Used by mainframes and most microprocessors
• IBM System/370, Intel Architecture (x86), all x86 compatables
 Register or Memory (R+M)
— ALU operations can have any/all operands in memory
— Not used commonly now
• DEC Vax

Michael Flynn EE382 Winter/99 Slide 8


L/S ISA General Characteristics
 32 GPR x 32b....more recently 64b
 instr size: 32b... more recently 64b
 instr types
— R1 <- R2 op R3 for ALU ops
— R1 <-> MEM [RB,D] for LD/ST

Michael Flynn EE382 Winter/99 Slide 9


R/M ISA General Characteristics
 16 GPR x 32b
 instr size...16b, 32b, 48b
 instr types
— RR R1 <- R1 op R2
— RM R1 <- R1 op MEM [RB,RX,D]
— MM MEM1 [RB,RX,D] <- MEM1 [RB,RX,D] op MEM2
[RB,RX,D] used for character, decimal ops only.

Michael Flynn EE382 Winter/99 Slide 10


ISA Syntax Terminology
 OP.type destination, source1,source2
— eg ADD.F R1,R2,R3 puts result of floating pt. add in
floating reg 1.
— OP without type implies integer type unless fp is clear
from the context.
— destination is always first operand, so that store is ST
MEM [RB,RX,D], R2

Michael Flynn EE382 Winter/99 Slide 11


ISA Assumptions
 assume all i.s. have a PSW and condition codes...CC
 Branch is BC.CC target, target is either R or Mem.
 unconditional branch is BR, even though it’s
implemented with BC
 other branches BCT, BAL (branch and link)

Michael Flynn EE382 Winter/99 Slide 12


Moore’s Law
Transistors
Per Die
108 Memory 16M
107
Microprocessor 4M
1M Pentium™
106 256K Processor
Intel486™
64K
105 Processor
Intel386™
16K
4K 80286 Processor
104 1K 8086
103 8080
4004
102
101
1
1970 1975 1980 1985 1990 1995 2000

Moore’s Law: No. Tx per chip increases 4X every 3 years


CAGR = 60%

Michael Flynn EE382 Winter/99 Source: Intel Slide 13


Die Size Growth
1000

Pentium (tm)
Die Size (mm 2)

LOGIC 68040
80486
80386
16M
100 68020
80286 4M
68000 1M
8086 256K
DRAM
64K

10
1975 1980 1985 1990 1995 2000
Year
Michael Flynn EE382 Winter/99 Source: Intel Slide 14
Finer Lithography

10
Resolution
Overlay
1.0 CD Control
Resolution ( m)

1 0.8 Generation
0.5
0.35
0.25

0.1

0.01
'83 '86 '89 '92 '95 '98 '01
YEAR
Michael Flynn EE382 Winter/99 Source: Intel Slide 15
Limits on scaling
 As device sizes get smaller there are difficulties
maintaining the rate of down sizing of feature sizes
 It currently appears that around 50nm several factors
may limit scaling
— hot carrier effects
— time dependent dielectric breakdown
— gate tunneling current
— short channel effects and effect on VT

Michael Flynn EE382 Winter/99 Slide 16


Beyond CMOS MOSFETs
 If “limits” prove real; there are alternative technologies
with system’s implications
— low temperature CMOS
— sub threshold logic
— new gate oxide materials
— SOI

Michael Flynn EE382 Winter/99 Slide 17


Fabrication Facility Costs
Dollars in Millions
10000

1000

100

10

1
1965 1970 1975 1980 1985 1990 1995 2000

Moore’s Second Law: Fab Costs Grow 40% Per Year

Source: VLSI Research, Inc.


Michael Flynn EE382 Winter/99 Slide 18
Microprocessor Business Model
 New “generation” of silicon technology every 2.5-3 years
— 30% reduction in linear dimensions => 50% in area
— 30% reduction in device delay => 50% increase in speed
— Used to reduce cost and improve performance on previous
generation microprocessor
— Used to enable new generation of microprocessor with
wider, more parallel, more functional machine organization
— Incremental changes between generations
 Business growth enables investment in new technology
— Driven by performance, new applications, and “dancing
bunny people”

Michael Flynn EE382 Winter/99 Slide 19


Performance Growth
1200
DEC Alpha 21264/600
1100

1000

900

800
Performance

700

600

500
DEC Alpha 5/500
400

300
DEC Alpha 5/300
200
DEC Alpha 4/266
SUN-4/ MIPS MIPS IBM IBM POWER 100
100
260 M/120 M2000 RS6000 DEC AXP/500
HP 9000/750
0
1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997
Year
Workstation Performance Improving 54% per year
Figure 1.20 from P&H
That’s almost 1% per week!
Michael Flynn EE382 Winter/99 Slide 20
PC Shipment Growth

Performance Growth and New Applications Drive Volume

Source: Dataquest by A. Yu in IEEE Micro 12/96


Michael Flynn EE382 Winter/99 Slide 21
System Price/Performance

1965 1977 1998

IBM System 360/50 DEC VAX11/780 Dell Dimension XPS-300


0.15 MIPS 1 MIPS 725 MIPS
64 KB 1 MB 64 MB
$1M $200K $2412 (1/4/98)

$6.6M per MIPS $200K per MIPS $3.33 per MIPS

Photographs from Virtual Computing History Group


Michael Flynn EE382 Winter/99 Slide 22
Representative System
L2 Cache

L1 L1
Icache Dcache

Pipelines
CPU • • • CPU
Registers

Chipset Memory

I/O Bus(es)

Michael Flynn EE382 Winter/99 Slide 23


Summary
 Current architectures exploit parallelism for performance
— Multiple pipelines and caches
— Multiprocessors
 Technology costs are increasing rapidly
— High volume is critical to recover costs
• interface standards and evolution necessary
— Product success depends on cost-effective area allocation and
partitioning
 Technology capacity and performance increasing rapidly
— Critical to evaluate broad space of design options at each generation
• Opportunity to learn from the past and to innovate

Theoretical analysis and modeling combined with design


targets are powerful tools for developing computer systems.

This course will help prepare you to apply those


for your future career in theory or practice.

Michael Flynn EE382 Winter/99 Slide 24


This Week
 Check access to the web page
— Make sure you can read and print
— First problem set will be posted by Friday
 Reading
— Scan Chapter 1
— Sections 2.1,2.2
 Room Change
— move to Gates B03
— no festival Friday lecture

Michael Flynn EE382 Winter/99 Slide 25

Vous aimerez peut-être aussi