Pertemuan 3 Gerbang Logika

EE382
Processor Design
Stanford University
Winter Quarter 1998-1999
Instructor: Michael Flynn
Teaching Assistant: Steve Chou
Administrative Assistant: Susan Gere
Lecture 1 - Introduction
Michael Flynn EE382 Winter/99 Slide 1

Class Objectives
 Learn theoretical analysis and limits
— develop intuition
— project long-term trends and bound design space more
efficiently than simulation
 Learn models for VLSI component cost tradeoffs
— emphasis on microprocessor
 Learn modeling techniques for computer system
performance
— emphasis on queuing
 Put it all together to balance system performance and
cost
— Emphasis on multiprocessors, memory, and I/O
— Practical examples and design targets

Course Prerequisites
 Computer Architecture and Organization (EE282)
— Instruction Set Architecture
— Machine Organization
— Basic Pipeline Design
— Cache Organization
— Branch Prediction
— Superscalar Execution
• In-Order
• Out-of-Order
 Statistics
— Basic probability
• distribution functions
• statistical measures
— Familiarity with stochastic processes and Markov models
is helpful, but not required

Course Information
 Access to the course web page is necessary
http://www-leland.stanford.edu/class/ee382/
— Course info, assignments, old exams, design
tools,FAQs, ...
 Textbook and reference material
— Computer Architecture: Pipelined and Parallel Processor
Design, Michael J. Flynn
 Problem set and design problem philosophy
— Learn by doing: maximize learning/effort
 Exam philosophy
— Extend what you have learned
— Open-book, not a speed or trick contest
 You are expected to give us feedback
— Questions, office hours, email, surveys

Grading
 Problem Sets and Design Problems 40%
— 6 problem sets,
— 2 design problems
 Midterm 20%
 Final Exam 40%
— Covers entire course
— Scheduled March 15, 8:30-11:30AM

Key Concepts of Abstraction
 Instruction Set Architecture (ISA)
— Functional interface for assembly-language programmer
— Examples: SGI MIPS, Sun SPARC, PowerPC, HPPA, DEC
Alpha, Intel (x86), IBM System/390, IBM AS/400
 Implementation (Machine Organization)
— Partitioning into units and logic design
— Examples
• Intel386 CPU, Intel486 CPU, Pentium® Processor, Pentium® Pro
Processor
• Alpha 21064, 21164, 21264
 Realization
— Physical fabrication and assembly
— Examples
• IBM 709(‘54) built with vacuum tubes and 7090(‘59) built with transistors
• Pentium Processor in 0.8 m, 0.6m, 0.35 m BiCMOS/CMOS

Instruction Set Architecture
 “... the attributes of a [computing] system as seen by the
programmer, i.e. the conceptual structure and functional
behavior, as distinct from the organization of the data flow
and controls, the logical design, and the physical
implementation.”
Amdahl, Blaauw, and Brooks, 1964
 Consists of:
— Organization of storage
— Data types
— Encodings and representations (instruction formats)
— Instruction (or Operation Code) Set
— Modes for addressing data Items and instructions
— Program visible exceptional conditions
 Specifies requirements for binary compatibility across
implementations

Instruction Set Types
 Load/Store (L/S)
— Only load and store instructions refer to memory
• no memory ALU ops
— used by several microprocessors
• Power PC, HP, DEC Alpha
 Register/Memory (R/M)
— ALU operations can have either source or destination in
memory
— Used by mainframes and most microprocessors
• IBM System/370, Intel Architecture (x86), all x86 compatables
 Register or Memory (R+M)
— ALU operations can have any/all operands in memory
— Not used commonly now
• DEC Vax

L/S ISA General Characteristics
 32 GPR x 32b....more recently 64b
 instr size: 32b... more recently 64b
 instr types
— R1 <- R2 op R3 for ALU ops
— R1 <-> MEM [RB,D] for LD/ST

R/M ISA General Characteristics
 16 GPR x 32b
 instr size...16b, 32b, 48b
 instr types
— RR R1 <- R1 op R2
— RM R1 <- R1 op MEM [RB,RX,D]
— MM MEM1 [RB,RX,D] <- MEM1 [RB,RX,D] op MEM2
[RB,RX,D] used for character, decimal ops only.

ISA Syntax Terminology
 OP.type destination, source1,source2
— eg ADD.F R1,R2,R3 puts result of floating pt. add in
floating reg 1.
— OP without type implies integer type unless fp is clear
from the context.
— destination is always first operand, so that store is ST
MEM [RB,RX,D], R2

ISA Assumptions
 assume all i.s. have a PSW and condition codes...CC
 Branch is BC.CC target, target is either R or Mem.
 unconditional branch is BR, even though it’s
implemented with BC
 other branches BCT, BAL (branch and link)

Moore’s Law
Transistors
Per Die
108 Memory 16M
107
Microprocessor 4M
1M Pentium™
106 256K Processor
Intel486™
64K
105 Processor
Intel386™
16K
4K 80286 Processor
104 1K 8086
103 8080
4004
102
101
1
1970 1975 1980 1985 1990 1995 2000
Moore’s Law: No. Tx per chip increases 4X every 3 years

CAGR = 60%
Michael Flynn EE382 Winter/99 Source: Intel Slide 13

Die Size Growth
1000
Pentium (tm)
Die Size (mm 2)
LOGIC 68040
80486
80386
16M
100 68020
80286 4M
68000 1M
8086 256K
DRAM
64K
10
1975 1980 1985 1990 1995 2000
Year
Finer Lithography
10
Resolution
Overlay
1.0 CD Control
Resolution ( m)
1 0.8 Generation
0.5
0.35
0.25
0.1
0.01
'83 '86 '89 '92 '95 '98 '01
YEAR
Limits on scaling
 As device sizes get smaller there are difficulties
maintaining the rate of down sizing of feature sizes
 It currently appears that around 50nm several factors
may limit scaling
— hot carrier effects
— time dependent dielectric breakdown
— gate tunneling current
— short channel effects and effect on VT

Beyond CMOS MOSFETs
 If “limits” prove real; there are alternative technologies
with system’s implications
— low temperature CMOS
— sub threshold logic
— new gate oxide materials
— SOI

Fabrication Facility Costs
Dollars in Millions
10000
1000
100
10
1
1965 1970 1975 1980 1985 1990 1995 2000
Moore’s Second Law: Fab Costs Grow 40% Per Year
Source: VLSI Research, Inc.

Microprocessor Business Model
 New “generation” of silicon technology every 2.5-3 years
— 30% reduction in linear dimensions => 50% in area
— 30% reduction in device delay => 50% increase in speed
— Used to reduce cost and improve performance on previous
generation microprocessor
— Used to enable new generation of microprocessor with
wider, more parallel, more functional machine organization
— Incremental changes between generations
 Business growth enables investment in new technology
— Driven by performance, new applications, and “dancing
bunny people”

Performance Growth
1200
DEC Alpha 21264/600
1100
1000
900
800
Performance
700
600
500
DEC Alpha 5/500
400
300
DEC Alpha 5/300
200
DEC Alpha 4/266
SUN-4/ MIPS MIPS IBM IBM POWER 100
100
260 M/120 M2000 RS6000 DEC AXP/500
HP 9000/750
0
1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997
Year
Workstation Performance Improving 54% per year
Figure 1.20 from P&H
That’s almost 1% per week!
PC Shipment Growth
Performance Growth and New Applications Drive Volume
Source: Dataquest by A. Yu in IEEE Micro 12/96

System Price/Performance
1965 1977 1998
IBM System 360/50 DEC VAX11/780 Dell Dimension XPS-300

0.15 MIPS 1 MIPS 725 MIPS
64 KB 1 MB 64 MB
$1M $200K $2412 (1/4/98)
$6.6M per MIPS $200K per MIPS $3.33 per MIPS
Photographs from Virtual Computing History Group

Representative System
L2 Cache
L1 L1
Icache Dcache
Pipelines
CPU • • • CPU
Registers
Chipset Memory
I/O Bus(es)

Summary
 Current architectures exploit parallelism for performance
— Multiple pipelines and caches
— Multiprocessors
 Technology costs are increasing rapidly
— High volume is critical to recover costs
• interface standards and evolution necessary
— Product success depends on cost-effective area allocation and
partitioning
 Technology capacity and performance increasing rapidly
— Critical to evaluate broad space of design options at each generation
• Opportunity to learn from the past and to innovate
Theoretical analysis and modeling combined with design

targets are powerful tools for developing computer systems.
This course will help prepare you to apply those

for your future career in theory or practice.

This Week
 Check access to the web page
— Make sure you can read and print
— First problem set will be posted by Friday
 Reading
— Scan Chapter 1
— Sections 2.1,2.2
 Room Change
— move to Gates B03
— no festival Friday lecture

Pertemuan 3 Gerbang Logika

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Pertemuan 3 Gerbang Logika

Transféré par

Droits d'auteur :

Formats disponibles

EE382

Michael Flynn EE382 Winter/99 Slide 1

Michael Flynn EE382 Winter/99 Slide 2

Michael Flynn EE382 Winter/99 Slide 3

Michael Flynn EE382 Winter/99 Slide 4

Michael Flynn EE382 Winter/99 Slide 5

Michael Flynn EE382 Winter/99 Slide 6

Michael Flynn EE382 Winter/99 Slide 7

Michael Flynn EE382 Winter/99 Slide 8

Michael Flynn EE382 Winter/99 Slide 9

Michael Flynn EE382 Winter/99 Slide 10

Michael Flynn EE382 Winter/99 Slide 11

Michael Flynn EE382 Winter/99 Slide 12

Moore’s Law: No. Tx per chip increases 4X every 3 years

Michael Flynn EE382 Winter/99 Source: Intel Slide 13

Michael Flynn EE382 Winter/99 Slide 16

Michael Flynn EE382 Winter/99 Slide 17

Moore’s Second Law: Fab Costs Grow 40% Per Year

Source: VLSI Research, Inc.

Michael Flynn EE382 Winter/99 Slide 19

Performance Growth and New Applications Drive Volume

Source: Dataquest by A. Yu in IEEE Micro 12/96

1965 1977 1998

IBM System 360/50 DEC VAX11/780 Dell Dimension XPS-300

$6.6M per MIPS $200K per MIPS $3.33 per MIPS

Photographs from Virtual Computing History Group

Michael Flynn EE382 Winter/99 Slide 23

Theoretical analysis and modeling combined with design

This course will help prepare you to apply those

Michael Flynn EE382 Winter/99 Slide 24

Michael Flynn EE382 Winter/99 Slide 25

Vous aimerez peut-être aussi