Académique Documents
Professionnel Documents
Culture Documents
Introduction
i.e. decisions regarding:
registers, memory addressing, addressing modes,
instruction operands, available operations, control flow
instructions, instruction encoding
3 4
1
Single Processor Performance
Computer Technology Move to multi-processor
Performance improvements:
Improvements in semiconductor technology
Feature size, clock speed
Improvements in computer architectures
Enabled by HLL compilers, UNIX
Lead to RISC architectures
Copyright 2012, Elsevier Inc. All Copyright 2012, Elsevier Inc. All
rights reserved. rights reserved.
2
Parallelism Trends in Technology
Classes of parallelism in applications: Integrated circuit technology
Transistor density: 35%/year
Data-Level Parallelism (DLP) Die size: 10-20%/year
Task-Level Parallelism (TLP) Integration overall: 40-55%/year
Copyright 2012, Elsevier Inc. All Copyright 2012, Elsevier Inc. All
rights reserved. rights reserved.
3
Energy and Power Power
Intel 80386
Dynamic power consumed ~ 2 W
Transistor switch from 0 -> 1 or 1 -> 0 3.3 GHz Intel
x Capacitive load x Voltage2 x Frequency Core i7 consumes
switched 130 W
Reducing clock rate reduces power, not Heat must be
energy dissipated from
1.5 x 1.5 cm chip
Static power This is the limit
Currentstatic x Voltage of what can be
Scales with number of transistors cooled by air
Copyright 2012, Elsevier Inc. All Copyright 2012, Elsevier Inc. All
rights reserved. rights reserved.
Copyright 2012, Elsevier Inc. All Copyright 2012, Elsevier Inc. All
rights reserved. rights reserved.
4
Principles of Computer Design Textbooks
Take Advantage of Parallelism Computer
e.g. multiple processors, disks, memory banks,
Architecture: A
pipelining, multiple functional units
Quantitative
Principle of Locality Approach 5th
Reuse of data and instructions Edition, John
Focus on the Common Case
Hennessy and David
Amdahls Law Patterson, Morgan
Kaufmann
Course Contents
1. Fundamentals (Chapter 1)
Course Contents
Technology trends
Performance evaluation 4. Architectural simulator: SimpleScalar
5. Memory hierarchy (Chapter 2 and Appendix B)
2. Instruction set principles (Appendix A) Cache designs
3. Instruction-level parallelism (Chapter 3 and Hardware, software cache optimizations
Appendix C) Main memory optimization
Pipelining
Virtual memory
Multiple-issue and dynamically scheduling
6. Beyond ILP (Chapters 3 and 4)
Support of speculative execution and precise
SMT and Multi-core
interrupt GPU
Instruction fetch and branch prediction
Limits on ILP
19 20
5
Prerequisite Grading
ECE/CS 366 or equivalent Homework 20%
Basic logic design Projects (2) 10%
Assembly language Midterm 30%
Single-cycle and pipelined processor
Final 35%
designs
Class participation 5%
21 22
23 24