Vous êtes sur la page 1sur 14

CS5222

Advanced Computer Architecture


Part 3: VLIW Architecture
Fall Term, 2004/2005
Chi Chi Hung (email: chich@comp.nus.edu.sg)
Building S/17, Rm 5-13
Phone: 6874-2832

CS5222 Adv. Comp. Arch. Part 3 Page.1

Chi C.H. Fall 2004 NUS

Basic Working Principles of VLIW


Aim at speeding up computation by exploiting instructionlevel parallelism.
Same hardware core as superscalar processors, having
multiple execution units (EUs) working in parallel.
An instruction is consisted of multiple operations; typical
word length from 52 bits to 1 Kbits.
All operations in an instruction are executed in a lock-step
mode.
One or multiple register files for FX and FP data.
Rely on compiler to find parallelism and schedule
dependency free program code.

CS5222 Adv. Comp. Arch. Part 3 Page.2

Chi C.H. Fall 2004 NUS

Basic VLIW Approach

CS5222 Adv. Comp. Arch. Part 3 Page.3

Chi C.H. Fall 2004 NUS

Register File Structure for


VLIW

What is the challenge to register file in VLIW? R/W ports


CS5222 Adv. Comp. Arch. Part 3 Page.4

Chi C.H. Fall 2004 NUS

Differences Between VLIW & Superscalar Architecture (I)

CS5222 Adv. Comp. Arch. Part 3 Page.5

Chi C.H. Fall 2004 NUS

Differences Between VLIW & Superscalar Architecture (II)


Instruction formulation:
Superscalar:
-

Receive conventional instructions conceived for seq.


processors.

VLIW:
-

Receive (very) long instruction words, each comprising a


field (or opcode) for each execution unit.

Instruction word length depends (a) number of execution


units, and (b) code length to control each unit (such as
opcode length, register names, ).

Typical word length is 64 1024 bits, much longer than


conventional machine word length.

CS5222 Adv. Comp. Arch. Part 3 Page.6

Chi C.H. Fall 2004 NUS

Differences Between VLIW & Superscalar Architecture (III)


Instruction scheduling:
Superscalar:
-

Done dynamically at run-time by the hardware.

Data dependency is checked and resolved in hardware.

Need a lookahead hardware window for instruction fetch.

CS5222 Adv. Comp. Arch. Part 3 Page.7

Chi C.H. Fall 2004 NUS

Differences Between VLIW & Superscalar Architecture (IV)


Instruction scheduling (contd):
VLIW:
- Static scheduling done at compile-time by the compiler.
-

Advantages:
Reduce hardware complexity.
Tasks such as decoding, data dependency detection,
instruction issue, , etc. becoming simple.
Potentially higher clock rate.
Higher degree of parallelism with global program
information.

CS5222 Adv. Comp. Arch. Part 3 Page.8

Chi C.H. Fall 2004 NUS

Differences Between VLIW & Superscalar Architecture (V)


Instruction scheduling (contd):
VLIW:
-

Disadvantages
Higher complexity of the compiler.
Compiler optimization needs to consider technology
dependent parameters such as latencies and load-use
time of cache.
(Question: What happens to the software if the hardware
is updated?)
Non-deterministic problem of cache misses, resulting in
worst case assumption for code scheduling.
In case of un-filled opcodes in a (V)LIW, memory space
and instruction bandwidth are wasted.

CS5222 Adv. Comp. Arch. Part 3 Page.9

Chi C.H. Fall 2004 NUS

Development history of Proposed/Commercial VLIWs

CS5222 Adv. Comp. Arch. Part 3 Page.10

Chi C.H. Fall 2004 NUS

Case Study of VLIW: Trace 200 Family (I)

CS5222 Adv. Comp. Arch. Part 3 Page.11

Chi C.H. Fall 2004 NUS

Case Study of VLIW: Trace 200 Family (II)

Only two branches might be used in Trace 7/2000


CS5222 Adv. Comp. Arch. Part 3 Page.12

Chi C.H. Fall 2004 NUS

Code Expansion in
VLIW
It is found that code in VLIW is expanded roughly by a
factor of three.
For long VLIW, more opcode fields will be emptied. This
will result in wasting bandwidth and storage space. Can
you propose a solution for it?

CS5222 Adv. Comp. Arch. Part 3 Page.13

Chi C.H. Fall 2004 NUS

END

CS5222 Adv. Comp. Arch. Part 3 Page.14

Chi C.H. Fall 2004 NUS

Vous aimerez peut-être aussi