Vous êtes sur la page 1sur 31

Ph.D.

Research Plan Presentation

Anup Gangwar

Embedded Systems Group


(http://www.cse.iitd.ac.in/esproject)
Department of Computer Science & Engineering
Indian Institute of Technology Delhi

June 11, 2002


Presentation Outline

Introduction and motivation


Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 2


Introduction

Why customize architectures?


General purpose computing domain Vs embedded
Customization leads to cheaper design solutions
Architectural choices for exploiting ILP
Superscalar processors
Try to extract ILP at run time, so, complex hardware
Limited clock speeds and high power dissipation
Not suited for embedded type of applications
VLIW processors
Compiler has lot of knowledge about hardware
Compiler extracts ILP statically, so, simplified hardware
Possible to attain higher clock speeds

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 3


Introduction - Problems with VLIW Processors

Complex compiler required for extracting ILP


Adequate hardware support needed for compiler
controlled execution
Code size expansion due to explicit NOPs if,
The application does not contain enough parallelism
The compiler is not able to extract parallelism from the application
Need for good instruction encoding and NOP compression
schemes

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 4


Presentation Outline

Introduction and motivation


Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 5


Specialization Opportunities -> FUs

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 6


Specialization Opportunities -> FUs (contd...)

Functional Unit Types


MISO or Multiple Input Single Output
MIMO or Multiple Input Multiple Output
MIMO with LD/ST or MIMOs with memory interaction
Rigid or flexible I/O timeshapes

NAME Inputs and Sources Outputs and Dests. I/O Policy

MISO Multiple (Regfile) Single (Regfile) Flexible or Rigid

MIMO Multiple (Regfile) Multiple (Regfile) Flexible or Rigid

MIMO with Multiple (Regfile or Multiple (Regfile or Flexible or Rigid for Reg.
LD/ST Mem.) Mem.) and block LD/ST for
mem.

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 7


Specialization Opportunities -> Reg. File

Single register file organization doesnt scale well


Area grows as N3
Delay grows as N3/2
Power grows as N3
where N is the no. of Functional Units connected to the register file

Clustered VLIW architectures are the solution


Each FU can read from/write to only a subset of registers
Data copying may increase execution latency
Powerful application analysis required to overcome above
mentioned problems

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 8


Specialization Opportunities -> Reg. File (contd...)

A Clustered VLIW Architecture

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 9


Specialization Opportunities -> Interconnect

Clustering FUs together requires deciding ICN


between different clusters
between clusters and memory

Analysis of data access patterns required for evaluating


cost-performance tradeoffs
Current ASIP vendors do not offer customizable
interconnects

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 10


Specialization Opportunities -> Encoding

Instruction encoding/decoding scheme affects


Code size
Object code compatibility
Branch miss prediction penalty
Hardware cost
Address specification in code size
Each UniOp is equivalent to a RISC/CISC instruction

UniOp UniOp UniOp UniOp

MultiOp

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 11


Specialization Opportunities -> Encoding (contd...)

IALU.0 IALU.1 FALU.0 BU.0

ADD NOP FMUL NOP

NOPs in a MultiOp

VLIW Processor Pipeline with Instruction Decompressor

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 12


Specialization Opportunities -> Summary

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 13


Presentation Outline

Introduction and motivation


Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 14


Existing Methodologies -> Simulation Driven

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 15


Task Set and Architecture
Constraints Description

Application Parameter
Extraction Architecture Design Space Exploration

Retargetable Compiler

Instruction Encoding Specialization

Validation
(Simulation with encoded instructions)

Architecture Description
(Output to synthesizer)

VLIW ASIP Synthesis Methodology

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 16


Presentation Outline

Introduction and motivation


Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 17


Validation Framework -> Trimaran
C Program Bridge Code

IMPACT

ANSI C Parsing
Code profiling
Classical machine independent optimizations ELCOR
Block formation

ELCOR IR Machine dependent


code optimizations
Generated Simulator SIMULATOR Generator Code scheduling
(Statistics)
Register allocation
ELCOR IR to low level C files
Compute and HPL-PD virtual machine
stall cycles Cache simulation
Cache stats
Performance statistics
Spill code info

HMDES Machine Description

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 18


Validation Framework -> Trimaran (contd...)

REBEL

Low level C files C libraries Emulation Library

Code Processor

HMDES

Native Compiler

Executable for the host platform

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 19


Validation Framework -> Retargetable Assembler

Instruction Encoding
Toolkit Generator
Description

Assembly Instructions Generated Assembler

Object Code

To Simulator
(for simulation with encoded instructions)

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 20


Presentation Outline

Introduction and motivation


Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 21


Work Plan -> Interconnect/RF/FU Specialization

Initially model the interconnect problem as ILP and


later on move to other solutions
Code selection problem in compilers is similar to
identifying compute intensive parts for AFUs
No. and type of FUs has not been properly explored
RF clustering problem has not been dealt with
elsewhere
Jacome et. al.
Deal with Interconnect/RF/FU specialization simultaneously
Operation chaining is not considered

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 22


Work Plan -> Encoding/Decoding Specialization

Goal is to be able to generate encoding schemes


automatically
Work of Shail Aditya et. al.
Basically a parameterized encoding scheme
Techniques especially for HPL-PD architecture
Do not talk of dynamic code size minimization
Encoding template is fixed exploration limited only to within the
template design space
Various encoding templates need to be explored, also
the template itself may be derived from application
Dynamic code size minimization needs to be considered

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 23


Presentation Outline

Introduction and motivation


Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 24


Work Status -> Specialized FUs in Trimaran

Modeling MISOs
Model as external function calls
Replace in Trimaran bridge code and replace with AFU op
Model new AFU in MDES with the required ops
Introduce the semantics in simulator op definitions file
Modeling MIMOs
Model as external function calls returning voids
Replace in Trimaran bridge code and replace with AFU op
Explicitly reserve registers in C-code for returning values
Introduce operation semantics in simulator op definition file

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 25


Work Status -> Specialized FUs in Trimaran (contd...)

Modeling MIMOs with LD/ST


Model as regular MIMOs
Memory interaction with block LD/ST at beginning and end of
execute cycles
Additionally
Possible to impose register file constraints
Various I/O timeshapes, rigid or flexible
Possible to introduce pipelined functional units

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 26


Work Status -> Instruction Enc. in Trimaran

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 27


Work Status -> Instruction Enc. in Trimaran (contd...)

New Jersey Machine Code Toolkit (NJMC)


Deals with bits at symbolic level
Can be used to write assemblers/disassemblers
Specification in SLED (Specification Language for
Encoding/Decoding)
Model instruction decompressor in HMDES
Instrument ELCOR to generate assembly code
Encoding is done using procedures generated by NJMC
Problems with NJMC
VLIW instruction need to be broken up into 32 bit tokens
Encoded instructions must end on 8 bit boundary

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 28


Work Status -> Code Gen. for Clustered ASIPs

ELCOR
Disadvantages
ELCOR is heavily oriented towards HPL-PD architecture
Does not support clustered VLIW architecture
Advantages
Strong optimizing compiler
Rich library to deal with the IR

IMPACT compiler system offers another choice for


building a backend
Feasibility study being carried out to fix a particular
direction of work

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 29


Presentation Outline

Introduction and motivation


Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 30


References

Bhuvan Middha, Varun Raj, Anup Gangwar, M. Balakrishnan, Anshul Kumar and
Paolo Ienne, A Trimaran based framework for exploring design space of VLIW
ASIPs with coarse grain FUs, ISSS-2002.
Anup Gangwar, M. Balakrishnan and Anshul Kumar, A framework for studying the
effect of VLIW processor instruction encoding and decoding schemes, Mini
Project, Dept. of CSE.
M. Jacome and G. de. Veciana, Design challenges for new application specific
processors, IEEE Design and Test of Computers-2000.
B. Ramakrishna Rau and Michael S. Schlansker, Embedded computer architecture
and automation, IEEE Computer-2001
Michael S. Schlansker and B. Ramakrishna Rau, EPIC: An architecture for
instruction-level parallel processors, HPCA-2000.
N. G. Busa, A. van der Werf and M. Bekooij, Scheduling coarse grain operations
for VLIW processors, ASPDAC-1998.
Shail Aditya, Scott A. Mahlke and B. Ramakrishna Rau, Code size minimization and
retargetable assembly for custom EPIC and VLIW processors, ISSS-1999.

Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esproject Slide 31

Vous aimerez peut-être aussi