Académique Documents
Professionnel Documents
Culture Documents
The SHARC
Developed by Analog Devices
Optimized for demanding DSP and imaging
applications.
32 Bit floating point, with 40 bit extended
floating point capabilities.
Large on-chip memory.
Ideal for scalable multi-processing applications.
2
Harvard Architecture
Program memory can store data.
Able to simultaneously read or write data at one
location and get instructions from another place in
memory.
2 buses
1 Data memory bus.
2 Program bus.
DSP
Digital Signal Processor.
High speed, low overhead data movement
and rapid computations required.
Usually has a small on-board ROM, RAM
and single cycle multiply.
Designed to run single line, serial in, serial
out, signal processing applications very fast.
5
DSP Computations
The inner product of two vectors is a common
computation for determining energy or
correlation.
The following C code is an example:
for (n=0; n<length; n++)
result+= x[n] * y[n];
The process which has the lowest instruction
time will have the best performance.
6
SHARC DSP
The SHARC incorporates features aimed at
optimizing such loops.
High-Speed Floating Point Capability
Extended Floating Point
Dual-block, Dual-port
Optimizes the Harvard Architecture by allowing the fetch
of instructions while performing data memory accesses.
9
DAG Capabilities
Circular Buffering
Rather then actually moving data in and out of a vector,
circular buffers are used.
Updating the index modulo, the oldest entry can be
conveniently replaced by the newest entry.
SHARC DSP
What Makes the SHARC unique?
It also has some features not related directly related
to optimizing numeric computations.
Pipelining
Handling Branches
SHARCs Pipeline
3 stages
1 Instruction Fetch
2 Decode
3 Execution
Multi-processing
SHARC is uniquely equipped for multiprocessing.
Links to ports are very powerful multiprocessing capabilities.
Two main program models depending on the
application.
Adapts well to different multi-processing
architectures.
18
Multi-processing
SHARC Links
SHARC has 6 link ports that can transport
data at rates up to 40Mbytes/sec.
Links designed for point-to-point
connections.
Data can be transmitted in either direction
but not both simultaneously.
19
Multi-processing Architectures
Cluster Design
Groups of up to 6 in a cluster
Most common for joining multiple
SAHRC's
All processors, global I/O and global
memory connected to a common
Cluster bus.
Each SHARC can drive the bus.
22
Multi-processing Architectures
Mesh Design
All SHARCs joined by their link ports and are
connected to a common bus.
In SIMD mode one single master SHARC drives
the bus.
In MIMD mode mesh architecture cannot function
if data is lager then on chip available memory.
Advantageous scalability over a wider range of
applications.
23
24
Sources
www.alacron.com/news/tp_mimd_simd.htm
www.analog.com
www.cs.seas.gwu.edu/~cs339/cs339lecture2.pdf
www.ixthos.aa.psiweb.com/technical/notes_
articles/articles
25