Application Specific Processors

Application specific processor
APPLICATION SPECIFIC
PROCESSORS
1. Application Specific Processors
General purpose processors are designed to execute multiple applications and
perform multiple tasks. General purpose processors can be quite expensive
especially for small devices that are designed to perform special tasks. Also general
purpose processors might lack high performance that a certain task required.
Therefore, application specific processors emerged as a solution for high
performance and cost effective processors. Application specific processors have
become a part of our lifes and can be found almost in every device we use on a
daily basis. Devices such as TVs, cell phones, and GPSs they all have a form of
application specific processors. An application specific processor combines high
performance, low cost, and low power consumption.
1.1. Application Specific Processors Classification

Application specific processors can be classified into three major categories:
- Digital Signal Processor (DSP): Programmable microprocessor for extensive realtime mathematical computations.
- Application Specific Instruction Set Processors (ASIP): Programmable
microprocessor where hardware and instruction set are designed together for one
special application.
- Application Specific Integrated Circuit (ASIC): Algorithm completely implemented
in hardware.
1.2. Application Specific Systems
Some of the typical approaches of building an application specific system or an

embedded system are to use one or more of the following implementation methods:
GPP, ASIC or ASIP.
- GPP
Functionality of the system is exclusively build on the software level. Although the
biggest advantage of such system is the flexibility but it is not optimal in term of
performance, power consumption, cost, physical space and heat dissipation.
- ASIC
Compared to GPP, ASIC based systems offer better performance and power
consumption but at the cost of flexibility and extensibility. Although it is difficult to
use the ASIC for tasks other than what they were designed for, but it is possible to
use GPP to perform the more general less demanding tasks in addition to ASIC in
the same system.
- ASIP

In this approach, and as defined in [9], [4], an ASIP is basically a compromise
between the two extremes; The Application specific integrated circuit processors
ASIC being designed to do mostly a very specific job with high performance but with
minimal room for modifications and
the general purpose processors which costs a lot more than ASIP but with extreme
flexibility at what they do. Due to this flexibility and low price, ASIP are great to be
used in embedded and system-on-a-chip solutions.
2. Digital Signal Processors

Digital signal processors are specialized microprocessors optimized for the need of
performing digital signal processing. Digital signal processors gained their
importance with the increased demand on data-intensive applications such as video
and internet browsing on mobile devices. The digital signal processors satisfy the

need for powerful processor while maintaining low cost and low power consumption.
Size limitation prevented the possibility to include more than four processors on the
board but due to the recent shrink in feature size allowed for single-chip DSPs to
become multicore with decent amount of memory and I/O interface. The need for
larger memory on chip and the power constrains driven these multicore DSPs to
become system-on-chip. Using system-on-chip leads to more reduction in cost for
its simplified board design. The move into multicore direction in the embedded
systems allowed for higher performance, lower cost, and lower power consumption.
2.1. Background and History

The concept of DSP was introduced in mid 1970s. The rise in interest for DSPs is the
result of the need to solve real world problems using digital systems. Few years
later a toy by the name Speak and Spell was created using a single integrated
circuit to synthesize speech opening the doors for new possibilities. These
possibilities can be described by that digital signal processing is possible in real
time and DSPs can be cost effective. DSPs differ from other microprocessors is that
DSP intend to do complex math while guaranteeing real-time processing .
Prior to the advance in DSP chips, the DSP application was implemented using AMD
2901 bit slice chip. These bit slice architectures would sometimes include a
peripheral multiplier chip. Later on Intel released the 2920 as an analog signal
processing that has ADC-DAC with internal signal processor. But lack of hardware
multiplier was the reason for it market failure. In 1980 the first dedicated digital
signal processor was released. In 1983 Texas instruments released their first
successful DSP.
Architecture properties such as dual/multiple data busses, logic to prevent
over/underflow, single cycle complex instructions, Hardware multiplier, little to no
support of interrupt, and support of special instructions to handle signal processing
gave the DSP the ability to do complex math in real time.
As an example, the sample rate for voice is 8 kHz. First DSPs allowed around 600
instructions per sample period, barely enough to transcoding. As better DSPs
become available functions such as noise cancelation and echo cancelation can be
implemented. This was a result of using multiprocessors DSPs rather than improving
performance of single DSPs. Nowadays, more than 68% of DSPs are used in the
wireless sector precisely in the mobile handsets and base stations .
2.2. Multicore DSPs

Multicore DSPs a raised as a solution for higher- performance applications such as
third generation wireless communication, mobile TV, and mobile web browsing. The
move from single core to multi core resulted in significant improvement in
performance, power consumption, and space requirements.
Multicore DSPs can be categorized in several ways. They can be categorized by the
type of the cores on the chip into homogeneous or heterogeneous. Homogeneous
multicore DSPs consist of the same type of cores, on the other hand heterogeneous
DSPS consist of different cores. DSPs also can be categorized base on the way they

are interconnected. DSPs can be categorized base on interconnection between
cores as a hierarchal or as a mesh topology. In hierarchal topology cores
communicate through switches. Cores that need to communicate with each other
faster are placed close to each others. Cores can share memory or have individual
memories. Cores usually have a dedicated level 1 memory, and a dedicated or a
shared level 2 memory. The communication between cores is done through memory
transfer between memories. This process is known as direct memory access (DMA).
In mesh topology cores are placed on 2D array and the cores are connected through
a network of buses and multiple simple switches. Memory is usually dedicated
(local) to each processor. Mesh topology allows scaling to large number of cores
without increasing the complexity.
2.3. Examples of Multicore DSPs

Multicore DSPs are produced by several vendors these days. Company such as TI,
Freescale, and Picochip provides different DSP platforms for several applications
such as wireless communication, video, and VOIP. In this section several multicore
DSPs are presented.
2.3.1. TI TNETV3020
This multicore DSP platform contains six TMS320C64x+ DSP cores each running at
speed of 500 MHz. This platform is designed to fit high performance wireless voice
and video applications. This platform is based on the hierarchal architecture.
2.3.2. Freescale MSC8156
This multicore DSP platform contain six starcore SC3850 DSP processors each
running at speed of 1 GHz. This platform is designed for 3G/4G applications.
3. Application-Specific Instruction Set Processors

3.1. Background
An ASIP is typically a programmable architecture that is designed in a specific way
to perform certain tasks more efficiently. This extra efficiency is not exclusively
associated with faster performance. Other factors like reduced production costs,
simplified manufacturing process and less power consumption can all be considered
efficiency qualities for ASIP. The term Application in ASIP is not necessarily related
to software applications, it actually describe the class of tasks the ASIP platform was
designed to efficiently accomplish. As the name suggests, the Instruction set seems
to be the core characteristic of any ASIP based platform; but this is entirely not true.
Considering a whole platform, other very important attributes like interfaces and
micro-architecture do contribute a lot to the overall system performance.
Interfaces outline how the system and the architecture communicate with each
other. Having only an extremely parallelized instruction-set ASIP in a data extensive
processing platform doesnt mean necessarily a more efficiently performing system.

For example, if the load/store unit cannot handle data processing as quick, there will
be a performance bottle-nick due to system interfaces. Similarly for the microarchitecture, the pipeline of the ASIP must be designed in a specific way that
optimizes the performance of the whole system. A traditional RISC stages (Fetch,
Decode, Execute, memory & write-back) might not be the optimal pipeline for the
application. Specifically designing any of these aspects of ASIP will be always
associated with trade-offs. Having an extremely parallelized instruction-set will
propose issues with the size of the instruction-set and it will affect the program
memory, interfaces and fetch & decode stages of the pipeline.
3.2. Instructions Set

An ASIP instruction is usually different than a normal instruction. It doesnt have to
be composed of a mnemonic and register/memory operands. The design of the
architecture which is controlled by the application of the ASIP-, shapes the
instruction-set format. An ASIP can use either general purpose registers or
configuration registers. In a typical RISC processor, instructions trigger functional
units along with general purposes register addresses, while ASIP can benefit also
from configuration registers or utilizes specially designed data flow mechanisms
that are hardwired in the system. Using configuration register with constant
operands eliminate the necessity to encode their addresses them in the instruction
word as with general purpose registers. In [1], the authors describe ASIPs as more of
a hardwired block than a processor from an architectural and interface perspective.
Instructions set in ASIPs can be divided into two parts, Static logic which defines a
minimum ISA and configurable logic which can be used to design new instructions.
The configurable logic can be programmed in the field similar to FPGA or during the
chip synthesis.
3.3. Network Processors

3.3.1. Networking Systems Evolution
Generally, network systems architecture evolution can be classified into three main
generations:
- First Generation (1980s): Networking services are executed on the application
layer on a general purpose processor based system. In another word, network
services are basically software packages that are running on top of the operating
system of a PC.
- Second Generation (mid 1990s): Few networking functions were assigned to
more dedicated hardware units and a faster switching fabric that replaces the
shared bus.
- Third Generation (late 1990s): Utilization of ASIC for a more distributed design
with ASIC hardware and an exclusive processor on each network interface to
accommodate the fast data plane requirements.
The complexity of latest network devices does benefit from the distributed approach
in designing the platform. Activities like exchanging routing table entries among all
routing interface in a network router need to be carried out smoothly without any
interfering with the router forwarding tasks.

But the move toward ASIC had its trade-offs. Some of those are the high production
costs, the long delay of availability to markets time and the inflexibility of ASIC
based platforms. These trade-offs made it clear that ASIC based systems are not
efficient in term of scalability or upgradability.
Chip manufacturers began exploiting new approaches to design flexible yet fast
performing network systems. That when programmable network processor came
out. A processor with combined advantages of first and third generation devices;
the flexibility of general purpose processors and the speed of ASIC was obvious.
It is very obvious that software based systems does not perform as fast as hardware
based ones. So with the programmable approach in mind, networking core tasks
must be exploited and assigned to dedicated functional units to guarantee the
overall performance of these processors would be acceptable.
3.3.2. Network Applications

In order to understand how network processors are special and why they are
designed accordingly, we need to understand the characteristics of network
applications first. Most of network applications are based on pattern matching,
address lookup and queuing management which can be classified into two main
categories according to the processing types, Data Plane and control plane
processing.
Data plane processing:
Basically the processor is moving data from one port to another without much
interest in the payloads. Such tasks varies from less processor demanding
applications like routing or forwarding to a heavier applications like firewall filtering,
encryption or contents payload inspection.
Control plane processing:
Tasks like updating router lookup entries and establishing secured tunnel
connections are some of the control plane tasks. Usually such tasks are more
demanding processes although it deals with fewer amounts of data. Data plane
tasks require minimal level of designing but they require a heavy processing power
while on the contrary, the control plane processing doesnt require heavy processing
power but it takes effort to design them properly.
4. Application Specific Integrated Circuits

4.1. History and Background
The term 'ASIC' stands for 'Application-Specific Integrated Circuit'. An ASIC is
basically an integrated circuit designed specifically for a special purpose or
application. An example of an ASIC is an IC designed for a specific line of cellular
phones of a company, whereby no other products can use it except the cell phones
belonging to that product line. The opposite of an ASIC is a standard product or
general purpose IC, such as a logic gate or a general purpose microcontroller, both
of which can be used in any electronic application by anybody. Aside from the
nature of its application, an ASIC differs from a standard product in the nature of its
availability. The intellectual property, design database, and deployment of an ASIC
are usually controlled by just a single entity or company, which is generally the enduser of the ASIC too. Thus, an ASIC is proprietary by nature and not available to the

general public. A standard product, on the other hand, is produced by the
manufacturer for sale to the general public. Standard products are therefore readily
available for use by anybody for a wider range of applications.
The first ASIC's were introduced in 1980. They used gate Array Technology known as
uncommitted logic array or ULA's. They had few thousand gates; they were
customized by varying the mask for metal interconnections. Thus, the functionality
of such a device can be varied by modifying which nodes in the circuit are
connected and which are not. Later versions became more generalized,
customization of which involve variations in both the metal and poly silicon layers.
4.2. Classification
According to the technology used for manufacturing, ASIC's are usually classified
into one of four categories: Full-custom, Semi-custom, Structured and Gate Array.
4.2.1. Full-custom ASIC's

These are those that are entirely tailor-fitted to a particular application from the
very start. Since its ultimate design and functionality is pre-specified by the user, it
is manufactured with all the photolithographic layers of the device already fully
defined, just like most off-the-shelf general purpose IC's. The use of predefined
masks for manufacturing leaves no option for circuit modification during fabrication,
except perhaps for some minor fine-tuning or calibration. This means that a fullcustom ASIC cannot be modified to suit different applications, and is generally
produced as a single, specific product for a particular application only.
4.2.2. Semi-custom or Standard Cell ASIC's

Semi-Custom ASICs on the other hand, can be partly customized to serve different
functions within its general area of application. Unlike full-custom ASIC's, semicustom ASIC's are designed to allow a certain degree of modification during the
manufacturing process. A semi-custom ASIC is manufactured with the masks for the
diffused layers already fully defined, so the transistors and other active components
of the circuit are already fixed for that semi-custom ASIC design. The customization
of the final ASIC product to the intended application is done by varying the masks of
the interconnection layers, e.g., the metallization layers. Figure 4.3 shows the layout
of a standard cell.
4.2.3. Structured or Platform ASIC's
All the relatively new ASIC classification comes under this category, which have
been designed and produced from a tightly defined set of:
- Design methodologies.
- Intellectual properties (IP's)
- Well-characterized silicon, aimed at shortening the design cycle and minimizing
the development costs of the ASIC.
A platform ASIC is built from a group of 'platform slices', with a 'platform slice' being
defined as a pre-manufactured device, system, or logic for that platform. Each slice
used by the ASIC may be customized by varying its metal layers. The re-use of pre-

manufactured and pre-characterized platform slices simply means that platform
ASIC's are not built from scratch, thereby minimizing design cycle time and costs
4.2.4. Gate Array based ASICs

Gate-arraybased ASIC are transistors which are predefined on the silicon wafer. The
predefined pattern of transistors on a gate array is the base array, and the smallest
element that is replicated to make the base array is the base cell. Only the top few
layers of metal, which define the interconnect between transistors, are defined by
the designer using custom masks. To distinguish this type of gate array from other
types of gate array, it is often called a masked gate array (MGA). The designer
chooses from a gate-array library of predesigned and pre-characterized logic cells.
The logic cells in a gate-array library are often called macros. The reason for this is
that the base-cell layout is the same for each logic cell, and only the interconnection
is customized, so that there is a similarity between gate-array macros and a
software macro.
We can complete the diffusion steps that form the transistors and then stockpile
wafers (sometimes we call a gate array a pre-diffused array for this reason). Since
only the metal interconnections are unique to an MGA, we can use the stockpiled
wafers for different customers as needed. Using wafers prefabricated up to the
metallization steps reduces the time needed to make an MGA, the turnaround time ,
to a few days or at most a couple of weeks. The costs for all the initial fabrication
steps for an MGA are shared for each customer and this reduces the cost of an MGA
compared to a full-custom or standard-cell ASIC design.

Application Specific Processors

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Application Specific Processors

Transféré par

Droits d'auteur :

Formats disponibles

Application specific processor

1.1. Application Specific Processors Classification

1.2. Application Specific Systems

Some of the typical approaches of building an application specific system or an