Lecture 6

Mekelle Institute of Technology
Embedded Systems (CSE507)

Department of Electronics and Communication Engineering/Computer Science and Engineering
Lecture 6
Hardware Acceleration and Embedded Networks
Hardware Acceleration
Hardware acceleration is the use of computer hardware to perform some function faster than is possible in software running on the generalpurpose CPU. Examples of hardware acceleration include acceleration functionality in graphics processing units (GPUs) and instructions for complex operations in CPUs.( from wikipedia)
....
Normally, processors are sequential, and instructions are executed one by one. Various techniques are used to improve performance; hardware acceleration is one of them. The main difference between hardware and software is concurrency, allowing hardware to be much faster than software. Hardware accelerators are designed for computationally intensive software code. Depending upon granularity, hardware acceleration can vary from a small functional unit to a large functional block
....
The hardware that performs the acceleration, when in a separate unit from the CPU, is referred to as a hardware accelerator, or often more specifically as graphics accelerator or floatingpoint accelerator, etc. Those terms, however, are older and have been replaced with less descriptive terms like video card or graphics card.
....
Many hardware accelerators are built on top of field-programmable gate array chips.
CPUs and accelerators
Accelerated Systems
Use additional computational unit dedicated to some functions
hardwired logic extra CPU Coprocessor
Applications

graphics and multimedia (streaming data) encryption and compression communication devices (signal processing) supercomputing, numerical computations
Why Accelerators?
Better cost/performance.
Custom logic may be able to perform operation faster than a CPU of equivalent cost. CPU cost is a non-linear function of performance.
Better real-time performance.

Put time-critical functions on less-loaded processing elements.
Role of Performance Estimation
First, determine that the system really needs to be accelerated. How much faster is the accelerator on the core function (speed up)? How much data transfer overhead? How much is the overhead for synchronization with CPU? Performance estimation must be done on all levels of abstraction Simulation based methods (only average case, need to find test patterns that stress the system) Analytic methods (only limited accuracy, quick, used mainly on system level)
Accelerator Execution Time
Total accelerator execution time:
Input/output time (bus transactions) include: flushing register/cache values to main memory; time required for CPU to set up transaction; overhead of data transfers by bus packets, handshaking, etc.
Accelerator Gain
For simplification let us consider:

An application consists of one task only, repeated n times task can be executed completely on the accelerator
But in general:

not all tasks can be executed on the accelerator CPU has also other obligations possibilities:
single-threaded/blocking: CPU waits for accelerator multithreaded/non-blocking: CPU continues to execute along with accelerator, CPU must have useful work to do, software must support multi-threading
Accelerator/CPU Interface Issues
Synchronization
via interrupts via special data and control registers at accelerator
Data transfer to main memory

assisted by DMA or special logic within the accelerator caching problems as CPU works on cache and accelerator works on main memory (declare data area as non-cacheable, invalidate cache after transfer, write through cache, )
System Design Issues
Hardware/software co-design meeting system-level objectives by exploiting the synergism of hardware and software through their concurrent design. joint design of hardware and software architectures. Co-design problems have different flavor according to the application domain, implementation technology and design methodology Design a heterogeneous multiprocessor architecture Communication (bus, network, ) Memory architecture Interfaces and I/O Processing elements (CPU, application-specific integrated circuit, FPGA (Field-programmable gate array)) Program the system
Networking for Embedded Systems

Why we use networks. Network abstractions. Example networks.
Overheads for Computers as Components
13
Network elements
distributed computing platform: PE
PE communication link network PE

14
PEs may be CPUs or ASICs.

Networks in embedded systems

initial processing more processing
PE PE PE
sensor
actuator
15
Why distributed?

Higher performance at lower cost. Physically distributed activities---time constants may not allow transmission to central site. Improved debugging---use one CPU in network to debug others. May buy subsystems that have embedded processors.
Overheads for Computers as Components 16
Hardware architectures
Many different types of networks:

topology; scheduling of communication; routing.
17
Point-to-point networks
One source, one or more destinations, no data switching (serial port):
PE 1 link 1
PE 2 link 2
PE 3
18
Bus networks
Common physical connection:
PE 1
PE 2
PE 3
PE 4
header
address
data
ECC
packet format
19
Bus arbitration

Fixed: Same order every time. Fair: every PE has the same access over long periods.
round-robin: rotate top priority among Pes.
fixed round-robin
A A
A,B,C
B B
C C
A,B,C
A B
B C
C A
20
Multi-stage networks

Use several stages of switching elements. Often blocking. Often smaller than crossbar.
21
I2C bus
Designed for low-cost, medium data rate applications. Characteristics:

serial; multiple-master; fixed-priority arbitration.
Several micro-controllers come with built-in I2C controllers.

Overheads for Computers as Components 22
The CAN Bus
Originally designed for automotive electronics Now used for other applications as well Bit serial transmission, 500 Kb/s, over twisted pair,up to 40 m Synchronous, nodes synchronize themselves by listening to the bit transitions on the bus Arbitration by using Carrier Sense Multiple Access with Arbitration on Message Priority (CSMA/AMP) For error handling a special error frame and an overload frame are used as well as acknowledgements
23

Lecture 6

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Lecture 6

Transféré par

Droits d'auteur :

Formats disponibles

Mekelle Institute of Technology

Embedded Systems (CSE507)

CPUs and accelerators

Better real-time performance.

Role of Performance Estimation

Accelerator Execution Time

Total accelerator execution time:

For simplification let us consider:

Accelerator/CPU Interface Issues

Data transfer to main memory

System Design Issues

Networking for Embedded Systems

Why we use networks. Network abstractions. Example networks.

Overheads for Computers as Components

PE communication link network PE

PEs may be CPUs or ASICs.

Networks in embedded systems

Overheads for Computers as Components

Many different types of networks:

topology; scheduling of communication; routing.

Overheads for Computers as Components

One source, one or more destinations, no data switching (serial port):

Common physical connection:

Overheads for Computers as Components

round-robin: rotate top priority among Pes.

Overheads for Computers as Components

Designed for low-cost, medium data rate applications. Characteristics:

serial; multiple-master; fixed-priority arbitration.

Several micro-controllers come with built-in I2C controllers.

The CAN Bus

Vous aimerez peut-être aussi