Lect 4 5 DSP 14-08-2015

Digital Signal Processing
BITS Pilani
Pilani|Dubai|Goa|Hyderabad
BITS Pilani
Previous class:
Importance of LTI system for DSP
Different decomposition of DT signals
Convolution
Computation required in DSP
BITS Pilani
Today class:
Evolution of DSP architecture
Numeric Representation used in DSP
Fixed point
Floating point
Analysis of computation required for FIR filter
Can you write the expression for 8-tap FIR

filter ?
Y[n] = a0 X[n]+ a1 X[n-1]+ a2
X[n-2]+ -- - - - +a7X[n-7]
Most recurring computation is
multiplication
and
then
accumulation (MAC)
DSP processors
On chip data memory to store input sample x(n)
On chip memory to store filter coefficients h(n) (for e.g. in
FIR filter example ai)
On chip memory to hold program or instructions that
support DSP operations
DSP processors require multiple operands
simultaneously. Hence, DSP processors should have
multiple operands fetch capacity and multiple memory
access in a single instruction cycle
Dedicated hardware multipliers and accumulators to
carry out multiply and accumulate operations
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
DSP~GPP
Real time
throughput
requirement
Not real time
Used in embedded
throughput needed
Desktop computing
application.
To support DSP
No special
computation like
features.
FFT, convolution,
special features are
provided.
Have MAC unit
DSP
Many clock cycles to
perform add, shift and
multiplication in
ordinary
microprocessors
Most DSP have
specialized instruction
to complete add, shift
and save the result in
1 cycle.
DSP Architecture
Computers need instructions to operate
At every clock cycle they must be told what to do
If instructions are already stored, the computer has to
just fetch and execute them
These are called as stored program machines
Our computer has to fetch the instruction, operate on the
data and store the result
Courtesy : website www.elin.ttu.ee/~olev/lect1.pdf
What is the best suitable architecture for DSP?
Architectural evolution:
Von Neumann
Called as Von Neumann architecture.

Designed by: John Von Neumann, an American mathematician.
(1) Single memory and single bus for transferring the data into
and out of CPU
(2) Single memory shared by both the program instructions and
data.
Most computers today are of the Von Neumann design.
How many cycles needed for MAC instruction for two

numbers that reside in external memory?
1. Get the opcode of instruction
(opcode specifies what the operation is, tells the CPU
what to do)
2. Get data1
3. Get data2
4. Multiply and accumulate and store result.
(Assume that CPU computation takes very small time in
comparison to memory access)
So need four cycles.
Single-Cycle MAC unit
Can compute a sum of

n-products in n cycles
Harvard architecture
Developed at Harvard University (1940)

Program instructions and data can be fetched at the same time.
Increasing overall processing speed
Most present day DSPs use this dual bus architecture.
Ex: ADSP-21xx and AT&T's DSP16xx.
Cycles needed for MAC instruction in Harvard

architecture?
1. Instruction 1 fetched.
2. Instruction 1 decode and get data1 from DM and coefficient
from PM
3. Perform MAC operation and store result in DM as well as
fetch Instruction 2 from PM.
4. Instruction 2 decode get data1 from DM and coefficient
from PM
5. Perform MAC operation and store result in DM (for inst 2)
as well as fetch Instruction 3 from PM.
So single MAC operation need 3 cycles
Modified Harvard architecture
Three memory banks

How many memory access simultaneously possible?
Allow three independent memory accesses per instruction cycle.
Processors based on a three-bank modified Harvard architecture
include the Zilog Z893xx, Motorola DSP5600x, DSP563xx
Multiple-Access Memories
Using fast memories that
support multiple, sequential
accesses per instruction cycle
over a single set of buses
OR
Using multi-ported memories
that allow multiple concurrent
memory accesses over two or
more independent sets of buses.
This arrangement provides one program memory access

and two data memory accesses per instruction word.
Ex: Motorola DSP561xx processors.
Super Harvard Architecture (SHARCH DSP)
Part of program memory is used as data

memory.
Including an instruction cache in the CPU.
In a program, which part is executed
repeatedly?
The first time through a loop, slower operation
Next executions of the loop will be faster
This means that all of the memory to CPU
information transfers can be accomplished in a
single cycle.
EX: ADSP-2106x and new ADSP-211xx
Enhanced DSP
architectures:
Very Long Instruction Word (VLIW) architecture:
VLIW CPUs have four to eight
execution units.
One VLIW instruction encodes
multiple operations.
EX:if a VLIW device has four
execution units, then a VLIW
instruction for that device would
have four operation fields.
VLIW instructions are usually at least 64 bits in width.
VLIW CPUs use software (the compiler) to decide which
operations can run in parallel.
Hardware's complexity for instruction scheduling is reduced.
EX: TMS320 C6xx
Very Long Instruction Word

(VLIW)
A technique for
instruction-level
parallelism by
executing instructions
without dependencies
(known at compiletime)
in parallel
Example of a single
VLIW instruction:
F=a+b; c=e/g; d=x&y;
w=z*h;
Endians:
Big Endian(MSB in first location)
Little endian
How 12345678 will be stored in four
location starting from 4000 in each
case?
TI DSP: Little endian
Motorola DSP: Big endian
Thanks for Your attentio
21

Lect 4 5 DSP 14-08-2015

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Lect 4 5 DSP 14-08-2015

Transféré par

Droits d'auteur :

Formats disponibles

Digital Signal Processing

Analysis of computation required for FIR filter

Can you write the expression for 8-tap FIR

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956

What is the best suitable architecture for DSP?

Called as Von Neumann architecture.

How many cycles needed for MAC instruction for two

Single-Cycle MAC unit

Can compute a sum of

Developed at Harvard University (1940)

Cycles needed for MAC instruction in Harvard

Modified Harvard architecture

Three memory banks

This arrangement provides one program memory access

Super Harvard Architecture (SHARCH DSP)

Part of program memory is used as data

Very Long Instruction Word

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956

Thanks for Your attentio

Vous aimerez peut-être aussi