Explorer les Livres électroniques
Catégories
Explorer les Livres audio
Catégories
Explorer les Magazines
Catégories
Explorer les Documents
Catégories
des Etudes
Technologiques en
Communications
Processeur Embarqué
License en Télécommunications
Moez BALTI
1
Plan
Introduction :
Pourquoi un processeur dans les SE?
Architecture générale: Harvard-Von
Meumann
Les types de processeurs
RISC Vs. CISC
ASIP : Micro-controleur, DSP.
Architecture ARM
Assembleur ARM
2
Le processeur
3
Architecture Classique
“Von Neumann1”
Une mémoire contenant les instructions et les données.
Une “Central processing unit” (CPU) qui lit les
instructions de la mémoire.
La CPU contient des registres :
program counter (PC),
instruction register (IR),
general-purpose registers GPR
Etc….
1
John Von Neumann (1903-1957) : Nom du scientifique américain d’origine
hongroise , (avec Charney) il a mis au point la structure du 1ièr ordinateur à
programme enregistré
4
CPU + mémoire
Instruction fetch
CPU
address Bus
PC
ADD r5,r1,r3
200 200
PC
Memory
Data Bus
ADD
IR r5,r1,r3
5
Architecture Harvard
Deux bus de données et deux bus d’adresse
address
data memory
data PC
CPU
address
6
von Neumann vs. Harvard
7
Von Neuman Vs Harvard
8
Architectures microprocesseurs
9
RISC vs. CISC
Complex instruction set computer (CISC):
Architecture à jeu dinstructions complexe
Plusieurs modes d’adressages mémoires possible;
Base+déplacement Donnée = M[ 100 + reg],
Indirect Donnée = M[ [adresse] ]
Pré-incrémentée, ….
Grand nombre d’instructions (sémantique
complexe)
Codage des instructions variables : le nombre
d’octets nécessaire pour le codage des
instructions est variable
La taille du code est en général plus réduite que
dans le cas des RISC (code density)
Exemple : x86, AMD,
10
RISC Vs. CISC (suite)………
Reduced instruction set computer (RISC):
Architecture à jeu d’instruction réduit, structure simple
Seul deux instructions accèdent à la mémoire load/store;
Format régulier
Grand nombre de registres
Certains registres sont visibles au programmeurs
(assembleur) , d’autres sont gérées par le matériel
Structure hardware simplifié = espace silicium libre
Cache; moins d ’accès à la mémoire externe
Exemple : ARM, UltraSparc, …..
11
Technique de Pipeline
12
ISA : Instruction Set Architecture
ISA : Architecture logique visible au programmeur de
bas niveau ou à l’OS
13
Crusoe
ISA : x86
Le code exécuté par le processeur est VLIW ≠ x86
La traduction du code X86/Crusoe par logiciel (
Morphing) à la volée
meilleure efficacité, code plus compacte,
Solution flexible
Un processeur à mot très long VLIW (128 bits)
compatible X86
14
Autres Solutions
ASIP : Application specific instruction set processor;
Processeur programmable optimisé pour un groupe
d’applications.
Le jeu d’instructions est enrichi d’instruction
spécifique à l’application exemple :
mac R1, R2, R3 : multiplier et cumuler R1 += R2*R3
ASIP : bon compromis performance et flexibilité.
Coût de conception élevé. Une solution meilleur
entre un GPP et un ASIC
Axe de recherche :
Génération automatique (semi-automatique)
Architecture, compilateur associé, envi de
dévéllopement
Deux types d’ASIP : les micro-contrôleurs et les DSP
15
Micro-contrôleur
Micro-contrôleur : microprocesseur doté de
moyens de lecture et de control d’un seul bit d’E/S
(mode série).
Des interfaces d’E/S sont en générale intégré au
microC (interface de communication série, timers,
counters, convertisseur analogique/digital)
Exemple de micro-contrôleur
Intel 8051: fréquence 12 Mhz, mémoire 4 KO ROM et
128 mots RAM, UART( 1 port parallèle et plusieurs
ports série), timers ; largeur du bus 8 bits,
performance. 1 MIPS, puissance 0.2 watt
16
Exemple de micro-contrôleur
17
DSP
DSP (Digital Signal Processor) : un processeur spécialisé dans le
traitement du Signal; grande quantité de données
Un signal donnée reçues en continue représentant : les
images depuis une caméra, voix, musique, ….
DSP contient plusieurs groupe de registres, des circuits
de multiplications;
Le jeu d’instructions est enrichi d’instructions
fréquemment utilisées en traitement du signal exemple
instructions SIMD:
Plusieurs instructions en parallèle sur plusieurs données,
exemple
addMultiple8 R1, R2, R3
Les registres R1, R2, R3 sont considérés comme des tableaux de
plusieurs registres de 8 bits, par exemple 4*8 ou 16*8, les additions
sont faites en //
18
DSP…suite
Un exemple: Filtrage d ’un signal
Un filtre : est un traitement appliqué à un signal
(analogique ou digital) pour améliorer sa qualité
(élimination du bruit) voir exemple après.
Solution Digitale : Robuste aux conditions
externes, facilement modifiable et reproductible
19
FINITE-IMPULSE RESPONSE
FILTER
C(n-2) C(n-1)
C(0) C(1)
20
Circuit Spécifique
ASIC : Application Specific Integrated Circuit
Circuits Intégrés pour applications spécifiques
Peut contenir un Core CPU (logiciel) modifiable
Périphérique
Généralement conçu pour réaliser une seule
application (1 programme)
Caractéristiques
Prix de développement élevé
Se justifie pour les grandes quantités
Pour les grandes sociétés
21
Circuit spécifique
22
Schéma d’un ASIC
Ici il s’agit d’un system-on-
chip SOC composé de
plusieurs blocs IP : Internal Busses
Processeur, cache, ….
Unité analogique (pour les ES) Dcache
DSP
+ 1 bus standard (exemple:
AMBA)
Bientôt : MPSOC ou Antenna Processor
MultiProcessor SOC, plusieurs & LCD Core
processeurs dans un CI avec & keyboard Bluetooth
d’autres unités Interfaces Interface
Icache
23
ARM technology presentation
24
ARM Presentation
ARM Ltd Founded in November 1990:
Designs the ARM range of RISC processor
Licenses ARM core designs to semiconductor
partners who fabricate and sell
ARM: Advanced RISC Machine
Developed at Acorn Computers Ltd, Cambridge, England
(1983-1985)
First RISC microprocessor for commercial use
Architectural simplicity: low code size, low silicon area
Very small implementation
Very low power consumption
Competitive, easy to develop and cheap
25
Advantages of using ARM
ARM core-based micro_processorsare used in industry
commonly because:
Tools of Choice: ARM has the widest range of
hardware and software tools support any 32 bit
architecture
Ease of Access: There are more than 10 leading
microcontroller suppliers providing ARM based MCUs
Flexibility in System Design: Through a wide range of
functionality and power, parts running 1 MHz to 1 GHz
with architectural performance enhancements for
media and Java
Low Cost of Silicon: Processors and other products
making efficientuse of silicon while designing any kind
of chips
26
ARM Core characteristics
32-bit RISC-processor core
32-bit ARM instruction set
16-bit Thumb instruction set
8 / 16 / 32 –bit data types
37 pieces of 32-bit integer registers (16 vailable)
Processor cores: ARM6, ARM7, ARM9, ARM10, ARM11
Extensions: Thumb, El Segundo, Jazelle(execute Java
bytecode)
IP-blocks: UART, GPIO, memory controllers, etc
Von Neuman-type bus structure (ARM7)
Harvard structure (ARM9)
7 modes of operation
27
ARM Architecture variants
T Thumb instruction set
M Long Multiply support 32x32 64
E Enhanced DSP instructions
J Jazelle: support Java
Pipeline :
ARM7 3 < 150 MHz
ARM9-StrongARM 5 < 233 MHz
ARM10 6 266 –325 MHz
Xscale 7 100 MHz –1 GHz
ARM11 8 350 MHz –1GHz
28
ARM Architecture versions
ARMv4: 32-bit ISA operating in a 32-bit address space.
ARMv4T: added the16-bit Thumb instruction set which enabled
compilers to generate more compact code while retaining all the
benefits of a32-bit system.
ARMv5TE:Thumb architecture, along with ARM‘Enhanced’ DSP
instruction set extensions to the ARM ISA.
ARMv5TEJ: The Jazelle extension to support Java acceleration
technology
ARMv6: includes media instructions to support Single Instruction
Multiple Data(SIMD) software execution (applications of video and
audio codecs)
NEONTM Media Acceleration Technology: 64/128-bit hybrid SIMD for
high-performance, media intense, low power mobile handheld
devices.
Vector Floating Point (VFP): coprocessor support is an architecture
option. The VFP architecture supports single and double precision
floating point arithmetic
29
ARM Technologies
30
ARM organization
31
ARM7 processor core
32
ARM7 core interface signals
33
Processor modes
ARM has seven basic operating modes:
User: unprivileged mode for normal program execution
FIQ: entered when a high priority (fast) interrupt is raised
IRQ: entered when a low priority (normal) interrupt is raised
Supervisor: Protected mode for operating system support(entered on
reset and when a Software Interrupt (SWI) instruction is executed)
Abort: for implementing virtual memory or memory protection
Undefined: supports software emulation of hardware coprocessors
System: for running privileged operating system tasks
Mode changes may be made under software control or may be
causedby external interrupts or exception processing.
Most application programs will execute in User mode. Other
privileged modes will be entered to service interrupts or
exceptions or to access protected resources.
34
ARM 7 Registers
35
ARM 7 register set
Registers are arranged into several banks,
being governed by the processor mode.
Each mode can access
a particular set of r0-r12 registers
a particular r13 (the stack pointer) and r14 (link
register)
r15 (the program counter)
cpsr (the current program status register)
Privileged modes can also access
a particular SPSR (saved program status
register)
36
ARM 7 registers
37
ARM 7 registers in user mode
38
LR & PC registers
Register 14:is used as the subroutine Link Register
(LR). This receives a copy of R15 when a Branch and
Link (BL) instruction is executed.
At all other times it may be treated as a general-purpose
register. The corresponding banked registers R14_svc,
R14_irq, R14_fiq, R14_abt and R14_und are similarly used
to hold the return values of R15 when interrupts and
exceptions arise, or when Branch and Link instructions
are executed within interrupt or exception routines.
Register 15:holds the Program Counter (PC). In ARM
state, bits [1:0] of R15 are zero and bits [31:2] contain
the PC. In THUMB state, bit [0] is zero and bits [31:1]
contain the PC.
39
Program status registers
Processor has:
1 Current Program Status Register (CPSR)
5 Saved Program Status Registers (SPSRs) for
exception handlers to use
Program Status Registers functions:
Hold information about the most recently performed ALU
operation
Control the enabling and disabling of interrupts
Set the processor operating mode
40
Program status registers
41
ARM 7 Data types and memory
format
42
Data types
ARM7 processor supports the following data types:
32-bit words
16-bit halfwords
8-bit bytes
Data alignment must be as follows:
Align words to four-byte boundaries
Align halfwords to two-byte boundaries
Align bytes to byte boundaries
Memory transfer (Load & Store): bytes, halfwords
and words
Signed operands are assumed to be in two’s
complement format
43
Memory format
Memory: linear collection of bytes numbered in
ascending order from 0:
bytes 0-3 hold the first stored word
bytes 4-7 hold the second stored word
44
Little-Endian data format
Byte with the lowest address in a word is the least-significant byte of the word
Byte with the highest address in a word is the most significant byte of the word
Byte at address 0 of the memory system connects to data lines 7-0
45
Exceptions
Definition
Exceptions are usually used to handle unexpected events which
arise during the execution of a program.
These events are all grouped under the “exception” heading
because they all use the same basic mechanism within ARM
processor
Exceptions groups
Exceptions generated as the direct effect of executing an
instruction : Software interrupts, undefined instructions (including
coprocessor instructions where the requested coprocessor is
absent) and prefetch aborts (instructions that are invalid due to a
memory fault occurring during fetch
Exceptions generated as a side-effect of an instruction : Data
aborts (a memory fault during a load or store data access).
Exceptions generated externally, unrelated to the instruction flow:
Reset, IRQ and FIQ fall into this category.
46
ARM programming modes &
Assembler syntax
47
Assembler definition
Machine level Language : a set of binary orders defined
to be understood and executed by a given
microprocessor.
Programming model : depends on registers, indicators
and interruptions definition of target microprocessor.
Assembler Language : defines instructions in mnemonic
form or operation codes that represents a function the
processor will perform (Example : JUMP, ADD, MOV…).
Assembler structure : code lines have two parts, first one
is the name of instruction to be executed, second part
are command parameters (Example : addah bh).
Assembling : Using of specific software tool to convert
Assembler symbolic instructions into executable
machine code.
48
AREA directive
An AREA is a chunk of data or code manipulated by the linker.
A complete application consists of one or more AREAs.
ENTRY directive
It marks the first instruction within an application.
An application can contain only a single ENTRY point. In application
with multiple assembler modules, only one module contains an ENTRY
directive.
General form of lines in an assembler module
label <white space>instruction <white space>; comment
The label, instruction, and comment must be separated by one
whitespace character.
The label must start in the first column.
An instruction never starts in the first column, even if there is no label.
All three sections of the line are optional, and the assembler also
accepts blank lines to improve the clarity of the code.
49
Assembly program example
50
Data definition and alignment Directives
51
Data processing instructions
Addition and subtraction
Multiplication
Shifts
Single data transfer
Compares and tests
Logical operations
Loads and Stores operations
Addressing modes of single-register loads and stores
loading constants into registers
Loading addresses into registers
Conditional execution and loops
Subroutines
Memory mapped peripherals
Floating point computation
52
ARM Instruction Code Format
53
Instruction: general coding
54
ARM instruction Mnemonics
55
ARM instruction Mnemonics
56
ARM instruction Mnemonics
57
ARM condition codes
58
Data processing instructions
59
Data processing instruction
binary encoding
60
Addition and subtraction
The arithmetic instructions in the ARM instruction set include
addition and subtraction operations that perform addition,
subtraction, and reverse subtraction, all with and without
carry.
61
Multiply instructions
ARM7 processor has dedicated logic for performing multiplication.
Multiplication by a constant can be done with a shift and add instruction
or a shift and reverse subtract instruction.
MUL r4,r2,r1 ; r4 = r2 x r1
MULS r4,r2,r1 ; r4 = r2 x r1 then set the N and Z flag
MLA r7,r8,r9,r3 ; r7 = r8 x r9 + r3
MUL and MLA instructions
MUL and MLA are multiply and multiply-and-accumulate
instructions that produce 32-bit results.
MUL multiplies the values in two registers, truncates the
result to 32 bits, and stores the product in a third register.
MLA multiplies two registers, adds the value of a third
register to the product, truncates the results to 32 bits,
and stores the result in a fourth register.
62
Multiply long instructions
Multiply long instructions produce 64-bit results. They multiply the values
of two registers and store the 64-bit result in a third and fourth register.
SMULL and UMULL are signed and unsigned multiply long instructions:
63
Logical operations
ARM supports Boolean logic operations using
two register operands.
64
Logic Shift Left
66
Shifter rotate operations
MOV R0, R1, RRX The register R0 become the same as the value
of the register R1 rotated though the carry flag by one bit. The MSB of
the value becomes the same as the current Carry flag, while the Carry
flag will be the same as the LSB or R1. The value of R1 will not be
changed.
67
Data processing examples
68
Load and Store Register
Load and Store Register
Classes of instructions Load and Store Multiple registers
Swap register and memory contents
Load Register instructions can load a 32-bit, a 16-bit or an 8-bit from memory into a
register. Byte and half word loads can be automatically zero-extended or sign-extended
as they are loaded.
Store Register instructions can store a 32-bit, a 16-bit or an 8-bit from a register to
memory.
Load and Store Register addressing modes
Offset addressing: memory address is formed by adding or subtracting an offset to or
from the base register value.
Pre-indexed addressing: memory address is formed in the same way as for offset
addressing. As a side-effect, the memory address is also written back to the base
register.
Post-indexed addressing: memory address is the base register value. As a side-effect,
an offset is added to or subtracted from the base register value and the result is written
back to the base register.
In each case, the offset can be either an immediate or the value of an index register.
Register-based offsets can also be scaled with shift operations.
69
Data transfer instruction
binary encoding
70
Offset addressing
LDR R0, [R1] load the register R0 with the 32-bit word at
the memory address held in the register R1. An offset of zero is
assumed
LDR R0, [R1, #4] load the register R0 with the word at the
memory address calculated by adding the constant value 4 to the
memory address contained in the R1 register
LDR R0, [R1, R2] Loads the register R0 with the value at the
memory address calculated by adding the value in the register R1
to the value held in the register R2
LDR R0, [R1,R2, LSL #2] load the register R0 with the 32-bit
value at the memory address calculated by adding the value in
the R1 register to the value obtained by shifting the value in R2
left by 2 bits.
71
Indexed addressing
Pre-indexed addressing
Post-indexed addressing
The address of the data transfer is calculated from the unmodified value
in the base register, Rn. Then the offset is added to the value in Rn and
written back to Rn.
The T flag is used for operating systems in memory management
environments and is not used here.
72
Pre-Index addressing
LDR R0, [R1, #4]! load the register R0 with the word at the memory address
calculated by adding the constant value 4 to the memory
address contained in the R1 register. The new memory
address is placed back into the base register R1.
LDR R0, [R1,R2, LSL #2]! First calculates the new address by adding the value in the
base address register, R1, to the value obtained by shifting
the value in the offset register, R2, left by 2 bits. It will then
load the 32-bit at this address into the destination register,
R0. The new address is also written back into the base
register, R1. The offset register, R2, will not be effected by
this operation.
73
Post-Index addressing
LDR R0, [R1], #4 Load the register R0 with the word at the memory address
contained in the base register, R1. It will then calculate the
new value of R1 by adding the constant value 4 to the
current value of R1.
LDR R0, [R1], R2, LSL #2 First loads the 32-bit value at the memory address
contained in the base register, R1, into the destination
register, R0. It will then calculate the new value for the
base register by adding the current value to the value
obtained by shifting the value in the offset register, R2,
left by 2 bits. The offset register, R2, will not be effected
by this operation.
74
Load and Store Multiple Registers
Load Multiple (LDM) and Store Multiple (STM)
instructions perform a block transfer of any number of
the general-purpose registers to or from memory.
Addressing modes
•pre-increment
•post-increment
•pre-decrement
•post-decrement
75
Multiple transfer instruction
Binary coding
76
Multiple transfer instruction
77
Multiple transfer instruction
78
Swap memory and register instruction binary
encoding & assembler syntax
79
Loops and subroutines
80
Compare and test instructions
81
Condition code flags update
For a data processing instruction to update the condition code flags, the instruction must be
postfixed with an S.
The exceptions to this are CMP, CMN, TST, and TEQ, which always update the flags,
because updating flags is their only real function.
`
82
Loop structures
Three basic types of loops
for loops
while loops
do {…..} while loops
83
While and do Loops
While Loop
Because the number of iterations of a while loop is not a constant, these
structures tend to be somewhat simpler.
There is only one branch in the loop itself. The first branch actually throws
you into the loop of code .
Do …while loops
Loop body is executed before the expression is evaluated. The
structure is the same as the while loop but without the initial branch:
84
Subroutine structure
The branch and link instruction performs a branch in the same manner
as the branch instruction. It uses the Link Register, r14, to save the
address of the next instruction after the branch
85
Branch conditions
86