Vous êtes sur la page 1sur 86

Institut Supérieur

des Etudes
Technologiques en
Communications

Processeur Embarqué

License en Télécommunications

Moez BALTI

1
Plan
 Introduction :
 Pourquoi un processeur dans les SE?
 Architecture générale: Harvard-Von
Meumann
 Les types de processeurs
 RISC Vs. CISC
 ASIP : Micro-controleur, DSP.
 Architecture ARM
 Assembleur ARM

2
Le processeur

 Pourquoi faire appel à une unité programmable?


 Pourquoi ne pas utiliser une solution totalement
matériel (Hardwired)?
 L’utilisation du logiciel permet de spécialiser plus
rapidement et plus facilement un SE.
 Mise au point du système facile et rapide
 Solution de faible coût.
 Dans la plupart des SE, le SW domine le coût du
SE. La part du logiciel dans les SE ne cessent
d’augmenter

3
Architecture Classique
“Von Neumann1”
 Une mémoire contenant les instructions et les données.
 Une “Central processing unit” (CPU) qui lit les
instructions de la mémoire.
 La CPU contient des registres :
 program counter (PC),
 instruction register (IR),
 general-purpose registers GPR
 Etc….

1
John Von Neumann (1903-1957) : Nom du scientifique américain d’origine
hongroise , (avec Charney) il a mis au point la structure du 1ièr ordinateur à
programme enregistré

4
CPU + mémoire

Instruction fetch

CPU
address Bus
PC
ADD r5,r1,r3
200 200
PC
Memory
Data Bus
ADD
IR r5,r1,r3

5
Architecture Harvard
Deux bus de données et deux bus d’adresse

address
data memory
data PC
CPU
address

program memory data

6
von Neumann vs. Harvard

 L’architecture Harvard permet deux lectures


simultanées .
 La plupart des DSP (digital signal processor)
utilisent l’architecture Harvard
 Note : DSP processeur à archi dédiée pour traitement
de signal (opération sur les matrices)
 Intéressante dans le cas des applications traitant
des données à la volée « streaming data »:
 Permet une plus grande bande passante pour les
données (nombre de données lues ou écrites par
cycles importantes)
 Comme la lecture des données et la lectures des
instructions se font sur des bus différents, il y a moins
d’interférence, une plus grande prédictibilité au
niveau de la largeur de bande.

7
Von Neuman Vs Harvard

8
Architectures microprocesseurs

9
RISC vs. CISC
 Complex instruction set computer (CISC):
Architecture à jeu dinstructions complexe
 Plusieurs modes d’adressages mémoires possible;
 Base+déplacement Donnée = M[ 100 + reg],
 Indirect Donnée = M[ [adresse] ]
 Pré-incrémentée, ….
 Grand nombre d’instructions (sémantique
complexe)
 Codage des instructions variables : le nombre
d’octets nécessaire pour le codage des
instructions est variable
 La taille du code est en général plus réduite que
dans le cas des RISC (code density)
 Exemple : x86, AMD,
10
RISC Vs. CISC (suite)………
 Reduced instruction set computer (RISC):
Architecture à jeu d’instruction réduit, structure simple
 Seul deux instructions accèdent à la mémoire load/store;
 Format régulier
 Grand nombre de registres
 Certains registres sont visibles au programmeurs
(assembleur) , d’autres sont gérées par le matériel
 Structure hardware simplifié = espace silicium libre
 Cache; moins d ’accès à la mémoire externe
Exemple : ARM, UltraSparc, …..

11
Technique de Pipeline

12
ISA : Instruction Set Architecture
 ISA : Architecture logique visible au programmeur de
bas niveau ou à l’OS

 L’ISA n’est peut être pas l’architecture réelle.


 Exemple : il y a des registres et/ou des unités
fonctionnelle que l’utilisateur ne voit pas, le programme
est interprété pour être exécuté sur l’architecture réelle
(architecture virtuelle)
 Dynamic binary translation

 Exemple : Crusoé Tansmeta

13
Crusoe
 ISA : x86
 Le code exécuté par le processeur est VLIW ≠ x86
 La traduction du code X86/Crusoe par logiciel (
Morphing) à la volée
 meilleure efficacité, code plus compacte,
Solution flexible
 Un processeur à mot très long VLIW (128 bits)
compatible X86

14
Autres Solutions
 ASIP : Application specific instruction set processor;
Processeur programmable optimisé pour un groupe
d’applications.
 Le jeu d’instructions est enrichi d’instruction
spécifique à l’application exemple :
mac R1, R2, R3 : multiplier et cumuler R1 += R2*R3
 ASIP : bon compromis performance et flexibilité.
Coût de conception élevé. Une solution meilleur
entre un GPP et un ASIC
 Axe de recherche :
 Génération automatique (semi-automatique)
 Architecture, compilateur associé, envi de
dévéllopement
 Deux types d’ASIP : les micro-contrôleurs et les DSP
15
Micro-contrôleur
 Micro-contrôleur : microprocesseur doté de
moyens de lecture et de control d’un seul bit d’E/S
(mode série).
 Des interfaces d’E/S sont en générale intégré au
microC (interface de communication série, timers,
counters, convertisseur analogique/digital)
 Exemple de micro-contrôleur
Intel 8051: fréquence 12 Mhz, mémoire 4 KO ROM et
128 mots RAM, UART( 1 port parallèle et plusieurs
ports série), timers ; largeur du bus 8 bits,
performance. 1 MIPS, puissance 0.2 watt

16
Exemple de micro-contrôleur

17
DSP
 DSP (Digital Signal Processor) : un processeur spécialisé dans le
traitement du Signal; grande quantité de données
 Un signal donnée reçues en continue représentant : les
images depuis une caméra, voix, musique, ….
 DSP contient plusieurs groupe de registres, des circuits
de multiplications;
 Le jeu d’instructions est enrichi d’instructions
fréquemment utilisées en traitement du signal exemple
instructions SIMD:
 Plusieurs instructions en parallèle sur plusieurs données,
exemple
addMultiple8 R1, R2, R3
Les registres R1, R2, R3 sont considérés comme des tableaux de
plusieurs registres de 8 bits, par exemple 4*8 ou 16*8, les additions
sont faites en //

18
DSP…suite
 Un exemple: Filtrage d ’un signal
 Un filtre : est un traitement appliqué à un signal
(analogique ou digital) pour améliorer sa qualité
(élimination du bruit) voir exemple après.
 Solution Digitale : Robuste aux conditions
externes, facilement modifiable et reproductible

19
FINITE-IMPULSE RESPONSE
FILTER

FIR (finite impulse Response): Filtre très utilisé dans les


DSP, la base de la FFT
For (i=0, y(n)=0; i<n; i++) y(n) += C(i)*x(n-i);

délai délai .... délai

x(n-1) x(1) x(0)


x(n-2)

C(n-2) C(n-1)
C(0) C(1)

20
Circuit Spécifique
 ASIC : Application Specific Integrated Circuit
 Circuits Intégrés pour applications spécifiques
 Peut contenir un Core CPU (logiciel) modifiable
 Périphérique
 Généralement conçu pour réaliser une seule
application (1 programme)
 Caractéristiques
 Prix de développement élevé
 Se justifie pour les grandes quantités
 Pour les grandes sociétés

21
Circuit spécifique

 En général un ASIC est construit par


l’assemblage de plusieurs blocs appelés IP
(Intelectual Property)
 Un IP = morceau de hardware
 Exemple : cœur Processeur, mémoire, interface
Bus; interface réseaux, accélérateur graphique, ..
 IP hardware : la société achète un masque
(procédé pour fabriquer le CI)
 IP software : la société achète le logiciel décrivant
l’unité sous forme de programme (en utilisant C,
C++ (verilog, SystemC) , Ada (VHDL),…)

22
Schéma d’un ASIC
Ici il s’agit d’un system-on-
chip SOC composé de
plusieurs blocs IP : Internal Busses
Processeur, cache, ….
Unité analogique (pour les ES) Dcache
DSP
+ 1 bus standard (exemple:
AMBA)
Bientôt : MPSOC ou Antenna Processor
MultiProcessor SOC, plusieurs & LCD Core
processeurs dans un CI avec & keyboard Bluetooth
d’autres unités Interfaces Interface
Icache

23
ARM technology presentation

ARM microprocessors core


architecture & hardware
organization

24
ARM Presentation
 ARM Ltd Founded in November 1990:
 Designs the ARM range of RISC processor
 Licenses ARM core designs to semiconductor
partners who fabricate and sell
 ARM: Advanced RISC Machine
 Developed at Acorn Computers Ltd, Cambridge, England
(1983-1985)
 First RISC microprocessor for commercial use
 Architectural simplicity: low code size, low silicon area
 Very small implementation
 Very low power consumption
 Competitive, easy to develop and cheap

25
Advantages of using ARM
 ARM core-based micro_processorsare used in industry
commonly because:
 Tools of Choice: ARM has the widest range of
hardware and software tools support any 32 bit
architecture
 Ease of Access: There are more than 10 leading
microcontroller suppliers providing ARM based MCUs
 Flexibility in System Design: Through a wide range of
functionality and power, parts running 1 MHz to 1 GHz
with architectural performance enhancements for
media and Java
 Low Cost of Silicon: Processors and other products
making efficientuse of silicon while designing any kind
of chips

26
ARM Core characteristics
 32-bit RISC-processor core
 32-bit ARM instruction set
 16-bit Thumb instruction set
 8 / 16 / 32 –bit data types
 37 pieces of 32-bit integer registers (16 vailable)
 Processor cores: ARM6, ARM7, ARM9, ARM10, ARM11
 Extensions: Thumb, El Segundo, Jazelle(execute Java
bytecode)
 IP-blocks: UART, GPIO, memory controllers, etc
 Von Neuman-type bus structure (ARM7)
 Harvard structure (ARM9)
 7 modes of operation

27
ARM Architecture variants
 T Thumb instruction set
 M Long Multiply support 32x32 64
 E Enhanced DSP instructions
 J Jazelle: support Java
 Pipeline :
 ARM7 3 < 150 MHz
 ARM9-StrongARM 5 < 233 MHz
 ARM10 6 266 –325 MHz
 Xscale 7 100 MHz –1 GHz
 ARM11 8 350 MHz –1GHz

28
ARM Architecture versions
 ARMv4: 32-bit ISA operating in a 32-bit address space.
 ARMv4T: added the16-bit Thumb instruction set which enabled
compilers to generate more compact code while retaining all the
benefits of a32-bit system.
 ARMv5TE:Thumb architecture, along with ARM‘Enhanced’ DSP
instruction set extensions to the ARM ISA.
 ARMv5TEJ: The Jazelle extension to support Java acceleration
technology
 ARMv6: includes media instructions to support Single Instruction
Multiple Data(SIMD) software execution (applications of video and
audio codecs)
 NEONTM Media Acceleration Technology: 64/128-bit hybrid SIMD for
high-performance, media intense, low power mobile handheld
devices.
 Vector Floating Point (VFP): coprocessor support is an architecture
option. The VFP architecture supports single and double precision
floating point arithmetic

29
ARM Technologies

30
ARM organization

31
ARM7 processor core

ARM single-cycle instruction pipeline


operation

32
ARM7 core interface signals

33
Processor modes
 ARM has seven basic operating modes:
 User: unprivileged mode for normal program execution
 FIQ: entered when a high priority (fast) interrupt is raised
 IRQ: entered when a low priority (normal) interrupt is raised
 Supervisor: Protected mode for operating system support(entered on
reset and when a Software Interrupt (SWI) instruction is executed)
 Abort: for implementing virtual memory or memory protection
 Undefined: supports software emulation of hardware coprocessors
 System: for running privileged operating system tasks
 Mode changes may be made under software control or may be
causedby external interrupts or exception processing.
 Most application programs will execute in User mode. Other
privileged modes will be entered to service interrupts or
exceptions or to access protected resources.

34
ARM 7 Registers

35
ARM 7 register set
 Registers are arranged into several banks,
being governed by the processor mode.
 Each mode can access
a particular set of r0-r12 registers
 a particular r13 (the stack pointer) and r14 (link
register)
 r15 (the program counter)
 cpsr (the current program status register)
 Privileged modes can also access
a particular SPSR (saved program status
register)

36
ARM 7 registers

37
ARM 7 registers in user mode

 16 pieces of 32-bit integer registers


r0 –r15 (user mode)
 r0 –r12: general purpose registers
 r13: Stack Pointer (SP)
 r14: subroutine Link Register (LR)
 r15: Program Counter (PC)
 r16: Current Program State Register
(CPSR)
 SPSR: Saved Program Status
Register

38
LR & PC registers
 Register 14:is used as the subroutine Link Register
(LR). This receives a copy of R15 when a Branch and
Link (BL) instruction is executed.
 At all other times it may be treated as a general-purpose
register. The corresponding banked registers R14_svc,
R14_irq, R14_fiq, R14_abt and R14_und are similarly used
to hold the return values of R15 when interrupts and
exceptions arise, or when Branch and Link instructions
are executed within interrupt or exception routines.
 Register 15:holds the Program Counter (PC). In ARM
state, bits [1:0] of R15 are zero and bits [31:2] contain
the PC. In THUMB state, bit [0] is zero and bits [31:1]
contain the PC.

39
Program status registers
 Processor has:
 1 Current Program Status Register (CPSR)
 5 Saved Program Status Registers (SPSRs) for
exception handlers to use
 Program Status Registers functions:
 Hold information about the most recently performed ALU
operation
 Control the enabling and disabling of interrupts
 Set the processor operating mode

40
Program status registers

41
ARM 7 Data types and memory
format

42
Data types
 ARM7 processor supports the following data types:
 32-bit words
 16-bit halfwords
 8-bit bytes
 Data alignment must be as follows:
 Align words to four-byte boundaries
 Align halfwords to two-byte boundaries
 Align bytes to byte boundaries
 Memory transfer (Load & Store): bytes, halfwords
and words
 Signed operands are assumed to be in two’s
complement format

43
Memory format
 Memory: linear collection of bytes numbered in
ascending order from 0:
 bytes 0-3 hold the first stored word
 bytes 4-7 hold the second stored word

 Processor can treat words in memory as being


stored in:
 little-endian
format (default memory format)
 big-endian format

44
Little-Endian data format

Byte with the lowest address in a word is the least-significant byte of the word
Byte with the highest address in a word is the most significant byte of the word
Byte at address 0 of the memory system connects to data lines 7-0

45
Exceptions
 Definition
 Exceptions are usually used to handle unexpected events which
arise during the execution of a program.
 These events are all grouped under the “exception” heading
because they all use the same basic mechanism within ARM
processor
 Exceptions groups
 Exceptions generated as the direct effect of executing an
instruction : Software interrupts, undefined instructions (including
coprocessor instructions where the requested coprocessor is
absent) and prefetch aborts (instructions that are invalid due to a
memory fault occurring during fetch
 Exceptions generated as a side-effect of an instruction : Data
aborts (a memory fault during a load or store data access).
 Exceptions generated externally, unrelated to the instruction flow:
Reset, IRQ and FIQ fall into this category.

46
ARM programming modes &
Assembler syntax

47
Assembler definition
 Machine level Language : a set of binary orders defined
to be understood and executed by a given
microprocessor.
 Programming model : depends on registers, indicators
and interruptions definition of target microprocessor.
 Assembler Language : defines instructions in mnemonic
form or operation codes that represents a function the
processor will perform (Example : JUMP, ADD, MOV…).
 Assembler structure : code lines have two parts, first one
is the name of instruction to be executed, second part
are command parameters (Example : addah bh).
 Assembling : Using of specific software tool to convert
Assembler symbolic instructions into executable
machine code.

48
AREA directive
 An AREA is a chunk of data or code manipulated by the linker.
 A complete application consists of one or more AREAs.
ENTRY directive
 It marks the first instruction within an application.
 An application can contain only a single ENTRY point. In application
with multiple assembler modules, only one module contains an ENTRY
directive.
General form of lines in an assembler module
label <white space>instruction <white space>; comment
 The label, instruction, and comment must be separated by one
whitespace character.
 The label must start in the first column.
 An instruction never starts in the first column, even if there is no label.
 All three sections of the line are optional, and the assembler also
accepts blank lines to improve the clarity of the code.

49
Assembly program example

50
Data definition and alignment Directives

 The DCB directive allocates one or more bytes of memory,


and defines the initial runtime contents of the memory.
 The DCQ directive allocates one or more eight-byte blocks of
memory, aligned on four-byte boundaries, and defines the
initial runtime contents of the memory.
 The DCD directive allocates one or more words of memory,
aligned on four-byte boundaries, and defines the initial
runtime contents of the memory.
 The DCW directive allocates one or more halfwords of
memory, aligned on two-byte boundaries, and defines the
initial runtime contents of the memory.
 SETEND BE or LE to chose big or little endian (default is little
endian).
 ALIGN: it directive aligns the current location to a specified
boundary by padding with zeros.

51
 Data processing instructions
 Addition and subtraction
 Multiplication
 Shifts
 Single data transfer
 Compares and tests
 Logical operations
 Loads and Stores operations
 Addressing modes of single-register loads and stores
 loading constants into registers
 Loading addresses into registers
 Conditional execution and loops
 Subroutines
 Memory mapped peripherals
 Floating point computation

52
ARM Instruction Code Format

53
Instruction: general coding

54
ARM instruction Mnemonics

55
ARM instruction Mnemonics

56
ARM instruction Mnemonics

57
ARM condition codes

58
Data processing instructions

59
Data processing instruction
binary encoding

60
Addition and subtraction
 The arithmetic instructions in the ARM instruction set include
addition and subtraction operations that perform addition,
subtraction, and reverse subtraction, all with and without
carry.

61
Multiply instructions
 ARM7 processor has dedicated logic for performing multiplication.
Multiplication by a constant can be done with a shift and add instruction
or a shift and reverse subtract instruction.
MUL r4,r2,r1 ; r4 = r2 x r1
MULS r4,r2,r1 ; r4 = r2 x r1 then set the N and Z flag
MLA r7,r8,r9,r3 ; r7 = r8 x r9 + r3
 MUL and MLA instructions
 MUL and MLA are multiply and multiply-and-accumulate
instructions that produce 32-bit results.
 MUL multiplies the values in two registers, truncates the
result to 32 bits, and stores the product in a third register.
 MLA multiplies two registers, adds the value of a third
register to the product, truncates the results to 32 bits,
and stores the result in a fourth register.

62
Multiply long instructions
 Multiply long instructions produce 64-bit results. They multiply the values
of two registers and store the 64-bit result in a third and fourth register.
SMULL and UMULL are signed and unsigned multiply long instructions:

SMULL r4,r8,r2,r1 ; r4 = bits 31-0 of r2 x r1


;r8 = bits 63-32 of r2 x r1
UMULL r6,r8,r0,r1 ;{r6,r8} = r0 x r1

SMLAL and UMLAL are signed and unsigned multiply-long-and-accumulate


instructions. They multiply the values of two registers, add the 64-bit
value from a third and fourth register, and store the 64-bit result in the
third and fourth registers

63
Logical operations
 ARM supports Boolean logic operations using
two register operands.

64
Logic Shift Left

Shift the value up of a register, towards MSB, by n bits.


Number of bits to shift is specified by either a constant value or
another register.
Lower bits of the value are replaced with a zero: a power of 2
multiplication (×2n).
MOV R0, R1, LSL #2 R0 will become the value of R1 shifted
left by 2 bits. The value of R1 is not changed.
MOV R0, R1, LSL R2 R0 will become the value of R1 shifted
left by the number of bits specified in the R2 register. R0 is the only
register to change, both R1 and R2 are not effected by this
operation.
If the instruction is to set the status register, the carry flag(C) is the last
bit that was shifted out of the value.
65
Logic and Arithmetic shifts

66
Shifter rotate operations

MOV R0, R1, RRX The register R0 become the same as the value
of the register R1 rotated though the carry flag by one bit. The MSB of
the value becomes the same as the current Carry flag, while the Carry
flag will be the same as the LSB or R1. The value of R1 will not be
changed.

67
Data processing examples

 This instruction adds the contents of


register r1 to the value 0x31400000,
then stores the result in register r0.
The barrel shifter creates this operand
by rotating 0xC5 ten bits to the right.
The number of bits to shift the 8-bit
value must be even

 This instruction shifts the contents of


r2, 10 bits to the right, subtracts the
shifted result from the value in r1, and
stores the result in register r0.

68
Load and Store Register
Load and Store Register
 Classes of instructions Load and Store Multiple registers
Swap register and memory contents

Load Register instructions can load a 32-bit, a 16-bit or an 8-bit from memory into a
register. Byte and half word loads can be automatically zero-extended or sign-extended
as they are loaded.
Store Register instructions can store a 32-bit, a 16-bit or an 8-bit from a register to
memory.
Load and Store Register addressing modes
Offset addressing: memory address is formed by adding or subtracting an offset to or
from the base register value.
Pre-indexed addressing: memory address is formed in the same way as for offset
addressing. As a side-effect, the memory address is also written back to the base
register.
Post-indexed addressing: memory address is the base register value. As a side-effect,
an offset is added to or subtracted from the base register value and the result is written
back to the base register.
In each case, the offset can be either an immediate or the value of an index register.
Register-based offsets can also be scaled with shift operations.

69
Data transfer instruction
binary encoding

70
Offset addressing

 LDR R0, [R1] load the register R0 with the 32-bit word at
the memory address held in the register R1. An offset of zero is
assumed
 LDR R0, [R1, #4] load the register R0 with the word at the
memory address calculated by adding the constant value 4 to the
memory address contained in the R1 register
 LDR R0, [R1, R2] Loads the register R0 with the value at the
memory address calculated by adding the value in the register R1
to the value held in the register R2
 LDR R0, [R1,R2, LSL #2] load the register R0 with the 32-bit
value at the memory address calculated by adding the value in
the R1 register to the value obtained by shifting the value in R2
left by 2 bits.
71
Indexed addressing
 Pre-indexed addressing

The address of the data transfer is calculated by adding the offset to


the value in the base register, Rn.
The optional ! specifies writing the new address back into Rn at the end
of the instruction.
The optional B selects an unsigned byte transfer, but the default is
word, so you do not have to add anything in most cases.

Post-indexed addressing

The address of the data transfer is calculated from the unmodified value
in the base register, Rn. Then the offset is added to the value in Rn and
written back to Rn.
The T flag is used for operating systems in memory management
environments and is not used here.
72
Pre-Index addressing

LDR R0, [R1, #4]! load the register R0 with the word at the memory address
calculated by adding the constant value 4 to the memory
address contained in the R1 register. The new memory
address is placed back into the base register R1.

LDR R0, [R1,R2, LSL #2]! First calculates the new address by adding the value in the
base address register, R1, to the value obtained by shifting
the value in the offset register, R2, left by 2 bits. It will then
load the 32-bit at this address into the destination register,
R0. The new address is also written back into the base
register, R1. The offset register, R2, will not be effected by
this operation.

73
Post-Index addressing

LDR R0, [R1], #4 Load the register R0 with the word at the memory address
contained in the base register, R1. It will then calculate the
new value of R1 by adding the constant value 4 to the
current value of R1.

LDR R0, [R1], R2, LSL #2 First loads the 32-bit value at the memory address
contained in the base register, R1, into the destination
register, R0. It will then calculate the new value for the
base register by adding the current value to the value
obtained by shifting the value in the offset register, R2,
left by 2 bits. The offset register, R2, will not be effected
by this operation.

74
Load and Store Multiple Registers
 Load Multiple (LDM) and Store Multiple (STM)
instructions perform a block transfer of any number of
the general-purpose registers to or from memory.
 Addressing modes
 •pre-increment
 •post-increment
 •pre-decrement
 •post-decrement

 The base address is specified by a register value, which


can be optionally updated after the transfer.
 LDM and STM instructions also allow very efficient
code for block copies and similar data movement
algorithms.

75
Multiple transfer instruction
Binary coding

76
Multiple transfer instruction

 Rn An expression evaluation to a valid register number


<Rlist> A list of register ranges enclosed in {} (e.g. {R0,R2-
R7,R10})
{!} If present requests write-back (W=1), otherwise
W=0.
{^} If present set S bit to load the CPSR along with the
PC, or force transfer of user bank when in privileged mode

77
Multiple transfer instruction

F : Full D: Descending stack adresse I: Increment


B : Before E : Empty A : Ascending stack adresse
D: Decrement A : After

78
Swap memory and register instruction binary
encoding & assembler syntax

79
Loops and subroutines

80
Compare and test instructions

81
Condition code flags update
 For a data processing instruction to update the condition code flags, the instruction must be
postfixed with an S.
 The exceptions to this are CMP, CMN, TST, and TEQ, which always update the flags,
because updating flags is their only real function.
`

82
Loop structures
Three basic types of loops
 for loops
 while loops
 do {…..} while loops

 For Loop example

83
While and do Loops
While Loop
 Because the number of iterations of a while loop is not a constant, these
structures tend to be somewhat simpler.
 There is only one branch in the loop itself. The first branch actually throws
you into the loop of code .

Do …while loops
Loop body is executed before the expression is evaluated. The
structure is the same as the while loop but without the initial branch:

84
Subroutine structure
The branch and link instruction performs a branch in the same manner
as the branch instruction. It uses the Link Register, r14, to save the
address of the next instruction after the branch

A subroutine call overwrites the previous return address stored in r14.


In nested subroutines, you must save r14. Typically, r14 is pushed onto
a stack in memory.
A leaf subroutine is one that does not call another subroutine, and as a
result, does not have to save r14.
Because subroutines often require the use of multiple registers, original
register values can be saved using a store multiple instruction.

85
Branch conditions

86

Vous aimerez peut-être aussi