Académique Documents
Professionnel Documents
Culture Documents
Where are we and where are we going? Focus on desktop/server microprocessors vs. embedded/DSP microprocessor
Microprocessor Generations
First generation: 1971-78
Behind the power curve (16-bit, <50k transistors)
Fourth Generation: 1990 Architectural and performance leadership (64-bit, > 1M transistors, Intel/AMD translate into RISC internally)
Implemented in 1985
125,000 transistors 5-8 MIPS (Million Instructions per Second)
early restart after cache miss (1992) nonblocking caches: continue during a cache miss (1994)
Introduced 1971 2250 transistors 108 kHz, 60,000 ops/sec 16 pins 10-micron process As powerful as the ENIAC which had 18000 tubes and occupied a large room Targeted use: Calculators Cost: less than $100
2 ALUs, each working at 4.4GHz 128-bit FPU 0.13 micron process Targeted use: PCs and low-end workstations Cost: around $600
Moores Law
In 1965, one of the founders of Intel Gordon Moore predicted that the number of transistor on an IC (and therefore the capability of microprocessors) will double every year. Later he modified it to 18months His prediction still holds true in 02. In fact, the time required for doubling is contracting to the original prediction, and is closer to a year now
1 ,0 0 0 1970
1975
1980
1985
1990
1995
2000
2005
kHz, MHz, GHz (Clock Frequency) 108kHz 4004 worked at a clock frequency of
The latest processors have clock freqs. in GHz Out of 2 uPs having similar designs, one with higher clock frequency will be more powerful Same is not true for 2 uPs of dissimilar designs. Example: Out of PowerPC & Pentium 4 uPs working at the same freq, the former performs better due to superior design. Same for the Athlon uP when compared with a Pentium
CPU
Memor y I/O
Microcontrollers
Memory CPU
ROM RAM
A single chip
A microprocessor system?
uPs are powerful pieces of hardware, but not much useful on their own Just as the human brain needs hands, feet, eyes, ears, mouth to be useful; so does the uP A uP system is uP plus all the components it requires to do a certain task A microcomputer is 1 example of a uP system
RAM I/O
Instruction Cache
General-purpose microprocessor
CPU for Computers No RAM, ROM, I/O on CPU chip itself Example Intels x86, Motorolas 680x0
Many chips on mothers board
Data Bus
RAM
ROM
I/O Port
Timer
Microcontroller :
A smaller computer On-chip RAM, ROM, I/O ports... Example Motorolas 6811, Intels 8051, Zilogs Z8 and PIC 16X
A single chip
Microcontroller
Block Diagram
External interrupts Interrupt Control On-chip ROM for program code
Timer/Counter
On-chip RAM
Timer 1 Timer 0
Counter Inputs
OSC
Bus Control
4 I/O Ports
P0 P1 P2 P3
TxD RxD
Address/Data
8086 microprocessor
Address Bus 20 lines A19 A0 Data Bus 16 lines D15 D0 Microprocessor 8086 16 bit- microprocessor ? 16-bits data bus? Da ta Bu s
Control signals
Add Bus
FFFFFH
2 units a 1. BIU 2. EU
BIU and EU
BIU (bus interface unit) sends out addresses, fetches instructions from memory, reads data from ports and memory, and writes data to ports and memory. In other words, the BIU handles all transfers of data and addresses on the buses for the execution unit. EU (execution unit) of the 8086 tells the BIU where to fetch instructions or data from, decodes instructions, and executes instructions.
Registers
Both ALU & FPU have a very small amount of super-fast private memory placed right next to them for their exclusive use. These are called registers The ALU & FPU store intermediate and final results from their calculations in these registers Processed data goes back to the data cache and then to main memory from these registers
Control Unit
The brain of the uP Manages the whole uP Tasks include fetching instructions & data, storing data, managing input/output devices
Overview
Intel 8088 facts
20 bit address bus allow accessing
1 M memory locations
20-bit address
control Byte addressable and byte-swapping signals To 8088 Word: 5A2F CLK 18001 5A High byte of word 18000 2F Low byte of word Memory locations
GND
Organization of 8088/8086
Address bus (20 bits) AH BH CH AL General purpose BL register CL
Segment register
CS DS SS ES IP
Instruction Queue
Bus control
External bus
AX
Data Group
AH BH CH DH SP
AL BL CL DL
Accumulator Base Counter Data Stack Pointer Base Pointer Source Index Destination Index
BX CX DX
BP SI DI
B
n bits
Carry Y= 0 ? A>B?
0 0 0 A+B 0 0 1 A -B 0 1 0 A -1 F 0 1 1 A 1 0 0 1 0 1
and B A or B not A
Signal F is generated according to the current instruction. Basic arithmetic operations: addition, subtraction, Basic logic operations: and, or, xor, shifting,
Flag Register
Flag register contains information reflecting the current status of a microprocessor. It also contains information which controls the operation of the microprocessor.
15
NT IOPL OF DF IF TFSF ZF AF PF
Status Flags
0 CF
Control Flags
Carry flag Parity flag Auxiliary carry flag Zero flag Sign flag Overflow flag Nested task flag
MOV AL, BL
Opcode tells what operation is to be performed. Mode indicates the type of a instruction: Register type, or Memory type Operands tell what data should be used in the operation. Operands can be addresses telling where to get data (or where to store results)
EU Operation
1. Fetch an instruction from instruction queue
AH BH CH DH AL General purpose BL register CL DL
SP process is also referred to as instruction BP SI decoding) ALU Data bus DI (16 bits)
ALU An arithmetic operation A logic operation Flag register Storing a datum into a register Moving a datum from a register Changing flag register
How can a 16-bit microprocessor generate 20-bit memory addresses? Left shift 4 bits 16-bit register 0000 + 16-bit register Offset
FFFFF Addr1 + 0FFFF Addr1 Offset Segment address 00000 1M memory space Segment (64K)
Memory Segmentation
A segment is a 64KB block of memory starting from any 16-byte
boundary
For example: 00000, 00010, 00020, 20000, 8CE90, and E0840 are all valid segment addresses The requirement of starting from 16-byte boundary is due to the 4-bit left shifting
15
CS DS SS ES
Segment addresses must be stored in segment registers Offset is derived from the combination of pointer registers, the Instruction Pointer (IP), and immediate values Examples 3 4 8 A 0 CS 4 2 1 4 IP + Instruction address 3 8 A B 4 1 2 3 4 0 DS 0 0 2 2 DI + Data address 1 2 3 6 2 +
0000
Memory address
5 0 0 0 0 SS F F E 0 SP + Stack address 5 F F E 0
Fetching Instructions
Where to fetch the next instruction?
Memory
12352 MOV AL, 0
After an instruction is fetched, Register IP is updated as follows: IP = IP + Length of the fetched instruction
For Example: the length of MOV AL, 0 is 2 bytes. After fe the IP is updated to 0014
There is a number of methods to generate the memory address when accessing data memory. These methods are referred to as Addressing Modes Examples: Direct addressing: MOV AL, [0300H] 1 2 3 4 0 0 3 0 0 Memory address 1 2 6 4 0 DS Register indirect addressing: MOV AL, [SI] 1 2 3 4 0 0 3 1 0 Memory address 1 2 6 5 0 DS (assume DS=1234H) (assume SI=0310H) (assume DS=1234H)
FFFFF FFFF0
Locations from 00000H to 003FFH Interrupt are used for the interrupt pointer table pointer It has 256 table entries table
Each table entry is 4 bytes 003FF 256 4 = 1024 = memory addressing space From 00000H to 003FFH 00000
Interrupts
The interrupt temporarily suspends execution of the program and switch the processor to executing a special routine (interrupt service routine) When the execution of interrupt service routine is complete, the processor resumes the execution of the original program Interrupt classification Hardware Interrupts
Caused by activating the processors interrupt control signals (NMI, INTR)
Software Interrupts
Caused by the execution of an INT instruction Caused by an event which is generated by the execution of a program, such as division by zero
Minimum Mode
Maximum Mode
8088 generates control signals It needs 8288 bus controller to generate for memory and I/O operationscontrol signals for memory and I/O operations It Some functions are not available allows the use of 8087 coprocessor; in minimum mode it also provides other functions Compatible with 8085-based systems
E x e c u tio n U n it (E U )
B u s In te rfa c e U n it( B IU ) F e tc h e s O p c o d e s , R e a d s O p e ra n d s , W r it e s D a t a
8 0 8 6 /8 0 8 8 M P U
T e m p o ra ry R e g is te rs In s tru c t io n Q u e u e ALU
EU C o n tro l
F la g s
IP 1 6 - b it O ff s e t A d d r e s s
2 0 - b it P h y s ic a l A d d r e s s
GND AD14
GND AD14
1 GND
A14
i8 0 8 6 C ir c u it - M a x im u m M o d e
V cc
CLK R EADY R ESET S0# S1# S2# CLK M R DC # M W TC # AM W C # IO R C # IO W C # A IO W C # IN T A #
8284A C lo c k G e n e ra to r R DY
8086 CPU
M N /M X #
LE O E# BHE# A D 1 5 :A D 0 A 1 9 :A 1 6 IN T R 74LS373 x3
A 1 9 :A 0 , BHE#
A D D R /D A T A
D IR EN # 74LS245 74LS245 x2 x2
D 1 5 :D 0
A D D R /D a ta
RESET# Signal
The Active low RESET# signal puts the 8086/8 into a defined state Clears the flags register, segment registers etc. Sets the effective program address to 0FFFF0h (CS=0F000h, IP=0FFF0h) 8086/8 Programs always start at 0FFFF0H after Reset has been asserted and removed Continues into latest generation CPUs
Use of BHE#/A0(BLE#)
B y te - W id e a d d r e s s in g (8 0 8 8 ) FFFFF FFFFE FFFFD FFFFC A 1 9 ..A 1 O D D A d d re s s e s (8 0 8 6 ) FFFFF FFFFD FFFFB FFFF9 A 1 9 ..A 1 E V E N A d d re s s e s (8 0 8 6 ) FFFFE FFFFC FFFFA FFFF8
D 1 5 :D 8 BHE# A 0 /B L E #
D 7 :D 0
Use of BHE#/BLE#
BHE# 0 0 1 1 A0/BLE# 0 1 0 1 Selection Whole word (16-bits) High byte to/from odd address Low byte to/from even address No selection
A d d re s s / D a ta Bus
A d d re s s T im e
D a t a T im e
ALE
O u tp u t o f 74H C 373
M ic r o c o m p u t e r A d d r e s s B u s
7 4 H C 3 7 3 o r e q u iv a le n t
A d d re s s / D a ta B u s
In 0 : I n 7
Q 0 :Q 7
S y s te m A d d re s s B u s
ALE
LE OE# T r iS t a t e C o n t r o l s ig n a l, O E # , s h o w n c o n n e c te d to G N D f o r s im p lic it y
(1 Wait State)
8086/8088 Summary
First Generation (introduced June 1978) One of the first 16b processors on the market 16b internal registers 16/8b external data bus 20b address bus (1MB addressable) Used in 1st generation IBM PCs (1981)
80186/80188
Evolution of 8086/8088 80186/80188 Increased instruction set On-chip system components (Clock generator, DMA, Interrupt, Timers) Unsuccessful in PCs Popular in embedded systems
Multitasking-support. What happens in one area of memory doesnt affect other programs. Protected mode supported by Windows 3.0.
16MB addressable physical memory On-chip MMU (1GB virtual memory) Non-multiplexed address-bus and data-bus
Clock speeds: 16-33MHz P3 processors were far ahead of their time: First 386 PCs early 1987
(COMPAQ)
Protected mode of 386 is fully compatible with 286 New virtual real mode
Protected mode=native mode of operation. Chips are designed for advanced operating systems such as Windows NT Processor can run with hardware memory protection while simulating the 8086s real-mode operation. Multiple copies of e.g. DOS can run simultaneously, each in a protected area of memory. If a program in one memory area crashes, the rest of the system is protected.
B u s U n it ( B U ) P re fe tc h Q u e u e
E x e c u t io n U n it ( E U ) ALU
C o n tro l U n it (C U )
D a ta
I n s t r u c tio n U n it ( IU )
R e g is te r s
T h e 8 0 3 8 6 in c lu d e s a B u s In t e r f a c e U n it f o r r e a d in g a n d p r o v id in g d a t a a n d in s tr u c t io n s , w it h a P r e f e tc h Q u e u e , a n I U fo r c o n t r o llin g th e E U w it h it s r e g is te r s , a s w e ll a s a n A U f o r g e n e r a t in g m e m o r y a n d I/ O a d d r e s s e s
80386 Features
32b general and offset registers 16B prefetch queue Memory management unit with segmentation unit and paging unit 32b address and data bus 4GB physical address space 64TB virtual address space i387 numerical coprocessor Implementation of real, protected and virtual 8086 modes
8 7 AL BL CL DL SI DI BP SP
0 CS SS DS ES FS G S
S e g m e n t R e g is te r s 15 0
E x e c u t io n U n it
1 6 - b y te d e e p In s tr u c tio n Q u e u e
B u s In t e r fa c e U n it
3 2 -b it D a ta Bus
Coprocessor: i387
The hardware implementation of floating point processing in the i387 means floating point operations run at much higher speed. The i386 can execute all mathematical expressions using software emulation of the i387.
M ic r o c o d e RO M
M ic r o c o d e Q ueue
Prefetch Queue
Bus Interface
Decoding Unit
C o n tr o l U n it
In a m ic r o p r o g r a m m e d C I S C t h e p r o c e s s o r f e tc h e s t h e in s t r u c t io n s v ia t h e b u s in te r f a c e in t o a p r e f e t c h q u e u e , w h ic h t r a n s fe r s th e m t o a d e c o d in g u n it . T h e d e c o d in g u n it b r e a k s th e m a c h in e in s t r u c t io n in t o m a n y e le m e n t a r y m ic r o - in s t r u c t io n s a n d a p p le s t h e m t o a m ic r o c o d e q u e u e . T h e m ic r o - in s t r u c t io n s a r e t r a n s f e r r e d f r o m t h e m ic r o c o d e q u e u e t o t h e c o n t r o l a n d e x e c u t io n u n it w h ic h d r iv e s t h e A L U a n d t h e r e g is t e r s