Vous êtes sur la page 1sur 10

Microprocessors 10EC62

UNIT-8: 80386, 80486 & PENTIUM PROCESSORS

THE 80386 MICROPROCESSOR

Features of 80386

 The 80386 is a 32-bit microprocessor, introduced by Intel in1985.


 It’s ALU and all the internal registers are 32-bit wide
 It has non-multiplexed 32-bit address bus and 32-bit data bus
 It can access a maximum of 4 GB of physical memory, and 64 TB of virtual memory
 It allows maximum segment size of 4 GB
 It organizes physical memory in terms of 4 KB size pages
 It supports virtual memory, paging, and 4 levels of protection
 It is compatible with the software written for 8086 and 80286 microprocessors
 It can operate in clock frequencies up to 33 MHz
 It has 64-bit barrel shifter, which increases the speed of all rotate and shift instructions
 It has multiply & divide logic, using the bit-shift-rotate algorithm
 It has 16-byte instruction queue and decoded instruction queue
 It can operate in 3 modes: Real mode, Protected mode and Virtual 8086 mode
o Real mode is the mode of the processor immediately after Reset. In this mode the
processor operates like a faster 8086, with some new instructions
o Protected mode is the natural mode of 80386. In this mode, all the instructions and
features of 80386 are available
o Virtual 8086 or V86 mode is a special mode, in which the processor can execute 8086
programs, and then enter into protected mode to continue executing the native 80386
programs
 In 80386, each bus cycle consists of two clock periods or T-states: T1 & T2.
 Memory system of i386 is organized into four 8-bit wide banks, each consisting of maximum
1GB memory

80386/80486 Programming Model

The programming model consists of a set of “program-visible” registers that are used during
application programming. Figure illustrates the programming model of i386 and i486
microprocessors. It consists of all the registers of 8086 and 80286 microprocessors, plus a 32-bit
extension to each register. The 32-bit extended registers are EAX, EBX, ECX, EDX, EBP, ESP, ESI,
EDI, EIP and EFLAGS. The segment registers are still 16-bit, but two additional segment registers FS
and GS are present in 80386. In addition to the 16-bit “visible” portion, each segment register has a
64-bit (8-byte) program-invisible part (not shown in figure), used to store the segment descriptor
corresponding to each segment register.

Dept. of ECE, SMVITM, Bantakal Page 1


Microprocessors 10EC62

Flag Register (EFLAGS)

********

Dept. of ECE, SMVITM, Bantakal Page 2


Microprocessors 10EC62

Special Registers of 80386

Other than the registers in the programming model, 80386 has many special registers such as control
registers, debug registers and test registers. All these are 32-bit registers.

Control Registers
There are four control registers: CR0 – CR3.
CR0 contains a number of special control bits that are used in memory paging, math coprocessor
selection, protected mode selection, etc. Memory paging is one useful feature which allows any linear
address to be assigned to any physical memory location in the system.
CR1 is not used in 80386, but used in Pentium processor
CR2 is used to hold the 32-bit linear address of the last page accessed before a page-fault interrupt.
CR3 is used to hold the base address of the page directory

Debug Registers
There are eight debug registers: DR0 – DR7.
The first four debug registers (DR0–DR3) contain the 32-bit linear breakpoint addresses. These
breakpoint addresses point to instruction or data. These addresses are always compared with the
address generated by the program. If there is a match between the breakpoint address and the address
generated by the program, then the microprocessor causes a type-1 interrupt. This feature is an
extension of the basic tracing or single-step mechanism used in the earlier microprocessors.
DR4 and DR5 are not used in 80386, but used in Pentium
DR6 and DR7 contain a number of control bits which are used to control the way in which the
microprocessor responds to a breakpoint (debug or TRAP) interrupt.

Test Registers
There are two test registers: TR6 & TR7. They are used to test the translation look aside buffer (TLB)
that is used with the paging unit in 80386. The TLB holds the most commonly used page table address
translations, thereby reducing the number of memory read accesses. The TLB holds the most common
32 entries from the page table, and is tested with TR6 and TR7 registers.
TR6 holds the tag field (linear address) of the TLB, and TR7 holds the physical address of the TLB.
Both these registers contain several bits, which control the way in which the TLB is accessed.

Memory management (physical address generation) in 80386

In the protected mode operation of 80386, the memory address consists of 16-bit segment selector and
32-bit offset address. The selector points to a descriptor for the segment, in the descriptor table. The
offset address specifies the location of the desired code or data in the segment. Since the offset is 32-
bit, the maximum size of a segment in 80386 can be 2^32 = 4 gigabyte (GB).

Figure shows how the 80386 uses a selector to access a descriptor from the descriptor table, and how
it computes the physical address. The 16-bit selector which is contained in the segment register
consists of 13-bit index, 1-bit table indicator (TI) and 2-bit requested privilege level (RPL). If TI=0,
then the 13-bit index selects a segment descriptor from the global descriptor table. If TI=1, then the

Dept. of ECE, SMVITM, Bantakal Page 3


Microprocessors 10EC62

13-bit index selects a segment descriptor from the local descriptor table. The 2-bit RPL is part of the
80386 processor’s built-in protection features, which we shall not discuss here.

The 13-bit index part of the selector is multiplied by 8 and used as a pointer to a descriptor in the
descriptor table. Multiplication by 8 is done because each descriptor is 8-byte in length. The
descriptor, among other things, contains the 32-bit physical base address of the segment, and the limit
or maximum size of the segment. The memory management unit (MMU) of 80386 adds the 32-bit
base address from the descriptor to the 32-bit offset or effective address, to generate the 32-bit
physical address of the code or data byte.

Figure shows the segment descriptor of 80386:

Dept. of ECE, SMVITM, Bantakal Page 4


Microprocessors 10EC62

Memory Bank System in 80386

The size of physical memory in 80386 is 4 GB. If virtual addressing is used, then 64 terabyte memory
locations can be mapped to 4 GB memory space (memory swapping with hard disk). Figure shows the
organization of 80386 physical memory system.

The memory system is organized into four banks. Each bank is 8-bit wide, and consists of 1 GB
memory. To store a 16-bit number, lower byte is stored in one bank, and upper byte is stored in
another bank. To store a 32-bit number, one byte is stored in each bank (8-bit × 4 = 32-bit). This
arrangement allows byte, word and double-word to be accessed in single memory cycle.

Each memory byte is numbered in hexadecimal across the four banks, starting from 00000000h to
FFFFFFFFh, as shown in the figure. In 8086 and 80286, there are two banks, and they are enabled
using A0 and BHE# signals. Whereas in 80386, the four banks are individually enabled using four
bank enable signals BE0# – BE3#.

Pipelines and Cache Memory

A pipeline is a special way of handling memory access, so that the memory gets additional time to
access data. Pipelining extends the memory access time from 50 ns (without pipeline) to 80 ns (with
pipeline), assuming 16 MHz clock frequency.

The pipeline works as follows: When an instruction is fetched from memory, the microprocessor often
has extra time before the next instruction needs to be fetched. During this extra time, the address of
the next instruction is sent out through the address bus, ahead of time. This extra time gives more
access time to slower memory devices.

Not all memory references can take advantage of the pipe, because sometimes microprocessor needs
to fetch code or data from a memory location which is not immediately following the previous

Dept. of ECE, SMVITM, Bantakal Page 5


Microprocessors 10EC62

memory location. In that case, the memory cycle will be non-pipelined. Overall, pipeline is a cost-
saving feature that reduces the memory access time and increases the speed of memory cycle.

In systems with higher clock frequencies, another technique, called cache memory is used to increase
the speed of memory cycle. A cache memory is a high speed static-memory (SRAM) that is placed
between the microprocessor and DRAM memory. SRAMs have access times less than 10 ns, and
thus speed up the memory cycle.

The size of the cache memory is decided by the type of application program that is running on the
microprocessor. If the program is small and works on fewer amounts of data, then a small cache is
beneficial. If the program is large, and works on large blocks of data, then a large cache is
recommended. In 80386 based computer system, 256 KB cache memory is used.

Interleaved Memory System

Interleaved memory system is the technique used for improving the speed of the system. It requires
two or more complete set of address buses, and a controller that provides addresses for each bus.
Depending on the number of buses present, we have two-way interleave and four-way interleave.

In this technique, the memory system is divided into 2 or 4 parts. Suppose that there are two parts.
One part contains 32-bit addresses (4 bytes) 00000000h–00000003h, 00000008h–0000000Bh, and so
on. The other part contains addresses 00000004h–00000007h, 0000000Ch–0000000Fh, and so on.
While the microprocessor is processing on the data from one part of the memory, the interleave
control logic generates the address for the next data which is in the other part of memory. In this way,
the memory devices get more access time, without the need for inserting wait-states in the memory
cycle. This speeds up the system.

Input/Output System

The I/O system in 80386 is similar to that of 8086 microprocessor based system. In isolated-I/O, each
I/O location is given 16-bit address; so altogether there are 64 KB I/O locations. The I/O map of
80386 is similar to the memory map, shown in page #5, except that, in I/O, the address ranges from
0000h to FFFFh.

As in the case of memory system, the I/O system is also organized into four banks. Even though most
of the I/O data transfers are 8-bit wide, 16-bit and 32-bit data transfers are used in the recent disk
drives and video display interfaces. The wider (32-bit) I/O data path increases the data transfer rate
between the microprocessor and the I/O devices.

Dept. of ECE, SMVITM, Bantakal Page 6


Microprocessors 10EC62

Description of 80386 Signals

A31–A2: Address bus consists of 32-bit address. However, address bits A1 and A0 are not
available. Instead, they are encoded into the bank enable signals (BE0#–BE3#).

D31–D0: Data bus consists of 32 lines for data transfer between microprocessor & memory and
between microprocessor and I/O devices.

BE3#–BE0#: Bank enable signals select either one bank (for byte access), or two banks (for word
access) or all four banks (for double-word access).

M/IO# This signal selects the memory device when it is logic high, and selects I/O device
when it is at logic low.

W/R# This signal indicates write operation when at logic high, and indicates read operation
when at logic low

ADS# The address strobe signal becomes active whenever the microprocessor issues a valid
memory or I/O address. It is similar to ALE signal of 8086.

RESET The reset signal initializes the microprocessor, causing it to begin executing the
software at memory location FFFFFFF0h.

CLK2 Clock times-2 is connected to a clock signal that is twice the operating frequency of
the microprocessor. For example, if the microprocessor operates at 16 MHz, then 32
MHz clock signal is connected to this pin.

BS16# Bus size 16, selects 16-bit bus when the signal is logic-0, and selects 32-bit bus when
the signal is logic-1.

NA# Next address signal causes the 80386 to output the address of the next instruction or
data in the current bus cycle. This is useful in pipelining.

Note: Other pins are similar to those of 8086 microprocessor.

Salient Features of 80486

 Intel’s 80486 is a 32-bit microprocessor. It has 32-bit data bus and 32-bit address bus
 It has all the features of 80386 microprocessor
 It is made up of over 1.2 million transistors
 It has in-built numeric coprocessor similar to 80387
 It has 8 KB level-1 cache memory; it also supports external (level-2) cache memory
 It can operate at clock frequencies 25 MHz, 33 MHz, 50 MHz, 66 MHz and 100 MHz.
 It has parity generator and checker unit, which provides/checks parity for each byte of memory

Dept. of ECE, SMVITM, Bantakal Page 7


Microprocessors 10EC62

 It has built-in self-test (BIST) that tests the microprocessor and other parts during start-up
 It is packaged in 168-pin pin grid array (PGA) package.

Salient features of Pentium microprocessor

 Pentium has all the features of 80486 microprocessor


 It has dual cache memory; 8 KB for data and 8 KB for instruction
 It has 64-bit data bus
 It is made up of over 3 million transistors
 The numeric coprocessor works five time faster than that of 80486
 It has dual integer unit, so two instructions can be executed in a single clock cycle
 It has branch prediction logic for more efficient branching
 It is packaged in a 237-pin PGA
 It can be operated at clock frequencies 66 MHz and above
 It uses 3.3 V power supply, with reduced current consumption
 The memory system of Pentium is organized into 8 banks

Block diagram of Pentium

Pentium processor uses superscalar architecture. It executes instructions in five stages, allowing the
processor to overlap multiple instructions, so that it takes less time to execute two instructions in a
row. It has two separate 8 KB caches on chip; one for the instructions, and the other for the data. This
allows the processor to fetch data and instructions simultaneously from the cache. When the data is
modified, only the data in the cache is changed; Data in the memory is changed only when the
processor copies the cache back to the DRAM.

Dept. of ECE, SMVITM, Bantakal Page 8


Microprocessors 10EC62

Branch Prediction Logic

The Pentium microprocessor uses branch prediction logic to reduce the time required for a branch
operation. Branch prediction is a digital logic circuit that guesses which way a branch will go before
this is known for sure.

Two-way branching is usually implemented with a conditional jump instruction. A conditional jump
can either be “not taken” and continue execution with the code which immediately follows the jump
instruction, or the jump can be “taken”, and the jump takes place to a different program memory
location. It is not known for sure whether the conditional jump will be taken or not taken, until the
condition has been calculated and the jump instruction has passed the execution stage in the
instruction pipeline.

Without branch prediction, the processor would have to wait until the conditional jump instruction has
passed the execution stage, before the next instruction can enter the fetch stage in the pipeline. The
branch predictor attempts to avoid this waste of time by trying to guess whether the conditional jump
is most likely to be taken or not taken. The branch that is guessed to be the most likely is then fetched
and speculatively executed. If it is later detected that the guess was wrong then the speculatively
executed or partially executed instructions are discarded and the pipeline starts over with the correct
branch, incurring a delay.

There are two types of branch predictions: Static and Dynamic. Static prediction is the simplest
branch prediction technique because it does not rely on information about the history of code
executing. Instead it predicts the outcome of a branch based solely on the branch instruction. Dynamic
branch prediction uses information about taken or not taken branches gathered at run-time to predict
the outcome of a branch. Pentium processor uses dynamic branch prediction.

Cache structure in Pentium

The Pentium processor has separate 8 KB L1 cache for data, and 8 KB L1 cache for code, and they
are located inside the chip.

L1 cache on the Pentium processor is 2-way set-associative in structure. In a set-associative structure


the cache is divided into equal sections called cache ways. Each cache way is treated like a small
direct mapped cache. In a 2-way scheme, two lines of memory may be stored at any time.

The data cache is configurable as a write-back or write-through on a line-by-line basis. When the
cache is configured as write-back, the cache acts like a buffer by receiving data from the processor
and writing data back to main memory whenever the system bus is available. The advantage to the
write-back process is that the processor is freed up to continue with other tasks while main memory is
updated at a later time. However the disadvantage to this approach is that by having cache handle
writes back to memory, the cost and complexity of cache subsequently increase. The second
alternative is to configure the Pentium cache as write-through. In a write-through cache scheme the
processor handles writes to main memory instead of the cache. The cache may update its contents as

Dept. of ECE, SMVITM, Bantakal Page 9


Microprocessors 10EC62

the data comes through from the processor however the write operation does not end until the
processor has written the data back to main memory. The advantage to this approach is that the cache
does not have to be as complex, which thus makes it less expensive to implement. The disadvantage
of course is that the processor must wait until the main memory accepts the data before moving on to
its next task.

Cache consistency on the Pentium processor is maintained using the MESI protocol. The protocol is
used to decide if a cache entry should be updated or invalidated.

******** ********

Dept. of ECE, SMVITM, Bantakal Page 10

Vous aimerez peut-être aussi