Académique Documents
Professionnel Documents
Culture Documents
Architecture
Pages 1-7 Summary
➔ How many units does the 80386 have?
◆ It has 6 main units:
● The Bus Interface Unit.
● The Code Prefetch Unit.
● The Instruction Decoding Unit.
● The Execution Unit: which consist of 3 subunits:
○ The Control Unit.
○ The Data Unit.
○ The Test Protection Unit.
● The Segmentation Unit.
● The Paging Unit.
The Code Prefetch Unit
➔ What is its function?
◆ It performs the program look ahead function. When the bus interface unit is
not busy, the code prefetch unit uses it to fetch sequentially along the
instruction byte steam.
➔ Where does the code prefetch unit stores the instructions it fetches?
◆ It stores them in the code queue.
➔ How long is the code queue?
◆ It is 16 byte long.
➔ Which one have the higher priority, code prefetches or data transfers?
◆ Data transfers has a higher priority.
➔ How does the CPU benefit from the code prefetch unit?
◆ It reduces the idle time(the time in which the CPU waits for instructions to
arrive) to practically zero.
The Execution Unit
➔ What is its function?
◆ It is the unit responsible for executing the instruction taken from the instruction
queue, and therefore needs to communicate with all other units required to
complete the instruction.
➔ What does the control unit contain?
◆ It contains microcode and special parallel hardware.
➔ What does the control unit do?
◆ It speeds up multiply, divide and effective address calculation.
➔ What does the data unit contain?
◆ It contains the ALU, a file of eight 32-bits general purpose registers and a 64
bit parallel shifter(which performs multiple shifts in one clock cycle).
➔ What does the data unit do?
◆ It performs data operations requested by the control unit.
➔ What does the protection test unit do?
◆ It checks for segmentation violations under the control of the microcode.
➔ How does the execution unit speed up the execution of memory reference instructions?
◆ It does so by partially overlapping the execution of any memory reference
instruction with the previous instruction, and since memory reference
instructions are frequent, a performance gain of approximately 9% is achieved.
The Segmentation Unit
➔ What does it do?
◆ It translates logical address into linear address at the request of the execution
unit.
➔ What is the logical address?
◆ It is the address generated by the CPU.
➔ What is the linear address? And when does the linear address = the physical address?
◆ It is the resultant address from adding the base segment(logical address) to an
offset. The linear address is the same as the physical address if the paging unit is
NOT enabled.
➔ What is the purpose of segment descriptor cache?
◆ It is used to store currently used segment descriptor to speed up translation.
➔ Where is the translated linear address is sent?
◆ It is sent to the paging unit.
Advanced Computer
Architecture
Intel 80486 MP
A/
Q/What is the difference between the math coprocessor of 80386 and the math
coprocessor of the 80486?
A/
The math coprocessor of 80486(which is known as 80487) is built in and integrated with the
80486, which allows math instructions to be executed 3 times faster than the 386/387
combination.
A/
We can obtain the code for the math coprocessor by adding 1 to the name of the microprocessor,
for example the math coprocessor for the 8086 microprocessor is 8087, and for the 80386 is
80387.
Q/Why does it has an 8 KByte code and data caches
A/
A/
We can disable paging for any section of translation memory page, while the 80386 cannot.
FPU
Q/What is it, what does it provide, what data types does it support, and what is its
features?
A/
A/
Advanced Computer
Architecture
Intel Pentium
Superscalar Factor 2 3
Memory Subsystem
● What is a transaction oriented bus?
○ A bus that handles each bus access as a separate request and response.
● How many access operations on the caches can the BIU handle?
○ 4 concurrent access operations
● How is the L1 instruction cache organized?
○ It is organized as a 4 way set associative.
● How is the L1 data cache organized?
○ It is organized as a 2 way set associative(can support one load and one store
operation per clock cycle).
● How is coherency is maintained between the caches and the memory subsystem?
○ By using the MESI cache protocol.
● What does MESI stands for?
○ It stands for Modified, Exclusive, Shared and Invalid.
● What is the function of the memory reorder buffer?
○ It works as a scheduling and dispatch station, and it is able to reorder memory.
● Why does the memory reorder buffer reorder memory?
○ To prevent blocks and improve throughput.
Fetch/Decode Unit
● Where does it read the instructions from?
○ From the L1 instruction cache and converts them into a series of micro-ops.
● How many bytes can it fetch per clock cycle?
○ 32 bytes.
● How many bytes does it transfer to the decoder?
○ 16 aligned bytes.
● How does it calculate the instruction pointer?
○ Based on inputs from the branch target buffer, the interrupt status and
branch prediction indications.
● What is branch prediction? And which buffer performs it?
○ The process in which the processor tries to predict whether the branch
instruction will jump or not depending on a past history in the branch. The
branch target buffer performs branch prediction.
● How many entries does the branch target buffer allow?
○ 512 entries.
● How is branch prediction is achieved?
○ By looking ahead of the retirement program counter using Yeh’s algorithm.
● How many decoders does the instruction decoder have?
○ 3 parallel decoders(2 simple and 1 complex).
● What does each decoder do?
○ Convert each instruction into a triadic micro-ops(2 logical sources and 1 logical
destination)
● How many micro-ops can each instruction be decoded to?
○ From 1 to 4 micro-ops for each instruction.
● How many micro-ops can the instruction decoder generate per clock cycle?
○ 6 per clock cycle.
● How many general purpose registers are there?
○ 40 internal, which can handle both integer and floating point.
● What is the purpose of the Register Alias Table Unit?
○ Converts the logical register references into physical register references.
● What does the allocator in the Register Alias Table Unit do?
○ It adds status bits and flags to the micro-ops to allow out-of-order execution
and sends the micro-ops to the instruction pool
● How many execution units does the processor have?
○ 6 parallel units
Instruction Pool
● Define it?
○ An array of content addressable memory(set associative cache) organized into 40
micro-ops registers.
● What does it contain?
○ It contains micro-ops that are waiting to be executed in addition to those
who have been executed and yet to be committed to the machine state.
Dispatch/Execute Unit
● What is its functions?
○ Schedules and executes micro-ops stored in the reorder buffer(instruction pool)
depending on data dependencies and resource availability.
● What is the function of the reservation station?
○ It is responsible for scheduling and dispatching of micro-ops from the reorder
buffer.
● What if 2 micro-ops of the same type are available at the same time?
○ The reorder buffer follows a FIFO algorithm to execute them.
● How many instructions can we schedule per clock cycle? And why?
○ We can schedule 5 because we have 2 integer units, 2 floating point units and 1
memory interface unit.
● How is branch misprediction is detected?
○ One of the integer units detects it and sends a signal to the branch target buffer
to restart the pipeline.
● What does the memory interface unit do?
○ It executes one store and one load operations per clock cycle.
Retirement Unit
● What is its function?
○ Commits the results of previously executed micro-ops into the permanent
machine state and remove them from the reorder buffer.
● How many micro-ops can it retire per clock cycle?
○ 3 micro-ops per clock cycle.
● Where does it write the result to?
○ It write the results to the retirement register file and/or memory.
● How many registers does the retirement register file contain?
○ It contains 8 general purpose registers and 8 floating point data registers
● On what data types does the MMX technology instructions operate on?
○ It operates on the following data types:
■ Packed byte
■ Packed word
■ Packed doubleword
■ Quadword
● How is the MMX technology instructions grouped?
○ There are grouped into the following sub groups:
■ MMX conversion instructions
■ MMX packed arithmetic instructions
■ MMX comparison instructions
■ MMX logic instructions
■ MMX shift and rotate instructions
■ MMX state management
● What unit does the floating point instructions work on?
○ It executes using the processor FPU(Floating Point Unit)
● What data types does the floating point instructions operate on?
○ It operates on the following data types:
■ Floating Point(Real)
■ Extended integer
■ BCD(Binary Coded Decimal)
● What are the types of floating point instructions?
○ It includes many different types like:
■ Data transfer(FLD, FST)
■ Basic arithmetic(FADD, FADDP)
■ Comparison(FCOM, FCOMP)
● What are the system instructions?
○ They are used to control the functions of processor that are provided to support
the operating systems and executives.
Advanced Computer
Architecture
Intel Pentium III Processor
MMX SSE
MMX instructions are SIMD for integers SSE instructions are SIMD for single
precision floating point numbers
MMX instructions operate on two 32-bit SSE instructions operate on four 32-bit floats
integers simultaneously(at the same time) simultaneously
No new registers were defined for MMX Eight new registers were defined for SSE
Advanced Computer
Architecture
Multicore Architecture
● What are the inner-core communication methods?
○ There are 2 methods:
■ Message Passing
■ Shared Memory
● How are the cores connected with each other?
○ There are many ways to connect core:
■ Bus
■ Ring
■ Two-dimensional mesh
■ Crossbar
● What are homogeneous and heterogeneous multicore systems?
○ Homogeneous refers to systems that have only identical cores.
○ Heterogeneous refers to systems that have cores that are NOT identical.