Académique Documents
Professionnel Documents
Culture Documents
Pentium
Pentium Pro
Pentium II
Pentium III
The Pentium pipelined Integer Unit supports 5 stages: 1) Pre-fetch 2) Decode 3) Address generate 4) EX Execute - ALU and Cache Access 5) WB Writeback Although different later processors like the MMX tampered with the 5 execution steps(by adding intermediate LIFO structures to hold bulks of instructions), the steps remain the core foundation of the pipelining.
(Conclusion)
For two instructions to be paired together in the Decode stage, they have to lack dependencies. The two paired instructions would also have to be basic, in the sense that they contain no displacements or immediate addressing. As it can be deduced, pipelines will sometimes execute an instruction at the time, despite the Superscalar ability.
If two instructions are executing concurrently in the pipeline (given they satisfy the proper conditions, and are independent) and one of them stalls as a result of hazard control, the other one will also stall.
Branch Prediction
Other than the Superscalar ability of the Pentium processor, the branch prediction mechanism is a much-debated improvement. Predicting the behaviors of branches can have a very strong impact on the performance of a machine. Since a wrong prediction would result in a flush of the pipes and wasted cycles. The branch prediction mechanism is done through a branch target buffer. The branch target buffer contains the information about all branches. The prediction of whether a jump will occur or no, is based on the branchs previous behavior. There are four possible states that depict a branchs disposition to jump: Stage 0: Very unlikely a jump will occur Stage 1: Unlikely a jump will occur Stage 2: Likely a jump will occur Stage 3: Very likely a jump will occur
Branch Prediction
When a branch has its address in the branch target buffer, its behavior is tracked. This diagram portrays the four stages associated branch prediction.
Branch Prediction
It is actually believed that Pentiums algorithm for branch prediction is incorrect.
As it can be seen in the diagram to the right, State 0 will jump directly to State 3, instead of following the usual path which would include State 1, and State 2.
This abnormality might be attributed to the way in which the branch target buffer operates:
- If a branch is not found in the branch target buffer, then it predicted that it wont jump. - A branch wont get an actual entry in the branch target buffer, until the first time it jumps, and when it does, it goes straight into State 3. - Because the branch wont get an entry into the branch target buffer until the first time it jumps, this will cause an alteration into the actual state diagram, as it can be clearly seen.
Branch Prediction
The Intel Pentium branch prediction algorithm is indeed better than a 50% guess, but it has limitations. In a need to increase the accuracy of branch predictions, the processors following the Pentium adopted a different branch prediction algorithm. Some loops have repetitive patterns and they need to be recognized. With a two bit binary counter, it is impossible to attain any complexity. Later generation processors, such as the Pentium MMX, Pentium Pro, Pentium II, use another mechanism for branch prediction.
A 4 bit register is used to record the previous behavior of the branch. If the 4 bit register would be 0001, it would mean that the branch only jumped the last time out of 4.
A 4 bit register would not be of much use without any additional logic. In addition to the 4 bit register, there are 16, 2-bit counters like the ones that were previously shown.
Branch Prediction
A 4 bit register that records the behavior of the branch along with 16 2-bit counters, the mechanism is able to give more accurate branching predictions. Since the register has 4 bits, it has 16 possible values, so the current value of the 4 bit register can always be associated with one of the 16 bit counters, like it is shown in the diagram to the right.