This document describes techniques for speculative execution in superscalar processors. It discusses how branch prediction and out-of-order execution allow fetching and executing instructions before control dependencies are resolved. A key requirement for speculation is the ability to undo incorrectly speculated instructions using a re-order buffer and committing instructions only when predictions are correct. Tomasulo's algorithm is extended with a re-order buffer to support speculative execution, allowing instructions to complete out-of-order but commit in-order.
2. Handling Control Dependence
• Simple pipeline
– Branch prediction reduces stalls due to control
dependence
• Wide issue processor
– Mere branch prediction is not sufficient
– Instructions in the predicted path need to be
fetched and EXECUTED (speculated
execution)
slide 2
Anshul Kumar, CSE IITD
3. What is required for speculation?
What is required for speculation?
• Branch prediction to choose which
instructions to execute
• Execution of instructions before control
dependences are resolved
• Ability to undo the effects of incorrectly
speculated sequence
• Preserving of correct behaviour under
exceptions
slide 3
Anshul Kumar, CSE IITD
4. Types of speculation
• Hardware based speculation
– done with dynamic branch prediction and
dynamic scheduling
– used in Superscalar processors
• Compiler based speculation
– done with static branch prediction and static
scheduling
– used in VLIW processors
slide 4
Anshul Kumar, CSE IITD
5. Extending Tomasulo’s scheme for
Extending Tomasulo’s scheme for
speculative execution
speculative execution
i ix
• Introduce re-order buffer (ROB) x
• Add another stage – “commit”
x
f
fx
Normal execution Speculative execution
• Issue • Issue
• Execute • Execute
• Write result • Write result
• Commit
slide 5
Anshul Kumar, CSE IITD
6. Extending Tomasulo’s scheme for
Extending Tomasulo’s scheme for
speculative execution – contd.
speculative execution – contd.
• Write results into ROB in the “write result” stage
• Write results into register file or memory in the
“commit” stage
• Dependent instructions can read operands from
ROB
• A speculative instruction commits only if the
prediction is determined to be correct
• Instructions may complete execution out-of-order,
but they commit in-order
slide 6
Anshul Kumar, CSE IITD
8. Issue
• Get next instruction from instruction queue
• Check if there is a matching RS which is
empty
– no: structural hazard, instruction stalls
– yes: issue the instruction to that RS
• For each operand, check if it is available in
RF
– yes: put the operand in the RS
– no: keep track of FU that will produce it
slide 8
Anshul Kumar, CSE IITD
9. Execute
• If one or more operands not available, wait
and monitor CDB
• When an operand becomes available, it is
placed in RS
• When all operands are available, start
execution
• Choice may need to be made if multiple
instructions become ready at the same time
slide 9
Anshul Kumar, CSE IITD
10. Write result
• When result is available
– write it on CDB and
– from there into RF and relevant RSs
• Mark RS as available
slide 10
Anshul Kumar, CSE IITD
12. RS and RF fields
op busy Qj Vj Qk Vk val Qi
slide 12
Anshul Kumar, CSE IITD
13. Issue
• Get instruction <op, rd, rs, rt> from instruction
queue
• Wait until ∃ r | RS[r].busy = no and RF[rd].Qi = φ
• if (RF[rs].Qi ≠ φ)
{RS[r].Qj ← RF[rs].Qi}
else {RS[r].Vj ← RF[rs].val; RS[r].Qj ← φ}
• similarly for rt
• RS[r].op ← op; RS[r].busy ← yes; RF[rd].Qi ← r
slide 13
Anshul Kumar, CSE IITD
14. Execute
• Wait until RS[r].Qj = φ and RS[r].Qk = φ
• Compute result: operation is RS[r].op,
operands are RS[r].Vj and RS[r].Vk
slide 14
Anshul Kumar, CSE IITD
15. Write result
• Wait until execution complete at r and CDB
available
• ∀ x if (RF[x].Qi = r)
{RF[x].val ← result; RF[x].Qi ← φ}
• ∀ x if (RS[x].Qj = r)
{RS[x].Vj ← result; RS[x].Qj ← φ}
• similarly for Qk / Vk
• RS[r].busy ← no
slide 15
Anshul Kumar, CSE IITD
17. Issue
• Get next instruction from instruction queue
• Check if there is a matching RS which is empty
and an empty slot in ROB
– no: structural hazard, instruction stalls
– yes: issue the instruction to that RS and mark the ROB
slot, also put ROB slot number in RS
• For each operand, check if it is available in RF or
ROB
– yes: put the operand in the RS
– no: keep track of FU that will produce it
slide 17
Anshul Kumar, CSE IITD
18. Execute (no change)
• If one or more operands not available, wait
and monitor CDB
• When an operand becomes available, it is
placed in RS
• When all operands are available, start
execution
• Choice may need to be made if multiple
instructions become ready at the same time
slide 18
Anshul Kumar, CSE IITD
19. Write result
• When result is available
– write it on CDB with ROB tag and
– from there into ROB RF and relevant RSs
• Mark RS as available
slide 19
Anshul Kumar, CSE IITD
20. Commit (non-branch instruction)
Commit (non-branch instruction)
• Wait until instruction reaches head of ROB
• Update RF
• Remove instruction from ROB
slide 20
Anshul Kumar, CSE IITD
21. Commit (branch instruction)
• Wait until instruction reaches head of ROB
• If branch is mispredicted,
– flush ROB
– Restart execution at correct successor of the
branch instruction
• else
– Remove instruction from ROB
slide 21
Anshul Kumar, CSE IITD
25. ROB fields
inst busy rdy val dst
slide 25
Anshul Kumar, CSE IITD
26. Issue
• Get instruction <op, rd, rs, rt> from instruction queue
• Wait until ∃r | RS[r].busy=no and RF[rd].Qi = φ
ROB[b].busy=no, where b = ROB tail
• if (RF[rs].Qi ≠ φ RF[rs].busy) {h ← RF[rs].Qi;
if (ROB[h].rdy) {RS[r].Vj ← ROB[h].val; RS[r].Qj ← φ}
else {RS[r].Qj ← h}
} else {RS[r].Vj ← RF[rs].val; RS[r].Qj ← φ}
• similarly for rt
RS[r].op ← op; RS[r].busy ← yes; RS[r].Qi← b
•
RF[rd].Qi ← rb; RF[rd].busy ← yes; ROB[b].busy ← yes
•
ROB[b].inst ← op; ROB[b].dst ← rd; ROB[b].rdy ← no
•
slide 26
Anshul Kumar, CSE IITD
27. Execute (no change)
• Wait until RS[r].Qj = φ and RS[r].Qk = φ
• Compute result: operation is RS[r].op,
operands are RS[r].Vj and RS[r].Vk
slide 27
Anshul Kumar, CSE IITD
28. Write result
• Wait until execution complete at r and CDB
available
• b ← RS[r].Qi
• ∀ x if (RF[x].Qi = r)
{RF[x] ← result; RF[x].Qi ← φ}
• ∀ x if (RS[x].Qj = r b)
{RS[x].Vj ← result; RS[x].Qj ← φ}
• similarly for Qk / Vk
• RS[r].busy ← no
• ROB[b].rdy ← yes; ROB[b].val ← result
slide 28
Anshul Kumar, CSE IITD
29. Commit (non-branch instruction)
Commit (non-branch instruction)
• Wait until instruction reaches head of ROB
(entry h) and ROB[h].rdy = yes
• d ← ROB[h].dst
• RF[d].val ← ROB[h].val
• ROB[h].busy ← no
• if (RF[d].Qi = h) {RF[d].busy ← no}
slide 29
Anshul Kumar, CSE IITD
30. Commit (branch instruction)
• Wait until instruction reaches head of ROB
(entry h) and ROB[h].rdy = yes
• If branch is mispredicted,
– clear ROB, RF[ ].Qi
– fetch branch dest
• else
– ROB[h].busy ← no
– if (RF[d].Qi = h) {RF[d].busy ← no}
slide 30
Anshul Kumar, CSE IITD