Lec Feb09 2009

CSL718 : Superscalar
Processors

Dynamic Scheduling and
Speculative Execution
9th Feb, 2009

Anshul Kumar, CSE IITD

Handling Control Dependence
• Simple pipeline
– Branch prediction reduces stalls due to control
dependence
• Wide issue processor
– Mere branch prediction is not sufficient
– Instructions in the predicted path need to be
fetched and EXECUTED (speculated
execution)

slide 2

What is required for speculation?
What is required for speculation?
• Branch prediction to choose which
instructions to execute
• Execution of instructions before control
dependences are resolved
• Ability to undo the effects of incorrectly
speculated sequence
• Preserving of correct behaviour under
exceptions

slide 3

Types of speculation
• Hardware based speculation
– done with dynamic branch prediction and
dynamic scheduling
– used in Superscalar processors
• Compiler based speculation
– done with static branch prediction and static
scheduling
– used in VLIW processors

slide 4

Extending Tomasulo’s scheme for
speculative execution
speculative execution
i ix
• Introduce re-order buffer (ROB) x
• Add another stage – “commit”
x
f
fx
Normal execution Speculative execution
• Issue • Issue
• Execute • Execute
• Write result • Write result
• Commit
slide 5

speculative execution – contd.
speculative execution – contd.
• Write results into ROB in the “write result” stage
• Write results into register file or memory in the
“commit” stage
• Dependent instructions can read operands from
ROB
• A speculative instruction commits only if the
prediction is determined to be correct
• Instructions may complete execution out-of-order,
but they commit in-order

slide 6

Recall Tomasulo’s scheme ......

slide 7

Issue
• Get next instruction from instruction queue
• Check if there is a matching RS which is
empty
– no: structural hazard, instruction stalls
– yes: issue the instruction to that RS
• For each operand, check if it is available in
RF
– yes: put the operand in the RS
– no: keep track of FU that will produce it

slide 8

Execute
• If one or more operands not available, wait
and monitor CDB
• When an operand becomes available, it is
placed in RS
• When all operands are available, start
execution
• Choice may need to be made if multiple
instructions become ready at the same time

slide 9

Write result
• When result is available
– write it on CDB and
– from there into RF and relevant RSs
• Mark RS as available

slide 10

More formal description ......

slide 11

RS and RF fields
op busy Qj Vj Qk Vk val Qi

slide 12

Issue
• Get instruction <op, rd, rs, rt> from instruction
queue
• Wait until ∃ r | RS[r].busy = no and RF[rd].Qi = φ
• if (RF[rs].Qi ≠ φ)
{RS[r].Qj ← RF[rs].Qi}
else {RS[r].Vj ← RF[rs].val; RS[r].Qj ← φ}
• similarly for rt
• RS[r].op ← op; RS[r].busy ← yes; RF[rd].Qi ← r

slide 13

Execute
• Wait until RS[r].Qj = φ and RS[r].Qk = φ
• Compute result: operation is RS[r].op,
operands are RS[r].Vj and RS[r].Vk

slide 14

Write result
• Wait until execution complete at r and CDB
available
• ∀ x if (RF[x].Qi = r)
{RF[x].val ← result; RF[x].Qi ← φ}
• ∀ x if (RS[x].Qj = r)
{RS[x].Vj ← result; RS[x].Qj ← φ}
• similarly for Qk / Vk
• RS[r].busy ← no
slide 15

Tomasulo’s scheme plus ROB......

slide 16

Issue
• Get next instruction from instruction queue
• Check if there is a matching RS which is empty
and an empty slot in ROB
– no: structural hazard, instruction stalls
– yes: issue the instruction to that RS and mark the ROB
slot, also put ROB slot number in RS
• For each operand, check if it is available in RF or
ROB
– yes: put the operand in the RS
– no: keep track of FU that will produce it

slide 17

Execute (no change)
• If one or more operands not available, wait
and monitor CDB
• When an operand becomes available, it is
placed in RS
• When all operands are available, start
execution
• Choice may need to be made if multiple
instructions become ready at the same time

slide 18

Write result
• When result is available
– write it on CDB with ROB tag and
– from there into ROB RF and relevant RSs
• Mark RS as available

slide 19

Commit (non-branch instruction)
• Wait until instruction reaches head of ROB
• Update RF
• Remove instruction from ROB

slide 20

Commit (branch instruction)
• If branch is mispredicted,
– flush ROB
– Restart execution at correct successor of the
branch instruction
• else
– Remove instruction from ROB

slide 21

More formal description ......

slide 22

RS fields
op busy Qi Qj Vj Qk Vk

slide 23

RF fields
val Qi busy

slide 24

ROB fields
inst busy rdy val dst

slide 25

Issue
• Get instruction <op, rd, rs, rt> from instruction queue
• Wait until ∃r | RS[r].busy=no and RF[rd].Qi = φ
ROB[b].busy=no, where b = ROB tail
• if (RF[rs].Qi ≠ φ RF[rs].busy) {h ← RF[rs].Qi;
if (ROB[h].rdy) {RS[r].Vj ← ROB[h].val; RS[r].Qj ← φ}
else {RS[r].Qj ← h}
} else {RS[r].Vj ← RF[rs].val; RS[r].Qj ← φ}
• similarly for rt
RS[r].op ← op; RS[r].busy ← yes; RS[r].Qi← b
•
RF[rd].Qi ← rb; RF[rd].busy ← yes; ROB[b].busy ← yes
•
ROB[b].inst ← op; ROB[b].dst ← rd; ROB[b].rdy ← no
•
slide 26

Execute (no change)
• Wait until RS[r].Qj = φ and RS[r].Qk = φ
• Compute result: operation is RS[r].op,
operands are RS[r].Vj and RS[r].Vk

slide 27

Write result
• Wait until execution complete at r and CDB
available
• b ← RS[r].Qi
• ∀ x if (RF[x].Qi = r)
{RF[x] ← result; RF[x].Qi ← φ}
• ∀ x if (RS[x].Qj = r b)
{RS[x].Vj ← result; RS[x].Qj ← φ}
• similarly for Qk / Vk
• RS[r].busy ← no
• ROB[b].rdy ← yes; ROB[b].val ← result
slide 28

(entry h) and ROB[h].rdy = yes
• d ← ROB[h].dst
• RF[d].val ← ROB[h].val
• ROB[h].busy ← no
• if (RF[d].Qi = h) {RF[d].busy ← no}

slide 29

Commit (branch instruction)
(entry h) and ROB[h].rdy = yes
• If branch is mispredicted,
– clear ROB, RF[ ].Qi
– fetch branch dest
• else
– ROB[h].busy ← no
– if (RF[d].Qi = h) {RF[d].busy ← no}

slide 30

Lec Feb09 2009

Recommended

Recommended

More Related Content

More from Ravi Soni

More from Ravi Soni (13)

Recently uploaded

Recently uploaded (20)

Lec Feb09 2009