SlideShare a Scribd company logo
1 of 30
Download to read offline
CSL718 : Superscalar
    Processors

  Dynamic Scheduling and
   Speculative Execution
       9th Feb, 2009

      Anshul Kumar, CSE IITD
Handling Control Dependence
• Simple pipeline
   – Branch prediction reduces stalls due to control
     dependence
• Wide issue processor
   – Mere branch prediction is not sufficient
   – Instructions in the predicted path need to be
     fetched and EXECUTED (speculated
     execution)


                                                     slide 2
Anshul Kumar, CSE IITD
What is required for speculation?
What is required for speculation?
• Branch prediction to choose which
  instructions to execute
• Execution of instructions before control
  dependences are resolved
• Ability to undo the effects of incorrectly
  speculated sequence
• Preserving of correct behaviour under
  exceptions

                                               slide 3
Anshul Kumar, CSE IITD
Types of speculation
• Hardware based speculation
   – done with dynamic branch prediction and
     dynamic scheduling
   – used in Superscalar processors
• Compiler based speculation
   – done with static branch prediction and static
     scheduling
   – used in VLIW processors


                                                     slide 4
Anshul Kumar, CSE IITD
Extending Tomasulo’s scheme for
Extending Tomasulo’s scheme for
     speculative execution
      speculative execution
                                          i ix
• Introduce re-order buffer (ROB)                   x
• Add another stage – “commit”
                                                         x
                                                     f
                                             fx
Normal execution         Speculative execution
• Issue                  • Issue
• Execute                • Execute
• Write result           • Write result
                         • Commit
                                                 slide 5
Anshul Kumar, CSE IITD
Extending Tomasulo’s scheme for
Extending Tomasulo’s scheme for
 speculative execution – contd.
  speculative execution – contd.
• Write results into ROB in the “write result” stage
• Write results into register file or memory in the
  “commit” stage
• Dependent instructions can read operands from
  ROB
• A speculative instruction commits only if the
  prediction is determined to be correct
• Instructions may complete execution out-of-order,
  but they commit in-order

                                                 slide 6
Anshul Kumar, CSE IITD
Recall Tomasulo’s scheme ......




                                  slide 7
Anshul Kumar, CSE IITD
Issue
• Get next instruction from instruction queue
• Check if there is a matching RS which is
  empty
   – no: structural hazard, instruction stalls
   – yes: issue the instruction to that RS
• For each operand, check if it is available in
  RF
   – yes: put the operand in the RS
   – no: keep track of FU that will produce it

                                                 slide 8
Anshul Kumar, CSE IITD
Execute
• If one or more operands not available, wait
  and monitor CDB
• When an operand becomes available, it is
  placed in RS
• When all operands are available, start
  execution
• Choice may need to be made if multiple
  instructions become ready at the same time

                                           slide 9
Anshul Kumar, CSE IITD
Write result
• When result is available
   – write it on CDB and
   – from there into RF and relevant RSs
• Mark RS as available




                                           slide 10
Anshul Kumar, CSE IITD
More formal description ......




                                 slide 11
Anshul Kumar, CSE IITD
RS and RF fields
op busy Qj               Vj   Qk   Vk   val   Qi




                                              slide 12
Anshul Kumar, CSE IITD
Issue
• Get instruction <op, rd, rs, rt> from instruction
  queue
• Wait until ∃ r | RS[r].busy = no and RF[rd].Qi = φ
• if (RF[rs].Qi ≠ φ)
       {RS[r].Qj ← RF[rs].Qi}
  else {RS[r].Vj ← RF[rs].val; RS[r].Qj ← φ}
• similarly for rt
• RS[r].op ← op; RS[r].busy ← yes; RF[rd].Qi ← r


                                                slide 13
Anshul Kumar, CSE IITD
Execute
• Wait until RS[r].Qj = φ and RS[r].Qk = φ
• Compute result: operation is RS[r].op,
  operands are RS[r].Vj and RS[r].Vk




                                         slide 14
Anshul Kumar, CSE IITD
Write result
• Wait until execution complete at r and CDB
  available
• ∀ x if (RF[x].Qi = r)
     {RF[x].val ← result; RF[x].Qi ← φ}
• ∀ x if (RS[x].Qj = r)
     {RS[x].Vj ← result; RS[x].Qj ← φ}
• similarly for Qk / Vk
• RS[r].busy ← no
                                         slide 15
Anshul Kumar, CSE IITD
Tomasulo’s scheme plus ROB......




                                   slide 16
Anshul Kumar, CSE IITD
Issue
• Get next instruction from instruction queue
• Check if there is a matching RS which is empty
  and an empty slot in ROB
   – no: structural hazard, instruction stalls
   – yes: issue the instruction to that RS and mark the ROB
     slot, also put ROB slot number in RS
• For each operand, check if it is available in RF or
  ROB
   – yes: put the operand in the RS
   – no: keep track of FU that will produce it

                                                       slide 17
Anshul Kumar, CSE IITD
Execute (no change)
• If one or more operands not available, wait
  and monitor CDB
• When an operand becomes available, it is
  placed in RS
• When all operands are available, start
  execution
• Choice may need to be made if multiple
  instructions become ready at the same time

                                          slide 18
Anshul Kumar, CSE IITD
Write result
• When result is available
   – write it on CDB with ROB tag and
   – from there into ROB RF and relevant RSs
• Mark RS as available




                                               slide 19
Anshul Kumar, CSE IITD
Commit (non-branch instruction)
Commit (non-branch instruction)
• Wait until instruction reaches head of ROB
• Update RF
• Remove instruction from ROB




                                         slide 20
Anshul Kumar, CSE IITD
Commit (branch instruction)
• Wait until instruction reaches head of ROB
• If branch is mispredicted,
   – flush ROB
   – Restart execution at correct successor of the
     branch instruction
• else
   – Remove instruction from ROB



                                                     slide 21
Anshul Kumar, CSE IITD
More formal description ......




                                 slide 22
Anshul Kumar, CSE IITD
RS fields
        op busy Qi          Qj   Vj   Qk   Vk




                                                slide 23
Anshul Kumar, CSE IITD
RF fields
                    val     Qi   busy




                                        slide 24
Anshul Kumar, CSE IITD
ROB fields
                inst busy rdy   val   dst




                                            slide 25
Anshul Kumar, CSE IITD
Issue
• Get instruction <op, rd, rs, rt> from instruction queue
• Wait until ∃r | RS[r].busy=no and RF[rd].Qi = φ
              ROB[b].busy=no, where b = ROB tail
• if (RF[rs].Qi ≠ φ RF[rs].busy) {h ← RF[rs].Qi;
       if (ROB[h].rdy) {RS[r].Vj ← ROB[h].val; RS[r].Qj ← φ}
        else {RS[r].Qj ← h}
    } else {RS[r].Vj ← RF[rs].val; RS[r].Qj ← φ}
•   similarly for rt
    RS[r].op ← op; RS[r].busy ← yes; RS[r].Qi← b
•
    RF[rd].Qi ← rb; RF[rd].busy ← yes; ROB[b].busy ← yes
•
    ROB[b].inst ← op; ROB[b].dst ← rd; ROB[b].rdy ← no
•
                                                       slide 26
    Anshul Kumar, CSE IITD
Execute (no change)
• Wait until RS[r].Qj = φ and RS[r].Qk = φ
• Compute result: operation is RS[r].op,
  operands are RS[r].Vj and RS[r].Vk




                                         slide 27
Anshul Kumar, CSE IITD
Write result
• Wait until execution complete at r and CDB
  available
• b ← RS[r].Qi
• ∀ x if (RF[x].Qi = r)
     {RF[x] ← result; RF[x].Qi ← φ}
• ∀ x if (RS[x].Qj = r b)
     {RS[x].Vj ← result; RS[x].Qj ← φ}
• similarly for Qk / Vk
• RS[r].busy ← no
• ROB[b].rdy ← yes; ROB[b].val ← result
                                         slide 28
Anshul Kumar, CSE IITD
Commit (non-branch instruction)
Commit (non-branch instruction)
• Wait until instruction reaches head of ROB
  (entry h) and ROB[h].rdy = yes
• d ← ROB[h].dst
• RF[d].val ← ROB[h].val
• ROB[h].busy ← no
• if (RF[d].Qi = h) {RF[d].busy ← no}



                                         slide 29
Anshul Kumar, CSE IITD
Commit (branch instruction)
• Wait until instruction reaches head of ROB
  (entry h) and ROB[h].rdy = yes
• If branch is mispredicted,
   – clear ROB, RF[ ].Qi
   – fetch branch dest
• else
   – ROB[h].busy ← no
   – if (RF[d].Qi = h) {RF[d].busy ← no}

                                           slide 30
Anshul Kumar, CSE IITD

More Related Content

More from Ravi Soni

Lec 6 Structure (Types) 196
Lec 6  Structure (Types) 196Lec 6  Structure (Types) 196
Lec 6 Structure (Types) 196Ravi Soni
 
Lec 3 Organizational Effectiveness 184
Lec 3  Organizational Effectiveness 184Lec 3  Organizational Effectiveness 184
Lec 3 Organizational Effectiveness 184Ravi Soni
 
Lec 2 Multidisciplinary 183
Lec 2  Multidisciplinary 183Lec 2  Multidisciplinary 183
Lec 2 Multidisciplinary 183Ravi Soni
 
Lec 5 Structure (Basics) 186
Lec 5  Structure (Basics) 186Lec 5  Structure (Basics) 186
Lec 5 Structure (Basics) 186Ravi Soni
 
Lec Jan15 2009
Lec Jan15 2009Lec Jan15 2009
Lec Jan15 2009Ravi Soni
 
Lec Jan29 2009
Lec Jan29 2009Lec Jan29 2009
Lec Jan29 2009Ravi Soni
 
Lec Jan22 2009
Lec Jan22 2009Lec Jan22 2009
Lec Jan22 2009Ravi Soni
 
Lec Feb05 2009
Lec Feb05 2009Lec Feb05 2009
Lec Feb05 2009Ravi Soni
 
Cs718min1 2008soln View
Cs718min1 2008soln ViewCs718min1 2008soln View
Cs718min1 2008soln ViewRavi Soni
 
Lec Jan12 2009
Lec Jan12 2009Lec Jan12 2009
Lec Jan12 2009Ravi Soni
 
Lec Jan19 2009
Lec Jan19 2009Lec Jan19 2009
Lec Jan19 2009Ravi Soni
 
Lec Feb02 2009
Lec Feb02 2009Lec Feb02 2009
Lec Feb02 2009Ravi Soni
 

More from Ravi Soni (13)

Lec 6 Structure (Types) 196
Lec 6  Structure (Types) 196Lec 6  Structure (Types) 196
Lec 6 Structure (Types) 196
 
Lec 3 Organizational Effectiveness 184
Lec 3  Organizational Effectiveness 184Lec 3  Organizational Effectiveness 184
Lec 3 Organizational Effectiveness 184
 
Lec 2 Multidisciplinary 183
Lec 2  Multidisciplinary 183Lec 2  Multidisciplinary 183
Lec 2 Multidisciplinary 183
 
Lec 1 182
Lec 1 182Lec 1 182
Lec 1 182
 
Lec 5 Structure (Basics) 186
Lec 5  Structure (Basics) 186Lec 5  Structure (Basics) 186
Lec 5 Structure (Basics) 186
 
Lec Jan15 2009
Lec Jan15 2009Lec Jan15 2009
Lec Jan15 2009
 
Lec Jan29 2009
Lec Jan29 2009Lec Jan29 2009
Lec Jan29 2009
 
Lec Jan22 2009
Lec Jan22 2009Lec Jan22 2009
Lec Jan22 2009
 
Lec Feb05 2009
Lec Feb05 2009Lec Feb05 2009
Lec Feb05 2009
 
Cs718min1 2008soln View
Cs718min1 2008soln ViewCs718min1 2008soln View
Cs718min1 2008soln View
 
Lec Jan12 2009
Lec Jan12 2009Lec Jan12 2009
Lec Jan12 2009
 
Lec Jan19 2009
Lec Jan19 2009Lec Jan19 2009
Lec Jan19 2009
 
Lec Feb02 2009
Lec Feb02 2009Lec Feb02 2009
Lec Feb02 2009
 

Recently uploaded

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Lec Feb09 2009

  • 1. CSL718 : Superscalar Processors Dynamic Scheduling and Speculative Execution 9th Feb, 2009 Anshul Kumar, CSE IITD
  • 2. Handling Control Dependence • Simple pipeline – Branch prediction reduces stalls due to control dependence • Wide issue processor – Mere branch prediction is not sufficient – Instructions in the predicted path need to be fetched and EXECUTED (speculated execution) slide 2 Anshul Kumar, CSE IITD
  • 3. What is required for speculation? What is required for speculation? • Branch prediction to choose which instructions to execute • Execution of instructions before control dependences are resolved • Ability to undo the effects of incorrectly speculated sequence • Preserving of correct behaviour under exceptions slide 3 Anshul Kumar, CSE IITD
  • 4. Types of speculation • Hardware based speculation – done with dynamic branch prediction and dynamic scheduling – used in Superscalar processors • Compiler based speculation – done with static branch prediction and static scheduling – used in VLIW processors slide 4 Anshul Kumar, CSE IITD
  • 5. Extending Tomasulo’s scheme for Extending Tomasulo’s scheme for speculative execution speculative execution i ix • Introduce re-order buffer (ROB) x • Add another stage – “commit” x f fx Normal execution Speculative execution • Issue • Issue • Execute • Execute • Write result • Write result • Commit slide 5 Anshul Kumar, CSE IITD
  • 6. Extending Tomasulo’s scheme for Extending Tomasulo’s scheme for speculative execution – contd. speculative execution – contd. • Write results into ROB in the “write result” stage • Write results into register file or memory in the “commit” stage • Dependent instructions can read operands from ROB • A speculative instruction commits only if the prediction is determined to be correct • Instructions may complete execution out-of-order, but they commit in-order slide 6 Anshul Kumar, CSE IITD
  • 7. Recall Tomasulo’s scheme ...... slide 7 Anshul Kumar, CSE IITD
  • 8. Issue • Get next instruction from instruction queue • Check if there is a matching RS which is empty – no: structural hazard, instruction stalls – yes: issue the instruction to that RS • For each operand, check if it is available in RF – yes: put the operand in the RS – no: keep track of FU that will produce it slide 8 Anshul Kumar, CSE IITD
  • 9. Execute • If one or more operands not available, wait and monitor CDB • When an operand becomes available, it is placed in RS • When all operands are available, start execution • Choice may need to be made if multiple instructions become ready at the same time slide 9 Anshul Kumar, CSE IITD
  • 10. Write result • When result is available – write it on CDB and – from there into RF and relevant RSs • Mark RS as available slide 10 Anshul Kumar, CSE IITD
  • 11. More formal description ...... slide 11 Anshul Kumar, CSE IITD
  • 12. RS and RF fields op busy Qj Vj Qk Vk val Qi slide 12 Anshul Kumar, CSE IITD
  • 13. Issue • Get instruction <op, rd, rs, rt> from instruction queue • Wait until ∃ r | RS[r].busy = no and RF[rd].Qi = φ • if (RF[rs].Qi ≠ φ) {RS[r].Qj ← RF[rs].Qi} else {RS[r].Vj ← RF[rs].val; RS[r].Qj ← φ} • similarly for rt • RS[r].op ← op; RS[r].busy ← yes; RF[rd].Qi ← r slide 13 Anshul Kumar, CSE IITD
  • 14. Execute • Wait until RS[r].Qj = φ and RS[r].Qk = φ • Compute result: operation is RS[r].op, operands are RS[r].Vj and RS[r].Vk slide 14 Anshul Kumar, CSE IITD
  • 15. Write result • Wait until execution complete at r and CDB available • ∀ x if (RF[x].Qi = r) {RF[x].val ← result; RF[x].Qi ← φ} • ∀ x if (RS[x].Qj = r) {RS[x].Vj ← result; RS[x].Qj ← φ} • similarly for Qk / Vk • RS[r].busy ← no slide 15 Anshul Kumar, CSE IITD
  • 16. Tomasulo’s scheme plus ROB...... slide 16 Anshul Kumar, CSE IITD
  • 17. Issue • Get next instruction from instruction queue • Check if there is a matching RS which is empty and an empty slot in ROB – no: structural hazard, instruction stalls – yes: issue the instruction to that RS and mark the ROB slot, also put ROB slot number in RS • For each operand, check if it is available in RF or ROB – yes: put the operand in the RS – no: keep track of FU that will produce it slide 17 Anshul Kumar, CSE IITD
  • 18. Execute (no change) • If one or more operands not available, wait and monitor CDB • When an operand becomes available, it is placed in RS • When all operands are available, start execution • Choice may need to be made if multiple instructions become ready at the same time slide 18 Anshul Kumar, CSE IITD
  • 19. Write result • When result is available – write it on CDB with ROB tag and – from there into ROB RF and relevant RSs • Mark RS as available slide 19 Anshul Kumar, CSE IITD
  • 20. Commit (non-branch instruction) Commit (non-branch instruction) • Wait until instruction reaches head of ROB • Update RF • Remove instruction from ROB slide 20 Anshul Kumar, CSE IITD
  • 21. Commit (branch instruction) • Wait until instruction reaches head of ROB • If branch is mispredicted, – flush ROB – Restart execution at correct successor of the branch instruction • else – Remove instruction from ROB slide 21 Anshul Kumar, CSE IITD
  • 22. More formal description ...... slide 22 Anshul Kumar, CSE IITD
  • 23. RS fields op busy Qi Qj Vj Qk Vk slide 23 Anshul Kumar, CSE IITD
  • 24. RF fields val Qi busy slide 24 Anshul Kumar, CSE IITD
  • 25. ROB fields inst busy rdy val dst slide 25 Anshul Kumar, CSE IITD
  • 26. Issue • Get instruction <op, rd, rs, rt> from instruction queue • Wait until ∃r | RS[r].busy=no and RF[rd].Qi = φ ROB[b].busy=no, where b = ROB tail • if (RF[rs].Qi ≠ φ RF[rs].busy) {h ← RF[rs].Qi; if (ROB[h].rdy) {RS[r].Vj ← ROB[h].val; RS[r].Qj ← φ} else {RS[r].Qj ← h} } else {RS[r].Vj ← RF[rs].val; RS[r].Qj ← φ} • similarly for rt RS[r].op ← op; RS[r].busy ← yes; RS[r].Qi← b • RF[rd].Qi ← rb; RF[rd].busy ← yes; ROB[b].busy ← yes • ROB[b].inst ← op; ROB[b].dst ← rd; ROB[b].rdy ← no • slide 26 Anshul Kumar, CSE IITD
  • 27. Execute (no change) • Wait until RS[r].Qj = φ and RS[r].Qk = φ • Compute result: operation is RS[r].op, operands are RS[r].Vj and RS[r].Vk slide 27 Anshul Kumar, CSE IITD
  • 28. Write result • Wait until execution complete at r and CDB available • b ← RS[r].Qi • ∀ x if (RF[x].Qi = r) {RF[x] ← result; RF[x].Qi ← φ} • ∀ x if (RS[x].Qj = r b) {RS[x].Vj ← result; RS[x].Qj ← φ} • similarly for Qk / Vk • RS[r].busy ← no • ROB[b].rdy ← yes; ROB[b].val ← result slide 28 Anshul Kumar, CSE IITD
  • 29. Commit (non-branch instruction) Commit (non-branch instruction) • Wait until instruction reaches head of ROB (entry h) and ROB[h].rdy = yes • d ← ROB[h].dst • RF[d].val ← ROB[h].val • ROB[h].busy ← no • if (RF[d].Qi = h) {RF[d].busy ← no} slide 29 Anshul Kumar, CSE IITD
  • 30. Commit (branch instruction) • Wait until instruction reaches head of ROB (entry h) and ROB[h].rdy = yes • If branch is mispredicted, – clear ROB, RF[ ].Qi – fetch branch dest • else – ROB[h].busy ← no – if (RF[d].Qi = h) {RF[d].busy ← no} slide 30 Anshul Kumar, CSE IITD