SlideShare a Scribd company logo
1 of 14
From Superscalar OO to Multicore SST Checkpoint and Transactional memory support for SST © dave+stratusdesign@gmail.com stratusdesign.squarespace.com
The OO Superscalar legacy ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Year Inflight Instructions Clock Speed 1998 90 600Mhz 2008 200 3200Mhz
Speculative execution evolution ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Evolution to SST ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Hazards ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
OO & SST Differences ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data hazards ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
SST handling of Data Hazards ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
SST handling of Control Hazards ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
SST Memory Consistency Protocol ,[object Object],[object Object],[object Object]
Checkpoints ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
SST new circuit structures ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
SST logic Wakeup Behind Thread DQ Full? DQ Empty for current & spec ckpt? L1 Miss Set  ‘ S ’  bit  in Cache Start Behind  thread in  wait mode  to handle  Defers Start Executing  Main thread Speculatively  ahead Behind Thread Runs Thru DQ for Active Checkpoint Done Ahead Thread  • Normal Mode Behind Thread  • Pause L1  Resolved Ahead Thread • Scout Mode Behind Thread • Pause High Level SW  initiates a Memory Transaction Restore Checkpoint Tx Fail  ‘ S ’ bit Detect Mem Order Violation Br Mispredict Exception WAIT Begin SST Episode Arch Checkpoint Active • Architectural Inactive • Speculative   Instr has Data Dependencies? Execute Instr and Retire OO Enqueue DQ  with Instr & All Resolved Opr Instr has no Data Dependencies? WAIT more data expected Speculation Successful Program Execution resumes were speculation finished
SST scheduling Program Order LDX addr1, %r1 ADD %r1, 0x04, %r2 STX %r2, addr2 SETHI 0x01, %r2 STX %r2, addr3 etc..  ;  Ahead-Thread   1 LDX addr1, %r1  ; Load Miss on addr1, Defer and set R1 [ NT ])  To Defer Q ; Checkpoint Start Ahead-Thread, Behind-Thread Waits for data read 2 ADD %r1, 0x04, %r2 ; Source Operand has NT bit set Defer and set R2 [NT]  To Defer Q 3 STX %r2, addr2 ; Source Operand has NT bit set Defer) To Defer Q 4  SETHI 0x01, %r2  ; Ahead Thread Executes Independently) 5  STX %r2, addr3 ; Ahead Thread Executes Independently & continues speculative execution of more program instructions ;  Load Miss resolves start   Behind-Thread 6  ADD %r1, 0x04, %r2 [NT=0,SNT=1]   ;  NT was reset at 4, set waw bit 7  STX %r2, addr3 SST Order LDX addr1, %r1 ADD %r1, 0x04, %r2 STX %r2, addr2 SETHI 0x01, %r2 STX %r2, addr3 etc..  Deferring data-dependent instructions prevents RAW  –   here %r2 was read at 3 but written before at 2 Saving operands in DQ prevents WAR as any valid data in register at that time is captured and saved for Behind-Thread to use later regardless of future writes by Ahead-Thread Registers with WAW bit not committed to Architectural state  –  here %r2 was written at 4 & 6 ;Deferred Queue LDX  addr1, %r1 [ NT ] ADD  %r1 [ NT ],  0x04, %r2 [ NT ] STX  %r2 [ NT ] , addr2 WAW WAR RAW

More Related Content

What's hot

Process Synchronization And Deadlocks
Process Synchronization And DeadlocksProcess Synchronization And Deadlocks
Process Synchronization And Deadlocks
tech2click
 
Jack_Knutson_SNUG2003_ Copy
Jack_Knutson_SNUG2003_ CopyJack_Knutson_SNUG2003_ Copy
Jack_Knutson_SNUG2003_ Copy
Jack Knutson
 
Chapter 6 - Process Synchronization
Chapter 6 - Process SynchronizationChapter 6 - Process Synchronization
Chapter 6 - Process Synchronization
Wayne Jones Jnr
 
Ch7 OS
Ch7 OSCh7 OS
Ch7 OS
C.U
 
Synchronization linux
Synchronization linuxSynchronization linux
Synchronization linux
Susant Sahani
 

What's hot (20)

Process Synchronization And Deadlocks
Process Synchronization And DeadlocksProcess Synchronization And Deadlocks
Process Synchronization And Deadlocks
 
OSCh7
OSCh7OSCh7
OSCh7
 
Operating Systems - Process Synchronization and Deadlocks
Operating Systems - Process Synchronization and DeadlocksOperating Systems - Process Synchronization and Deadlocks
Operating Systems - Process Synchronization and Deadlocks
 
Timing Analysis
Timing AnalysisTiming Analysis
Timing Analysis
 
CNWeek4 lec2-bscs1
CNWeek4 lec2-bscs1CNWeek4 lec2-bscs1
CNWeek4 lec2-bscs1
 
Jack_Knutson_SNUG2003_ Copy
Jack_Knutson_SNUG2003_ CopyJack_Knutson_SNUG2003_ Copy
Jack_Knutson_SNUG2003_ Copy
 
Timing analysis
Timing analysisTiming analysis
Timing analysis
 
Process synchronization
Process synchronizationProcess synchronization
Process synchronization
 
6.Process Synchronization
6.Process Synchronization6.Process Synchronization
6.Process Synchronization
 
Major project iii 3
Major project  iii  3Major project  iii  3
Major project iii 3
 
Chapter 6 - Process Synchronization
Chapter 6 - Process SynchronizationChapter 6 - Process Synchronization
Chapter 6 - Process Synchronization
 
Operating System-Ch6 process synchronization
Operating System-Ch6 process synchronizationOperating System-Ch6 process synchronization
Operating System-Ch6 process synchronization
 
Process synchronization in Operating Systems
Process synchronization in Operating SystemsProcess synchronization in Operating Systems
Process synchronization in Operating Systems
 
Ch7 OS
Ch7 OSCh7 OS
Ch7 OS
 
A Robust UART Architecture Based on Recursive Running Sum Filter for Better N...
A Robust UART Architecture Based on Recursive Running Sum Filter for Better N...A Robust UART Architecture Based on Recursive Running Sum Filter for Better N...
A Robust UART Architecture Based on Recursive Running Sum Filter for Better N...
 
Synchronization linux
Synchronization linuxSynchronization linux
Synchronization linux
 
Operating systems question bank
Operating systems question bankOperating systems question bank
Operating systems question bank
 
Operating Systems Chapter 6 silberschatz
Operating Systems Chapter 6 silberschatzOperating Systems Chapter 6 silberschatz
Operating Systems Chapter 6 silberschatz
 
Burst clock controller
Burst clock controllerBurst clock controller
Burst clock controller
 
Operating Systems - "Chapter 5 Process Synchronization"
Operating Systems - "Chapter 5 Process Synchronization"Operating Systems - "Chapter 5 Process Synchronization"
Operating Systems - "Chapter 5 Process Synchronization"
 

Similar to from OO to Multicore SST

Pipeline and data hazard
Pipeline and data hazardPipeline and data hazard
Pipeline and data hazard
Waed Shagareen
 
11thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp0111thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp01
Karam Abuataya
 

Similar to from OO to Multicore SST (20)

OS Process Synchronization, semaphore and Monitors
OS Process Synchronization, semaphore and MonitorsOS Process Synchronization, semaphore and Monitors
OS Process Synchronization, semaphore and Monitors
 
Final report
Final reportFinal report
Final report
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
 
Analyzing and Interpreting AWR
Analyzing and Interpreting AWRAnalyzing and Interpreting AWR
Analyzing and Interpreting AWR
 
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel PoderTroubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
 
Pipeline and data hazard
Pipeline and data hazardPipeline and data hazard
Pipeline and data hazard
 
Performance and predictability
Performance and predictabilityPerformance and predictability
Performance and predictability
 
Building real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark StreamingBuilding real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark Streaming
 
Control hazards MIPS pipeline.pptx
Control hazards MIPS pipeline.pptxControl hazards MIPS pipeline.pptx
Control hazards MIPS pipeline.pptx
 
bluespec talk
bluespec talkbluespec talk
bluespec talk
 
Operating System Engineering
Operating System EngineeringOperating System Engineering
Operating System Engineering
 
Performance and predictability (1)
Performance and predictability (1)Performance and predictability (1)
Performance and predictability (1)
 
Performance and Predictability - Richard Warburton
Performance and Predictability - Richard WarburtonPerformance and Predictability - Richard Warburton
Performance and Predictability - Richard Warburton
 
CH05.pdf
CH05.pdfCH05.pdf
CH05.pdf
 
Coding style for good synthesis
Coding style for good synthesisCoding style for good synthesis
Coding style for good synthesis
 
11thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp0111thingsabout11g 12659705398222 Phpapp01
11thingsabout11g 12659705398222 Phpapp01
 
11 Things About11g
11 Things About11g11 Things About11g
11 Things About11g
 
Lecture18-19 (1).ppt
Lecture18-19 (1).pptLecture18-19 (1).ppt
Lecture18-19 (1).ppt
 
676.v3
676.v3676.v3
676.v3
 
AMC Minor Technical Issues
AMC Minor Technical IssuesAMC Minor Technical Issues
AMC Minor Technical Issues
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

from OO to Multicore SST

  • 1. From Superscalar OO to Multicore SST Checkpoint and Transactional memory support for SST © dave+stratusdesign@gmail.com stratusdesign.squarespace.com
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13. SST logic Wakeup Behind Thread DQ Full? DQ Empty for current & spec ckpt? L1 Miss Set ‘ S ’ bit in Cache Start Behind thread in wait mode to handle Defers Start Executing Main thread Speculatively ahead Behind Thread Runs Thru DQ for Active Checkpoint Done Ahead Thread • Normal Mode Behind Thread • Pause L1 Resolved Ahead Thread • Scout Mode Behind Thread • Pause High Level SW initiates a Memory Transaction Restore Checkpoint Tx Fail ‘ S ’ bit Detect Mem Order Violation Br Mispredict Exception WAIT Begin SST Episode Arch Checkpoint Active • Architectural Inactive • Speculative Instr has Data Dependencies? Execute Instr and Retire OO Enqueue DQ with Instr & All Resolved Opr Instr has no Data Dependencies? WAIT more data expected Speculation Successful Program Execution resumes were speculation finished
  • 14. SST scheduling Program Order LDX addr1, %r1 ADD %r1, 0x04, %r2 STX %r2, addr2 SETHI 0x01, %r2 STX %r2, addr3 etc.. ; Ahead-Thread 1 LDX addr1, %r1 ; Load Miss on addr1, Defer and set R1 [ NT ]) To Defer Q ; Checkpoint Start Ahead-Thread, Behind-Thread Waits for data read 2 ADD %r1, 0x04, %r2 ; Source Operand has NT bit set Defer and set R2 [NT] To Defer Q 3 STX %r2, addr2 ; Source Operand has NT bit set Defer) To Defer Q 4 SETHI 0x01, %r2 ; Ahead Thread Executes Independently) 5 STX %r2, addr3 ; Ahead Thread Executes Independently & continues speculative execution of more program instructions ; Load Miss resolves start Behind-Thread 6 ADD %r1, 0x04, %r2 [NT=0,SNT=1] ; NT was reset at 4, set waw bit 7 STX %r2, addr3 SST Order LDX addr1, %r1 ADD %r1, 0x04, %r2 STX %r2, addr2 SETHI 0x01, %r2 STX %r2, addr3 etc.. Deferring data-dependent instructions prevents RAW – here %r2 was read at 3 but written before at 2 Saving operands in DQ prevents WAR as any valid data in register at that time is captured and saved for Behind-Thread to use later regardless of future writes by Ahead-Thread Registers with WAW bit not committed to Architectural state – here %r2 was written at 4 & 6 ;Deferred Queue LDX addr1, %r1 [ NT ] ADD %r1 [ NT ], 0x04, %r2 [ NT ] STX %r2 [ NT ] , addr2 WAW WAR RAW