SlideShare a Scribd company logo
Hybrid STM/HTM for Nested
Transactions on OpenJDK
Keith Chapman
Purdue U
Tony Hosking
ANU/Data61, Purdue U
Eliot Moss
UMass
Motivation
STM has been around for ages
• But STM is slow
Commodity hardware for transactions available now
• But HTM approaches are only best effort
Out goal: Accelerate STM with HTM when possible
Nested Transactions
Allow composition of transactions
Two flavors
• Closed Nested Transactions
• HTM inside STM not useful; must do all the same work
• Open Nested Transactions
• HTM inside STM avoids most TM work – HTM well-suited!
XJ: Transactional Java
XJ Language
• Supports flat and closed/open/boosted nested transactions
The implementation supports hybrid transactions
• HTM support via Intel TSX
• HTM and STM can co-exist
Uses an Optimistic-reads / Pessimistic-writes protocol
TM Metadata
Txn Log
0 0Version # 0 0 0Txn ID 1
One metadata word for every object
STM Transaction Protocol (Read)
T1 Log
0 00 0
STM Transaction Protocol (Read)
T1 Log
0 00 0
Check Mode
STM Transaction Protocol (Read)
T1 Log
0 0
0 0
T1 Read
STM Transaction Protocol (Read)
T1 Log
0 0
0 0
T1 Validate
STM Transaction Protocol (Read)
T1 Log
0 0
T1 Commit
0 0
STM Transaction Protocol (Write)
T1 Log
0 0
STM Transaction Protocol (Write)
T1 Log
Check Mode
T1 1
0 0
STM Transaction Protocol (Write)
T1 Log
T1 Write
T1 1
0 0
STM Transaction Protocol (Write)
T1 Log
T1 Write
T1 1
0 0
STM Transaction Protocol (Write)
T1 Log
T1 Write
T1 11 0
STM Transaction Protocol (Write)
T1 Log
T1 Commit
STM Conflict Detection
T1 Log
0 00 0
T2 Log
0 0
STM Conflict Detection
T1 Log
0 0
0 0
T2 Log
0 0
T1 Read
STM Conflict Detection
T1 Log
0 0
0 0
T2 Log
0 0
T2 Write
T2 1
STM Conflict Detection
T1 Log
0 0
0 0
T2 Log
0 0
T1 Validate
T2 1
STM Conflict Detection
T1 Log
0 0
T2 Log
0 0
T1 Abort
T2 1
Hybrid Transaction Protocol
STM – HTM conflicts detected by lock word accesses
• Explicit XABORT if locked by another transaction
• HTM reads – Read the metadata word
• STM writes modify the metadata word
• Causes HTM to abort
• HTM writes – Increment version number
• Causes STM read invalidation / HTM abort
Abstract Locking
&
Undo Operations
Abstract Locks for STM
Open / Boosted Atomic Method
Body
Acquire abstract locks
Release abstract locks
If top level transaction
Open / Boosted Atomic Method
Body
Acquire abstract locks
Log abstract locks
If nested transaction
Log undo operations
Abstract Locks for Hybrid TM
Open / Boosted Atomic Method
Body
Validate abstract locks
Open / Boosted Atomic Method
Body
Acquire abstract locks
Log abstract locks
If top level is HTM
Log undo operations
Why Validation Works
• HTM vs STM
• If abstract locks conflict they must touch some same
physical words in the abstract locking data structure
— otherwise they could not detect the conflict
Open / Boosted Atomic Method
Body
Validate abstract locks
Why Validation Works
• HTM vs STM
• If abstract locks conflict they must touch some same
physical words in the abstract locking data structure
— otherwise they could not detect the conflict
• HTM vs HTM
• No conflict in the locking data structure because all
accesses to it are reads
• Any real conflicts that exist will occur on the actual
data structure
Open / Boosted Atomic Method
Body
Validate abstract locks
STM and HTM Methods
STM needs logging HTM doesn't
Different actions during read/write
Different actions for abstract locks
HTM should fall back to STM
STM and HTM Methods
STM needs logging HTM doesn't
Different actions during read/write
Different actions for abstract locks
HTM should fall back to STM
Maintain separate HTM and STM versions of methods
STM Method Variants
Original
(Non-txnal)
STM Method Variants
Original
(Non-txnal)
Can call
A B
STM Method Variants
Original
(Non-txnal)
Top-level Txn
STM
Can call
A B
STM Method Variants
Original
(Non-txnal)
Top-level Txn
STM
Can call
A B
STM Method Variants
STM
Transactionalized
Original
(Non-txnal)
Top-level Txn
STM
Can call
A B
STM Method Variants
STM
Transactionalized
Original
(Non-txnal)
Top-level Txn
STM
Can call
A B
STM Method Variants
STM
Transactionalized
Original
(Non-txnal)
New nested Txn
STM
Top-level Txn
STM
Can call
A B
STM Method Variants
STM
Transactionalized
Original
(Non-txnal)
New nested Txn
STM
Top-level Txn
STM
Can call
A B
STM Method Variants
STM
Transactionalized
Original
(Non-txnal)
New nested Txn
STM
Top-level Txn
STM
Can call
A B
STM Method Variants
STM
Transactionalized
Original
(Non-txnal)
New nested Txn
STM
Top-level Txn
STM
Can call
A B
Hybrid TM Method Variants
Original
(Non-txnal)
Hybrid TM Method Variants
Original
(Non-txnal)Can call
A B
Hybrid TM Method Variants
Router
Original
(Non-txnal)Can call
A B
Hybrid TM Method Variants
Router
Original
(Non-txnal)Can call
A B
Hybrid TM Method Variants
Top-level Txn
STM
Router
Original
(Non-txnal)Can call
A B
Hybrid TM Method Variants
Top-level Txn
HTM
Top-level Txn
STM
Router
Original
(Non-txnal)Can call
A B
Hybrid TM Method Variants
HTM
Transactionalized
Top-level Txn
HTM
Top-level Txn
STM
Router
Original
(Non-txnal)Can call
A B
Hybrid TM Method Variants
HTM
Transactionalized
Top-level Txn
HTM
Top-level Txn
STM
Router
Original
(Non-txnal)Can call
A B
Hybrid TM Method Variants
Optimized
New nested Txn
HTM
HTM
Transactionalized
Top-level Txn
HTM
Top-level Txn
STM
Router
Original
(Non-txnal)Can call
A B
Hybrid TM Method Variants
Optimized
New nested Txn
HTM
HTM
Transactionalized
Top-level Txn
HTM
Top-level Txn
STM
Router
Original
(Non-txnal)Can call
A B
Hybrid TM Method Variants
Optimized
New nested Txn
HTM
HTM
Transactionalized
Top-level Txn
HTM
Top-level Txn
STM
Router
Original
(Non-txnal)Can call
A B
Hybrid TM Method Variants
Optimized
New nested Txn
HTM
HTM
Transactionalized
Top-level Txn
HTM
STM
Transactionalized
Top-level Txn
STM
Router
Original
(Non-txnal)Can call
A B
Hybrid TM Method Variants
Optimized
New nested Txn
HTM
HTM
Transactionalized
Top-level Txn
HTM
STM
Transactionalized
Top-level Txn
STM
Router
Original
(Non-txnal)Can call
A B
Hybrid TM Method Variants
Optimized
New nested Txn
HTM
HTM
Transactionalized
Top-level Txn
HTM
STM
Transactionalized
Top-level Txn
STM
New nested
Txn STM
Router
Original
(Non-txnal)Can call
A B
Parent is closed
A B
Hybrid TM Method Variants
Optimized
New nested Txn
HTM
HTM
Transactionalized
Top-level Txn
HTM
STM
Transactionalized
Top-level Txn
STM
New nested
Txn STM
Router
Original
(Non-txnal)Can call
A B
Parent is closed
A B
Hybrid TM Method Variants
Optimized
New nested Txn
HTM
HTM
Transactionalized
Top-level Txn
HTM
STM
Transactionalized
Top-level Txn
STM
New nested
Txn STM
Router
Original
(Non-txnal)Can call
A B
Parent is closed
A B
Hybrid TM Method Variants
Optimized
New nested Txn
HTM
HTM
Transactionalized
Top-level Txn
HTM
STM
Transactionalized
Top-level Txn
STM
New nested
Txn STM
Router Nested Router
Original
(Non-txnal)Can call
A B
Parent is closed
A B
Parent is open
A B
Hybrid TM Method Variants
Optimized
New nested Txn
HTM
HTM
Transactionalized
Top-level Txn
HTM
STM
Transactionalized
Top-level Txn
STM
New nested
Txn STM
Router Nested Router
Original
(Non-txnal)Can call
A B
Parent is closed
A B
Parent is open
A B
Hybrid TM Method Variants
Optimized
New nested Txn
HTM
HTM
Transactionalized
Top-level Txn
HTM
New nested
Txn HTM
STM
Transactionalized
Top-level Txn
STM
New nested
Txn STM
Router Nested Router
Original
(Non-txnal)Can call
A B
Parent is closed
A B
Parent is open
A B
Hybrid TM Method Variants
Optimized
New nested Txn
HTM
HTM
Transactionalized
Top-level Txn
HTM
New nested
Txn HTM
STM
Transactionalized
Top-level Txn
STM
New nested
Txn STM
Router Nested Router
Original
(Non-txnal)Can call
A B
Parent is closed
A B
Parent is open
A B
Hybrid TM Method Variants
Optimized
New nested Txn
HTM
HTM
Transactionalized
Top-level Txn
HTM
New nested
Txn HTM
STM
Transactionalized
Top-level Txn
STM
New nested
Txn STM
Router Nested Router
Original
(Non-txnal)Can call
A B
Parent is closed
A B
Parent is open
A B
XJ System Architecture
XJ System Architecture
XJ source code
XJ System Architecture
XJ source code XJ Compiler
standard Java
bytecode
compile
XJ System Architecture
XJ source code XJ Compiler
standard Java
bytecode XJ Rewriter
bytecode
+ run-time calls
compile load
XJ System Architecture
XJ source code
XJ run-time
library
XJ Compiler
standard Java
bytecode XJ Rewriter
bytecode
+ run-time calls
HTM-enabled
JVM
compile load run
XJ System Architecture
HTM 4-5 times faster than STM
XJ source code
XJ run-time
library
XJ Compiler
standard Java
bytecode XJ Rewriter
bytecode
+ run-time calls
HTM-enabled
JVM
compile load run
VM Modifications
Kept to a minimum
Modifications done on OpenJDK:
• Native methods to begin, end, and abort a HW transaction
• Made them intrinsic to the HotSpot C1/C2 compilers
Had to go through several hoops to get HTM to work with HotSpot’s
optimizing compilers
Results
Synchrobench
Micro-benchmarks to evaluate synchronization performance on various
data structures
Added ability to run multiple operations within a single transaction
(group size)
Included XJ versions of the benchmarks
• TransactionalFriendlyTreeSet
48-way, Intel Xeon E5-2690 v3 machine with 2 sockets of 12
hyperthreaded cores
Throughput(Normalized)
0
0.2
0.4
0.6
0.8
1
Threads
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Throughput(Normalized)
0
0.2
0.4
0.6
0.8
1
Threads
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Open nestedClosed nested
Group size 1
5% Updates
Throughput(Normalized)
0
0.2
0.4
0.6
0.8
1
Threads
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Throughput(Normalized)
0
0.2
0.4
0.6
0.8
1
Threads
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Group size 1
Group size 2
Open nestedClosed nested 5% Updates
Throughput(Normalized)
0
0.2
0.4
0.6
0.8
1
Threads
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Throughput(Normalized)
0
0.2
0.4
0.6
0.8
1
Threads
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Group size 1
Group size 2
Group size 4
Open nestedClosed nested 5% Updates
Throughput(Normalized)
0
0.2
0.4
0.6
0.8
1
Threads
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Throughput(Normalized)
0
0.2
0.4
0.6
0.8
1
Threads
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Group size 1
Group size 2
Group size 4
Group size 8
Open nestedClosed nested 5% Updates
Throughput(Normalized)
0
0.2
0.4
0.6
0.8
1
Threads
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Throughput(Normalized)
0
0.2
0.4
0.6
0.8
1
Threads
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Group size 1
Group size 2
Group size 4
Group size 8
Group size 16
Open nestedClosed nested 5% Updates
Throughput(Normalized)
0
0.2
0.4
0.6
0.8
1
Threads
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Throughput(Normalized)
0
0.2
0.4
0.6
0.8
1
Threads
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Group size 1
Group size 2
Group size 4
Group size 8
Group size 16
Group size 32
Open nestedClosed nested 5% Updates
0
50
100
150
200
1
2
4
8
12
16
20
24
28
32
36
40
44
48
1
2
4
8
12
16
20
24
28
32
36
40
44
48
1
2
4
8
12
16
20
24
28
32
36
40
44
48
Group size 1 Group size 2 Group size 4
committedopsandabortedtxns(10
6
) open htm commits
closed htm commits
open stm commits
closed stm commits
htm aborts
Conclusions
STM and HTM can co-exist for nested transactions in Java
• Closed nesting — Similar to previous schemes
• Open nesting — Novel validation mechanism
• Implemented in OpenJDK on Intel TSX — Artifact evaluated
Conclusions
STM and HTM can co-exist for nested transactions in Java
• Closed nesting — Similar to previous schemes
• Open nesting — Novel validation mechanism
• Implemented in OpenJDK on Intel TSX — Artifact evaluated
Conclusions
STM and HTM can co-exist for nested transactions in Java
• Closed nesting — Similar to previous schemes
• Open nesting — Novel validation mechanism
• Implemented in OpenJDK on Intel TSX — Artifact evaluated
When it works, HTM is ~4-5× faster than STM
Conclusions
STM and HTM can co-exist for nested transactions in Java
• Closed nesting — Similar to previous schemes
• Open nesting — Novel validation mechanism
• Implemented in OpenJDK on Intel TSX — Artifact evaluated
When it works, HTM is ~4-5× faster than STM
Open nesting increases the envelope of effectiveness for HTM
Conclusions
STM and HTM can co-exist for nested transactions in Java
• Closed nesting — Similar to previous schemes
• Open nesting — Novel validation mechanism
• Implemented in OpenJDK on Intel TSX — Artifact evaluated
When it works, HTM is ~4-5× faster than STM
Open nesting increases the envelope of effectiveness for HTM
Production VM would need deeper modification

More Related Content

Viewers also liked

Hybrid Swtiching dyk04
Hybrid Swtiching dyk04Hybrid Swtiching dyk04
Hybrid Swtiching dyk04
DaeYoung (DY) KIM
 
HTM Theory
HTM TheoryHTM Theory
HTM Theory
Asfak Mahamud
 
Perovskite Solar Cells - an Introduction
Perovskite Solar Cells - an IntroductionPerovskite Solar Cells - an Introduction
Perovskite Solar Cells - an Introduction
Dawn John Mullassery
 
Fabrication of perovskite solar cell
Fabrication of perovskite solar cellFabrication of perovskite solar cell
Fabrication of perovskite solar cell
akash arya
 
OLED technology Seminar Ppt
OLED technology Seminar PptOLED technology Seminar Ppt
OLED technology Seminar Ppt
Ashly Liza
 
oled ppt
oled pptoled ppt

Viewers also liked (7)

Hybrid Swtiching dyk04
Hybrid Swtiching dyk04Hybrid Swtiching dyk04
Hybrid Swtiching dyk04
 
MRS poster (5)
MRS poster (5)MRS poster (5)
MRS poster (5)
 
HTM Theory
HTM TheoryHTM Theory
HTM Theory
 
Perovskite Solar Cells - an Introduction
Perovskite Solar Cells - an IntroductionPerovskite Solar Cells - an Introduction
Perovskite Solar Cells - an Introduction
 
Fabrication of perovskite solar cell
Fabrication of perovskite solar cellFabrication of perovskite solar cell
Fabrication of perovskite solar cell
 
OLED technology Seminar Ppt
OLED technology Seminar PptOLED technology Seminar Ppt
OLED technology Seminar Ppt
 
oled ppt
oled pptoled ppt
oled ppt
 

Similar to Hybrid STM/HTM for Nested Transactions on OpenJDK

What Scalable Programs Need from Transactional Memory
What Scalable Programs Need from Transactional MemoryWhat Scalable Programs Need from Transactional Memory
What Scalable Programs Need from Transactional Memory
Donald Nguyen
 
Prelim Slides
Prelim SlidesPrelim Slides
Prelim Slides
smpant
 
Coding style for good synthesis
Coding style for good synthesisCoding style for good synthesis
Coding style for good synthesis
Vinchipsytm Vlsitraining
 
Automatic Loop Parallelization using STM
Automatic Loop Parallelization using STMAutomatic Loop Parallelization using STM
Automatic Loop Parallelization using STMAmr Abed
 
Presentation systemc
Presentation systemcPresentation systemc
Presentation systemc
SUBRAHMANYA S
 
You name it, we analyze it
You name it, we analyze itYou name it, we analyze it
You name it, we analyze it
Jim Gilsinn
 
S4x14 Session: You Name It; We Analyze It
S4x14 Session: You Name It; We Analyze ItS4x14 Session: You Name It; We Analyze It
S4x14 Session: You Name It; We Analyze It
Digital Bond
 
PDA and Turing Machine (1).ppt
PDA and Turing Machine (1).pptPDA and Turing Machine (1).ppt
PDA and Turing Machine (1).ppt
AayushSingh233965
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
Stephen Peacock
 
CPAN Gems From The Far East
CPAN Gems From The Far EastCPAN Gems From The Far East
CPAN Gems From The Far Eastlestrrat
 
EKON27-FrameworksTuning.pdf
EKON27-FrameworksTuning.pdfEKON27-FrameworksTuning.pdf
EKON27-FrameworksTuning.pdf
Arnaud Bouchez
 
XML / JSON Data Exchange with PLC
XML / JSON Data Exchange with PLCXML / JSON Data Exchange with PLC
XML / JSON Data Exchange with PLC
Feri Handoyo
 
Enterprise Messaging with RabbitMQ.pdf
Enterprise Messaging with RabbitMQ.pdfEnterprise Messaging with RabbitMQ.pdf
Enterprise Messaging with RabbitMQ.pdf
Ortus Solutions, Corp
 
High performance network programming on the jvm oscon 2012
High performance network programming on the jvm   oscon 2012 High performance network programming on the jvm   oscon 2012
High performance network programming on the jvm oscon 2012
Erik Onnen
 
LMAX Disruptor - High Performance Inter-Thread Messaging Library
LMAX Disruptor - High Performance Inter-Thread Messaging LibraryLMAX Disruptor - High Performance Inter-Thread Messaging Library
LMAX Disruptor - High Performance Inter-Thread Messaging Library
Sebastian Andrasoni
 
Tracing versus Partial Evaluation: Which Meta-Compilation Approach is Better ...
Tracing versus Partial Evaluation: Which Meta-Compilation Approach is Better ...Tracing versus Partial Evaluation: Which Meta-Compilation Approach is Better ...
Tracing versus Partial Evaluation: Which Meta-Compilation Approach is Better ...
Stefan Marr
 
sCode optimization
sCode optimizationsCode optimization
sCode optimization
Satyamevjayte Haxor
 
Cache Optimization with Akamai
Cache Optimization with AkamaiCache Optimization with Akamai
Cache Optimization with AkamaiBlake Crosby
 
Tsl version 1.1_review
Tsl version 1.1_reviewTsl version 1.1_review
Tsl version 1.1_review
Ball Sutta
 

Similar to Hybrid STM/HTM for Nested Transactions on OpenJDK (20)

What Scalable Programs Need from Transactional Memory
What Scalable Programs Need from Transactional MemoryWhat Scalable Programs Need from Transactional Memory
What Scalable Programs Need from Transactional Memory
 
Prelim Slides
Prelim SlidesPrelim Slides
Prelim Slides
 
Coding style for good synthesis
Coding style for good synthesisCoding style for good synthesis
Coding style for good synthesis
 
Automatic Loop Parallelization using STM
Automatic Loop Parallelization using STMAutomatic Loop Parallelization using STM
Automatic Loop Parallelization using STM
 
Presentation systemc
Presentation systemcPresentation systemc
Presentation systemc
 
Build your own statistical engines
Build your own statistical enginesBuild your own statistical engines
Build your own statistical engines
 
You name it, we analyze it
You name it, we analyze itYou name it, we analyze it
You name it, we analyze it
 
S4x14 Session: You Name It; We Analyze It
S4x14 Session: You Name It; We Analyze ItS4x14 Session: You Name It; We Analyze It
S4x14 Session: You Name It; We Analyze It
 
PDA and Turing Machine (1).ppt
PDA and Turing Machine (1).pptPDA and Turing Machine (1).ppt
PDA and Turing Machine (1).ppt
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
 
CPAN Gems From The Far East
CPAN Gems From The Far EastCPAN Gems From The Far East
CPAN Gems From The Far East
 
EKON27-FrameworksTuning.pdf
EKON27-FrameworksTuning.pdfEKON27-FrameworksTuning.pdf
EKON27-FrameworksTuning.pdf
 
XML / JSON Data Exchange with PLC
XML / JSON Data Exchange with PLCXML / JSON Data Exchange with PLC
XML / JSON Data Exchange with PLC
 
Enterprise Messaging with RabbitMQ.pdf
Enterprise Messaging with RabbitMQ.pdfEnterprise Messaging with RabbitMQ.pdf
Enterprise Messaging with RabbitMQ.pdf
 
High performance network programming on the jvm oscon 2012
High performance network programming on the jvm   oscon 2012 High performance network programming on the jvm   oscon 2012
High performance network programming on the jvm oscon 2012
 
LMAX Disruptor - High Performance Inter-Thread Messaging Library
LMAX Disruptor - High Performance Inter-Thread Messaging LibraryLMAX Disruptor - High Performance Inter-Thread Messaging Library
LMAX Disruptor - High Performance Inter-Thread Messaging Library
 
Tracing versus Partial Evaluation: Which Meta-Compilation Approach is Better ...
Tracing versus Partial Evaluation: Which Meta-Compilation Approach is Better ...Tracing versus Partial Evaluation: Which Meta-Compilation Approach is Better ...
Tracing versus Partial Evaluation: Which Meta-Compilation Approach is Better ...
 
sCode optimization
sCode optimizationsCode optimization
sCode optimization
 
Cache Optimization with Akamai
Cache Optimization with AkamaiCache Optimization with Akamai
Cache Optimization with Akamai
 
Tsl version 1.1_review
Tsl version 1.1_reviewTsl version 1.1_review
Tsl version 1.1_review
 

Recently uploaded

2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Jay Das
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
Srikant77
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 

Recently uploaded (20)

2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 

Hybrid STM/HTM for Nested Transactions on OpenJDK

  • 1. Hybrid STM/HTM for Nested Transactions on OpenJDK Keith Chapman Purdue U Tony Hosking ANU/Data61, Purdue U Eliot Moss UMass
  • 2. Motivation STM has been around for ages • But STM is slow Commodity hardware for transactions available now • But HTM approaches are only best effort Out goal: Accelerate STM with HTM when possible
  • 3. Nested Transactions Allow composition of transactions Two flavors • Closed Nested Transactions • HTM inside STM not useful; must do all the same work • Open Nested Transactions • HTM inside STM avoids most TM work – HTM well-suited!
  • 4. XJ: Transactional Java XJ Language • Supports flat and closed/open/boosted nested transactions The implementation supports hybrid transactions • HTM support via Intel TSX • HTM and STM can co-exist Uses an Optimistic-reads / Pessimistic-writes protocol
  • 5. TM Metadata Txn Log 0 0Version # 0 0 0Txn ID 1 One metadata word for every object
  • 6. STM Transaction Protocol (Read) T1 Log 0 00 0
  • 7. STM Transaction Protocol (Read) T1 Log 0 00 0 Check Mode
  • 8. STM Transaction Protocol (Read) T1 Log 0 0 0 0 T1 Read
  • 9. STM Transaction Protocol (Read) T1 Log 0 0 0 0 T1 Validate
  • 10. STM Transaction Protocol (Read) T1 Log 0 0 T1 Commit
  • 11. 0 0 STM Transaction Protocol (Write) T1 Log
  • 12. 0 0 STM Transaction Protocol (Write) T1 Log Check Mode
  • 13. T1 1 0 0 STM Transaction Protocol (Write) T1 Log T1 Write
  • 14. T1 1 0 0 STM Transaction Protocol (Write) T1 Log T1 Write
  • 15. T1 1 0 0 STM Transaction Protocol (Write) T1 Log T1 Write
  • 16. T1 11 0 STM Transaction Protocol (Write) T1 Log T1 Commit
  • 17. STM Conflict Detection T1 Log 0 00 0 T2 Log 0 0
  • 18. STM Conflict Detection T1 Log 0 0 0 0 T2 Log 0 0 T1 Read
  • 19. STM Conflict Detection T1 Log 0 0 0 0 T2 Log 0 0 T2 Write T2 1
  • 20. STM Conflict Detection T1 Log 0 0 0 0 T2 Log 0 0 T1 Validate T2 1
  • 21. STM Conflict Detection T1 Log 0 0 T2 Log 0 0 T1 Abort T2 1
  • 22. Hybrid Transaction Protocol STM – HTM conflicts detected by lock word accesses • Explicit XABORT if locked by another transaction • HTM reads – Read the metadata word • STM writes modify the metadata word • Causes HTM to abort • HTM writes – Increment version number • Causes STM read invalidation / HTM abort
  • 24. Abstract Locks for STM Open / Boosted Atomic Method Body Acquire abstract locks Release abstract locks If top level transaction Open / Boosted Atomic Method Body Acquire abstract locks Log abstract locks If nested transaction Log undo operations
  • 25. Abstract Locks for Hybrid TM Open / Boosted Atomic Method Body Validate abstract locks Open / Boosted Atomic Method Body Acquire abstract locks Log abstract locks If top level is HTM Log undo operations
  • 26. Why Validation Works • HTM vs STM • If abstract locks conflict they must touch some same physical words in the abstract locking data structure — otherwise they could not detect the conflict Open / Boosted Atomic Method Body Validate abstract locks
  • 27. Why Validation Works • HTM vs STM • If abstract locks conflict they must touch some same physical words in the abstract locking data structure — otherwise they could not detect the conflict • HTM vs HTM • No conflict in the locking data structure because all accesses to it are reads • Any real conflicts that exist will occur on the actual data structure Open / Boosted Atomic Method Body Validate abstract locks
  • 28. STM and HTM Methods STM needs logging HTM doesn't Different actions during read/write Different actions for abstract locks HTM should fall back to STM
  • 29. STM and HTM Methods STM needs logging HTM doesn't Different actions during read/write Different actions for abstract locks HTM should fall back to STM Maintain separate HTM and STM versions of methods
  • 36. STM Method Variants STM Transactionalized Original (Non-txnal) New nested Txn STM Top-level Txn STM Can call A B
  • 37. STM Method Variants STM Transactionalized Original (Non-txnal) New nested Txn STM Top-level Txn STM Can call A B
  • 38. STM Method Variants STM Transactionalized Original (Non-txnal) New nested Txn STM Top-level Txn STM Can call A B
  • 39. STM Method Variants STM Transactionalized Original (Non-txnal) New nested Txn STM Top-level Txn STM Can call A B
  • 40. Hybrid TM Method Variants Original (Non-txnal)
  • 41. Hybrid TM Method Variants Original (Non-txnal)Can call A B
  • 42. Hybrid TM Method Variants Router Original (Non-txnal)Can call A B
  • 43. Hybrid TM Method Variants Router Original (Non-txnal)Can call A B
  • 44. Hybrid TM Method Variants Top-level Txn STM Router Original (Non-txnal)Can call A B
  • 45. Hybrid TM Method Variants Top-level Txn HTM Top-level Txn STM Router Original (Non-txnal)Can call A B
  • 46. Hybrid TM Method Variants HTM Transactionalized Top-level Txn HTM Top-level Txn STM Router Original (Non-txnal)Can call A B
  • 47. Hybrid TM Method Variants HTM Transactionalized Top-level Txn HTM Top-level Txn STM Router Original (Non-txnal)Can call A B
  • 48. Hybrid TM Method Variants Optimized New nested Txn HTM HTM Transactionalized Top-level Txn HTM Top-level Txn STM Router Original (Non-txnal)Can call A B
  • 49. Hybrid TM Method Variants Optimized New nested Txn HTM HTM Transactionalized Top-level Txn HTM Top-level Txn STM Router Original (Non-txnal)Can call A B
  • 50. Hybrid TM Method Variants Optimized New nested Txn HTM HTM Transactionalized Top-level Txn HTM Top-level Txn STM Router Original (Non-txnal)Can call A B
  • 51. Hybrid TM Method Variants Optimized New nested Txn HTM HTM Transactionalized Top-level Txn HTM STM Transactionalized Top-level Txn STM Router Original (Non-txnal)Can call A B
  • 52. Hybrid TM Method Variants Optimized New nested Txn HTM HTM Transactionalized Top-level Txn HTM STM Transactionalized Top-level Txn STM Router Original (Non-txnal)Can call A B
  • 53. Hybrid TM Method Variants Optimized New nested Txn HTM HTM Transactionalized Top-level Txn HTM STM Transactionalized Top-level Txn STM New nested Txn STM Router Original (Non-txnal)Can call A B Parent is closed A B
  • 54. Hybrid TM Method Variants Optimized New nested Txn HTM HTM Transactionalized Top-level Txn HTM STM Transactionalized Top-level Txn STM New nested Txn STM Router Original (Non-txnal)Can call A B Parent is closed A B
  • 55. Hybrid TM Method Variants Optimized New nested Txn HTM HTM Transactionalized Top-level Txn HTM STM Transactionalized Top-level Txn STM New nested Txn STM Router Original (Non-txnal)Can call A B Parent is closed A B
  • 56. Hybrid TM Method Variants Optimized New nested Txn HTM HTM Transactionalized Top-level Txn HTM STM Transactionalized Top-level Txn STM New nested Txn STM Router Nested Router Original (Non-txnal)Can call A B Parent is closed A B Parent is open A B
  • 57. Hybrid TM Method Variants Optimized New nested Txn HTM HTM Transactionalized Top-level Txn HTM STM Transactionalized Top-level Txn STM New nested Txn STM Router Nested Router Original (Non-txnal)Can call A B Parent is closed A B Parent is open A B
  • 58. Hybrid TM Method Variants Optimized New nested Txn HTM HTM Transactionalized Top-level Txn HTM New nested Txn HTM STM Transactionalized Top-level Txn STM New nested Txn STM Router Nested Router Original (Non-txnal)Can call A B Parent is closed A B Parent is open A B
  • 59. Hybrid TM Method Variants Optimized New nested Txn HTM HTM Transactionalized Top-level Txn HTM New nested Txn HTM STM Transactionalized Top-level Txn STM New nested Txn STM Router Nested Router Original (Non-txnal)Can call A B Parent is closed A B Parent is open A B
  • 60. Hybrid TM Method Variants Optimized New nested Txn HTM HTM Transactionalized Top-level Txn HTM New nested Txn HTM STM Transactionalized Top-level Txn STM New nested Txn STM Router Nested Router Original (Non-txnal)Can call A B Parent is closed A B Parent is open A B
  • 63. XJ System Architecture XJ source code XJ Compiler standard Java bytecode compile
  • 64. XJ System Architecture XJ source code XJ Compiler standard Java bytecode XJ Rewriter bytecode + run-time calls compile load
  • 65. XJ System Architecture XJ source code XJ run-time library XJ Compiler standard Java bytecode XJ Rewriter bytecode + run-time calls HTM-enabled JVM compile load run
  • 66. XJ System Architecture HTM 4-5 times faster than STM XJ source code XJ run-time library XJ Compiler standard Java bytecode XJ Rewriter bytecode + run-time calls HTM-enabled JVM compile load run
  • 67. VM Modifications Kept to a minimum Modifications done on OpenJDK: • Native methods to begin, end, and abort a HW transaction • Made them intrinsic to the HotSpot C1/C2 compilers Had to go through several hoops to get HTM to work with HotSpot’s optimizing compilers
  • 69. Synchrobench Micro-benchmarks to evaluate synchronization performance on various data structures Added ability to run multiple operations within a single transaction (group size) Included XJ versions of the benchmarks • TransactionalFriendlyTreeSet 48-way, Intel Xeon E5-2690 v3 machine with 2 sockets of 12 hyperthreaded cores
  • 70. Throughput(Normalized) 0 0.2 0.4 0.6 0.8 1 Threads 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Throughput(Normalized) 0 0.2 0.4 0.6 0.8 1 Threads 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Open nestedClosed nested Group size 1 5% Updates
  • 71. Throughput(Normalized) 0 0.2 0.4 0.6 0.8 1 Threads 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Throughput(Normalized) 0 0.2 0.4 0.6 0.8 1 Threads 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Group size 1 Group size 2 Open nestedClosed nested 5% Updates
  • 72. Throughput(Normalized) 0 0.2 0.4 0.6 0.8 1 Threads 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Throughput(Normalized) 0 0.2 0.4 0.6 0.8 1 Threads 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Group size 1 Group size 2 Group size 4 Open nestedClosed nested 5% Updates
  • 73. Throughput(Normalized) 0 0.2 0.4 0.6 0.8 1 Threads 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Throughput(Normalized) 0 0.2 0.4 0.6 0.8 1 Threads 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Group size 1 Group size 2 Group size 4 Group size 8 Open nestedClosed nested 5% Updates
  • 74. Throughput(Normalized) 0 0.2 0.4 0.6 0.8 1 Threads 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Throughput(Normalized) 0 0.2 0.4 0.6 0.8 1 Threads 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Group size 1 Group size 2 Group size 4 Group size 8 Group size 16 Open nestedClosed nested 5% Updates
  • 75. Throughput(Normalized) 0 0.2 0.4 0.6 0.8 1 Threads 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Throughput(Normalized) 0 0.2 0.4 0.6 0.8 1 Threads 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Group size 1 Group size 2 Group size 4 Group size 8 Group size 16 Group size 32 Open nestedClosed nested 5% Updates
  • 76. 0 50 100 150 200 1 2 4 8 12 16 20 24 28 32 36 40 44 48 1 2 4 8 12 16 20 24 28 32 36 40 44 48 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Group size 1 Group size 2 Group size 4 committedopsandabortedtxns(10 6 ) open htm commits closed htm commits open stm commits closed stm commits htm aborts
  • 77. Conclusions STM and HTM can co-exist for nested transactions in Java • Closed nesting — Similar to previous schemes • Open nesting — Novel validation mechanism • Implemented in OpenJDK on Intel TSX — Artifact evaluated
  • 78. Conclusions STM and HTM can co-exist for nested transactions in Java • Closed nesting — Similar to previous schemes • Open nesting — Novel validation mechanism • Implemented in OpenJDK on Intel TSX — Artifact evaluated
  • 79. Conclusions STM and HTM can co-exist for nested transactions in Java • Closed nesting — Similar to previous schemes • Open nesting — Novel validation mechanism • Implemented in OpenJDK on Intel TSX — Artifact evaluated When it works, HTM is ~4-5× faster than STM
  • 80. Conclusions STM and HTM can co-exist for nested transactions in Java • Closed nesting — Similar to previous schemes • Open nesting — Novel validation mechanism • Implemented in OpenJDK on Intel TSX — Artifact evaluated When it works, HTM is ~4-5× faster than STM Open nesting increases the envelope of effectiveness for HTM
  • 81. Conclusions STM and HTM can co-exist for nested transactions in Java • Closed nesting — Similar to previous schemes • Open nesting — Novel validation mechanism • Implemented in OpenJDK on Intel TSX — Artifact evaluated When it works, HTM is ~4-5× faster than STM Open nesting increases the envelope of effectiveness for HTM Production VM would need deeper modification