SlideShare a Scribd company logo
1 of 43
Objects with adaptive accessors
to avoid STM barriers
F. Miguel Carvalho and João Cachopo
1
WTM-2012
Software Engineering Group
Bern, Switzerland, April 10, 2012
General Goal
2
Shared data
3
atomic
4
Overheads?
5
STM Barriers
6
Reduce Runtime Overheads
• Redo-log <vs> undo-log
• Eager <vs> lazy ownership acquisition
• Transactional versioning
• No ownership records
• Metatada in place
• Multi-versioning (e.g. JVSTM)
• …
7
Good for read-only
8
Memory overheads
9
Winding path
10
Can we suppress these overheads?
11
AOM
Adaptive Object Metadata
…implemented in the JVSTM
12
box
13
:VBox
body:
:Counter
current:
versions’ history
14
:VBoxBody
next:
value: 2
version: 19
:VBoxBody
next:
value: 1
version: 17
:VBoxBody
next: null
value: 0
version: 13
:VBox
body:
:Counter
current:
The most recent
committed value.
Transaction
15
:VBoxBody
next:
value: 2
version: 19
:VBoxBody
next:
value: 1
version: 17
:VBoxBody
next: null
value: 0
version: 13
:VBox
body:
:Counter
current:
:Transaction
version: 18
…
lastCommitted 23
Transaction
16
:VBoxBody
next:
value: 2
version: 19
:VBoxBody
next:
value: 1
version: 17
:VBoxBody
next: null
value: 0
version: 13
:VBox
body:
:Counter
current:
:Transaction
version: 18
…
lastCommitted 23
Transaction 18 reads
the version 17
Shared Data
17
No contention
18
AOM
19
Compact Extended
AOM
:SomeType
32767
2147483647
34.7
field x
field y
field z
Compact Extended
AOM
:Something
32767
2147483647
34.7
field x
field y
field z
Compact Extended
nullHeader
AOM
Header
32767
2147483647
34.7
Header
32767
2147483647
34.7
value
of x
value
of y
value
of z
version
values
next
:VBoxBody
23
field x
field y
field z
version
values
next
:VBoxBody
19
null
value
of x
value
of y
value
of z
22
Compact Extended
AOM
Header
x of version 23
z of version 23
y of version 23
Header
32767
2147483647
34.7
value
of x
value
of y
value
of z
version
values
next
:VBoxBody
23
field x
field y
field z
version
values
next
:VBoxBody
19
null
value
of x
value
of y
value
of z
23
Compact Extended
1.
2.
1. Extending
24
header:
x: 3
y: 7
null
25
:VBoxBody
next:
version: 0
value:
header:
x: 3
y: 7
null
null
:Integer
value: 7
:Integer
value: 3
1
1. Extending
replicate()
26
:VBoxBody
next:
version: 17
value:
:VBoxBody
next:
version: 0
value:
header:
x: 3
y: 7
null
null
:Integer
value: 7
:Integer
value:11
:Integer
value: 7
:Integer
value: 3
12
1. Extending
replicate()
27
:VBoxBody
next:
version: 17
value:
:VBoxBody
next:
version: 0
value:
header:
x: 3
y: 7
null
null
:Integer
value: 13
:Integer
value: 11
:Integer
value: 7
:Integer
value: 3
13 2
1. Extending
casHeader()
2. Reverting
:VBoxBody
next:
version: 17
value:
header:
x: 3
y: 7
:Integer
value: 13
:Integer
value: 11
null
null
2
1
3
cas
put
read 28
boolean tryRevert (AdaptiveObject o , VBoxBody body){
if(o.readHeader() == body){
o.toCompactLayout(body.value);
return o.casHeaderWithNull(body);
}
return false;
}
1
2
3
AdaptiveObject
29
abstract class AdaptiveObject <T extends AdaptiveObject{
private VBoxBody<T> header;
public abstract void toCompactLayout(T from);
public VBoxBody<T> readHeader(){
return header;
}
public boolean casHeaderWithNull(VBoxBody<T> expected){
return UtilUnsafe.UNSAFE.compareAndSwapObject(this,header__ADDRESS__, expected, null);
}
}
29
boolean tryRevert (AdaptiveObject o , VBoxBody body){
if(o.readHeader() == body){
o.toCompactLayout(body.value);
return o.casHeaderWithNull(body);
}
return false;
}
1
2
3
AdaptiveObject
30
abstract class AdaptiveObject <T extends AdaptiveObject{
private static final long header__ADDRESS__;
private VBoxBody<T> header;
public abstract T replicate();
public abstract void toCompactLayout(T from);
public VBoxBody<T> readHeader(){
return header;
}
public boolean casHeaderWithNull(VBoxBody<T> expected){
return UtilUnsafe.UNSAFE.compareAndSwapObject(this,header__ADDRESS__, expected, null);
}
public boolean casHeader(VBoxBody<T> expected, VBoxBody<T> newBody){
return UtilUnsafe.UNSAFE.compareAndSwapObject(this, header__ADDRESS__, expected, newBody);
}
}
hierarchy
31
Object
…
…… …
…
…
… …
… …
…
… …
… …
hierarchy
32
AdaptiveObject
…
…… …
…
…
… …
… …
…
… …
… …
Object
AOM
• 1st release (Multiprog 12)
– implemented with the JVSTM lock based
– reversion and extension operations specified by
an AdaptiveObject interface
• 2nd release:
– Implemented with the JVSTM lock free
– AdaptiveObject as the root base class
– provides a Transparent API (like Deuce STM)
33
• increases the speedup between 13% and 35%
(* Multiprog12)
AOM with JVSTM lock based
34
0,00
0,50
1,00
1,50
2,00
2,50
3,00
3,50
1 2 4 8 10 12 14 16
Speedup
Threads
Circuit Main
0,00
0,50
1,00
1,50
2,00
2,50
3,00
1 2 4 8 10 12 14 16
Threads
Circuit Mem
LeeTM
• increases the speedup between 5% and 36%
new AOM with JVSTM lock free
35
LeeTM
0,00
1,00
2,00
3,00
4,00
5,00
6,00
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Speedup
Threads
Circuit Main
JVSTM
AOM
0,00
1,00
2,00
3,00
4,00
5,00
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Speedup
Threads
Circuit Mem
STAMP Vacation, low++ & long trxs & RO
• Low contention
• ++, large data sets
• -n = 256, longer transactions, instead of the
recommendation 2 or 4
• 3 kinds of transactions:
– Delete and create items: car, flight or room
– Remove defaulter clients (bill > 0)
– Query and reserve an item: car, flight or room
36Splitted in 2 transactions: RO + RW
STAMP Vacation, low++ & long trxs & RO
• increases the speedup between 18% and 37%
• Maximum speedup = 4,32
0,00
0,50
1,00
1,50
2,00
2,50
3,00
3,50
4,00
4,50
5,00
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Speedup
Threads
JVSTM
AOM
Comparing with the Deuce STM…
and enhancing the AOM with a
transparent API
38
Turning all objects
transactional
STAMP Vacation, low++ & long trxs & RO
• Maximum speedup = 3,83 (< 4,32 with a non-transparent API)
• Still better than the Deuce STM with TL2
0,00
0,50
1,00
1,50
2,00
2,50
3,00
3,50
4,00
4,50
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Speedup
Threads
JVSTM
AOM
Deuce TL2
STAMP Vacation, low++ & long trxs & RO
• Maximum speedup = 1,92
• Still better than JVSTM and the Deuce TL2
0,00
0,50
1,00
1,50
2,00
2,50
1 2 4 8 12 16 20 24 28 32 36 40 44 48
Speedup
Threads
JVSTM
AOM
Deuce TL2
Future Work
41
Future Work
• An improved reversion algorithm
• New design for AOM that keeps the contention-free execution
path without any barrier or validation
• Integrate the AOM compiler in the implementation of the
Deuce STM
42
43/42

More Related Content

What's hot

Specializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network StackSpecializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network StackKernel TLV
 
Jvm Performance Tunning
Jvm Performance TunningJvm Performance Tunning
Jvm Performance Tunningguest1f2740
 
[Sitcon2018] Analysis and Improvement of IOTA PoW Implementation
[Sitcon2018] Analysis and Improvement of IOTA PoW Implementation[Sitcon2018] Analysis and Improvement of IOTA PoW Implementation
[Sitcon2018] Analysis and Improvement of IOTA PoW ImplementationZhen Wei
 
FPGA design with CλaSH
FPGA design with CλaSHFPGA design with CλaSH
FPGA design with CλaSHConrad Parker
 
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML ProcessingVTD-XML: The Future of XML Processing
VTD-XML: The Future of XML ProcessingGuo Albert
 
Lowering STM Overhead with Static Analysis
Lowering STM Overhead with Static AnalysisLowering STM Overhead with Static Analysis
Lowering STM Overhead with Static AnalysisGuy Korland
 
The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014Jian-Hong Pan
 
Georgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software securityGeorgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software securityDefconRussia
 
Refactoring for testability c++
Refactoring for testability c++Refactoring for testability c++
Refactoring for testability c++Dimitrios Platis
 
Zeromq anatomy & jeromq
Zeromq anatomy & jeromqZeromq anatomy & jeromq
Zeromq anatomy & jeromqDongmin Yu
 
Multithreading done right
Multithreading done rightMultithreading done right
Multithreading done rightPlatonov Sergey
 
Blocks & GCD
Blocks & GCDBlocks & GCD
Blocks & GCDrsebbe
 
LLVM Register Allocation
LLVM Register AllocationLLVM Register Allocation
LLVM Register AllocationWang Hsiangkai
 
NIR on the Mesa i965 backend (FOSDEM 2016)
NIR on the Mesa i965 backend (FOSDEM 2016)NIR on the Mesa i965 backend (FOSDEM 2016)
NIR on the Mesa i965 backend (FOSDEM 2016)Igalia
 
Hs java open_party
Hs java open_partyHs java open_party
Hs java open_partyOpen Party
 

What's hot (20)

Specializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network StackSpecializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network Stack
 
Jvm Performance Tunning
Jvm Performance TunningJvm Performance Tunning
Jvm Performance Tunning
 
S emb t13-freertos
S emb t13-freertosS emb t13-freertos
S emb t13-freertos
 
Loom and concurrency latest
Loom and concurrency latestLoom and concurrency latest
Loom and concurrency latest
 
ocelot
ocelotocelot
ocelot
 
[Sitcon2018] Analysis and Improvement of IOTA PoW Implementation
[Sitcon2018] Analysis and Improvement of IOTA PoW Implementation[Sitcon2018] Analysis and Improvement of IOTA PoW Implementation
[Sitcon2018] Analysis and Improvement of IOTA PoW Implementation
 
FPGA design with CλaSH
FPGA design with CλaSHFPGA design with CλaSH
FPGA design with CλaSH
 
VTD-XML: The Future of XML Processing
VTD-XML: The Future of XML ProcessingVTD-XML: The Future of XML Processing
VTD-XML: The Future of XML Processing
 
Lowering STM Overhead with Static Analysis
Lowering STM Overhead with Static AnalysisLowering STM Overhead with Static Analysis
Lowering STM Overhead with Static Analysis
 
The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014
 
Georgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software securityGeorgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software security
 
FreeRTOS
FreeRTOSFreeRTOS
FreeRTOS
 
Refactoring for testability c++
Refactoring for testability c++Refactoring for testability c++
Refactoring for testability c++
 
Zeromq anatomy & jeromq
Zeromq anatomy & jeromqZeromq anatomy & jeromq
Zeromq anatomy & jeromq
 
Multithreading done right
Multithreading done rightMultithreading done right
Multithreading done right
 
Blocks & GCD
Blocks & GCDBlocks & GCD
Blocks & GCD
 
Joel Falcou, Boost.SIMD
Joel Falcou, Boost.SIMDJoel Falcou, Boost.SIMD
Joel Falcou, Boost.SIMD
 
LLVM Register Allocation
LLVM Register AllocationLLVM Register Allocation
LLVM Register Allocation
 
NIR on the Mesa i965 backend (FOSDEM 2016)
NIR on the Mesa i965 backend (FOSDEM 2016)NIR on the Mesa i965 backend (FOSDEM 2016)
NIR on the Mesa i965 backend (FOSDEM 2016)
 
Hs java open_party
Hs java open_partyHs java open_party
Hs java open_party
 

Similar to Fm wtm12-v2

Counter Wars (JEEConf 2016)
Counter Wars (JEEConf 2016)Counter Wars (JEEConf 2016)
Counter Wars (JEEConf 2016)Alexey Fyodorov
 
Sharding in MongoDB 4.2 #what_is_new
 Sharding in MongoDB 4.2 #what_is_new Sharding in MongoDB 4.2 #what_is_new
Sharding in MongoDB 4.2 #what_is_newAntonios Giannopoulos
 
[grcpp] Refactoring for testability c++
[grcpp] Refactoring for testability c++[grcpp] Refactoring for testability c++
[grcpp] Refactoring for testability c++Dimitrios Platis
 
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...Lucidworks
 
Performance measurement and tuning
Performance measurement and tuningPerformance measurement and tuning
Performance measurement and tuningAOE
 
from Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Worksfrom Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu WorksZhen Wei
 
The CppCat Analyzer Checks TortoiseGit
The CppCat Analyzer Checks TortoiseGitThe CppCat Analyzer Checks TortoiseGit
The CppCat Analyzer Checks TortoiseGitAndrey Karpov
 
Heading for a Record: Chromium, the 5th Check
Heading for a Record: Chromium, the 5th CheckHeading for a Record: Chromium, the 5th Check
Heading for a Record: Chromium, the 5th CheckPVS-Studio
 
4Developers 2018: Ile (nie) wiesz o strukturach w .NET (Łukasz Pyrzyk)
4Developers 2018: Ile (nie) wiesz o strukturach w .NET (Łukasz Pyrzyk)4Developers 2018: Ile (nie) wiesz o strukturach w .NET (Łukasz Pyrzyk)
4Developers 2018: Ile (nie) wiesz o strukturach w .NET (Łukasz Pyrzyk)PROIDEA
 
NET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxNET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxpetabridge
 
Introduction to trader bots with Python
Introduction to trader bots with PythonIntroduction to trader bots with Python
Introduction to trader bots with Pythonroskakori
 
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2PVS-Studio
 
High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018Zahari Dichev
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
OLTP+OLAP=HTAP
 OLTP+OLAP=HTAP OLTP+OLAP=HTAP
OLTP+OLAP=HTAPEDB
 
Verilog overview
Verilog overviewVerilog overview
Verilog overviewposdege
 

Similar to Fm wtm12-v2 (20)

Counter Wars (JEEConf 2016)
Counter Wars (JEEConf 2016)Counter Wars (JEEConf 2016)
Counter Wars (JEEConf 2016)
 
Sharding in MongoDB 4.2 #what_is_new
 Sharding in MongoDB 4.2 #what_is_new Sharding in MongoDB 4.2 #what_is_new
Sharding in MongoDB 4.2 #what_is_new
 
VLSI lab manual
VLSI lab manualVLSI lab manual
VLSI lab manual
 
[grcpp] Refactoring for testability c++
[grcpp] Refactoring for testability c++[grcpp] Refactoring for testability c++
[grcpp] Refactoring for testability c++
 
Performance .NET Core - M. Terech, P. Janowski
Performance .NET Core - M. Terech, P. JanowskiPerformance .NET Core - M. Terech, P. Janowski
Performance .NET Core - M. Terech, P. Janowski
 
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
 
Performance measurement and tuning
Performance measurement and tuningPerformance measurement and tuning
Performance measurement and tuning
 
from Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Worksfrom Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Works
 
The CppCat Analyzer Checks TortoiseGit
The CppCat Analyzer Checks TortoiseGitThe CppCat Analyzer Checks TortoiseGit
The CppCat Analyzer Checks TortoiseGit
 
12 virtualmachine
12 virtualmachine12 virtualmachine
12 virtualmachine
 
Heading for a Record: Chromium, the 5th Check
Heading for a Record: Chromium, the 5th CheckHeading for a Record: Chromium, the 5th Check
Heading for a Record: Chromium, the 5th Check
 
4Developers 2018: Ile (nie) wiesz o strukturach w .NET (Łukasz Pyrzyk)
4Developers 2018: Ile (nie) wiesz o strukturach w .NET (Łukasz Pyrzyk)4Developers 2018: Ile (nie) wiesz o strukturach w .NET (Łukasz Pyrzyk)
4Developers 2018: Ile (nie) wiesz o strukturach w .NET (Łukasz Pyrzyk)
 
Lp seminar
Lp seminarLp seminar
Lp seminar
 
NET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxNET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptx
 
Introduction to trader bots with Python
Introduction to trader bots with PythonIntroduction to trader bots with Python
Introduction to trader bots with Python
 
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
 
High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
OLTP+OLAP=HTAP
 OLTP+OLAP=HTAP OLTP+OLAP=HTAP
OLTP+OLAP=HTAP
 
Verilog overview
Verilog overviewVerilog overview
Verilog overview
 

Recently uploaded

Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 

Recently uploaded (20)

Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 

Fm wtm12-v2

Editor's Notes

  1. My name is Miguel, I came from Portugal and I work at the Software Engineering Group of Inesc-Id, part of the Technical University of Lisbon. I’m here to present my work about “Objects with adaptive accessors to avoid STM barriers”. This work was developed by me and professor João Cachopo.
  2. General goal of my work => increase the applications performance! How? - parallelizing sequential programs;
  3. But other problems arise from the parallelization, such as, the concurrent access to shared data.
  4. To that end we use: - STM to synchronize access to shared data.
  5. But in many cases the STMs introduce overheads that are larger that the gains from the parallelization, turning its benefits useless. In fact many STMs present good performance results in micro-benchmarks. But typically these micro-benchmarks are very simple applications that manipulate a certain kind of data-structure, such as a SkipList, an HashTable, aRedBlackTree, or other else. And perform operations over this data structure, such as deleting, moving, updating and inserting new elements. But in more realistic benchmarks, using more complex operations and more complicated data structures, as happens with the StmBench7, the performance results are not so good.
  6. The STM Barriers are one of the reasons that prevent an STM from achieving a better performance. Because, instead of just reading or updating a memory location, these STM barriers: => Need to consult the metadata associated with the memory locations; => And, keep track of the read-set and write-set.
  7. Many different approaches have been tried to mitigate the overheads incurred by STM Barriers. My work is based on the approach of a Multi-versioning STM. From the available implementations of a Multi-Versioning STM, we choose the JVSTM, which was the seminal STM using a Multi-versioning approach. Others: LSA, SMV: Selective Multi-Versioning
  8. Good for read intensive workloads. A multi-versioning STM has the big advantage that the read-only transactions never abort and always succeed. So under read-dominated scenarios a multi-versioning STM can increase the overall performance.
  9. Yet, a multi-versioning STM has a big handicap/drawback in memory overheads. Yet, to store the multiple versions of a transactional location, these STMs may incur into large memory overheads.
  10. …. furthermore, even when shared data is not under contention we may need to tackle several versions to reach the desired value. So, although a muli-versioning STM is good for read dominated workload, these problems turn this approach into a non consensual option.
  11. Our goal is to take advantage of the good performance of a Multi-Versioning under read dominated scenarios, and simultaneously reduce the runtime overheads in memory and performance when the transactional localitions are not under contention.
  12. Before introducing the AOM I will give a brief description about the JVSTM. We have already seen two presentations related with the JVSTM, so I will pass through very quickly this description. Let’s start to analyze the JVSTM. In the JVSTM a transactional location is known as a versioned box. Here we have a counter object with a transactional location called --- current
  13. Instead of storing a single value, a versioned box keeps a history of values. Each element of the versions’ history is a box body. The version associated with each value corresponds to the number of the transaction that has committed that value. A versioned box points to the head of the versions’ history corresponding to the most recent committed value.
  14. When a transactions starts it gets its transaction number from a global counter --- lastCommitted. This counter is updated by every read-write transaction that commits successfully.
  15. Then, a transaction reads the body with the version equals or lower to the transaction’s version. With this approach, the read-only transactions can always see a valid snapshot of the memory corresponding to the version captured in the moment it has begun. In other words, this means that read-only transactions are serialized in the instant they began. Although this model can improve the performance for read-dominated workloads, yet it also add some overheads: - It largely increase the total memory managed by an application; - It adds extra indirections in all memory accesses. --- instead of directly access a memory location, it must track the versions’ history to get the correct version.
  16. The key insight of our solution --- the AOM --- is based on the idea that in the majority of realistic scenarios, a large part of the locations is not under contention. Instead of what is happening in this roundabout…
  17. Usually and like this highway, the cars drive free and without contention, the same happens to most part of the transactional locations. And in these cases we don’t need metadata and multiple versions either. Multiple versions are just required when several transactions contend for the same transactional object and at least, one of those transactions writes to that object. So if we can avoid the metadata for the vast majority of objects then we can: - Largely reduce the required memory space; - Avoid extra indirections when reading those locations. Our final goal is that in scenarios without contention we can read a transactional location with the same overhead as reading any other common location. We just want to get the value from a transactional location without the need of consulting metadata, nor tracking the read-set.
  18. To achieve this idea, we propose that transactional locations should have two different layouts: - Compact layout – equals to the layout dictated by the object model of the runtime environment. - Extended layout – when the object may be under contention.
  19. Here we have an example of an object in the compact layout. This object has 3 fields. One of them requiring two slots.
  20. To exchange between layouts, we need one additional slot, that is denoted in this picture by the Header. And, when the object is in the compact layout this slot is pointing to null. This is the layout of an object at the beginning of its life cycle. Yet, the idea is that this additional slot may be further reduced by using some unused bits of the objects’ header.
  21. Later, when an object is written by a transaction, it must be extended and this header will point to the versions’ history. Note that in the AOM we have remove the Vbox. We don’t need it. Because the own object represents the identity of the versions’ history.
  22. Another particularity of the JVSTM is its garbage collector mechanism of old versions. This garbage collector algorithm removes versions when there are no running transactions that may need to access them. So eventually and if this object is no longer written by any transaction, then it will become with just one box body. In this case we can it revert back to the compact layout, discarding all the additional metadata. So swinging back and forth between these two layouts we expect to reduce to much the runtime overheads. Considering that the vast majority of objects are seldom written, then the number of objects that need to have more than one version should be residual when compared to the total number of objects in the application, reducing substantially the runtime overheads. Now let’s review in more detail the extension and reversion process.
  23. If an object is being extended then its header is pointing null. And the extension includes 3 tasks.
  24. First we create a new body corresponding to the version zero and copy the values from the object fields into this body.
  25. Then we create a 2nd body pointing to the version zero and containing the values written by the transaction. e.g. in this case the transaction writes only the value 11 to the field x.
  26. And finally it updates the object’s header. Note that in this case the object fields have not been changed. JUST the reversion process can update the object fields. So a reading transaction may see the header pointing to null, or to the versions history. But in both cases the version 0 and the object fields contain the same values. And if a running transaction intercepts the extension process of this object and already sees it in the extended layout, then that transaction will get the version 0. So, no matter the layout seen by the transaction it will always get consistent values.
  27. The reversion also includes 3 tasks: - 1st it reads the object’s header and check if the versions’ history has just one body. - In that case, it copy the values from that body into the object fields. - And finally it will perform a compare-and-swap operation that nullifies the header.
  28. These three tasks are defined by three methods in the AdaptiveObject class,
  29. This AdaptiveObject class also defines two more methods for the extension process: - The replicate that returns a clone of this object; - And the casHeader that performs a compare and swap operation in the object’s
  30. Finally our instrumentation engine replaces the root of the classes hierarchy. And the AdaptiveObject will be the base class of any transactional class.
  31. This is the new design corresponding to the new release of the AOM.
  32. The Lee-TM is a realistic and non-trivial benchmark that uses Lee routing algorithm to automatically produce interconnections between electronic components. We have no scalability from 8 threads henceforth. With these results we confirmed our expectations, but we have no scalability and we were confined to a restricted set of benchmarks. For instance, the results in the STAMP were not so promising.
  33. With the new release we have almost the same speedup. But we scale up to almost 44 threads in the Main circuit and 28 threads in the MEM circuit.
  34. We used one of the recommended configurations in the STAMP – identified by low++ in its paper. And we added long transactions and support for read-only transactions.
  35. The speedup decreased a little bit but the AOM is still better than the JVSTM and much better than the TL2.
  36. Removing the RO transactions the speed almost decreased to half of its value.