Submit Search
Upload
Applying Concurrency Cookbook Recipes to SPEC JBB
•
1 like
•
313 views
Monica Beckwith
Follow
Arm64, SPECJBB, Java
Read less
Read more
Technology
Report
Share
Report
Share
1 of 43
Recommended
BKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
BKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
Linaro
SFO15-102:ODP Project Update
SFO15-102:ODP Project Update
Linaro
"What's New in HotSpot JVM 8" @ JPoint 2014, Moscow, Russia
"What's New in HotSpot JVM 8" @ JPoint 2014, Moscow, Russia
Vladimir Ivanov
LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2
Linaro
JVM and Garbage Collection Tuning
JVM and Garbage Collection Tuning
Kai Koenig
Jvm Performance Tunning
Jvm Performance Tunning
guest1f2740
Evaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI Supercomputer
George Markomanolis
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
Linaro
Recommended
BKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
BKK16-302: Android Optimizing Compiler: New Member Assimilation Guide
Linaro
SFO15-102:ODP Project Update
SFO15-102:ODP Project Update
Linaro
"What's New in HotSpot JVM 8" @ JPoint 2014, Moscow, Russia
"What's New in HotSpot JVM 8" @ JPoint 2014, Moscow, Russia
Vladimir Ivanov
LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2
Linaro
JVM and Garbage Collection Tuning
JVM and Garbage Collection Tuning
Kai Koenig
Jvm Performance Tunning
Jvm Performance Tunning
guest1f2740
Evaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI Supercomputer
George Markomanolis
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
Linaro
JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013
Vladimir Ivanov
Thesis - LLVM toolchain support as a plug-in for Eclipse CDT
Thesis - LLVM toolchain support as a plug-in for Eclipse CDT
TuononenP
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
Vladimir Ivanov
LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
Linaro
BKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP Integration
BKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP Integration
Linaro
Intrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VM
Kris Mok
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
Linaro
HKG15-301: OVS implemented via ODP & vendor SDKs
HKG15-301: OVS implemented via ODP & vendor SDKs
Linaro
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Things
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Things
Samsung Open Source Group
JVM JIT compilation overview by Vladimir Ivanov
JVM JIT compilation overview by Vladimir Ivanov
ZeroTurnaround
LAS16-305: Smart City Big Data Visualization on 96Boards
LAS16-305: Smart City Big Data Visualization on 96Boards
Linaro
Kickstarting IOT using NodeRED
Kickstarting IOT using NodeRED
Rajesh Sola
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
Linaro
Las16 309 - lua jit arm64 port - status
Las16 309 - lua jit arm64 port - status
Linaro
Ostech war story using mainline linux for an android tv bsp
Ostech war story using mainline linux for an android tv bsp
Neil Armstrong
JMC/JFR: Kotlin spezial
JMC/JFR: Kotlin spezial
Miro Wengner
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
AMD Developer Central
LAS16-TR03: Upstreaming 201
LAS16-TR03: Upstreaming 201
Linaro
BKK16-502 Suspend to Idle
BKK16-502 Suspend to Idle
Linaro
LAS16-200: SCMI - System Management and Control Interface
LAS16-200: SCMI - System Management and Control Interface
Linaro
Advanced High-Performance Computing Features of the Open Power ISA
Advanced High-Performance Computing Features of the Open Power ISA
Ganesan Narayanasamy
XT Best Practices
XT Best Practices
Jeff Larkin
More Related Content
What's hot
JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013
Vladimir Ivanov
Thesis - LLVM toolchain support as a plug-in for Eclipse CDT
Thesis - LLVM toolchain support as a plug-in for Eclipse CDT
TuononenP
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
Vladimir Ivanov
LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
Linaro
BKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP Integration
BKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP Integration
Linaro
Intrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VM
Kris Mok
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
Linaro
HKG15-301: OVS implemented via ODP & vendor SDKs
HKG15-301: OVS implemented via ODP & vendor SDKs
Linaro
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Things
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Things
Samsung Open Source Group
JVM JIT compilation overview by Vladimir Ivanov
JVM JIT compilation overview by Vladimir Ivanov
ZeroTurnaround
LAS16-305: Smart City Big Data Visualization on 96Boards
LAS16-305: Smart City Big Data Visualization on 96Boards
Linaro
Kickstarting IOT using NodeRED
Kickstarting IOT using NodeRED
Rajesh Sola
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
Linaro
Las16 309 - lua jit arm64 port - status
Las16 309 - lua jit arm64 port - status
Linaro
Ostech war story using mainline linux for an android tv bsp
Ostech war story using mainline linux for an android tv bsp
Neil Armstrong
JMC/JFR: Kotlin spezial
JMC/JFR: Kotlin spezial
Miro Wengner
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
AMD Developer Central
LAS16-TR03: Upstreaming 201
LAS16-TR03: Upstreaming 201
Linaro
BKK16-502 Suspend to Idle
BKK16-502 Suspend to Idle
Linaro
LAS16-200: SCMI - System Management and Control Interface
LAS16-200: SCMI - System Management and Control Interface
Linaro
What's hot
(20)
JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013
Thesis - LLVM toolchain support as a plug-in for Eclipse CDT
Thesis - LLVM toolchain support as a plug-in for Eclipse CDT
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
LAS16-500: The Rise and Fall of Assembler and the VGIC from Hell
BKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP Integration
BKK16-409 VOSY Switch Port to ARMv8 Platforms and ODP Integration
Intrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VM
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
HKG15-301: OVS implemented via ODP & vendor SDKs
HKG15-301: OVS implemented via ODP & vendor SDKs
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Things
JerryScript: An ultra-lighteweight JavaScript Engine for the Internet of Things
JVM JIT compilation overview by Vladimir Ivanov
JVM JIT compilation overview by Vladimir Ivanov
LAS16-305: Smart City Big Data Visualization on 96Boards
LAS16-305: Smart City Big Data Visualization on 96Boards
Kickstarting IOT using NodeRED
Kickstarting IOT using NodeRED
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
Las16 309 - lua jit arm64 port - status
Las16 309 - lua jit arm64 port - status
Ostech war story using mainline linux for an android tv bsp
Ostech war story using mainline linux for an android tv bsp
JMC/JFR: Kotlin spezial
JMC/JFR: Kotlin spezial
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
LAS16-TR03: Upstreaming 201
LAS16-TR03: Upstreaming 201
BKK16-502 Suspend to Idle
BKK16-502 Suspend to Idle
LAS16-200: SCMI - System Management and Control Interface
LAS16-200: SCMI - System Management and Control Interface
Similar to Applying Concurrency Cookbook Recipes to SPEC JBB
Advanced High-Performance Computing Features of the Open Power ISA
Advanced High-Performance Computing Features of the Open Power ISA
Ganesan Narayanasamy
XT Best Practices
XT Best Practices
Jeff Larkin
Advanced High-Performance Computing Features of the OpenPOWER ISA
Advanced High-Performance Computing Features of the OpenPOWER ISA
Ganesan Narayanasamy
Implementing AI: High Performance Architectures: Arm SVE and Supercomputer Fu...
Implementing AI: High Performance Architectures: Arm SVE and Supercomputer Fu...
KTN
20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris
imec.archive
Arm architecture
Arm architecture
Pantech ProLabs India Pvt Ltd
IBM SAN Volume Controller Performance Analysis
IBM SAN Volume Controller Performance Analysis
brettallison
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Akihiro Hayashi
Kvm for ibm_z_systems_v1.1.2_limits
Kvm for ibm_z_systems_v1.1.2_limits
Krystel Hery
Java Memory Model
Java Memory Model
Łukasz Koniecki
x86_1.ppt
x86_1.ppt
jeronimored
G108277 ds8000-resiliency-lagos-v1905c
G108277 ds8000-resiliency-lagos-v1905c
Tony Pearson
Optimization in Programming languages
Optimization in Programming languages
Ankit Pandey
EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5
PRADEEP
Embedded system Design introduction _ Karakola
Embedded system Design introduction _ Karakola
JohanAspro
RISC-V 30908 patra
RISC-V 30908 patra
RISC-V International
opt-mem-trx
opt-mem-trx
Miguel Gamboa
Fall 2016 Insurance Case Study – Finance 360Loss ControlLoss.docx
Fall 2016 Insurance Case Study – Finance 360Loss ControlLoss.docx
lmelaine
Kauli SSPにおけるVyOSの導入事例
Kauli SSPにおけるVyOSの導入事例
Kazuhito Ohkawa
HeapStats: Troubleshooting with Serviceability and the New Runtime Monitoring...
HeapStats: Troubleshooting with Serviceability and the New Runtime Monitoring...
Yuji Kubota
Similar to Applying Concurrency Cookbook Recipes to SPEC JBB
(20)
Advanced High-Performance Computing Features of the Open Power ISA
Advanced High-Performance Computing Features of the Open Power ISA
XT Best Practices
XT Best Practices
Advanced High-Performance Computing Features of the OpenPOWER ISA
Advanced High-Performance Computing Features of the OpenPOWER ISA
Implementing AI: High Performance Architectures: Arm SVE and Supercomputer Fu...
Implementing AI: High Performance Architectures: Arm SVE and Supercomputer Fu...
20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris
Arm architecture
Arm architecture
IBM SAN Volume Controller Performance Analysis
IBM SAN Volume Controller Performance Analysis
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Kvm for ibm_z_systems_v1.1.2_limits
Kvm for ibm_z_systems_v1.1.2_limits
Java Memory Model
Java Memory Model
x86_1.ppt
x86_1.ppt
G108277 ds8000-resiliency-lagos-v1905c
G108277 ds8000-resiliency-lagos-v1905c
Optimization in Programming languages
Optimization in Programming languages
EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5
Embedded system Design introduction _ Karakola
Embedded system Design introduction _ Karakola
RISC-V 30908 patra
RISC-V 30908 patra
opt-mem-trx
opt-mem-trx
Fall 2016 Insurance Case Study – Finance 360Loss ControlLoss.docx
Fall 2016 Insurance Case Study – Finance 360Loss ControlLoss.docx
Kauli SSPにおけるVyOSの導入事例
Kauli SSPにおけるVyOSの導入事例
HeapStats: Troubleshooting with Serviceability and the New Runtime Monitoring...
HeapStats: Troubleshooting with Serviceability and the New Runtime Monitoring...
More from Monica Beckwith
The ilities of software engineering.pptx
The ilities of software engineering.pptx
Monica Beckwith
A G1GC Saga-KCJUG.pptx
A G1GC Saga-KCJUG.pptx
Monica Beckwith
ZGC-SnowOne.pdf
ZGC-SnowOne.pdf
Monica Beckwith
QCon London.pdf
QCon London.pdf
Monica Beckwith
Enabling Java: Windows on Arm64 - A Success Story!
Enabling Java: Windows on Arm64 - A Success Story!
Monica Beckwith
Intro to Garbage Collection
Intro to Garbage Collection
Monica Beckwith
OpenJDK Concurrent Collectors
OpenJDK Concurrent Collectors
Monica Beckwith
OPENJDK: IN THE NEW AGE OF CONCURRENT GARBAGE COLLECTORS
OPENJDK: IN THE NEW AGE OF CONCURRENT GARBAGE COLLECTORS
Monica Beckwith
The Performance Engineer's Guide to Java (HotSpot) Virtual Machine
The Performance Engineer's Guide to Java (HotSpot) Virtual Machine
Monica Beckwith
Garbage First Garbage Collector: Where the Rubber Meets the Road!
Garbage First Garbage Collector: Where the Rubber Meets the Road!
Monica Beckwith
JFokus Java 9 contended locking performance
JFokus Java 9 contended locking performance
Monica Beckwith
Java Performance Engineer's Survival Guide
Java Performance Engineer's Survival Guide
Monica Beckwith
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
Monica Beckwith
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
Monica Beckwith
Java 9: The (G1) GC Awakens!
Java 9: The (G1) GC Awakens!
Monica Beckwith
Game of Performance: A Song of JIT and GC
Game of Performance: A Song of JIT and GC
Monica Beckwith
Way Improved :) GC Tuning Confessions - presented at JavaOne2015
Way Improved :) GC Tuning Confessions - presented at JavaOne2015
Monica Beckwith
GC Tuning Confessions Of A Performance Engineer - Improved :)
GC Tuning Confessions Of A Performance Engineer - Improved :)
Monica Beckwith
GC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance Engineer
Monica Beckwith
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...
Monica Beckwith
More from Monica Beckwith
(20)
The ilities of software engineering.pptx
The ilities of software engineering.pptx
A G1GC Saga-KCJUG.pptx
A G1GC Saga-KCJUG.pptx
ZGC-SnowOne.pdf
ZGC-SnowOne.pdf
QCon London.pdf
QCon London.pdf
Enabling Java: Windows on Arm64 - A Success Story!
Enabling Java: Windows on Arm64 - A Success Story!
Intro to Garbage Collection
Intro to Garbage Collection
OpenJDK Concurrent Collectors
OpenJDK Concurrent Collectors
OPENJDK: IN THE NEW AGE OF CONCURRENT GARBAGE COLLECTORS
OPENJDK: IN THE NEW AGE OF CONCURRENT GARBAGE COLLECTORS
The Performance Engineer's Guide to Java (HotSpot) Virtual Machine
The Performance Engineer's Guide to Java (HotSpot) Virtual Machine
Garbage First Garbage Collector: Where the Rubber Meets the Road!
Garbage First Garbage Collector: Where the Rubber Meets the Road!
JFokus Java 9 contended locking performance
JFokus Java 9 contended locking performance
Java Performance Engineer's Survival Guide
Java Performance Engineer's Survival Guide
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
Java 9: The (G1) GC Awakens!
Java 9: The (G1) GC Awakens!
Game of Performance: A Song of JIT and GC
Game of Performance: A Song of JIT and GC
Way Improved :) GC Tuning Confessions - presented at JavaOne2015
Way Improved :) GC Tuning Confessions - presented at JavaOne2015
GC Tuning Confessions Of A Performance Engineer - Improved :)
GC Tuning Confessions Of A Performance Engineer - Improved :)
GC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance Engineer
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...
Recently uploaded
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
carlostorres15106
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Mark Billinghurst
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
Allon Mureinik
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
Pooja Nehwal
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
Pixlogix Infotech
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Delhi Call girls
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Alan Dix
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
Padma Pradeep
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
comworks
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
naman860154
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Puma Security, LLC
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
null - The Open Security Community
The transition to renewables in India.pdf
The transition to renewables in India.pdf
Competition Advisory Services (India) LLP
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
shyamraj55
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Safe Software
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
null - The Open Security Community
Recently uploaded
(20)
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
The transition to renewables in India.pdf
The transition to renewables in India.pdf
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Applying Concurrency Cookbook Recipes to SPEC JBB
1.
Jade Alglave –
Architecture and Technology Monica Beckwith – Infrastructure LOB; Advanced Server Team Applying Concurrency Cookbook Recipes to SPEC JBB
2.
2 © 2019
Arm Limited About Us Dr. Jade Alglave • Memory model architect at Arm • Co-developer and maintainer of the herd+diy toolsuite with Luc Maranget (INRIA, France) • Co-developer and maintainer of the Linux kernel memory model Monica Beckwith Managed runtime performance architect at Arm • Experience with OpenJDK HotSpot JIT, GC • Experience with JMM and with strong and weakly ordered architecture such as x86-64, SPARC, Arm64 and (very briefly) PPC64
3.
3 © 2019
Arm Limited What We Will Cover Today Introduction to – Memory Models (Java Relaxed) Performance Methodology using Litmus Tests and Tools Performance Analysis and Measurement using Java Micro-Benchmark Harness (JMH) Performance Study on Scaling CPU Cores and Simultaneous Multithreading (SMT)
4.
4 © 2019
Arm Limited What We Will NOT Cover Today Details of – Java Relaxed memory model JMH benchmarking SPEC JBB 2015 benchmarking CPU cores and simultaneous multithreading (SMT)
5.
5 © 2019
Arm Limited Memory Models - What value can a load read?
6.
6 © 2019
Arm Limited Multi-threaded hardware with shared memory structure A multi-threaded, concurrency-aware program Processor Threads and Software Threads … The Ideal Concurrent World of Hardware and Software * This drawing is heavily inspired by “timethreads“ concept in Doug Lea’s ‘Concurrent Programming in Java: Design Principles and Patterns, Second Edition ‘ Object 2Object 1 Thread 1 LockThread 2 help Thread1 Threadn Shared Memory W R W R * This drawing is heavily inspired by ‘A Tutorial Introduction to the ARM and POWER Relaxed Memory Models’ by Sewell et. al.
7.
7 © 2019
Arm Limited Sequentially Consistent Shared Memory Execution Order == Program Order == Sequential Order Object 2 Object 1 Thread 1 Lock Thread 2 help =+ Thread 1 Thread n Shared Memory W R W R Timeline (Program Order)Timeline (Single Global Execution Order) A Sequentially Consistent Machine • No local reordering • Writes become visible simultaneously to all threads
8.
8 © 2019
Arm Limited Sequential Consistency in Practice Store Buffering Example Initially, X and Y are 0 in memory; foo and bar are local (register) variables: p0 p1 a: X = 1; c: Y = 1; b: foo = Y; d: bar = X; What are the permissible values for foo and bar? On Sequential Consistency, they are the values reachable by interleavings: {a,b,c,d} {c,d,a,b} {a,c,b,d} Therefore we cannot have foo and bar both equal to 0.
9.
9 © 2019
Arm Limited The Real Concurrent World of Hardware Multi Processor Threads/Cores with Tiered Memory Structure CPUn CPU0 L1D$ L2 LLC L1 I$ CPU1 L2 L2 Usually shared between all cores Could be multi-threaded (SMT) Usually private Memory Controller DDR Banks L1D$L1 I$ L1D$L1 I$ IO Can have Load and Store buffers Can have out of order execution
10.
10 © 2019
Arm Limited Strong Models based on Total Store Ordering (TSO) CPUn CPU0 L1 D$ L2 LLC L1 I$ CPU1 L2 L2 Memory Controller L1 D$ L1 I$ L1 D$ L1 I$ IO Life In The Real World Without Sequential Consistency Relaxed vs Strong Memory Model Strong Memory Models Weaker Memory Models X86, SPARC POWER, Arm v7 • A thread can see it’s own write before other threads • All other threads see the write simultaneously: Multiple Copy Atomic Model • Local reordering is allowed • All threads are not guaranteed to see the write simultaneously: Not Multiple Copy Atomic Model
11.
11 © 2019
Arm Limited Strong Models based on Total Store Ordering (TSO) CPUn CPU0 L1 D$ L2 LLC L1 I$ CPU1 L2 L2 Memory Controller L1 D$ L1 I$ L1 D$ L1 I$ IO Life In The Real World Without Sequential Consistency Relaxed vs Strong Memory Model Weaker Memory Models X86, SPARC ARM v8 • A thread can see it’s own write before other threads • All other threads see the write simultaneously: Multiple Copy Atomic Model • Local reordering is allowed • All threads are guaranteed to see the write simultaneously: Multiple Copy Atomic Model
12.
12 © 2019
Arm Limited • Can we reason about our concurrent programs following Sequential Consistency? • Probably if we had a formal, preferably executable, memory models to ensure that we understand the guarantees given by architectures and programming languages. • Here’s where Jade would come in talking about her cool tools that allow programmers to explore the consequences of a given memory model or generate vast families of litmus tests to run against hardware.litmus tests Going Back To Our Store Buffer Example herd litmus testcat model Is this behavior allowed by the cat model? Yes/No litmus on HW litmus test Is this behavior observed on HW? Yes/No litmus test configuration file (~cat model) diy diy.inria.fr
13.
13 © 2019
Arm Limited X86 SB {x=0; y=0;} P0 | P1 ; MOV [x],$1 | MOV [y],$1 ; MOV EAX,[y] | MOV EAX,[x] ; exists (0:EAX=0 / 1:EAX=0) Hardware architecture and test name Initial state (x and y are shared memory location) Thread names Sequence of instructions displayed as columns Question: can we observe this final state of given that x=0; y=0? Store Buffer Litmus Test on a TSO Hardware
14.
14 © 2019
Arm Limited Armed With Knowledge On TSO hardware, can we observe the final state of foo=0 and bar=0; given that X=0; Y=0? ... ... Yes! All production architectures allow the outcome where both foo and bar equal 0. So, what do we do? … Use mfence as needed.
15.
15 © 2019
Arm Limited Performance Methodology - Using Litmus Tests and Tools To Avoid Barriers Where-ever Possible
16.
16 © 2019
Arm Limited What & The Why Of Barriers / Fences? Barriers ensure ordering properties Barriers enforce strong order Barriers (when inserted correctly) restore sequential consistency Barriers can be potentially expensive Data Memory Barriers on Arm: DMB SY (full system) DMB ST (wait for store to complete) DMB LD (wait for only loads to complete)
17.
17 © 2019
Arm Limited Normal Load-Stores No Barriers Litmus test AArch64 MP { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; } P0 | P1 ; MOV W0,#1 | LDR W0,[X1] ; STR W0,[X1] | LDR W2,[X3] ; MOV W2,#1 | ; STR W2,[X3] | ; exists (1:X0=1 / 1:X2=0) Check for any reorder Check if X0 = 1 and X2 = 0 can exist on P1.
18.
18 © 2019
Arm Limited Normal Stores Load Barrier Litmus test AArch64 MP+DMB.LD { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; } P0 | P1 ; MOV W0,#1 | LDR W0,[X1] ; STR W0,[X1] | DMB LD; MOV W2,#1 | LDR W2,[X3] ; STR W2,[X3] | ; exists (1:X0=1 / 1:X2=0) Check for Store reorder
19.
19 © 2019
Arm Limited Load & Store Barriers Litmus test AArch64 MP+DMB.LD+DMB.ST { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; } P0 | P1 ; MOV W0,#1 | LDR W0,[X1] ; STR W0,[X1] | DMB LD; DMB ST | LDR W2, [X3]; MOV W2,#1 | ; STR W2,[X3] | ; exists (1:X0=1 / 1:X2=0) Generated assembler #START _litmus_P1 ldr w4,[x1] dmb ld ldr w5,[x2] #START _litmus_P0 mov w7,#1 str w7,[x0] dmb st mov w6,#1 str w6,[x2] Test MP+DMB.ST+DMB.LD Allowed Histogram (3 states) 499999:>1:X0=0; 1:X2=0; 20 :>1:X0=0; 1:X2=1; 499981:>1:X0=1; 1:X2=1; No Witnesses Positive: 0, Negative: 1000000 Condition exists (1:X0=1 / 1:X2=0) is NOT vali Hash=4d15dccdb1da0ce51fac17dea068d047 Observation MP+DMB.ST+DMB.LD Never 0 1000000
20.
20 © 2019
Arm Limited But Aren’t Barriers Expensive? Acquire – Release (implicit barrier) semantic – One way barriers LDAR - All loads and stores that are after an LDAR in program order, … must be observed after the LDAR STLR - All loads and stores preceding an STLR …, must be observed before the STLR Thinking about lock-free?
21.
21 © 2019
Arm Limited Performance Analysis and Measurement - Using Java Micro- Benchmarking Harness (JMH)
22.
22 © 2019
Arm Limited JMM Rule 1 for Volatile Stores Order 1 Normal/Volatile Load Normal/Volatile Store Can’t Reorder Order 2 Volatile Store Any load/store (normal or volatile) followed by a ‘volatile store’ can’t be reordered.
23.
23 © 2019
Arm Limited Volatile Stores - Barriers With DMBs JMM rule: Any load/store followed by Volatile Store can’t be re-ordered Litmus test { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; } P0 | P1 ; MOV W0,#1 | LDR W0,[X1] ; STR W0,[X1] | DMB LD ; LDR W2,[X1] | LDR W2,[X3] ; DMB SY | ; STR W2,[X3] | ; exists (1:X0=1 / 1:X2=0) Results Test MP Allowed States 3 1:X0=0; 1:X2=0; 1:X0=0; 1:X2=1; 1:X0=1; 1:X2=1; No Witnesses Positive: 0 Negative: 3 Condition exists (1:X0=1 / 1:X2=0) Observation MP Never 0 3
24.
24 © 2019
Arm Limited Volatile Stores - Can Barriers Be Replaced By STLR? JMM rule: Any load/store followed by Volatile Store can’t be re-ordered Litmus test { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; } P0 | P1 ; MOV W0,#1 | LDR W0,[X1] ; STR W0,[X1] | DMB LD ; MOV W2,#1 | LDR W2,[X3] ; STLR W2,[X3] | ; exists (1:X0=1 / 1:X2=0) Results Test MP Allowed States 3 1:X0=0; 1:X2=0; 1:X0=0; 1:X2=1; 1:X0=1; 1:X2=1; No Witnesses Positive: 0 Negative: 3 Condition exists (1:X0=1 / 1:X2=0) Observation MP Never 0 3
25.
25 © 2019
Arm Limited JMM Rule 2 for Volatile Stores • A ‘volatile store’ followed by any normal load/store CAN be reordered. • A ‘volatile store’ followed by any volatile load/store CANNOT be reordered. Order 1 Can Reorder Volatile Store Can’t Reorder Order 2 Normal Load Normal Store Volatile Load Volatile Store
26.
26 © 2019
Arm Limited Volatile Stores – Can Barriers Be Replaced By STLR? STLR doesn’t guarantee that a subsequent volatile load/store will not be reordered static class TestNormalLoadPostVolatileStores { volatile int intField1; int intNorm1; public TestNormalLoadPostVolatileStores() { intField1 = 32; intNorm1 = intField1; } } JMH Test Code { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; } P0 | P1 ; LDR W0,[X1] | LDR W0,[X1] ; MOV W2,#1 | MOV W2,#1 ; STLR W2,[X3] | STR W2,[X3] ; exists (0:X0=1 / 1:X0=1) Litmus Test Positive Event Structure
27.
27 © 2019
Arm Limited Volatile Stores – Can Barriers Be Replaced By STLR? Success! STLR + LDAR of volatiles provide that guarantee Litmus test { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; } P0 | P1 ; LDR W0,[X1] | LDAR W0,[X1] ; MOV W2,#1 | MOV W2,#1 ; STLR W2,[X3] | STR W2,[X3] ; exists (0:X0=1 / 1:X0=1) Results Test LB Allowed States 3 0:X0=0; 1:X0=0; 0:X0=0; 1:X0=1; 0:X0=1; 1:X0=0; No Witnesses Positive: 0 Negative: 3 Condition exists (0:X0=1 / 1:X0=1) Observation LB Never 0 3
28.
28 © 2019
Arm Limited Volatile Stores – JMH Profiles Success! STLR + LDAR of volatiles provide that guarantee Load Acquire – Store Release Pair stlr w11, [x10] ;*putfield intField1 ; add x10, x2, #0xc ldar w11, [x10] ;*getfield intField1 Data Memory Barrier (inner share-ability domain str w10, [x2,#12] dmb ish ;*putfield intField1 ldr w11, [x2,#12] dmb ishld ;*getfield intField1 36% faster on max SMT count!!
29.
29 © 2019
Arm Limited JMM Rule 1 for Volatile Loads Order 1 Volatile Load Can’t Reorder Order 2 Normal/Volatile Load Normal/Volatile Store A ‘volatile load’ followed by any load/store (normal or volatile) can’t be reordered.
30.
30 © 2019
Arm Limited Volatile Load - Barriers With DMBs JMM rule: A Volatile Load followed by any load/store can’t be re-ordered Litmus test { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; } P0 | P1 ; LDR W0,[X1] | LDR W0,[X1] ; DMB SY | MOV W2,#1 ; MOV W2,#1 | DMB SY ; STR W2,[X3] | STR W2,[X3] ; exists (0:X0=1 / 1:X0=1) Results Test LB Allowed States 3 0:X0=0; 1:X0=0; 0:X0=0; 1:X0=1; 0:X0=1; 1:X0=0; No Witnesses Positive: 0 Negative: 3 Condition exists (0:X0=1 / 1:X0=1) Observation LB Never 0 3
31.
31 © 2019
Arm Limited Volatile Stores – Can Barriers Be Replaced By LDAR? Success! LDAR provides the right guarantee Litmus test { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; } P0 | P1 ; LDAR W0,[X1] | LDR W0,[X1] ; | MOV W2,#1 ; MOV W2,#1 | DMB SY ; STR W2,[X3] | STR W2,[X3] ; exists (0:X0=1 / 1:X0=1) Results Test LB Allowed States 3 0:X0=0; 1:X0=0; 0:X0=0; 1:X0=1; 0:X0=1; 1:X0=0; No Witnesses Positive: 0 Negative: 3 Condition exists (0:X0=1 / 1:X0=1) Observation LB Never 0 3
32.
32 © 2019
Arm Limited Performance Study - Scaling CPU Cores / Simultaneous Multithreading (SMT)
33.
33 © 2019
Arm Limited 0.9 0.95 1 1.05 1.1 1 4 Max lse wolse wdmb Applying Cook Book Recipes to SPECJBB Bigger is Better Core or Thread Count With LSE; With LDAR (baseline) Without LSE; With LDAR With LSE; With DMB 1 1.00 0.97 0.92 4 1.00 1.00 0.95 Max 1.00 1.01 1.00 •Fences/Barriers(e.g. DMB ST, DMB LD, DMB SY) •Atomics/LSE (e.g. LDREX/STREX or CAS)
34.
34 © 2019
Arm Limited Single Core Performance The Quest and Guarantee of Sequential Consistency Hardware improvements measured on Java micro-benchmarks (OpenJDK JDK11): • Object/memory allocations up to 2.4x faster • Object/array initializations up to 5x faster – Smart issuing and cost reduction of SW barriers (i.e. DMB) required by Arm’s relaxed memory model • Copy chars up to 1.6x faster • New atomic instructions improve locking throughput and contention latency by up to 2x Cortex-A72 Code Neoverse N1 0.21% dmb ishst ;*new (0 cycles) 0.00% 7.73% (~3.5 cycles) ldr x11, [sp,#8] 1.75% ldr w17, [x11,#12];*getfield 0.06% mov x2, x0 0.51% ldp w0, w18,[x11,#16];*getfield 0.11% 0.42% ldp w3, w1, [x11,#24];*getfield org.openjdk.bench.vm.compiler.generated.StoreAfterStore_testAllocAndZeroStore_jmhTest::testAllocAndZeroStore_avgt_jmhStub
35.
35 © 2019
Arm Limited Ares Single Core Performance Hardware improvements measured on SPECJBB (OpenJDK JDK11): • Neoverse N1 CPU improves performance from Cortex-A72 by 1.7x Software improvements measured on SPECJBB: • JDK11 improves performance vs JDK8 on Arm by up to 14%
36.
36 © 2019
Arm Limited Resources http://g.oswego.edu/dl/jmm/cookbook.html https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf http://hg.openjdk.java.net/code-tools/jmh-jdk- microbenchmarks/file/92c55597888e/README.md http://infocenter.arm.com/help/topic/com.arm.doc.genc007826/Barrier_Litmus_Tests_an d_Cookbook_A08.pdf
37.
37 © 2019
Arm Limited Appendix
38.
38 © 2019
Arm Limited The herd+diy Toolsuite The tool suite supports and provides a formal consistency model for all of the following: - Arm, IBM Power, Intel x86 - Nvidia GPUs - C/C++ - Linux C This means that a user can experiment with the concurrency implemented at all these levels and generate systematic families of tests to probe implementations. diy.inria.fr
39.
39 © 2019
Arm Limited A Store Barrier Litmus Test PodWW Rfe PodRR Fre Fre PodWR Fre PodWR A litmus test source has three main sections: The initial state defines the initial values of registers and memory locations. Initialisation to zero may be omitted. The code section defines the code to be run concurrently — above there are two threads. Yes we know, our X86 assembler syntax is a mistake. The final condition applies to the final values of registers and memory locations.
40.
40 © 2019
Arm Limited Executing the model: herd herd litmus testcat model Is this behavior allowed by the cat model? Yes/No The herd tool allows a user to execute a formal model, written in the cat language. Given a litmus test and a cat model, herd runs the litmus test against the cat model: herd tries to determine whether the model allows the final state given in the test can be reached.
41.
41 © 2019
Arm Limited Running tests on hardware: litmus litmus on HW litmus test Is this behavior observed on HW? Yes/No The litmus tool allows a user to run a litmus test against hardware. The tool gathers all the final states that were observed on hardware during multiple runs of the test. We can then compare the output of herd and litmus, to check whether they are in accord.
42.
42 © 2019
Arm Limited Generating tests: diy litmus test configuration file (~cat model) diy The diy tool allows a user to generate interesting families of litmus tests. It takes as input a configuration file, where a user should list the features of interest to them. We can use families of diy-generated tests to run validation campaigns, comparing the cat model and prototypes.
43.
The Cloud to
Edge Infrastructure Foundation for a World of 1T Intelligent Devices Thank You!