SlideShare a Scribd company logo
Studying Concurrency
2017.1.22
<ajblane0612@gmail.com>
AJMachine
迷失到收斂
Outline
• 為什麼寫concurrency不容易?
• Programmer-observable behavior
• 來點concurrency performance 撰寫技巧例子
• 來點concurrency security 例子
為什麼寫Concurrency不容易?
• Hardware optimizations
• Compiler optimizations
無法預期行為
Hardware Optimizations
- Write Buffer
• On a write, a processor simply inserts the write operation into the
write buffer and proceeds without waiting for the write to complete
• In order to effectively hide the latency of write operations
• Therefore, P1, P2 are all in critical sections
Sarita V. Adve, Kourosh Gharachorloo, “Shared Memory Consistency Models: A Tutorial”
Hardware Optimizations
- Overlapped Writes
• Assume the Data and Head variables reside in different memory modules
• Since the write to Head may be injected into the network before the write to Data
has reached its memory module
• Therefore, it is possible for another processor to observe the new value of Head
and yet obtain the old value of Data
• Reordering of write operations
Sarita V. Adve, Kourosh Gharachorloo, “Shared Memory Consistency Models: A Tutorial”
(coalesced write)
Hardware Optimizations
- Non−blocking Reads
• If P2 is allowed to issue its read operations in an overlapped
fashion, there is the possibility for the read of Data to arrive
at its memory module before the write from P1 while the
read of Head reaches its memory module after the write
from P1 => P2.Data =2000/ P2.Head = 0
Sarita V. Adve, Kourosh Gharachorloo, “Shared Memory Consistency Models: A Tutorial”
(coalesced read)
如果更想仔細了解運作,可參考
Memory Barriers: a Hardware View for
Software Hackers
所以怎麼辦? 理想上
• Sequential Consistency (單核operations順序=
多核operation順序)
– The result of any execution is the same as if the
operations of all the processors were executed in
some sequential order, and the operations of each
individual processor appear in this sequence in
the order specified by its program
• There is no local reordering
• Each write becomes visible to all threads
Sarita V. Adve, Kourosh Gharachorloo, “Shared Memory Consistency Models: A Tutorial”
Luc Maranget, etc., “A Tutorial Introduction to the ARM and POWER Relaxed Memory Models”
事實上,不保證SC
Memory model Local ordering Multiple-copy atomic
model
Total store ordering Intel x86 X O
Relaxed memory
model
ARM X X
Luc Maranget, etc., “A Tutorial Introduction to the ARM and POWER Relaxed Memory Models”
Developers需自己寫code管理記憶體操作順序
Hardware Optimizations這麼多,我要
怎知道程式的運作行為(Programmer-
observable Behavior)?
• Mathematically rigorous architecture
definitions
– Luc Maranget, etc., “A Tutorial Introduction to the
ARM and POWER Relaxed Memory Models”
• Hardware semantics
– Shaked Flur, etc., “Modelling the ARMv8
Architecture, Operationally Concurrency and ISA”
• C/C++11 memory model
• …?
Mathematically Rigorous Architecture
Definitions – For Example
• Message Passing (MP)
Luc Maranget, etc., “A Tutorial Introduction to the ARM and POWER Relaxed Memory Models”
Y=1; r1=y; r2=x; x=1 r1=1 ∧ r2=0
x86-TSO : forbidden
ARM: allowed
Partial-order Propagation
?
Partial-order Propagation是否一定會
影響程式行為? 不一定會
• MP test harness
• m is the number of times that the final outcome
of r1=1 ∧ r2=0 was observed in n trials
Hardware Semantics
Shaked Flur, etc., “Modelling the ARMv8 Architecture, Operationally Concurrency and ISA”
撰寫
撰寫
Web Site of Hardware Semantics
http://www.cl.cam.ac.uk/~sf502/popl16/help.html
Result of Hardware Semantics
http://www.cl.cam.ac.uk/~sf502/popl16/help.html
如果有同時存取某位置(lock沒寫好),可以看result資訊可提早看出。
C/C++11 Memory Model
• 從language層面,制定keywords,來使各個
硬體必須符合此language memory model。
– https://www.youtube.com/watch?v=S-x-23lrRnc
• 此影片中有提到ARM為了滿足C11 memory model,
complier會有double barrier狀況
• Reinoud Elhorst, “Lowering C11 Atomics for ARM in
LLVM”
– Torvald Riegel, “Modern C/C++ concurrency”
• Semantics
– Mark Barry, “Mathematizing C++ concurrency”
Mathematizing C++ Concurrency
• 利用 Isabelle/HOL 來撰寫C++ memory model
的semantics
• For example:定義release sequence
來點Concurrency Performance撰寫技
巧例子
• LMAX
• RCU
• Concurrent malloc(3)
• An Analysis of Linux Scalability to Many Cores
LMAX: New Financial Trading Platform
https://martinfowler.com/articles/lmax.html
LMAX Lock-free技巧
http://mechanitis.blogspot.tw/2011/06/dissecting-disruptor-how-do-i-read-from.html
• 應用Barrier就是把原本lock改成lock-free,lock-free可以想成lock是硬體
管理。基本上實作概念跟lock差不多。排隊。
• RingBuffer: 增快反應時間
Read Copy Update (RCU)
• Read-mostly situations
• Typical RCU: update into removal and reclamation (disrupt)
– Removal and Replacing references to data items can run concurrently with readers
– Remove pointers to a data structure, so that subsequent readers cannot gain a
reference to it
– RCU provides implicit low-overhead communication between readers and reclaimers
(synchronize_rcu())
https://www.kernel.org/doc/Documentation/RCU/whatisRCU.txt
https://lwn.net/Articles/262464/
Grace Period 時間太長?
https://lwn.net/Articles/253651/
有一堆RCU,只能有緣再唸了
https://lwn.net/Articles/264090/
Concurrent malloc(3)
• How to false cache sharing
– Modern multi-processor systems preserve a coherent
view of memory on a per-cache-line basis
• How to reduce lock contention
Jason Evans, “a scalable concurrent malloc implementation for freebsd”
jemalloc
• Phk-malloc was specially optimized to minimize the working set of pages, jemalloc
must be more concerned with cache locality
• jemalloc first tries to minimize memory usage, and tries to allocate contiguously
(weaker security)
• One way of fixing this issue is to pad allocations, but padding is in direct opposition
to the goal of packing objects as tightly as possible; it can cause severe internal
fragmentation. jemalloc instead relies on multiple allocation arenas to reduce the
problem
• One of the main goals for this allocator was to reduce lock contention for multi-
threaded applications by using a single 2 allocator lock, each free list had its own
lock
• The solution was to use multiple
arenas for allocation, and assign threads
to arenas via hashing of the thread identifiers
Jason Evans, “a scalable concurrent malloc implementation for freebsd”
Scalability Collapse Caused by Non-
scalable Locks
Linux Scalability to Many Cores -
Per-core Mount Caches
Silas Boyd-Wickizer, etc. , “An Analysis of Linux Scalability to Many Cores”
• Observation: mount table is
rarely modified
• Common case: cores access
per-core tables
• Modify mount table: invalidate
per-core tables
Linux Scalability to Many Cores -
Sloppy Counters
• Because reading reference count is slow
Silas Boyd-Wickizer, etc. , “An Analysis of Linux Scalability to Many Cores”
來點Concurrency Security 例子
• Concurrency fuzzer
– Sebastian Burckhardt, etc., “A Randomized
Scheduler with Probabilistic Guarantees of Finding
Bugs”
• Timing side channel attack
– Yeongjin Jang, etc., “Breaking Kernel Address
Space Layout Randomization with Intel TSX”
Concurrency Fuzzer-
Randomized Scheduler
Sebastian Burckhardt, etc., “A Randomized Scheduler with Probabilistic
Guarantees of Finding Bugs”
Randomized Scheduler
基本上,Read/ Write reordering in hardware 是沒有模擬到的
Find Violation (Order/ Atomicity)
此投影片有整理幾個Fuzzer
“Concurrency: A problem and
opportunity in the exploitation of
memory corruptions”
Intel Transactional Synchronization
Extensions
• the assembly instruction xbegin can return various
results that represent the hardware's suggestions for
how to proceed and reasons for failure: success, a
suggestion to retry, a potential cause for the abort
• To effectively use TSX it's imperative to understand it's
implementation and limitations. TSX is implemented
using the cache coherence protocol, which x86
machines already implement. When a transaction
begins, the processor starts tracking read and write
sets of cache lines which have been brought into the L1
cache. If at any point during a logical core's execution
of a transaction another core modifies a cache line in
the read or write set then the transaction is aborted.
Nick Stanley, “Hardware Transactional Memory with Intel’s TSX”
Intel Transactional Synchronization
Extensions - Suppressing exceptions
• a transaction aborts when such a hardware exception occurs during the
execution of the transaction. However, unlike normal situations where the
OS intervenes and handles these exceptions gracefully, TSX instead
invokes a user-specified abort handler, without informing the underlying
OS. More precisely, TSX treats these exceptions in a synchronous
manner—immediately executing an abort handler while suppressing the
exception itself. In other words, the exception inside the transaction will
not be communicated to the underlying OS. This allows us to engage in
abnormal behavior (e.g., attempting to access privileged, i.e., kernel,
memory regions) without worrying about crashing the program. In DrK,
we break KASLR by turning this surprising behavior into a timing channel
that leaks the status (e.g., mapped or unmapped) of all kernel pages.
Timing Side Channel Attack
• TSX instead invokes a user-
specified abort handler, without
informing the underlying OS
• 也就是說我在User space就可以
知道kennel address with random
(!!!)
Yeongjin Jang, etc., “Breaking Kernel Address Space Layout
Randomization with Intel TSX”
Reference
• Sarita V. Adve, Kourosh Gharachorloo, “Shared Memory Consistency
Models: A Tutorial”
• Luc Maranget, etc., “A Tutorial Introduction to the ARM and POWER
Relaxed Memory Models”
• Shaked Flur, etc., “Modelling the ARMv8 Architecture, Operationally
Concurrency and ISA”
• https://www.youtube.com/watch?v=6QU37TwRO4w
• http://www.cl.cam.ac.uk/~sf502/popl16/help.html
• Jade Alglave, etc., “The Semantics of Power and ARM Multiprocessor
Machine Code”
• Paul E. McKenney, “Memory Barriers: a Hardware View for Software
Hackers”
Reference
C/C++ 11 memory model
• https://www.youtube.com/watch?v=S-x-23lrRnc
• Reinoud Elhorst, “Lowering C11 Atomics for ARM in LLVM”
• Torvald Riegel, “Modern C/C++ concurrency”
• Mark Barry, “Mathematizing C++ concurrency”
LMAX
• https://github.com/LMAX-Exchange/disruptor
• https://martinfowler.com/articles/lmax.html
• http://mechanitis.blogspot.tw/2011/06/dissecting-disruptor-how-do-i-read-
from.html
RCU
• https://www.kernel.org/doc/Documentation/RCU/whatisRCU.txt
• https://lwn.net/Articles/262464/
• https://lwn.net/Articles/253651/
• https://lwn.net/Articles/264090/
Reference
Concurrent malloc(3)
• Jason Evans, “a scalable concurrent malloc implementation
for freebsd”
Concurrency security
• Sebastian Burckhardt, etc., “A Randomized Scheduler with
Probabilistic Guarantees of Finding Bugs”
• Ralf-Philipp Weinmann, etc., “Concurrency: A problem and
opportunity in the exploitation of memory corruptions”
• Yeongjin Jang, etc., “Breaking Kernel Address Space Layout
Randomization with Intel TSX”
• Nick Stanley, “Hardware Transactional Memory with Intel’s
TSX” (有建議的Intel concurrency寫法)

More Related Content

What's hot

KVM tools and enterprise usage
KVM tools and enterprise usageKVM tools and enterprise usage
KVM tools and enterprise usage
vincentvdk
 
Sheepdog Status Report
Sheepdog Status ReportSheepdog Status Report
Sheepdog Status Report
Liu Yuan
 
Laying OpenStack Cinder Block Services
Laying OpenStack Cinder Block ServicesLaying OpenStack Cinder Block Services
Laying OpenStack Cinder Block Services
Kenneth Hui
 
RHEVM - Live Storage Migration
RHEVM - Live Storage MigrationRHEVM - Live Storage Migration
RHEVM - Live Storage Migration
Raz Tamir
 
Openstack HA
Openstack HAOpenstack HA
Openstack HA
Yong Luo
 
Building your own NSQL store
Building your own NSQL storeBuilding your own NSQL store
Building your own NSQL store
Edward Capriolo
 
OpenStack Cinder Overview - Havana Release
OpenStack Cinder Overview - Havana ReleaseOpenStack Cinder Overview - Havana Release
OpenStack Cinder Overview - Havana Release
Avishay Traeger
 
Cinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit AustinCinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit Austin
Ed Balduf
 
Libvirt/KVM Driver Update (Kilo)
Libvirt/KVM Driver Update (Kilo)Libvirt/KVM Driver Update (Kilo)
Libvirt/KVM Driver Update (Kilo)
Stephen Gordon
 
Kubernetes for HCL Connections Component Pack - Build or Buy?
Kubernetes for HCL Connections Component Pack - Build or Buy?Kubernetes for HCL Connections Component Pack - Build or Buy?
Kubernetes for HCL Connections Component Pack - Build or Buy?
Martin Schmidt
 
Virtualization Architecture & KVM
Virtualization Architecture & KVMVirtualization Architecture & KVM
Virtualization Architecture & KVM
Pradeep Kumar
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
OpenStack Korea Community
 
Monitor PowerKVM using Ganglia, Nagios
Monitor PowerKVM using Ganglia, NagiosMonitor PowerKVM using Ganglia, Nagios
Monitor PowerKVM using Ganglia, Nagios
Pradeep Kumar
 
Play With Android
Play With AndroidPlay With Android
Play With AndroidChamp Yen
 
Linux Integrity Mechanisms - Protecting Container Runtime as an example
Linux Integrity Mechanisms - Protecting Container Runtime as an exampleLinux Integrity Mechanisms - Protecting Container Runtime as an example
Linux Integrity Mechanisms - Protecting Container Runtime as an example
Clay (Chih-Hao) Chang
 
Deep Dive into Openstack Storage, Sean Cohen, Red Hat
Deep Dive into Openstack Storage, Sean Cohen, Red HatDeep Dive into Openstack Storage, Sean Cohen, Red Hat
Deep Dive into Openstack Storage, Sean Cohen, Red Hat
Cloud Native Day Tel Aviv
 
OSS Presentation VMWorld 2011 by Andy Bennett & Craig Morgan
OSS Presentation VMWorld 2011 by Andy Bennett & Craig MorganOSS Presentation VMWorld 2011 by Andy Bennett & Craig Morgan
OSS Presentation VMWorld 2011 by Andy Bennett & Craig MorganOpenStorageSummit
 
Cinder - status of replication
Cinder - status of replicationCinder - status of replication
Cinder - status of replication
Ed Balduf
 
Symmetric Crypto for DPDK - Declan Doherty
Symmetric Crypto for DPDK - Declan DohertySymmetric Crypto for DPDK - Declan Doherty
Symmetric Crypto for DPDK - Declan Doherty
harryvanhaaren
 
One-click Hadoop Cluster Deployment on OpenPOWER Systems
One-click Hadoop Cluster Deployment on OpenPOWER SystemsOne-click Hadoop Cluster Deployment on OpenPOWER Systems
One-click Hadoop Cluster Deployment on OpenPOWER Systems
Pradeep Kumar
 

What's hot (20)

KVM tools and enterprise usage
KVM tools and enterprise usageKVM tools and enterprise usage
KVM tools and enterprise usage
 
Sheepdog Status Report
Sheepdog Status ReportSheepdog Status Report
Sheepdog Status Report
 
Laying OpenStack Cinder Block Services
Laying OpenStack Cinder Block ServicesLaying OpenStack Cinder Block Services
Laying OpenStack Cinder Block Services
 
RHEVM - Live Storage Migration
RHEVM - Live Storage MigrationRHEVM - Live Storage Migration
RHEVM - Live Storage Migration
 
Openstack HA
Openstack HAOpenstack HA
Openstack HA
 
Building your own NSQL store
Building your own NSQL storeBuilding your own NSQL store
Building your own NSQL store
 
OpenStack Cinder Overview - Havana Release
OpenStack Cinder Overview - Havana ReleaseOpenStack Cinder Overview - Havana Release
OpenStack Cinder Overview - Havana Release
 
Cinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit AustinCinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit Austin
 
Libvirt/KVM Driver Update (Kilo)
Libvirt/KVM Driver Update (Kilo)Libvirt/KVM Driver Update (Kilo)
Libvirt/KVM Driver Update (Kilo)
 
Kubernetes for HCL Connections Component Pack - Build or Buy?
Kubernetes for HCL Connections Component Pack - Build or Buy?Kubernetes for HCL Connections Component Pack - Build or Buy?
Kubernetes for HCL Connections Component Pack - Build or Buy?
 
Virtualization Architecture & KVM
Virtualization Architecture & KVMVirtualization Architecture & KVM
Virtualization Architecture & KVM
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
 
Monitor PowerKVM using Ganglia, Nagios
Monitor PowerKVM using Ganglia, NagiosMonitor PowerKVM using Ganglia, Nagios
Monitor PowerKVM using Ganglia, Nagios
 
Play With Android
Play With AndroidPlay With Android
Play With Android
 
Linux Integrity Mechanisms - Protecting Container Runtime as an example
Linux Integrity Mechanisms - Protecting Container Runtime as an exampleLinux Integrity Mechanisms - Protecting Container Runtime as an example
Linux Integrity Mechanisms - Protecting Container Runtime as an example
 
Deep Dive into Openstack Storage, Sean Cohen, Red Hat
Deep Dive into Openstack Storage, Sean Cohen, Red HatDeep Dive into Openstack Storage, Sean Cohen, Red Hat
Deep Dive into Openstack Storage, Sean Cohen, Red Hat
 
OSS Presentation VMWorld 2011 by Andy Bennett & Craig Morgan
OSS Presentation VMWorld 2011 by Andy Bennett & Craig MorganOSS Presentation VMWorld 2011 by Andy Bennett & Craig Morgan
OSS Presentation VMWorld 2011 by Andy Bennett & Craig Morgan
 
Cinder - status of replication
Cinder - status of replicationCinder - status of replication
Cinder - status of replication
 
Symmetric Crypto for DPDK - Declan Doherty
Symmetric Crypto for DPDK - Declan DohertySymmetric Crypto for DPDK - Declan Doherty
Symmetric Crypto for DPDK - Declan Doherty
 
One-click Hadoop Cluster Deployment on OpenPOWER Systems
One-click Hadoop Cluster Deployment on OpenPOWER SystemsOne-click Hadoop Cluster Deployment on OpenPOWER Systems
One-click Hadoop Cluster Deployment on OpenPOWER Systems
 

Similar to [若渴計畫] Studying Concurrency

Memory model
Memory modelMemory model
Memory model
Yi-Hsiu Hsu
 
CPU Caches
CPU CachesCPU Caches
CPU Caches
shinolajla
 
Exploiting Modern Microarchitectures: Meltdown, Spectre, and other Attacks
Exploiting Modern Microarchitectures: Meltdown, Spectre, and other AttacksExploiting Modern Microarchitectures: Meltdown, Spectre, and other Attacks
Exploiting Modern Microarchitectures: Meltdown, Spectre, and other Attacks
inside-BigData.com
 
Cassandra and drivers
Cassandra and driversCassandra and drivers
Cassandra and drivers
Ben Bromhead
 
Cpu Cache and Memory Ordering——并发程序设计入门
Cpu Cache and Memory Ordering——并发程序设计入门Cpu Cache and Memory Ordering——并发程序设计入门
Cpu Cache and Memory Ordering——并发程序设计入门
frogd
 
Beneath the Linux Interrupt handling
Beneath the Linux Interrupt handlingBeneath the Linux Interrupt handling
Beneath the Linux Interrupt handling
Bhoomil Chavda
 
POWER ISA introduction and what’s new in ISA V3.1 (Overview)
POWER ISA introduction and what’s new in ISA V3.1 (Overview)POWER ISA introduction and what’s new in ISA V3.1 (Overview)
POWER ISA introduction and what’s new in ISA V3.1 (Overview)
Ganesan Narayanasamy
 
Windows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel DevelopersWindows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel Developers
Kernel TLV
 
Andy Parsons Pivotal June 2011
Andy Parsons Pivotal June 2011Andy Parsons Pivotal June 2011
Andy Parsons Pivotal June 2011Andy Parsons
 
Automating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency SpreadsAutomating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency Spreads
ScyllaDB
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right Way
DataStax Academy
 
Pune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCDPune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCD
Prashant Rane
 
Virtualization Basics
Virtualization BasicsVirtualization Basics
Virtualization Basics
SrikantMishra12
 
Solaris vs Linux
Solaris vs LinuxSolaris vs Linux
Solaris vs Linux
Grigale LTD
 
Large Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - NautilusLarge Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - Nautilus
Gabriele Di Bernardo
 
Application Profiling for Memory and Performance
Application Profiling for Memory and PerformanceApplication Profiling for Memory and Performance
Application Profiling for Memory and Performance
pradeepfn
 
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded SystemsF9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
National Cheng Kung University
 
LMAX Disruptor - High Performance Inter-Thread Messaging Library
LMAX Disruptor - High Performance Inter-Thread Messaging LibraryLMAX Disruptor - High Performance Inter-Thread Messaging Library
LMAX Disruptor - High Performance Inter-Thread Messaging Library
Sebastian Andrasoni
 
Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...
Nikolay Savvinov
 
Synchronization linux
Synchronization linuxSynchronization linux
Synchronization linuxSusant Sahani
 

Similar to [若渴計畫] Studying Concurrency (20)

Memory model
Memory modelMemory model
Memory model
 
CPU Caches
CPU CachesCPU Caches
CPU Caches
 
Exploiting Modern Microarchitectures: Meltdown, Spectre, and other Attacks
Exploiting Modern Microarchitectures: Meltdown, Spectre, and other AttacksExploiting Modern Microarchitectures: Meltdown, Spectre, and other Attacks
Exploiting Modern Microarchitectures: Meltdown, Spectre, and other Attacks
 
Cassandra and drivers
Cassandra and driversCassandra and drivers
Cassandra and drivers
 
Cpu Cache and Memory Ordering——并发程序设计入门
Cpu Cache and Memory Ordering——并发程序设计入门Cpu Cache and Memory Ordering——并发程序设计入门
Cpu Cache and Memory Ordering——并发程序设计入门
 
Beneath the Linux Interrupt handling
Beneath the Linux Interrupt handlingBeneath the Linux Interrupt handling
Beneath the Linux Interrupt handling
 
POWER ISA introduction and what’s new in ISA V3.1 (Overview)
POWER ISA introduction and what’s new in ISA V3.1 (Overview)POWER ISA introduction and what’s new in ISA V3.1 (Overview)
POWER ISA introduction and what’s new in ISA V3.1 (Overview)
 
Windows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel DevelopersWindows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel Developers
 
Andy Parsons Pivotal June 2011
Andy Parsons Pivotal June 2011Andy Parsons Pivotal June 2011
Andy Parsons Pivotal June 2011
 
Automating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency SpreadsAutomating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency Spreads
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right Way
 
Pune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCDPune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCD
 
Virtualization Basics
Virtualization BasicsVirtualization Basics
Virtualization Basics
 
Solaris vs Linux
Solaris vs LinuxSolaris vs Linux
Solaris vs Linux
 
Large Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - NautilusLarge Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - Nautilus
 
Application Profiling for Memory and Performance
Application Profiling for Memory and PerformanceApplication Profiling for Memory and Performance
Application Profiling for Memory and Performance
 
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded SystemsF9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
 
LMAX Disruptor - High Performance Inter-Thread Messaging Library
LMAX Disruptor - High Performance Inter-Thread Messaging LibraryLMAX Disruptor - High Performance Inter-Thread Messaging Library
LMAX Disruptor - High Performance Inter-Thread Messaging Library
 
Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...
 
Synchronization linux
Synchronization linuxSynchronization linux
Synchronization linux
 

More from Aj MaChInE

An Intro on Data-oriented Attacks
An Intro on Data-oriented AttacksAn Intro on Data-oriented Attacks
An Intro on Data-oriented Attacks
Aj MaChInE
 
A Study on .NET Framework for Red Team - Part I
A Study on .NET Framework for Red Team - Part IA Study on .NET Framework for Red Team - Part I
A Study on .NET Framework for Red Team - Part I
Aj MaChInE
 
A study on NetSpectre
A study on NetSpectreA study on NetSpectre
A study on NetSpectre
Aj MaChInE
 
Introduction to Adversary Evaluation Tools
Introduction to Adversary Evaluation ToolsIntroduction to Adversary Evaluation Tools
Introduction to Adversary Evaluation Tools
Aj MaChInE
 
[若渴] A preliminary study on attacks against consensus in bitcoin
[若渴] A preliminary study on attacks against consensus in bitcoin[若渴] A preliminary study on attacks against consensus in bitcoin
[若渴] A preliminary study on attacks against consensus in bitcoin
Aj MaChInE
 
[RAT資安小聚] Study on Automatically Evading Malware Detection
[RAT資安小聚] Study on Automatically Evading Malware Detection[RAT資安小聚] Study on Automatically Evading Malware Detection
[RAT資安小聚] Study on Automatically Evading Malware Detection
Aj MaChInE
 
[若渴] Preliminary Study on Design and Exploitation of Trustzone
[若渴] Preliminary Study on Design and Exploitation of Trustzone[若渴] Preliminary Study on Design and Exploitation of Trustzone
[若渴] Preliminary Study on Design and Exploitation of Trustzone
Aj MaChInE
 
[若渴]Study on Side Channel Attacks and Countermeasures
[若渴]Study on Side Channel Attacks and Countermeasures [若渴]Study on Side Channel Attacks and Countermeasures
[若渴]Study on Side Channel Attacks and Countermeasures
Aj MaChInE
 
[若渴計畫] Challenges and Solutions of Window Remote Shellcode
[若渴計畫] Challenges and Solutions of Window Remote Shellcode[若渴計畫] Challenges and Solutions of Window Remote Shellcode
[若渴計畫] Challenges and Solutions of Window Remote Shellcode
Aj MaChInE
 
[若渴計畫] Introduction: Formal Verification for Code
[若渴計畫] Introduction: Formal Verification for Code[若渴計畫] Introduction: Formal Verification for Code
[若渴計畫] Introduction: Formal Verification for Code
Aj MaChInE
 
[若渴計畫] Studying ASLR^cache
[若渴計畫] Studying ASLR^cache[若渴計畫] Studying ASLR^cache
[若渴計畫] Studying ASLR^cache
Aj MaChInE
 
[若渴計畫] Black Hat 2017之過去閱讀相關整理
[若渴計畫] Black Hat 2017之過去閱讀相關整理[若渴計畫] Black Hat 2017之過去閱讀相關整理
[若渴計畫] Black Hat 2017之過去閱讀相關整理
Aj MaChInE
 
閱讀文章分享@若渴 2016.1.24
閱讀文章分享@若渴 2016.1.24閱讀文章分享@若渴 2016.1.24
閱讀文章分享@若渴 2016.1.24
Aj MaChInE
 
[若渴計畫2015.8.18] SMACK
[若渴計畫2015.8.18] SMACK[若渴計畫2015.8.18] SMACK
[若渴計畫2015.8.18] SMACK
Aj MaChInE
 
[SITCON2015] 自己的異質多核心平台自己幹
[SITCON2015] 自己的異質多核心平台自己幹[SITCON2015] 自己的異質多核心平台自己幹
[SITCON2015] 自己的異質多核心平台自己幹
Aj MaChInE
 
[MOSUT20150131] Linux Runs on SoCKit Board with the GPGPU
[MOSUT20150131] Linux Runs on SoCKit Board with the GPGPU[MOSUT20150131] Linux Runs on SoCKit Board with the GPGPU
[MOSUT20150131] Linux Runs on SoCKit Board with the GPGPU
Aj MaChInE
 
[若渴計畫]由GPU硬體概念到coding CUDA
[若渴計畫]由GPU硬體概念到coding CUDA[若渴計畫]由GPU硬體概念到coding CUDA
[若渴計畫]由GPU硬體概念到coding CUDA
Aj MaChInE
 
[若渴計畫]64-bit Linux Return-Oriented Programming
[若渴計畫]64-bit Linux Return-Oriented Programming[若渴計畫]64-bit Linux Return-Oriented Programming
[若渴計畫]64-bit Linux Return-Oriented Programming
Aj MaChInE
 
[MOSUT] Format String Attacks
[MOSUT] Format String Attacks[MOSUT] Format String Attacks
[MOSUT] Format String Attacks
Aj MaChInE
 

More from Aj MaChInE (19)

An Intro on Data-oriented Attacks
An Intro on Data-oriented AttacksAn Intro on Data-oriented Attacks
An Intro on Data-oriented Attacks
 
A Study on .NET Framework for Red Team - Part I
A Study on .NET Framework for Red Team - Part IA Study on .NET Framework for Red Team - Part I
A Study on .NET Framework for Red Team - Part I
 
A study on NetSpectre
A study on NetSpectreA study on NetSpectre
A study on NetSpectre
 
Introduction to Adversary Evaluation Tools
Introduction to Adversary Evaluation ToolsIntroduction to Adversary Evaluation Tools
Introduction to Adversary Evaluation Tools
 
[若渴] A preliminary study on attacks against consensus in bitcoin
[若渴] A preliminary study on attacks against consensus in bitcoin[若渴] A preliminary study on attacks against consensus in bitcoin
[若渴] A preliminary study on attacks against consensus in bitcoin
 
[RAT資安小聚] Study on Automatically Evading Malware Detection
[RAT資安小聚] Study on Automatically Evading Malware Detection[RAT資安小聚] Study on Automatically Evading Malware Detection
[RAT資安小聚] Study on Automatically Evading Malware Detection
 
[若渴] Preliminary Study on Design and Exploitation of Trustzone
[若渴] Preliminary Study on Design and Exploitation of Trustzone[若渴] Preliminary Study on Design and Exploitation of Trustzone
[若渴] Preliminary Study on Design and Exploitation of Trustzone
 
[若渴]Study on Side Channel Attacks and Countermeasures
[若渴]Study on Side Channel Attacks and Countermeasures [若渴]Study on Side Channel Attacks and Countermeasures
[若渴]Study on Side Channel Attacks and Countermeasures
 
[若渴計畫] Challenges and Solutions of Window Remote Shellcode
[若渴計畫] Challenges and Solutions of Window Remote Shellcode[若渴計畫] Challenges and Solutions of Window Remote Shellcode
[若渴計畫] Challenges and Solutions of Window Remote Shellcode
 
[若渴計畫] Introduction: Formal Verification for Code
[若渴計畫] Introduction: Formal Verification for Code[若渴計畫] Introduction: Formal Verification for Code
[若渴計畫] Introduction: Formal Verification for Code
 
[若渴計畫] Studying ASLR^cache
[若渴計畫] Studying ASLR^cache[若渴計畫] Studying ASLR^cache
[若渴計畫] Studying ASLR^cache
 
[若渴計畫] Black Hat 2017之過去閱讀相關整理
[若渴計畫] Black Hat 2017之過去閱讀相關整理[若渴計畫] Black Hat 2017之過去閱讀相關整理
[若渴計畫] Black Hat 2017之過去閱讀相關整理
 
閱讀文章分享@若渴 2016.1.24
閱讀文章分享@若渴 2016.1.24閱讀文章分享@若渴 2016.1.24
閱讀文章分享@若渴 2016.1.24
 
[若渴計畫2015.8.18] SMACK
[若渴計畫2015.8.18] SMACK[若渴計畫2015.8.18] SMACK
[若渴計畫2015.8.18] SMACK
 
[SITCON2015] 自己的異質多核心平台自己幹
[SITCON2015] 自己的異質多核心平台自己幹[SITCON2015] 自己的異質多核心平台自己幹
[SITCON2015] 自己的異質多核心平台自己幹
 
[MOSUT20150131] Linux Runs on SoCKit Board with the GPGPU
[MOSUT20150131] Linux Runs on SoCKit Board with the GPGPU[MOSUT20150131] Linux Runs on SoCKit Board with the GPGPU
[MOSUT20150131] Linux Runs on SoCKit Board with the GPGPU
 
[若渴計畫]由GPU硬體概念到coding CUDA
[若渴計畫]由GPU硬體概念到coding CUDA[若渴計畫]由GPU硬體概念到coding CUDA
[若渴計畫]由GPU硬體概念到coding CUDA
 
[若渴計畫]64-bit Linux Return-Oriented Programming
[若渴計畫]64-bit Linux Return-Oriented Programming[若渴計畫]64-bit Linux Return-Oriented Programming
[若渴計畫]64-bit Linux Return-Oriented Programming
 
[MOSUT] Format String Attacks
[MOSUT] Format String Attacks[MOSUT] Format String Attacks
[MOSUT] Format String Attacks
 

Recently uploaded

Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 

Recently uploaded (20)

Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 

[若渴計畫] Studying Concurrency

  • 3. Outline • 為什麼寫concurrency不容易? • Programmer-observable behavior • 來點concurrency performance 撰寫技巧例子 • 來點concurrency security 例子
  • 4. 為什麼寫Concurrency不容易? • Hardware optimizations • Compiler optimizations 無法預期行為
  • 5. Hardware Optimizations - Write Buffer • On a write, a processor simply inserts the write operation into the write buffer and proceeds without waiting for the write to complete • In order to effectively hide the latency of write operations • Therefore, P1, P2 are all in critical sections Sarita V. Adve, Kourosh Gharachorloo, “Shared Memory Consistency Models: A Tutorial”
  • 6. Hardware Optimizations - Overlapped Writes • Assume the Data and Head variables reside in different memory modules • Since the write to Head may be injected into the network before the write to Data has reached its memory module • Therefore, it is possible for another processor to observe the new value of Head and yet obtain the old value of Data • Reordering of write operations Sarita V. Adve, Kourosh Gharachorloo, “Shared Memory Consistency Models: A Tutorial” (coalesced write)
  • 7. Hardware Optimizations - Non−blocking Reads • If P2 is allowed to issue its read operations in an overlapped fashion, there is the possibility for the read of Data to arrive at its memory module before the write from P1 while the read of Head reaches its memory module after the write from P1 => P2.Data =2000/ P2.Head = 0 Sarita V. Adve, Kourosh Gharachorloo, “Shared Memory Consistency Models: A Tutorial” (coalesced read)
  • 9. 所以怎麼辦? 理想上 • Sequential Consistency (單核operations順序= 多核operation順序) – The result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program • There is no local reordering • Each write becomes visible to all threads Sarita V. Adve, Kourosh Gharachorloo, “Shared Memory Consistency Models: A Tutorial” Luc Maranget, etc., “A Tutorial Introduction to the ARM and POWER Relaxed Memory Models”
  • 10. 事實上,不保證SC Memory model Local ordering Multiple-copy atomic model Total store ordering Intel x86 X O Relaxed memory model ARM X X Luc Maranget, etc., “A Tutorial Introduction to the ARM and POWER Relaxed Memory Models” Developers需自己寫code管理記憶體操作順序
  • 11. Hardware Optimizations這麼多,我要 怎知道程式的運作行為(Programmer- observable Behavior)? • Mathematically rigorous architecture definitions – Luc Maranget, etc., “A Tutorial Introduction to the ARM and POWER Relaxed Memory Models” • Hardware semantics – Shaked Flur, etc., “Modelling the ARMv8 Architecture, Operationally Concurrency and ISA” • C/C++11 memory model • …?
  • 12. Mathematically Rigorous Architecture Definitions – For Example • Message Passing (MP) Luc Maranget, etc., “A Tutorial Introduction to the ARM and POWER Relaxed Memory Models” Y=1; r1=y; r2=x; x=1 r1=1 ∧ r2=0 x86-TSO : forbidden ARM: allowed Partial-order Propagation ?
  • 13. Partial-order Propagation是否一定會 影響程式行為? 不一定會 • MP test harness • m is the number of times that the final outcome of r1=1 ∧ r2=0 was observed in n trials
  • 14. Hardware Semantics Shaked Flur, etc., “Modelling the ARMv8 Architecture, Operationally Concurrency and ISA” 撰寫 撰寫
  • 15. Web Site of Hardware Semantics http://www.cl.cam.ac.uk/~sf502/popl16/help.html
  • 16. Result of Hardware Semantics http://www.cl.cam.ac.uk/~sf502/popl16/help.html 如果有同時存取某位置(lock沒寫好),可以看result資訊可提早看出。
  • 17. C/C++11 Memory Model • 從language層面,制定keywords,來使各個 硬體必須符合此language memory model。 – https://www.youtube.com/watch?v=S-x-23lrRnc • 此影片中有提到ARM為了滿足C11 memory model, complier會有double barrier狀況 • Reinoud Elhorst, “Lowering C11 Atomics for ARM in LLVM” – Torvald Riegel, “Modern C/C++ concurrency” • Semantics – Mark Barry, “Mathematizing C++ concurrency”
  • 18. Mathematizing C++ Concurrency • 利用 Isabelle/HOL 來撰寫C++ memory model 的semantics • For example:定義release sequence
  • 19. 來點Concurrency Performance撰寫技 巧例子 • LMAX • RCU • Concurrent malloc(3) • An Analysis of Linux Scalability to Many Cores
  • 20. LMAX: New Financial Trading Platform https://martinfowler.com/articles/lmax.html
  • 22. Read Copy Update (RCU) • Read-mostly situations • Typical RCU: update into removal and reclamation (disrupt) – Removal and Replacing references to data items can run concurrently with readers – Remove pointers to a data structure, so that subsequent readers cannot gain a reference to it – RCU provides implicit low-overhead communication between readers and reclaimers (synchronize_rcu()) https://www.kernel.org/doc/Documentation/RCU/whatisRCU.txt https://lwn.net/Articles/262464/
  • 25. Concurrent malloc(3) • How to false cache sharing – Modern multi-processor systems preserve a coherent view of memory on a per-cache-line basis • How to reduce lock contention Jason Evans, “a scalable concurrent malloc implementation for freebsd”
  • 26. jemalloc • Phk-malloc was specially optimized to minimize the working set of pages, jemalloc must be more concerned with cache locality • jemalloc first tries to minimize memory usage, and tries to allocate contiguously (weaker security) • One way of fixing this issue is to pad allocations, but padding is in direct opposition to the goal of packing objects as tightly as possible; it can cause severe internal fragmentation. jemalloc instead relies on multiple allocation arenas to reduce the problem • One of the main goals for this allocator was to reduce lock contention for multi- threaded applications by using a single 2 allocator lock, each free list had its own lock • The solution was to use multiple arenas for allocation, and assign threads to arenas via hashing of the thread identifiers Jason Evans, “a scalable concurrent malloc implementation for freebsd”
  • 27. Scalability Collapse Caused by Non- scalable Locks
  • 28. Linux Scalability to Many Cores - Per-core Mount Caches Silas Boyd-Wickizer, etc. , “An Analysis of Linux Scalability to Many Cores” • Observation: mount table is rarely modified • Common case: cores access per-core tables • Modify mount table: invalidate per-core tables
  • 29. Linux Scalability to Many Cores - Sloppy Counters • Because reading reference count is slow Silas Boyd-Wickizer, etc. , “An Analysis of Linux Scalability to Many Cores”
  • 30. 來點Concurrency Security 例子 • Concurrency fuzzer – Sebastian Burckhardt, etc., “A Randomized Scheduler with Probabilistic Guarantees of Finding Bugs” • Timing side channel attack – Yeongjin Jang, etc., “Breaking Kernel Address Space Layout Randomization with Intel TSX”
  • 31. Concurrency Fuzzer- Randomized Scheduler Sebastian Burckhardt, etc., “A Randomized Scheduler with Probabilistic Guarantees of Finding Bugs” Randomized Scheduler 基本上,Read/ Write reordering in hardware 是沒有模擬到的 Find Violation (Order/ Atomicity)
  • 32. 此投影片有整理幾個Fuzzer “Concurrency: A problem and opportunity in the exploitation of memory corruptions”
  • 33. Intel Transactional Synchronization Extensions • the assembly instruction xbegin can return various results that represent the hardware's suggestions for how to proceed and reasons for failure: success, a suggestion to retry, a potential cause for the abort • To effectively use TSX it's imperative to understand it's implementation and limitations. TSX is implemented using the cache coherence protocol, which x86 machines already implement. When a transaction begins, the processor starts tracking read and write sets of cache lines which have been brought into the L1 cache. If at any point during a logical core's execution of a transaction another core modifies a cache line in the read or write set then the transaction is aborted. Nick Stanley, “Hardware Transactional Memory with Intel’s TSX”
  • 34. Intel Transactional Synchronization Extensions - Suppressing exceptions • a transaction aborts when such a hardware exception occurs during the execution of the transaction. However, unlike normal situations where the OS intervenes and handles these exceptions gracefully, TSX instead invokes a user-specified abort handler, without informing the underlying OS. More precisely, TSX treats these exceptions in a synchronous manner—immediately executing an abort handler while suppressing the exception itself. In other words, the exception inside the transaction will not be communicated to the underlying OS. This allows us to engage in abnormal behavior (e.g., attempting to access privileged, i.e., kernel, memory regions) without worrying about crashing the program. In DrK, we break KASLR by turning this surprising behavior into a timing channel that leaks the status (e.g., mapped or unmapped) of all kernel pages.
  • 35. Timing Side Channel Attack • TSX instead invokes a user- specified abort handler, without informing the underlying OS • 也就是說我在User space就可以 知道kennel address with random (!!!) Yeongjin Jang, etc., “Breaking Kernel Address Space Layout Randomization with Intel TSX”
  • 36. Reference • Sarita V. Adve, Kourosh Gharachorloo, “Shared Memory Consistency Models: A Tutorial” • Luc Maranget, etc., “A Tutorial Introduction to the ARM and POWER Relaxed Memory Models” • Shaked Flur, etc., “Modelling the ARMv8 Architecture, Operationally Concurrency and ISA” • https://www.youtube.com/watch?v=6QU37TwRO4w • http://www.cl.cam.ac.uk/~sf502/popl16/help.html • Jade Alglave, etc., “The Semantics of Power and ARM Multiprocessor Machine Code” • Paul E. McKenney, “Memory Barriers: a Hardware View for Software Hackers”
  • 37. Reference C/C++ 11 memory model • https://www.youtube.com/watch?v=S-x-23lrRnc • Reinoud Elhorst, “Lowering C11 Atomics for ARM in LLVM” • Torvald Riegel, “Modern C/C++ concurrency” • Mark Barry, “Mathematizing C++ concurrency” LMAX • https://github.com/LMAX-Exchange/disruptor • https://martinfowler.com/articles/lmax.html • http://mechanitis.blogspot.tw/2011/06/dissecting-disruptor-how-do-i-read- from.html RCU • https://www.kernel.org/doc/Documentation/RCU/whatisRCU.txt • https://lwn.net/Articles/262464/ • https://lwn.net/Articles/253651/ • https://lwn.net/Articles/264090/
  • 38. Reference Concurrent malloc(3) • Jason Evans, “a scalable concurrent malloc implementation for freebsd” Concurrency security • Sebastian Burckhardt, etc., “A Randomized Scheduler with Probabilistic Guarantees of Finding Bugs” • Ralf-Philipp Weinmann, etc., “Concurrency: A problem and opportunity in the exploitation of memory corruptions” • Yeongjin Jang, etc., “Breaking Kernel Address Space Layout Randomization with Intel TSX” • Nick Stanley, “Hardware Transactional Memory with Intel’s TSX” (有建議的Intel concurrency寫法)