This document discusses challenges in writing concurrent programs and provides examples of concurrency techniques. It explains that hardware and compiler optimizations can result in unexpected program behaviors. It then describes memory model definitions, performance patterns like LMAX and RCU, and security issues such as timing side-channel attacks using Intel TSX. The goal is to understand how to write correct and efficient concurrent code despite relaxed memory consistency models.
Achieving the Ultimate Performance with KVMDevOps.com
Building and managing a cloud is not an easy task. It needs solid knowledge, proper planning and extensive experience in selecting the proper components and putting them together.
Many companies build new-age KVM clouds, only to find out that their applications & workloads do not perform well. Join this webinar to learn how to get the most out of your KVM cloud and how to optimize it for performance.
Join this webinar and learn:
Why performance matters and how to measure it properly?
What are the main components of an efficient new-age cloud?
How to select the right hardware?
How to optimize CPU and memory for ultimate performance?
Which network components work best?
How to tune the storage layer for performance?
Achieving the Ultimate Performance with KVMDevOps.com
Building and managing a cloud is not an easy task. It needs solid knowledge, proper planning and extensive experience in selecting the proper components and putting them together.
Many companies build new-age KVM clouds, only to find out that their applications & workloads do not perform well. Join this webinar to learn how to get the most out of your KVM cloud and how to optimize it for performance.
Join this webinar and learn:
Why performance matters and how to measure it properly?
What are the main components of an efficient new-age cloud?
How to select the right hardware?
How to optimize CPU and memory for ultimate performance?
Which network components work best?
How to tune the storage layer for performance?
An overview of the OpenStack Cinder project, which provides block storage services in OpenStack. This presentation is updated to cover the Havana release, with a look forward at what's expected in Icehouse.
KVM (Kernel-based Virtual Machine) is a full virtualization solution built into the Linux kernel. OpenStack Foundation user surveys consistently indicate that KVM is the most commonly used Hypervisor for OpenStack deployments, managed using the Libvirt driver for OpenStack Compute (Nova). Despite this sustained popularity development of the driver, and indeed the underlying Hypervisor itself, continues at a frantic pace.
This presentation will help you make sense of it all starting with an overview of the way Nova, Libvirt, and KVM interact before analysing progress made in Kilo on utilizing key Libvirt/KVM features in Nova including:
Instance vCPU pinning
Huge page backed instances
Enhanced NUMA topology awareness
...and more! The session will close with a discussion of how in addition to exposing existing Libvirt/KVM features emerging OpenStack use cases - such as Network Function Virtualization (NFV) and High Performance Computing (HPC) - are driving open innovation in the Libvirt, QEMU, and KVM projects themselves.
Kubernetes for HCL Connections Component Pack - Build or Buy?Martin Schmidt
HCL Connections V7 will be based on Kubernetes only! A parallel WebSphere environment won't be necessary any longer. Martin and Christoph collected the basics and differences in building a Kubernetes environment of your choice. They show you a comparison of an on-premises deployment versus a hosted cloud environment (Amazon EKS). After this session you have the basics to size and build a Kubernetes cluster for Component Pack, so you can start learning the new technology to take off with Connections V7 and become a Kubernaut.
I invite you to come and listen to my presentation about how Openstack and Gluster are integrating together in both Cinder and Swift.
I will give a brief description about Openstack storage components (Cinder, Swift and Glance) , followed by an intro to Gluster, and then present the integration points and some preferred topology and configuration between gluster and openstack.
John and I presented the status of Replication in OpenStack Cinder at the Tokyo Summit. This presentation is current for the Liberty release and has some previews of what will be in Mitaka.
Caches are used in many layers of applications that we develop today, holding data inside or outside of your runtime environment, or even distributed across multiple platforms in data fabrics. However, considerable performance gains can often be realized by configuring the deployment platform/environment and coding your application to take advantage of the properties of CPU caches.
In this talk, we will explore what CPU caches are, how they work and how to measure your JVM-based application data usage to utilize them for maximum efficiency. We will discuss the future of CPU caches in a many-core world, as well as advancements that will soon arrive such as HP's Memristor.
An overview of the OpenStack Cinder project, which provides block storage services in OpenStack. This presentation is updated to cover the Havana release, with a look forward at what's expected in Icehouse.
KVM (Kernel-based Virtual Machine) is a full virtualization solution built into the Linux kernel. OpenStack Foundation user surveys consistently indicate that KVM is the most commonly used Hypervisor for OpenStack deployments, managed using the Libvirt driver for OpenStack Compute (Nova). Despite this sustained popularity development of the driver, and indeed the underlying Hypervisor itself, continues at a frantic pace.
This presentation will help you make sense of it all starting with an overview of the way Nova, Libvirt, and KVM interact before analysing progress made in Kilo on utilizing key Libvirt/KVM features in Nova including:
Instance vCPU pinning
Huge page backed instances
Enhanced NUMA topology awareness
...and more! The session will close with a discussion of how in addition to exposing existing Libvirt/KVM features emerging OpenStack use cases - such as Network Function Virtualization (NFV) and High Performance Computing (HPC) - are driving open innovation in the Libvirt, QEMU, and KVM projects themselves.
Kubernetes for HCL Connections Component Pack - Build or Buy?Martin Schmidt
HCL Connections V7 will be based on Kubernetes only! A parallel WebSphere environment won't be necessary any longer. Martin and Christoph collected the basics and differences in building a Kubernetes environment of your choice. They show you a comparison of an on-premises deployment versus a hosted cloud environment (Amazon EKS). After this session you have the basics to size and build a Kubernetes cluster for Component Pack, so you can start learning the new technology to take off with Connections V7 and become a Kubernaut.
I invite you to come and listen to my presentation about how Openstack and Gluster are integrating together in both Cinder and Swift.
I will give a brief description about Openstack storage components (Cinder, Swift and Glance) , followed by an intro to Gluster, and then present the integration points and some preferred topology and configuration between gluster and openstack.
John and I presented the status of Replication in OpenStack Cinder at the Tokyo Summit. This presentation is current for the Liberty release and has some previews of what will be in Mitaka.
Caches are used in many layers of applications that we develop today, holding data inside or outside of your runtime environment, or even distributed across multiple platforms in data fabrics. However, considerable performance gains can often be realized by configuring the deployment platform/environment and coding your application to take advantage of the properties of CPU caches.
In this talk, we will explore what CPU caches are, how they work and how to measure your JVM-based application data usage to utilize them for maximum efficiency. We will discuss the future of CPU caches in a many-core world, as well as advancements that will soon arrive such as HP's Memristor.
Exploiting Modern Microarchitectures: Meltdown, Spectre, and other Attacksinside-BigData.com
In this deck from the FOSDEM 2018 conference, Jon Masters from Red Hat presents: Exploiting modern microarchitectures Meltdown, Spectre, and other hardware attacks.
"Recently disclosed vulnerabilities against modern high performance computer microarchitectures known as 'Meltdown' and 'Spectre' are among an emerging wave of hardware-focused attacks. These include cache side-channel exploits against underlying shared resources, which arise as a result of common industry-wide performance optimizations. More broadly, attacks against hardware are entering a new phase of sophistication that will see more in the months ahead. This talk will describe several of these attacks, how they can be mitigated, and generally what we can do as an industry to bring performance without trading security."
Jon Masters is a Computer Architect at Red Hat, where he was tech lead for mitigation efforts against Meltdown and Spectre. Jon has worked closely with high performance microprocessor design teams for years on emerging alternative server platforms, and also currently leads the CCIX software working group helping to define high performance cache coherent interconnects for workload acceleration. Jon has been a Linux developer for 22 years, since beginning college at the age of 13, and has authored a number of books on Linux technology. He lives in Cambridge, MA, and will run his 11th marathon later this spring.
Watch the video: https://insidehpc.com/2018/02/exploiting-modern-microarchitectures-meltdown-spectre-attacks/
Learn more: https://fosdem.org/2018/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
An overview of Cassandra drivers for Java, Ruby, Python with tips and tricks for getting the most performance from Cassandra. Tune your application for low latency or high throughput.
-> Deep dive inside the kernel Interrupt management subsystem.
-> Entire presentation is oriented towards 8259 Interrupt controller.
-> Detail understanding of how request_irq() function works.
POWER ISA introduction and what’s new in ISA V3.1 (Overview)Ganesan Narayanasamy
This presentation will cover the introduction to the POWER ISA including the register file, floating point architecture, basic VSX architecture, Interrupts, memory management, branch handling, instruction set etc. There are several architectural innovations and extensions that happened in the latest version of POWER ISA v3.1 This presentation will also provide an overview on the new architecture features introduced in POWER ISA v3.1.
Windows Internals for Linux Kernel DevelopersKernel TLV
Agenda:
The Windows kernel has an honorable history of more than a quarter of a century. Since its inception in 1989, Windows NT supported a variety of modern OS features -- symmetric multiprocessing, interrupt prioritization, virtual memory, deferred interrupt processing, and many others. In this talk, targeted for Linux kernel developers, we will highlight the key features of the Windows NT kernel that are interesting or different from Linux's perspective. We will begin with a brief overview of processes, threads, and virtual memory on Windows. Next, we will talk about interrupt handling, interrupt priorities (IRQLs), bottom-half processing (DPC, APC, kernel worker threads, kernel thread pool), and I/O request flow. Among other things, we will look at device driver structure on Windows, application to driver communication (handles, IOCTLs), and the logical \DosDevices filesystem. Finally, we will discuss some features introduced in newer Windows versions, such as user-mode drivers (UMDF).
Speaker:
Sasha is the CTO of Sela Group, a training and consulting company based in Israel that employs over 400 developers world-wide. Most of Sasha's work revolves around performance optimization, production debugging, and low-level system diagnostics, but he also dabbles in mobile application development on iOS and Android. Sasha is the author of two books and three Pluralsight courses, and a contributor to multiple open-source projects. He blogs at http://blog.sashag.net.
Automating the Hunt for Non-Obvious Sources of Latency SpreadsScyllaDB
False sharing references and power management can trigger wide latency spreads, but are neither directly observable nor easily traced to causes. This talk describes how to diagnose the problems quickly, and outlines several remedies.
Cassandra is pretty awesome, sure I am biased, but it rocks. Always on, tuneable consistency and multi-master architecture? Let’s get our web scale on and build a highly available app that never goes down!
Hold on a second. There is one key piece of the puzzle that has a massive impact on your applications availability: the client driver.
In this talk we will go through the how to best configure your clients to make the most of failure handling and tuneable consistency in Cassandra.
Blocks is a cool concept and is very much needed for performance improvements and responsiveness. GCD helps run blocks effortlessly by scheduling on a desired queue, priority and lots more.
Introduce F9 microkernel, new open source implementation built from scratch, which deploys modern kernel techniques, derived from L4 microkernel designs, to deep embedded devices.
:: https://github.com/f9micro
Characteristics of F9 microkernel
– Efficiency: performance + power consumption
– Security: memory protection + isolated execution
– Flexible development environment
[若渴]Study on Side Channel Attacks and Countermeasures Aj MaChInE
[投影片錯誤更正] p.43 中間32數字改成64。右上藍色小框64改成63
原本要整理Meltdown與Spectre,但這兩個所利用的硬體行為之後都跟cache side channel有關係,所以閱讀Meltdown與Spectre之餘,就整理了相關cache side channel攻擊與防禦。
回饋問題:
一: 為什麼LLC要切割成LLC slice?
"Modern Intel processors, starting with the Sandy Bridge microarchitecture, use a more complex architecture for the LLC, to improve its performance. The LLC is divided into per-core slices, which are connected by a ring bus. Slices can be accessed concurrently and are effectively separate caches, although the bus ensures that each core can access the full LLC (with higher latency for remote slices)."
二: flush+reload with shared memory pages,為什麼要 flush+reload? 不是可以直接存取到資料?
討論的是共用shared library,洩漏victim使用shared library的情形。
三: RDTSCP ?
可量測執行指令的cycle數。
四: side channel攻擊需要環境運作的程式不能太複雜?
Kuon: 實際案例 embed運作環境並不複雜,e.g. trustzone上可能只運作openSSL。
AJ: 就算在複雜環境,可以找到觸發Victim的特定運算點,也是可以進行觀測。
一個學習format string attack與分享的故事
Outline:
*Illustrating format string vulnerabilities
*A case study
+fsa.c
+Compile and setup insecure environment
+Viewing the stack
+Viewing Memory at any location
+Overwriting of arbitrary memory
*So, You Can…
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
5. Hardware Optimizations
- Write Buffer
• On a write, a processor simply inserts the write operation into the
write buffer and proceeds without waiting for the write to complete
• In order to effectively hide the latency of write operations
• Therefore, P1, P2 are all in critical sections
Sarita V. Adve, Kourosh Gharachorloo, “Shared Memory Consistency Models: A Tutorial”
6. Hardware Optimizations
- Overlapped Writes
• Assume the Data and Head variables reside in different memory modules
• Since the write to Head may be injected into the network before the write to Data
has reached its memory module
• Therefore, it is possible for another processor to observe the new value of Head
and yet obtain the old value of Data
• Reordering of write operations
Sarita V. Adve, Kourosh Gharachorloo, “Shared Memory Consistency Models: A Tutorial”
(coalesced write)
7. Hardware Optimizations
- Non−blocking Reads
• If P2 is allowed to issue its read operations in an overlapped
fashion, there is the possibility for the read of Data to arrive
at its memory module before the write from P1 while the
read of Head reaches its memory module after the write
from P1 => P2.Data =2000/ P2.Head = 0
Sarita V. Adve, Kourosh Gharachorloo, “Shared Memory Consistency Models: A Tutorial”
(coalesced read)
9. 所以怎麼辦? 理想上
• Sequential Consistency (單核operations順序=
多核operation順序)
– The result of any execution is the same as if the
operations of all the processors were executed in
some sequential order, and the operations of each
individual processor appear in this sequence in
the order specified by its program
• There is no local reordering
• Each write becomes visible to all threads
Sarita V. Adve, Kourosh Gharachorloo, “Shared Memory Consistency Models: A Tutorial”
Luc Maranget, etc., “A Tutorial Introduction to the ARM and POWER Relaxed Memory Models”
10. 事實上,不保證SC
Memory model Local ordering Multiple-copy atomic
model
Total store ordering Intel x86 X O
Relaxed memory
model
ARM X X
Luc Maranget, etc., “A Tutorial Introduction to the ARM and POWER Relaxed Memory Models”
Developers需自己寫code管理記憶體操作順序
11. Hardware Optimizations這麼多,我要
怎知道程式的運作行為(Programmer-
observable Behavior)?
• Mathematically rigorous architecture
definitions
– Luc Maranget, etc., “A Tutorial Introduction to the
ARM and POWER Relaxed Memory Models”
• Hardware semantics
– Shaked Flur, etc., “Modelling the ARMv8
Architecture, Operationally Concurrency and ISA”
• C/C++11 memory model
• …?
12. Mathematically Rigorous Architecture
Definitions – For Example
• Message Passing (MP)
Luc Maranget, etc., “A Tutorial Introduction to the ARM and POWER Relaxed Memory Models”
Y=1; r1=y; r2=x; x=1 r1=1 ∧ r2=0
x86-TSO : forbidden
ARM: allowed
Partial-order Propagation
?
22. Read Copy Update (RCU)
• Read-mostly situations
• Typical RCU: update into removal and reclamation (disrupt)
– Removal and Replacing references to data items can run concurrently with readers
– Remove pointers to a data structure, so that subsequent readers cannot gain a
reference to it
– RCU provides implicit low-overhead communication between readers and reclaimers
(synchronize_rcu())
https://www.kernel.org/doc/Documentation/RCU/whatisRCU.txt
https://lwn.net/Articles/262464/
25. Concurrent malloc(3)
• How to false cache sharing
– Modern multi-processor systems preserve a coherent
view of memory on a per-cache-line basis
• How to reduce lock contention
Jason Evans, “a scalable concurrent malloc implementation for freebsd”
26. jemalloc
• Phk-malloc was specially optimized to minimize the working set of pages, jemalloc
must be more concerned with cache locality
• jemalloc first tries to minimize memory usage, and tries to allocate contiguously
(weaker security)
• One way of fixing this issue is to pad allocations, but padding is in direct opposition
to the goal of packing objects as tightly as possible; it can cause severe internal
fragmentation. jemalloc instead relies on multiple allocation arenas to reduce the
problem
• One of the main goals for this allocator was to reduce lock contention for multi-
threaded applications by using a single 2 allocator lock, each free list had its own
lock
• The solution was to use multiple
arenas for allocation, and assign threads
to arenas via hashing of the thread identifiers
Jason Evans, “a scalable concurrent malloc implementation for freebsd”
28. Linux Scalability to Many Cores -
Per-core Mount Caches
Silas Boyd-Wickizer, etc. , “An Analysis of Linux Scalability to Many Cores”
• Observation: mount table is
rarely modified
• Common case: cores access
per-core tables
• Modify mount table: invalidate
per-core tables
29. Linux Scalability to Many Cores -
Sloppy Counters
• Because reading reference count is slow
Silas Boyd-Wickizer, etc. , “An Analysis of Linux Scalability to Many Cores”
30. 來點Concurrency Security 例子
• Concurrency fuzzer
– Sebastian Burckhardt, etc., “A Randomized
Scheduler with Probabilistic Guarantees of Finding
Bugs”
• Timing side channel attack
– Yeongjin Jang, etc., “Breaking Kernel Address
Space Layout Randomization with Intel TSX”
31. Concurrency Fuzzer-
Randomized Scheduler
Sebastian Burckhardt, etc., “A Randomized Scheduler with Probabilistic
Guarantees of Finding Bugs”
Randomized Scheduler
基本上,Read/ Write reordering in hardware 是沒有模擬到的
Find Violation (Order/ Atomicity)
33. Intel Transactional Synchronization
Extensions
• the assembly instruction xbegin can return various
results that represent the hardware's suggestions for
how to proceed and reasons for failure: success, a
suggestion to retry, a potential cause for the abort
• To effectively use TSX it's imperative to understand it's
implementation and limitations. TSX is implemented
using the cache coherence protocol, which x86
machines already implement. When a transaction
begins, the processor starts tracking read and write
sets of cache lines which have been brought into the L1
cache. If at any point during a logical core's execution
of a transaction another core modifies a cache line in
the read or write set then the transaction is aborted.
Nick Stanley, “Hardware Transactional Memory with Intel’s TSX”
34. Intel Transactional Synchronization
Extensions - Suppressing exceptions
• a transaction aborts when such a hardware exception occurs during the
execution of the transaction. However, unlike normal situations where the
OS intervenes and handles these exceptions gracefully, TSX instead
invokes a user-specified abort handler, without informing the underlying
OS. More precisely, TSX treats these exceptions in a synchronous
manner—immediately executing an abort handler while suppressing the
exception itself. In other words, the exception inside the transaction will
not be communicated to the underlying OS. This allows us to engage in
abnormal behavior (e.g., attempting to access privileged, i.e., kernel,
memory regions) without worrying about crashing the program. In DrK,
we break KASLR by turning this surprising behavior into a timing channel
that leaks the status (e.g., mapped or unmapped) of all kernel pages.
35. Timing Side Channel Attack
• TSX instead invokes a user-
specified abort handler, without
informing the underlying OS
• 也就是說我在User space就可以
知道kennel address with random
(!!!)
Yeongjin Jang, etc., “Breaking Kernel Address Space Layout
Randomization with Intel TSX”
36. Reference
• Sarita V. Adve, Kourosh Gharachorloo, “Shared Memory Consistency
Models: A Tutorial”
• Luc Maranget, etc., “A Tutorial Introduction to the ARM and POWER
Relaxed Memory Models”
• Shaked Flur, etc., “Modelling the ARMv8 Architecture, Operationally
Concurrency and ISA”
• https://www.youtube.com/watch?v=6QU37TwRO4w
• http://www.cl.cam.ac.uk/~sf502/popl16/help.html
• Jade Alglave, etc., “The Semantics of Power and ARM Multiprocessor
Machine Code”
• Paul E. McKenney, “Memory Barriers: a Hardware View for Software
Hackers”
37. Reference
C/C++ 11 memory model
• https://www.youtube.com/watch?v=S-x-23lrRnc
• Reinoud Elhorst, “Lowering C11 Atomics for ARM in LLVM”
• Torvald Riegel, “Modern C/C++ concurrency”
• Mark Barry, “Mathematizing C++ concurrency”
LMAX
• https://github.com/LMAX-Exchange/disruptor
• https://martinfowler.com/articles/lmax.html
• http://mechanitis.blogspot.tw/2011/06/dissecting-disruptor-how-do-i-read-
from.html
RCU
• https://www.kernel.org/doc/Documentation/RCU/whatisRCU.txt
• https://lwn.net/Articles/262464/
• https://lwn.net/Articles/253651/
• https://lwn.net/Articles/264090/
38. Reference
Concurrent malloc(3)
• Jason Evans, “a scalable concurrent malloc implementation
for freebsd”
Concurrency security
• Sebastian Burckhardt, etc., “A Randomized Scheduler with
Probabilistic Guarantees of Finding Bugs”
• Ralf-Philipp Weinmann, etc., “Concurrency: A problem and
opportunity in the exploitation of memory corruptions”
• Yeongjin Jang, etc., “Breaking Kernel Address Space Layout
Randomization with Intel TSX”
• Nick Stanley, “Hardware Transactional Memory with Intel’s
TSX” (有建議的Intel concurrency寫法)