SlideShare a Scribd company logo
Linux Kernel Memory Model
SeongJae Park <sj38.park@gmail.com>
These slides were presented during
4th Korea Linux Kernel Community Meetup
(https://kernel-dev-ko.github.io/4th/)
This work by SeongJae Park is licensed under the
Creative Commons Attribution-ShareAlike 3.0 Unported
License. To view a copy of this license, visit
http://creativecommons.org/licenses/by-sa/3.0/.
I, SeongJae Park
● SeongJae Park <sj38.park@gmail.com>
● PhD candidate researcher at DCSLAB, SNU
● Part time linux kernel programmer at KOSSLAB
● Interested in memory management and parallel programming for OS kernels
It’s Multi-processor Era, Now
● Processor vendors changed their mind to increase number of cores instead of
clock speed more than a decade ago
○ Now, multi-core systems are prevalent
○ Even octa-core system in your pocket, maybe?
● As a result, the free lunch is over;
parallel programming is essential for high performance and scalability
http://www.gotw.ca/images/CPU.png
GIVE UP :’(
Writing Correct Parallel Program is Hard
● Nature of parallelism is counter-intuitive to human
○ Time is relative, order is ambiguous, and only observations exist
● Compilers and processors do reorderings for performance
● C language developed with Uni-Processor assumption
○ Standard for the kernel (C99) is oblivious of multi-processor systems
CPU 0 CPU 1
X = 1;
Y = 1;
X = 2;
Y = 2;
if (Y == 2 && X == 1)
OOPS();
CPU 1 can call OOPS() in Linux Kernel
Hardware Reorderings
Why and How?
Instruction Reordering Helps Performance
● By reordering dependent instructions far away, total time consumptions can
be reduced
● If the reordering is guaranteed to not change the result of the instruction
sequence, it would be helpful for better performance
● Such reordering is legal and recommended
fetch decode execute
fetch decode execute
fetch decode execute
Instruction 1
Instruction 3
Instruction 2
instruction 2 depends on result of instruction 1
(e.g., first instruction modifies opcode of next instruction)
By reordering instruction 2 and 3, total execution time can be shorten
6 cycles for 3 instructions: 2 cycles per instruction
Time and Order is Relative
● Each CPU concurrently generates their events and
observes effects of events from others
● It is impossible to define absolute order of two concurrent events;
Only relative observation order is possible
CPU 1 CPU 2 CPU 3 CPU 4
Generated
event 1
Generated
event 2
Observed
event 1
followed by
event 2
I observed
event 2
followed by
event 1
Event Bus
Cache Coherency is Per-CacheLine
● It is well known that cache coherency protocol helps system memory
consistency
● In actual, it guarantees only per-cacheline sequential consistency
● Every CPU will eventually agree about global order of each cacheline,
but some CPU can aware the change faster than others,
order of changes between different variables is not guaranteed
CPU 1 CPU 2 CPU 3 CPU 4
A = 1; (1)
B = 1; (2)
A = 2; (3)
B = 2; (4)
The order is:
(1), (2), (3), (4)
Coherent cache protocol
The order is:
(1), (3), (2), (4)
The order is:
(2), (1), (4), (3)
Every CPU is honest;
The system is legally
cache-coherent
An Example with A Mythological Architecture
● Many systems equip hierarchical memory for better performance and space
● Propagation speed of an event to a given core can be influenced by specific
sub-layer of memory
If CPU 0 Message Queue is busy, CPU 2 can observe an event from
CPU 0 (event A) after an event of CPU 1 (event B)
though CPU 1 observed event A before generating event B
CPU 0 CPU 1
Cache
CPU 0
Message
Queue
CPU 1
Message
Queue
Memory
CPU 2 CPU 3
Cache
CPU 2
Message
Queue
CPU 3
Message
Queue
Bus
An Example with A Mythological Architecture
● Many systems equip hierarchical memory for better performance and space
● Propagation speed of an event to a given core can be influenced by specific
sub-layer of memory
If CPU 0 Message Queue is busy, CPU 2 can observe an event from
CPU 0 (event A) after an event of CPU 1 (event B)
though CPU 1 observed event A before generating event B
CPU 0 CPU 1
Cache
CPU 0
Message
Queue
CPU 1
Message
Queue
Memory
CPU 2 CPU 3
Cache
CPU 2
Message
Queue
CPU 3
Message
Queue
Bus
Generate
Event A;
Event A
An Example with A Mythological Architecture
● Many systems equip hierarchical memory for better performance and space
● Propagation speed of an event to a given core can be influenced by specific
sub-layer of memory
If CPU 0 Message Queue is busy, CPU 2 can observe an event from
CPU 0 (event A) after an event of CPU 1 (event B)
though CPU 1 observed event A before generating event B
CPU 0 CPU 1
Cache
CPU 0
Message
Queue
CPU 1
Message
Queue
Memory
CPU 2 CPU 3
Cache
CPU 2
Message
Queue
CPU 3
Message
Queue
Bus
Generate
Event A;
Seen Event A;
Generate
Event B;
Event BEvent A
An Example with A Mythological Architecture
● Many systems equip hierarchical memory for better performance and space
● Propagation speed of an event to a given core can be influenced by specific
sub-layer of memory
If CPU 0 Message Queue is busy, CPU 2 can observe an event from
CPU 0 (event A) after an event of CPU 1 (event B)
though CPU 1 observed event A before generating event B
CPU 0 CPU 1
Cache
CPU 0
Message
Queue
CPU 1
Message
Queue
Memory
CPU 2 CPU 3
Cache
CPU 2
Message
Queue
CPU 3
Message
Queue
Bus
Generate
Event A;
Seen Event A;
Generate
Event B;
Seen Event B;
Event A
Event B
Busy… ;;;
An Example with A Mythological Architecture
● Many systems equip hierarchical memory for better performance and space
● Propagation speed of an event to a given core can be influenced by specific
sub-layer of memory
If CPU 0 Message Queue is busy, CPU 2 can observe an event from
CPU 0 (event A) after an event of CPU 1 (event B)
though CPU 1 observed event A before generating event B
CPU 0 CPU 1
Cache
CPU 0
Message
Queue
CPU 1
Message
Queue
Memory
CPU 2 CPU 3
Cache
CPU 2
Message
Queue
CPU 3
Message
Queue
Bus
Generate
Event A;
Seen Event A;
Generate
Event B;
Seen Event B;
Seen Event A;
Event B
Event A
Yet Another Mythological Architecture
● Store Buffer and Invalidation Queue can reorder observation of stores and
loads
CPU 0
Cache
Store
Buffer
Invalidation
Queue
Memory
CPU 1
Cache
Store
Buffer
Invalidation
Queue
Bus
Compiler Reorderings
C-language Doesn’t Know Multi-Processor
● By the time of initial C-language development, multi-processor was rare
● As a result, C-language has only few guarantees about memory operations
on multi-processor
● Undefined behavior is allowed for undefined case
https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/The_C_Programming
_Language,_First_Edition_Cover_(2).svg/2000px-The_C_Programming_Languag
e,_First_Edition_Cover_(2).svg.png
Compiler Optimizes (Reorders) Code
● Clever compilers try hard (really hard) to optimize code for high IPC
(not for human perspective goals)
○ Converts small, private functions to inline code
○ Reorder memory access code to minimize dependency
○ Simplify unnecessarily complex loops, ...
● Optimization uses term `Undefined behavior` as they want
○ It’s legal, but sometimes do insane things in programmer’s perspective
● Compilers reorder memory accesses based on the C-standard, which is
oblivious of multi-processor systems; this can result in unintended programs
● C11 standard aware the multi-processor systems, though
Synchronization Primitives
Synchronization Primitives
● Because reordering and asynchronous effect propagation is legal,
synchronization primitives are necessary to write human intuitive program
● Most environments including the Linux kernel provide sufficiently enough
synchronization primitives for human intuitive programs
https://s-media-cache-ak0.pinimg.com/236x/42/bc/55/42bc55a6d7e5affe2d0dbe9c872a3df9.jpg
Atomic Operations
● Atomic operations are configured with multiple sub operations
○ E.g., compare1)
-and-exchange2)
, fetch1)
-and-add2)
, and test1)
-and-set2)
● Atomic operations have mutual exclusiveness
○ Middle state of atomic operation execution cannot be seen by others
○ It can be thought of as small critical section that protected by a global lock
● Linux also provides plenty of atomic operations
http://www.scienceclarified.com/photos/atomic-mass-3024.jpg
Memory Barriers
● In general, memory barriers guarantee
effects of memory operations issued before it
to be propagated to other components in the system (e.g., processor)
before memory operations issued after the barrier
● Linux provides various types of memory barriers including compiler memory
barriers, hardware memory barriers, load barriers, store barriers
CPU 1 CPU 2 CPU 3
READ A;
WRITE B;
<BARRIER>;
READ C;
READ A,
WRITE B,
then READ C
occurred
WRITE B,
READ A,
then READ C
occurred
READ A and WRITE B can be reordered but READ C is guaranteed to
be ordered after {READ A, WRITE B}
Partial Ordering Primitives
● Partial ordering primitives include read / write memory barriers, acquire /
release semantics and data / address / control dependencies
● Semantics of partial ordering primitives are somewhat confusing
● Cost of partial ordering can be much cheaper than fully ordering primitives
● Linux also provides partial ordering primitives
https://www.ternvets.co.uk/wp-content/uploads/2017/02/cat-with-cone.jpg
Cost of Synchronization Primitives
● Generally, synchronization primitives are expensive, unscalable
● Each primitive costs differently; the gap is significant
● Correct selection and efficient use of synchronization primitives are important
for the performance and scalability
Performance of various synchronization primitives
LKMM: Linux Kernel
Memory Model
Each Environment Provides Own Memory Model
● Memory model, a.k.a Memory consistency model
● Defines what values can be obtained by load instructions of given code
● Programming environments including Instruction Set Architectures,
Programming languages, Operating systems, etc define their memory model
○ Modern language memory models (e.g., Golang, Rust, Java, C11, …) aware multi-processor,
while C99, which is for the Linux kernel, doesn’t
http://www.sciencemag.org/sites/default/files/styles/article_main_large/public/Memory.jpg?itok=4FmHo7M5
Each ISA Provides Specific Memory Model
● Some architectures have stricter ordering enforcement rule than others
● PA-RISC CPUs are strictest, Alpha is weakest
● Because Linux kernel supports multiple architectures, it defines its memory
model based on weakest one, the Alpha
https://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.2015.01.31a.pdf
Linux Kernel Memory Model (LKMM)
● The memory model for Linux kernel programming environment
○ Defines what values can be obtained, given a piece of Linux kernel code, for specific load
instructions in the code
● Linux kernel original memory model is necessary because
○ It uses C99 but C99 memory model is oblivious of multi-processor environment;
C11 memory model aware of multi-processor, but Torvalds doesn’t want it
○ It supports multiple architectures with different ISA-level memory model
https://github.com/sjp38/perfbook
LKMM Providing Ordering Primitives
● Designed for weakest memory model architecture, Alpha
○ Almost every combination of reordering is possible, doesn’t provide address dependency
● Atomic instructions
○ atomix_xchg(), atomic_inc_return(), atomic_dec_return(), …
○ Most of those just use mapping instruction, but provide general guarantees at Kernel level
● Memory barriers
○ Compiler barriers: WRITE_ONCE(), READ_ONCE(), barrier(), ...
○ CPU barriers: mb(), wmb(), rmb(), smp_mb(), smp_wmb(), smp_rmb(), …
○ Semantic barriers: ACQUIRE operations, RELEASE operations, dependencies, …
○ For detail, refer to https://www.kernel.org/doc/Documentation/memory-barriers.txt
● Because different memory ordering primitive has different cost, only
necessary ordering primitives should be used in necessary case for high
performance and scalability
Informal LKMM
● Originally, LKMM was just an informal text, ‘memory-barriers.txt’
● It explains about the Linux Kernel Memory Model in English
(There is Korean translation, too: https://www.kernel.org/doc/Documentation/translations/ko_KR/memory-barriers.txt)
● To use the LKMM to prove your code, you should use Feynman Algorithm
○ Write down your code
○ Think real hard with the ‘memory-barriers.txt’
○ Write down your provement
○ Hard and unreliable, of course!
https://www.kernel.org/doc/Documentation/memory-barriers.txt
Formal and Executable LKMM: Help Arrives at Last
● Jade Alglave, Paul E. McKenney, and some guys made formal LKMM
● It is formal, executable memory model
○ It receives C-like simple code as input
○ The code containing parallel code snippets and a question: can this result happen?
● Provides model for each ordering primitive of LKMM based on concepts
including orders and cycles
● Can be executed with herd7 and klitmus7
○ LKMM extends herd7 and klitmus7 to support LKMM ordering primitives in code
○ herd7 based execution does proof in user mode
○ klitmus7 based execution generates test running kernel module and
wrapper scripts for execution of the module
Steps for LKMM Execution
● Installation
○ LKMM is merged in Linux source tree at tools/memory-model;
Just get the linux source code
○ Install herdtools7 (https://github.com/herd/herdtools7)
● Usage
○ Using herd7 based user mode proof
$ herd7 -conf linux-kernel.cfg <your-litmus-test file>
○ Using klitmus7 based real kernel mode execution
$ mkdir mymodules
$ klitmus7 -o mymodules <your-litmus-test file>
$ cd mymodules ; make
$ sudo sh run.sh
● That’s it! Now you can prove your parallel code for all Linux environments!
Summary
● Nature of parallel programming is counter-intuitive
○ Cannot define order of events without interaction
○ Ordering rule is different for different environment
○ Memory model defines their ordering rule
● For human-intuitive and correct program, interaction is necessary
○ Almost every environment provides memory ordering primitives including atomic instructions
and memory barriers, which is expensive in common
○ Memory model defines what result can occur and cannot with given code snippet
● Formal Linux kernel memory model is available
○ Linux kernel provides its memory model based on weakest memory ordering rule architecture
it supports, the Alpha, C99, and its original ordering primitives including RCU
○ Formal and executable LKMM is in the kernel tree; now you can prove your parallel code!
Thanks
Demonstration
`herdtools` Install
● LKMM (in Linux v4.19) depends on herdtools version 7.49
○ Depending version may change
● `herdtools` depends on OPAM, the package manager for OCaml
● Ubuntu package manager supports OPAM
$ sudo apt install opam
$ opam init && sudo opam update && sudo opam upgrade
$ git clone https://github.com/herd/herdtools7 && cd herdtools7
$ git checkout 7.49
$ make all && make install
$ herd7 -version
7.49, Rev: 93dcbdd89086d5f3e981b280d437309fdeb8b427
Herd7 Based Userspace Litmus Tests Execution
$ herd7 -conf linux-kernel.cfg 
litmus-tests/SB+fencembonceonces.litmus
Test SB+fencembonceonces Allowed
States 3
0:r0=0; 1:r0=1;
0:r0=1; 1:r0=0;
0:r0=1; 1:r0=1;
No
Witnesses
Positive: 0 Negative: 3
Condition exists (0:r0=0 / 1:r0=0)
Observation SB+fencembonceonces Never 0 3
Time SB+fencembonceonces 0.01
Hash=d66d99523e2cac6b06e66f4c995ebb48
Klitmus7 Based Litmus Tests Execution
$ mkdir my; klitmus7 -o my/ 
litmus-tests/SB+fencembonceonces.litmus
$ cd klitmus_test; make; sudo sh ./run.sh
...
Test SB+fencembonceonces Allowed
Histogram (3 states)
16580117:>0:r0=1; 1:r0=0;
16402936:>0:r0=0; 1:r0=1;
3016947 :>0:r0=1; 1:r0=1;
...
Positive: 0, Negative: 36000000
Condition exists (0:r0=0 / 1:r0=0) is NOT validated
Hash=d66d99523e2cac6b06e66f4c995ebb48
Observation SB+fencembonceonces Never 0 36000000
...

More Related Content

What's hot

The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux Kernel
Divye Kapoor
 
Linux Internals - Part I
Linux Internals - Part ILinux Internals - Part I
Understanding of linux kernel memory model
Understanding of linux kernel memory modelUnderstanding of linux kernel memory model
Understanding of linux kernel memory model
SeongJae Park
 
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
Altinity Ltd
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernel
Adrian Huang
 
Capturing NIC and Kernel TX and RX Timestamps for Packets in Go
Capturing NIC and Kernel TX and RX Timestamps for Packets in GoCapturing NIC and Kernel TX and RX Timestamps for Packets in Go
Capturing NIC and Kernel TX and RX Timestamps for Packets in Go
ScyllaDB
 
Pcie drivers basics
Pcie drivers basicsPcie drivers basics
Pcie drivers basics
Venkatesh Malla
 
Linux Internals - Part II
Linux Internals - Part IILinux Internals - Part II
Linux Internals - Part II
Emertxe Information Technologies Pvt Ltd
 
OS Process Synchronization, semaphore and Monitors
OS Process Synchronization, semaphore and MonitorsOS Process Synchronization, semaphore and Monitors
OS Process Synchronization, semaphore and Monitors
sgpraju
 
Introduction to System Calls
Introduction to System CallsIntroduction to System Calls
Introduction to System Calls
Vandana Salve
 
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
Anne Nicolas
 
16 control unit
16 control unit16 control unit
16 control unit
dilip kumar
 
Linux Programming
Linux ProgrammingLinux Programming
U Boot or Universal Bootloader
U Boot or Universal BootloaderU Boot or Universal Bootloader
U Boot or Universal Bootloader
Satpal Parmar
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Anne Nicolas
 
Linux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersLinux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF Superpowers
Brendan Gregg
 
Interrupts
InterruptsInterrupts
Interrupts
Anil Kumar Pugalia
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKB
shimosawa
 
eBPF Workshop
eBPF WorkshopeBPF Workshop
eBPF Workshop
Michael Kehoe
 
CPU Scheduling algorithms
CPU Scheduling algorithmsCPU Scheduling algorithms
CPU Scheduling algorithms
Shanu Kumar
 

What's hot (20)

The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux Kernel
 
Linux Internals - Part I
Linux Internals - Part ILinux Internals - Part I
Linux Internals - Part I
 
Understanding of linux kernel memory model
Understanding of linux kernel memory modelUnderstanding of linux kernel memory model
Understanding of linux kernel memory model
 
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernel
 
Capturing NIC and Kernel TX and RX Timestamps for Packets in Go
Capturing NIC and Kernel TX and RX Timestamps for Packets in GoCapturing NIC and Kernel TX and RX Timestamps for Packets in Go
Capturing NIC and Kernel TX and RX Timestamps for Packets in Go
 
Pcie drivers basics
Pcie drivers basicsPcie drivers basics
Pcie drivers basics
 
Linux Internals - Part II
Linux Internals - Part IILinux Internals - Part II
Linux Internals - Part II
 
OS Process Synchronization, semaphore and Monitors
OS Process Synchronization, semaphore and MonitorsOS Process Synchronization, semaphore and Monitors
OS Process Synchronization, semaphore and Monitors
 
Introduction to System Calls
Introduction to System CallsIntroduction to System Calls
Introduction to System Calls
 
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
 
16 control unit
16 control unit16 control unit
16 control unit
 
Linux Programming
Linux ProgrammingLinux Programming
Linux Programming
 
U Boot or Universal Bootloader
U Boot or Universal BootloaderU Boot or Universal Bootloader
U Boot or Universal Bootloader
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
 
Linux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersLinux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF Superpowers
 
Interrupts
InterruptsInterrupts
Interrupts
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKB
 
eBPF Workshop
eBPF WorkshopeBPF Workshop
eBPF Workshop
 
CPU Scheduling algorithms
CPU Scheduling algorithmsCPU Scheduling algorithms
CPU Scheduling algorithms
 

Similar to Linux Kernel Memory Model

An Introduction to the Formalised Memory Model for Linux Kernel
An Introduction to the Formalised Memory Model for Linux KernelAn Introduction to the Formalised Memory Model for Linux Kernel
An Introduction to the Formalised Memory Model for Linux Kernel
SeongJae Park
 
Containers > VMs
Containers > VMsContainers > VMs
Containers > VMs
David Timothy Strauss
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kevin Lynch
 
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Andrey Vagin
 
Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...
Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...
Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...
WG_ Events
 
Checkpoint and Restore In Userspace
Checkpoint and Restore In UserspaceCheckpoint and Restore In Userspace
Checkpoint and Restore In Userspace
OpenVZ
 
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
OpenVZ
 
Fast boot
Fast bootFast boot
Fast boot
SZ Lin
 
Introduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & ContainersIntroduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & Containers
Vaibhav Sharma
 
Exploring Docker Security
Exploring Docker SecurityExploring Docker Security
Exploring Docker Security
Patrick Kleindienst
 
Docker 原理與實作
Docker 原理與實作Docker 原理與實作
Docker 原理與實作
kao kuo-tung
 
Securing Applications and Pipelines on a Container Platform
Securing Applications and Pipelines on a Container PlatformSecuring Applications and Pipelines on a Container Platform
Securing Applications and Pipelines on a Container Platform
All Things Open
 
Optimizing Linux Servers
Optimizing Linux ServersOptimizing Linux Servers
Optimizing Linux Servers
Davor Guttierrez
 
Talk 160920 @ Cat System Workshop
Talk 160920 @ Cat System WorkshopTalk 160920 @ Cat System Workshop
Talk 160920 @ Cat System Workshop
Quey-Liang Kao
 
Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]
Akhil Nadh PC
 
Realtime
RealtimeRealtime
Realtime
Mark Veltzer
 
Cache coherence ppt
Cache coherence pptCache coherence ppt
Cache coherence ppt
ArendraSingh2
 
LXC on Ganeti
LXC on GanetiLXC on Ganeti
LXC on Ganeti
kawamuray
 
General Purpose GPU Computing
General Purpose GPU ComputingGeneral Purpose GPU Computing
General Purpose GPU Computing
GlobalLogic Ukraine
 
Securing Applications and Pipelines on a Container Platform
Securing Applications and Pipelines on a Container PlatformSecuring Applications and Pipelines on a Container Platform
Securing Applications and Pipelines on a Container Platform
All Things Open
 

Similar to Linux Kernel Memory Model (20)

An Introduction to the Formalised Memory Model for Linux Kernel
An Introduction to the Formalised Memory Model for Linux KernelAn Introduction to the Formalised Memory Model for Linux Kernel
An Introduction to the Formalised Memory Model for Linux Kernel
 
Containers > VMs
Containers > VMsContainers > VMs
Containers > VMs
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
 
Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...
Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...
Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...
 
Checkpoint and Restore In Userspace
Checkpoint and Restore In UserspaceCheckpoint and Restore In Userspace
Checkpoint and Restore In Userspace
 
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
 
Fast boot
Fast bootFast boot
Fast boot
 
Introduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & ContainersIntroduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & Containers
 
Exploring Docker Security
Exploring Docker SecurityExploring Docker Security
Exploring Docker Security
 
Docker 原理與實作
Docker 原理與實作Docker 原理與實作
Docker 原理與實作
 
Securing Applications and Pipelines on a Container Platform
Securing Applications and Pipelines on a Container PlatformSecuring Applications and Pipelines on a Container Platform
Securing Applications and Pipelines on a Container Platform
 
Optimizing Linux Servers
Optimizing Linux ServersOptimizing Linux Servers
Optimizing Linux Servers
 
Talk 160920 @ Cat System Workshop
Talk 160920 @ Cat System WorkshopTalk 160920 @ Cat System Workshop
Talk 160920 @ Cat System Workshop
 
Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]
 
Realtime
RealtimeRealtime
Realtime
 
Cache coherence ppt
Cache coherence pptCache coherence ppt
Cache coherence ppt
 
LXC on Ganeti
LXC on GanetiLXC on Ganeti
LXC on Ganeti
 
General Purpose GPU Computing
General Purpose GPU ComputingGeneral Purpose GPU Computing
General Purpose GPU Computing
 
Securing Applications and Pipelines on a Container Platform
Securing Applications and Pipelines on a Container PlatformSecuring Applications and Pipelines on a Container Platform
Securing Applications and Pipelines on a Container Platform
 

More from SeongJae Park

Biscuit: an operating system written in go
Biscuit:  an operating system written in goBiscuit:  an operating system written in go
Biscuit: an operating system written in go
SeongJae Park
 
GCMA: Guaranteed Contiguous Memory Allocator
GCMA: Guaranteed Contiguous Memory AllocatorGCMA: Guaranteed Contiguous Memory Allocator
GCMA: Guaranteed Contiguous Memory Allocator
SeongJae Park
 
Design choices of golang for high scalability
Design choices of golang for high scalabilityDesign choices of golang for high scalability
Design choices of golang for high scalability
SeongJae Park
 
Brief introduction to kselftest
Brief introduction to kselftestBrief introduction to kselftest
Brief introduction to kselftest
SeongJae Park
 
Let the contribution begin (EST futures)
Let the contribution begin  (EST futures)Let the contribution begin  (EST futures)
Let the contribution begin (EST futures)
SeongJae Park
 
Porting golang development environment developed with golang
Porting golang development environment developed with golangPorting golang development environment developed with golang
Porting golang development environment developed with golang
SeongJae Park
 
gcma: guaranteed contiguous memory allocator
gcma:  guaranteed contiguous memory allocatorgcma:  guaranteed contiguous memory allocator
gcma: guaranteed contiguous memory allocator
SeongJae Park
 
An introduction to_golang.avi
An introduction to_golang.aviAn introduction to_golang.avi
An introduction to_golang.avi
SeongJae Park
 
Develop Android/iOS app using golang
Develop Android/iOS app using golangDevelop Android/iOS app using golang
Develop Android/iOS app using golang
SeongJae Park
 
Develop Android app using Golang
Develop Android app using GolangDevelop Android app using Golang
Develop Android app using Golang
SeongJae Park
 
Sw install with_without_docker
Sw install with_without_dockerSw install with_without_docker
Sw install with_without_docker
SeongJae Park
 
Git inter-snapshot public
Git  inter-snapshot publicGit  inter-snapshot public
Git inter-snapshot public
SeongJae Park
 
(Live) build and run golang web server on android.avi
(Live) build and run golang web server on android.avi(Live) build and run golang web server on android.avi
(Live) build and run golang web server on android.avi
SeongJae Park
 
Deep dark-side of git: How git works internally
Deep dark-side of git: How git works internallyDeep dark-side of git: How git works internally
Deep dark-side of git: How git works internally
SeongJae Park
 
Deep dark side of git - prologue
Deep dark side of git - prologueDeep dark side of git - prologue
Deep dark side of git - prologue
SeongJae Park
 
DO YOU WANT TO USE A VCS
DO YOU WANT TO USE A VCSDO YOU WANT TO USE A VCS
DO YOU WANT TO USE A VCS
SeongJae Park
 
Experimental android hacking using reflection
Experimental android hacking using reflectionExperimental android hacking using reflection
Experimental android hacking using reflection
SeongJae Park
 
ash
ashash
Hacktime for adk
Hacktime for adkHacktime for adk
Hacktime for adk
SeongJae Park
 
Let the contribution begin
Let the contribution beginLet the contribution begin
Let the contribution begin
SeongJae Park
 

More from SeongJae Park (20)

Biscuit: an operating system written in go
Biscuit:  an operating system written in goBiscuit:  an operating system written in go
Biscuit: an operating system written in go
 
GCMA: Guaranteed Contiguous Memory Allocator
GCMA: Guaranteed Contiguous Memory AllocatorGCMA: Guaranteed Contiguous Memory Allocator
GCMA: Guaranteed Contiguous Memory Allocator
 
Design choices of golang for high scalability
Design choices of golang for high scalabilityDesign choices of golang for high scalability
Design choices of golang for high scalability
 
Brief introduction to kselftest
Brief introduction to kselftestBrief introduction to kselftest
Brief introduction to kselftest
 
Let the contribution begin (EST futures)
Let the contribution begin  (EST futures)Let the contribution begin  (EST futures)
Let the contribution begin (EST futures)
 
Porting golang development environment developed with golang
Porting golang development environment developed with golangPorting golang development environment developed with golang
Porting golang development environment developed with golang
 
gcma: guaranteed contiguous memory allocator
gcma:  guaranteed contiguous memory allocatorgcma:  guaranteed contiguous memory allocator
gcma: guaranteed contiguous memory allocator
 
An introduction to_golang.avi
An introduction to_golang.aviAn introduction to_golang.avi
An introduction to_golang.avi
 
Develop Android/iOS app using golang
Develop Android/iOS app using golangDevelop Android/iOS app using golang
Develop Android/iOS app using golang
 
Develop Android app using Golang
Develop Android app using GolangDevelop Android app using Golang
Develop Android app using Golang
 
Sw install with_without_docker
Sw install with_without_dockerSw install with_without_docker
Sw install with_without_docker
 
Git inter-snapshot public
Git  inter-snapshot publicGit  inter-snapshot public
Git inter-snapshot public
 
(Live) build and run golang web server on android.avi
(Live) build and run golang web server on android.avi(Live) build and run golang web server on android.avi
(Live) build and run golang web server on android.avi
 
Deep dark-side of git: How git works internally
Deep dark-side of git: How git works internallyDeep dark-side of git: How git works internally
Deep dark-side of git: How git works internally
 
Deep dark side of git - prologue
Deep dark side of git - prologueDeep dark side of git - prologue
Deep dark side of git - prologue
 
DO YOU WANT TO USE A VCS
DO YOU WANT TO USE A VCSDO YOU WANT TO USE A VCS
DO YOU WANT TO USE A VCS
 
Experimental android hacking using reflection
Experimental android hacking using reflectionExperimental android hacking using reflection
Experimental android hacking using reflection
 
ash
ashash
ash
 
Hacktime for adk
Hacktime for adkHacktime for adk
Hacktime for adk
 
Let the contribution begin
Let the contribution beginLet the contribution begin
Let the contribution begin
 

Recently uploaded

GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 

Recently uploaded (20)

GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 

Linux Kernel Memory Model

  • 1. Linux Kernel Memory Model SeongJae Park <sj38.park@gmail.com>
  • 2. These slides were presented during 4th Korea Linux Kernel Community Meetup (https://kernel-dev-ko.github.io/4th/)
  • 3. This work by SeongJae Park is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/.
  • 4. I, SeongJae Park ● SeongJae Park <sj38.park@gmail.com> ● PhD candidate researcher at DCSLAB, SNU ● Part time linux kernel programmer at KOSSLAB ● Interested in memory management and parallel programming for OS kernels
  • 5. It’s Multi-processor Era, Now ● Processor vendors changed their mind to increase number of cores instead of clock speed more than a decade ago ○ Now, multi-core systems are prevalent ○ Even octa-core system in your pocket, maybe? ● As a result, the free lunch is over; parallel programming is essential for high performance and scalability http://www.gotw.ca/images/CPU.png GIVE UP :’(
  • 6. Writing Correct Parallel Program is Hard ● Nature of parallelism is counter-intuitive to human ○ Time is relative, order is ambiguous, and only observations exist ● Compilers and processors do reorderings for performance ● C language developed with Uni-Processor assumption ○ Standard for the kernel (C99) is oblivious of multi-processor systems CPU 0 CPU 1 X = 1; Y = 1; X = 2; Y = 2; if (Y == 2 && X == 1) OOPS(); CPU 1 can call OOPS() in Linux Kernel
  • 8. Instruction Reordering Helps Performance ● By reordering dependent instructions far away, total time consumptions can be reduced ● If the reordering is guaranteed to not change the result of the instruction sequence, it would be helpful for better performance ● Such reordering is legal and recommended fetch decode execute fetch decode execute fetch decode execute Instruction 1 Instruction 3 Instruction 2 instruction 2 depends on result of instruction 1 (e.g., first instruction modifies opcode of next instruction) By reordering instruction 2 and 3, total execution time can be shorten 6 cycles for 3 instructions: 2 cycles per instruction
  • 9. Time and Order is Relative ● Each CPU concurrently generates their events and observes effects of events from others ● It is impossible to define absolute order of two concurrent events; Only relative observation order is possible CPU 1 CPU 2 CPU 3 CPU 4 Generated event 1 Generated event 2 Observed event 1 followed by event 2 I observed event 2 followed by event 1 Event Bus
  • 10. Cache Coherency is Per-CacheLine ● It is well known that cache coherency protocol helps system memory consistency ● In actual, it guarantees only per-cacheline sequential consistency ● Every CPU will eventually agree about global order of each cacheline, but some CPU can aware the change faster than others, order of changes between different variables is not guaranteed CPU 1 CPU 2 CPU 3 CPU 4 A = 1; (1) B = 1; (2) A = 2; (3) B = 2; (4) The order is: (1), (2), (3), (4) Coherent cache protocol The order is: (1), (3), (2), (4) The order is: (2), (1), (4), (3) Every CPU is honest; The system is legally cache-coherent
  • 11. An Example with A Mythological Architecture ● Many systems equip hierarchical memory for better performance and space ● Propagation speed of an event to a given core can be influenced by specific sub-layer of memory If CPU 0 Message Queue is busy, CPU 2 can observe an event from CPU 0 (event A) after an event of CPU 1 (event B) though CPU 1 observed event A before generating event B CPU 0 CPU 1 Cache CPU 0 Message Queue CPU 1 Message Queue Memory CPU 2 CPU 3 Cache CPU 2 Message Queue CPU 3 Message Queue Bus
  • 12. An Example with A Mythological Architecture ● Many systems equip hierarchical memory for better performance and space ● Propagation speed of an event to a given core can be influenced by specific sub-layer of memory If CPU 0 Message Queue is busy, CPU 2 can observe an event from CPU 0 (event A) after an event of CPU 1 (event B) though CPU 1 observed event A before generating event B CPU 0 CPU 1 Cache CPU 0 Message Queue CPU 1 Message Queue Memory CPU 2 CPU 3 Cache CPU 2 Message Queue CPU 3 Message Queue Bus Generate Event A; Event A
  • 13. An Example with A Mythological Architecture ● Many systems equip hierarchical memory for better performance and space ● Propagation speed of an event to a given core can be influenced by specific sub-layer of memory If CPU 0 Message Queue is busy, CPU 2 can observe an event from CPU 0 (event A) after an event of CPU 1 (event B) though CPU 1 observed event A before generating event B CPU 0 CPU 1 Cache CPU 0 Message Queue CPU 1 Message Queue Memory CPU 2 CPU 3 Cache CPU 2 Message Queue CPU 3 Message Queue Bus Generate Event A; Seen Event A; Generate Event B; Event BEvent A
  • 14. An Example with A Mythological Architecture ● Many systems equip hierarchical memory for better performance and space ● Propagation speed of an event to a given core can be influenced by specific sub-layer of memory If CPU 0 Message Queue is busy, CPU 2 can observe an event from CPU 0 (event A) after an event of CPU 1 (event B) though CPU 1 observed event A before generating event B CPU 0 CPU 1 Cache CPU 0 Message Queue CPU 1 Message Queue Memory CPU 2 CPU 3 Cache CPU 2 Message Queue CPU 3 Message Queue Bus Generate Event A; Seen Event A; Generate Event B; Seen Event B; Event A Event B Busy… ;;;
  • 15. An Example with A Mythological Architecture ● Many systems equip hierarchical memory for better performance and space ● Propagation speed of an event to a given core can be influenced by specific sub-layer of memory If CPU 0 Message Queue is busy, CPU 2 can observe an event from CPU 0 (event A) after an event of CPU 1 (event B) though CPU 1 observed event A before generating event B CPU 0 CPU 1 Cache CPU 0 Message Queue CPU 1 Message Queue Memory CPU 2 CPU 3 Cache CPU 2 Message Queue CPU 3 Message Queue Bus Generate Event A; Seen Event A; Generate Event B; Seen Event B; Seen Event A; Event B Event A
  • 16. Yet Another Mythological Architecture ● Store Buffer and Invalidation Queue can reorder observation of stores and loads CPU 0 Cache Store Buffer Invalidation Queue Memory CPU 1 Cache Store Buffer Invalidation Queue Bus
  • 18. C-language Doesn’t Know Multi-Processor ● By the time of initial C-language development, multi-processor was rare ● As a result, C-language has only few guarantees about memory operations on multi-processor ● Undefined behavior is allowed for undefined case https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/The_C_Programming _Language,_First_Edition_Cover_(2).svg/2000px-The_C_Programming_Languag e,_First_Edition_Cover_(2).svg.png
  • 19. Compiler Optimizes (Reorders) Code ● Clever compilers try hard (really hard) to optimize code for high IPC (not for human perspective goals) ○ Converts small, private functions to inline code ○ Reorder memory access code to minimize dependency ○ Simplify unnecessarily complex loops, ... ● Optimization uses term `Undefined behavior` as they want ○ It’s legal, but sometimes do insane things in programmer’s perspective ● Compilers reorder memory accesses based on the C-standard, which is oblivious of multi-processor systems; this can result in unintended programs ● C11 standard aware the multi-processor systems, though
  • 21. Synchronization Primitives ● Because reordering and asynchronous effect propagation is legal, synchronization primitives are necessary to write human intuitive program ● Most environments including the Linux kernel provide sufficiently enough synchronization primitives for human intuitive programs https://s-media-cache-ak0.pinimg.com/236x/42/bc/55/42bc55a6d7e5affe2d0dbe9c872a3df9.jpg
  • 22. Atomic Operations ● Atomic operations are configured with multiple sub operations ○ E.g., compare1) -and-exchange2) , fetch1) -and-add2) , and test1) -and-set2) ● Atomic operations have mutual exclusiveness ○ Middle state of atomic operation execution cannot be seen by others ○ It can be thought of as small critical section that protected by a global lock ● Linux also provides plenty of atomic operations http://www.scienceclarified.com/photos/atomic-mass-3024.jpg
  • 23. Memory Barriers ● In general, memory barriers guarantee effects of memory operations issued before it to be propagated to other components in the system (e.g., processor) before memory operations issued after the barrier ● Linux provides various types of memory barriers including compiler memory barriers, hardware memory barriers, load barriers, store barriers CPU 1 CPU 2 CPU 3 READ A; WRITE B; <BARRIER>; READ C; READ A, WRITE B, then READ C occurred WRITE B, READ A, then READ C occurred READ A and WRITE B can be reordered but READ C is guaranteed to be ordered after {READ A, WRITE B}
  • 24. Partial Ordering Primitives ● Partial ordering primitives include read / write memory barriers, acquire / release semantics and data / address / control dependencies ● Semantics of partial ordering primitives are somewhat confusing ● Cost of partial ordering can be much cheaper than fully ordering primitives ● Linux also provides partial ordering primitives https://www.ternvets.co.uk/wp-content/uploads/2017/02/cat-with-cone.jpg
  • 25. Cost of Synchronization Primitives ● Generally, synchronization primitives are expensive, unscalable ● Each primitive costs differently; the gap is significant ● Correct selection and efficient use of synchronization primitives are important for the performance and scalability Performance of various synchronization primitives
  • 27. Each Environment Provides Own Memory Model ● Memory model, a.k.a Memory consistency model ● Defines what values can be obtained by load instructions of given code ● Programming environments including Instruction Set Architectures, Programming languages, Operating systems, etc define their memory model ○ Modern language memory models (e.g., Golang, Rust, Java, C11, …) aware multi-processor, while C99, which is for the Linux kernel, doesn’t http://www.sciencemag.org/sites/default/files/styles/article_main_large/public/Memory.jpg?itok=4FmHo7M5
  • 28. Each ISA Provides Specific Memory Model ● Some architectures have stricter ordering enforcement rule than others ● PA-RISC CPUs are strictest, Alpha is weakest ● Because Linux kernel supports multiple architectures, it defines its memory model based on weakest one, the Alpha https://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.2015.01.31a.pdf
  • 29. Linux Kernel Memory Model (LKMM) ● The memory model for Linux kernel programming environment ○ Defines what values can be obtained, given a piece of Linux kernel code, for specific load instructions in the code ● Linux kernel original memory model is necessary because ○ It uses C99 but C99 memory model is oblivious of multi-processor environment; C11 memory model aware of multi-processor, but Torvalds doesn’t want it ○ It supports multiple architectures with different ISA-level memory model https://github.com/sjp38/perfbook
  • 30. LKMM Providing Ordering Primitives ● Designed for weakest memory model architecture, Alpha ○ Almost every combination of reordering is possible, doesn’t provide address dependency ● Atomic instructions ○ atomix_xchg(), atomic_inc_return(), atomic_dec_return(), … ○ Most of those just use mapping instruction, but provide general guarantees at Kernel level ● Memory barriers ○ Compiler barriers: WRITE_ONCE(), READ_ONCE(), barrier(), ... ○ CPU barriers: mb(), wmb(), rmb(), smp_mb(), smp_wmb(), smp_rmb(), … ○ Semantic barriers: ACQUIRE operations, RELEASE operations, dependencies, … ○ For detail, refer to https://www.kernel.org/doc/Documentation/memory-barriers.txt ● Because different memory ordering primitive has different cost, only necessary ordering primitives should be used in necessary case for high performance and scalability
  • 31. Informal LKMM ● Originally, LKMM was just an informal text, ‘memory-barriers.txt’ ● It explains about the Linux Kernel Memory Model in English (There is Korean translation, too: https://www.kernel.org/doc/Documentation/translations/ko_KR/memory-barriers.txt) ● To use the LKMM to prove your code, you should use Feynman Algorithm ○ Write down your code ○ Think real hard with the ‘memory-barriers.txt’ ○ Write down your provement ○ Hard and unreliable, of course! https://www.kernel.org/doc/Documentation/memory-barriers.txt
  • 32. Formal and Executable LKMM: Help Arrives at Last ● Jade Alglave, Paul E. McKenney, and some guys made formal LKMM ● It is formal, executable memory model ○ It receives C-like simple code as input ○ The code containing parallel code snippets and a question: can this result happen? ● Provides model for each ordering primitive of LKMM based on concepts including orders and cycles ● Can be executed with herd7 and klitmus7 ○ LKMM extends herd7 and klitmus7 to support LKMM ordering primitives in code ○ herd7 based execution does proof in user mode ○ klitmus7 based execution generates test running kernel module and wrapper scripts for execution of the module
  • 33. Steps for LKMM Execution ● Installation ○ LKMM is merged in Linux source tree at tools/memory-model; Just get the linux source code ○ Install herdtools7 (https://github.com/herd/herdtools7) ● Usage ○ Using herd7 based user mode proof $ herd7 -conf linux-kernel.cfg <your-litmus-test file> ○ Using klitmus7 based real kernel mode execution $ mkdir mymodules $ klitmus7 -o mymodules <your-litmus-test file> $ cd mymodules ; make $ sudo sh run.sh ● That’s it! Now you can prove your parallel code for all Linux environments!
  • 34. Summary ● Nature of parallel programming is counter-intuitive ○ Cannot define order of events without interaction ○ Ordering rule is different for different environment ○ Memory model defines their ordering rule ● For human-intuitive and correct program, interaction is necessary ○ Almost every environment provides memory ordering primitives including atomic instructions and memory barriers, which is expensive in common ○ Memory model defines what result can occur and cannot with given code snippet ● Formal Linux kernel memory model is available ○ Linux kernel provides its memory model based on weakest memory ordering rule architecture it supports, the Alpha, C99, and its original ordering primitives including RCU ○ Formal and executable LKMM is in the kernel tree; now you can prove your parallel code!
  • 37. `herdtools` Install ● LKMM (in Linux v4.19) depends on herdtools version 7.49 ○ Depending version may change ● `herdtools` depends on OPAM, the package manager for OCaml ● Ubuntu package manager supports OPAM $ sudo apt install opam $ opam init && sudo opam update && sudo opam upgrade $ git clone https://github.com/herd/herdtools7 && cd herdtools7 $ git checkout 7.49 $ make all && make install $ herd7 -version 7.49, Rev: 93dcbdd89086d5f3e981b280d437309fdeb8b427
  • 38. Herd7 Based Userspace Litmus Tests Execution $ herd7 -conf linux-kernel.cfg litmus-tests/SB+fencembonceonces.litmus Test SB+fencembonceonces Allowed States 3 0:r0=0; 1:r0=1; 0:r0=1; 1:r0=0; 0:r0=1; 1:r0=1; No Witnesses Positive: 0 Negative: 3 Condition exists (0:r0=0 / 1:r0=0) Observation SB+fencembonceonces Never 0 3 Time SB+fencembonceonces 0.01 Hash=d66d99523e2cac6b06e66f4c995ebb48
  • 39. Klitmus7 Based Litmus Tests Execution $ mkdir my; klitmus7 -o my/ litmus-tests/SB+fencembonceonces.litmus $ cd klitmus_test; make; sudo sh ./run.sh ... Test SB+fencembonceonces Allowed Histogram (3 states) 16580117:>0:r0=1; 1:r0=0; 16402936:>0:r0=0; 1:r0=1; 3016947 :>0:r0=1; 1:r0=1; ... Positive: 0, Negative: 36000000 Condition exists (0:r0=0 / 1:r0=0) is NOT validated Hash=d66d99523e2cac6b06e66f4c995ebb48 Observation SB+fencembonceonces Never 0 36000000 ...