SlideShare a Scribd company logo
1 of 39
Download to read offline
Linux Kernel Memory Model
SeongJae Park <sj38.park@gmail.com>
These slides were presented during
4th Korea Linux Kernel Community Meetup
(https://kernel-dev-ko.github.io/4th/)
This work by SeongJae Park is licensed under the
Creative Commons Attribution-ShareAlike 3.0 Unported
License. To view a copy of this license, visit
http://creativecommons.org/licenses/by-sa/3.0/.
I, SeongJae Park
● SeongJae Park <sj38.park@gmail.com>
● PhD candidate researcher at DCSLAB, SNU
● Part time linux kernel programmer at KOSSLAB
● Interested in memory management and parallel programming for OS kernels
It’s Multi-processor Era, Now
● Processor vendors changed their mind to increase number of cores instead of
clock speed more than a decade ago
○ Now, multi-core systems are prevalent
○ Even octa-core system in your pocket, maybe?
● As a result, the free lunch is over;
parallel programming is essential for high performance and scalability
http://www.gotw.ca/images/CPU.png
GIVE UP :’(
Writing Correct Parallel Program is Hard
● Nature of parallelism is counter-intuitive to human
○ Time is relative, order is ambiguous, and only observations exist
● Compilers and processors do reorderings for performance
● C language developed with Uni-Processor assumption
○ Standard for the kernel (C99) is oblivious of multi-processor systems
CPU 0 CPU 1
X = 1;
Y = 1;
X = 2;
Y = 2;
if (Y == 2 && X == 1)
OOPS();
CPU 1 can call OOPS() in Linux Kernel
Hardware Reorderings
Why and How?
Instruction Reordering Helps Performance
● By reordering dependent instructions far away, total time consumptions can
be reduced
● If the reordering is guaranteed to not change the result of the instruction
sequence, it would be helpful for better performance
● Such reordering is legal and recommended
fetch decode execute
fetch decode execute
fetch decode execute
Instruction 1
Instruction 3
Instruction 2
instruction 2 depends on result of instruction 1
(e.g., first instruction modifies opcode of next instruction)
By reordering instruction 2 and 3, total execution time can be shorten
6 cycles for 3 instructions: 2 cycles per instruction
Time and Order is Relative
● Each CPU concurrently generates their events and
observes effects of events from others
● It is impossible to define absolute order of two concurrent events;
Only relative observation order is possible
CPU 1 CPU 2 CPU 3 CPU 4
Generated
event 1
Generated
event 2
Observed
event 1
followed by
event 2
I observed
event 2
followed by
event 1
Event Bus
Cache Coherency is Per-CacheLine
● It is well known that cache coherency protocol helps system memory
consistency
● In actual, it guarantees only per-cacheline sequential consistency
● Every CPU will eventually agree about global order of each cacheline,
but some CPU can aware the change faster than others,
order of changes between different variables is not guaranteed
CPU 1 CPU 2 CPU 3 CPU 4
A = 1; (1)
B = 1; (2)
A = 2; (3)
B = 2; (4)
The order is:
(1), (2), (3), (4)
Coherent cache protocol
The order is:
(1), (3), (2), (4)
The order is:
(2), (1), (4), (3)
Every CPU is honest;
The system is legally
cache-coherent
An Example with A Mythological Architecture
● Many systems equip hierarchical memory for better performance and space
● Propagation speed of an event to a given core can be influenced by specific
sub-layer of memory
If CPU 0 Message Queue is busy, CPU 2 can observe an event from
CPU 0 (event A) after an event of CPU 1 (event B)
though CPU 1 observed event A before generating event B
CPU 0 CPU 1
Cache
CPU 0
Message
Queue
CPU 1
Message
Queue
Memory
CPU 2 CPU 3
Cache
CPU 2
Message
Queue
CPU 3
Message
Queue
Bus
An Example with A Mythological Architecture
● Many systems equip hierarchical memory for better performance and space
● Propagation speed of an event to a given core can be influenced by specific
sub-layer of memory
If CPU 0 Message Queue is busy, CPU 2 can observe an event from
CPU 0 (event A) after an event of CPU 1 (event B)
though CPU 1 observed event A before generating event B
CPU 0 CPU 1
Cache
CPU 0
Message
Queue
CPU 1
Message
Queue
Memory
CPU 2 CPU 3
Cache
CPU 2
Message
Queue
CPU 3
Message
Queue
Bus
Generate
Event A;
Event A
An Example with A Mythological Architecture
● Many systems equip hierarchical memory for better performance and space
● Propagation speed of an event to a given core can be influenced by specific
sub-layer of memory
If CPU 0 Message Queue is busy, CPU 2 can observe an event from
CPU 0 (event A) after an event of CPU 1 (event B)
though CPU 1 observed event A before generating event B
CPU 0 CPU 1
Cache
CPU 0
Message
Queue
CPU 1
Message
Queue
Memory
CPU 2 CPU 3
Cache
CPU 2
Message
Queue
CPU 3
Message
Queue
Bus
Generate
Event A;
Seen Event A;
Generate
Event B;
Event BEvent A
An Example with A Mythological Architecture
● Many systems equip hierarchical memory for better performance and space
● Propagation speed of an event to a given core can be influenced by specific
sub-layer of memory
If CPU 0 Message Queue is busy, CPU 2 can observe an event from
CPU 0 (event A) after an event of CPU 1 (event B)
though CPU 1 observed event A before generating event B
CPU 0 CPU 1
Cache
CPU 0
Message
Queue
CPU 1
Message
Queue
Memory
CPU 2 CPU 3
Cache
CPU 2
Message
Queue
CPU 3
Message
Queue
Bus
Generate
Event A;
Seen Event A;
Generate
Event B;
Seen Event B;
Event A
Event B
Busy… ;;;
An Example with A Mythological Architecture
● Many systems equip hierarchical memory for better performance and space
● Propagation speed of an event to a given core can be influenced by specific
sub-layer of memory
If CPU 0 Message Queue is busy, CPU 2 can observe an event from
CPU 0 (event A) after an event of CPU 1 (event B)
though CPU 1 observed event A before generating event B
CPU 0 CPU 1
Cache
CPU 0
Message
Queue
CPU 1
Message
Queue
Memory
CPU 2 CPU 3
Cache
CPU 2
Message
Queue
CPU 3
Message
Queue
Bus
Generate
Event A;
Seen Event A;
Generate
Event B;
Seen Event B;
Seen Event A;
Event B
Event A
Yet Another Mythological Architecture
● Store Buffer and Invalidation Queue can reorder observation of stores and
loads
CPU 0
Cache
Store
Buffer
Invalidation
Queue
Memory
CPU 1
Cache
Store
Buffer
Invalidation
Queue
Bus
Compiler Reorderings
C-language Doesn’t Know Multi-Processor
● By the time of initial C-language development, multi-processor was rare
● As a result, C-language has only few guarantees about memory operations
on multi-processor
● Undefined behavior is allowed for undefined case
https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/The_C_Programming
_Language,_First_Edition_Cover_(2).svg/2000px-The_C_Programming_Languag
e,_First_Edition_Cover_(2).svg.png
Compiler Optimizes (Reorders) Code
● Clever compilers try hard (really hard) to optimize code for high IPC
(not for human perspective goals)
○ Converts small, private functions to inline code
○ Reorder memory access code to minimize dependency
○ Simplify unnecessarily complex loops, ...
● Optimization uses term `Undefined behavior` as they want
○ It’s legal, but sometimes do insane things in programmer’s perspective
● Compilers reorder memory accesses based on the C-standard, which is
oblivious of multi-processor systems; this can result in unintended programs
● C11 standard aware the multi-processor systems, though
Synchronization Primitives
Synchronization Primitives
● Because reordering and asynchronous effect propagation is legal,
synchronization primitives are necessary to write human intuitive program
● Most environments including the Linux kernel provide sufficiently enough
synchronization primitives for human intuitive programs
https://s-media-cache-ak0.pinimg.com/236x/42/bc/55/42bc55a6d7e5affe2d0dbe9c872a3df9.jpg
Atomic Operations
● Atomic operations are configured with multiple sub operations
○ E.g., compare1)
-and-exchange2)
, fetch1)
-and-add2)
, and test1)
-and-set2)
● Atomic operations have mutual exclusiveness
○ Middle state of atomic operation execution cannot be seen by others
○ It can be thought of as small critical section that protected by a global lock
● Linux also provides plenty of atomic operations
http://www.scienceclarified.com/photos/atomic-mass-3024.jpg
Memory Barriers
● In general, memory barriers guarantee
effects of memory operations issued before it
to be propagated to other components in the system (e.g., processor)
before memory operations issued after the barrier
● Linux provides various types of memory barriers including compiler memory
barriers, hardware memory barriers, load barriers, store barriers
CPU 1 CPU 2 CPU 3
READ A;
WRITE B;
<BARRIER>;
READ C;
READ A,
WRITE B,
then READ C
occurred
WRITE B,
READ A,
then READ C
occurred
READ A and WRITE B can be reordered but READ C is guaranteed to
be ordered after {READ A, WRITE B}
Partial Ordering Primitives
● Partial ordering primitives include read / write memory barriers, acquire /
release semantics and data / address / control dependencies
● Semantics of partial ordering primitives are somewhat confusing
● Cost of partial ordering can be much cheaper than fully ordering primitives
● Linux also provides partial ordering primitives
https://www.ternvets.co.uk/wp-content/uploads/2017/02/cat-with-cone.jpg
Cost of Synchronization Primitives
● Generally, synchronization primitives are expensive, unscalable
● Each primitive costs differently; the gap is significant
● Correct selection and efficient use of synchronization primitives are important
for the performance and scalability
Performance of various synchronization primitives
LKMM: Linux Kernel
Memory Model
Each Environment Provides Own Memory Model
● Memory model, a.k.a Memory consistency model
● Defines what values can be obtained by load instructions of given code
● Programming environments including Instruction Set Architectures,
Programming languages, Operating systems, etc define their memory model
○ Modern language memory models (e.g., Golang, Rust, Java, C11, …) aware multi-processor,
while C99, which is for the Linux kernel, doesn’t
http://www.sciencemag.org/sites/default/files/styles/article_main_large/public/Memory.jpg?itok=4FmHo7M5
Each ISA Provides Specific Memory Model
● Some architectures have stricter ordering enforcement rule than others
● PA-RISC CPUs are strictest, Alpha is weakest
● Because Linux kernel supports multiple architectures, it defines its memory
model based on weakest one, the Alpha
https://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.2015.01.31a.pdf
Linux Kernel Memory Model (LKMM)
● The memory model for Linux kernel programming environment
○ Defines what values can be obtained, given a piece of Linux kernel code, for specific load
instructions in the code
● Linux kernel original memory model is necessary because
○ It uses C99 but C99 memory model is oblivious of multi-processor environment;
C11 memory model aware of multi-processor, but Torvalds doesn’t want it
○ It supports multiple architectures with different ISA-level memory model
https://github.com/sjp38/perfbook
LKMM Providing Ordering Primitives
● Designed for weakest memory model architecture, Alpha
○ Almost every combination of reordering is possible, doesn’t provide address dependency
● Atomic instructions
○ atomix_xchg(), atomic_inc_return(), atomic_dec_return(), …
○ Most of those just use mapping instruction, but provide general guarantees at Kernel level
● Memory barriers
○ Compiler barriers: WRITE_ONCE(), READ_ONCE(), barrier(), ...
○ CPU barriers: mb(), wmb(), rmb(), smp_mb(), smp_wmb(), smp_rmb(), …
○ Semantic barriers: ACQUIRE operations, RELEASE operations, dependencies, …
○ For detail, refer to https://www.kernel.org/doc/Documentation/memory-barriers.txt
● Because different memory ordering primitive has different cost, only
necessary ordering primitives should be used in necessary case for high
performance and scalability
Informal LKMM
● Originally, LKMM was just an informal text, ‘memory-barriers.txt’
● It explains about the Linux Kernel Memory Model in English
(There is Korean translation, too: https://www.kernel.org/doc/Documentation/translations/ko_KR/memory-barriers.txt)
● To use the LKMM to prove your code, you should use Feynman Algorithm
○ Write down your code
○ Think real hard with the ‘memory-barriers.txt’
○ Write down your provement
○ Hard and unreliable, of course!
https://www.kernel.org/doc/Documentation/memory-barriers.txt
Formal and Executable LKMM: Help Arrives at Last
● Jade Alglave, Paul E. McKenney, and some guys made formal LKMM
● It is formal, executable memory model
○ It receives C-like simple code as input
○ The code containing parallel code snippets and a question: can this result happen?
● Provides model for each ordering primitive of LKMM based on concepts
including orders and cycles
● Can be executed with herd7 and klitmus7
○ LKMM extends herd7 and klitmus7 to support LKMM ordering primitives in code
○ herd7 based execution does proof in user mode
○ klitmus7 based execution generates test running kernel module and
wrapper scripts for execution of the module
Steps for LKMM Execution
● Installation
○ LKMM is merged in Linux source tree at tools/memory-model;
Just get the linux source code
○ Install herdtools7 (https://github.com/herd/herdtools7)
● Usage
○ Using herd7 based user mode proof
$ herd7 -conf linux-kernel.cfg <your-litmus-test file>
○ Using klitmus7 based real kernel mode execution
$ mkdir mymodules
$ klitmus7 -o mymodules <your-litmus-test file>
$ cd mymodules ; make
$ sudo sh run.sh
● That’s it! Now you can prove your parallel code for all Linux environments!
Summary
● Nature of parallel programming is counter-intuitive
○ Cannot define order of events without interaction
○ Ordering rule is different for different environment
○ Memory model defines their ordering rule
● For human-intuitive and correct program, interaction is necessary
○ Almost every environment provides memory ordering primitives including atomic instructions
and memory barriers, which is expensive in common
○ Memory model defines what result can occur and cannot with given code snippet
● Formal Linux kernel memory model is available
○ Linux kernel provides its memory model based on weakest memory ordering rule architecture
it supports, the Alpha, C99, and its original ordering primitives including RCU
○ Formal and executable LKMM is in the kernel tree; now you can prove your parallel code!
Thanks
Demonstration
`herdtools` Install
● LKMM (in Linux v4.19) depends on herdtools version 7.49
○ Depending version may change
● `herdtools` depends on OPAM, the package manager for OCaml
● Ubuntu package manager supports OPAM
$ sudo apt install opam
$ opam init && sudo opam update && sudo opam upgrade
$ git clone https://github.com/herd/herdtools7 && cd herdtools7
$ git checkout 7.49
$ make all && make install
$ herd7 -version
7.49, Rev: 93dcbdd89086d5f3e981b280d437309fdeb8b427
Herd7 Based Userspace Litmus Tests Execution
$ herd7 -conf linux-kernel.cfg 
litmus-tests/SB+fencembonceonces.litmus
Test SB+fencembonceonces Allowed
States 3
0:r0=0; 1:r0=1;
0:r0=1; 1:r0=0;
0:r0=1; 1:r0=1;
No
Witnesses
Positive: 0 Negative: 3
Condition exists (0:r0=0 / 1:r0=0)
Observation SB+fencembonceonces Never 0 3
Time SB+fencembonceonces 0.01
Hash=d66d99523e2cac6b06e66f4c995ebb48
Klitmus7 Based Litmus Tests Execution
$ mkdir my; klitmus7 -o my/ 
litmus-tests/SB+fencembonceonces.litmus
$ cd klitmus_test; make; sudo sh ./run.sh
...
Test SB+fencembonceonces Allowed
Histogram (3 states)
16580117:>0:r0=1; 1:r0=0;
16402936:>0:r0=0; 1:r0=1;
3016947 :>0:r0=1; 1:r0=1;
...
Positive: 0, Negative: 36000000
Condition exists (0:r0=0 / 1:r0=0) is NOT validated
Hash=d66d99523e2cac6b06e66f4c995ebb48
Observation SB+fencembonceonces Never 0 36000000
...

More Related Content

What's hot

Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network InterfacesKernel TLV
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughThomas Graf
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageKernel TLV
 
Kubernetes DNS Horror Stories
Kubernetes DNS Horror StoriesKubernetes DNS Horror Stories
Kubernetes DNS Horror StoriesLaurent Bernaille
 
Windows Internals: fuzzing, hijacking and weaponizing kernel objects
Windows Internals: fuzzing, hijacking and weaponizing kernel objectsWindows Internals: fuzzing, hijacking and weaponizing kernel objects
Windows Internals: fuzzing, hijacking and weaponizing kernel objectsNullbyte Security Conference
 
Linux and Java - Understanding and Troubleshooting
Linux and Java - Understanding and TroubleshootingLinux and Java - Understanding and Troubleshooting
Linux and Java - Understanding and TroubleshootingJérôme Kehrli
 
Course 102: Lecture 25: Devices and Device Drivers
Course 102: Lecture 25: Devices and Device Drivers Course 102: Lecture 25: Devices and Device Drivers
Course 102: Lecture 25: Devices and Device Drivers Ahmed El-Arabawy
 
The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecturehugo lu
 
Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCKernel TLV
 
Linux process management
Linux process managementLinux process management
Linux process managementRaghu nath
 
Conan a C/C++ Package Manager
Conan a C/C++ Package ManagerConan a C/C++ Package Manager
Conan a C/C++ Package ManagerUilian Ries
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance AnalysisBrendan Gregg
 
Linux Internals - Kernel/Core
Linux Internals - Kernel/CoreLinux Internals - Kernel/Core
Linux Internals - Kernel/CoreShay Cohen
 
Solaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and TuningSolaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and TuningAdrian Cockcroft
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking ExplainedThomas Graf
 
Kernel Recipes 2019 - Faster IO through io_uring
Kernel Recipes 2019 - Faster IO through io_uringKernel Recipes 2019 - Faster IO through io_uring
Kernel Recipes 2019 - Faster IO through io_uringAnne Nicolas
 
Namespaces and cgroups - the basis of Linux containers
Namespaces and cgroups - the basis of Linux containersNamespaces and cgroups - the basis of Linux containers
Namespaces and cgroups - the basis of Linux containersKernel TLV
 

What's hot (20)

Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network Interfaces
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
 
Concurrency With Go
Concurrency With GoConcurrency With Go
Concurrency With Go
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
Kubernetes DNS Horror Stories
Kubernetes DNS Horror StoriesKubernetes DNS Horror Stories
Kubernetes DNS Horror Stories
 
Windows Internals: fuzzing, hijacking and weaponizing kernel objects
Windows Internals: fuzzing, hijacking and weaponizing kernel objectsWindows Internals: fuzzing, hijacking and weaponizing kernel objects
Windows Internals: fuzzing, hijacking and weaponizing kernel objects
 
Linux and Java - Understanding and Troubleshooting
Linux and Java - Understanding and TroubleshootingLinux and Java - Understanding and Troubleshooting
Linux and Java - Understanding and Troubleshooting
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
 
Course 102: Lecture 25: Devices and Device Drivers
Course 102: Lecture 25: Devices and Device Drivers Course 102: Lecture 25: Devices and Device Drivers
Course 102: Lecture 25: Devices and Device Drivers
 
The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecture
 
Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCC
 
Linux process management
Linux process managementLinux process management
Linux process management
 
Conan a C/C++ Package Manager
Conan a C/C++ Package ManagerConan a C/C++ Package Manager
Conan a C/C++ Package Manager
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
 
Linux Internals - Kernel/Core
Linux Internals - Kernel/CoreLinux Internals - Kernel/Core
Linux Internals - Kernel/Core
 
Solaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and TuningSolaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and Tuning
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
 
QNX Sales Engineering Presentation
QNX Sales Engineering PresentationQNX Sales Engineering Presentation
QNX Sales Engineering Presentation
 
Kernel Recipes 2019 - Faster IO through io_uring
Kernel Recipes 2019 - Faster IO through io_uringKernel Recipes 2019 - Faster IO through io_uring
Kernel Recipes 2019 - Faster IO through io_uring
 
Namespaces and cgroups - the basis of Linux containers
Namespaces and cgroups - the basis of Linux containersNamespaces and cgroups - the basis of Linux containers
Namespaces and cgroups - the basis of Linux containers
 

Similar to Linux Kernel Memory Model

An Introduction to the Formalised Memory Model for Linux Kernel
An Introduction to the Formalised Memory Model for Linux KernelAn Introduction to the Formalised Memory Model for Linux Kernel
An Introduction to the Formalised Memory Model for Linux KernelSeongJae Park
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kevin Lynch
 
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...OpenVZ
 
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...Andrey Vagin
 
Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...
Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...
Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...WG_ Events
 
Checkpoint and Restore In Userspace
Checkpoint and Restore In UserspaceCheckpoint and Restore In Userspace
Checkpoint and Restore In UserspaceOpenVZ
 
Fast boot
Fast bootFast boot
Fast bootSZ Lin
 
Introduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & ContainersIntroduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & ContainersVaibhav Sharma
 
Docker 原理與實作
Docker 原理與實作Docker 原理與實作
Docker 原理與實作kao kuo-tung
 
Securing Applications and Pipelines on a Container Platform
Securing Applications and Pipelines on a Container PlatformSecuring Applications and Pipelines on a Container Platform
Securing Applications and Pipelines on a Container PlatformAll Things Open
 
Talk 160920 @ Cat System Workshop
Talk 160920 @ Cat System WorkshopTalk 160920 @ Cat System Workshop
Talk 160920 @ Cat System WorkshopQuey-Liang Kao
 
Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]Akhil Nadh PC
 
LXC on Ganeti
LXC on GanetiLXC on Ganeti
LXC on Ganetikawamuray
 

Similar to Linux Kernel Memory Model (20)

An Introduction to the Formalised Memory Model for Linux Kernel
An Introduction to the Formalised Memory Model for Linux KernelAn Introduction to the Formalised Memory Model for Linux Kernel
An Introduction to the Formalised Memory Model for Linux Kernel
 
Containers > VMs
Containers > VMsContainers > VMs
Containers > VMs
 
Linux Internals - Part II
Linux Internals - Part IILinux Internals - Part II
Linux Internals - Part II
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
 
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
 
Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...
Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...
Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...
 
Checkpoint and Restore In Userspace
Checkpoint and Restore In UserspaceCheckpoint and Restore In Userspace
Checkpoint and Restore In Userspace
 
Fast boot
Fast bootFast boot
Fast boot
 
Introduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & ContainersIntroduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & Containers
 
Exploring Docker Security
Exploring Docker SecurityExploring Docker Security
Exploring Docker Security
 
Linux-Internals-and-Networking
Linux-Internals-and-NetworkingLinux-Internals-and-Networking
Linux-Internals-and-Networking
 
Docker 原理與實作
Docker 原理與實作Docker 原理與實作
Docker 原理與實作
 
Securing Applications and Pipelines on a Container Platform
Securing Applications and Pipelines on a Container PlatformSecuring Applications and Pipelines on a Container Platform
Securing Applications and Pipelines on a Container Platform
 
Optimizing Linux Servers
Optimizing Linux ServersOptimizing Linux Servers
Optimizing Linux Servers
 
Talk 160920 @ Cat System Workshop
Talk 160920 @ Cat System WorkshopTalk 160920 @ Cat System Workshop
Talk 160920 @ Cat System Workshop
 
Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]Chorus - Distributed Operating System [ case study ]
Chorus - Distributed Operating System [ case study ]
 
Realtime
RealtimeRealtime
Realtime
 
Cache coherence ppt
Cache coherence pptCache coherence ppt
Cache coherence ppt
 
LXC on Ganeti
LXC on GanetiLXC on Ganeti
LXC on Ganeti
 

More from SeongJae Park

Biscuit: an operating system written in go
Biscuit:  an operating system written in goBiscuit:  an operating system written in go
Biscuit: an operating system written in goSeongJae Park
 
GCMA: Guaranteed Contiguous Memory Allocator
GCMA: Guaranteed Contiguous Memory AllocatorGCMA: Guaranteed Contiguous Memory Allocator
GCMA: Guaranteed Contiguous Memory AllocatorSeongJae Park
 
Design choices of golang for high scalability
Design choices of golang for high scalabilityDesign choices of golang for high scalability
Design choices of golang for high scalabilitySeongJae Park
 
Brief introduction to kselftest
Brief introduction to kselftestBrief introduction to kselftest
Brief introduction to kselftestSeongJae Park
 
Let the contribution begin (EST futures)
Let the contribution begin  (EST futures)Let the contribution begin  (EST futures)
Let the contribution begin (EST futures)SeongJae Park
 
Porting golang development environment developed with golang
Porting golang development environment developed with golangPorting golang development environment developed with golang
Porting golang development environment developed with golangSeongJae Park
 
gcma: guaranteed contiguous memory allocator
gcma:  guaranteed contiguous memory allocatorgcma:  guaranteed contiguous memory allocator
gcma: guaranteed contiguous memory allocatorSeongJae Park
 
An introduction to_golang.avi
An introduction to_golang.aviAn introduction to_golang.avi
An introduction to_golang.aviSeongJae Park
 
Develop Android/iOS app using golang
Develop Android/iOS app using golangDevelop Android/iOS app using golang
Develop Android/iOS app using golangSeongJae Park
 
Develop Android app using Golang
Develop Android app using GolangDevelop Android app using Golang
Develop Android app using GolangSeongJae Park
 
Sw install with_without_docker
Sw install with_without_dockerSw install with_without_docker
Sw install with_without_dockerSeongJae Park
 
Git inter-snapshot public
Git  inter-snapshot publicGit  inter-snapshot public
Git inter-snapshot publicSeongJae Park
 
(Live) build and run golang web server on android.avi
(Live) build and run golang web server on android.avi(Live) build and run golang web server on android.avi
(Live) build and run golang web server on android.aviSeongJae Park
 
Deep dark-side of git: How git works internally
Deep dark-side of git: How git works internallyDeep dark-side of git: How git works internally
Deep dark-side of git: How git works internallySeongJae Park
 
Deep dark side of git - prologue
Deep dark side of git - prologueDeep dark side of git - prologue
Deep dark side of git - prologueSeongJae Park
 
DO YOU WANT TO USE A VCS
DO YOU WANT TO USE A VCSDO YOU WANT TO USE A VCS
DO YOU WANT TO USE A VCSSeongJae Park
 
Experimental android hacking using reflection
Experimental android hacking using reflectionExperimental android hacking using reflection
Experimental android hacking using reflectionSeongJae Park
 
Let the contribution begin
Let the contribution beginLet the contribution begin
Let the contribution beginSeongJae Park
 

More from SeongJae Park (20)

Biscuit: an operating system written in go
Biscuit:  an operating system written in goBiscuit:  an operating system written in go
Biscuit: an operating system written in go
 
GCMA: Guaranteed Contiguous Memory Allocator
GCMA: Guaranteed Contiguous Memory AllocatorGCMA: Guaranteed Contiguous Memory Allocator
GCMA: Guaranteed Contiguous Memory Allocator
 
Design choices of golang for high scalability
Design choices of golang for high scalabilityDesign choices of golang for high scalability
Design choices of golang for high scalability
 
Brief introduction to kselftest
Brief introduction to kselftestBrief introduction to kselftest
Brief introduction to kselftest
 
Let the contribution begin (EST futures)
Let the contribution begin  (EST futures)Let the contribution begin  (EST futures)
Let the contribution begin (EST futures)
 
Porting golang development environment developed with golang
Porting golang development environment developed with golangPorting golang development environment developed with golang
Porting golang development environment developed with golang
 
gcma: guaranteed contiguous memory allocator
gcma:  guaranteed contiguous memory allocatorgcma:  guaranteed contiguous memory allocator
gcma: guaranteed contiguous memory allocator
 
An introduction to_golang.avi
An introduction to_golang.aviAn introduction to_golang.avi
An introduction to_golang.avi
 
Develop Android/iOS app using golang
Develop Android/iOS app using golangDevelop Android/iOS app using golang
Develop Android/iOS app using golang
 
Develop Android app using Golang
Develop Android app using GolangDevelop Android app using Golang
Develop Android app using Golang
 
Sw install with_without_docker
Sw install with_without_dockerSw install with_without_docker
Sw install with_without_docker
 
Git inter-snapshot public
Git  inter-snapshot publicGit  inter-snapshot public
Git inter-snapshot public
 
(Live) build and run golang web server on android.avi
(Live) build and run golang web server on android.avi(Live) build and run golang web server on android.avi
(Live) build and run golang web server on android.avi
 
Deep dark-side of git: How git works internally
Deep dark-side of git: How git works internallyDeep dark-side of git: How git works internally
Deep dark-side of git: How git works internally
 
Deep dark side of git - prologue
Deep dark side of git - prologueDeep dark side of git - prologue
Deep dark side of git - prologue
 
DO YOU WANT TO USE A VCS
DO YOU WANT TO USE A VCSDO YOU WANT TO USE A VCS
DO YOU WANT TO USE A VCS
 
Experimental android hacking using reflection
Experimental android hacking using reflectionExperimental android hacking using reflection
Experimental android hacking using reflection
 
ash
ashash
ash
 
Hacktime for adk
Hacktime for adkHacktime for adk
Hacktime for adk
 
Let the contribution begin
Let the contribution beginLet the contribution begin
Let the contribution begin
 

Recently uploaded

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Laying the Data Foundations for Artificial Intelligence!
Laying the Data Foundations for Artificial Intelligence!Laying the Data Foundations for Artificial Intelligence!
Laying the Data Foundations for Artificial Intelligence!Memoori
 
Software Security in the Real World w/Kelsey Hightower
Software Security in the Real World w/Kelsey HightowerSoftware Security in the Real World w/Kelsey Hightower
Software Security in the Real World w/Kelsey HightowerAnchore
 
full stack practical assignment msc cs.pdf
full stack practical assignment msc cs.pdffull stack practical assignment msc cs.pdf
full stack practical assignment msc cs.pdfHulkTheDevil
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
A PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptxA PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptxatharvdev2010
 
QMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfQMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfROWELL MARQUINA
 
Automation Ops Series: Session 3 - Solutions management
Automation Ops Series: Session 3 - Solutions managementAutomation Ops Series: Session 3 - Solutions management
Automation Ops Series: Session 3 - Solutions managementDianaGray10
 
Women in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automationWomen in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automationDianaGray10
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdfHCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdfROWELL MARQUINA
 
Arti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdfArti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdfwill854175
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Deliver Latency Free Customer Experience
Deliver Latency Free Customer ExperienceDeliver Latency Free Customer Experience
Deliver Latency Free Customer ExperienceOpsTree solutions
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Tetracrom printing process for packaging with CMYK+
Tetracrom printing process for packaging with CMYK+Tetracrom printing process for packaging with CMYK+
Tetracrom printing process for packaging with CMYK+Antonio de Llamas
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 

Recently uploaded (20)

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Laying the Data Foundations for Artificial Intelligence!
Laying the Data Foundations for Artificial Intelligence!Laying the Data Foundations for Artificial Intelligence!
Laying the Data Foundations for Artificial Intelligence!
 
Software Security in the Real World w/Kelsey Hightower
Software Security in the Real World w/Kelsey HightowerSoftware Security in the Real World w/Kelsey Hightower
Software Security in the Real World w/Kelsey Hightower
 
full stack practical assignment msc cs.pdf
full stack practical assignment msc cs.pdffull stack practical assignment msc cs.pdf
full stack practical assignment msc cs.pdf
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
A PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptxA PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptx
 
QMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfQMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdf
 
BoSEU24 | Bill Thompson | Talk From Another Century
BoSEU24 | Bill Thompson | Talk From Another CenturyBoSEU24 | Bill Thompson | Talk From Another Century
BoSEU24 | Bill Thompson | Talk From Another Century
 
Automation Ops Series: Session 3 - Solutions management
Automation Ops Series: Session 3 - Solutions managementAutomation Ops Series: Session 3 - Solutions management
Automation Ops Series: Session 3 - Solutions management
 
Women in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automationWomen in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automation
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdfHCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
 
Arti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdfArti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdf
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Deliver Latency Free Customer Experience
Deliver Latency Free Customer ExperienceDeliver Latency Free Customer Experience
Deliver Latency Free Customer Experience
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Tetracrom printing process for packaging with CMYK+
Tetracrom printing process for packaging with CMYK+Tetracrom printing process for packaging with CMYK+
Tetracrom printing process for packaging with CMYK+
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 

Linux Kernel Memory Model

  • 1. Linux Kernel Memory Model SeongJae Park <sj38.park@gmail.com>
  • 2. These slides were presented during 4th Korea Linux Kernel Community Meetup (https://kernel-dev-ko.github.io/4th/)
  • 3. This work by SeongJae Park is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/.
  • 4. I, SeongJae Park ● SeongJae Park <sj38.park@gmail.com> ● PhD candidate researcher at DCSLAB, SNU ● Part time linux kernel programmer at KOSSLAB ● Interested in memory management and parallel programming for OS kernels
  • 5. It’s Multi-processor Era, Now ● Processor vendors changed their mind to increase number of cores instead of clock speed more than a decade ago ○ Now, multi-core systems are prevalent ○ Even octa-core system in your pocket, maybe? ● As a result, the free lunch is over; parallel programming is essential for high performance and scalability http://www.gotw.ca/images/CPU.png GIVE UP :’(
  • 6. Writing Correct Parallel Program is Hard ● Nature of parallelism is counter-intuitive to human ○ Time is relative, order is ambiguous, and only observations exist ● Compilers and processors do reorderings for performance ● C language developed with Uni-Processor assumption ○ Standard for the kernel (C99) is oblivious of multi-processor systems CPU 0 CPU 1 X = 1; Y = 1; X = 2; Y = 2; if (Y == 2 && X == 1) OOPS(); CPU 1 can call OOPS() in Linux Kernel
  • 8. Instruction Reordering Helps Performance ● By reordering dependent instructions far away, total time consumptions can be reduced ● If the reordering is guaranteed to not change the result of the instruction sequence, it would be helpful for better performance ● Such reordering is legal and recommended fetch decode execute fetch decode execute fetch decode execute Instruction 1 Instruction 3 Instruction 2 instruction 2 depends on result of instruction 1 (e.g., first instruction modifies opcode of next instruction) By reordering instruction 2 and 3, total execution time can be shorten 6 cycles for 3 instructions: 2 cycles per instruction
  • 9. Time and Order is Relative ● Each CPU concurrently generates their events and observes effects of events from others ● It is impossible to define absolute order of two concurrent events; Only relative observation order is possible CPU 1 CPU 2 CPU 3 CPU 4 Generated event 1 Generated event 2 Observed event 1 followed by event 2 I observed event 2 followed by event 1 Event Bus
  • 10. Cache Coherency is Per-CacheLine ● It is well known that cache coherency protocol helps system memory consistency ● In actual, it guarantees only per-cacheline sequential consistency ● Every CPU will eventually agree about global order of each cacheline, but some CPU can aware the change faster than others, order of changes between different variables is not guaranteed CPU 1 CPU 2 CPU 3 CPU 4 A = 1; (1) B = 1; (2) A = 2; (3) B = 2; (4) The order is: (1), (2), (3), (4) Coherent cache protocol The order is: (1), (3), (2), (4) The order is: (2), (1), (4), (3) Every CPU is honest; The system is legally cache-coherent
  • 11. An Example with A Mythological Architecture ● Many systems equip hierarchical memory for better performance and space ● Propagation speed of an event to a given core can be influenced by specific sub-layer of memory If CPU 0 Message Queue is busy, CPU 2 can observe an event from CPU 0 (event A) after an event of CPU 1 (event B) though CPU 1 observed event A before generating event B CPU 0 CPU 1 Cache CPU 0 Message Queue CPU 1 Message Queue Memory CPU 2 CPU 3 Cache CPU 2 Message Queue CPU 3 Message Queue Bus
  • 12. An Example with A Mythological Architecture ● Many systems equip hierarchical memory for better performance and space ● Propagation speed of an event to a given core can be influenced by specific sub-layer of memory If CPU 0 Message Queue is busy, CPU 2 can observe an event from CPU 0 (event A) after an event of CPU 1 (event B) though CPU 1 observed event A before generating event B CPU 0 CPU 1 Cache CPU 0 Message Queue CPU 1 Message Queue Memory CPU 2 CPU 3 Cache CPU 2 Message Queue CPU 3 Message Queue Bus Generate Event A; Event A
  • 13. An Example with A Mythological Architecture ● Many systems equip hierarchical memory for better performance and space ● Propagation speed of an event to a given core can be influenced by specific sub-layer of memory If CPU 0 Message Queue is busy, CPU 2 can observe an event from CPU 0 (event A) after an event of CPU 1 (event B) though CPU 1 observed event A before generating event B CPU 0 CPU 1 Cache CPU 0 Message Queue CPU 1 Message Queue Memory CPU 2 CPU 3 Cache CPU 2 Message Queue CPU 3 Message Queue Bus Generate Event A; Seen Event A; Generate Event B; Event BEvent A
  • 14. An Example with A Mythological Architecture ● Many systems equip hierarchical memory for better performance and space ● Propagation speed of an event to a given core can be influenced by specific sub-layer of memory If CPU 0 Message Queue is busy, CPU 2 can observe an event from CPU 0 (event A) after an event of CPU 1 (event B) though CPU 1 observed event A before generating event B CPU 0 CPU 1 Cache CPU 0 Message Queue CPU 1 Message Queue Memory CPU 2 CPU 3 Cache CPU 2 Message Queue CPU 3 Message Queue Bus Generate Event A; Seen Event A; Generate Event B; Seen Event B; Event A Event B Busy… ;;;
  • 15. An Example with A Mythological Architecture ● Many systems equip hierarchical memory for better performance and space ● Propagation speed of an event to a given core can be influenced by specific sub-layer of memory If CPU 0 Message Queue is busy, CPU 2 can observe an event from CPU 0 (event A) after an event of CPU 1 (event B) though CPU 1 observed event A before generating event B CPU 0 CPU 1 Cache CPU 0 Message Queue CPU 1 Message Queue Memory CPU 2 CPU 3 Cache CPU 2 Message Queue CPU 3 Message Queue Bus Generate Event A; Seen Event A; Generate Event B; Seen Event B; Seen Event A; Event B Event A
  • 16. Yet Another Mythological Architecture ● Store Buffer and Invalidation Queue can reorder observation of stores and loads CPU 0 Cache Store Buffer Invalidation Queue Memory CPU 1 Cache Store Buffer Invalidation Queue Bus
  • 18. C-language Doesn’t Know Multi-Processor ● By the time of initial C-language development, multi-processor was rare ● As a result, C-language has only few guarantees about memory operations on multi-processor ● Undefined behavior is allowed for undefined case https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/The_C_Programming _Language,_First_Edition_Cover_(2).svg/2000px-The_C_Programming_Languag e,_First_Edition_Cover_(2).svg.png
  • 19. Compiler Optimizes (Reorders) Code ● Clever compilers try hard (really hard) to optimize code for high IPC (not for human perspective goals) ○ Converts small, private functions to inline code ○ Reorder memory access code to minimize dependency ○ Simplify unnecessarily complex loops, ... ● Optimization uses term `Undefined behavior` as they want ○ It’s legal, but sometimes do insane things in programmer’s perspective ● Compilers reorder memory accesses based on the C-standard, which is oblivious of multi-processor systems; this can result in unintended programs ● C11 standard aware the multi-processor systems, though
  • 21. Synchronization Primitives ● Because reordering and asynchronous effect propagation is legal, synchronization primitives are necessary to write human intuitive program ● Most environments including the Linux kernel provide sufficiently enough synchronization primitives for human intuitive programs https://s-media-cache-ak0.pinimg.com/236x/42/bc/55/42bc55a6d7e5affe2d0dbe9c872a3df9.jpg
  • 22. Atomic Operations ● Atomic operations are configured with multiple sub operations ○ E.g., compare1) -and-exchange2) , fetch1) -and-add2) , and test1) -and-set2) ● Atomic operations have mutual exclusiveness ○ Middle state of atomic operation execution cannot be seen by others ○ It can be thought of as small critical section that protected by a global lock ● Linux also provides plenty of atomic operations http://www.scienceclarified.com/photos/atomic-mass-3024.jpg
  • 23. Memory Barriers ● In general, memory barriers guarantee effects of memory operations issued before it to be propagated to other components in the system (e.g., processor) before memory operations issued after the barrier ● Linux provides various types of memory barriers including compiler memory barriers, hardware memory barriers, load barriers, store barriers CPU 1 CPU 2 CPU 3 READ A; WRITE B; <BARRIER>; READ C; READ A, WRITE B, then READ C occurred WRITE B, READ A, then READ C occurred READ A and WRITE B can be reordered but READ C is guaranteed to be ordered after {READ A, WRITE B}
  • 24. Partial Ordering Primitives ● Partial ordering primitives include read / write memory barriers, acquire / release semantics and data / address / control dependencies ● Semantics of partial ordering primitives are somewhat confusing ● Cost of partial ordering can be much cheaper than fully ordering primitives ● Linux also provides partial ordering primitives https://www.ternvets.co.uk/wp-content/uploads/2017/02/cat-with-cone.jpg
  • 25. Cost of Synchronization Primitives ● Generally, synchronization primitives are expensive, unscalable ● Each primitive costs differently; the gap is significant ● Correct selection and efficient use of synchronization primitives are important for the performance and scalability Performance of various synchronization primitives
  • 27. Each Environment Provides Own Memory Model ● Memory model, a.k.a Memory consistency model ● Defines what values can be obtained by load instructions of given code ● Programming environments including Instruction Set Architectures, Programming languages, Operating systems, etc define their memory model ○ Modern language memory models (e.g., Golang, Rust, Java, C11, …) aware multi-processor, while C99, which is for the Linux kernel, doesn’t http://www.sciencemag.org/sites/default/files/styles/article_main_large/public/Memory.jpg?itok=4FmHo7M5
  • 28. Each ISA Provides Specific Memory Model ● Some architectures have stricter ordering enforcement rule than others ● PA-RISC CPUs are strictest, Alpha is weakest ● Because Linux kernel supports multiple architectures, it defines its memory model based on weakest one, the Alpha https://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.2015.01.31a.pdf
  • 29. Linux Kernel Memory Model (LKMM) ● The memory model for Linux kernel programming environment ○ Defines what values can be obtained, given a piece of Linux kernel code, for specific load instructions in the code ● Linux kernel original memory model is necessary because ○ It uses C99 but C99 memory model is oblivious of multi-processor environment; C11 memory model aware of multi-processor, but Torvalds doesn’t want it ○ It supports multiple architectures with different ISA-level memory model https://github.com/sjp38/perfbook
  • 30. LKMM Providing Ordering Primitives ● Designed for weakest memory model architecture, Alpha ○ Almost every combination of reordering is possible, doesn’t provide address dependency ● Atomic instructions ○ atomix_xchg(), atomic_inc_return(), atomic_dec_return(), … ○ Most of those just use mapping instruction, but provide general guarantees at Kernel level ● Memory barriers ○ Compiler barriers: WRITE_ONCE(), READ_ONCE(), barrier(), ... ○ CPU barriers: mb(), wmb(), rmb(), smp_mb(), smp_wmb(), smp_rmb(), … ○ Semantic barriers: ACQUIRE operations, RELEASE operations, dependencies, … ○ For detail, refer to https://www.kernel.org/doc/Documentation/memory-barriers.txt ● Because different memory ordering primitive has different cost, only necessary ordering primitives should be used in necessary case for high performance and scalability
  • 31. Informal LKMM ● Originally, LKMM was just an informal text, ‘memory-barriers.txt’ ● It explains about the Linux Kernel Memory Model in English (There is Korean translation, too: https://www.kernel.org/doc/Documentation/translations/ko_KR/memory-barriers.txt) ● To use the LKMM to prove your code, you should use Feynman Algorithm ○ Write down your code ○ Think real hard with the ‘memory-barriers.txt’ ○ Write down your provement ○ Hard and unreliable, of course! https://www.kernel.org/doc/Documentation/memory-barriers.txt
  • 32. Formal and Executable LKMM: Help Arrives at Last ● Jade Alglave, Paul E. McKenney, and some guys made formal LKMM ● It is formal, executable memory model ○ It receives C-like simple code as input ○ The code containing parallel code snippets and a question: can this result happen? ● Provides model for each ordering primitive of LKMM based on concepts including orders and cycles ● Can be executed with herd7 and klitmus7 ○ LKMM extends herd7 and klitmus7 to support LKMM ordering primitives in code ○ herd7 based execution does proof in user mode ○ klitmus7 based execution generates test running kernel module and wrapper scripts for execution of the module
  • 33. Steps for LKMM Execution ● Installation ○ LKMM is merged in Linux source tree at tools/memory-model; Just get the linux source code ○ Install herdtools7 (https://github.com/herd/herdtools7) ● Usage ○ Using herd7 based user mode proof $ herd7 -conf linux-kernel.cfg <your-litmus-test file> ○ Using klitmus7 based real kernel mode execution $ mkdir mymodules $ klitmus7 -o mymodules <your-litmus-test file> $ cd mymodules ; make $ sudo sh run.sh ● That’s it! Now you can prove your parallel code for all Linux environments!
  • 34. Summary ● Nature of parallel programming is counter-intuitive ○ Cannot define order of events without interaction ○ Ordering rule is different for different environment ○ Memory model defines their ordering rule ● For human-intuitive and correct program, interaction is necessary ○ Almost every environment provides memory ordering primitives including atomic instructions and memory barriers, which is expensive in common ○ Memory model defines what result can occur and cannot with given code snippet ● Formal Linux kernel memory model is available ○ Linux kernel provides its memory model based on weakest memory ordering rule architecture it supports, the Alpha, C99, and its original ordering primitives including RCU ○ Formal and executable LKMM is in the kernel tree; now you can prove your parallel code!
  • 37. `herdtools` Install ● LKMM (in Linux v4.19) depends on herdtools version 7.49 ○ Depending version may change ● `herdtools` depends on OPAM, the package manager for OCaml ● Ubuntu package manager supports OPAM $ sudo apt install opam $ opam init && sudo opam update && sudo opam upgrade $ git clone https://github.com/herd/herdtools7 && cd herdtools7 $ git checkout 7.49 $ make all && make install $ herd7 -version 7.49, Rev: 93dcbdd89086d5f3e981b280d437309fdeb8b427
  • 38. Herd7 Based Userspace Litmus Tests Execution $ herd7 -conf linux-kernel.cfg litmus-tests/SB+fencembonceonces.litmus Test SB+fencembonceonces Allowed States 3 0:r0=0; 1:r0=1; 0:r0=1; 1:r0=0; 0:r0=1; 1:r0=1; No Witnesses Positive: 0 Negative: 3 Condition exists (0:r0=0 / 1:r0=0) Observation SB+fencembonceonces Never 0 3 Time SB+fencembonceonces 0.01 Hash=d66d99523e2cac6b06e66f4c995ebb48
  • 39. Klitmus7 Based Litmus Tests Execution $ mkdir my; klitmus7 -o my/ litmus-tests/SB+fencembonceonces.litmus $ cd klitmus_test; make; sudo sh ./run.sh ... Test SB+fencembonceonces Allowed Histogram (3 states) 16580117:>0:r0=1; 1:r0=0; 16402936:>0:r0=0; 1:r0=1; 3016947 :>0:r0=1; 1:r0=1; ... Positive: 0, Negative: 36000000 Condition exists (0:r0=0 / 1:r0=0) is NOT validated Hash=d66d99523e2cac6b06e66f4c995ebb48 Observation SB+fencembonceonces Never 0 36000000 ...