SlideShare a Scribd company logo
1 of 37
Download to read offline
Linux Kernel Development
Ch9. An Introduction to Kernel Synchronization
Hewitt
• Shared resources require protection from concurrent
access because if multiple threads of execution access
and manipulate the data at the same time, the threads
may overwrite each other’s changes or access data
while it is in an inconsistent state.
• What makes synchronization so complicated?
– Linux 2.0 - Symmetrical multiprocessing.
– Linux 2.6 - Kernel preemption.
Critical Regions and Race Conditions
• Critical regions (Critical sections) - code paths that
access and manipulate shared data.
• Atomic operations - complete without interruption as if
the entire critical region were one indivisible instruction.
• Race condition - two threads of execution to be
simultaneously executing within the same critical region.
• Synchronization - ensuring that unsafe concurrency is
prevented and that race conditions do not occur.
Why Do We Need Protection?
• ATM
• The Single Variable, i++
– atomic instruction
Locking
• A mechanism for preventing access to a resource while
another thread of execution is in the marked region.
• It works much like a lock on a door.
• The most significant difference between the various
mechanisms is the behavior when the lock is unavailable
because another thread already holds it – busy wait or sleep.
• Locks are implemented using atomic operations that ensure
no race exists.
• Causes of Concurrency
– Pseudo concurrency - Two things do not actually happen at the
same time but interleave with each other, which may be caused
by preemption or signal.
– True concurrency - A symmetrical multiprocessing machine, two
processes can actually be executed in a critical region at the
exact same time.
– Interrupts - An interrupt can occur asynchronously at almost any
time, interrupting the currently executing code.
– Softirqs and tasklets - The kernel can raise or schedule a softirq
or tasklet at almost any time, interrupting the currently executing
code.
– Kernel preemption - Because the kernel is preemptive, one task
in the kernel can preempt another.
– Sleeping and synchronization with user-space - A task in the
kernel can sleep and thus invoke the scheduler, resulting in the
running of a new process.
– Symmetrical multiprocessing - Two or more processors can
execute kernel code at exactly the same time.
• Knowing What to Protect
– Most global kernel data structures do. A good rule of thumb is
that if another thread of execution can access the data, the data
needs some sort of locking; if anyone else can see it, lock it.
– Remember to lock data, not code.
– Provide appropriate protection for the most pessimistic case,
SMP with kernel preemption, and all scenarios will be covered.
Deadlocks
• Each thread waits for resources of others results in
never make any progress.
• Simple rules of using lock
– Implement lock ordering. Nested locks must always be obtained
in the same order.
– Prevent starvation.
– Do not double acquire the same lock.
– Design for simplicity. Complexity in your locking scheme invites
deadlocks.
Contention and Scalability
• Consider a linked list
– Lock for the entire list
– Lock for each node
– Lock for each element in each node
• Locking that is too coarse results in poor scalability if
there is high lock contention, whereas locking that is too
fine results in wasteful overhead if there is little lock
contention.
• Start simple and grow in complexity only as needed.
Simplicity is key.
Conclusion
• Making your code SMP-safe is not something that can
be added as an afterthought.
• Proper synchronization - locking that is free of deadlocks,
scalable, and clean - requires design decisions from start
through finish.
Linux Kernel Development
Ch10. Kernel Synchronization Methods
Hewitt
Atomic Operations
• Provide instructions that execute atomically - without
interruption.
• The foundation on which other synchronization methods
are built.
• Some architectures, lacking direct atomic operations,
provide an operation to lock the memory bus for a single
operation, thus guaranteeing that another memory-
affecting operation cannot occur simultaneously.
• Atomic Integer Operations
– the atomic integer methods operate on a special data type,
atomic_t.
– the data types are not passed to any nonatomic functions.
– the compiler does not (erroneously but cleverly) optimize access
to the value.
– it can hide any architecture-specific differences in its
implementation.
• 64-Bit Atomic Operations
– Functions are prefixed with atomic64 in lieu of atomic.
– For portability between all Linux’s supported architectures,
developers should use the 32-bit atomic_t type. The 64-bit
atomic64_t is reserved for code that is both architecture-specific
and that requires 64-bits.
• Atomic Bitwise Operations
– The nonatomic functions are prefixed by double underscores.
– Real atomicity requires that all intermediate states be correctly
realized.
Spin Locks
• If a thread of execution attempts to acquire a spin lock
while it is already held, which is called contended, the
thread busy loops – spins - waiting for the lock to
become available. If the lock is not contended, the
thread can immediately acquire the lock and continue.
• The spinning prevents more than one thread of
execution from entering the critical region.
• It is wise to hold spin locks for less than the duration of
two context switches.
• Spin Lock Methods
– Linux kernel’s spin locks are not recursive.
– Spin locks can be used in interrupt handlers, whereas
semaphores cannot be used because they sleep.
– If a lock is used in an interrupt handler, you must also disable
local interrupts (interrupt requests on the current processor)
before obtaining the lock.
disable kernel preemption &
disable interrupts
disable kernel preemption not recommended
• Spin Locks and Bottom Halves
– Because a bottom half might preempt process context code, if
data is shared between a bottom-half process context, you must
protect the data in process context with both a lock and then
disabling of bottom halves.
– two tasklets of the same type do not ever run simultaneously.
Thus, there is no need to protect data used only within a single
type of tasklet.
– …
Reader-Writer Spin Locks
• One or more readers can concurrently hold the reader
lock. The writer lock, conversely, can be held by at most
one writer with no concurrent readers.
• Linux reader-writer spin locks is that they favor readers
over writers.
Semaphore
• It’s sleeping locks in Linux.
• When a task attempts to acquire a semaphore that is
unavailable, the semaphore places the task onto a wait
queue and puts the task to sleep. The processor is then
free to execute other code.
• Unlike spin locks, semaphores do not disable kernel
preemption and, consequently, code holding a
semaphore can be preempted. This means semaphores
do not adversely affect scheduling latency.
• Counting and Binary Semaphores
– Counting semaphore - the number of permissible simultaneous
holders of semaphores can be set at declaration time. This value
is called the usage count or simply the count.
– Binary semaphore – one lock holder at a time. Also called as
mutex, because it enforces mutual exclusion.
– A semaphore supports two atomic operations, P() and V(). Later
systems called these methods down() and up(), respectively, and
so does Linux.
– Counting semaphores are not used to enforce mutual exclusion.
It enforces limits in certain code.
Reader-Writer Semaphores
• All reader-writer semaphores are mutexes - that is, their
usage count is one - although they enforce mutual
exclusion only for writers, not readers.
• Reader-writer semaphores have a unique method that
their reader-writer spin lock cousins do not have:
downgrade_write().This function atomically converts an
acquired write lock to a read lock.
• It is worthwhile using only if your code naturally splits
along a reader/writer boundary.
Mutexes
• It behaves similar to a semaphore with a count of one,
but it has a simpler interface, more efficient performance,
and additional constraints on its use.
– Only one task can hold the mutex at a time. That is, the usage
count on a mutex is always one.
– Whoever locked a mutex must unlock.
– Recursive locks and unlocks are not allowed. That is, you cannot
recursively acquire the same mutex, and you cannot unlock an
unlocked mutex.
– A process cannot exit while holding a mutex.
– A mutex cannot be acquired by an interrupt handler or bottom
half, even with mutex_trylock().
– A mutex can be managed only via the official API: It must be
initialized via the methods described in this section and cannot
be copied, hand initialized, or reinitialized.
• Semaphores Versus Mutexes
– Unless one of mutex’s additional constraints prevent you from
using them, prefer the new mutex type to semaphores.
• Spin Locks Versus Mutexes
– Only a spin lock can be used in interrupt context, whereas only a
mutex can be held while a task sleeps.
Completion Variables
• It’s an easy way to synchronize between two tasks in the
kernel when one task needs to signal to the other that an
event has occurred.
BKL: The Big Kernel Lock
• A global spin lock that was created to ease the transition
from Linux’s original SMP implementation to fine-grained
locking.
• BKL properties:
– Sleep is allowed while holding the BKL. The lock is automatically
dropped when the task is unscheduled and reacquired when the
task is rescheduled.
– The BKL is a recursive lock.
– BKL only can be used in process context.
– New users of the BKL are forbidden.
Sequential Locks
• It’s generally shortened to seq lock, is a newer type of
lock introduced in the 2.6 kernel.
• It works by maintaining a sequence counter.
– Whenever the data in question is written to, a lock is obtained
and a sequence number is incremented. Prior to and after
reading the data, the sequence number is read. If the values are
the same, a write did not begin in the middle of the read. Further,
if the values are even, a write is not underway.
• Seq locks are useful to provide a lightweight and
scalable lock for use with many readers and a few
writers. Seq locks, however, favor writers over readers.
• A prominent user of the seq lock is jiffies, the variable
that stores a Linux machine’s uptime.
Preemption Disabling
• Because the kernel is preemptive, a process in the
kernel can stop running at any instant to enable a
process of higher priority to run.
• The kernel preemption code uses spin locks as markers
of nonpreemptive regions.
• Per-processor data may not require a spin lock, but do
need kernel preemption disabled.
• A cleaner solution to per-processor data issues.
Ordering and Barriers
• When dealing with synchronization between multiple
processors or with hardware devices, it is sometimes a
requirement that memory-reads (loads) and memory-
writes (stores) issue in the order specified in your
program code.
• All processors that do reorder reads or writes provide
machine instructions to enforce ordering requirements.
• It is also possible to instruct the compiler not to reorder
instructions around a given point. These instructions are
called barriers.
• Barrier functions ensure no load/store are reordered
across it.

More Related Content

What's hot

Vliw and superscaler
Vliw and superscalerVliw and superscaler
Vliw and superscalerRafi Dar
 
Superscalar & superpipeline processor
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processorMuhammad Ishaq
 
Graphics processing uni computer archiecture
Graphics processing uni computer archiectureGraphics processing uni computer archiecture
Graphics processing uni computer archiectureHaris456
 
Advanced processor principles
Advanced processor principlesAdvanced processor principles
Advanced processor principlesDhaval Bagal
 
Superfluid NFV: VMs and Virtual Infrastructure Managers speed-up for instanta...
Superfluid NFV: VMs and Virtual Infrastructure Managers speed-up for instanta...Superfluid NFV: VMs and Virtual Infrastructure Managers speed-up for instanta...
Superfluid NFV: VMs and Virtual Infrastructure Managers speed-up for instanta...Stefano Salsano
 
Deploying of Unikernels in the NFV Infrastructure
Deploying of Unikernels in the NFV InfrastructureDeploying of Unikernels in the NFV Infrastructure
Deploying of Unikernels in the NFV InfrastructureStefano Salsano
 
Advanced computer architecture lesson 5 and 6
Advanced computer architecture lesson 5 and 6Advanced computer architecture lesson 5 and 6
Advanced computer architecture lesson 5 and 6Ismail Mukiibi
 
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...Stefano Salsano
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programmingShaveta Banda
 
VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)Pragnya Dash
 
Extending ETSI VNF descriptors and OpenVIM to support Unikernels
Extending ETSI VNF descriptors and OpenVIM to support UnikernelsExtending ETSI VNF descriptors and OpenVIM to support Unikernels
Extending ETSI VNF descriptors and OpenVIM to support UnikernelsStefano Salsano
 
Architecture of TPU, GPU and CPU
Architecture of TPU, GPU and CPUArchitecture of TPU, GPU and CPU
Architecture of TPU, GPU and CPUGlobalLogic Ukraine
 

What's hot (20)

13 superscalar
13 superscalar13 superscalar
13 superscalar
 
Vliw and superscaler
Vliw and superscalerVliw and superscaler
Vliw and superscaler
 
Superscalar & superpipeline processor
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processor
 
Graphics processing uni computer archiecture
Graphics processing uni computer archiectureGraphics processing uni computer archiecture
Graphics processing uni computer archiecture
 
Advanced processor principles
Advanced processor principlesAdvanced processor principles
Advanced processor principles
 
Threads .ppt
Threads .pptThreads .ppt
Threads .ppt
 
Superfluid NFV: VMs and Virtual Infrastructure Managers speed-up for instanta...
Superfluid NFV: VMs and Virtual Infrastructure Managers speed-up for instanta...Superfluid NFV: VMs and Virtual Infrastructure Managers speed-up for instanta...
Superfluid NFV: VMs and Virtual Infrastructure Managers speed-up for instanta...
 
Deploying of Unikernels in the NFV Infrastructure
Deploying of Unikernels in the NFV InfrastructureDeploying of Unikernels in the NFV Infrastructure
Deploying of Unikernels in the NFV Infrastructure
 
4 threads
4 threads4 threads
4 threads
 
Lec1 final
Lec1 finalLec1 final
Lec1 final
 
Advanced computer architecture lesson 5 and 6
Advanced computer architecture lesson 5 and 6Advanced computer architecture lesson 5 and 6
Advanced computer architecture lesson 5 and 6
 
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
 
Threading
ThreadingThreading
Threading
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programming
 
VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)VLIW(Very Long Instruction Word)
VLIW(Very Long Instruction Word)
 
seL4 intro
seL4 introseL4 intro
seL4 intro
 
Extending ETSI VNF descriptors and OpenVIM to support Unikernels
Extending ETSI VNF descriptors and OpenVIM to support UnikernelsExtending ETSI VNF descriptors and OpenVIM to support Unikernels
Extending ETSI VNF descriptors and OpenVIM to support Unikernels
 
Realtime
RealtimeRealtime
Realtime
 
Superscalar Processor
Superscalar ProcessorSuperscalar Processor
Superscalar Processor
 
Architecture of TPU, GPU and CPU
Architecture of TPU, GPU and CPUArchitecture of TPU, GPU and CPU
Architecture of TPU, GPU and CPU
 

Viewers also liked

6 Dean Google
6 Dean Google6 Dean Google
6 Dean GoogleFrank Cai
 
An Overview of [Linux] Kernel Lock Improvements -- Linuxcon NA 2014
An Overview of [Linux] Kernel Lock Improvements -- Linuxcon NA 2014An Overview of [Linux] Kernel Lock Improvements -- Linuxcon NA 2014
An Overview of [Linux] Kernel Lock Improvements -- Linuxcon NA 2014Davidlohr Bueso
 
Linux Locking Mechanisms
Linux Locking MechanismsLinux Locking Mechanisms
Linux Locking MechanismsKernel TLV
 
Linux and H/W optimizations for MySQL
Linux and H/W optimizations for MySQLLinux and H/W optimizations for MySQL
Linux and H/W optimizations for MySQLYoshinori Matsunobu
 
Futex Scaling for Multi-core Systems
Futex Scaling for Multi-core SystemsFutex Scaling for Multi-core Systems
Futex Scaling for Multi-core SystemsDavidlohr Bueso
 
Memory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelMemory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelDavidlohr Bueso
 
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)Sneeker Yeh
 

Viewers also liked (8)

6 Dean Google
6 Dean Google6 Dean Google
6 Dean Google
 
An Overview of [Linux] Kernel Lock Improvements -- Linuxcon NA 2014
An Overview of [Linux] Kernel Lock Improvements -- Linuxcon NA 2014An Overview of [Linux] Kernel Lock Improvements -- Linuxcon NA 2014
An Overview of [Linux] Kernel Lock Improvements -- Linuxcon NA 2014
 
Linux Locking Mechanisms
Linux Locking MechanismsLinux Locking Mechanisms
Linux Locking Mechanisms
 
Linux and H/W optimizations for MySQL
Linux and H/W optimizations for MySQLLinux and H/W optimizations for MySQL
Linux and H/W optimizations for MySQL
 
Posix Threads
Posix ThreadsPosix Threads
Posix Threads
 
Futex Scaling for Multi-core Systems
Futex Scaling for Multi-core SystemsFutex Scaling for Multi-core Systems
Futex Scaling for Multi-core Systems
 
Memory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelMemory Barriers in the Linux Kernel
Memory Barriers in the Linux Kernel
 
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
 

Similar to Linux kernel development chapter 10

Remote core locking-Andrea Lombardo
Remote core locking-Andrea LombardoRemote core locking-Andrea Lombardo
Remote core locking-Andrea LombardoAndrea Lombardo
 
Concurrent Programming in Java
Concurrent Programming in JavaConcurrent Programming in Java
Concurrent Programming in JavaLakshmi Narasimhan
 
Describe synchronization techniques used by programmers who develop .pdf
Describe synchronization techniques used by programmers who develop .pdfDescribe synchronization techniques used by programmers who develop .pdf
Describe synchronization techniques used by programmers who develop .pdfexcellentmobiles
 
Operating system 27 semaphores
Operating system 27 semaphoresOperating system 27 semaphores
Operating system 27 semaphoresVaibhav Khanna
 
Linux synchronization tools
Linux synchronization toolsLinux synchronization tools
Linux synchronization toolsmukul bhardwaj
 
Multi core programming 2
Multi core programming 2Multi core programming 2
Multi core programming 2Robin Aggarwal
 
Module2 MultiThreads.ppt
Module2 MultiThreads.pptModule2 MultiThreads.ppt
Module2 MultiThreads.pptshreesha16
 
Multiprocessing -Interprocessing communication and process sunchronization,se...
Multiprocessing -Interprocessing communication and process sunchronization,se...Multiprocessing -Interprocessing communication and process sunchronization,se...
Multiprocessing -Interprocessing communication and process sunchronization,se...Neena R Krishna
 
chapter4-processes nd processors in DS.ppt
chapter4-processes nd processors in DS.pptchapter4-processes nd processors in DS.ppt
chapter4-processes nd processors in DS.pptaakarshsiwani1
 
Parallel Computing - Lec 3
Parallel Computing - Lec 3Parallel Computing - Lec 3
Parallel Computing - Lec 3Shah Zaib
 
Pune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCDPune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCDPrashant Rane
 
Locking base concurrency control
  Locking base concurrency control  Locking base concurrency control
Locking base concurrency controlPrakash Poudel
 
Xen and-the-art-of-rails-deployment2640
Xen and-the-art-of-rails-deployment2640Xen and-the-art-of-rails-deployment2640
Xen and-the-art-of-rails-deployment2640Newlink
 
Xen and-the-art-of-rails-deployment2640
Xen and-the-art-of-rails-deployment2640Xen and-the-art-of-rails-deployment2640
Xen and-the-art-of-rails-deployment2640Newlink
 
Xen and-the-art-of-rails-deployment2640
Xen and-the-art-of-rails-deployment2640Xen and-the-art-of-rails-deployment2640
Xen and-the-art-of-rails-deployment2640LLC NewLink
 

Similar to Linux kernel development chapter 10 (20)

Memory model
Memory modelMemory model
Memory model
 
Remote core locking-Andrea Lombardo
Remote core locking-Andrea LombardoRemote core locking-Andrea Lombardo
Remote core locking-Andrea Lombardo
 
Concurrency in Java
Concurrency in JavaConcurrency in Java
Concurrency in Java
 
Concurrent Programming in Java
Concurrent Programming in JavaConcurrent Programming in Java
Concurrent Programming in Java
 
Describe synchronization techniques used by programmers who develop .pdf
Describe synchronization techniques used by programmers who develop .pdfDescribe synchronization techniques used by programmers who develop .pdf
Describe synchronization techniques used by programmers who develop .pdf
 
Operating system 27 semaphores
Operating system 27 semaphoresOperating system 27 semaphores
Operating system 27 semaphores
 
Linux synchronization tools
Linux synchronization toolsLinux synchronization tools
Linux synchronization tools
 
Java concurrency
Java concurrencyJava concurrency
Java concurrency
 
Multi core programming 2
Multi core programming 2Multi core programming 2
Multi core programming 2
 
Module2 MultiThreads.ppt
Module2 MultiThreads.pptModule2 MultiThreads.ppt
Module2 MultiThreads.ppt
 
Multiprocessing -Interprocessing communication and process sunchronization,se...
Multiprocessing -Interprocessing communication and process sunchronization,se...Multiprocessing -Interprocessing communication and process sunchronization,se...
Multiprocessing -Interprocessing communication and process sunchronization,se...
 
chapter4-processes nd processors in DS.ppt
chapter4-processes nd processors in DS.pptchapter4-processes nd processors in DS.ppt
chapter4-processes nd processors in DS.ppt
 
Parallel Computing - Lec 3
Parallel Computing - Lec 3Parallel Computing - Lec 3
Parallel Computing - Lec 3
 
Thread
ThreadThread
Thread
 
Thread
ThreadThread
Thread
 
Pune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCDPune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCD
 
Locking base concurrency control
  Locking base concurrency control  Locking base concurrency control
Locking base concurrency control
 
Xen and-the-art-of-rails-deployment2640
Xen and-the-art-of-rails-deployment2640Xen and-the-art-of-rails-deployment2640
Xen and-the-art-of-rails-deployment2640
 
Xen and-the-art-of-rails-deployment2640
Xen and-the-art-of-rails-deployment2640Xen and-the-art-of-rails-deployment2640
Xen and-the-art-of-rails-deployment2640
 
Xen and-the-art-of-rails-deployment2640
Xen and-the-art-of-rails-deployment2640Xen and-the-art-of-rails-deployment2640
Xen and-the-art-of-rails-deployment2640
 

Recently uploaded

Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

Linux kernel development chapter 10

  • 1. Linux Kernel Development Ch9. An Introduction to Kernel Synchronization Hewitt
  • 2. • Shared resources require protection from concurrent access because if multiple threads of execution access and manipulate the data at the same time, the threads may overwrite each other’s changes or access data while it is in an inconsistent state. • What makes synchronization so complicated? – Linux 2.0 - Symmetrical multiprocessing. – Linux 2.6 - Kernel preemption.
  • 3. Critical Regions and Race Conditions • Critical regions (Critical sections) - code paths that access and manipulate shared data. • Atomic operations - complete without interruption as if the entire critical region were one indivisible instruction. • Race condition - two threads of execution to be simultaneously executing within the same critical region. • Synchronization - ensuring that unsafe concurrency is prevented and that race conditions do not occur.
  • 4. Why Do We Need Protection? • ATM • The Single Variable, i++ – atomic instruction
  • 5. Locking • A mechanism for preventing access to a resource while another thread of execution is in the marked region. • It works much like a lock on a door. • The most significant difference between the various mechanisms is the behavior when the lock is unavailable because another thread already holds it – busy wait or sleep. • Locks are implemented using atomic operations that ensure no race exists.
  • 6. • Causes of Concurrency – Pseudo concurrency - Two things do not actually happen at the same time but interleave with each other, which may be caused by preemption or signal. – True concurrency - A symmetrical multiprocessing machine, two processes can actually be executed in a critical region at the exact same time.
  • 7. – Interrupts - An interrupt can occur asynchronously at almost any time, interrupting the currently executing code. – Softirqs and tasklets - The kernel can raise or schedule a softirq or tasklet at almost any time, interrupting the currently executing code. – Kernel preemption - Because the kernel is preemptive, one task in the kernel can preempt another. – Sleeping and synchronization with user-space - A task in the kernel can sleep and thus invoke the scheduler, resulting in the running of a new process. – Symmetrical multiprocessing - Two or more processors can execute kernel code at exactly the same time.
  • 8. • Knowing What to Protect – Most global kernel data structures do. A good rule of thumb is that if another thread of execution can access the data, the data needs some sort of locking; if anyone else can see it, lock it. – Remember to lock data, not code. – Provide appropriate protection for the most pessimistic case, SMP with kernel preemption, and all scenarios will be covered.
  • 9. Deadlocks • Each thread waits for resources of others results in never make any progress. • Simple rules of using lock – Implement lock ordering. Nested locks must always be obtained in the same order. – Prevent starvation. – Do not double acquire the same lock. – Design for simplicity. Complexity in your locking scheme invites deadlocks.
  • 10. Contention and Scalability • Consider a linked list – Lock for the entire list – Lock for each node – Lock for each element in each node • Locking that is too coarse results in poor scalability if there is high lock contention, whereas locking that is too fine results in wasteful overhead if there is little lock contention. • Start simple and grow in complexity only as needed. Simplicity is key.
  • 11. Conclusion • Making your code SMP-safe is not something that can be added as an afterthought. • Proper synchronization - locking that is free of deadlocks, scalable, and clean - requires design decisions from start through finish.
  • 12. Linux Kernel Development Ch10. Kernel Synchronization Methods Hewitt
  • 13. Atomic Operations • Provide instructions that execute atomically - without interruption. • The foundation on which other synchronization methods are built. • Some architectures, lacking direct atomic operations, provide an operation to lock the memory bus for a single operation, thus guaranteeing that another memory- affecting operation cannot occur simultaneously.
  • 14. • Atomic Integer Operations – the atomic integer methods operate on a special data type, atomic_t. – the data types are not passed to any nonatomic functions. – the compiler does not (erroneously but cleverly) optimize access to the value. – it can hide any architecture-specific differences in its implementation.
  • 15.
  • 16. • 64-Bit Atomic Operations – Functions are prefixed with atomic64 in lieu of atomic. – For portability between all Linux’s supported architectures, developers should use the 32-bit atomic_t type. The 64-bit atomic64_t is reserved for code that is both architecture-specific and that requires 64-bits.
  • 17. • Atomic Bitwise Operations – The nonatomic functions are prefixed by double underscores. – Real atomicity requires that all intermediate states be correctly realized.
  • 18. Spin Locks • If a thread of execution attempts to acquire a spin lock while it is already held, which is called contended, the thread busy loops – spins - waiting for the lock to become available. If the lock is not contended, the thread can immediately acquire the lock and continue. • The spinning prevents more than one thread of execution from entering the critical region. • It is wise to hold spin locks for less than the duration of two context switches.
  • 19. • Spin Lock Methods – Linux kernel’s spin locks are not recursive. – Spin locks can be used in interrupt handlers, whereas semaphores cannot be used because they sleep. – If a lock is used in an interrupt handler, you must also disable local interrupts (interrupt requests on the current processor) before obtaining the lock. disable kernel preemption & disable interrupts disable kernel preemption not recommended
  • 20.
  • 21. • Spin Locks and Bottom Halves – Because a bottom half might preempt process context code, if data is shared between a bottom-half process context, you must protect the data in process context with both a lock and then disabling of bottom halves. – two tasklets of the same type do not ever run simultaneously. Thus, there is no need to protect data used only within a single type of tasklet. – …
  • 22. Reader-Writer Spin Locks • One or more readers can concurrently hold the reader lock. The writer lock, conversely, can be held by at most one writer with no concurrent readers. • Linux reader-writer spin locks is that they favor readers over writers.
  • 23.
  • 24. Semaphore • It’s sleeping locks in Linux. • When a task attempts to acquire a semaphore that is unavailable, the semaphore places the task onto a wait queue and puts the task to sleep. The processor is then free to execute other code. • Unlike spin locks, semaphores do not disable kernel preemption and, consequently, code holding a semaphore can be preempted. This means semaphores do not adversely affect scheduling latency.
  • 25. • Counting and Binary Semaphores – Counting semaphore - the number of permissible simultaneous holders of semaphores can be set at declaration time. This value is called the usage count or simply the count. – Binary semaphore – one lock holder at a time. Also called as mutex, because it enforces mutual exclusion. – A semaphore supports two atomic operations, P() and V(). Later systems called these methods down() and up(), respectively, and so does Linux. – Counting semaphores are not used to enforce mutual exclusion. It enforces limits in certain code.
  • 26.
  • 27. Reader-Writer Semaphores • All reader-writer semaphores are mutexes - that is, their usage count is one - although they enforce mutual exclusion only for writers, not readers. • Reader-writer semaphores have a unique method that their reader-writer spin lock cousins do not have: downgrade_write().This function atomically converts an acquired write lock to a read lock. • It is worthwhile using only if your code naturally splits along a reader/writer boundary.
  • 28. Mutexes • It behaves similar to a semaphore with a count of one, but it has a simpler interface, more efficient performance, and additional constraints on its use. – Only one task can hold the mutex at a time. That is, the usage count on a mutex is always one. – Whoever locked a mutex must unlock. – Recursive locks and unlocks are not allowed. That is, you cannot recursively acquire the same mutex, and you cannot unlock an unlocked mutex. – A process cannot exit while holding a mutex. – A mutex cannot be acquired by an interrupt handler or bottom half, even with mutex_trylock(). – A mutex can be managed only via the official API: It must be initialized via the methods described in this section and cannot be copied, hand initialized, or reinitialized.
  • 29. • Semaphores Versus Mutexes – Unless one of mutex’s additional constraints prevent you from using them, prefer the new mutex type to semaphores. • Spin Locks Versus Mutexes – Only a spin lock can be used in interrupt context, whereas only a mutex can be held while a task sleeps.
  • 30. Completion Variables • It’s an easy way to synchronize between two tasks in the kernel when one task needs to signal to the other that an event has occurred.
  • 31. BKL: The Big Kernel Lock • A global spin lock that was created to ease the transition from Linux’s original SMP implementation to fine-grained locking. • BKL properties: – Sleep is allowed while holding the BKL. The lock is automatically dropped when the task is unscheduled and reacquired when the task is rescheduled. – The BKL is a recursive lock. – BKL only can be used in process context. – New users of the BKL are forbidden.
  • 32. Sequential Locks • It’s generally shortened to seq lock, is a newer type of lock introduced in the 2.6 kernel. • It works by maintaining a sequence counter. – Whenever the data in question is written to, a lock is obtained and a sequence number is incremented. Prior to and after reading the data, the sequence number is read. If the values are the same, a write did not begin in the middle of the read. Further, if the values are even, a write is not underway. • Seq locks are useful to provide a lightweight and scalable lock for use with many readers and a few writers. Seq locks, however, favor writers over readers.
  • 33. • A prominent user of the seq lock is jiffies, the variable that stores a Linux machine’s uptime.
  • 34. Preemption Disabling • Because the kernel is preemptive, a process in the kernel can stop running at any instant to enable a process of higher priority to run. • The kernel preemption code uses spin locks as markers of nonpreemptive regions. • Per-processor data may not require a spin lock, but do need kernel preemption disabled.
  • 35. • A cleaner solution to per-processor data issues.
  • 36. Ordering and Barriers • When dealing with synchronization between multiple processors or with hardware devices, it is sometimes a requirement that memory-reads (loads) and memory- writes (stores) issue in the order specified in your program code. • All processors that do reorder reads or writes provide machine instructions to enforce ordering requirements. • It is also possible to instruct the compiler not to reorder instructions around a given point. These instructions are called barriers.
  • 37. • Barrier functions ensure no load/store are reordered across it.