SlideShare a Scribd company logo
1 of 27
Download to read offline
Process Scheduling
Darren Huang#791
Multitasking
• Multitasking operating systems come in two flavors: cooperative
multitasking and preemptive multitasking
• Linux implements preemptive multitasking
• the scheduler decides when a process is to cease running and a new process
is to begin running
Process Scheduler
• Linux kernel introduce Completely Fair Scheduler since version 2.6.23
• CFS has been modified a bit further in 2.6.24
• Comparison
• Linux pre-2.6 Multilevel feedback queue
• Linux 2.6-2.6.23 O(1) scheduler
• Linux post-2.6.23 Completely Fair Scheduler
• FreeBSD Multilevel feedback queue
• Mac OS X Multilevel feedback queue
• Windows NT Multilevel feedback queue
• Brain Fuck Scheduler
Policy
• I/O-bound processes
• Processor-bound processes
• tends to run such processes less frequently but for longer durations
• Policy in Unix systems tends to explicitly favor I/O-bound processes,
thus providing good process response time
• Linux is favoring I/O-bound processes over processor-bound
processors
Process Priority
• The Linux kernel implements two separate priority ranges
• Nice value
• Real-time priority
• Nice value
• A number from -20 to +19 with a default of 0
• Real-time priority
• Default range from 0 to 99
• Real-time priority and nice value are in disjoint value spaces
Timeslice
• Timeslice is the numeric value that represents how long a task can
run until it is preempted
• Linux’s CFS scheduler does NOT directly assign timeslices to processes
• CFS assigns processes a proportion of the processor
The Scheduling Policy in Action
• A text editor vs. a video encoder
Scheduling Algorithm
• How traditional Unix systems schedule processes.
• Mapping nice values onto timeslice to alloct each nice value cause
some drawbacks.
• Process A: nice value = 0 timeslice of 100 milliseconds
Process B: nice value = 20 timeslice of 5 milliseconds,
• Process A: nice value = 20 timeslice of 5 milliseconds
Process B: nice value = 20 timeslice of 5 milliseconds,
• Process A: nice value = 0 timeslice of 100 milliseconds
Process B: nice value = 0 timeslice of 100 milliseconds,
Scheduling Algorithm
• Process A: nice value = 0 timeslice of 100 milliseconds
Process B: nice value = 1 timeslice of 95 milliseconds,
• Process A: nice value = 18 timeslice of 10 milliseconds
Process B: nice value = 19 timeslice of 5 milliseconds,
• If performing a nice value to timeslice mapping, we need the ability to assign
the absolute timeslice.(ex. integer multiple of the timer ticks) Timeslice
change with different timer ticks.
• Optimize for interactive tasks. One process gains unfair amount of process
time.
Scheduling Algorithm
• The Linux scheduler is modular, and the modularity is called scheduler
classes
• The base scheduler code is defined in kernel/sched.c
• CFS is defined in kernel/sched_fair.c
• CFS basically models an “ideal, precise multi-tasking CPU” on real
hardware
• Do away with timeslices completely and assign each process a
PROPOTION of the processor
Ideal, Precise, Multitasking CPU
Actual Hardware CPU
Fair Scheduling
• CFS is called a fair scheduler because it gives each process a fair
share—a proportion—of the processor’s time
• The absolute timeslice allotted any nice value is NOT an absolute
number, but a given proportion of the processor
• CFS is NOT perfectly fair, because it only approximates perfect
multitasking
• But it can place a lower bound on latency of n for n runnable
processes on the unfairness
The Linux Scheduling Implementation
• We discuss four components of CFS
• Time Accounting
• Process Selection
• The Scheduler Entry Point
• Sleeping and Waking Up
Time Accounting
• CFS does NOT have the notion of a timeslice, but it must still keep
account for the time that each process runs
• CFS uses the scheduler entity structure, struct sched_entity,
defined in <linux/sched.h>, to keep track of process accounting
• The scheduler entity structure is embedded in the process descriptor,
struct task_stuct, as a member variable named se
Time Accounting: Virtual Runtime
• The virtual runtime is used to help us approximate the “ideal
multitasking processor” that CFS is modeling
• CFS uses vruntime to account for how long a process has run and
thus how much longer it ought to run
• The vruntime variable stores the virtual runtime of a process, which is
the actual runtime normalized by the number of runnable processes
Process Selection
• CFS uses a red-black tree to manage the list of runnable processes
and efficiently find the process with the smallest vruntime
• Picking the next task
• run the process represented by the leftmost node in the rbtree
• __pick_next_entity()
• Adding processes to the tree
• enqueue_entity()
• Removing processes from the tree
• dequeue_entity()
The Scheduler Entry Point
• The main entry point into the process schedule is the function
schedule(), defined in kernel/sched.c
Sleeping and Waking Up
• Tasks that are sleeping (blocked) are in a special non-runnable state
• Without this special state, the scheduler would select tasks that did
not want to run
• Sleeping is handled via wait queues
• A wait queue is a simple list of processes waiting for an event to occur
Preemption and Context Switching
• Context switching is handled by the context_switch() function
defined in kernel/sched.c
• It is called by schedule() when a new process has been selected to
run to do two basic jobs
• Calls switch_mm() to switch the virtual memory mapping from the previous
process’s to that of the new process
• Calls switch_to() switch the processor state from the previous process’s to
the current’s
• The kernel provides the need_resched flag to signify whether a
reschedule should be performed
Preemption and Context Switching (cont.)
• Upon returning to user-space or returning from an interrupt, the
need_resched flag is checked
• If it is set, the kernel invokes the scheduler before continuing
• In 2.6, the need_resched flag was moved into a single bit of a special
flag variable inside the thread_info structure
User Preemption
• User preemption can occur
• When returning to user-space from a system call
• When returning to user-space from an interrupt handler
Kernel Preemption
• Kernel preemption can occur
• When an interrupt handler exits, before returning to kernel-space
• When kernel code becomes preemptible again
• If a task in the kernel explicitly calls schedule()
• If a task in the kernel blocks (which results in a call to schedule())
Real-Time Scheduling Policies
• Linux provides two real-time scheduling policies, SCHED_FIFO and
SCHED_RR
• The normal, not real-time scheduling policy is SCHED_NORMAL
• Real-time policies are managed not by the CFS, but by a special real-
time scheduler, defined in kernel/sched_rt.c
Scheduler-Related System Calls
Group Scheduling Enhancements in 2.6.24
References
• http://git.kernel.org/cgit/linux/kernel/git/next/linux-
next.git/tree/Documentation/scheduler/sched-design-CFS.txt
• http://dl.acm.org/citation.cfm?id=1400102
• http://dl.acm.org/citation.cfm?id=1594375
• http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4631872
• http://www.ibm.com/developerworks/linux/library/l-scheduler/
• http://www.ibm.com/developerworks/linux/library/l-cfs/
• http://en.wikipedia.org/wiki/Completely_Fair_Scheduler
• http://blog.xuite.net/ian11832/blogg/23745751

More Related Content

What's hot

BEAM (Erlang VM) as a Soft Real-time Platform
BEAM (Erlang VM) as a Soft Real-time PlatformBEAM (Erlang VM) as a Soft Real-time Platform
BEAM (Erlang VM) as a Soft Real-time PlatformHamidreza Soleimani
 
Operating Systems - Processor Management
Operating Systems - Processor ManagementOperating Systems - Processor Management
Operating Systems - Processor ManagementDamian T. Gordon
 
Process management
Process managementProcess management
Process managementBirju Tank
 
Supporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OSSupporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OSNamHyuk Ahn
 
101 3.6 modify process execution priorities
101 3.6 modify process execution priorities101 3.6 modify process execution priorities
101 3.6 modify process execution prioritiesAcácio Oliveira
 
11 process definition
11 process definition11 process definition
11 process definitionmyrajendra
 
Embedded Recipes 2017 - Reliable monitoring with systemd - Jérémy Rosen
Embedded Recipes 2017 - Reliable monitoring with systemd - Jérémy RosenEmbedded Recipes 2017 - Reliable monitoring with systemd - Jérémy Rosen
Embedded Recipes 2017 - Reliable monitoring with systemd - Jérémy RosenAnne Nicolas
 
Dynamic Resource Management In a Massively Parallel Stream Processing Engine
 Dynamic Resource Management In a Massively Parallel Stream Processing Engine Dynamic Resource Management In a Massively Parallel Stream Processing Engine
Dynamic Resource Management In a Massively Parallel Stream Processing EngineKasper Grud Skat Madsen
 
Comparision of scheduling algorithms
Comparision of scheduling algorithmsComparision of scheduling algorithms
Comparision of scheduling algorithmsTanya Makkar
 

What's hot (20)

Process Synchronization
Process SynchronizationProcess Synchronization
Process Synchronization
 
BEAM (Erlang VM) as a Soft Real-time Platform
BEAM (Erlang VM) as a Soft Real-time PlatformBEAM (Erlang VM) as a Soft Real-time Platform
BEAM (Erlang VM) as a Soft Real-time Platform
 
CPU Scheduling
CPU SchedulingCPU Scheduling
CPU Scheduling
 
Replication in Distributed Systems
Replication in Distributed SystemsReplication in Distributed Systems
Replication in Distributed Systems
 
Homework solutionsch9
Homework solutionsch9Homework solutionsch9
Homework solutionsch9
 
Os
OsOs
Os
 
3 process management
3 process management3 process management
3 process management
 
Operating Systems - Processor Management
Operating Systems - Processor ManagementOperating Systems - Processor Management
Operating Systems - Processor Management
 
Process management
Process managementProcess management
Process management
 
水晶礦脈
水晶礦脈水晶礦脈
水晶礦脈
 
Supporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OSSupporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OS
 
101 3.6 modify process execution priorities
101 3.6 modify process execution priorities101 3.6 modify process execution priorities
101 3.6 modify process execution priorities
 
SCHEDULING
SCHEDULING  SCHEDULING
SCHEDULING
 
Ch5 process synchronization
Ch5   process synchronizationCh5   process synchronization
Ch5 process synchronization
 
Lecture 2 process
Lecture 2   processLecture 2   process
Lecture 2 process
 
Open MPI 2
Open MPI 2Open MPI 2
Open MPI 2
 
11 process definition
11 process definition11 process definition
11 process definition
 
Embedded Recipes 2017 - Reliable monitoring with systemd - Jérémy Rosen
Embedded Recipes 2017 - Reliable monitoring with systemd - Jérémy RosenEmbedded Recipes 2017 - Reliable monitoring with systemd - Jérémy Rosen
Embedded Recipes 2017 - Reliable monitoring with systemd - Jérémy Rosen
 
Dynamic Resource Management In a Massively Parallel Stream Processing Engine
 Dynamic Resource Management In a Massively Parallel Stream Processing Engine Dynamic Resource Management In a Massively Parallel Stream Processing Engine
Dynamic Resource Management In a Massively Parallel Stream Processing Engine
 
Comparision of scheduling algorithms
Comparision of scheduling algorithmsComparision of scheduling algorithms
Comparision of scheduling algorithms
 

Similar to Linux kernel development ch4

Process scheduling &amp; time
Process scheduling &amp; timeProcess scheduling &amp; time
Process scheduling &amp; timeYojana Nanaware
 
Processes and operating systems
Processes and operating systemsProcesses and operating systems
Processes and operating systemsRAMPRAKASHT1
 
Unit 2_OS process management
Unit 2_OS process management Unit 2_OS process management
Unit 2_OS process management JayeshGadhave1
 
Operating System Process Management.pptx
Operating System Process Management.pptxOperating System Process Management.pptx
Operating System Process Management.pptxminaltmv
 
Reduced instruction set computers
Reduced instruction set computersReduced instruction set computers
Reduced instruction set computersSyed Zaid Irshad
 
Process scheduling
Process schedulingProcess scheduling
Process schedulingHao-Ran Liu
 
Insider operating system
Insider   operating systemInsider   operating system
Insider operating systemAditi Saxena
 
CPU Scheduling Criteria CPU Scheduling Criteria (1).pptx
CPU Scheduling Criteria CPU Scheduling Criteria (1).pptxCPU Scheduling Criteria CPU Scheduling Criteria (1).pptx
CPU Scheduling Criteria CPU Scheduling Criteria (1).pptxTSha7
 
Scheduling in Linux and Web Servers
Scheduling in Linux and Web ServersScheduling in Linux and Web Servers
Scheduling in Linux and Web ServersDavid Evans
 
Processes and Thread OS_Tanenbaum_3e
Processes and Thread OS_Tanenbaum_3eProcesses and Thread OS_Tanenbaum_3e
Processes and Thread OS_Tanenbaum_3eLe Gia Hoang
 
MODULE 3 process synchronizationnnn.pptx
MODULE 3 process synchronizationnnn.pptxMODULE 3 process synchronizationnnn.pptx
MODULE 3 process synchronizationnnn.pptxsenthilkumar969017
 
Operating system Q/A
Operating system Q/AOperating system Q/A
Operating system Q/AAbdul Munam
 
Operating system 23 process synchronization
Operating system 23 process synchronizationOperating system 23 process synchronization
Operating system 23 process synchronizationVaibhav Khanna
 
Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...WMLab,NCU
 
PelotonDB - A self-driving database for hybrid workloads
PelotonDB - A self-driving database for hybrid workloadsPelotonDB - A self-driving database for hybrid workloads
PelotonDB - A self-driving database for hybrid workloads宇 傅
 

Similar to Linux kernel development ch4 (20)

Process scheduling &amp; time
Process scheduling &amp; timeProcess scheduling &amp; time
Process scheduling &amp; time
 
Processes and operating systems
Processes and operating systemsProcesses and operating systems
Processes and operating systems
 
Unit 2_OS process management
Unit 2_OS process management Unit 2_OS process management
Unit 2_OS process management
 
Ch6 cpu scheduling
Ch6   cpu schedulingCh6   cpu scheduling
Ch6 cpu scheduling
 
Operating System Process Management.pptx
Operating System Process Management.pptxOperating System Process Management.pptx
Operating System Process Management.pptx
 
Section05 scheduling
Section05 schedulingSection05 scheduling
Section05 scheduling
 
Os2
Os2Os2
Os2
 
Reduced instruction set computers
Reduced instruction set computersReduced instruction set computers
Reduced instruction set computers
 
Process scheduling
Process schedulingProcess scheduling
Process scheduling
 
Insider operating system
Insider   operating systemInsider   operating system
Insider operating system
 
CPU Scheduling Criteria CPU Scheduling Criteria (1).pptx
CPU Scheduling Criteria CPU Scheduling Criteria (1).pptxCPU Scheduling Criteria CPU Scheduling Criteria (1).pptx
CPU Scheduling Criteria CPU Scheduling Criteria (1).pptx
 
Operating System
Operating SystemOperating System
Operating System
 
Scheduling in Linux and Web Servers
Scheduling in Linux and Web ServersScheduling in Linux and Web Servers
Scheduling in Linux and Web Servers
 
Processes and Thread OS_Tanenbaum_3e
Processes and Thread OS_Tanenbaum_3eProcesses and Thread OS_Tanenbaum_3e
Processes and Thread OS_Tanenbaum_3e
 
MODULE 3 process synchronizationnnn.pptx
MODULE 3 process synchronizationnnn.pptxMODULE 3 process synchronizationnnn.pptx
MODULE 3 process synchronizationnnn.pptx
 
ch_scheduling (1).ppt
ch_scheduling (1).pptch_scheduling (1).ppt
ch_scheduling (1).ppt
 
Operating system Q/A
Operating system Q/AOperating system Q/A
Operating system Q/A
 
Operating system 23 process synchronization
Operating system 23 process synchronizationOperating system 23 process synchronization
Operating system 23 process synchronization
 
Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...
 
PelotonDB - A self-driving database for hybrid workloads
PelotonDB - A self-driving database for hybrid workloadsPelotonDB - A self-driving database for hybrid workloads
PelotonDB - A self-driving database for hybrid workloads
 

Linux kernel development ch4

  • 2. Multitasking • Multitasking operating systems come in two flavors: cooperative multitasking and preemptive multitasking • Linux implements preemptive multitasking • the scheduler decides when a process is to cease running and a new process is to begin running
  • 3. Process Scheduler • Linux kernel introduce Completely Fair Scheduler since version 2.6.23 • CFS has been modified a bit further in 2.6.24 • Comparison • Linux pre-2.6 Multilevel feedback queue • Linux 2.6-2.6.23 O(1) scheduler • Linux post-2.6.23 Completely Fair Scheduler • FreeBSD Multilevel feedback queue • Mac OS X Multilevel feedback queue • Windows NT Multilevel feedback queue • Brain Fuck Scheduler
  • 4. Policy • I/O-bound processes • Processor-bound processes • tends to run such processes less frequently but for longer durations • Policy in Unix systems tends to explicitly favor I/O-bound processes, thus providing good process response time • Linux is favoring I/O-bound processes over processor-bound processors
  • 5. Process Priority • The Linux kernel implements two separate priority ranges • Nice value • Real-time priority • Nice value • A number from -20 to +19 with a default of 0 • Real-time priority • Default range from 0 to 99 • Real-time priority and nice value are in disjoint value spaces
  • 6. Timeslice • Timeslice is the numeric value that represents how long a task can run until it is preempted • Linux’s CFS scheduler does NOT directly assign timeslices to processes • CFS assigns processes a proportion of the processor
  • 7. The Scheduling Policy in Action • A text editor vs. a video encoder
  • 8. Scheduling Algorithm • How traditional Unix systems schedule processes. • Mapping nice values onto timeslice to alloct each nice value cause some drawbacks. • Process A: nice value = 0 timeslice of 100 milliseconds Process B: nice value = 20 timeslice of 5 milliseconds, • Process A: nice value = 20 timeslice of 5 milliseconds Process B: nice value = 20 timeslice of 5 milliseconds, • Process A: nice value = 0 timeslice of 100 milliseconds Process B: nice value = 0 timeslice of 100 milliseconds,
  • 9. Scheduling Algorithm • Process A: nice value = 0 timeslice of 100 milliseconds Process B: nice value = 1 timeslice of 95 milliseconds, • Process A: nice value = 18 timeslice of 10 milliseconds Process B: nice value = 19 timeslice of 5 milliseconds, • If performing a nice value to timeslice mapping, we need the ability to assign the absolute timeslice.(ex. integer multiple of the timer ticks) Timeslice change with different timer ticks. • Optimize for interactive tasks. One process gains unfair amount of process time.
  • 10. Scheduling Algorithm • The Linux scheduler is modular, and the modularity is called scheduler classes • The base scheduler code is defined in kernel/sched.c • CFS is defined in kernel/sched_fair.c • CFS basically models an “ideal, precise multi-tasking CPU” on real hardware • Do away with timeslices completely and assign each process a PROPOTION of the processor
  • 13. Fair Scheduling • CFS is called a fair scheduler because it gives each process a fair share—a proportion—of the processor’s time • The absolute timeslice allotted any nice value is NOT an absolute number, but a given proportion of the processor • CFS is NOT perfectly fair, because it only approximates perfect multitasking • But it can place a lower bound on latency of n for n runnable processes on the unfairness
  • 14. The Linux Scheduling Implementation • We discuss four components of CFS • Time Accounting • Process Selection • The Scheduler Entry Point • Sleeping and Waking Up
  • 15. Time Accounting • CFS does NOT have the notion of a timeslice, but it must still keep account for the time that each process runs • CFS uses the scheduler entity structure, struct sched_entity, defined in <linux/sched.h>, to keep track of process accounting • The scheduler entity structure is embedded in the process descriptor, struct task_stuct, as a member variable named se
  • 16. Time Accounting: Virtual Runtime • The virtual runtime is used to help us approximate the “ideal multitasking processor” that CFS is modeling • CFS uses vruntime to account for how long a process has run and thus how much longer it ought to run • The vruntime variable stores the virtual runtime of a process, which is the actual runtime normalized by the number of runnable processes
  • 17. Process Selection • CFS uses a red-black tree to manage the list of runnable processes and efficiently find the process with the smallest vruntime • Picking the next task • run the process represented by the leftmost node in the rbtree • __pick_next_entity() • Adding processes to the tree • enqueue_entity() • Removing processes from the tree • dequeue_entity()
  • 18. The Scheduler Entry Point • The main entry point into the process schedule is the function schedule(), defined in kernel/sched.c
  • 19. Sleeping and Waking Up • Tasks that are sleeping (blocked) are in a special non-runnable state • Without this special state, the scheduler would select tasks that did not want to run • Sleeping is handled via wait queues • A wait queue is a simple list of processes waiting for an event to occur
  • 20. Preemption and Context Switching • Context switching is handled by the context_switch() function defined in kernel/sched.c • It is called by schedule() when a new process has been selected to run to do two basic jobs • Calls switch_mm() to switch the virtual memory mapping from the previous process’s to that of the new process • Calls switch_to() switch the processor state from the previous process’s to the current’s • The kernel provides the need_resched flag to signify whether a reschedule should be performed
  • 21. Preemption and Context Switching (cont.) • Upon returning to user-space or returning from an interrupt, the need_resched flag is checked • If it is set, the kernel invokes the scheduler before continuing • In 2.6, the need_resched flag was moved into a single bit of a special flag variable inside the thread_info structure
  • 22. User Preemption • User preemption can occur • When returning to user-space from a system call • When returning to user-space from an interrupt handler
  • 23. Kernel Preemption • Kernel preemption can occur • When an interrupt handler exits, before returning to kernel-space • When kernel code becomes preemptible again • If a task in the kernel explicitly calls schedule() • If a task in the kernel blocks (which results in a call to schedule())
  • 24. Real-Time Scheduling Policies • Linux provides two real-time scheduling policies, SCHED_FIFO and SCHED_RR • The normal, not real-time scheduling policy is SCHED_NORMAL • Real-time policies are managed not by the CFS, but by a special real- time scheduler, defined in kernel/sched_rt.c
  • 27. References • http://git.kernel.org/cgit/linux/kernel/git/next/linux- next.git/tree/Documentation/scheduler/sched-design-CFS.txt • http://dl.acm.org/citation.cfm?id=1400102 • http://dl.acm.org/citation.cfm?id=1594375 • http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4631872 • http://www.ibm.com/developerworks/linux/library/l-scheduler/ • http://www.ibm.com/developerworks/linux/library/l-cfs/ • http://en.wikipedia.org/wiki/Completely_Fair_Scheduler • http://blog.xuite.net/ian11832/blogg/23745751