• Like

Linux kernel development ch4

Uploaded on


  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Process Scheduling Darren Huang#791
  • 2. Multitasking • Multitasking operating systems come in two flavors: cooperative multitasking and preemptive multitasking • Linux implements preemptive multitasking • the scheduler decides when a process is to cease running and a new process is to begin running
  • 3. Process Scheduler • Linux kernel introduce Completely Fair Scheduler since version 2.6.23 • CFS has been modified a bit further in 2.6.24 • Comparison • Linux pre-2.6 Multilevel feedback queue • Linux 2.6-2.6.23 O(1) scheduler • Linux post-2.6.23 Completely Fair Scheduler • FreeBSD Multilevel feedback queue • Mac OS X Multilevel feedback queue • Windows NT Multilevel feedback queue • Brain Fuck Scheduler
  • 4. Policy • I/O-bound processes • Processor-bound processes • tends to run such processes less frequently but for longer durations • Policy in Unix systems tends to explicitly favor I/O-bound processes, thus providing good process response time • Linux is favoring I/O-bound processes over processor-bound processors
  • 5. Process Priority • The Linux kernel implements two separate priority ranges • Nice value • Real-time priority • Nice value • A number from -20 to +19 with a default of 0 • Real-time priority • Default range from 0 to 99 • Real-time priority and nice value are in disjoint value spaces
  • 6. Timeslice • Timeslice is the numeric value that represents how long a task can run until it is preempted • Linux’s CFS scheduler does NOT directly assign timeslices to processes • CFS assigns processes a proportion of the processor
  • 7. The Scheduling Policy in Action • A text editor vs. a video encoder
  • 8. Scheduling Algorithm • How traditional Unix systems schedule processes. • Mapping nice values onto timeslice to alloct each nice value cause some drawbacks. • Process A: nice value = 0 timeslice of 100 milliseconds Process B: nice value = 20 timeslice of 5 milliseconds, • Process A: nice value = 20 timeslice of 5 milliseconds Process B: nice value = 20 timeslice of 5 milliseconds, • Process A: nice value = 0 timeslice of 100 milliseconds Process B: nice value = 0 timeslice of 100 milliseconds,
  • 9. Scheduling Algorithm • Process A: nice value = 0 timeslice of 100 milliseconds Process B: nice value = 1 timeslice of 95 milliseconds, • Process A: nice value = 18 timeslice of 10 milliseconds Process B: nice value = 19 timeslice of 5 milliseconds, • If performing a nice value to timeslice mapping, we need the ability to assign the absolute timeslice.(ex. integer multiple of the timer ticks) Timeslice change with different timer ticks. • Optimize for interactive tasks. One process gains unfair amount of process time.
  • 10. Scheduling Algorithm • The Linux scheduler is modular, and the modularity is called scheduler classes • The base scheduler code is defined in kernel/sched.c • CFS is defined in kernel/sched_fair.c • CFS basically models an “ideal, precise multi-tasking CPU” on real hardware • Do away with timeslices completely and assign each process a PROPOTION of the processor
  • 11. Ideal, Precise, Multitasking CPU
  • 12. Actual Hardware CPU
  • 13. Fair Scheduling • CFS is called a fair scheduler because it gives each process a fair share—a proportion—of the processor’s time • The absolute timeslice allotted any nice value is NOT an absolute number, but a given proportion of the processor • CFS is NOT perfectly fair, because it only approximates perfect multitasking • But it can place a lower bound on latency of n for n runnable processes on the unfairness
  • 14. The Linux Scheduling Implementation • We discuss four components of CFS • Time Accounting • Process Selection • The Scheduler Entry Point • Sleeping and Waking Up
  • 15. Time Accounting • CFS does NOT have the notion of a timeslice, but it must still keep account for the time that each process runs • CFS uses the scheduler entity structure, struct sched_entity, defined in <linux/sched.h>, to keep track of process accounting • The scheduler entity structure is embedded in the process descriptor, struct task_stuct, as a member variable named se
  • 16. Time Accounting: Virtual Runtime • The virtual runtime is used to help us approximate the “ideal multitasking processor” that CFS is modeling • CFS uses vruntime to account for how long a process has run and thus how much longer it ought to run • The vruntime variable stores the virtual runtime of a process, which is the actual runtime normalized by the number of runnable processes
  • 17. Process Selection • CFS uses a red-black tree to manage the list of runnable processes and efficiently find the process with the smallest vruntime • Picking the next task • run the process represented by the leftmost node in the rbtree • __pick_next_entity() • Adding processes to the tree • enqueue_entity() • Removing processes from the tree • dequeue_entity()
  • 18. The Scheduler Entry Point • The main entry point into the process schedule is the function schedule(), defined in kernel/sched.c
  • 19. Sleeping and Waking Up • Tasks that are sleeping (blocked) are in a special non-runnable state • Without this special state, the scheduler would select tasks that did not want to run • Sleeping is handled via wait queues • A wait queue is a simple list of processes waiting for an event to occur
  • 20. Preemption and Context Switching • Context switching is handled by the context_switch() function defined in kernel/sched.c • It is called by schedule() when a new process has been selected to run to do two basic jobs • Calls switch_mm() to switch the virtual memory mapping from the previous process’s to that of the new process • Calls switch_to() switch the processor state from the previous process’s to the current’s • The kernel provides the need_resched flag to signify whether a reschedule should be performed
  • 21. Preemption and Context Switching (cont.) • Upon returning to user-space or returning from an interrupt, the need_resched flag is checked • If it is set, the kernel invokes the scheduler before continuing • In 2.6, the need_resched flag was moved into a single bit of a special flag variable inside the thread_info structure
  • 22. User Preemption • User preemption can occur • When returning to user-space from a system call • When returning to user-space from an interrupt handler
  • 23. Kernel Preemption • Kernel preemption can occur • When an interrupt handler exits, before returning to kernel-space • When kernel code becomes preemptible again • If a task in the kernel explicitly calls schedule() • If a task in the kernel blocks (which results in a call to schedule())
  • 24. Real-Time Scheduling Policies • Linux provides two real-time scheduling policies, SCHED_FIFO and SCHED_RR • The normal, not real-time scheduling policy is SCHED_NORMAL • Real-time policies are managed not by the CFS, but by a special real- time scheduler, defined in kernel/sched_rt.c
  • 25. Scheduler-Related System Calls
  • 26. Group Scheduling Enhancements in 2.6.24
  • 27. References • http://git.kernel.org/cgit/linux/kernel/git/next/linux- next.git/tree/Documentation/scheduler/sched-design-CFS.txt • http://dl.acm.org/citation.cfm?id=1400102 • http://dl.acm.org/citation.cfm?id=1594375 • http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4631872 • http://www.ibm.com/developerworks/linux/library/l-scheduler/ • http://www.ibm.com/developerworks/linux/library/l-cfs/ • http://en.wikipedia.org/wiki/Completely_Fair_Scheduler • http://blog.xuite.net/ian11832/blogg/23745751