Linux kernel development ch4


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Linux kernel development ch4

  1. 1. Process Scheduling Darren Huang#791
  2. 2. Multitasking • Multitasking operating systems come in two flavors: cooperative multitasking and preemptive multitasking • Linux implements preemptive multitasking • the scheduler decides when a process is to cease running and a new process is to begin running
  3. 3. Process Scheduler • Linux kernel introduce Completely Fair Scheduler since version 2.6.23 • CFS has been modified a bit further in 2.6.24 • Comparison • Linux pre-2.6 Multilevel feedback queue • Linux 2.6-2.6.23 O(1) scheduler • Linux post-2.6.23 Completely Fair Scheduler • FreeBSD Multilevel feedback queue • Mac OS X Multilevel feedback queue • Windows NT Multilevel feedback queue • Brain Fuck Scheduler
  4. 4. Policy • I/O-bound processes • Processor-bound processes • tends to run such processes less frequently but for longer durations • Policy in Unix systems tends to explicitly favor I/O-bound processes, thus providing good process response time • Linux is favoring I/O-bound processes over processor-bound processors
  5. 5. Process Priority • The Linux kernel implements two separate priority ranges • Nice value • Real-time priority • Nice value • A number from -20 to +19 with a default of 0 • Real-time priority • Default range from 0 to 99 • Real-time priority and nice value are in disjoint value spaces
  6. 6. Timeslice • Timeslice is the numeric value that represents how long a task can run until it is preempted • Linux’s CFS scheduler does NOT directly assign timeslices to processes • CFS assigns processes a proportion of the processor
  7. 7. The Scheduling Policy in Action • A text editor vs. a video encoder
  8. 8. Scheduling Algorithm • How traditional Unix systems schedule processes. • Mapping nice values onto timeslice to alloct each nice value cause some drawbacks. • Process A: nice value = 0 timeslice of 100 milliseconds Process B: nice value = 20 timeslice of 5 milliseconds, • Process A: nice value = 20 timeslice of 5 milliseconds Process B: nice value = 20 timeslice of 5 milliseconds, • Process A: nice value = 0 timeslice of 100 milliseconds Process B: nice value = 0 timeslice of 100 milliseconds,
  9. 9. Scheduling Algorithm • Process A: nice value = 0 timeslice of 100 milliseconds Process B: nice value = 1 timeslice of 95 milliseconds, • Process A: nice value = 18 timeslice of 10 milliseconds Process B: nice value = 19 timeslice of 5 milliseconds, • If performing a nice value to timeslice mapping, we need the ability to assign the absolute timeslice.(ex. integer multiple of the timer ticks) Timeslice change with different timer ticks. • Optimize for interactive tasks. One process gains unfair amount of process time.
  10. 10. Scheduling Algorithm • The Linux scheduler is modular, and the modularity is called scheduler classes • The base scheduler code is defined in kernel/sched.c • CFS is defined in kernel/sched_fair.c • CFS basically models an “ideal, precise multi-tasking CPU” on real hardware • Do away with timeslices completely and assign each process a PROPOTION of the processor
  11. 11. Ideal, Precise, Multitasking CPU
  12. 12. Actual Hardware CPU
  13. 13. Fair Scheduling • CFS is called a fair scheduler because it gives each process a fair share—a proportion—of the processor’s time • The absolute timeslice allotted any nice value is NOT an absolute number, but a given proportion of the processor • CFS is NOT perfectly fair, because it only approximates perfect multitasking • But it can place a lower bound on latency of n for n runnable processes on the unfairness
  14. 14. The Linux Scheduling Implementation • We discuss four components of CFS • Time Accounting • Process Selection • The Scheduler Entry Point • Sleeping and Waking Up
  15. 15. Time Accounting • CFS does NOT have the notion of a timeslice, but it must still keep account for the time that each process runs • CFS uses the scheduler entity structure, struct sched_entity, defined in <linux/sched.h>, to keep track of process accounting • The scheduler entity structure is embedded in the process descriptor, struct task_stuct, as a member variable named se
  16. 16. Time Accounting: Virtual Runtime • The virtual runtime is used to help us approximate the “ideal multitasking processor” that CFS is modeling • CFS uses vruntime to account for how long a process has run and thus how much longer it ought to run • The vruntime variable stores the virtual runtime of a process, which is the actual runtime normalized by the number of runnable processes
  17. 17. Process Selection • CFS uses a red-black tree to manage the list of runnable processes and efficiently find the process with the smallest vruntime • Picking the next task • run the process represented by the leftmost node in the rbtree • __pick_next_entity() • Adding processes to the tree • enqueue_entity() • Removing processes from the tree • dequeue_entity()
  18. 18. The Scheduler Entry Point • The main entry point into the process schedule is the function schedule(), defined in kernel/sched.c
  19. 19. Sleeping and Waking Up • Tasks that are sleeping (blocked) are in a special non-runnable state • Without this special state, the scheduler would select tasks that did not want to run • Sleeping is handled via wait queues • A wait queue is a simple list of processes waiting for an event to occur
  20. 20. Preemption and Context Switching • Context switching is handled by the context_switch() function defined in kernel/sched.c • It is called by schedule() when a new process has been selected to run to do two basic jobs • Calls switch_mm() to switch the virtual memory mapping from the previous process’s to that of the new process • Calls switch_to() switch the processor state from the previous process’s to the current’s • The kernel provides the need_resched flag to signify whether a reschedule should be performed
  21. 21. Preemption and Context Switching (cont.) • Upon returning to user-space or returning from an interrupt, the need_resched flag is checked • If it is set, the kernel invokes the scheduler before continuing • In 2.6, the need_resched flag was moved into a single bit of a special flag variable inside the thread_info structure
  22. 22. User Preemption • User preemption can occur • When returning to user-space from a system call • When returning to user-space from an interrupt handler
  23. 23. Kernel Preemption • Kernel preemption can occur • When an interrupt handler exits, before returning to kernel-space • When kernel code becomes preemptible again • If a task in the kernel explicitly calls schedule() • If a task in the kernel blocks (which results in a call to schedule())
  24. 24. Real-Time Scheduling Policies • Linux provides two real-time scheduling policies, SCHED_FIFO and SCHED_RR • The normal, not real-time scheduling policy is SCHED_NORMAL • Real-time policies are managed not by the CFS, but by a special real- time scheduler, defined in kernel/sched_rt.c
  25. 25. Scheduler-Related System Calls
  26. 26. Group Scheduling Enhancements in 2.6.24
  27. 27. References • next.git/tree/Documentation/scheduler/sched-design-CFS.txt • • • • • • •