Your SlideShare is downloading. ×
Linux O(1) Scheduling
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Linux O(1) Scheduling

3,215
views

Published on

outdated

outdated


0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,215
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
107
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Linux Scheduling (Kernel 2.6) Roy Lee, 21 Sep 2005 NCTU Computer Operating System Lab 1
  • 2. Linux Scheduling[include/linux/sched.h]TASK_RUNNINGTASK_INTERRUPTIBLETASK_UNINTERRUPTIBLETASK_STOPPEDEXIT_ZOMBIEEXIT_DEADset_task_state(task, state);task_state = state;set_current_state( state); Robert Love, “Linux Kernel Development,” 2nd Edition 2
  • 3. Runnable & Runningstruct runqueue { spinlock_t lock; unsigned long nr_running; unsigned long long nr_switches; unsigned long nr_uninterruptible; unsigned long expired_timestamp; unsigned long long timestamp_last_tick; task_t *curr, *idle; struct mm_struct *prev_mm; prio_array_t *active, *expired, arrays[2]; int best_expired_prio; atomic_t nr_iowait; 3
  • 4. Runnable & Running(cont.)#define BITMAP_SIZE ((((MAX_PRIO+1+7)/8)+sizeof(long)-1)/sizeof(long)) H 0 Hstruct prio_array { p4 ... ... 100 unsigned int nr_active; H H unsigned long bitmap[BITMAP_SIZE]; 139 p1 p2 bitmap struct list_head queue[MAX_PRIO]; 0}; ... ... 100idx = sched_find_first_bit(array->bitmap); 139queue = array->queue + idx; queuenext = list_entry(queue->next, task_t, run_list); struct prio_array 4
  • 5. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H p4 ... ... ... ... 100 100 H H 139 p1 139 p2 queue p3 queue 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *active *expiredP4 has the highest priority, and is selected for its execution process list_head 5
  • 6. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H p4 ... ... p4 ... ... 100 100 H H 139 p1 139 p2 queue p3 queue 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *active *expiredLater on, P4 runs out of its timeslice, and get moved to the expired array process list_head 6
  • 7. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... 100 100 H H 139 p1 139 p2 queue p3 queue 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *active *expiredBitmaps are also updated process list_head 7
  • 8. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... 100 100 H H 139 p1 139 p2 queue p3 queue 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *active *expiredNow, P2 has the highest priority, and is selected for its execution process list_head 8
  • 9. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p2 100 100 H H 139 p1 139 p2 queue p3 queue 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *active *expiredLater on, P2 runs out of its timeslice, and get moved to the expired array process list_head 9
  • 10. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p2 100 100 H H 139 p1 139 p2 queue p3 queue 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *active *expiredBitmaps are also updated process list_head 10
  • 11. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p2 100 100 H H 139 p1 139 queue p3 queue 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *active *expiredNow, P1 has the highest priority, and is selected for its execution process list_head 11
  • 12. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p2 100 100 H H 139 p1 139 queue p3 queue 0 0 ... ... p5 ... ... 100 100 139 139 bitmap bitmap *active *expiredDuring its execution, it forks a child process P5 process list_head 12
  • 13. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p2 100 100 H H 139 p5 139 queue p1 queue p3 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *active *expiredTo avoid COW overhead, P1 yields the CPU to the P5 process list_head 13
  • 14. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p2 100 100 H H 139 p5 H 139 queue p1 p5 queue p3 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *active *expiredLater on, P5 runs out of its timeslice, and get moved to the expired array process list_head 14
  • 15. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p2 100 100 H H 139 p5 H 139 queue p1 p5 queue p3 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *active *expiredBitmaps are also updatedYou may notice that it’s priority is changed here processWe will explain this later list_head 15
  • 16. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p2 100 100 H H 139 p1 139 H H queue p3 p5 queue 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *active *expiredP1 resumes its execution, finishes its job and then exitsThis is a typical fork() and the exec() scenario process list_head 16
  • 17. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p2 100 100 H H 139 p3 139 H H queue p5 queue 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *active *expiredNow, P3 has the highest priority, and is selected for its execution process list_head 17
  • 18. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p2 100 100 H H 139 p3 139 H H queue p5 queue p3 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *active *expiredLater on, P3 runs out of its timeslice, and get moved to the expired array process list_head 18
  • 19. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p2 100 100 H H 139 p3 139 H H queue p5 queue p3 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *active *expiredBitmaps are also updated process list_head 19
  • 20. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p2 100 100 H H 139 139 H H queue p5 queue p3 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap Exchange! *active *expiredNow the active array is empty, scheduler exchanges it with the expired one process list_head 20
  • 21. Runnable & Running(cont.) struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p2 100 100 H H 139 139 H H queue p5 queue p3 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *expired *activeAnother round begins! process list_head 21
  • 22. What Polices Do We Have? 0 MAX_RT_PRIO MAX_PRIO SCHED_NORMAL  Ranges from MAX_RT_PRIO to MAX_PRIO - 1 (100 ~ 139) SCHED_FIFO & SCHED_RR  Ranges from 0 to MAX_RT_PRIO -1 (0 ~ 99)  Both are soft real-time scheduling.  A SCHED_FIFO process doesn’t have timeslice.  A SCHED_RR process only round-robbin with those which have equal priority.  The real-time processes:  never expire.  work with static priority 22
  • 23. Realtime Scheduling struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p1 100 p2 100 p5 H H H 139 139 H p6 p3 queue queue p7 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *expired *activeP4 has the highest priority, and is selected for its execution RR process Normal process FIFO process list_head 23
  • 24. Realtime Scheduling struct prio_array array[2] H 0 0 H H H ... ... p2 ... ... p1 100 p4 100 p5 H H H 139 139 H p6 p3 queue queue p7 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *expired *activeP4 runs out its timeslice, but since its a RR process, it does not expireScheduler reinserts it to the tail of its priority list RR process Normal process FIFO process list_head 24
  • 25. Realtime Scheduling struct prio_array array[2] H 0 0 H H H ... ... p2 ... ... p1 100 p4 100 p5 H H H 139 139 H p6 p3 queue queue p7 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *expired *activeNow P2 has the highest priority, and is selected for its execution RR process Normal process FIFO process list_head 25
  • 26. Realtime Scheduling struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p1 100 100 p5 H H H 139 139 H p6 p3 queue queue p7 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *expired *activeP2 finishes its job and exitsNow P4 has the highest priority, and is selected for its execution RR process Normal process FIFO process list_head 26
  • 27. Realtime Scheduling struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p1 100 100 p5 H H H 139 139 H p6 p3 queue queue p7 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *expired *activeIn this case, unless P4 exits or voluntarily relinquishes its execution,or higher priority processes are created/waked up, it monopolize the CPU RR process Normal process FIFO process list_head 27
  • 28. Realtime Scheduling struct prio_array array[2] H 0 0 H H H ... ... ... ... p1 100 100 p5 H H H 139 139 H p6 p3 queue queue p7 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *expired *activeLater on, P4 finishes its job and exitsP1 is selected for its execution RR process Normal process FIFO process list_head 28
  • 29. Realtime Scheduling struct prio_array array[2] H 0 0 H H H ... ... ... ... p5 100 100 p1 H H H 139 139 H p6 p3 queue queue p7 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *expired *activeP1 runs out its timeslice and is reinserted to the tail of its list RR process Normal process FIFO process list_head 29
  • 30. Realtime Scheduling struct prio_array array[2] H 0 0 H H H ... ... ... ... p5 100 100 p1 H H H 139 139 H p6 p3 queue queue p7 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *expired *activeP5 is FIFO realtime, it does not have timeslice.Unless higher priority processes are created/waked up, it monopolizes the CPU RR process Normal process FIFO process list_head 30
  • 31. The Priority of Processes 0 MAX_RT_PRIO MAX_PRIO Static priority mapping  Ranges from -20 to 19  Specified by the user (nice value). Dynamic priority  A bonus or penalty from the range -5 to +5 based on the interactivity of the task. #define MAX_USER_RT_PRIO 100 #define MAX_RT_PRIO MAX_USER_RT_PRIO #define MAX_PRIO (MAX_RT_PRIO + 40) #define NICE_TO_PRIO(nice) (MAX_RT_PRIO + (nice) + 20) #define PRIO_TO_NICE(prio) ((prio) - MAX_RT_PRIO - 20) #define TASK_NICE(p) PRIO_TO_NICE((p)->static_prio) 31
  • 32. The Priority of Processes(cont.)Struct task_struct Dynamic priority.{ int state; Specified by the user.(nice) ... int prio, static_prio; Ranges from 0 to MAX_SLEEP_AVG ... prio_array_t *array; timer_interrupt() unsigned long sleep_avg; unsigned long long timestamp, last_ran; update_process_times() unsigned long long sched_time; int activated; scheduler_tick() unsigned long policy; if (!--p->time_slice) cpumask_t cpus_allowed; unsigned int time_slice, first_time_slice; recalc_task_prio() ... effective_prio() 32
  • 33. When A Process is Interactive Enough… struct prio_array array[2] H 0 0 H H H ... ... p4 ... ... p2 100 p1 100 H H 139 139 H H p3 queue queue 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *expired *activeIf P4 has enough interactivity, after it runs out its timeslice,the scheduler would reinsert it to the end of its listCase 1 process list_head 33
  • 34. When A Process is Interactive Enough… struct prio_array array[2] H 0 0 H H H ... ... p1 ... ... p2 100 p4 100 H H 139 139 H H p3 queue queue 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *expired *activeThe scheduler would reinsert it to the end of its listinstead of moving it to the expired arrayCase 1 process list_head 34
  • 35. When A Process is Interactive Enough… struct prio_array array[2] H 0 0 H H H p4 ... ... p4 ... ... p2 100 p1 100 H H 139 139 H H p3 queue queue 0 0 ... ... ... ... 100 100 139 139 bitmap bitmap *expired *activeHowever, if there are any processes that have been starved,it still has to be expired to prevent further starvationCase 2 process list_head 35
  • 36. Interactivity of Process if (!TASK_INTERACTIVE(p) || EXPIRED_STARVING(rq)) { enqueue_task(p, rq->expired); if (p->static_prio < rq->best_expired_prio) rq->best_expired_prio = p->static_prio; } else enqueue_task(p, rq->active); #define TASK_INTERACTIVE(p) ((p)->prio <= (p)->static_prio - DELTA(p)) #define DELTA(p) (SCALE(TASK_NICE(p), 40, MAX_BONUS) + INTERACTIVE_DELTA) #define EXPIRED_STARVING(rq) ((STARVATION_LIMIT && ((rq)->expired_timestamp && (jiffies - (rq)->expired_timestamp >= STARVATION_LIMIT * ((rq)->nr_running) + 1))) || ((rq)->curr->static_prio > (rq)->best_expired_prio)) 36
  • 37. Timeslice The calculation is a simple scaling of the static priority into a range of timeslices (5 ~ 800 ms). By default (with nice value of zero) is 100 ms. #define MIN_TIMESLICE max(5 * HZ / 1000, 1) #define DEF_TIMESLICE (100 * HZ / 1000) #define SCALE_PRIO(x, prio) max(x * (MAX_PRIO - prio) / (MAX_USER_PRIO/2), MIN_TIMESLICE) static inline unsigned int task_timeslice(task_t *p) { if (p->static_prio < NICE_TO_PRIO(0)) return SCALE_PRIO(DEF_TIMESLICE*4, p->static_prio); else return SCALE_PRIO(DEF_TIMESLICE, p->static_prio); } 37
  • 38. Scheduling with Process Creation p->state = TASK_RUNNING; INIT_LIST_HEAD(&p->run_list);do_fork() p->array = NULL; spin_lock_init(&p->switch_lock); copy_process() ... sched_fork() local_irq_disable(); p->time_slice = (current->time_slice + 1) >> 1; p->first_time_slice = 1; wake_up_new() current->time_slice >>= 1; p->timestamp = sched_clock(); p->prio = current->prio; if (unlikely(!current->time_slice)) { list_add_tail(...); current->time_slice = 1; p->array = current->array; preempt_disable(); p->array->nr_active++; scheduler_tick(); rq->nr_running++; local_irq_enable(); preempt_enable(); To avoid the COW overhead, } else we let the child go first local_irq_enable(); 38
  • 39. Scheduling with Process Terminationsys_exit() sched_exit() do_exit() exit_notify() rq = task_rq_lock(p->parent, &flags); if (p->first_time_slice) { release_task() p->parent->time_slice += p->time_slice; if (unlikely(p->parent->time_slice > task_timeslice(p))) sched_exit() p->parent->time_slice = task_timeslice(p); } schedule() if (p->sleep_avg < p->parent->sleep_avg) p->parent->sleep_avg = p->parent->sleep_avg / (EXIT_WEIGHT + 1) * EXIT_WEIGHT + p->sleep_avg / (EXIT_WEIGHT + 1); BUG(); task_rq_unlock(rq, &flags); 39
  • 40. Control Flow of scheduler_tick()scheduler_tick() yes no(FIFO) Realtime task? Round robbin? no yes yes yes Timeslice remained? Timeslice remained? no no Remove from active Set need_reschedule flag Recalculate priority and timeslice Continue exection yes no Interactive enough? Is there any task in Reinsert to active no expired starving? yes Reinsert to expired Set need_reschedule flag 40
  • 41. Charge Ticks to the Current Processtimer_interrupt() #define user_mode(regs) (!!((regs)->cs & 3)) update_process_times(user_mode(regs)) User mode? jiffies_to_cputime(1) No Yes account_user_time() p->utime = cputime_add(p->utime, cputime); account_system_time() p->stime = cputime_add(p->stime, cputime); 41