1. 1
CPU Scheduling (GalvinNotes 9th Ed.)
Chapter 6: CPU Scheduling
ī BASIC CONCEPTS
o CPU-I/O Burst Cycle
o CPU Scheduler
o Preemptive Scheduling
o Dispatcher
ī SCHEDULING CRITERIA
ī SCHEDULING ALGORITHMS
o First-Come First-Serve Scheduling, FCFS
o Shortest-Job-First Scheduling, SJF
o Priority Scheduling
o Round Robin Scheduling
o Multilevel Queue Scheduling
o Multilevel Feedback-Queue Scheduling
ī THREAD SCHEDULING
o Contention Scope
o Pthread Scheduling
ī ALGORITHM EVALUATION
o Deterministic Modeling
o Queuing Models
o Simulations
o Implementation
SKIPPED CONTENT
ī MULTIPLE-PROCESSOR SCHEDULING
o Approachesto Multiple-Processor Scheduling
o Processor Affinity
o Load Balancing
o MulticoreProcessors
o Virtualization and Scheduling (Optional, Omitted from 9thedition)
ī REAL-TIMECPU SCHEDULING
o Minimizing Latency
o Priority-Based Scheduling
o Rate-MonotonicScheduling
o Earliest-Deadline-First Scheduling
o Proportional Share Scheduling
o POSIX Real-Time Scheduling
ī OPERATING SYSTEM EXAMPLES (OPTIONAL)
o Example: Linux Scheduling
o Example: WindowsXP Scheduling
Contents
BASIC CONCEPTS
Almost allprograms have some alternating cycle ofCPU number crunchingandwaiting for I/O of some kind. (Even a simple fetch from memory
takes a long time relative to CPU speeds.) Ina simple system running a single process, the time spent waiting for I/O is wasted, and those CPU cycles
are lost forever. A schedulingsystem allows one processto use the CPU while another is waiting for I/O, therebymaking full use of otherwise lost
CPU cycles. The challenge is to make the overall system as "efficient" and "fair" as possible, subject to varying and often d ynamic conditions, and
where "efficient" and "fair" are somewha t subjective terms, often subject to shifting priority policies.
2. 2
CPU Scheduling (GalvinNotes 9th Ed.)
īˇ CPU-I/O Burst Cycle: Almost all processes alternate betweentwo statesina continuingcycle, as shown
in Figure 6.1 below:(a) A CPU burst ofperforming calculations, and(b) An I/O burst, waiting for data
transfer in or out of the system. CPU bursts vary from process to process, and from program to
program.
īˇ CPU Scheduler: Whenever the CPU becomes idle, it is the job of the CPU Scheduler (a.k.a. the short-
term scheduler) to select another process from the readyqueue to run next. The storage structure for
the readyqueue andthe algorithmusedto select the next process are not necessarily a FIFO queue.
There are severalalternatives to choose from, as well as numerous adjustable parameters for each
algorithm, whichis the basic subject of this entire chapter. (Note that the readyqueue is not necessarily
a first-in, first-out (FIFO) queue. As we shall seewhenwe consider the various scheduling algorithms, a
readyqueue canbe implementedas a FIFO queue, a priority queue, a tree, or simply an unordered
linked list. Conceptually, however, all the processes in the readyqueue are linedupwaiting for a chance
to run on the CPU. The records in the queues are generally process control blocks (PCBs) of the
processes.)
īˇ Preemptive Scheduling: CPU scheduling decisions take place under one of four conditions:
o When a process switches from the running state to the waitingstate, such as for anI/O
request or invocationof the wait()systemcall.
o When a process switches from the running state to the readystate, for example inresponse
to an interrupt.
o When a process switches from the waiting state to the readystate, sayat completion ofI/O or a returnfrom wait().
o When a process terminates.
For conditions 1 and 4 there is nochoice - A new process must be selected. For conditions 2 and 3 there is a choice - To either
continue running the current process, or select a different one. If schedulingtakes place onlyunder conditions 1 and4, the system is
said to be non-preemptive, or cooperative. Under these conditions, once a process starts running it keeps running, until it either
voluntarilyblocks or untilit finishes. Otherwise the systemis said to be preemptive. Windows used non-preemptive scheduling up to
Windows 3.x, and startedusing pre-emptive scheduling withWin95. Macs usednon-preemptive prior to OSX, andpre-emptive since
then. Note that pre-emptive scheduling is only possible on hardware that supports a timer interrupt.
Note that pre-emptive schedulingcancause problems whentwo processes share data, because one process mayget interrupted
in the middleof updatingshareddata structures (Chapter 5 examined this issue ingreater detail). Preemptioncan alsobe a problem if
the kernel is busyimplementing a system call (e.g. updating critical kerneldata structures) when the preemptionoccurs. Most modern
UNIXes deal withthis problem bymaking the process wait untilthe systemcallhas either completedor blockedbefore allowing the
preemption Unfortunatelythis solutionis problematic for real-time systems, as real-time response can no longer be guaranteed.
Some critical sections of code protect themselves fromconcurrencyproblems by disabling interrupts before entering the critical
sectionandre-enabling interrupts onexiting the section. Needless to say, this shouldonlybe done inrare situations, andonly on very
short pieces of code that will finish quickly, (usually just a few machine instructions.)
īˇ Dispatcher: The dispatcher is the module that gives control ofthe CPU to the process selectedbythe scheduler. Thisfunction involves: (a)
Switching context. (b) Switching to user mode. (c) Jumping to the proper locationinthe newlyloadedprogram. The dispatcher needs to be
as fast as possible, as it is run on every context switch. The time consumed by the dispatcher is known as dispatch latency.
SCHEDULING CRITERIA
īˇ There are severaldifferent criteriato consider when trying to select the "best" scheduling algorithm for a particular situa tion and
environment, including:
o CPU utilization - Ideallythe CPU would be busy100% of the time, soas to waste 0 CPU cycles. On a real system CPU usage should
range from 40% (lightly loaded) to 90% (heavily loaded.)
o Throughput - Number of processes completedper unit time. Mayrange from 10/second to 1/hour depending on the specific
processes.
o Turnaroundtime - Time requiredfor a particular process to complete, fromsubmission time to completion (Wall clock time).
o Waiting time - It is the time processes spend in the ready queue waiting their turn to get on the CPU. The CPU-scheduling
algorithmdoesnot affect the amount oftime during whicha process executes or does I/O. It affects onlythe amount of time that
a process spends waitinginthe readyqueue. Waiting time is the sumof the periods spent waiting in the ready queue. (Load
average - The average number of processes sitting in the ready queue waiting their turn to get into the CPU. Reported in 1-
minute, 5-minute, and 15-minute averages)
o Response time - The time takeninan interactive programfrom the issuance of a commandto the commence of a response to
that command.
In general one wants to optimize the average value of a criteria (Maximize CPU utilizationandthroughput, andminimize all the
others.)However sometimes one wants to dosomethingdifferent, suchas to minimize the maximum response time. Sometimes
it is most desirable to minimize the variance of a criteria than the actual value. I.e. users are more accepting of a consistent
predictable system than an inconsistent one, even if it is a little bit slower.
SCHEDULING ALGORITHMS
The following subsections willexplainseveral common schedulingstrategies, lookingat onlya single CPU burst (in milliseconds) each for a small
number of processes. Obviously real systems have to deal with a lot more simultaneous processes executing their CPU -I/O burst cycles.
3. 3
CPU Scheduling (GalvinNotes 9th Ed.)
First-Come First-Serve Scheduling, FCFS
īˇ FCFS can yield some verylong average wait times, particularlyif the first process to get there takes a long time. For example, consider the
following three processes:
In the first Gantt chart below, processP1 arrives first. The average waiting time for the three processes is ( 0 + 24 + 27 ) / 3 = 17.0 ms. In the second
Gantt chart below, the same three processes have anaverage wait time of ( 0 + 3 + 6 ) / 3 = 3.0 ms. The total runtime for the three bursts is the
same, but inthe second case twoof the three finish much quicker, andthe other process is onlydelayedbya short amount.
īˇ FCFS can alsoblock the system in a busydynamic system inanother way, knownas the convoyeffect. When one CPU intensive process
blocks the CPU, a number of I/O intensive processes canget backedupbehindit, leaving the I/O devices idle. When the CPU h og finally
relinquishes the CPU, thenthe I/O processespass throughthe CPU quickly, leaving the CPU idle while everyone queues up for I/O, and
then the cycle repeats itself when the CPU intensive process gets back to the ready queue.
Shortest-Job-First Scheduling, SJF
īˇ The idea behindthe SJF algorithm is to pick the quickest fastest little job that needs to be
done, get it out of the wayfirst, andthen pickthe next smallest fastest jobto donext.
(Technicallythis algorithmpicks a process basedon the next shortest CPU burst, not the
overall process time.) For example, the Gantt chart below is based
upon the following CPU burst times, (andthe assumptionthat all jobs
arrive at the same time.). Inthis case, wait time is (0 + 3 + 9 + 16)/4 =
7.0 ms, (as opposedto 10.25 ms for FCFS for the same processes.)
īˇ For long-term batchjobs this can be done baseduponthe limits that users set for their jobs when theysubmit them. Another optionwould
be to statisticallymeasure the runtime characteristics of jobs, particularlyif the same tasks are runrepeatedlyandpredictably(but once
againthat reallyisn't a viable option for short termCPU schedulinginthe realworld). A more practical approachis to predict the length of
the next burst, basedon some historical measurement ofrecent burst times for this
process.
īˇ SJF can be either preemptive or non-preemptive. Preemptionoccurs when a new process
arrives in the readyqueue that has a predictedburst time shorter thanthe time
remaininginthe process whose burst is currentlyonthe CPU. Preemptive SJFis
sometimes referred to as shortest remaining time first scheduling. For example, the
following Gantt chart is baseduponthe followingdata andthe
average wait time inthis case is (( 5 - 3 ) + ( 10 - 1 ) + ( 17 - 2 )) / 4 =
26 / 4 = 6.5 ms. (As opposedto 7.75 ms for non-preemptive SJFor
8.75 for FCFS.)
Priority Scheduling
īˇ Priorityscheduling is a more generalcase ofSJF, in whicheachjob is assigned a priorityand
the job withthe highest prioritygets scheduled first. (SJF uses the inverse of the next
expected burst time as its priority - The smaller the expected burst, the higher the
priority.). This bookuses low number for highpriorities, with0 beingthe highest possible
priority. For example, the following Gantt chart is baseduponthese process burst times
and priorities, and yields an average waiting time of 8.2 ms:
īˇ Priorities can be assigned either internallyor externally.
Internalpriorities are assignedbythe OS using criteria such
as average burst time, ratioof CPU to I/O activity, system
resource use, andother factors available to the kernel.
External priorities are assignedbyusers, basedonthe importance of the job, feespaid, etc.
īˇ Priorityscheduling can be either preemptive or non-preemptive.
īˇ Priorityscheduling can suffer froma major problemknownas indefinite blocking, or starvation, in which a low-priority task can wait
forever because there are always some other jobs around that have higher priority. One common solution to this problem is aging, in
which priorities ofjobs increase the longer theywait. Under this scheme a low-priorityjob willeventuallyget its priorityraisedhighenough
that it gets run.
Round Robin Scheduling
īˇ Round robinschedulingis similar to FCFS scheduling, except that CPU bursts are
assignedwith limits called time quantum. When a process is given the CPU, a
timer is set for whatever value hasbeen set for a time quantum. If the process finishes its burst before the time quantum ti mer expires,
then it is swappedout of the CPU just like the normal FCFS
algorithm. If the timer goes off first, then the process is
4. 4
CPU Scheduling (GalvinNotes 9th Ed.)
swappedout of the CPU andmovedto the back endof the ready queue.
The readyqueue is maintainedas a circular queue,sowhenall processes
have had a turn, thenthe scheduler gives the first process another turn,
and soon. RR scheduling cangive the effect of all processors sharing the
CPU equally, althoughthe average wait time can be longer than with
other scheduling algorithms. In the followingexample the average wait
time is 5.66 ms.
īˇ The performance of RR is sensitive to the time quantum selected. If the
quantum is large enough, then RRreduces to the FCFS algorithm; If it is
very small, then eachprocessgets 1/nth ofthe processor time and share
the CPU equally. BUT, a real systeminvokes overhead for every context
switch, and the smaller the time quantum the more context switches
there are. (See Figure 6.4 below.)Most modern systems use time quantum between10 and100 milliseconds, andcontext switchtimes on
the order of 10 microseconds, so the overhead i s small relative to the time quantum.
īˇ Turn around time alsovarieswith quantum time, ina non-apparent manner. Consider, for example the processes shown in Figure 6.5. In
general, turnaroundtime is minimizedifmost processes finishtheir next cpu burst within one time quantum. For example, with three
processes of 10 ms bursts each, the average turnaroundtime for 1 ms quantum is 29, and for 10 ms quantum it reducesto 20. H owever, if
it is made too large, then RR just degenerates to FCFS. A rule of
thumb is that 80% of CPU bursts should be smaller than the time
quantum.
Multilevel Queue Scheduling
īˇ When processes can be readilycategorized, thenmultiple separate
queues canbe established, eachimplementingwhatever scheduling
algorithm is most appropriate for that type of job, and/or with
different parametric adjustments. Scheduling must also be done
betweenqueues, that is schedulingone queue to get time relative to
other queues. Two common options are strict priority (no job in a
lower priorityqueue runs until all higher priorityqueues are empty)
and round-robin(each queue gets a time slice in turn, possibly of
different sizes.) Note that under this algorithm jobs cannot switch
from queue to queue - Once theyare assigneda queue, that is their
queue until they finish.
Multilevel Feedback-Queue Scheduling
īˇ Multilevel feedbackqueue scheduling is similar to the ordinarymultilevel queue scheduling describedabove, except jobs may be moved
from one queue to another for a varietyof reasons:(A) If the characteristics of a jobchange betweenCPU-intensive andI/O intensive, then
it maybe appropriate to switcha job fromone queue to another. (B)Agingcanalso be incorporated, so that a job that has w aited for a
long time canget bumpedupintoa higher priority queue for
a while.
īˇ Multilevel feedbackqueue scheduling is the most
flexible, because it can be tunedfor anysituation. But it is alsothe most complex to implement because ofallthe adjustable parameters.
Some of the parameters whichdefine one of these systems include: (A) The number of queues. (B) The scheduling algorithm for each
queue. (C) The methods usedto upgrade or demote processes from one queue to another. (Which maybe different.)(D) The metho dused
to determine which queue a process enters initially.
THREAD SCHEDULING
īˇ The process scheduler schedulesonlythe kernel threads. User threads are mappedto kernelthreads bythe threadlibrary - The OS (andin
particular the scheduler) is unaware of them.
o Contention Scope:
ī§ Contention scope refers to the scope in whichthreads compete for the use of physical CPUs. On systems implementing
many-to-one andmany-to-manythreads, Process Contention Scope, PCS, occurs, because competitionoccurs between
threads that are part of the same process. (This is the management / scheduling of multiple user threads on a single
kernel thread, andis managedbythe threadlibrary.) System Contention Scope, SCS, involves the system scheduler
scheduling kernel threads to run onone or more CPUs. Systems implementing one-to-one threads (XP, Solaris 9, Linux),
5. 5
CPU Scheduling (GalvinNotes 9th Ed.)
use onlySCS. PCS schedulingis typicallydone with priority, where the programmer canset and/or change the priority
of threads created by his or her programs.
o Phtread Scheduling: The Pthreadlibraryprovides for specifying scope contention:
ī§ PTHREAD_SCOPE_PROCESS schedules threads usingPCS, byschedulinguser threads ontoavailable LWPs using the
many-to-manymodel.
ī§ PTHREAD_SCOPE_SYSTEMschedules threads usingSCS, bybinding user threads to particular LWPs , effectively
implementing a one-to-one model.
getscope andsetscope methods provide for determiningandsetting the scope contention respectively.
ALGORITHM EVALUATION
The first stepin determiningwhichalgorithm (and what parameter settings withinthat algorithm)is optimal for a particular operating environment
is to determine what criteria are to be used, what goals are to be targeted, andwhat constraints if anymust be applied. For example, one might
want to "maximize CPU utilization, subject to a maximum response time of 1 second". Once criteriahave been established, thendifferent algorithms
can be analyzedanda "best choice" determined. The following sections outline some different methods for determining the "be st choice".
īˇ Deterministic Modeling: If a specific workloadis known, thenthe exact valuesfor major criteria canbe fairlyeasilycalculated, andthe
"best" determined. For example, consider the following workload(withallprocesses arriving at time 0), andthe resultingschedules
determinedbythree different algorithms â
The average waiting times for FCFS, SJF, andRR are 28ms, 13ms, and23ms respectively. Deterministic modeling is fast and eas y, but it requires
specific knowninput, andthe results onlyapplyfor that particular set ofinput. However byexamining multiple similar cases, certain trends can be
observed. (Like the fact that for processes arriving at the same time, SJF will always yield the shortest average wait time.)
īˇ Queuing Models: Specific processdata is oftennot available, particularlyfor future times. However a studyof historical performance can
often produce statistical descriptions ofcertainimportant parameters, suchas the rate at which new processes arrive, the ratio of CPU
bursts to I/O times, the distributionof CPU burst times and I/O burst times, etc. Armedwith those probability distributions and some
mathematicalformulas, it is possible to calculate certain performance characteristics ofindividual waiting queues (e.g., Li ttle's Formula).
Queuingmodels treat the computer as a network of interconnectedqueues, eachof which is described by its probability distribution
statistics and formulas such as Little's formula. Unfortunatelyreal systems and modern scheduling algorithms are so complex as to make
the mathematics intractable in many cases with real systems.
īˇ Simulations: Another approachis to runcomputer simulations of the different proposed algorithms (andadjustment parameters) under
different load conditions, andto analyze the results to determine the "best"choice ofoperationfor a particular load pattern. Operating
conditions for simulations are oftenrandomlygeneratedusing distributionfunctions similar to those describedabove. A better alternative
when possible is to generate trace tapes, by monitoring andlogging the performance of a real system under typicalexpected work loads.
These are better because theyprovide a more accurate picture of systemloads, and alsobecause theyallowmultiple simulations to be run
with the identical process load, andnot just statisticallyequivalent loads. A compromise is to randomlydetermine system loads and then
save the results intoa file, sothat all simulations can be runagainst identicalrandomlydetermined system loads. Although trace tapes
provide more accurate input information, theycanbe difficult andexpensive to collect and store, and their use increasesth e complexityof
the simulations significantly. There is alsosome question as to whether the future performance of the new system will really match the
past performance of the old system.
Implementation
The onlyrealwayto determine how a proposedschedulingalgorithmis going to operate is to implement it ona real system. Even inthis case, the
measured results maynot be definitive, for at least two major reasons:(A) System work loads are not static, but change over time as new programs
are installed, new users are addedto the system, new hardware becomesavailable, new work projects get started, andeven societal changes. (For
example the explosionof the Internet has drasticallychangedthe amount of networktraffic that a system sees and the importance of handling it
with rapid response times.)(B) As mentionedabove, changingthe schedulingsystemmayhave animpact onthe work load and the ways in which
users use the system. ( The bookgives anexample of a programmer whomodifiedhis code to write anarbitrarycharacter to th e screen at regular
intervals, just sohis jobwouldbe classifiedas interactive andplacedintoa higher priorityqueue.) Most modernsystems provide some capability for
the system administrator to adjust scheduling parameters, either on the fly or as the result of a reboot or a kernel rebuild.
6. 6
CPU Scheduling (GalvinNotes 9th Ed.)
Summary
īˇ CPU schedulingis the taskof selecting a waiting process from the readyqueue andallocating the CPU to it. The CPU is allocated to the
selected process by the dispatcher.
īˇ First-come, first-served (FCFS) scheduling is the simplest scheduling algorithm, but it can cause short processes to wait fo r very long
processes. Shortestjob- first (SJF) scheduling is provablyoptimal, providing the shortest average waiting time. Implementing SJF scheduling
is difficult, however, because predicting the length ofthe next CPU burst is difficult. The SJF algori thm is a special case of the general
priorityscheduling algorithm, which simplyallocates the CPU to the highest-priorityprocess. Both priorityand SJF scheduling may suffer
from starvation. Aging is a technique to prevent starvation.
īˇ Round-robin(RR)schedulingis more appropriate for a time-shared (interactive) system. RR scheduling allocates the CPU to the first
process inthe readyqueue for q time units, where q is the time quantum. After q time units,if the process has not relinquishedthe CPU, it
is preempted, andthe process is put at the tail of the ready queue. The major problem is the selection of the time quantum. If the
quantum is toolarge, RR scheduling degeneratesto FCFS scheduling. Ifthe quantum is too small, scheduling overhead in the form of
context-switch time becomes excessive.
īˇ The FCFS algorithm is nonpreemptive; the RR algorithmis preemptive. The SJF and priority algorithms may be either preemptive or
nonpreemptive.
īˇ Multilevel queue algorithms allowdifferent algorithms to be used for different classesof processes. The most commonmodel includes a
foregroundinteractive queue that uses RR scheduling and a background batch queue that uses FCFS scheduling. Multilevel feedb ack
queues allow processes to move from one queue to anothe r.
īˇ Manycontemporarycomputer systems support multiple processors and allow eachprocessor to schedule itself independently. Typ ically,
each processor maintains its ownprivate queue of processes(or threads), all of whichare available to run. Additional issues related to
multiprocessor scheduling include processor affinity, load balancing, and multicore processing.
īˇ A real-time computer system requires that results arrive within a deadline period;results arriving after the deadline haspassedare useless.
Hard real-time systems must guarantee that real-time tasks are servicedwithin their deadline periods. Soft real -time systems are less
restrictive, assigning real-time tasks higher scheduling priority than other tasks.
īˇ Real-time scheduling algorithms include rate-monotonic andearliest deadline- first scheduling. Rate-monotonic scheduling assigns tasks
that require the CPU more oftena higher prioritythan tasks that require the CPU less often. Earliest-deadline-first scheduling assigns
priorityaccording to upcomingdeadlinesâthe earlier the deadline, the higher the priority. Proportional share scheduling divides up
processor time intoshares andassigning eachprocessa number of shares, thus guaranteeing each process a proportional share of CPU
time. The POSIX Pthread API provides various features for scheduling real -time threads as well.
īˇ Operating systems supportingthreads at the kernel level must schedule threadsânot processesâfor execution. This is the case with
Solaris andWindows. Bothof these systems schedule threads using preemptive, prioritybased scheduling algorithms, includingsupport for
real-time threads. The Linux process scheduler uses a priority-basedalgorithmwith real-time support as well. The scheduling algorithms
for these three operating systems typically favor interactive over CPU-bound processes.
īˇ The wide variety of scheduling algorithms demands that we have methods to select among algorithms. Analytic methods use
mathematicalanalysisto determine the performance of an algorithm. Simulation methods determine performance by imitating the
scheduling algorithmon a ârepresentativeâ sample of processesand computing the resulting performance. However, simulation can at
best provide an approximationof actual systemperformance. The onlyreliable technique for evaluating a scheduling algorithm is to
implement the algorithm on an actual system and monitor its performance in a âreal -worldâ environment.
ReadLater
īˇ Thread schedulingprogrammingsneak-peek using the PTHREAD_SCOPE_PROCESS andother three functions mentionedinthe notes
above. A brief andsimple ideaabout how the concepts work out in the programming world. Skippingit in first rundue to pragmatic
reasons only. Worth readingto consolidate what you learnt.
īˇ Multiple-Processor Scheduling(includes "Procesor Affinity", Loadbalancing-Push and Pull migration, Multicore Processors):Not directly
relevant from pragmatic vieweven insecond run, but worth a read.