CPU Scheduling

CIAO!
Siamo Lino Sarto e Giorgio Vitiello
I nostri username Telegram sono
@linuxarto, @iGiorgio

Contents
▸ Scheduling Overview
▸ Multi-Processor Scheduling
▸ Fair-Share Scheduling
▸ Lottery Scheduling
▸ Scheduling in our time

Basic Concepts
Introduce CPU scheduling

▸ CPU scheduling is the basis for
multiprogrammed OS
▸ There are evaluation criteria for selecting
a CPU-scheduling algorithm for a
particular system

▸ CPU–I/O Burst Cycle – Process
execution consists of a cycle of CPU
execution and I/O wait
▸ CPU burst followed by I/O burst
▸ CPU burst distribution is of main concern

Short-term Scheduler
▸ Short-term scheduler selects from
among the processes in ready queue,
and allocates the CPU to one of them

Preemptive Vs Non-Preemptive Scheduling
▸ Non-Preemptive Scheduling
▹ The running process keeps the CPU until it voluntary gives up
the CPU
New
Ready Running
Terminated
Waiting

Preemptive Vs Non-Preemptive Scheduling
▸ Preemptive Scheduling
▹ The running process can be interrupted and must release the
CPU
New
Ready Running
Terminated
Waiting

Preemptive Scheduling Issues
▸ Consider access to shared data
▸ Consider preemption while in kernel
mode
▸ Consider interrupts occurring during
crucial OS activities

Scheduling Criteria
Different CPU-scheduling
algorithms have different
properties, and the choice of a
particular algorithm may favor
one class of processes over
another

Scheduling Criteria
▸ CPU utilization – keep the CPU as busy as possible
▸ Throughput – # of processes that complete their execution
per time unit
▸ Turnaround time – amount of time to execute a particular
process
▸ Waiting time – amount of time a process has been waiting in
the ready queue
▸ Response time – amount of time it takes from when a
request was submitted until the first response is produced

Scheduling Criteria
▸ Max CPU utilization
▸ Max throughput
▸ Min turnaround time
▸ Min waiting time
▸ Min response time

Scheduling Criteria
▸ Main goals of scheduling
▹ Maximize resource use
▹ Fairness
▸ Switching is expensive
▹ Trade-off between maximizing usage and
fairness

Planned vs On-demand
▸ Planned
▹ Schedule in advance
▹ Supervisor/developer plans schedule given a priori knowledge
about all tasks and deadlines
▸ On-demand
▹ Schedule “on-the-fly”
▹ Supervisor runs periodically and makes decision based on
current information
Where is the planned used?

Real Time Systems
▸ Hard real-time
▹ Missing a deadline is a total system failure
▸ Soft real-time
▹ Undesirable to miss too many deadlines too
badly

Scheduling Strategies - FCFS
First Come, First Served (FCFS)
▸ Non-preemptive multiprocessing
P1 P2 P3

Scheduling Strategies - Round Robin
Round Robin
▸ Each process gets to run for a set time slice, or
until it finishes or gets blocked
P1 P2 P3
Time Slice
switch
P3
Blocked (I/O)
P1 P2 P1

Scheduling Strategies
▸ FCFS
▹ Resource maximize
▸ Round Robin
▹ Fairness
Which one is my laptop using?

Scheduling Strategies - Priorities
▸ More important processes Higher Priority
▸ Linux inverts priority values
▹ Highest priority = 0
▹ Lowest priority = 99 (highest number varies by Linux flavor)
ps -w -e -o pri,pid,pcpu,vsz,cputime,command | sort -n -
r --key=5 | sort --stable -n --key=1

Scheduling Strategies - Preemptive Priority
▸ Always runs the highest
priority process that is
ready to run
▸ Round-robin schedule
among equally high, ready
to run, highest-priority
processes
▸ Multi-level Feedback
Queue (MLFQ)
Fairness?

Priority Inversion
Mars Pathfinder is the first
completed mission in NASA’s
Discovery program for planetary
missions with highly focused
scientific goals. The project
started in October 1993 and
incurred a total cost of 265
million dollars. Launched in
December 4, 1996, and landed on
Mars on July 4, 1997. The last
data transmission was recorded
on September 27, 1997

Priority Inversion - Priority Inheritance

Hyper-Threading
Thread 1
Thread 2
Thread 1
Thread 2
Thread 3
Thread 4

A Fair Share Scheduler
Judy Kay
Piers Lauder

Introduction
➢ One of the critical requirements of a scheduler is that is be
fair
➢ Traditionally, this has been interpreted in the context of
processes, and it has meant that schedulers were designed
to share resources fairly between processes
➢ More recently, it has become clear that schedulers need to be
fair to users rather than just processes

A Fair Scheduler
➢The context for developing Share was that of the main teaching
machine in a computer science department
➢1000 undergraduates in several classes as well as a staff of about
100
➢Our load was almost exclusively interactive and had frequent,
extreme peaks when a major assignment was due
➢60-85 active terminals, mainly engaged in editing, compiling and
running small to medium Pascal programs

A Fair Scheduler
➢Unix scheduler was inadequate for a number of
reasons
○ It gave more of the machine to users with more processes
○ It didn’t take into account the long-term history of a user's activity
○ When one class had an assignment due, and all the students in that class wanted
to work very hard, everyone, suffered with poor response
○ If someone needed good response for activities like practical examinations or
project demonstrations, it was difficult to ensure that they could get that response
without denying all other users access to the machine

A Fair Scheduler
➢Fixed-budget model (charging)
○ Each user has a fixed size budget allotted to them
○ As they use resources, the budget is reduced and when it is empty, they cannot
use the machine at all
○ A process can get a better machine share if the user who owns it is prepared to
pay more for the resources used by that process
○ The fixed-budget model can be refined so that each user has several budgets,
each associated with different resources

A Fair Scheduler
➢We use this approach in these cases
○ For resources like disc space, each user has a limit
○ Resources like printer pages are allocated to each user as a
budget that has a tight upper bound and is updated each day
○ Daily connect limits are available to prevent individuals from
hogging the machine within a single day

A Fair Scheduler
➢ All these measures helped control consumption of
resources but did not deal with the problems of
CPU allocation we described earlier
➢ It was for these that we developed the Share
scheduler

Objective of Share
➢ Seem fair
➢ Be understandable
➢ Be predictable
➢ Accommodate special needs
➢ Deal well with peak loads
➢ Encourage load spreading
➢ Give interactive users reasonable response
➢ Give some resources to each process so that no process is
postponed indefinitely
➢ Be very cheap to run

User’s View of Share
➢Essentially Share is based on the principle that
○ Everyone should be scheduled for some resources
○ According to their entitlement
○ As defined by their shares
➢A user’s shares indicate their entitlement to do work on the machine
➢Every user has a usage, which is a decayed measure of the work that
the user has done
➢The decay rate determines how long a user’s history of resource
consumption affects their response

User’s View of Share
Looked at from the
user’s point of view,
Share gives worse
response to users who
have had more than their
fair share so that they
cannot work as hard as
users who have not had
their fair share

Differences between Share and previous schedulers
➢ Does not have a fixed-budget
➢ Maintains Unix nice, which is a value associated with the
process that reduces the costs of the process to its growth
➢ Share’s charges are associated with different resources such
as memory occupancy, systems calls, CPU usage
➢ We set charges at different levels at different times of the
day

Share Implementation
➢ Share has two main components
○ User-Level Scheduling
○ Process-Level Scheduling

➢ User-level scheduling
○ Update usage for each user by adding charges incurred
by all their processes since the last update and decaying
by the appropriate constant
○ Update accumulated records of resource consumption
for planning, monitoring, policy decisions

➢ The Process-Level Scheduling has two types of process
○ Active Process, describes any process that is ready to run
○ Current Process, describes the active process that actually has
control of the CPU

➢ There are three types of activity at the process level
○ Activation of a new process
○ Adjustment of the priority of the current process
○ Decaying of the priorities of all processes

➢ Multiple processor machines don’t affect the
implementation provided that the kernel still uses a
single priority queue for processes
➢ The only difference is that processes are selected
to run from the front of the queue more often, and
incur charges more frequently, than if only one
processor were present

➢It’s possible for a user to achieve a very large usage during a
relatively idle period
➢Then, if new users become active, the original user’s share
becomes so small that they can do no effective work
➢This user’s processes are effectively marrooned with
insufficient CPU allocation even to exit
➢Marooning is avoided by the combination of bounds on the
normalised form of priority, the process priority decay rate

Hierarchical Share
➢ The simple version of Share was inadequate for a machine
that is shared between organisations or independent groups
of users
➢ Share is fine if we can make the following assumptions
○ The total allocation of shares for each organisation is strictly maintained in the
proportions that the machine split is made
○ The users in each organization are equally active
○ K1 is acceptable at the organisational level and is constant for all users
○ Costs for resources are consistent for all users, and the other parameters of
Share, including K2, t1, t2 and t3, are accepted for all users

Hierarchical Share
➢The users alter the m_share value
whenever they log in or log out. For
this reason m_share values will
usually be recalculated at each log in
or log out. This poses a small but
acceptable overhead
➢Share acts fairly under full load but a
light load can distort it. The situation
depicted in figure

Design Goal’s
➢ That it be fair
○ With the simple, non-hierarchical Share, users deemed the
scheduler to be treating them fairly
○ We have observed a number of situations in which Share has
dealt with potentially disastrous situations to the satisfaction of
most users
○ With the hierarchical Share system, involved a user who initiated
a long running CPU bound process, Share ensured that users in
other groups were unaffected by the problem

Design Goal’s
➢ That it be understandable
○ Our users interpret relatively poor response as an indication that
they have exceeded their machine share

Design Goal’s
➢ That it be predictable
○ Each user’s personal profile lists their effective machine share
and they quickly learn to interpret this is a meaningful way
○ Users speak of a certain machine share as being adequate to
do one task but not another

Design Goal’s
➢ The scheduler should accommodate special needs
○ Share, for the situation where one needs to guarantee a user
excellent response for a brief period, allocates a relatively large
number of shares to the relevant user’s account for the duration
of the special needs
○ Typically, other users suffer only small periods of reduced
response

Design Goal’s
➢ That it should deal well with peak loads
○ In our design environment, one of the classic causes of a peak
load is the deadline for an assignment
○ With Share, the individuals in the class that is working to the
deadline are penalised as their usage grows; meanwhile, other
students get good response and are often unaware of the other
class’s deadline
○ Under heavy load, heavy users suffer most

Design Goal’s
➢That it should encourage load spreading
○ Users can distribute their workload to get good responsiveness if they work
constantly
➢That it should give interactive users reasonable response
○ We can ensure this goal by combining Share with a check at login time that only
allows users to log in if they can get reasonable response
○ Those who have high usage do get poor response and if the machine is heavily
loaded, the poor response may well be intolerable for interactive tasks

Design Goal’s
➢ No process should be postponed indefinitely
○ Share allocates some resources to every process
➢ That it should be very cheap to run
○ Since most of the costly calculations are performed relatively infrequently, Share
creates only a small overhead relative to the conventional scheduler

Conclusion
➢ Users perceive the scheduler as fair
➢ The strengths of Share are that it
○ It’s fair to users and to groups
○ Users cannot cheat the system
○ Groups of users are protected from each other
○ Gives a good prediction of the response that a user might expect
○ Gives meaningful feedback to users on the cost of various services
○ Helps spread load

Lottery Scheduling - Overview
Carl A. Waldspurger
William E. Weihl

Lottery Scheduling - Overview
▸ Each user/process gets a share of the “tickets”
▹ e.g. 1000 total tickets, 20 processes each get 50 tickets
▸ User/process can distribute tickets however it
wants
▹ Among its own threads, can “loan” to other processes’ threads
▸ Scheduler: randomly picks a ticket
▹ Associated thread gets to for that time slice
▸ Paper introduce currencies
▹ For allocating CPU among multiple threads of a single user

Lottery Scheduling vs Traditional Schedulers
▸ Priority schemes
▹ Priority assignments often ad hoc
▹ Scheme using decay-usage scheduling are poorly understood
▸ Fair share schemes adjust priorities with a
feedback loop to achieve fairness
▹ Only achieve long-term fairness

Lottery Scheduling - Ticket properties
▸ Abstract
▹ Operate independently of machine details
▸ Relative
▹ Chances of winning the lottery depends on contention level
▸ Uniform
▹ Can be used for many different resources

Lottery Scheduling - Another advantage
▸ Lottery scheduling is starvation-free
▹ Every ticket holder will finally get the resource

Lottery Scheduling - Fairness analysis
▸ Lottery scheduling is probabilistically fair
▸ If a thread has a t tickets out of T
▹ Its probability of winning a lottery is p = t/T
▹ Its expected number of wins over n drawings is
np
▹ Binomial distribution
▹ Variance σ2 = np(1 - p)

Lottery Scheduling - Standard Deviation Overview
Expect value μ
Standard deviation σ
Coefficient of variation σ* = σ / |μ|

Lottery Scheduling - Fairness analysis
▸ Coefficient of variation of number of wins
σ/np = √((1 - p) / np)
▹ Decreases with √n
▸ Number of tries before winning the lottery follows
a geometric distribution
▹ The probability Pr(X=k) that the kth trial is
the first success is Pr(X=k) = (1 - p)k-1p
▸ As time passes, each thread ends receiving its
share of the resource

Lottery Scheduling - Geometric Distribution

Lottery Scheduling - Ticket transfer
▸ Explicit transfer of tickets from one client to
another
▸ They can be used whenever a client blocks due to
some dependency
▹ When a client waits for a reply from a server, it
can temporarily transfer its tickets to the server
▸ They eliminate priority inversions

Lottery Scheduling - Ticket inflation
▸ Lets users create new tickets
▹ Like printing their own money
▹ Counterpart is ticket deflation
▸ Normally disallowed except among mutually
trusting clients
▹ Lets them to adjust their priorities dynamically
without explicit communication

Lottery Scheduling - Ticket currencies
▸ Consider the case of a user/process managing
multiple threads
▹ Want to let him favor some threads over others
▹ Without impacting the threads of other processes
▸ Will let him create new tickets but will debase the
individual values of all the tickets he owns
▹ His tickets will be expressed in a new currency that will have a
variable exchange rate with the base currency

Lottery Scheduling - Ticket currencies - Analogy

Lottery Scheduling - Ticket currencies
▸ Computing values
▹ Currency: sum value of
backing tickets
▹ Ticket: compute share
of currency value
Example:
▸ task1 funding in base units?
▸ 100 / (300/1000)
▸ 333 base units

Lottery Scheduling - Compensation tickets
▸ I/O-bound threads are likely get less than their fair
share of the CPU because they often block before
their CPU quantum expires
▸ Compensation tickets address this imbalance

▸ A client that consumes only a fraction f of its CPU
quantum can be granted a compensation ticket
▹ Ticket inflates the value of all client tickets by
1/f until the client starts gets the CPU
▸ For example CPU quantum is 100ms
▹ Client1 releases the CPU after 20ms f = 0.2
▹ Value of all tickets owned by Client1 will be
multiplied by 1/f = 1/0.2 = 5 until Client1 gets
the CPU

▸ Compensation tickets
▹ Favor I/O-bound and interactive threads
▹ Helps them getting their fair share of the CPU

Lottery Scheduling - Implementation
▸ On a MIPS-based DECstation running Mach3
microkernel
▹ Time slice is 100ms, fairly large as scheme
does not allow preemption
▸ Requires a fast pseudo-random number generator
and a fast way to pick lottery winner

A Fair Scheduler - DECstation Overview
▸ The MIPS-based workstations
ran Ultrix, a DEC-proprietary
version of UNIX
▸ The first line of computer
systems given the DECstation
name were word processing
systems based on the PDP-8
(a 12-bit minicomputer
produced by DEC)
▸ These systems, built into a
VT52 terminal, were also
known as the VT78

▸ The DECstation 5000 Model 100 Series are mid-
range workstations
▸ The Model 125, used in the paper, have two
external caches, a 64 KB instruction cache and a
64 KB data cache
▸ The system may contain 32 to 480 MB of memory.
▸ The memory subsystem operates at 25 MHz

Lottery Scheduling - Picking the winner
▸ Assign to each client a range of consecutive lottery
ticket numbers
▸ Create a list of clients
▸ Use a Pseudo-Random Generator to generate a
winning ticket number
▸ Search client list until we find the winner
▸ Search time is O(n), where n is list length

▸ For large n use a tree of partial ticket
sums, with clients at leaves

10
12
17
18
>≤
≤
≤
≤
>
>
>
O(log(n)) operations

Lottery Scheduling - Long term fairness
▸ Dhrystone benchmark
▸ Two tasks
▸ Three 60-second runs
for each ratio
▸ Gets better over longer
intervals

Lottery Scheduling - Short term fluctuations
▸ Dhrystone benchmark
▸ Two tasks
▸ 2 : 1 allocation
▸ 8-second averages

Lottery Scheduling - Monte-Carlo Rates
▸ Many trials for accurate
results
▸ Three tasks
▸ Ticket inflation
▸ Funding based on
relative error

▸ Multithreaded
“database” server
▸ Three clients
▸ 8 : 3 : 1 allocation
▸ Ticket transfers
Lottery Scheduling - Query Processing Rates

▸ Currencies A, B
2 : 1 funding
▸ Task A funding 100.A
▸ Task B1 funding 100.B
▸ Task B2 joins with
funding 100.B
Lottery Scheduling - Currencies Insulate Loads

Matching the Specialization in Traditional Scheduler
▸ D. Petrou, J. W.
Milford, and G. A.
Gibson, Implementing
Lottery Scheduling:
Matching the
Specializations in
Traditional
Schedulers, Proc.
USENIX '99, Monterey
CA, June 9-11, 1999

“Straightforward implementation of lottery
scheduling does not provide the
responsiveness for a mixed interactive and
CPU-bound workload offered by the decay
usage priority scheduler of the FreeBSD
operating system

“Moreover, standard lottery scheduling
ignores kernel priorities used in the
FreeBSD scheduler to reduce kernel lock
contention

How to assign the tickets?
▸ Assuming that the users know best
▹ Each user is handed some number of tickets,
and a user can allocate tickets to any jobs they
run as desired
▸ “Ticket-assignment problem” remains
open

Why Not Deterministic?
▸ As we have seen Lottery not deliver the
exact right proportions, especially over
short time scales
▸ Waldspurger invented stride scheduling,
a deterministic fair-share scheduler

Stride Scheduling
▸ Each job in the system has a stride,
which is inverse in proportion to the
number of tickets it has
▸ For example consider three jobs A, B,
and C, with 100, 50, and 250 tickets

Stride Scheduling
▸ We can compute the stride of each by
dividing some large number by the
number of tickets each process has
been assigned
▹ In our example use 10.000 as large number
▸ Every time a process runs, we will
increment a pass value for it by its stride

Stride Scheduling - Example
Who Runs?
Pass A
(stride 100)
Pass B
(stride 200)
Pass C
(stride 40)
A 0 0 0
B 100 0 0
C 100 200 0
C 100 200 40
C 100 200 80
C 100 200 120

Why use
Lottery at all?
Given the precision of Stride Scheduling

Stride Scheduling - Example
▸ Lottery scheduling has one nice property
that stride scheduling does not: no
global state
▸ Imagine a new job enters in the middle of
our stride scheduling: what should its
pass value be? Should it be set to 0? If
so, it will monopolize the CPU

Scheduling In Our Time - Examples
▸ Linux scheduler, entitled the Completely Fair
Scheduler (CFS), implements fairshare scheduling,
but does so in a highly efficient and scalable
manner
▸ Windows schedules threads using a priority-based,
preemptive scheduling algorithm
▹ The portion of the Windows kernel that handles scheduling is
called the dispatcher
▹ Windows scheduler ensures that the highest-priority thread will
always run

Completely Fair Scheduler
▸ Scheduling in the Linux system is based
on scheduling classes
▹ Each class is assigned a specific priority
▸ The CFS scheduler assigns a proportion
of CPU processing time to each task
▹ This proportion is calculated based on the nice
value (range from -20 to +19) assigned to each
task

Completely Fair Scheduler
▸ CFS records how long each task has run
by maintaining the virtual run time of
each task using the per-task variable
vruntime
▸ The virtual run time is associated with a
decay factor based on the priority of a
task

CFS Performance
Task with the
smallest value
of vruntime
Value of vruntime
LargerSmaller

Grazie Dell’Attenzione!
Domande?

CPU Scheduling

Recommended

Recommended

More Related Content

What's hot

What's hot (10)

Similar to CPU Scheduling

Similar to CPU Scheduling (20)

More from Università Degli Studi Di Salerno

More from Università Degli Studi Di Salerno (9)

Recently uploaded

Recently uploaded (20)

CPU Scheduling

Editor's Notes