1
Scheduling in
Android
AnDevCon Washington D.C. 2017
Karim Yaghmour
@karimyaghmour
karim.yaghmour@opersys.com
2
These slides are made available to you under a Creative Commons Share-
Alike 3.0 license. The full terms of this license are here:
https://creativecommons.org/licenses/by-sa/3.0/
Attribution requirements and misc., PLEASE READ:
● This slide must remain as-is in this specific location (slide #2), everything
else you are free to change; including the logo :-)
● Use of figures in other documents must feature the below “Originals at”
URL immediately under that figure and the below copyright notice where
appropriate.
● You are free to fill in the “Delivered and/or customized by” space on the
right as you see fit.
● You are FORBIDEN from using the default “About” slide as-is or any of its
contents.
● You are FORBIDEN from using any content provided by 3rd parties without
the EXPLICIT consent from those parties.
(C) Copyright 2016-2017, Opersys inc.
These slides created by: Karim Yaghmour
Originals at: www.opersys.com/community/docs
Delivered and/or customized by
3
About
● Author of:
● Introduced Linux Trace Toolkit in 1999
● Originated Adeos and relayfs (kernel/relay.c)
● Training, Custom Dev, Consulting, ...
4
Agenda
1. Architecture
2. Linux scheduler history
3. Completely Fair Scheduler (CFS)
4. Sched Classes
5. CPU power management
6. Load tracking
7. Android problems w/ Linux scheduler
8. User-space vs. Linux scheduler
9. Framework
10. Summing up
5
1. Architecture
● Hardware on which Android is based
● Android stack
● Startup
● System services
● Binder driver
6
7
8
9
10
11
12
2. Linux scheduler history
● 1995 - Linux 1.2
● Circular queue of tasks w/ round-robin
● 1999 - Linux 2.2
● Scheduling classes: real-time, non-preemptable and non-real-time
● SMP support
● 2001 - Linux 2.4 O(N) scheduler
● Each task gets a slice (epoch)
● 2003 - Linux 2.6 O(1) scheduler
● Two queues: active vs. expired
● Heuristics-based / no “formal” algorithm
● 2007 - ~ Linux 2.6.21
● Con Kolivas' Rotating Staircase Deadline Scheduler (RSDL)
● 2007 - Linux 2.6.23
● Ingo Molnar's Completely Fair Scheduler (CFS)
13
3. Completely Fair Scheduler (CFS)
●
Tasks put in red-black tree
●
Self-balancing – i.e. no path more than twice other path
● O (log n) operations
●
task_struct->sched_entity->rb_node
●
kernel/core/sched.c:schedule()
●
put_prev_task()
●
pick_next_task()
●
Priorities provide decay factors
●
Higher priority = lower decay factor
● i.e. lower priority uses up their time more quickly
●
Group scheduling – control group mechanism (cgroups)
●
Grouped processes share a single process' virtual time
●
Use of cgroupfs – /acct in Android
14
4. Sched Classes
● See kernel/sched/sched.h
● struct sched_class
● Classes
● CFS - SCHED_OTHER (i.e. SCHED_NORMAL in the sources)
● RT - SCHED_FIFO / SCHED_RR
● Deadline - SCHED_DEADLINE(since 3.14)
● Idle - Not!!! SCHED_IDLE
● Stop – CPU stopping
● sched_setscheduler()
15
5. CPU power management
●
Need to integrate CPU power management and scheduler more closely
●
Existing Linux support for power management (governors)
●
cpuidle
– What happens when the CPU is idle
● cpufreq drivers
– HW counters- / FW-based decision (intel_pstate / longrun)
– Hard-coded values based on ranges within driver DVFS – “Dynamic Voltage and Frequency Scaling”
●
Trivial
– Performance - highest frequency
– Powersave - lowest frequency
– Userspace - User-space app makes decision
●
Non-trivial
– Based on system load
●
Stats for non-trivial governors come from scheduler stats (/proc/stat)
●
On-demand - Immediate jump on load increase
● Conservative - Gradual scale-up on increase
16
5.1. Making frequency scaling choices
● SCHED_DEADLINE
● We have precise info, because of how this works
● SCHED_FIFO / SCHED_RR
● Put CPU at max right away
● SCHED_NORMAL
● Use stats
17
6. Load tracking
● How to track how much “load” a process is putting on the
system?
● Not just CPU time consumed
● Process waiting for CPU is contributing to load
● Since 3.8
● PELT - Per-Entity Load Tracking
– Each load instance is y times the previous one:
● L0 + L1*y + L2*y2 + L3*y3 + ...
● New proposal (used in Pixel)
● WALT - Window-Assisted Load Tracking
18
7. Android problems w/ Linux scheduler
● Even “on-demand” isn't good enough
● It takes too many milliseconds for timer to tick
and stats to be updated
● Since decision is based on stats, delay is user-
noticeable
19
7.1. Android Initial Solution
● Android devs wrote their own governor:
● “interactive”
● Detects transition out of idle
● Shortens timeout to scale up to much shorter
time (“1-2 ticks”)
● Out of tree
20
7.2. Intermediate solutions discussed
● Trigger cpufreq when scheduler updates stats
● Instead of trigger using a timeout
● Introduce new governor: schedutil (merged in
4.7)
● Use the info straight from the scheduler stats
update
21
7.3. Recent Work
●
Problems:
● bigLITTLE
● EAS – Energy Aware Scheduling
●
SchedTune used in Pixel phone
● Provide user-space knobs to control schedutil governor
● Android doesn't need to make suppositions, it knows what's happening
●
Implemented as control-group controller
● Each cgroup has tunable “schedtune.boost”
●
sched-freq
● Use CFS runqueue info to scale CPU freq
●
WALT – Window-Assisted Load Tracking
22
8. User-space vs. Linux scheduler
● Control knobs
● /dev/cpuset/ - which tasks run on which CPUs
● /dev/cpuctl/ - restrict CPU time for bg tasks
● /dev/stune/ - EAS stune boosting
● init.rc
● Power HAL:
● Power “hints”
● Set interactive
23
9. Framework
● System Services:
●
Activity Manager
– Causes apps to be started through Zygote
– Feed lifecycle events
– Manages foreground vs. background, etc.
●
Scheduling Policy:
– Modifies process scheduler based on request
● Defined thread groups:
● Default (for internal use)
● BG non-interactive
● Foreground
● System
●
Audio App -- SCHED_FIFO
●
Audio Sys -- SCHED_FIFO
●
Top App
● See
●
frameworks/base/core/java/android/os/Process.java
●
frameworks/base/core/jni/android_util_Process.cpp
●
system/core/include/cutils/sched_policy.h
24
10. Summing Up
● Quite a few moving parts
● Linux does bulk of work
● Android gives hints to Linux
● Still ongoing work to get best performance with
least battery usage
25
Thank you ...
karim.yaghmour@opersys.com

Scheduling in Android

  • 1.
    1 Scheduling in Android AnDevCon WashingtonD.C. 2017 Karim Yaghmour @karimyaghmour karim.yaghmour@opersys.com
  • 2.
    2 These slides aremade available to you under a Creative Commons Share- Alike 3.0 license. The full terms of this license are here: https://creativecommons.org/licenses/by-sa/3.0/ Attribution requirements and misc., PLEASE READ: ● This slide must remain as-is in this specific location (slide #2), everything else you are free to change; including the logo :-) ● Use of figures in other documents must feature the below “Originals at” URL immediately under that figure and the below copyright notice where appropriate. ● You are free to fill in the “Delivered and/or customized by” space on the right as you see fit. ● You are FORBIDEN from using the default “About” slide as-is or any of its contents. ● You are FORBIDEN from using any content provided by 3rd parties without the EXPLICIT consent from those parties. (C) Copyright 2016-2017, Opersys inc. These slides created by: Karim Yaghmour Originals at: www.opersys.com/community/docs Delivered and/or customized by
  • 3.
    3 About ● Author of: ●Introduced Linux Trace Toolkit in 1999 ● Originated Adeos and relayfs (kernel/relay.c) ● Training, Custom Dev, Consulting, ...
  • 4.
    4 Agenda 1. Architecture 2. Linuxscheduler history 3. Completely Fair Scheduler (CFS) 4. Sched Classes 5. CPU power management 6. Load tracking 7. Android problems w/ Linux scheduler 8. User-space vs. Linux scheduler 9. Framework 10. Summing up
  • 5.
    5 1. Architecture ● Hardwareon which Android is based ● Android stack ● Startup ● System services ● Binder driver
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
    12 2. Linux schedulerhistory ● 1995 - Linux 1.2 ● Circular queue of tasks w/ round-robin ● 1999 - Linux 2.2 ● Scheduling classes: real-time, non-preemptable and non-real-time ● SMP support ● 2001 - Linux 2.4 O(N) scheduler ● Each task gets a slice (epoch) ● 2003 - Linux 2.6 O(1) scheduler ● Two queues: active vs. expired ● Heuristics-based / no “formal” algorithm ● 2007 - ~ Linux 2.6.21 ● Con Kolivas' Rotating Staircase Deadline Scheduler (RSDL) ● 2007 - Linux 2.6.23 ● Ingo Molnar's Completely Fair Scheduler (CFS)
  • 13.
    13 3. Completely FairScheduler (CFS) ● Tasks put in red-black tree ● Self-balancing – i.e. no path more than twice other path ● O (log n) operations ● task_struct->sched_entity->rb_node ● kernel/core/sched.c:schedule() ● put_prev_task() ● pick_next_task() ● Priorities provide decay factors ● Higher priority = lower decay factor ● i.e. lower priority uses up their time more quickly ● Group scheduling – control group mechanism (cgroups) ● Grouped processes share a single process' virtual time ● Use of cgroupfs – /acct in Android
  • 14.
    14 4. Sched Classes ●See kernel/sched/sched.h ● struct sched_class ● Classes ● CFS - SCHED_OTHER (i.e. SCHED_NORMAL in the sources) ● RT - SCHED_FIFO / SCHED_RR ● Deadline - SCHED_DEADLINE(since 3.14) ● Idle - Not!!! SCHED_IDLE ● Stop – CPU stopping ● sched_setscheduler()
  • 15.
    15 5. CPU powermanagement ● Need to integrate CPU power management and scheduler more closely ● Existing Linux support for power management (governors) ● cpuidle – What happens when the CPU is idle ● cpufreq drivers – HW counters- / FW-based decision (intel_pstate / longrun) – Hard-coded values based on ranges within driver DVFS – “Dynamic Voltage and Frequency Scaling” ● Trivial – Performance - highest frequency – Powersave - lowest frequency – Userspace - User-space app makes decision ● Non-trivial – Based on system load ● Stats for non-trivial governors come from scheduler stats (/proc/stat) ● On-demand - Immediate jump on load increase ● Conservative - Gradual scale-up on increase
  • 16.
    16 5.1. Making frequencyscaling choices ● SCHED_DEADLINE ● We have precise info, because of how this works ● SCHED_FIFO / SCHED_RR ● Put CPU at max right away ● SCHED_NORMAL ● Use stats
  • 17.
    17 6. Load tracking ●How to track how much “load” a process is putting on the system? ● Not just CPU time consumed ● Process waiting for CPU is contributing to load ● Since 3.8 ● PELT - Per-Entity Load Tracking – Each load instance is y times the previous one: ● L0 + L1*y + L2*y2 + L3*y3 + ... ● New proposal (used in Pixel) ● WALT - Window-Assisted Load Tracking
  • 18.
    18 7. Android problemsw/ Linux scheduler ● Even “on-demand” isn't good enough ● It takes too many milliseconds for timer to tick and stats to be updated ● Since decision is based on stats, delay is user- noticeable
  • 19.
    19 7.1. Android InitialSolution ● Android devs wrote their own governor: ● “interactive” ● Detects transition out of idle ● Shortens timeout to scale up to much shorter time (“1-2 ticks”) ● Out of tree
  • 20.
    20 7.2. Intermediate solutionsdiscussed ● Trigger cpufreq when scheduler updates stats ● Instead of trigger using a timeout ● Introduce new governor: schedutil (merged in 4.7) ● Use the info straight from the scheduler stats update
  • 21.
    21 7.3. Recent Work ● Problems: ●bigLITTLE ● EAS – Energy Aware Scheduling ● SchedTune used in Pixel phone ● Provide user-space knobs to control schedutil governor ● Android doesn't need to make suppositions, it knows what's happening ● Implemented as control-group controller ● Each cgroup has tunable “schedtune.boost” ● sched-freq ● Use CFS runqueue info to scale CPU freq ● WALT – Window-Assisted Load Tracking
  • 22.
    22 8. User-space vs.Linux scheduler ● Control knobs ● /dev/cpuset/ - which tasks run on which CPUs ● /dev/cpuctl/ - restrict CPU time for bg tasks ● /dev/stune/ - EAS stune boosting ● init.rc ● Power HAL: ● Power “hints” ● Set interactive
  • 23.
    23 9. Framework ● SystemServices: ● Activity Manager – Causes apps to be started through Zygote – Feed lifecycle events – Manages foreground vs. background, etc. ● Scheduling Policy: – Modifies process scheduler based on request ● Defined thread groups: ● Default (for internal use) ● BG non-interactive ● Foreground ● System ● Audio App -- SCHED_FIFO ● Audio Sys -- SCHED_FIFO ● Top App ● See ● frameworks/base/core/java/android/os/Process.java ● frameworks/base/core/jni/android_util_Process.cpp ● system/core/include/cutils/sched_policy.h
  • 24.
    24 10. Summing Up ●Quite a few moving parts ● Linux does bulk of work ● Android gives hints to Linux ● Still ongoing work to get best performance with least battery usage
  • 25.