SlideShare a Scribd company logo
1 of 62
Download to read offline
Presented by
Date
Event
sched-freq
integrating the scheduler
and cpufreq
Steve Muckle
BKK16-104 March 7, 2016
Linaro Connect BKK16
outline
- how things (don’t) work today
- schedfreq design
- latest test results
- an upstream surprise
- next steps
clarifications/quick questions here
proposals and debate in hacking session
Tuesday 8th March, 15:00-15:50
HACKING-2 (lobby lounge - 23rd floor)
outline
- how things (don’t) work today
- schedfreq design
- latest test results
- an upstream surprise
- next steps
how does cpufreq currently work?
- plugin architecture (governors)
- popular governors are sampling-based
- let's assume:
- fmin = 100MHz, fmax = 1000Mhz
- a policy which goes to util*fmax + 100Mhz
sampling = 80%
freq = 900 MHz
sampling = 30%
freq = 400 Mhz
sampling = 20%
freq = 300 MHz
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
sampling = 70%
freq = 800 Mhz
sampling = 10%
freq = 200 Mhz
sampling = 10%
freq = 200 Mhz
cpu busy time
cpufreq governor sampling
sampling = 80%
freq = 900 MHz
sampling = 30%
freq = 400 Mhz
sampling = 20%
freq = 300 MHz
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
sampling = 70%
freq = 800 Mhz
sampling = 10%
freq = 200 Mhz
sampling = 10%
freq = 200 Mhz
cpu busy time
oops
cpufreq governor sampling
sampling = 0%
freq = 100 MHz
cpu busy time
sampling = 60%
freq = 700 MHz
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
more problems: new tasks
oops
sampling = 0%
freq = 100 MHz
cpu busy time
sampling = 60%
freq = 700 MHz
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
more problems: new tasks
sampling = 100%
freq = 1000 MHz
cpu busy time
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
sampling = 0%
freq = 100 MHz
sampling = 100%
freq = 1000 MHz
sampling = 90%
freq = 1000 MHz
sampling = 0%
freq = 100 MHz
more problems: exiting tasks
oops
sampling = 100%
freq = 1000 MHz
cpu busy time
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
sampling = 100%
freq = 1000 MHz
sampling = 0%
freq = 100 MHz
sampling = 100%
freq = 1000 MHz
sampling = 90%
freq = 1000 MHz
sampling = 0%
freq = 100 MHz
more problems: exiting tasks
cpu busy time
more problems: task migration
oops
cpu busy time
oops
more problems: task migration
cpu busy time
more problems: task migration
oops
cpu busy time
oops
more problems: task migration
more problems: tuning
- ondemand has 7 knobs
- interactive has 11 knobs
- ...and more get added by OEMs
- along with hacks
the line in the sand
Ingo Molnar, May 31 2013
Note that I still disagree with the whole design notion of
having an "idle back-end" (and a 'cpufreq back end')
separate from scheduler power saving policy…
This is a "line in the sand", a 'must have' design property
for any scheduler power saving patches to be acceptable…
https://lwn.net/Articles/552889/
outline
- how things (don’t) work today
- schedfreq design
- latest test results
- an upstream surprise
- next steps
CFS
RT
DL
schedfreq drivercpufreq
CPU N
schedfreq design
- schedfreq is a cpufreq governor
- can cpufreq be removed?
estimating CFS capacity
- per-entity load tracking (PELT)
- introduced in 3.8
- exponential moving average
- sum = is_running() + sum*y
- frequency invariance required
- partially merged (core bits in, ARM support not)
- microarch invariance required
- partially merged (core bits in, ARM support not)
estimating CFS capacity - PELT
estimating CFS capacity - PELT
- initial task load
- was 0 when task is fork-balanced to different CPU
due to bug
- fix is on mainline/4.5
- http://thread.gmane.org/gmane.linux.
kernel/2106780/
- blocked load is included in util_avg
estimating DL capacity
- runtime utilization tracking not strictly required
- DL tasks have runtime, deadline, and period parameters
- this describes the task’s bandwidth reservation
- util = runtime/period
- track DL bandwidth admitted into the system
estimating DL capacity
- util = runtime/period has drawbacks
- it’s worst case
- it’s always there
- better solution - track active utilization
- related to bandwidth reclaiming
- both solutions under discussion
estimating RT capacity
- task priority but no constraints
- monitor RT utilization
- use rt_avg
- no way to react to short latency constraints
- focus on long term constraints and soft real time
- may not be optimal but it already exists
- be sure to budget capacity for RT
- do not steal from other class’s cap requests
aggregation of sched classes
- sched class capacities are summed
- headroom added to CFS and RT
- (CFS + RT) * 1.25
- no headroom for DL tasks
- DL tasks have precise capacity parameters
- total capacity converted to frequency
- scale using policy->max
cpu0 cpu1 cpu2 cpu3
1.6 GHz
1.3 GHz
1.0 GHz
300 MHz
aggregation of CPUs
- CPU with max request in freq domain drives
frequency
setting the frequency
- tricky to do from hot scheduler paths
- locking
- performance implications
- varying CPU frequency driver behavior
- does driver sleep?
- is driver slow?
setting the frequency - fast path
- target freq can always be calculated
- if…
- driver isn’t slow or sleeps AND
- schedfreq isn’t throttled AND
- a freq transition isn’t underway AND
- the slow path isn’t active
then we can set freq in the fast path
setting the frequency - slow path
- kthread spawned by schedfreq
- safe to sleep
- safe to do more work
setting the frequency - slow path
- kthread spawned by schedfreq
- safe to sleep
- safe to do more work
but
setting the frequency - slow path
- kthread spawned by schedfreq
- safe to sleep
- safe to do more work
but
- task wake overhead ($$$)
locking
- lots of cleanup going on in cpufreq
- ongoing work from several people
- no blocking locking issues seen
...for now
locking
- sched hooks hold rq lock
- protect per-CPU data
- avoid accessing policy->rwsem
- freq_table
- min/max
- transition_ongoing
- not required to initiate freq transitions
locking
- schedfreq has 3 internal locks
- gov_enable_lock (mutex)
- GOV_START/STOP for static key control
- fastpath_lock (spinlock)
- solve race to re-evaluate frequency domain
- slowpath_lock (mutex)
- solve race between slow path, fast path,
GOV_START/STOP
scheduler hooks
- enqueue_task_fair, dequeue_task_fair*
- set CFS capacity
- CFS load balance paths
- set CFS capacity at src, dest
- scheduler_tick()
- jump to fmax if headroom is impacted
- pick_next_task_rt, task_tick_rt
- set RT capacity
scheduler hooks - todo
- DL
- migration paths in kernel/sched/core.c
- changing task affinity
- hotplug
- balance on exec()
- NUMA balancing
policy summary
- re-evaluate and set freq when tasks
- wake
- block (except when CPU goes idle)
- migrate
- event driven - too many events?
- go to fmax at tick if headroom is impacted
policy summary
- when CPU goes idle, clear vote but don’t re-
evaluate/set domain freq
- don’t initiate more work when going idle
- right thing to do?
policy summary
- PELT is very important, will need work
- Patrick Bellasi’s util_est
- buffer utilization value to yield more stable estimate
- Vincent Guittot’s invariance improvements
- same amount of work over same period at different
freqs => different utilizations
- tuning via schedtune
outline
- how things (don’t) work today
- schedfreq design
- latest test results
- an upstream surprise
- next steps
rt-app cpufreq test case
- simple periodic workload
- each test case uses a different duty cycle
- 16 different test cases/duty cycles
- either 10, 100 or 1000 loops
- varies from ~1% to ~43% busy
rt-app cpufreq test case
test case # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
busy (ms) 1 10 1 10 100 6 66 4 40 400 5 50 500 9 90 900
idle (ms) 100 1000 10 100 1000 33 333 10 100 1000 9 90 900 12 120 1200
loops 100 10 1000 100 10 300 30 1000 100 10 1000 100 10 1000 100 10
duration (s) 10.1 10.1 11 11 11 11.7 11.97 14 14 14 14 14 14 21 21 21
busy% 0.99% 0.99% 9.09% 9.09% 9.09% 15.38% 16.54% 28.57% 28.57% 28.57% 35.71% 35.71% 35.71% 42.86% 42.86% 42.86%
rt-app cpufreq test case
- for each loop, record
- time to execute busy work
- whether busy work overran period
- for each test case, report
- average time to complete busy work
- number of overruns
rt-app cpufreq test case
- define overhead as ...
(avg_time_test_gov - avg_time_perf_gov) /
(avg_time_pwrsv_gov - avg_time_perf_gov)
- 0% = completes as fast as perf gov
- 100% = completes as fast as powersave gov
Samsung Chromebook 2
- “Peach Pi”
- Exynos 5800
- CPUs 0-3: 200 MHz - 2000 MHz A15
- CPUs 4-7: 200 MHz - 1300 MHz A7
- A15 fmax 1800 MHz with most recent clock
support
- no power numbers yet
Samsung Chromebook 2
SCHED_OTHER (CFS) Perf ondemand interactive sched
run (ms) idle (ms) loops OR OH OR OH OR OH
1 100 100 0 62.07% 0 100.02% 0 78.49%
10 1000 10 0 21.80% 0 22.74% 0 72.56%
1 10 1000 0 21.72% 0 63.08% 0 52.40%
10 100 100 0 8.09% 0 15.53% 0 17.33%
100 1000 10 0 1.83% 0 1.77% 0 0.29%
6 33 300 0 15.32% 0 8.60% 0 17.34%
66 333 30 0 0.79% 0 3.18% 0 12.26%
4 10 1000 0 5.87% 0 10.21% 0 6.15%
40 100 100 0 0.41% 0 0.04% 0 2.68%
400 1000 10 0 0.42% 0 0.50% 0 1.22%
5 9 1000 2 3.82% 1 6.10% 0 2.51%
50 90 100 0 0.19% 0 0.05% 0 1.71%
500 900 10 0 0.37% 0 0.38% 0 1.82%
9 12 1000 6 1.79% 1 0.77% 0 0.26%
90 120 100 0 0.16% 1 0.05% 0 0.49%
900 1200 10 0 0.09% 0 0.26% 0 0.62%
Looks mostly good…
Samsung Chromebook 2
SCHED_FIFO (RT) Perf ondemand interactive sched
run (ms) idle (ms) loops OR OH OR OH OR OH
1 100 100 0 39.61% 0 100.49% 0 99.57%
10 1000 10 0 73.51% 0 21.09% 0 96.66%
1 10 1000 0 18.01% 0 61.46% 0 67.68%
10 100 100 0 31.31% 0 18.62% 0 77.01%
100 1000 10 0 58.80% 0 1.90% 0 15.40%
6 33 300 251 85.99% 0 9.20% 1 30.09%
66 333 30 24 84.03% 0 3.38% 0 33.23%
4 10 1000 0 6.23% 0 12.21% 10 11.54%
40 100 100 100 62.08% 0 0.11% 1 11.85%
400 1000 10 10 62.09% 0 0.51% 0 7.00%
5 9 1000 999 12.29% 1 6.03% 0 0.04%
50 90 100 99 61.47% 0 0.05% 2 6.53%
500 900 10 10 43.37% 0 0.39% 0 6.30%
9 12 1000 999 9.83% 0 0.01% 14 1.69%
90 120 100 99 61.47% 0 0.01% 28 2.29%
900 1200 10 10 43.31% 0 0.22% 0 2.15%
RTavg mechanism not reacting fast enough.
sched_time_avg_ms = 50
MediaTek 8173 EVB
- CPUs 0-1: 507 MHz - 1573 MHz A53
- CPUs 2-3: 507 MHz - 1989 MHz A72
- power measured via onboard TI INA219s
- thanks to Freedom Tan for this data
MediaTek 8173 EVB
SCHED_OTHER (CFS) Perf ondemand interactive sched
run (ms) idle (ms) loops OR OH OR OH OR OH
1 100 100 0 98.04% 0 100.41% 0 98.04%
10 1000 10 0 34.00% 0 67.68% 0 99.95%
1 10 1000 0 56.32% 0 101.11% 0 100.69%
10 100 100 0 18.31% 0 31.57% 0 100.02%
100 1000 10 0 2.77% 0 6.79% 0 7.72%
6 33 300 0 41.29% 0 100.27% 0 100.28%
66 333 30 0 1.27% 0 10.38% 0 39.31%
4 10 1000 0 21.45% 2 18.90% 6 65.66%
40 100 100 0 1.35% 0 8.16% 0 24.30%
400 1000 10 0 1.02% 0 1.74% 0 6.55%
5 9 1000 0 13.43% 2 14.14% 5 52.51%
50 90 100 0 1.31% 0 1.39% 1 12.62%
500 900 10 0 0.54% 0 1.32% 0 5.21%
9 12 1000 1 7.19% 1 8.59% 3 27.47%
90 120 100 0 0.88% 0 0.75% 1 3.80%
900 1200 10 0 0.16% 0 0.79% 0 2.83%
Trouble with most workloads with run < 100ms.
MediaTek 8173 EVB
SCHED_OTHER (CFS) Power/Perf power delta perf (from last slide)
run (ms) idle (ms) loops delta w/ondemand delta w/interactive sched overhead
1 100 100 -7.56% 0.01% 98.04%
10 1000 10 -0.41% 1.21% 99.95%
1 10 1000 12.97% 3.92% 100.69%
10 100 100 -3.19% -4.90% 100.02%
100 1000 10 -7.00% 1.37% 7.72%
6 33 300 -11.84% -9.95% 100.28%
66 333 30 -8.15% -2.09% 39.31%
4 10 1000 -0.93% -8.59% 65.66%
40 100 100 -3.18% -9.88% 24.30%
400 1000 10 -2.99% 0.07% 6.55%
5 9 1000 1.67% -9.33% 52.51%
50 90 100 -5.97% -10.89% 12.62%
500 900 10 -2.29% 1.73% 5.21%
9 12 1000 -5.90% -9.75% 27.47%
90 120 100 -6.88% -5.12% 3.80%
900 1200 10 5.23% 6.57% 2.83%
avg delta w/interactive: -3.48%
avg delta w/ondemand: -2.9%
Power savings seen, but not meaningful with
observed perf losses.
outline
- how things (don’t) work today
- schedfreq design
- latest test results
- an upstream surprise
- next steps
an upstream surprise
- scheduler - cpufreq hooks
posted by Rafael Wysocki
- Jan 29th 2016
- now in linux-next
- sched utilization driven gov
also posted by Rafael
- Feb 21st 2016
important differences
- ondemand-like freq algorithm
- possible to get stuck due to freq invariance
- weird semantics w.r.t. headroom
- no aggregation of sched class capacities
- currently goes to fmax for RT, DL
- uses workqueue rather than kthread
what’s it mean?
- more engagement from upstream
- what’s the value of schedfreq?
words of encouragement
“What I'd like to see from a scheduler metrics
usage POV is a single central place,
kernel/sched/cpufreq.c, where all the high level
('governor') decisions are made.
This is the approach Steve's series takes.”
Ingo Molnar, 03/03/2016
Note: Mike Turquette and Juri Lelli conceived and
authored much of the schedfreq series.
outline
- how things (don’t) work today
- schedfreq design
- latest test results
- an upstream surprise
- next steps
next steps
- address shortcomings in schedutil
- freq algorithm
- scheduler hooks
- better RT, DL response
- more in-depth testing and analysis
- experiments with real-world usecases
- Android UI, games, benchmarks, etc.
- merging with EAS
- integration and testing with schedtune
- window-based load tracking
the end
backup
ondemand
- sample every sampling_rate usec
- cpu usage = busy% of last sample_rate usec
- if busy% > up_threshold, go to fmax
- otherwise scale with load
- freq_next = fmin + busy% * (fmax - fmin)
- stay at fmax longer with sampling_down_factor
interactive
- sample every timer_rate usec
- cpu usage = busy% of last timer_rate usec
- if busy% > go_hispeed_load go to hispeed_freq
- otherwise scale CPU according to target_loads
- 85 1000000:90 1700000:95
- XXX put a note in here to explain this
- can prevent slowdown with min_sample_time
- delay speedups past hispeed_freq with
above_hispeed_delay

More Related Content

What's hot

SREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsBrendan Gregg
 
Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Brendan Gregg
 
Netflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesNetflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesEd Hunter
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf toolsBrendan Gregg
 
Linux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersLinux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersBrendan Gregg
 
USENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame GraphsUSENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame GraphsBrendan Gregg
 
Troubleshooting common oslo.messaging and RabbitMQ issues
Troubleshooting common oslo.messaging and RabbitMQ issuesTroubleshooting common oslo.messaging and RabbitMQ issues
Troubleshooting common oslo.messaging and RabbitMQ issuesMichael Klishin
 
New Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingNew Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingScyllaDB
 
QCon 2015 Broken Performance Tools
QCon 2015 Broken Performance ToolsQCon 2015 Broken Performance Tools
QCon 2015 Broken Performance ToolsBrendan Gregg
 
Performance Lessons learned in vRouter - Stephen Hemminger
Performance Lessons learned in vRouter - Stephen HemmingerPerformance Lessons learned in vRouter - Stephen Hemminger
Performance Lessons learned in vRouter - Stephen Hemmingerharryvanhaaren
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringGeorg Schönberger
 
Stop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production SystemsStop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production SystemsBrendan Gregg
 
Q2.12: Scheduler Inputs
Q2.12: Scheduler InputsQ2.12: Scheduler Inputs
Q2.12: Scheduler InputsLinaro
 
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFBrendan Gregg
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtAnne Nicolas
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at NetflixBrendan Gregg
 
Netflix: From Clouds to Roots
Netflix: From Clouds to RootsNetflix: From Clouds to Roots
Netflix: From Clouds to RootsBrendan Gregg
 
Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixBrendan Gregg
 
Performance: Observe and Tune
Performance: Observe and TunePerformance: Observe and Tune
Performance: Observe and TunePaul V. Novarese
 

What's hot (20)

SREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREs
 
Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016
 
Netflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesNetflix SRE perf meetup_slides
Netflix SRE perf meetup_slides
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
 
Linux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersLinux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF Superpowers
 
USENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame GraphsUSENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame Graphs
 
Troubleshooting common oslo.messaging and RabbitMQ issues
Troubleshooting common oslo.messaging and RabbitMQ issuesTroubleshooting common oslo.messaging and RabbitMQ issues
Troubleshooting common oslo.messaging and RabbitMQ issues
 
New Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingNew Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using Tracing
 
QCon 2015 Broken Performance Tools
QCon 2015 Broken Performance ToolsQCon 2015 Broken Performance Tools
QCon 2015 Broken Performance Tools
 
Performance Lessons learned in vRouter - Stephen Hemminger
Performance Lessons learned in vRouter - Stephen HemmingerPerformance Lessons learned in vRouter - Stephen Hemminger
Performance Lessons learned in vRouter - Stephen Hemminger
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and Monitoring
 
Stop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production SystemsStop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production Systems
 
Q2.12: Scheduler Inputs
Q2.12: Scheduler InputsQ2.12: Scheduler Inputs
Q2.12: Scheduler Inputs
 
Introduction to Perf
Introduction to PerfIntroduction to Perf
Introduction to Perf
 
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at Netflix
 
Netflix: From Clouds to Roots
Netflix: From Clouds to RootsNetflix: From Clouds to Roots
Netflix: From Clouds to Roots
 
Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at Netflix
 
Performance: Observe and Tune
Performance: Observe and TunePerformance: Observe and Tune
Performance: Observe and Tune
 

Viewers also liked

BKK16-107 Budget Fair Queueing heuristics in the block layer
BKK16-107 Budget Fair Queueing heuristics in the block layerBKK16-107 Budget Fair Queueing heuristics in the block layer
BKK16-107 Budget Fair Queueing heuristics in the block layerLinaro
 
BKK16-102 Creating new workload for Workload Automation & using WA with LAVA
BKK16-102 Creating new workload for Workload Automation & using WA with LAVABKK16-102 Creating new workload for Workload Automation & using WA with LAVA
BKK16-102 Creating new workload for Workload Automation & using WA with LAVALinaro
 
Q2.12: Idling ARMs in a busy world: Linux Power Management for ARM Multiclust...
Q2.12: Idling ARMs in a busy world: Linux Power Management for ARM Multiclust...Q2.12: Idling ARMs in a busy world: Linux Power Management for ARM Multiclust...
Q2.12: Idling ARMs in a busy world: Linux Power Management for ARM Multiclust...Linaro
 
Reverse engineering for_beginners-en
Reverse engineering for_beginners-enReverse engineering for_beginners-en
Reverse engineering for_beginners-enAndri Yabu
 
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s goingKernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s goingAnne Nicolas
 
Specification-Based Test Program Generation for ARM VMSAv8-64 MMUs
Specification-Based Test Program Generation for ARM VMSAv8-64 MMUsSpecification-Based Test Program Generation for ARM VMSAv8-64 MMUs
Specification-Based Test Program Generation for ARM VMSAv8-64 MMUsAlexander Kamkin
 
BKK16-404A PCI Development Meeting
BKK16-404A PCI Development MeetingBKK16-404A PCI Development Meeting
BKK16-404A PCI Development MeetingLinaro
 
Tackling the Management Challenges of Server Consolidation on Multi-core Systems
Tackling the Management Challenges of Server Consolidation on Multi-core SystemsTackling the Management Challenges of Server Consolidation on Multi-core Systems
Tackling the Management Challenges of Server Consolidation on Multi-core SystemsThe Linux Foundation
 
Virtualization overheads
Virtualization overheadsVirtualization overheads
Virtualization overheadsSandeep Joshi
 
Docker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in PragueDocker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in Praguetomasbart
 
Linux numa evolution
Linux numa evolutionLinux numa evolution
Linux numa evolutionLukas Pirl
 
Cgroup resource mgmt_v1
Cgroup resource mgmt_v1Cgroup resource mgmt_v1
Cgroup resource mgmt_v1sprdd
 
Gc and-pagescan-attacks-by-linux
Gc and-pagescan-attacks-by-linuxGc and-pagescan-attacks-by-linux
Gc and-pagescan-attacks-by-linuxCuong Tran
 
Known basic of NFV Features
Known basic of NFV FeaturesKnown basic of NFV Features
Known basic of NFV FeaturesRaul Leite
 
Non-Uniform Memory Access ( NUMA)
Non-Uniform Memory Access ( NUMA)Non-Uniform Memory Access ( NUMA)
Non-Uniform Memory Access ( NUMA)Nakul Manchanda
 
LAS16-TR02: Upstreaming 101
LAS16-TR02: Upstreaming 101LAS16-TR02: Upstreaming 101
LAS16-TR02: Upstreaming 101Linaro
 
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)Linaro
 
Linux NUMA & Databases: Perils and Opportunities
Linux NUMA & Databases: Perils and OpportunitiesLinux NUMA & Databases: Perils and Opportunities
Linux NUMA & Databases: Perils and OpportunitiesRaghavendra Prabhu
 
P4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC OffloadP4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC OffloadOpen-NFP
 

Viewers also liked (20)

BKK16-107 Budget Fair Queueing heuristics in the block layer
BKK16-107 Budget Fair Queueing heuristics in the block layerBKK16-107 Budget Fair Queueing heuristics in the block layer
BKK16-107 Budget Fair Queueing heuristics in the block layer
 
BKK16-102 Creating new workload for Workload Automation & using WA with LAVA
BKK16-102 Creating new workload for Workload Automation & using WA with LAVABKK16-102 Creating new workload for Workload Automation & using WA with LAVA
BKK16-102 Creating new workload for Workload Automation & using WA with LAVA
 
Q2.12: Idling ARMs in a busy world: Linux Power Management for ARM Multiclust...
Q2.12: Idling ARMs in a busy world: Linux Power Management for ARM Multiclust...Q2.12: Idling ARMs in a busy world: Linux Power Management for ARM Multiclust...
Q2.12: Idling ARMs in a busy world: Linux Power Management for ARM Multiclust...
 
Dulloor xen-summit
Dulloor xen-summitDulloor xen-summit
Dulloor xen-summit
 
Reverse engineering for_beginners-en
Reverse engineering for_beginners-enReverse engineering for_beginners-en
Reverse engineering for_beginners-en
 
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s goingKernel Recipes 2016 - Kernel documentation: what we have and where it’s going
Kernel Recipes 2016 - Kernel documentation: what we have and where it’s going
 
Specification-Based Test Program Generation for ARM VMSAv8-64 MMUs
Specification-Based Test Program Generation for ARM VMSAv8-64 MMUsSpecification-Based Test Program Generation for ARM VMSAv8-64 MMUs
Specification-Based Test Program Generation for ARM VMSAv8-64 MMUs
 
BKK16-404A PCI Development Meeting
BKK16-404A PCI Development MeetingBKK16-404A PCI Development Meeting
BKK16-404A PCI Development Meeting
 
Tackling the Management Challenges of Server Consolidation on Multi-core Systems
Tackling the Management Challenges of Server Consolidation on Multi-core SystemsTackling the Management Challenges of Server Consolidation on Multi-core Systems
Tackling the Management Challenges of Server Consolidation on Multi-core Systems
 
Virtualization overheads
Virtualization overheadsVirtualization overheads
Virtualization overheads
 
Docker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in PragueDocker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in Prague
 
Linux numa evolution
Linux numa evolutionLinux numa evolution
Linux numa evolution
 
Cgroup resource mgmt_v1
Cgroup resource mgmt_v1Cgroup resource mgmt_v1
Cgroup resource mgmt_v1
 
Gc and-pagescan-attacks-by-linux
Gc and-pagescan-attacks-by-linuxGc and-pagescan-attacks-by-linux
Gc and-pagescan-attacks-by-linux
 
Known basic of NFV Features
Known basic of NFV FeaturesKnown basic of NFV Features
Known basic of NFV Features
 
Non-Uniform Memory Access ( NUMA)
Non-Uniform Memory Access ( NUMA)Non-Uniform Memory Access ( NUMA)
Non-Uniform Memory Access ( NUMA)
 
LAS16-TR02: Upstreaming 101
LAS16-TR02: Upstreaming 101LAS16-TR02: Upstreaming 101
LAS16-TR02: Upstreaming 101
 
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
 
Linux NUMA & Databases: Perils and Opportunities
Linux NUMA & Databases: Perils and OpportunitiesLinux NUMA & Databases: Perils and Opportunities
Linux NUMA & Databases: Perils and Opportunities
 
P4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC OffloadP4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC Offload
 

Similar to BKK16-104 sched-freq

Debugging Ruby
Debugging RubyDebugging Ruby
Debugging RubyAman Gupta
 
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Ontico
 
Galvin-operating System(Ch6)
Galvin-operating System(Ch6)Galvin-operating System(Ch6)
Galvin-operating System(Ch6)dsuyal1
 
Debugging Ruby Systems
Debugging Ruby SystemsDebugging Ruby Systems
Debugging Ruby SystemsEngine Yard
 
Puppet Availability and Performance at 100K Nodes - PuppetConf 2014
Puppet Availability and Performance at 100K Nodes - PuppetConf 2014Puppet Availability and Performance at 100K Nodes - PuppetConf 2014
Puppet Availability and Performance at 100K Nodes - PuppetConf 2014Puppet
 
Cpu scheduling (1)
Cpu scheduling (1)Cpu scheduling (1)
Cpu scheduling (1)Ayush Oberai
 
WALT vs PELT : Redux - SFO17-307
WALT vs PELT : Redux  - SFO17-307WALT vs PELT : Redux  - SFO17-307
WALT vs PELT : Redux - SFO17-307Linaro
 
Deep review of LMS process
Deep review of LMS processDeep review of LMS process
Deep review of LMS processRiyaj Shamsudeen
 
20150918 klug el performance tuning-v1.4
20150918 klug el performance tuning-v1.420150918 klug el performance tuning-v1.4
20150918 klug el performance tuning-v1.4Jinkoo Han
 
Docker tips-for-java-developers
Docker tips-for-java-developersDocker tips-for-java-developers
Docker tips-for-java-developersAparna Chaudhary
 
Guide to alfresco monitoring
Guide to alfresco monitoringGuide to alfresco monitoring
Guide to alfresco monitoringMiguel Rodriguez
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceBrendan Gregg
 
Joker 2015 - Валеев Тагир - Что же мы измеряем?
Joker 2015 - Валеев Тагир - Что же мы измеряем?Joker 2015 - Валеев Тагир - Что же мы измеряем?
Joker 2015 - Валеев Тагир - Что же мы измеряем?tvaleev
 
Dynamic Shift Frequency Scaling Of ATPG Patterns
Dynamic Shift Frequency Scaling Of ATPG PatternsDynamic Shift Frequency Scaling Of ATPG Patterns
Dynamic Shift Frequency Scaling Of ATPG PatternsAditya Ramachandran
 
cloud computing chapter one in computer science
cloud computing chapter one in computer sciencecloud computing chapter one in computer science
cloud computing chapter one in computer scienceTSha7
 
Ovs perf
Ovs perfOvs perf
Ovs perfMadhu c
 

Similar to BKK16-104 sched-freq (20)

Debugging Ruby
Debugging RubyDebugging Ruby
Debugging Ruby
 
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)
 
Galvin-operating System(Ch6)
Galvin-operating System(Ch6)Galvin-operating System(Ch6)
Galvin-operating System(Ch6)
 
Debugging Ruby Systems
Debugging Ruby SystemsDebugging Ruby Systems
Debugging Ruby Systems
 
Puppet Availability and Performance at 100K Nodes - PuppetConf 2014
Puppet Availability and Performance at 100K Nodes - PuppetConf 2014Puppet Availability and Performance at 100K Nodes - PuppetConf 2014
Puppet Availability and Performance at 100K Nodes - PuppetConf 2014
 
Sge
SgeSge
Sge
 
CPU Scheduling
CPU SchedulingCPU Scheduling
CPU Scheduling
 
Cpu scheduling (1)
Cpu scheduling (1)Cpu scheduling (1)
Cpu scheduling (1)
 
WALT vs PELT : Redux - SFO17-307
WALT vs PELT : Redux  - SFO17-307WALT vs PELT : Redux  - SFO17-307
WALT vs PELT : Redux - SFO17-307
 
Deep review of LMS process
Deep review of LMS processDeep review of LMS process
Deep review of LMS process
 
20150918 klug el performance tuning-v1.4
20150918 klug el performance tuning-v1.420150918 klug el performance tuning-v1.4
20150918 klug el performance tuning-v1.4
 
Getput suite
Getput suiteGetput suite
Getput suite
 
Docker tips-for-java-developers
Docker tips-for-java-developersDocker tips-for-java-developers
Docker tips-for-java-developers
 
Guide to alfresco monitoring
Guide to alfresco monitoringGuide to alfresco monitoring
Guide to alfresco monitoring
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
 
Joker 2015 - Валеев Тагир - Что же мы измеряем?
Joker 2015 - Валеев Тагир - Что же мы измеряем?Joker 2015 - Валеев Тагир - Что же мы измеряем?
Joker 2015 - Валеев Тагир - Что же мы измеряем?
 
Dynamic Shift Frequency Scaling Of ATPG Patterns
Dynamic Shift Frequency Scaling Of ATPG PatternsDynamic Shift Frequency Scaling Of ATPG Patterns
Dynamic Shift Frequency Scaling Of ATPG Patterns
 
Load testing with Blitz
Load testing with BlitzLoad testing with Blitz
Load testing with Blitz
 
cloud computing chapter one in computer science
cloud computing chapter one in computer sciencecloud computing chapter one in computer science
cloud computing chapter one in computer science
 
Ovs perf
Ovs perfOvs perf
Ovs perf
 

More from Linaro

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloLinaro
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaLinaro
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraLinaro
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaLinaro
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018Linaro
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018Linaro
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...Linaro
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Linaro
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Linaro
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteLinaro
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopLinaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allLinaro
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorLinaro
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMULinaro
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MLinaro
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation Linaro
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootLinaro
 

More from Linaro (20)

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qa
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening Keynote
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8M
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
 

Recently uploaded

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 

Recently uploaded (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 

BKK16-104 sched-freq

  • 1. Presented by Date Event sched-freq integrating the scheduler and cpufreq Steve Muckle BKK16-104 March 7, 2016 Linaro Connect BKK16
  • 2. outline - how things (don’t) work today - schedfreq design - latest test results - an upstream surprise - next steps
  • 3. clarifications/quick questions here proposals and debate in hacking session Tuesday 8th March, 15:00-15:50 HACKING-2 (lobby lounge - 23rd floor)
  • 4. outline - how things (don’t) work today - schedfreq design - latest test results - an upstream surprise - next steps
  • 5. how does cpufreq currently work? - plugin architecture (governors) - popular governors are sampling-based - let's assume: - fmin = 100MHz, fmax = 1000Mhz - a policy which goes to util*fmax + 100Mhz
  • 6. sampling = 80% freq = 900 MHz sampling = 30% freq = 400 Mhz sampling = 20% freq = 300 MHz sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz sampling = 70% freq = 800 Mhz sampling = 10% freq = 200 Mhz sampling = 10% freq = 200 Mhz cpu busy time cpufreq governor sampling
  • 7. sampling = 80% freq = 900 MHz sampling = 30% freq = 400 Mhz sampling = 20% freq = 300 MHz sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz sampling = 70% freq = 800 Mhz sampling = 10% freq = 200 Mhz sampling = 10% freq = 200 Mhz cpu busy time oops cpufreq governor sampling
  • 8. sampling = 0% freq = 100 MHz cpu busy time sampling = 60% freq = 700 MHz sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz more problems: new tasks
  • 9. oops sampling = 0% freq = 100 MHz cpu busy time sampling = 60% freq = 700 MHz sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz more problems: new tasks
  • 10. sampling = 100% freq = 1000 MHz cpu busy time sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz sampling = 0% freq = 100 MHz sampling = 100% freq = 1000 MHz sampling = 90% freq = 1000 MHz sampling = 0% freq = 100 MHz more problems: exiting tasks
  • 11. oops sampling = 100% freq = 1000 MHz cpu busy time sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz sampling = 100% freq = 1000 MHz sampling = 0% freq = 100 MHz sampling = 100% freq = 1000 MHz sampling = 90% freq = 1000 MHz sampling = 0% freq = 100 MHz more problems: exiting tasks
  • 12. cpu busy time more problems: task migration
  • 13. oops cpu busy time oops more problems: task migration
  • 14. cpu busy time more problems: task migration
  • 15. oops cpu busy time oops more problems: task migration
  • 16. more problems: tuning - ondemand has 7 knobs - interactive has 11 knobs - ...and more get added by OEMs - along with hacks
  • 17. the line in the sand Ingo Molnar, May 31 2013 Note that I still disagree with the whole design notion of having an "idle back-end" (and a 'cpufreq back end') separate from scheduler power saving policy… This is a "line in the sand", a 'must have' design property for any scheduler power saving patches to be acceptable… https://lwn.net/Articles/552889/
  • 18. outline - how things (don’t) work today - schedfreq design - latest test results - an upstream surprise - next steps
  • 19. CFS RT DL schedfreq drivercpufreq CPU N schedfreq design - schedfreq is a cpufreq governor - can cpufreq be removed?
  • 20. estimating CFS capacity - per-entity load tracking (PELT) - introduced in 3.8 - exponential moving average - sum = is_running() + sum*y - frequency invariance required - partially merged (core bits in, ARM support not) - microarch invariance required - partially merged (core bits in, ARM support not)
  • 22. estimating CFS capacity - PELT - initial task load - was 0 when task is fork-balanced to different CPU due to bug - fix is on mainline/4.5 - http://thread.gmane.org/gmane.linux. kernel/2106780/ - blocked load is included in util_avg
  • 23. estimating DL capacity - runtime utilization tracking not strictly required - DL tasks have runtime, deadline, and period parameters - this describes the task’s bandwidth reservation - util = runtime/period - track DL bandwidth admitted into the system
  • 24. estimating DL capacity - util = runtime/period has drawbacks - it’s worst case - it’s always there - better solution - track active utilization - related to bandwidth reclaiming - both solutions under discussion
  • 25. estimating RT capacity - task priority but no constraints - monitor RT utilization - use rt_avg - no way to react to short latency constraints - focus on long term constraints and soft real time - may not be optimal but it already exists - be sure to budget capacity for RT - do not steal from other class’s cap requests
  • 26. aggregation of sched classes - sched class capacities are summed - headroom added to CFS and RT - (CFS + RT) * 1.25 - no headroom for DL tasks - DL tasks have precise capacity parameters - total capacity converted to frequency - scale using policy->max
  • 27. cpu0 cpu1 cpu2 cpu3 1.6 GHz 1.3 GHz 1.0 GHz 300 MHz aggregation of CPUs - CPU with max request in freq domain drives frequency
  • 28. setting the frequency - tricky to do from hot scheduler paths - locking - performance implications - varying CPU frequency driver behavior - does driver sleep? - is driver slow?
  • 29. setting the frequency - fast path - target freq can always be calculated - if… - driver isn’t slow or sleeps AND - schedfreq isn’t throttled AND - a freq transition isn’t underway AND - the slow path isn’t active then we can set freq in the fast path
  • 30. setting the frequency - slow path - kthread spawned by schedfreq - safe to sleep - safe to do more work
  • 31. setting the frequency - slow path - kthread spawned by schedfreq - safe to sleep - safe to do more work but
  • 32. setting the frequency - slow path - kthread spawned by schedfreq - safe to sleep - safe to do more work but - task wake overhead ($$$)
  • 33. locking - lots of cleanup going on in cpufreq - ongoing work from several people - no blocking locking issues seen ...for now
  • 34. locking - sched hooks hold rq lock - protect per-CPU data - avoid accessing policy->rwsem - freq_table - min/max - transition_ongoing - not required to initiate freq transitions
  • 35. locking - schedfreq has 3 internal locks - gov_enable_lock (mutex) - GOV_START/STOP for static key control - fastpath_lock (spinlock) - solve race to re-evaluate frequency domain - slowpath_lock (mutex) - solve race between slow path, fast path, GOV_START/STOP
  • 36. scheduler hooks - enqueue_task_fair, dequeue_task_fair* - set CFS capacity - CFS load balance paths - set CFS capacity at src, dest - scheduler_tick() - jump to fmax if headroom is impacted - pick_next_task_rt, task_tick_rt - set RT capacity
  • 37. scheduler hooks - todo - DL - migration paths in kernel/sched/core.c - changing task affinity - hotplug - balance on exec() - NUMA balancing
  • 38. policy summary - re-evaluate and set freq when tasks - wake - block (except when CPU goes idle) - migrate - event driven - too many events? - go to fmax at tick if headroom is impacted
  • 39. policy summary - when CPU goes idle, clear vote but don’t re- evaluate/set domain freq - don’t initiate more work when going idle - right thing to do?
  • 40. policy summary - PELT is very important, will need work - Patrick Bellasi’s util_est - buffer utilization value to yield more stable estimate - Vincent Guittot’s invariance improvements - same amount of work over same period at different freqs => different utilizations - tuning via schedtune
  • 41. outline - how things (don’t) work today - schedfreq design - latest test results - an upstream surprise - next steps
  • 42. rt-app cpufreq test case - simple periodic workload - each test case uses a different duty cycle - 16 different test cases/duty cycles - either 10, 100 or 1000 loops - varies from ~1% to ~43% busy
  • 43. rt-app cpufreq test case test case # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 busy (ms) 1 10 1 10 100 6 66 4 40 400 5 50 500 9 90 900 idle (ms) 100 1000 10 100 1000 33 333 10 100 1000 9 90 900 12 120 1200 loops 100 10 1000 100 10 300 30 1000 100 10 1000 100 10 1000 100 10 duration (s) 10.1 10.1 11 11 11 11.7 11.97 14 14 14 14 14 14 21 21 21 busy% 0.99% 0.99% 9.09% 9.09% 9.09% 15.38% 16.54% 28.57% 28.57% 28.57% 35.71% 35.71% 35.71% 42.86% 42.86% 42.86%
  • 44. rt-app cpufreq test case - for each loop, record - time to execute busy work - whether busy work overran period - for each test case, report - average time to complete busy work - number of overruns
  • 45. rt-app cpufreq test case - define overhead as ... (avg_time_test_gov - avg_time_perf_gov) / (avg_time_pwrsv_gov - avg_time_perf_gov) - 0% = completes as fast as perf gov - 100% = completes as fast as powersave gov
  • 46. Samsung Chromebook 2 - “Peach Pi” - Exynos 5800 - CPUs 0-3: 200 MHz - 2000 MHz A15 - CPUs 4-7: 200 MHz - 1300 MHz A7 - A15 fmax 1800 MHz with most recent clock support - no power numbers yet
  • 47. Samsung Chromebook 2 SCHED_OTHER (CFS) Perf ondemand interactive sched run (ms) idle (ms) loops OR OH OR OH OR OH 1 100 100 0 62.07% 0 100.02% 0 78.49% 10 1000 10 0 21.80% 0 22.74% 0 72.56% 1 10 1000 0 21.72% 0 63.08% 0 52.40% 10 100 100 0 8.09% 0 15.53% 0 17.33% 100 1000 10 0 1.83% 0 1.77% 0 0.29% 6 33 300 0 15.32% 0 8.60% 0 17.34% 66 333 30 0 0.79% 0 3.18% 0 12.26% 4 10 1000 0 5.87% 0 10.21% 0 6.15% 40 100 100 0 0.41% 0 0.04% 0 2.68% 400 1000 10 0 0.42% 0 0.50% 0 1.22% 5 9 1000 2 3.82% 1 6.10% 0 2.51% 50 90 100 0 0.19% 0 0.05% 0 1.71% 500 900 10 0 0.37% 0 0.38% 0 1.82% 9 12 1000 6 1.79% 1 0.77% 0 0.26% 90 120 100 0 0.16% 1 0.05% 0 0.49% 900 1200 10 0 0.09% 0 0.26% 0 0.62% Looks mostly good…
  • 48. Samsung Chromebook 2 SCHED_FIFO (RT) Perf ondemand interactive sched run (ms) idle (ms) loops OR OH OR OH OR OH 1 100 100 0 39.61% 0 100.49% 0 99.57% 10 1000 10 0 73.51% 0 21.09% 0 96.66% 1 10 1000 0 18.01% 0 61.46% 0 67.68% 10 100 100 0 31.31% 0 18.62% 0 77.01% 100 1000 10 0 58.80% 0 1.90% 0 15.40% 6 33 300 251 85.99% 0 9.20% 1 30.09% 66 333 30 24 84.03% 0 3.38% 0 33.23% 4 10 1000 0 6.23% 0 12.21% 10 11.54% 40 100 100 100 62.08% 0 0.11% 1 11.85% 400 1000 10 10 62.09% 0 0.51% 0 7.00% 5 9 1000 999 12.29% 1 6.03% 0 0.04% 50 90 100 99 61.47% 0 0.05% 2 6.53% 500 900 10 10 43.37% 0 0.39% 0 6.30% 9 12 1000 999 9.83% 0 0.01% 14 1.69% 90 120 100 99 61.47% 0 0.01% 28 2.29% 900 1200 10 10 43.31% 0 0.22% 0 2.15% RTavg mechanism not reacting fast enough. sched_time_avg_ms = 50
  • 49. MediaTek 8173 EVB - CPUs 0-1: 507 MHz - 1573 MHz A53 - CPUs 2-3: 507 MHz - 1989 MHz A72 - power measured via onboard TI INA219s - thanks to Freedom Tan for this data
  • 50. MediaTek 8173 EVB SCHED_OTHER (CFS) Perf ondemand interactive sched run (ms) idle (ms) loops OR OH OR OH OR OH 1 100 100 0 98.04% 0 100.41% 0 98.04% 10 1000 10 0 34.00% 0 67.68% 0 99.95% 1 10 1000 0 56.32% 0 101.11% 0 100.69% 10 100 100 0 18.31% 0 31.57% 0 100.02% 100 1000 10 0 2.77% 0 6.79% 0 7.72% 6 33 300 0 41.29% 0 100.27% 0 100.28% 66 333 30 0 1.27% 0 10.38% 0 39.31% 4 10 1000 0 21.45% 2 18.90% 6 65.66% 40 100 100 0 1.35% 0 8.16% 0 24.30% 400 1000 10 0 1.02% 0 1.74% 0 6.55% 5 9 1000 0 13.43% 2 14.14% 5 52.51% 50 90 100 0 1.31% 0 1.39% 1 12.62% 500 900 10 0 0.54% 0 1.32% 0 5.21% 9 12 1000 1 7.19% 1 8.59% 3 27.47% 90 120 100 0 0.88% 0 0.75% 1 3.80% 900 1200 10 0 0.16% 0 0.79% 0 2.83% Trouble with most workloads with run < 100ms.
  • 51. MediaTek 8173 EVB SCHED_OTHER (CFS) Power/Perf power delta perf (from last slide) run (ms) idle (ms) loops delta w/ondemand delta w/interactive sched overhead 1 100 100 -7.56% 0.01% 98.04% 10 1000 10 -0.41% 1.21% 99.95% 1 10 1000 12.97% 3.92% 100.69% 10 100 100 -3.19% -4.90% 100.02% 100 1000 10 -7.00% 1.37% 7.72% 6 33 300 -11.84% -9.95% 100.28% 66 333 30 -8.15% -2.09% 39.31% 4 10 1000 -0.93% -8.59% 65.66% 40 100 100 -3.18% -9.88% 24.30% 400 1000 10 -2.99% 0.07% 6.55% 5 9 1000 1.67% -9.33% 52.51% 50 90 100 -5.97% -10.89% 12.62% 500 900 10 -2.29% 1.73% 5.21% 9 12 1000 -5.90% -9.75% 27.47% 90 120 100 -6.88% -5.12% 3.80% 900 1200 10 5.23% 6.57% 2.83% avg delta w/interactive: -3.48% avg delta w/ondemand: -2.9% Power savings seen, but not meaningful with observed perf losses.
  • 52. outline - how things (don’t) work today - schedfreq design - latest test results - an upstream surprise - next steps
  • 53. an upstream surprise - scheduler - cpufreq hooks posted by Rafael Wysocki - Jan 29th 2016 - now in linux-next - sched utilization driven gov also posted by Rafael - Feb 21st 2016
  • 54. important differences - ondemand-like freq algorithm - possible to get stuck due to freq invariance - weird semantics w.r.t. headroom - no aggregation of sched class capacities - currently goes to fmax for RT, DL - uses workqueue rather than kthread
  • 55. what’s it mean? - more engagement from upstream - what’s the value of schedfreq?
  • 56. words of encouragement “What I'd like to see from a scheduler metrics usage POV is a single central place, kernel/sched/cpufreq.c, where all the high level ('governor') decisions are made. This is the approach Steve's series takes.” Ingo Molnar, 03/03/2016 Note: Mike Turquette and Juri Lelli conceived and authored much of the schedfreq series.
  • 57. outline - how things (don’t) work today - schedfreq design - latest test results - an upstream surprise - next steps
  • 58. next steps - address shortcomings in schedutil - freq algorithm - scheduler hooks - better RT, DL response - more in-depth testing and analysis - experiments with real-world usecases - Android UI, games, benchmarks, etc. - merging with EAS - integration and testing with schedtune - window-based load tracking
  • 61. ondemand - sample every sampling_rate usec - cpu usage = busy% of last sample_rate usec - if busy% > up_threshold, go to fmax - otherwise scale with load - freq_next = fmin + busy% * (fmax - fmin) - stay at fmax longer with sampling_down_factor
  • 62. interactive - sample every timer_rate usec - cpu usage = busy% of last timer_rate usec - if busy% > go_hispeed_load go to hispeed_freq - otherwise scale CPU according to target_loads - 85 1000000:90 1700000:95 - XXX put a note in here to explain this - can prevent slowdown with min_sample_time - delay speedups past hispeed_freq with above_hispeed_delay