SlideShare a Scribd company logo
Ruby3x3: How are we going
to measure 3x?
Matthew
Gaudet
Developer
on the
Eclipse OMR
project
Cross platform
components for
building reliable, high
performance language
runtimes
github.com/eclipse/omr
@eclipseOMR
0
0.5
1
1.5
2
2.5
3
3.5
Ruby 2.0 Ruby 3.0
Ruby 3x3: The Goal.
Performance
Agenda
Let’s talk about benchmarking!
• Some definitions
• Some philosophy
• Some pitfalls
Ruby 3x3
• Some Thoughts from Me.
Benchmarking.
6
Art + Science
Definition
Benchmark:
A piece of computer code run in
order to gather measurements
for comparison.
Definition
Benchmark:
Comparing the execution time of different
interpreters, or options.
Comparing the execution time of algorithms
Comparing the accuracy of different machine
learning algorithms
The Art of Benchmarking: What do you run?
Microbenchmark
Full
Application
The Benchmark
Continuum
Application
Kernel
Microbenchmarks
Pros
 Often easy to setup and run.
 Targeted to a particular
aspect.
 Fast acquisition of data.
Cons
 Exaggerates effects.
 Not typically generalizable.
A very small program written to explore the performance of one
aspect of the system under test.
Full Applications
Pros
 Immediate and obvious real
world impact!
Cons
 Small effects can be
swamped in natural
application variance.
 Can be complicated to
setup, or slow to run!
Benchmarking a whole application
Application Kernel
Pros
 Tight connection to real
world code.
 Typically more
generalizable.
Cons
 Difficult to know how much
of a an application should
be included vs. mocked.
A particular part of an application extracted for the express purpose
of constructing a benchmark.
Pitfalls in benchmark design
Un-Ruby-Like Code:
Code that looks like another language.
“You can write FORTRAN in any language”
Code that never produces garbage.
Code without exceptions
Pitfalls in benchmark design
Input Data is a key part of many benchmarks: Watch out
for weird input data!
 Imagine an MP3 compressor benchmark
– Inputs are
1. Silence. weird because most mp3s are not silence.
2. White noise. weird because most mp3s have some
structure.
– Reduces the generalizability of the results!
The Art of Benchmarking: What do you run?
What do you measure?
Time?
Throughput?
Latency?
Definition
Wall-clock time:
The measurement of relative to
a clock independent of the
process being timed.
$ time sleep 1
real 0m1.003s
user 0m0.000s
sys 0m0.000s
Definition
CPU time:
Measurement of how much of
the CPU the process actually
used
$ time sleep 1
real 0m1.003s
user 0m0.000s
sys 0m0.000s
Definition
Throughput:
A count of operations that occur
per unit of time.
Definition
Latency:
The time it takes for a response
to occur after stimulus.
The Art of Benchmarking: What do you run?
What do you measure?
What do you report?
Raw Measurements?
Speedup?
Definition
Speedup:
A ratio computed between a
baseline and experimental time
measurement.
𝑇𝑏𝑎𝑠𝑒𝑙𝑖𝑛𝑒
𝑇𝑒𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡𝑎𝑙
The Science of Benchmarking
An aside on misleading with speedup.
Speedup:
A ratio computed between a
baseline and experimental time
measurement.
An aside on misleading with speedup.
Speedup:
A ratio computed between a
baseline and experimental time
measurement.
An aside on misleading with speedup.
“He who controls the baseline
controls the speedup”
An aside on misleading with speedup.
“Our parallelization system shows
linear speedup as the number of
threads increases”
An aside on misleading with speedup.
0
1
2
3
4
5
6
7
8
9
1 thread 2 thread 4 thread 8 thread
SPEEDUP
Speedup
An aside on misleading with speedup.
Measurement Time (s)
Original Sequential Program 10.0
Parallelized 1 thread 100.0
Parallelized 2 thread 50.0
Parallelized 4 thread 25.0
Parallelized 8 thread 12.5
The distinction between relative speedup
and absolute speedup.
How you measure affects what
you measure.
Both of these are valid benchmarks!
$ cat test.rb
...
puts Benchmark.measure {
1_000_000.times {
compute_foo()
}
}
$ for i in `seq 1 10`; do
ruby t.rb ; done;
...
10.times {
puts Benchmark.measure {
1_000_000.times {
compute_foo()
}
}
}
vs.
But they’re going to measure (and may encourage
the optimization of ) two different things!
Definition
Warmup:
The time from application start
until it hits peak performance.
100
64 69
36
25 30 25 26 25 26 25
1 2 3 4 5 6 7 8 9 10 11
Time per Iteration (s)
When has warmup finished?
Despite this, even knowing warmup exists is important: It
allows us to choose methodologies that can accommodate the
possibility!
Definition
Run-to-Run Variance
The observed effect that
identical runs do not have
identical times.
$ for i in `seq 1 5`; do ruby -I../../lib/ string-equal.rb
--loopn 1 1000; done;
1.347334558
1.348350632
1.30690478
1.314764977
1.323862345
Methodology:
An incomplete list of decisions that need to be made when
developing benchmarking methodology:
1. Does your methodology account for warmup?
2. How are you accounting for run-to-run variance?
3. How are you accounting for the effects of the garbage
collector?
Pitfalls in benchmark design
Accounting for warmup often means producing
intermediate scores, so you can see when they stabilize.
If you aren’t accounting for warmup, you may find
that you miss out on peak performance.
Pitfalls in benchmark design
Account for run to run variance by running multiple times,
and presenting confidence intervals!
Be sure you’re methodology doesn’t encourage wild
variations in performance though!
Be aware, benchmarks can act Weird
Garbage Collector Impact
ruby -J-Xmx330m -J-Xms330m
-I../../lib/ connected.rb --loopn
10 1
0.426412300002994
0.35442964400863275
0.3484781830047723
0.36281039800087456
0.3565745719970437
0.36179181998886634
0.31713732800562866
0.3365019329939969
0.305397536008968
0.3006619710067753
ruby -J-Xmx33m -J-Xms33m
-I../../lib/ connected.rb --loopn
10 1
0.5431441880064085
0.8410410610085819
0.7975159170018742
0.8458756269974401
0.9974212259985507
1.0887025539996102
1.067053010003292
1.057003531997907
1.0708161939983256
1.0480617069988512
Garbage Collector Impact
Garbage collector impact can make benchmarks incredibly difficult to
compare:
 The Ruby+OMR Preview uses the OMR GC technology, including a
change to move off heap data on heap.
 Side effect of this is that it’s crazy difficult to compare against the default
ruby: there’s an entirely different set of data on the heap!
If heap size adapts to machine memory, you’ll need to figure out how to
lock it to give good comparisons across machines
42
string malloc string OMRBuffer
Benchmarking:
User Error
$ time ruby their_implementation.rb 100000
real 0m10.003s
user 0m08.001s
sys 0m02.007s
$ time ruby my_implementation.rb 10000
real 0m1.003s
user 0m0.801s
sys 0m0.206s
10x speedup!
User Error
$ time ruby their_implementation.rb 100000
real 0m10.003s
user 0m08.001s
sys 0m02.007s
$ time ruby my_implementation.rb 10000
real 0m1.003s
user 0m0.801s
sys 0m0.206s
10x speedup!
Pro Tip: Use a harness that
keeps you out of the
benchmarking process.
Aim for reproducibility!
Time(s)
Iterations
Unplugs
Laptop
Return
Power
Power Saving
Mode
Other Hardware Effects to watch for!
TurboBoost (and similar): Frequency scaling based
on the season.
Other Hardware Effects to watch for!
TurboBoost (and similar): Frequency scaling based
on the season location.
Other Hardware Effects to watch for!
TurboBoost (and similar): Frequency scaling based
on the season location rack
Other Hardware Effects to watch for!
TurboBoost (and similar): Frequency scaling based
on the season location rack CPU temperature.
Even in the cloud! [1]
[1]: http://www.brendangregg.com/blog/2014-09-15/the-msrs-
of-ec2.html
Software Pitfalls
What about your backup service?
Long sequence of benchmarks… do you have
automatic software updates installed?
Do your system administrators know you are
benchmarking?
What about your
screensaver?
Paranoia is a matter of
Effect Sizes
 Hardware Changes:
– Disable turbo boost,
– Disable hyperthreading.
 Krun tool:
– Set ulimit for heap and stack.
– Reboot machine before execution
– Monitor dmesg for unexpected output
– Monitor temperature of machine.
– Disable pstates
– CPU Governor set to performance mode.
– Perf sample rate control.
– Disable ASLR.
– Create a new user account for each run
http://arxiv.org/pdf/1602.00602v1.pdf
Performance improvements compound!
55
is 10 increases of 11%
is 25 increases of 4.5%
is 100 increases of 1.1%
3x
0
0.5
1
1.5
2
2.5
3
3.5
Ruby 2.0 Ruby 2.1 Ruby 2.2 Ruby 2.3 Ruby 2.4 Ruby 2.? Ruby 2.? Ruby 2.? Ruby 2.? Ruby 2.? Ruby 2.? Ruby 3.0
Ruby 3x3: The Process
Performance
(Made up data for
illustration only)
Philosophizing
Philosophy
Benchmarks drive change.
– What you measure is what
people try to change.
–What you don’t measure, may
not change how you want.
Squeezing a Water Balloon
Be sure to measure associated metrics to have a
clear headed view of tradeoffs:
For example: JIT Compilation:
Trade startup speed for peak speed.
Trade footprint for speed.
Benchmarks age!
Benchmarks can be wrung of all their possible
performance at some point.
 Using the same benchmarks for too long can lead to
shortsighted decisions driven by old benchmarks.
Idiomatic code evolves in a language.
 Benchmark use of language features can help drive
adoption!
–Be sure to benchmark desirable new language features!
60
Benchmarking 3x3
https://twitter.com/tenderlove/status/765288219931881472
62
Ruby Community has some great starting points!
Recall: Benchmarks drive change
Thought: Choose 9 application kernels that
represent what we want from a future CRuby!
• Why 9?
• Too many benchmarks can diffuse effort.
• Also! 3x3 = 9!
¯_(ツ)_/¯
Brainstorming on the nine?
1. Some CPU intensive applications:
• OptCarrot, Neural Nets, Monte Carlo Tree
Search, PSD filter pipeline?
2. Some memory intensive application:
• Large tree mutation benchmark?
3. A startup benchmark:
• time ruby -e “def foo; ‘100’; end; puts foo”?
4. Some web application framework benchmarks.
Choose a methodology that drives the change we
want in CRuby.
Want great performance, but not huge warmup
times?
–Only run 5 iterations, and score the last one?
Don’t want to deal with warmup?
–Don’t run iterations: Score the first run!
I Error Bars
One last idea…
What about a more ambitious
choice?
Use the ecosystem!
Add a standard performance harness to RubyGems.
 Would allow VM developers to sample popular gems, and
run a perf suite written by gem authors.
 With effort, time and $$$, we could make broad statements
about performance impact on the gem ecosystem.
Use the ecosystem!
Doesn’t just help VM developers
Gem authors get
1. Enabled for performance tracking!
2. Easier performance reporting with VM developers.
Credits
 Headache: https://en.wikipedia.org/wiki/Headache#/media/File:Cruikshank_-
_The_Head_Ache.png
@MattStudies
magaudet@ca.ibm.com
For more on software systems evaluation, be sure to visit
The Evaluate Collaboratory @
http://evaluate.inf.usi.ch/

More Related Content

What's hot

Gemification plan of Standard Library on Ruby
Gemification plan of Standard Library on RubyGemification plan of Standard Library on Ruby
Gemification plan of Standard Library on Ruby
Hiroshi SHIBATA
 
How to Begin to Develop Ruby Core
How to Begin to Develop Ruby CoreHow to Begin to Develop Ruby Core
How to Begin to Develop Ruby Core
Hiroshi SHIBATA
 
Why scala is not my ideal language and what I can do with this
Why scala is not my ideal language and what I can do with thisWhy scala is not my ideal language and what I can do with this
Why scala is not my ideal language and what I can do with this
Ruslan Shevchenko
 
Graal in GraalVM - A New JIT Compiler
Graal in GraalVM - A New JIT CompilerGraal in GraalVM - A New JIT Compiler
Graal in GraalVM - A New JIT Compiler
Koichi Sakata
 
How to develop the Standard Libraries of Ruby?
How to develop the Standard Libraries of Ruby?How to develop the Standard Libraries of Ruby?
How to develop the Standard Libraries of Ruby?
Hiroshi SHIBATA
 
JVM++: The Graal VM
JVM++: The Graal VMJVM++: The Graal VM
JVM++: The Graal VM
Martin Toshev
 
tDiary annual report 2009 - Sapporo Ruby Kaigi02
tDiary annual report 2009 - Sapporo Ruby Kaigi02tDiary annual report 2009 - Sapporo Ruby Kaigi02
tDiary annual report 2009 - Sapporo Ruby Kaigi02Hiroshi SHIBATA
 
Ruby 2.4 Internals
Ruby 2.4 InternalsRuby 2.4 Internals
Ruby 2.4 Internals
Koichi Sasada
 
RubyKaigi2015 making robots-with-mruby
RubyKaigi2015 making robots-with-mrubyRubyKaigi2015 making robots-with-mruby
RubyKaigi2015 making robots-with-mruby
yamanekko
 
JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013Vladimir Ivanov
 
Java & low latency applications
Java & low latency applicationsJava & low latency applications
Java & low latency applications
Ruslan Shevchenko
 
Fiber in the 10th year
Fiber in the 10th yearFiber in the 10th year
Fiber in the 10th year
Koichi Sasada
 
An introduction and future of Ruby coverage library
An introduction and future of Ruby coverage libraryAn introduction and future of Ruby coverage library
An introduction and future of Ruby coverage library
mametter
 
The secret of programming language development and future
The secret of programming  language development and futureThe secret of programming  language development and future
The secret of programming language development and future
Hiroshi SHIBATA
 
20140425 ruby conftaiwan2014
20140425 ruby conftaiwan201420140425 ruby conftaiwan2014
20140425 ruby conftaiwan2014Hiroshi SHIBATA
 
Gems on Ruby
Gems on RubyGems on Ruby
Gems on Ruby
Hiroshi SHIBATA
 
Graal and Truffle: One VM to Rule Them All
Graal and Truffle: One VM to Rule Them AllGraal and Truffle: One VM to Rule Them All
Graal and Truffle: One VM to Rule Them All
Thomas Wuerthinger
 

What's hot (20)

Gemification plan of Standard Library on Ruby
Gemification plan of Standard Library on RubyGemification plan of Standard Library on Ruby
Gemification plan of Standard Library on Ruby
 
How to Begin to Develop Ruby Core
How to Begin to Develop Ruby CoreHow to Begin to Develop Ruby Core
How to Begin to Develop Ruby Core
 
Why scala is not my ideal language and what I can do with this
Why scala is not my ideal language and what I can do with thisWhy scala is not my ideal language and what I can do with this
Why scala is not my ideal language and what I can do with this
 
Graal in GraalVM - A New JIT Compiler
Graal in GraalVM - A New JIT CompilerGraal in GraalVM - A New JIT Compiler
Graal in GraalVM - A New JIT Compiler
 
20140925 rails pacific
20140925 rails pacific20140925 rails pacific
20140925 rails pacific
 
From 'Legacy' to 'Edge'
From 'Legacy' to 'Edge'From 'Legacy' to 'Edge'
From 'Legacy' to 'Edge'
 
How to develop the Standard Libraries of Ruby?
How to develop the Standard Libraries of Ruby?How to develop the Standard Libraries of Ruby?
How to develop the Standard Libraries of Ruby?
 
JVM++: The Graal VM
JVM++: The Graal VMJVM++: The Graal VM
JVM++: The Graal VM
 
tDiary annual report 2009 - Sapporo Ruby Kaigi02
tDiary annual report 2009 - Sapporo Ruby Kaigi02tDiary annual report 2009 - Sapporo Ruby Kaigi02
tDiary annual report 2009 - Sapporo Ruby Kaigi02
 
Ruby 2.4 Internals
Ruby 2.4 InternalsRuby 2.4 Internals
Ruby 2.4 Internals
 
20140918 ruby kaigi2014
20140918 ruby kaigi201420140918 ruby kaigi2014
20140918 ruby kaigi2014
 
RubyKaigi2015 making robots-with-mruby
RubyKaigi2015 making robots-with-mrubyRubyKaigi2015 making robots-with-mruby
RubyKaigi2015 making robots-with-mruby
 
JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013
 
Java & low latency applications
Java & low latency applicationsJava & low latency applications
Java & low latency applications
 
Fiber in the 10th year
Fiber in the 10th yearFiber in the 10th year
Fiber in the 10th year
 
An introduction and future of Ruby coverage library
An introduction and future of Ruby coverage libraryAn introduction and future of Ruby coverage library
An introduction and future of Ruby coverage library
 
The secret of programming language development and future
The secret of programming  language development and futureThe secret of programming  language development and future
The secret of programming language development and future
 
20140425 ruby conftaiwan2014
20140425 ruby conftaiwan201420140425 ruby conftaiwan2014
20140425 ruby conftaiwan2014
 
Gems on Ruby
Gems on RubyGems on Ruby
Gems on Ruby
 
Graal and Truffle: One VM to Rule Them All
Graal and Truffle: One VM to Rule Them AllGraal and Truffle: One VM to Rule Them All
Graal and Truffle: One VM to Rule Them All
 

Similar to Ruby3x3: How are we going to measure 3x

What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performance
Piotr Przymus
 
Faster Python Programs Through Optimization by Dr.-Ing Mike Muller
Faster Python Programs Through Optimization by Dr.-Ing Mike MullerFaster Python Programs Through Optimization by Dr.-Ing Mike Muller
Faster Python Programs Through Optimization by Dr.-Ing Mike Muller
PyData
 
Training - What is Performance ?
Training  - What is Performance ?Training  - What is Performance ?
Training - What is Performance ?
Betclic Everest Group Tech Team
 
04 performance
04 performance04 performance
04 performance
marangburu42
 
Performance tuning Grails applications
 Performance tuning Grails applications Performance tuning Grails applications
Performance tuning Grails applications
GR8Conf
 
03 performance
03 performance03 performance
03 performance
marangburu42
 
DevOPs Transformation Workshop
DevOPs Transformation WorkshopDevOPs Transformation Workshop
DevOPs Transformation Workshop
Jules Pierre-Louis
 
Fundamentals Performance Testing
Fundamentals Performance TestingFundamentals Performance Testing
Fundamentals Performance Testing
Bhuvaneswari Subramani
 
Parallel Computing - Lec 6
Parallel Computing - Lec 6Parallel Computing - Lec 6
Parallel Computing - Lec 6
Shah Zaib
 
Webinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationWebinar: Performance Tuning + Optimization
Webinar: Performance Tuning + Optimization
MongoDB
 
Gatling workshop lets test17
Gatling workshop lets test17Gatling workshop lets test17
Gatling workshop lets test17
Gerald Muecke
 
Latency SLOs Done Right
Latency SLOs Done RightLatency SLOs Done Right
Latency SLOs Done Right
Fred Moyer
 
sCode optimization
sCode optimizationsCode optimization
sCode optimization
Satyamevjayte Haxor
 
Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014
Lari Hotari
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUGslandelle
 
Benchmarking PyCon AU 2011 v0
Benchmarking PyCon AU 2011 v0Benchmarking PyCon AU 2011 v0
Benchmarking PyCon AU 2011 v0
Tennessee Leeuwenburg
 
Performance Testing Java Applications
Performance Testing Java ApplicationsPerformance Testing Java Applications
Performance Testing Java Applications
C4Media
 
Is It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB PerformanceIs It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB Performance
Tim Callaghan
 
CODE TUNINGtertertertrtryryryryrtytrytrtry
CODE TUNINGtertertertrtryryryryrtytrytrtryCODE TUNINGtertertertrtryryryryrtytrytrtry
CODE TUNINGtertertertrtryryryryrtytrytrtry
kapib57390
 
Identifying Optimal Trade-Offs between CPU Time Usage and Temporal Constraints
Identifying Optimal Trade-Offs between CPU Time Usage and Temporal ConstraintsIdentifying Optimal Trade-Offs between CPU Time Usage and Temporal Constraints
Identifying Optimal Trade-Offs between CPU Time Usage and Temporal ConstraintsLionel Briand
 

Similar to Ruby3x3: How are we going to measure 3x (20)

What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performance
 
Faster Python Programs Through Optimization by Dr.-Ing Mike Muller
Faster Python Programs Through Optimization by Dr.-Ing Mike MullerFaster Python Programs Through Optimization by Dr.-Ing Mike Muller
Faster Python Programs Through Optimization by Dr.-Ing Mike Muller
 
Training - What is Performance ?
Training  - What is Performance ?Training  - What is Performance ?
Training - What is Performance ?
 
04 performance
04 performance04 performance
04 performance
 
Performance tuning Grails applications
 Performance tuning Grails applications Performance tuning Grails applications
Performance tuning Grails applications
 
03 performance
03 performance03 performance
03 performance
 
DevOPs Transformation Workshop
DevOPs Transformation WorkshopDevOPs Transformation Workshop
DevOPs Transformation Workshop
 
Fundamentals Performance Testing
Fundamentals Performance TestingFundamentals Performance Testing
Fundamentals Performance Testing
 
Parallel Computing - Lec 6
Parallel Computing - Lec 6Parallel Computing - Lec 6
Parallel Computing - Lec 6
 
Webinar: Performance Tuning + Optimization
Webinar: Performance Tuning + OptimizationWebinar: Performance Tuning + Optimization
Webinar: Performance Tuning + Optimization
 
Gatling workshop lets test17
Gatling workshop lets test17Gatling workshop lets test17
Gatling workshop lets test17
 
Latency SLOs Done Right
Latency SLOs Done RightLatency SLOs Done Right
Latency SLOs Done Right
 
sCode optimization
sCode optimizationsCode optimization
sCode optimization
 
Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUG
 
Benchmarking PyCon AU 2011 v0
Benchmarking PyCon AU 2011 v0Benchmarking PyCon AU 2011 v0
Benchmarking PyCon AU 2011 v0
 
Performance Testing Java Applications
Performance Testing Java ApplicationsPerformance Testing Java Applications
Performance Testing Java Applications
 
Is It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB PerformanceIs It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB Performance
 
CODE TUNINGtertertertrtryryryryrtytrytrtry
CODE TUNINGtertertertrtryryryryrtytrytrtryCODE TUNINGtertertertrtryryryryrtytrytrtry
CODE TUNINGtertertertrtryryryryrtytrytrtry
 
Identifying Optimal Trade-Offs between CPU Time Usage and Temporal Constraints
Identifying Optimal Trade-Offs between CPU Time Usage and Temporal ConstraintsIdentifying Optimal Trade-Offs between CPU Time Usage and Temporal Constraints
Identifying Optimal Trade-Offs between CPU Time Usage and Temporal Constraints
 

Recently uploaded

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 

Recently uploaded (20)

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 

Ruby3x3: How are we going to measure 3x

  • 1. Ruby3x3: How are we going to measure 3x?
  • 3. Cross platform components for building reliable, high performance language runtimes github.com/eclipse/omr @eclipseOMR
  • 4. 0 0.5 1 1.5 2 2.5 3 3.5 Ruby 2.0 Ruby 3.0 Ruby 3x3: The Goal. Performance
  • 5. Agenda Let’s talk about benchmarking! • Some definitions • Some philosophy • Some pitfalls Ruby 3x3 • Some Thoughts from Me.
  • 7. Definition Benchmark: A piece of computer code run in order to gather measurements for comparison.
  • 8. Definition Benchmark: Comparing the execution time of different interpreters, or options. Comparing the execution time of algorithms Comparing the accuracy of different machine learning algorithms
  • 9. The Art of Benchmarking: What do you run?
  • 11. Microbenchmarks Pros  Often easy to setup and run.  Targeted to a particular aspect.  Fast acquisition of data. Cons  Exaggerates effects.  Not typically generalizable. A very small program written to explore the performance of one aspect of the system under test.
  • 12. Full Applications Pros  Immediate and obvious real world impact! Cons  Small effects can be swamped in natural application variance.  Can be complicated to setup, or slow to run! Benchmarking a whole application
  • 13. Application Kernel Pros  Tight connection to real world code.  Typically more generalizable. Cons  Difficult to know how much of a an application should be included vs. mocked. A particular part of an application extracted for the express purpose of constructing a benchmark.
  • 14. Pitfalls in benchmark design Un-Ruby-Like Code: Code that looks like another language. “You can write FORTRAN in any language” Code that never produces garbage. Code without exceptions
  • 15. Pitfalls in benchmark design Input Data is a key part of many benchmarks: Watch out for weird input data!  Imagine an MP3 compressor benchmark – Inputs are 1. Silence. weird because most mp3s are not silence. 2. White noise. weird because most mp3s have some structure. – Reduces the generalizability of the results!
  • 16. The Art of Benchmarking: What do you run? What do you measure?
  • 18. Definition Wall-clock time: The measurement of relative to a clock independent of the process being timed. $ time sleep 1 real 0m1.003s user 0m0.000s sys 0m0.000s
  • 19. Definition CPU time: Measurement of how much of the CPU the process actually used $ time sleep 1 real 0m1.003s user 0m0.000s sys 0m0.000s
  • 20. Definition Throughput: A count of operations that occur per unit of time.
  • 21. Definition Latency: The time it takes for a response to occur after stimulus.
  • 22. The Art of Benchmarking: What do you run? What do you measure? What do you report?
  • 24. Definition Speedup: A ratio computed between a baseline and experimental time measurement. 𝑇𝑏𝑎𝑠𝑒𝑙𝑖𝑛𝑒 𝑇𝑒𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡𝑎𝑙
  • 25. The Science of Benchmarking
  • 26. An aside on misleading with speedup. Speedup: A ratio computed between a baseline and experimental time measurement.
  • 27. An aside on misleading with speedup. Speedup: A ratio computed between a baseline and experimental time measurement.
  • 28. An aside on misleading with speedup. “He who controls the baseline controls the speedup”
  • 29. An aside on misleading with speedup. “Our parallelization system shows linear speedup as the number of threads increases”
  • 30. An aside on misleading with speedup. 0 1 2 3 4 5 6 7 8 9 1 thread 2 thread 4 thread 8 thread SPEEDUP Speedup
  • 31. An aside on misleading with speedup. Measurement Time (s) Original Sequential Program 10.0 Parallelized 1 thread 100.0 Parallelized 2 thread 50.0 Parallelized 4 thread 25.0 Parallelized 8 thread 12.5 The distinction between relative speedup and absolute speedup.
  • 32. How you measure affects what you measure.
  • 33. Both of these are valid benchmarks! $ cat test.rb ... puts Benchmark.measure { 1_000_000.times { compute_foo() } } $ for i in `seq 1 10`; do ruby t.rb ; done; ... 10.times { puts Benchmark.measure { 1_000_000.times { compute_foo() } } } vs. But they’re going to measure (and may encourage the optimization of ) two different things!
  • 34. Definition Warmup: The time from application start until it hits peak performance. 100 64 69 36 25 30 25 26 25 26 25 1 2 3 4 5 6 7 8 9 10 11 Time per Iteration (s)
  • 35. When has warmup finished? Despite this, even knowing warmup exists is important: It allows us to choose methodologies that can accommodate the possibility!
  • 36. Definition Run-to-Run Variance The observed effect that identical runs do not have identical times. $ for i in `seq 1 5`; do ruby -I../../lib/ string-equal.rb --loopn 1 1000; done; 1.347334558 1.348350632 1.30690478 1.314764977 1.323862345
  • 37. Methodology: An incomplete list of decisions that need to be made when developing benchmarking methodology: 1. Does your methodology account for warmup? 2. How are you accounting for run-to-run variance? 3. How are you accounting for the effects of the garbage collector?
  • 38. Pitfalls in benchmark design Accounting for warmup often means producing intermediate scores, so you can see when they stabilize. If you aren’t accounting for warmup, you may find that you miss out on peak performance.
  • 39. Pitfalls in benchmark design Account for run to run variance by running multiple times, and presenting confidence intervals! Be sure you’re methodology doesn’t encourage wild variations in performance though!
  • 40. Be aware, benchmarks can act Weird
  • 41. Garbage Collector Impact ruby -J-Xmx330m -J-Xms330m -I../../lib/ connected.rb --loopn 10 1 0.426412300002994 0.35442964400863275 0.3484781830047723 0.36281039800087456 0.3565745719970437 0.36179181998886634 0.31713732800562866 0.3365019329939969 0.305397536008968 0.3006619710067753 ruby -J-Xmx33m -J-Xms33m -I../../lib/ connected.rb --loopn 10 1 0.5431441880064085 0.8410410610085819 0.7975159170018742 0.8458756269974401 0.9974212259985507 1.0887025539996102 1.067053010003292 1.057003531997907 1.0708161939983256 1.0480617069988512
  • 42. Garbage Collector Impact Garbage collector impact can make benchmarks incredibly difficult to compare:  The Ruby+OMR Preview uses the OMR GC technology, including a change to move off heap data on heap.  Side effect of this is that it’s crazy difficult to compare against the default ruby: there’s an entirely different set of data on the heap! If heap size adapts to machine memory, you’ll need to figure out how to lock it to give good comparisons across machines 42 string malloc string OMRBuffer
  • 44.
  • 45. User Error $ time ruby their_implementation.rb 100000 real 0m10.003s user 0m08.001s sys 0m02.007s $ time ruby my_implementation.rb 10000 real 0m1.003s user 0m0.801s sys 0m0.206s 10x speedup!
  • 46. User Error $ time ruby their_implementation.rb 100000 real 0m10.003s user 0m08.001s sys 0m02.007s $ time ruby my_implementation.rb 10000 real 0m1.003s user 0m0.801s sys 0m0.206s 10x speedup! Pro Tip: Use a harness that keeps you out of the benchmarking process. Aim for reproducibility!
  • 48. Other Hardware Effects to watch for! TurboBoost (and similar): Frequency scaling based on the season.
  • 49. Other Hardware Effects to watch for! TurboBoost (and similar): Frequency scaling based on the season location.
  • 50. Other Hardware Effects to watch for! TurboBoost (and similar): Frequency scaling based on the season location rack
  • 51. Other Hardware Effects to watch for! TurboBoost (and similar): Frequency scaling based on the season location rack CPU temperature. Even in the cloud! [1] [1]: http://www.brendangregg.com/blog/2014-09-15/the-msrs- of-ec2.html
  • 52. Software Pitfalls What about your backup service? Long sequence of benchmarks… do you have automatic software updates installed? Do your system administrators know you are benchmarking?
  • 54. Paranoia is a matter of Effect Sizes  Hardware Changes: – Disable turbo boost, – Disable hyperthreading.  Krun tool: – Set ulimit for heap and stack. – Reboot machine before execution – Monitor dmesg for unexpected output – Monitor temperature of machine. – Disable pstates – CPU Governor set to performance mode. – Perf sample rate control. – Disable ASLR. – Create a new user account for each run http://arxiv.org/pdf/1602.00602v1.pdf
  • 55. Performance improvements compound! 55 is 10 increases of 11% is 25 increases of 4.5% is 100 increases of 1.1% 3x
  • 56. 0 0.5 1 1.5 2 2.5 3 3.5 Ruby 2.0 Ruby 2.1 Ruby 2.2 Ruby 2.3 Ruby 2.4 Ruby 2.? Ruby 2.? Ruby 2.? Ruby 2.? Ruby 2.? Ruby 2.? Ruby 3.0 Ruby 3x3: The Process Performance (Made up data for illustration only)
  • 58. Philosophy Benchmarks drive change. – What you measure is what people try to change. –What you don’t measure, may not change how you want.
  • 59. Squeezing a Water Balloon Be sure to measure associated metrics to have a clear headed view of tradeoffs: For example: JIT Compilation: Trade startup speed for peak speed. Trade footprint for speed.
  • 60. Benchmarks age! Benchmarks can be wrung of all their possible performance at some point.  Using the same benchmarks for too long can lead to shortsighted decisions driven by old benchmarks. Idiomatic code evolves in a language.  Benchmark use of language features can help drive adoption! –Be sure to benchmark desirable new language features! 60
  • 63. Ruby Community has some great starting points!
  • 64. Recall: Benchmarks drive change Thought: Choose 9 application kernels that represent what we want from a future CRuby! • Why 9? • Too many benchmarks can diffuse effort. • Also! 3x3 = 9! ¯_(ツ)_/¯
  • 65. Brainstorming on the nine? 1. Some CPU intensive applications: • OptCarrot, Neural Nets, Monte Carlo Tree Search, PSD filter pipeline? 2. Some memory intensive application: • Large tree mutation benchmark? 3. A startup benchmark: • time ruby -e “def foo; ‘100’; end; puts foo”? 4. Some web application framework benchmarks.
  • 66. Choose a methodology that drives the change we want in CRuby. Want great performance, but not huge warmup times? –Only run 5 iterations, and score the last one? Don’t want to deal with warmup? –Don’t run iterations: Score the first run! I Error Bars
  • 68. What about a more ambitious choice?
  • 69. Use the ecosystem! Add a standard performance harness to RubyGems.  Would allow VM developers to sample popular gems, and run a perf suite written by gem authors.  With effort, time and $$$, we could make broad statements about performance impact on the gem ecosystem.
  • 70. Use the ecosystem! Doesn’t just help VM developers Gem authors get 1. Enabled for performance tracking! 2. Easier performance reporting with VM developers.
  • 71. Credits  Headache: https://en.wikipedia.org/wiki/Headache#/media/File:Cruikshank_- _The_Head_Ache.png @MattStudies magaudet@ca.ibm.com For more on software systems evaluation, be sure to visit The Evaluate Collaboratory @ http://evaluate.inf.usi.ch/

Editor's Notes

  1. OMR is a project trying to create reusable components for building or augmenting language runtimes. Should be some news soon, so follow us on twitter. Please, come talk to me about OMR! But, I’m not here to talk about OMR right now.
  2. That purple circle hides a big concept! Let’s dig into it.
  3. Benchmarking is this weird combination of art and science, that drives me mad. The problem is that benchmarks seem so objective and scientific, but are filled with judgement calls, and the science is hard!
  4. The art of benchmarking ends up being a long list of questions and decisions you have to ask yourself, filled with judgement calls. First off, what do you run?
  5. Sometimes this involves mocking up parts of the normal application flow in such a way to keep the code isolated.
  6. Imagine how this perturbs the code paths that your interpreter is going to take.
  7. The art of benchmarking ends up being a long list of questions and decisions you have to ask yourself, filled with judgement calls. First off, what do you run?
  8. Lots of questions have to be asked when you are benchmarking. This is equally true of both application developers and those who are developing language runtimes!
  9. Often when we’
  10. CPU time can be pretty misleading in a lot of circumstances: Notice that sleep used almost no CPU time, because it didn’t do anything! But it spent a long time running! Can be important though if you’re on a platform that charges by CPU usage!
  11. L
  12. For example, in a web server, latency would be how long it takes a request to be processed after the request is received.
  13. The art of benchmarking ends up being a long list of questions and decisions you have to ask yourself, filled with judgement calls. First off, what do you run?
  14. Lots of questions have to be asked when you are benchmarking. This is equally true of both application developers and those who are developing language runtimes!
  15. Typically, speedup is talking about a measurement on the same machine with a software change of some kind, though one can also compute speedups by changing hardware.
  16. Typically, speedup is talking about a measurement on the same machine with a software change of some kind.
  17. I used to be an academic, and I learned while I was there that it’s terribly easy to lie with speedup.
  18. Typically, speedup is talking about a measurement on the same machine with a software change of some kind, though one can also compute speedups by changing hardware.
  19. To abuse a quote from Dune,
  20. To abuse a quote from Dune,
  21. To abuse a quote from Dune,
  22. You’ll note even at 8 threads, the parallel program is slower than the original. Relative: Relative to 1 thread Absolute: Relative to the fastest sequential version!
  23. This point isn’t obvious to everyone.
  24. The first will try to encourage faster startup – if compute foo runs quickly, startup costs will dominate the run on the left side.
  25. Warmup can occur as code loading is happening, caches are warmed up, JIT compilation occurs, etc. Warmup is a really awkward term, because while many people understands what you mean, but it’s not got a great scientific definition.
  26. Warmup can occur as code loading is happening, caches are warmed up, JIT compilation occurs, operating system thread scheduling
  27. Reporting the minimum time for example.
  28. When trying to measure performance be aware that benchmarks can act weird! You’ll have to report with a methodology that can handle it!
  29. 3x degradation of performance by having too small a heap.
  30. Imagine you do your benchmark baseline on your couch at home, but then you get to work and find your change has made everything 3x faster!
  31. You benchmark 10 rubies…
  32. But we would like to be able to measure small changes….
  33. Faster code can come at the cost of increased warmup time, increased footrprint etc.
  34. 2. Just because you’re the fastest C89 compiler today doesn’t matter if people are writing C11 code that looks different! -
  35. At this point, we go to the wise tenderlove, who reminds us!
  36. Please… whatever you do though, account for some variance.
  37. I wanted to leave having some brainstorming