SlideShare a Scribd company logo
∼ Best Practice for Better Performance ∼
Scala Days 2015 San Francisco Un-conference
2015-03-19 @mogproject
Ad Tech Performance
About Demand Side Science
Introduction to Performance Tuning
Best Practice in Development
Japanese language version here:
Yosuke Mizutani (@mogproject)

Joined Demand Side Science in April 2013

(thanks to Scala Conference in Japan 2013)
Full-stack engineer (want to be…)
Background: 9-year infrastructure engineer
About Me
in Japanese Language
in Japanese Language
in Japanese Language
Nov 2012

Established Demand Side Science Inc.
Brief History of DSS

Developed private DSP package fractale
Brief History of DSS
Advertiser’s side of realtime ads bidding (RTB)
What is DSP
Supply Side Platform
Dec 2013

Moved into the group of Opt, the e-marketing agency
Oct 2014

Released dynamic creative tool unis
Brief History of DSS
× ×
unis is a third-party ad server which creates
dynamic and/or personalized ads under the rules.
items on sale most popular
fixed items
With venture mind + advantage of Opt group …
Future of DSS
We will create various products based on Science!
Future of DSS
For everyone’s …
Future of DSS
Marketer × Publisher Consumer×
Future of DSS
Win × Win Win×
DSS and Scala
Adopt Scala ×
for all products
from the day of establishment
System Architecture Example
log storage
Log Aggregation
Machine Learning
Cache Making
Today, I will not talk about JavaScript tuning.
System Architecture Example
About Demand Side Science
Introduction to Performance Tuning
Best Practice in Development
Resolve an issue
Reduce infrastructure cost

(e.g. Amazon Web Services)
Application goes wrong with high load

Bad latency under the specific condition
Slow batch execution than expectations
Slow development tools
Resolve an Issue
Very important especially in ad tech industry
Cost tends to go bigger and bigger
High traffic
Need to response in few milli seconds
Big database, big log data
Business requires

Benefit from mass delivery > Infra Investment

Reduce Infrastructure Cost
You need to care about
cost (≒ engineer’s time) and
risk (possibility to cause new trouble)
for performance tuning itself.
Don’t lose you goal
Scaling up/out of Infra can be the best
solution, naively
Don’t want to be perfect
We iterate
Basic of Performance Tuning
Measure metrics
× Find bottleneck
Try with hypothesis×
Don't take erratic steps.
“80% of a program’s processing time
come from 20% of the code”
— Pareto Principle
※CAUTION: This is my own impression
Bottle Neck in My Experience
JVM parameter
What is I/O
Memory × Disk
Approximate timing for various operations
execute typical instruction 1/1,000,000,000 sec = 1 nanosec
fetch from L1 cache memory 0.5 nanosec
branch misprediction 5 nanosec
fetch from L2 cache memory 7 nanosec
Mutex lock/unlock 25 nanosec
fetch from main memory 100 nanosec
send 2K bytes over 1Gbps network 20,000 nanosec
read 1MB sequentially from memory 250,000 nanosec
fetch from new disk location (seek) 8,000,000 nanosec
read 1MB sequentially from disk 20,000,000 nanosec
send packet US to Europe and back 150 milliseconds = 150,000,000 nanosec
If Typical Instruction Takes 1 second… week3-2
execute typical instruction 1 second
fetch from L1 cache memory 0.5 seconds
branch misprediction 5 seconds
fetch from L2 cache memory 7 seconds
Mutex lock/unlock ½ minute
fetch from main memory 1½ minute
send 2K bytes over 1Gbps network 5½ hours
read 1MB sequentially from memory 3 days
fetch from new disk location (seek) 13 weeks
read 1MB sequentially from disk 6½ months
send packet US to Europe and back 5 years
A batch
reads 1,000,000 files of 10KB
from disk
for each time.
Data size:
10KB × 1,000,000 ≒ 10GB
Horrible and True Story
Assuming 1,000,000 seeks are needed,

Estimated time:

8ms × 106
+ 20ms × 10,000 ≒ 8,200 sec ≒ 2.5 h
If there is one file of 10GB and only one seek is
Estimated time:
8ms × 1 + 20ms × 10,000 ≒ 200 sec ≒ 3.5 min
Horrible and True Story
Have Respect for the Disk Head
JVM Trade-offs
JVM Performance Triangle
Memory Footprint ↓
Throughput ↑ Latency ↓
longest pause
time for Full GC
In the other words…
JVM Performance Triangle
Throughput Responsiveness
C × T × R = a
JVM Performance Triangle
Tuning: vary C, T, R for fixed a
Optimization: increase a
Everything I ever learned about JVM performance tuning
@twitter by Attila Szegedi
About Demand Side Science
Introduction to Performance Tuning
Best Practice in Development
1. Requirement Definition / Feasibility
2. Basic Design
3. Detailed Design
4. Building Infrastructure / Coding
5. System Testing
6. System Operation / Maintenance
Development Process
Only topics related to performance will be covered.
Make the agreement with stakeholders
about performance requirement
Requirement Definition / Feasibility
How many user IDs
internet users in Japan: 100 million
unique browsers: 200 ~ x00 million
will increase?
data expiration cycle?
type of devices / browsers?
opt-out rate?
Requirement Definition / Feasibility
Number of deliver requests for ads
Number of impressions per month
In case 1 billion / month

=> mean: 400 QPS (Query Per Second)

=> if peak rate = 250%, then 1,000 QPS
For RTB, bid rate? win rate?
Goal response time? Content size?
Plans for increasing?
How about Cookie Sync?
Requirement Definition / Feasibility
Number of receiving trackers
Timing of firing tracker
Click rate?
Conversion(*) rate?

* A conversion occurs when the user performs the
specific action that the advertiser has defined as the
campaign goal.

e.g. buying a product in an online store
Requirement Definition / Feasibility
Requirement for aggregation
Indicates to be aggregated
Is unique counting needed?
Any exception rules?
Who and when
secondary processing by ad agency?
Update interval
Storage period
Requirement Definition / Feasibility
Hard limit by business side
Sales plan
Christmas selling?
Annual sales target?
Total budget
The most important thing is to provide numbers,
although it is extremely difficult to approximate
precisely in the turbulent world of ad tech.
Requirement Definition / Feasibility
Architecture design needs assumed value
Performance testing needs numeric goal
Architecture design
Choose framework
Web framework
Choose database
Basic Design
Threading model design
Reduce blocking
Future based
Callback & function composition
Actor based
Message passing
Thread pool design
We can’t know the appropriate thread pool
size unless we complete performance
testing in production.
Basic Design
Database design
Access pattern / Number of lookup
Data size per one record
Create model of distribution when the size
is not constant
Number of records
Rate of growth / retention period
Memory usage
At first, measure the performance of the
database itself
Detailed Design
Log design
Consider compression ratio for disk usage
Cache design
Some software needs the double of capacity
for processing backup (e.g. Redis)
Detailed Design
Simplicity and clarity come first
“It is far, far easier to make a correct
program fast than it is to make a fast
program correct”

— C++ Coding Standards: 101 Rules, Guidelines, and Best Practices (C
++ in-depth series)
Building Infrastructure / Coding
— Donald Knuth
“Premature optimization
is the root of all evil.”
— Jon Bentley
“On the other hand,
we cannot ignore efficiency”
Avoid the algorithm which is worse than linear
as possible
Measure, don’t guess
Building Infrastructure / Coding
SBT Plugin for running OpenJDK JMH 

(Java Microbenchmark Harness: Benchmark tool for Java)
Micro Benchmark: sbt-jmh
addSbtPlugin("pl.project13.scala" % "sbt-jmh" % "0.1.6")
Micro Benchmark: sbt-jmh
import org.openjdk.jmh.annotations.Benchmark
class YourBench {
def yourFunc(): Unit = ??? // write code to measure
Just put an annotation
> run -i 3 -wi 3 -f 1 -t 1
Micro Benchmark: sbt-jmh
Run benchmark in the sbt console
Number of
iterations to do
Number of warmup
iterations to do
How many times to forks
a single benchmark
Number of worker
threads to run with
[info] Benchmark Mode Samples Score Score error Units
[info] c.g.m.u.ContainsBench.listContains thrpt 3 41.033 25.573 ops/s
[info] c.g.m.u.ContainsBench.setContains thrpt 3 6.810 1.569 ops/s
Micro Benchmark: sbt-jmh
Result (excerpted)
By default, throughput score
will be displayed.
(larger is better)
Scala Optimization Example
Use Scala collection correctly
Prefer recursion to function call

by Prof. Martin Odersky in Scala Matsuri 2014
Try optimization libraries
def f(xs: List[Int], acc: List[Int] = Nil): List[Int] = {
if (xs.length < 4) {
(xs.sum :: acc).reverse
} else {
val (y, ys) = xs.splitAt(4)
f(ys, y.sum :: acc)
Horrible and True Story pt.2
Group by 4 elements of List[Int], then

calculate each sum respectively
scala> f((1 to 10).toList)
res1: List[Int] = List(10, 26, 19)
Horrible and True Story pt.2
List#length takes time proportional to the
length of the sequence
When the length of the parameter xs is n,

time complexity of List#length is O(n)
Implemented in LinearSeqOptimized#length
Horrible and True Story pt.2
In function f,

xs.length will be evaluated n / 4 + 1 times,

so number of execution of f is also
proportional to n

time complexity of function f is O(n2)
It becomes too slow with big n
Horrible and True Story pt.2
For your information, the following one-liner does
same work using built-in method
scala> (1 to 10).grouped(4).map(_.sum).toList
res2: List[Int] = List(10, 26, 19)
Library for optimising Scala collection

(by using macro)
Presentation in Scala Days 2014
System feature testing
Interface testing
Performance testing
Reliability testing
Security testing
Operation testing
System Testing
Simple load testing
Scenario load testing
mixed load with typical user operations
Aging test (continuously running test)
Performance Testing
Apache attached
Simple benchmark tool

Adequate for naive requirements
Latest version recommended

(Amazon Linux pre-installed version’s bug made me sick)
ab - Apache Bench
ab -C <CookieName=Value> -n <NumberOfRequests> -c <Concurrency> “<URL>“
Result example (excerpted)
ab - Apache Bench
Benchmarking (be patient)
Completed 1200 requests
Completed 2400 requests
Completed 10800 requests
Completed 12000 requests
Finished 12000 requests
Concurrency Level: 200
Time taken for tests: 7.365 seconds
Complete requests: 12000
Failed requests: 0
Write errors: 0
Total transferred: 166583579 bytes
HTML transferred: 160331058 bytes
Requests per second: 1629.31 [#/sec] (mean)
Time per request: 122.751 [ms] (mean)
Time per request: 0.614 [ms] (mean, across all concurrent requests)
Transfer rate: 22087.90 [Kbytes/sec] received
Percentage of the requests served within a certain time (ms)
50% 116
66% 138
75% 146
80% 150
90% 161
95% 170
98% 185
99% 208
100% 308 (longest request)
Requests per second
Load testing tool written in Scala
An era of Apache JMeter has finished
Say good bye to scenario making with GUI
With Gatling,
You load write scenario with Scala DSL
Care for the resource of stressor side
Resource of server (or PC)
Network router (CPU) can be bottleneck
Don’t tune two or more parameters at one
Leave change log and log files
Days for Testing and Tuning
System Operation / Maintenance
Logging ×
Anomaly Detection
Trends Visualization×
Day-to-day logging and monitoring
Application log
GC log
Anomaly detection from several metrics
Server resource (CPU, memory, disk, etc.)
abnormal response code
Trends visualization from several metrics
System Operation / Maintenance
GC log

Add JVM options as follows
JVM Settings
— Real customer
“If someone doesn’t enable
GC logging in production,
I shoot them! p55
JMX (Java Management eXtensions)

Add JVM options as follows
stdout / stderr
Should redirect to file
Should NOT throw away to /dev/null
Result of thread dump

(kill - 3 <PROCESS_ID>) will be written
JVM Settings
SLF4J + Profiler
Coding example
import org.slf4j.profiler.Profiler
val profiler: Profiler = new Profiler(this.getClass.getSimpleName)
SLF4J + Profiler
Output example

Log the result of the profiler when
timeout occurs
+ Profiler [BASIC]
|-- elapsed time [A] 220.487 milliseconds.
|-- elapsed time [B] 2499.866 milliseconds.
|-- elapsed time [OTHER] 3300.745 milliseconds.
|-- Total [BASIC] 6022.568 milliseconds.
For catching trends, not for anomaly detection
Operation is also necessary not to look over
the sign of change
Not only for infrastructure /application, but
business indicates
Who uses the console?
System user
System administrator
Application developer
Business manager
Trends Visualization
Grafana (+Graphite)
Graphite -
Manage and visualize numeric time-series
Grafana -
Visualize Graphite data more stylish

(or Kibana-like)
Grafana (+Graphite)
∼ Best Practice for Better Performance ∼
Scala Days 2015 San Francisco Un-conference
2015-03-19 @mogproject
Thank very
"Yosuke Mizutani - Kanagawa, Japan |" -
"mog project" -
"DSS Tech Blog - Demand Side Science ㈱ の技術ブログ" - http://demand-side-
"FunctionalNews - 関数型言語ニュースサイト" -
の概念まで∼ / 翔泳社 新刊のご紹介" -
"オプト、ダイナミック・クリエイティブツール「unis」の提供開始 ∼ パーソナラ
イズ化された広告を自動生成し、広告効果の最大化を目指す ∼ | インターネット広
告代理店 オプト" -
"The Scala Programming Language" -
"Finagle" -
"Play Framework - Build Modern & Scalable Web Apps with Java and Scala" -
"nginx" -
"Fluentd | Open Source Data Collector" -
ル (1/2) - @IT" -
"パレートの法則 - Wikipedia" -パレートの法則
"Teach Yourself Programming in Ten Years" -
"企業が作る国際ネットワーク最前線 - [4]いまさら聞けない国際ネットワークの
基礎知識:ITpro" -
"Coursera" -
"アースマラソン - Wikipedia" -アースマラソン
"Hard disk drive - Wikipedia, the free encyclopedia" -
"Everything I ever learned about JVM performance tuning @twitter(Attila
Szegedi).pdf" -
" C++ Coding Standards―101のルール、ガイドライン、ベストプ
ラクティス (C++ in-depth series): ハーブ サッター, アンドレイ アレキサンドレス
ク, 浜田 光之, Herb Sutter, Andrei Alexandrescu, 浜田 真理: 本" - http://
"UNIX哲学 - Wikipedia" -哲学
"ktoso/sbt-jmh" -
"ScalaBlitz | ScalaBlitz" -
" - Lightning-Fast Standard Collections With ScalaBlitz by Dmitry
Petrashko" -
"mog project: Micro Benchmark in Scala - Using sbt-jmh" - http://
"Gatling Project, Stress Tool" -
"WEB+DB PRESS Vol.83|技術評論社" -
"「Javaの鉱脈」でGatlingの記事を書きました — さにあらず" - http://
"Garbage Collection Tuning in the Java HotSpot™ Virtual Machine" - http://
"SLF4J extensions" -
"Graphite Documentation — Graphite 0.10.0 documentation" - http://
"Grafana - Graphite and InfluxDB Dashboard and graph composer" - http://
"Grafana - Grafana Play Home" -
"不動産関係に使える 無料画像一覧" -
"AI・EPSの無料イラストレーター素材なら無料イラスト素材.com" - http://www.無
"大体いい感じになるKeynoteテンプレート「Azusa」作った - MEMOGRAPHIX" -

More Related Content

Similar to Adtech scala-performance-tuning-150323223738-conversion-gate01

Flavius Ștef: Big Rewrites Without Big Risks at I T.A.K.E. Unconference
Flavius Ștef: Big Rewrites Without Big Risks at I T.A.K.E. UnconferenceFlavius Ștef: Big Rewrites Without Big Risks at I T.A.K.E. Unconference
Flavius Ștef: Big Rewrites Without Big Risks at I T.A.K.E. Unconference
Mozaic Works
Big rewrites without big risks
Big rewrites without big risksBig rewrites without big risks
Big rewrites without big risks
Flavius Stef
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
The Yin and Yang of Software
The Yin and Yang of SoftwareThe Yin and Yang of Software
The Yin and Yang of Software
elliando dias
Maximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesMaximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and Practices
Jeff Bertman
Greenplum for Internet Scale Analytics and Mining - Greenplum Summit 2018
Greenplum for Internet Scale Analytics and Mining - Greenplum Summit 2018Greenplum for Internet Scale Analytics and Mining - Greenplum Summit 2018
Greenplum for Internet Scale Analytics and Mining - Greenplum Summit 2018
VMware Tanzu
Front-End Performance Checklist 2020
Front-End Performance Checklist 2020Front-End Performance Checklist 2020
Front-End Performance Checklist 2020
Harsha MV
Highway to heaven - Microservices Meetup Munich
Highway to heaven - Microservices Meetup MunichHighway to heaven - Microservices Meetup Munich
Highway to heaven - Microservices Meetup Munich
Christian Deger
Web performance
Web  performance Web  performance
Web performance
Major Ye
Questions Log: Dynamic Cubes – Set to Retire Transformer?
Questions Log: Dynamic Cubes – Set to Retire Transformer?Questions Log: Dynamic Cubes – Set to Retire Transformer?
Questions Log: Dynamic Cubes – Set to Retire Transformer?
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTech
Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...
Flink Forward San Francisco 2018:  David Reniz & Dahyr Vergara - "Real-time m...Flink Forward San Francisco 2018:  David Reniz & Dahyr Vergara - "Real-time m...
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...
Flink Forward
Capacity Planning Infrastructure for Web Applications (Drupal)
Capacity Planning Infrastructure for Web Applications (Drupal)Capacity Planning Infrastructure for Web Applications (Drupal)
Capacity Planning Infrastructure for Web Applications (Drupal)
Ricardo Amaro
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
Anant Corporation
Clickhouse MeetUp@ContentSquare - ContentSquare's Experience Sharing
Clickhouse MeetUp@ContentSquare - ContentSquare's Experience SharingClickhouse MeetUp@ContentSquare - ContentSquare's Experience Sharing
Clickhouse MeetUp@ContentSquare - ContentSquare's Experience Sharing
ClickHouse Paris Meetup. ClickHouse at ContentSquare, by Christophe Kalenzaga...
ClickHouse Paris Meetup. ClickHouse at ContentSquare, by Christophe Kalenzaga...ClickHouse Paris Meetup. ClickHouse at ContentSquare, by Christophe Kalenzaga...
ClickHouse Paris Meetup. ClickHouse at ContentSquare, by Christophe Kalenzaga...
Altinity Ltd
Enterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & LearningsEnterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & Learnings
Dhaval Shah

Similar to Adtech scala-performance-tuning-150323223738-conversion-gate01 (20)

Flavius Ștef: Big Rewrites Without Big Risks at I T.A.K.E. Unconference
Flavius Ștef: Big Rewrites Without Big Risks at I T.A.K.E. UnconferenceFlavius Ștef: Big Rewrites Without Big Risks at I T.A.K.E. Unconference
Flavius Ștef: Big Rewrites Without Big Risks at I T.A.K.E. Unconference
Big rewrites without big risks
Big rewrites without big risksBig rewrites without big risks
Big rewrites without big risks
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
The Yin and Yang of Software
The Yin and Yang of SoftwareThe Yin and Yang of Software
The Yin and Yang of Software
Maximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesMaximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and Practices
Greenplum for Internet Scale Analytics and Mining - Greenplum Summit 2018
Greenplum for Internet Scale Analytics and Mining - Greenplum Summit 2018Greenplum for Internet Scale Analytics and Mining - Greenplum Summit 2018
Greenplum for Internet Scale Analytics and Mining - Greenplum Summit 2018
Front-End Performance Checklist 2020
Front-End Performance Checklist 2020Front-End Performance Checklist 2020
Front-End Performance Checklist 2020
Highway to heaven - Microservices Meetup Munich
Highway to heaven - Microservices Meetup MunichHighway to heaven - Microservices Meetup Munich
Highway to heaven - Microservices Meetup Munich
Web performance
Web  performance Web  performance
Web performance
Questions Log: Dynamic Cubes – Set to Retire Transformer?
Questions Log: Dynamic Cubes – Set to Retire Transformer?Questions Log: Dynamic Cubes – Set to Retire Transformer?
Questions Log: Dynamic Cubes – Set to Retire Transformer?
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTech
Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...
Flink Forward San Francisco 2018:  David Reniz & Dahyr Vergara - "Real-time m...Flink Forward San Francisco 2018:  David Reniz & Dahyr Vergara - "Real-time m...
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...
Capacity Planning Infrastructure for Web Applications (Drupal)
Capacity Planning Infrastructure for Web Applications (Drupal)Capacity Planning Infrastructure for Web Applications (Drupal)
Capacity Planning Infrastructure for Web Applications (Drupal)
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
Clickhouse MeetUp@ContentSquare - ContentSquare's Experience Sharing
Clickhouse MeetUp@ContentSquare - ContentSquare's Experience SharingClickhouse MeetUp@ContentSquare - ContentSquare's Experience Sharing
Clickhouse MeetUp@ContentSquare - ContentSquare's Experience Sharing
ClickHouse Paris Meetup. ClickHouse at ContentSquare, by Christophe Kalenzaga...
ClickHouse Paris Meetup. ClickHouse at ContentSquare, by Christophe Kalenzaga...ClickHouse Paris Meetup. ClickHouse at ContentSquare, by Christophe Kalenzaga...
ClickHouse Paris Meetup. ClickHouse at ContentSquare, by Christophe Kalenzaga...
Enterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & LearningsEnterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & Learnings

Recently uploaded

Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
Atif Razi
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
People as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimalaPeople as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimala
Rainfall intensity duration frequency curve statistical analysis and modeling...
Rainfall intensity duration frequency curve statistical analysis and modeling...Rainfall intensity duration frequency curve statistical analysis and modeling...
Rainfall intensity duration frequency curve statistical analysis and modeling...
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx

Recently uploaded (20)

Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
People as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimalaPeople as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimala
Rainfall intensity duration frequency curve statistical analysis and modeling...
Rainfall intensity duration frequency curve statistical analysis and modeling...Rainfall intensity duration frequency curve statistical analysis and modeling...
Rainfall intensity duration frequency curve statistical analysis and modeling...
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx

Adtech scala-performance-tuning-150323223738-conversion-gate01

  • 1. × ∼ Best Practice for Better Performance ∼ Scala Days 2015 San Francisco Un-conference 2015-03-19 @mogproject Ad Tech Performance Tuning Scala ×
  • 2. Agenda About Demand Side Science Introduction to Performance Tuning Best Practice in Development Japanese language version here:
  • 3. Yosuke Mizutani (@mogproject)
 Joined Demand Side Science in April 2013
 (thanks to Scala Conference in Japan 2013) Full-stack engineer (want to be…) Background: 9-year infrastructure engineer About Me
  • 8. Nov 2012
 Established Demand Side Science Inc. Brief History of DSS Demand × Side Science×
  • 9. 2013
 Developed private DSP package fractale Brief History of DSS Demand × Side Platform×
  • 10. Advertiser’s side of realtime ads bidding (RTB) What is DSP Supply Side Platform
  • 11. Dec 2013
 Moved into the group of Opt, the e-marketing agency Oct 2014
 Released dynamic creative tool unis Brief History of DSS × ×
  • 12. unis is a third-party ad server which creates dynamic and/or personalized ads under the rules. unis items on sale most popular items fixed items re-targeting
  • 13. With venture mind + advantage of Opt group … Future of DSS Demand × Side Science×
  • 14. We will create various products based on Science! Future of DSS ??? × ??? Science×
  • 15. For everyone’s … Future of DSS Marketer × Publisher Consumer×
  • 17. We DSS and Scala Adopt Scala × for all products from the day of establishment ×
  • 18. System Architecture Example RDBMS NOSQL log storage cache Log Aggregation Machine Learning Cache Making etc.
  • 19. Today, I will not talk about JavaScript tuning. System Architecture Example
  • 20. Agenda About Demand Side Science Introduction to Performance Tuning Best Practice in Development
  • 21. Resolve an issue Reduce infrastructure cost
 (e.g. Amazon Web Services) Motivations
  • 22. Application goes wrong with high load
 Bad latency under the specific condition Slow batch execution than expectations Slow development tools Resolve an Issue
  • 23. Very important especially in ad tech industry Cost tends to go bigger and bigger High traffic Need to response in few milli seconds Big database, big log data Business requires
 Benefit from mass delivery > Infra Investment
 Reduce Infrastructure Cost
  • 24. You need to care about cost (≒ engineer’s time) and risk (possibility to cause new trouble) for performance tuning itself. Don’t lose you goal Scaling up/out of Infra can be the best solution, naively Don’t want to be perfect
  • 25. We iterate Basic of Performance Tuning Measure metrics × Find bottleneck Try with hypothesis× Don't take erratic steps.
  • 26. “80% of a program’s processing time come from 20% of the code” — Pareto Principle
  • 27. ※CAUTION: This is my own impression Bottle Neck in My Experience others 1% Network 4% JVM parameter 5% Library 5% OS 10% Scala 10% Async・Thread 15% Database (RDBMS/NOSQL) 50%
  • 28. What is I/O Memory × Disk Network×
  • 29. Approximate timing for various operations execute typical instruction 1/1,000,000,000 sec = 1 nanosec fetch from L1 cache memory 0.5 nanosec branch misprediction 5 nanosec fetch from L2 cache memory 7 nanosec Mutex lock/unlock 25 nanosec fetch from main memory 100 nanosec send 2K bytes over 1Gbps network 20,000 nanosec read 1MB sequentially from memory 250,000 nanosec fetch from new disk location (seek) 8,000,000 nanosec read 1MB sequentially from disk 20,000,000 nanosec send packet US to Europe and back 150 milliseconds = 150,000,000 nanosec
  • 30. If Typical Instruction Takes 1 second… week3-2 execute typical instruction 1 second fetch from L1 cache memory 0.5 seconds branch misprediction 5 seconds fetch from L2 cache memory 7 seconds Mutex lock/unlock ½ minute fetch from main memory 1½ minute send 2K bytes over 1Gbps network 5½ hours read 1MB sequentially from memory 3 days fetch from new disk location (seek) 13 weeks read 1MB sequentially from disk 6½ months send packet US to Europe and back 5 years
  • 31. A batch reads 1,000,000 files of 10KB from disk for each time. Data size: 10KB × 1,000,000 ≒ 10GB Horrible and True Story
  • 32. Assuming 1,000,000 seeks are needed,
 Estimated time:
 8ms × 106 + 20ms × 10,000 ≒ 8,200 sec ≒ 2.5 h If there is one file of 10GB and only one seek is needed, Estimated time: 8ms × 1 + 20ms × 10,000 ≒ 200 sec ≒ 3.5 min Horrible and True Story
  • 33. Have Respect for the Disk Head
  • 34. JVM Trade-offs JVM Performance Triangle Memory Footprint ↓ Throughput ↑ Latency ↓ longest pause time for Full GC
  • 35. In the other words… JVM Performance Triangle Compactness Throughput Responsiveness
  • 36. C × T × R = a JVM Performance Triangle Tuning: vary C, T, R for fixed a Optimization: increase a Reference: Everything I ever learned about JVM performance tuning @twitter by Attila Szegedi %20about%20JVM%20performance%20tuning%20@twitter%28Attila%20Szegedi%29.pdf
  • 37. Agenda About Demand Side Science Introduction to Performance Tuning Best Practice in Development
  • 38. 1. Requirement Definition / Feasibility 2. Basic Design 3. Detailed Design 4. Building Infrastructure / Coding 5. System Testing 6. System Operation / Maintenance Development Process Only topics related to performance will be covered.
  • 39. Make the agreement with stakeholders about performance requirement Requirement Definition / Feasibility How many user IDs internet users in Japan: 100 million unique browsers: 200 ~ x00 million will increase? data expiration cycle? type of devices / browsers? opt-out rate?
  • 40. Requirement Definition / Feasibility Number of deliver requests for ads Number of impressions per month In case 1 billion / month
 => mean: 400 QPS (Query Per Second)
 => if peak rate = 250%, then 1,000 QPS For RTB, bid rate? win rate? Goal response time? Content size? Plans for increasing? How about Cookie Sync?
  • 41. Requirement Definition / Feasibility Number of receiving trackers Timing of firing tracker Click rate? Conversion(*) rate?
 * A conversion occurs when the user performs the specific action that the advertiser has defined as the campaign goal.
 e.g. buying a product in an online store
  • 42. Requirement Definition / Feasibility Requirement for aggregation Indicates to be aggregated Is unique counting needed? Any exception rules? Who and when secondary processing by ad agency? Update interval Storage period
  • 43. Requirement Definition / Feasibility Hard limit by business side Sales plan Christmas selling? Annual sales target? Total budget
  • 44. The most important thing is to provide numbers, although it is extremely difficult to approximate precisely in the turbulent world of ad tech. Requirement Definition / Feasibility Architecture design needs assumed value Performance testing needs numeric goal
  • 45. Architecture design Choose framework Web framework Choose database RDBMS NOSQL Basic Design
  • 46. Threading model design Reduce blocking Future based Callback & function composition Actor based Message passing Thread pool design We can’t know the appropriate thread pool size unless we complete performance testing in production. Basic Design
  • 47. Database design Access pattern / Number of lookup Data size per one record Create model of distribution when the size is not constant Number of records Rate of growth / retention period Memory usage At first, measure the performance of the database itself Detailed Design
  • 48. Log design Consider compression ratio for disk usage Cache design Some software needs the double of capacity for processing backup (e.g. Redis) Detailed Design
  • 49. Simplicity and clarity come first “It is far, far easier to make a correct program fast than it is to make a fast program correct”
 — C++ Coding Standards: 101 Rules, Guidelines, and Best Practices (C ++ in-depth series) Building Infrastructure / Coding
  • 50. — Donald Knuth “Premature optimization is the root of all evil.”
  • 51. — Jon Bentley “On the other hand, we cannot ignore efficiency”
  • 52. Avoid the algorithm which is worse than linear as possible Measure, don’t guess Building Infrastructure / Coding
  • 53. SBT Plugin for running OpenJDK JMH 
 (Java Microbenchmark Harness: Benchmark tool for Java) Micro Benchmark: sbt-jmh
  • 54. addSbtPlugin("pl.project13.scala" % "sbt-jmh" % "0.1.6") Micro Benchmark: sbt-jmh plugins.sbt jmhSettings build.sbt import org.openjdk.jmh.annotations.Benchmark class YourBench { @Benchmark def yourFunc(): Unit = ??? // write code to measure } YourBench.scala Just put an annotation
  • 55. > run -i 3 -wi 3 -f 1 -t 1 Micro Benchmark: sbt-jmh Run benchmark in the sbt console Number of measurement iterations to do Number of warmup iterations to do How many times to forks a single benchmark Number of worker threads to run with
  • 56. [info] Benchmark Mode Samples Score Score error Units [info] c.g.m.u.ContainsBench.listContains thrpt 3 41.033 25.573 ops/s [info] c.g.m.u.ContainsBench.setContains thrpt 3 6.810 1.569 ops/s Micro Benchmark: sbt-jmh Result (excerpted) By default, throughput score will be displayed. (larger is better)
  • 57. Scala Optimization Example Use Scala collection correctly Prefer recursion to function call
 by Prof. Martin Odersky in Scala Matsuri 2014 Try optimization libraries
  • 58. def f(xs: List[Int], acc: List[Int] = Nil): List[Int] = { if (xs.length < 4) { (xs.sum :: acc).reverse } else { val (y, ys) = xs.splitAt(4) f(ys, y.sum :: acc) } } Horrible and True Story pt.2 Group by 4 elements of List[Int], then
 calculate each sum respectively scala> f((1 to 10).toList) res1: List[Int] = List(10, 26, 19) Example
  • 59. Horrible and True Story pt.2 List#length takes time proportional to the length of the sequence When the length of the parameter xs is n,
 time complexity of List#length is O(n) Implemented in LinearSeqOptimized#length LinearSeqOptimized.scala#L35-43
  • 60. Horrible and True Story pt.2 In function f,
 xs.length will be evaluated n / 4 + 1 times,
 so number of execution of f is also proportional to n Therefore,
 time complexity of function f is O(n2) It becomes too slow with big n
  • 61. Horrible and True Story pt.2 For your information, the following one-liner does same work using built-in method scala> (1 to 10).grouped(4).map(_.sum).toList res2: List[Int] = List(10, 26, 19)
  • 63. Library for optimising Scala collection
 (by using macro) Presentation in Scala Days 2014 53a7d2c6e4b0543940d9e549/chapter0/ about ScalaBlitz
  • 64. System feature testing Interface testing Performance testing Reliability testing Security testing Operation testing System Testing
  • 65. Simple load testing Scenario load testing mixed load with typical user operations Aging test (continuously running test) Performance Testing
  • 66. Apache attached Simple benchmark tool
 Adequate for naive requirements Latest version recommended
 (Amazon Linux pre-installed version’s bug made me sick) Example ab - Apache Bench ab -C <CookieName=Value> -n <NumberOfRequests> -c <Concurrency> “<URL>“
  • 67. Result example (excerpted) ab - Apache Bench Benchmarking (be patient) Completed 1200 requests Completed 2400 requests (略) Completed 10800 requests Completed 12000 requests Finished 12000 requests (略) Concurrency Level: 200 Time taken for tests: 7.365 seconds Complete requests: 12000 Failed requests: 0 Write errors: 0 Total transferred: 166583579 bytes HTML transferred: 160331058 bytes Requests per second: 1629.31 [#/sec] (mean) Time per request: 122.751 [ms] (mean) Time per request: 0.614 [ms] (mean, across all concurrent requests) Transfer rate: 22087.90 [Kbytes/sec] received (略) Percentage of the requests served within a certain time (ms) 50% 116 66% 138 75% 146 80% 150 90% 161 95% 170 98% 185 99% 208 100% 308 (longest request) Requests per second = QPS
  • 68. Load testing tool written in Scala Gatling
  • 69. An era of Apache JMeter has finished Say good bye to scenario making with GUI With Gatling, You load write scenario with Scala DSL Gatling
  • 70. Care for the resource of stressor side Resource of server (or PC) Network router (CPU) can be bottleneck Don’t tune two or more parameters at one time Leave change log and log files Days for Testing and Tuning
  • 71. System Operation / Maintenance Logging × Anomaly Detection Trends Visualization×
  • 72. Day-to-day logging and monitoring Application log GC log Profiler Anomaly detection from several metrics Server resource (CPU, memory, disk, etc.) abnormal response code Latency Trends visualization from several metrics System Operation / Maintenance
  • 73. GC log
 Add JVM options as follows JVM Settings -verbose:gc -Xloggc:<PathToTheLog> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M
  • 74. — Real customer “If someone doesn’t enable GC logging in production, I shoot them! p55
  • 75. JMX (Java Management eXtensions)
 Add JVM options as follows JVM Settings<PORT NUMBER>
  • 76. stdout / stderr Should redirect to file Should NOT throw away to /dev/null Result of thread dump
 (kill - 3 <PROCESS_ID>) will be written here JVM Settings
  • 77. SLF4J + Profiler Coding example Profiler import org.slf4j.profiler.Profiler val profiler: Profiler = new Profiler(this.getClass.getSimpleName) profiler.start(“A”) doA() profiler.start(“B”) doB() profiler.stop() logger.warn(profiler.toString)
  • 78. SLF4J + Profiler Output example Example:
 Log the result of the profiler when timeout occurs Profiler + Profiler [BASIC] |-- elapsed time [A] 220.487 milliseconds. |-- elapsed time [B] 2499.866 milliseconds. |-- elapsed time [OTHER] 3300.745 milliseconds. |-- Total [BASIC] 6022.568 milliseconds.
  • 79. For catching trends, not for anomaly detection Operation is also necessary not to look over the sign of change Not only for infrastructure /application, but business indicates Who uses the console? System user System administrator Application developer Business manager Trends Visualization
  • 81. Graphite - Manage and visualize numeric time-series data Grafana - Visualize Graphite data more stylish
 (or Kibana-like) Grafana (+Graphite)
  • 82. × ∼ Best Practice for Better Performance ∼ Scala Days 2015 San Francisco Un-conference 2015-03-19 @mogproject Thank very much! you ×
  • 83. "Yosuke Mizutani - Kanagawa, Japan |" - "mog project" - "DSS Tech Blog - Demand Side Science ㈱ の技術ブログ" - http://demand-side- "FunctionalNews - 関数型言語ニュースサイト" - "『ザ・アドテクノロジー』∼データマーケティングの基礎からアトリビューション の概念まで∼ / 翔泳社 新刊のご紹介" - "オプト、ダイナミック・クリエイティブツール「unis」の提供開始 ∼ パーソナラ イズ化された広告を自動生成し、広告効果の最大化を目指す ∼ | インターネット広 告代理店 オプト" - "The Scala Programming Language" - "Finagle" - "Play Framework - Build Modern & Scalable Web Apps with Java and Scala" - "nginx" - "Fluentd | Open Source Data Collector" - "Javaパフォーマンスチューニング(1):Javaパフォーマンスチューニングのルー ル (1/2) - @IT" - "パレートの法則 - Wikipedia" -パレートの法則 "Teach Yourself Programming in Ten Years" - days.html#answers "企業が作る国際ネットワーク最前線 - [4]いまさら聞けない国際ネットワークの 基礎知識:ITpro" - 343461/ "Coursera" - "アースマラソン - Wikipedia" -アースマラソン "Hard disk drive - Wikipedia, the free encyclopedia" - wiki/Hard_disk_drive "Everything I ever learned about JVM performance tuning @twitter(Attila Szegedi).pdf" - Everything%20I%20ever%20learned%20about%20JVM%20performance %20tuning%20@twitter%28Attila%20Szegedi%29.pdf " C++ Coding Standards―101のルール、ガイドライン、ベストプ ラクティス (C++ in-depth series): ハーブ サッター, アンドレイ アレキサンドレス ク, 浜田 光之, Herb Sutter, Andrei Alexandrescu, 浜田 真理: 本" - http:// "UNIX哲学 - Wikipedia" -哲学 "ktoso/sbt-jmh" - "ScalaBlitz | ScalaBlitz" - " - Lightning-Fast Standard Collections With ScalaBlitz by Dmitry Petrashko" - about "mog project: Micro Benchmark in Scala - Using sbt-jmh" - http:// "Gatling Project, Stress Tool" - "WEB+DB PRESS Vol.83|技術評論社" - archive/2014/vol83 "「Javaの鉱脈」でGatlingの記事を書きました — さにあらず" - http:// "Garbage Collection Tuning in the Java HotSpot™ Virtual Machine" - http:// "SLF4J extensions" - "Graphite Documentation — Graphite 0.10.0 documentation" - http:// "Grafana - Graphite and InfluxDB Dashboard and graph composer" - http:// "Grafana - Grafana Play Home" - grafana-play-home "不動産関係に使える 無料画像一覧" - list.html "AI・EPSの無料イラストレーター素材なら無料イラスト素材.com" - http://www.無 料イラスト素材.com/ "大体いい感じになるKeynoteテンプレート「Azusa」作った - MEMOGRAPHIX" - References