SlideShare a Scribd company logo
1 of 36
HADOOP CLUSTER
PERFORMANCE
PROFILING
Ihor Bobak
Lead Software Engineer, EPAM Systems
AUGUST 27, 2015
CONTENTS
Covered topics:
• What is profiling? How do profilers work?
• What problems can affect performance?
• How to profile a distributed application?
• Gathering, storing and analysis of stack traces
• Memory analysis
• Use Case
• Alternative approaches to profiling
3
WHAT IS A PROFILER?
Profiler is a tool to look what parts of your app is working slowly.
VisualVM YourKit
4
HOW DO PROFILERS WORK?
• Instrumenting: adding extra bytecode to your
methods for recording when they’re called and
how long they execute.
• Sampling: taking dumps of all the threads
periodically in order to understand how much
CPU time each method takes.
5
DIFFUCULTIES WITH A CLUSTER
This is a typical mapreduce application running on a Hadoop cluster.
All blue boxes are separate JVM processes running on different
machines. Question: how can we profile a distributed Java app?
1. How to attach to a process
running on another host?
2. How to track the appearance
of new processes?
3. How to gather profiling data?
4. How to analyze this vast
amount of data?
6
WHY DO WE NEED A CLUSTER PROFILER?
Answer: we need a profiler to get more performance.
Hadoop principle is next:
“If you want more performance, add more hardware”.
This is a truth. But this is not the only truth.
Another truth is: there are problems that are related to
ALL applications (both distributed and local).
7
PROBLEM 1: NOT OPTIMAL CODE
public static void QuickSort(int[] a, int x, int y){
int pivot = (x+y)/2;
int apivot = a[pivot];
int i = x;
int j = y;
while (i <= j){
while (a[i] < apivot) i++;
while (a[j] > apivot) j--;
if (i <= j){
int temp = a[i];
a[i] = a[j];
a[j] = temp;
i++;
j--;
}
}
if (x < j)
QuickSort(a, x, j);
if (i < y)
QuickSort(a, i, y);
}
public static void StupidSort(int[] a){
for (int i = 0; i < a.length - 1; ++i)
for (int j = i + 1; j < a.length; ++j)
if (a[i] > a[j]){
int temp = a[i];
a[i] = a[j];
a[j] = temp;
}
}
“Tupo-v-lob” sort: O(N^2) Quicksort: O(N*log(N))
This is a simple example of different algorithm solving the same tasks:
8
PROBLEM 2: BAD CODE/DATA
• Repeatedly doing the same unnecessary actions
Example: re-reading the configuration file or a database table again and again during every operation (although we
could cache it in the memory).
• Wrong usage of someone’s code/libraries/binaries
Example: sqoop can import from MySQL in two modes – direct mode (using mysqldump and mysqlimport) and JDBC-
mode. The first one is faster.
• Usage of wrong libraries
Example: https://powercollections.codeplex.com/workitem/16950
I found that famous Wintellect’s OrderedSet works 3 times slower than native Microsoft’s SortedSet.
• Absense of indexes in a database
Example: “select * from fact join dim on fact.productid = dim.productid” is slow because developers missed to
make keys/indexes
• Bugs in famous libraries/frameworks
Example: http://ihorbobak.com/index.php/2015/06/03/spark-sql-bad-performance/ problem with A->B->C tables
join when enumerated in order A, C, B. This is handled fine by all database servers, but NOT by Spark SQL.
9
PROBLEM 3: HARDWARE TROUBLES
Two most important problems are:
• Disk problems (slow I/O speed)
• Network problems (slow bandwidths, packets
loss)
10
CLUSTER PROFILER ARCHITECTURE
Java Process
“Injected” code
which does stacktrace
sampling
Passing stacktraces
each 10 seconds
though HTTP
A set of Python/Perl
scripts to get
visualizations
Visualization in the
form of flame graphs
This is applicable to any java process:
mapper, reducer, etc., and it is applicable
not only Hadoop: it can be Spark RDD
code, Java web app code, etc.
11
HOW JAVA AGENT WORKS?
• Agent is bound to a java process by specifying -javaagent parameter, e.g.
java –javaagent:/path/agent.jar=parameters MainClass
or by overriding _JAVA_OPTIONS like this:
_JAVA_OPTIONS='-javaagent:/path/agent.jar=parameters
• Agent’s jar has a manifest with
PreMain-Class: namespace.TheAgentClass
• “TheAgentClass” has a premain() method that executes before your
main() and does the following:
– Read the parameters of the agent
– Constructs the profiler instances (based on parameters)
– Creates a ScheduledExecutorService (see java.util.concurrent) that does
scheduleAtFixedRate(worker, 0, 10, TimeUnit.SECONDS)
12
HOW JAVA AGENT WORKS?
The profiler thread collects stacktraces 100 times per second using ThreadMXBean (a
part of JMX – a technology for monitoring and managing the JVM)
public void profile() {
profileCount++;
try{
for (ThreadInfo thread : getAllRunnableThreads()) {
if (thread.getStackTrace().length > 0) {
String traceKey = StackTraceFormatter.formatStackTrace(thread.getStackTrace());
if (filter.includeStackTrace(traceKey))
traces.increment(traceKey, 1);
}
}
}
catch (OutOfMemoryError ex)
{
// ... skipping code for handling OOM (just for safety)
}
if (profileCount == reportingFrequency) {
profileCount = 0;
recordMethodCounts();
}
}
For more information about JMX read here:
https://docs.oracle.com/javase/tutorial/jmx/index.html
13
STATSD + MY CHANGES
I made a modification of a famous StatsD JVM profiler https://github.com/etsy/statsd-
jvm-profiler
List of my changes:
• Added the jvmName and host tag to each stacktrace;
• Optimized performance in stacktraces collection code;
• Improved stability - added catching of OutOfMemoryException;
• Added statistics to show how many lines and characters we pass to the backend;
• Seriously modified the influxdb_dump.py: now it extracts data into a set of distinct
files - one file for each JVM, each host and a total.
• Added extraction of memory information and rendering it with charts in R
• Added call_tree.py: a script for analysis of the method call trees
• Added some helper scripts.
14
INFLUXD
What is InfluxDB?
It is a time series, metrics, and analytics database.
Targeted at:
gathering metrics (like response times, CPU load), sensor
data, events (like exceptions) and real-time analytics.
Key Features:
• SQL-like query language;
• HTTP(S) API for data ingestion and queries;
• Built-in support for other data protocols such as
collectd;
• Has a CLI and web interface;
• Tag data for fast and efficient queries.
15
Measurements
(analog of tables)
tag keys:values
SQL-like query language
timestamps
Series:
measurement
name
+ tag key-values
+ data
values
16
Schema exploration examples:
• SHOW MEASUREMENTS
shows the list of measurements
• SHOW SERIES FROM /.*cpu.*/
shows the list of series for each measurement whose name matches the
pattern /*.cpu.*/
• SHOW TAG KEYS FROM /.*heap.*/
shows different tag keys from measurements that match pattern
• SHOW TAG VALUES FROM /.*cpu.*/ WITH KEY = jvmName
shows different tag keys from measurements that match pattern
Data exploration examples:
• SELECT * FROM cpu WHERE host = ‘A’
selects series for “cpu” measurement with tag host=‘A’
• SELECT percentile(value, 95) FROM response_times
WHERE time > now() - 1d
GROUP BY time(1m)
shows the 95th percentile of response times in the last day in 1 minute
interval
17
FLAME GRAPHS
D D
C C C
B B B B
A A A A
0th
ms
10th
ms
20th
ms
30th
ms
Gathered stack traces:
A->B->C
A->B->C->D
A->B->C->D
A->B
D D
C C C
B B B B
A A A A
0th
ms
10th
ms
20th
ms
30th
ms
THE WIDTH OF A BAR MATTERS.
Color doesn’t matter and is selected just to distinguish bars.
18
FLAME GRAPHS
Flame graphs are a visualization of profiled software, allowing the
most frequent code-paths to be identified quickly and accurately.
Invented by Brendann Gregg: http://www.brendangregg.com
19
SEQUENCE OF ACTIONS
Steps to Profile a Cluster:
1. Install InfluxDB on a separate machine visible to all machines of the cluster.
Create a database and a user.
2. Get the agent’s jar file from my blog (or from sources) and put it into
/var/lib at every worker node.
3. Change the configuration of the cluster: make _JAVA_OPTIONS=‘-
javaagent…’ available to all JVM processes.
4. Run your application and get the stacktraces in the InfluxDB. You may
“switch off” the _JAVA_AGENT after this.
5. Get the SVG files (flame graphs) from InfluxDB with the help of
influxdb_dump.py and flamegraph_files.sh and do the analysis.
These steps are described in detail at my blog http://ihorbobak.com
20
LOCATION FOR _JAVA_OPTIONS
_JAVA_OPTIONS='-javaagent:/var/lib/statsd-jvm-profiler-0.8.3-
SNAPSHOT.jar=server=serveraddress,port=8086,reporter=InfluxDBReporter,database=profiler,us
ername=profiler,password=profiler,prefix=value1.value2.valueN,tagMapping=tag1,tag2,tagN'
21
USE CASE WITH A REAL CUSTOMER
The App/Inventory/Environment:
•Our customer has an app that crawls data from a set of sites, parses it
and puts to a Hadoop cluster (20 machines with 8 cores, 32GB RAM
and 1TB HDD each).
•The app leverages Apache Nutch, Cloudera Hadoop distribution
version 5.3, Hbase, MongoDB and other technologies.
•There is a central Java web app (Java/Tomcat) that uses Nutch which
runs the mapreduce jobs.
The problem:
•The cluster crawls just 100 sites per day; a customer is asking us
“how to make it crawl 10 times more on the same hardware?”
22
FIRST FINDINGS
The first question that arose in my head: what exactly works slowly?
At the beginning I quickly found this: slow are the parts that are I/O intensive.
23
DISK I/O
Then I did I/O monitoring procedures and a series of test of disk speed on nodes.
This is the result of IOPS benchmark
https://github.com/cxcv/iops/blob/master/iops:
512 B blocks: 80.9 IO/s, 40.4 KiB/s (331.3 kbit/s)
1 KiB blocks: 97.9 IO/s, 97.9 KiB/s (802.1 kbit/s)
2 KiB blocks: 83.8 IO/s, 167.5 KiB/s ( 1.4 Mbit/s)
4 KiB blocks: 72.3 IO/s, 289.2 KiB/s ( 2.4 Mbit/s)
8 KiB blocks: 69.8 IO/s, 558.7 KiB/s ( 4.6 Mbit/s)
16 KiB blocks: 69.4 IO/s, 1.1 MiB/s ( 9.1 Mbit/s)
32 KiB blocks: 58.2 IO/s, 1.8 MiB/s ( 15.3 Mbit/s)
64 KiB blocks: 54.3 IO/s, 3.4 MiB/s ( 28.5 Mbit/s)
128 KiB blocks: 45.9 IO/s, 5.7 MiB/s ( 48.1 Mbit/s)
256 KiB blocks: 38.7 IO/s, 9.7 MiB/s ( 81.1 Mbit/s)
512 KiB blocks: 29.0 IO/s, 14.5 MiB/s (121.8 Mbit/s)
1 MiB blocks: 18.3 IO/s, 18.3 MiB/s (153.2 Mbit/s)
2 MiB blocks: 10.3 IO/s, 20.7 MiB/s (173.6 Mbit/s)
4 MiB blocks: 5.7 IO/s, 22.8 MiB/s (191.7 Mbit/s)
8 MiB blocks: 4.8 IO/s, 38.8 MiB/s (325.2 Mbit/s)
16 MiB blocks: 2.0 IO/s, 32.6 MiB/s (273.8 Mbit/s)
32 MiB blocks: 0.8 IO/s, 27.0 MiB/s (226.1 Mbit/s)
512 B blocks: 861.1 IO/s, 430.5 KiB/s ( 3.5 Mbit/s)
1 KiB blocks: 1084.7 IO/s, 1.1 MiB/s ( 8.9 Mbit/s)
2 KiB blocks: 836.6 IO/s, 1.6 MiB/s ( 13.7 Mbit/s)
4 KiB blocks: 698.4 IO/s, 2.7 MiB/s ( 22.9 Mbit/s)
8 KiB blocks: 755.7 IO/s, 5.9 MiB/s ( 49.5 Mbit/s)
16 KiB blocks: 909.1 IO/s, 14.2 MiB/s (119.2 Mbit/s)
32 KiB blocks: 784.9 IO/s, 24.5 MiB/s (205.7 Mbit/s)
64 KiB blocks: 747.9 IO/s, 46.7 MiB/s (392.1 Mbit/s)
128 KiB blocks: 593.2 IO/s, 74.2 MiB/s (622.0 Mbit/s)
256 KiB blocks: 441.4 IO/s, 110.4 MiB/s (925.8 Mbit/s)
512 KiB blocks: 423.3 IO/s, 211.6 MiB/s ( 1.8 Gbit/s)
1 MiB blocks: 295.1 IO/s, 295.1 MiB/s ( 2.5 Gbit/s)
2 MiB blocks: 159.1 IO/s, 318.3 MiB/s ( 2.7 Gbit/s)
4 MiB blocks: 103.2 IO/s, 412.6 MiB/s ( 3.5 Gbit/s)
8 MiB blocks: 46.6 IO/s, 372.8 MiB/s ( 3.1 Gbit/s)
16 MiB blocks: 23.4 IO/s, 374.0 MiB/s ( 3.1 Gbit/s)
32 MiB blocks: 11.9 IO/s, 381.9 MiB/s ( 3.2 Gbit/s)
Cluster Node My local VM
Cluster node is 10 times slower than a VM running on my development
workstation (the host is Core i7/32GB/1TB, guest is 3-core VM with 16GB RAM)
24
FETCHER MAPREDUCE JOB
% of CPU time:
15% - HTML parsing
15% - Hadoop
framework
initialization code
7% - HDFS
initialization code
22% - reducer code
(BAD NEWS HERE)
18% - reading
Hadoop XML config
files
23% - real job
25
DRILL DOWN INTO THE REDUCER
org.apache.hadoop.hbase.
catalog.MetatataReader.
fullScan()
org.apache.avro.
Schema$Parser.parse()
ending with ZipFile.read,
ZipFile.getEntry(), etc.
org.apache.hadoop.hbase.
client.HConnectionManager.
createConnection()
Creating a record writer
Parsing
avro
schema
Fetcher
Reducer
.run()
26
DRILL DOWN INTO THE RECORD WRITER
This is Gora
library code
Most observable
function calls on
top are:
java.util.zip.*
FileInputStream*
FileOutputStream*
27
REPEATING SLOW PARTS IN ALL JOBS
28
INEFFECTIVE MEMORY MANAGEMENT
Most of Java processes used
significantly less memory
than they were initially
assigned.
Legend:
• init - the initial amount of memory that the
JVM requests from the OS during startup;
• used - the amount of memory currently
used;
• Committed - the amount of memory that is
guaranteed to be available for use by the
Java virtual machine;
• Max - represents the maximum amount of
memory (in bytes) that can be used for
memory management.
A memory allocation may fail if it attempts
to increase the used memory such that used
> committed even if used <= max would still
be true
29
PROBLEMS AND NEXT STEPS
1) Gora + HBase
Reasons: Bad code in Gora (too many metadata full table scans)
Actions:
• check Gora’s configuration, dive into the code to find out why it does full scan
• try Cassandra instead of HBase
2) Hadoop Framework parts, in particular:
• HDFS initialization in mapreduce jobs (slow communication with Namenode)
• Reading configuration files (it is done with Xerces library ).
Possible Reasons:
• Bad I/O speed and bad network speed.
• There can be some parameterizing of XML parsing of config files that we’re not aware of.
Actions:
• fix the hardware issues.
• Search for why Hadoop XML config parsing may be so slow
• Check namenode memory usage
30
OTHER METHOD OF GETTING STACKTRACE
Another method to get stack traces is Linux’s perf_events:
perf record -F 99 -g -p PID
perf record -e L1-dcache-load-misses -c 10000 -ag -- sleep 5
Perf monitors:
• Hardware events (e.g. level 2 cache
misses);
• Software events (e.g. CPU migrations)
• Tracepoint events (e.g. filesystem I/O,
TCP events)
Perf can also do
• Sampling: collection of snapshots at some
frequency (by timer)
• Dynamic tracing: instrumenting code to
create events in any location (using
kprobes or uprobes frameworks)
For more details see: http://www.brendangregg.com/perf.html
31
PERF vs. JAVA AGENT
Advantages of perf over java agent:
• low overhead when getting stack traces;
• combining user calls (Java) and kernel calls in one flame graph.
• Will 100% catch all Java methods (no matter that JVM may
exclude safepoint checks from hot methods)
(http://chriskirk.blogspot.com/2013/09/what-is-java-safepoint.html - a good
explanation about safepoints).
Disadvantages of perf:
• Cannot get Java’s stacktraces (it is necessary to fix frame pointer-based stack
walking in OpenJDK – done by Netflix and Twitter)
• Doesn’t see Java symbols (hex numbers instead; special agent needed to add
symbols https://github.com/jrudolph/perf-map-agent )
• Permissions must be configured to symbol files
• It is necessary to develop a service which will launch perf, get stacktraces and
pass them to a server.
32
PERF vs. JAVA AGENT
And…. it happens that Netflix’s product is open sourced…
33
CREDITS
Andrew Johnson
Software Engineer at Etsy
Previously: Explorys, Inc.
https://www.linkedin.com/in/ajsquared
Brendann Gregg
Senior Performance Architect at Netflix
Previously: Joyent, Oracle, Sun Microsystems
http://www.brendangregg.com/index.html
34
BLOGS/ARTICLES
Blogs:
• My blog article
http://ihorbobak.com/index.php/2015/08/05/cluster-profiling/
• Etsy’s blog about JVM Profiler
https://codeascraft.com/2015/01/14/introducing-statsd-jvm-profiler-a-jvm-profiler-for-hadoop/
https://codeascraft.com/2015/05/12/four-months-of-statsd-jvm-profiler-a-retrospective/
• Brendan Gregg’s blog
http://www.brendangregg.com/blog/index.html
Source code:
• My modification of StatsD JVM Profiler
https://github.com/ibobak/statsd-jvm-profiler
• Original Etsy’s StatsD JVM Profiler
https://github.com/etsy/statsd-jvm-profiler
• Brendan Gregg’s FlameGraph
https://github.com/brendangregg/FlameGraph
Manuals:
• InfluxDB Docs
https://influxdb.com/docs/v0.9/introduction/overview.html
• Overview of the JMX Technology
https://docs.oracle.com/javase/tutorial/jmx/overview/index.html
• JVM Tool Interface
http://docs.oracle.com/javase/7/docs/platform/jvmti/jvmti.html#starting
35
BOOKS / VIDEOS
• Systems Performance: Enterprise and the Cloud
by Brendan Gregg
http://www.amazon.com/Systems-Performance-
Enterprise-Brendan-Gregg/dp/0133390098
• Blazing Performance with Flame Graphs
by Brendan Gregg
https://www.youtube.com/watch?v=nZfNehCzGdw
• Linux profiling at Netflix
by Brendan Gregg
https://www.youtube.com/watch?v=_Ik8oiQvWgo
• Profiling Java in Production
by Kaushik Srenevasan, Twitter University
https://www.youtube.com/watch?v=Yg6_ulhwLw0
36
Contacts:
Ihor Bobak
E-mail: ibobak@gmail.com
Skype: ibobak

More Related Content

What's hot

Advance Java Programs skeleton
Advance Java Programs skeletonAdvance Java Programs skeleton
Advance Java Programs skeletonIram Ramrajkar
 
Java Concurrency, Memory Model, and Trends
Java Concurrency, Memory Model, and TrendsJava Concurrency, Memory Model, and Trends
Java Concurrency, Memory Model, and TrendsCarol McDonald
 
Java SE 8 - New Features
Java SE 8 - New FeaturesJava SE 8 - New Features
Java SE 8 - New FeaturesNaveen Hegde
 
CS6270 Virtual Machines - Java Virtual Machine Architecture and APIs
CS6270 Virtual Machines - Java Virtual Machine Architecture and APIsCS6270 Virtual Machines - Java Virtual Machine Architecture and APIs
CS6270 Virtual Machines - Java Virtual Machine Architecture and APIsKwangshin Oh
 
Java 8 Feature Preview
Java 8 Feature PreviewJava 8 Feature Preview
Java 8 Feature PreviewJim Bethancourt
 
Concurrency with java
Concurrency with javaConcurrency with java
Concurrency with javaHoang Nguyen
 
Pune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCDPune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCDPrashant Rane
 
Java Concurrency in Practice
Java Concurrency in PracticeJava Concurrency in Practice
Java Concurrency in PracticeAlina Dolgikh
 
New Features Of JDK 7
New Features Of JDK 7New Features Of JDK 7
New Features Of JDK 7Deniz Oguz
 
ITFT - Java Coding
ITFT - Java CodingITFT - Java Coding
ITFT - Java CodingBlossom Sood
 
Java Memory Model
Java Memory ModelJava Memory Model
Java Memory Model*instinctools
 
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Lucidworks
 
Java programing considering performance
Java programing considering performanceJava programing considering performance
Java programing considering performanceRoger Xia
 
Java features. Java 8, 9, 10, 11
Java features. Java 8, 9, 10, 11Java features. Java 8, 9, 10, 11
Java features. Java 8, 9, 10, 11Ivelin Yanev
 
The Java memory model made easy
The Java memory model made easyThe Java memory model made easy
The Java memory model made easyRafael Winterhalter
 

What's hot (20)

Advance Java Programs skeleton
Advance Java Programs skeletonAdvance Java Programs skeleton
Advance Java Programs skeleton
 
Java Concurrency, Memory Model, and Trends
Java Concurrency, Memory Model, and TrendsJava Concurrency, Memory Model, and Trends
Java Concurrency, Memory Model, and Trends
 
Java SE 8 - New Features
Java SE 8 - New FeaturesJava SE 8 - New Features
Java SE 8 - New Features
 
CS6270 Virtual Machines - Java Virtual Machine Architecture and APIs
CS6270 Virtual Machines - Java Virtual Machine Architecture and APIsCS6270 Virtual Machines - Java Virtual Machine Architecture and APIs
CS6270 Virtual Machines - Java Virtual Machine Architecture and APIs
 
Java 8 Feature Preview
Java 8 Feature PreviewJava 8 Feature Preview
Java 8 Feature Preview
 
Concurrency with java
Concurrency with javaConcurrency with java
Concurrency with java
 
Pune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCDPune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCD
 
Java lab-manual
Java lab-manualJava lab-manual
Java lab-manual
 
Java Concurrency in Practice
Java Concurrency in PracticeJava Concurrency in Practice
Java Concurrency in Practice
 
New Features Of JDK 7
New Features Of JDK 7New Features Of JDK 7
New Features Of JDK 7
 
ITFT - Java Coding
ITFT - Java CodingITFT - Java Coding
ITFT - Java Coding
 
Java Tut1
Java Tut1Java Tut1
Java Tut1
 
Java Memory Model
Java Memory ModelJava Memory Model
Java Memory Model
 
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
 
Java Micro-Benchmarking
Java Micro-BenchmarkingJava Micro-Benchmarking
Java Micro-Benchmarking
 
Java programing considering performance
Java programing considering performanceJava programing considering performance
Java programing considering performance
 
02 basic java programming and operators
02 basic java programming and operators02 basic java programming and operators
02 basic java programming and operators
 
Java features. Java 8, 9, 10, 11
Java features. Java 8, 9, 10, 11Java features. Java 8, 9, 10, 11
Java features. Java 8, 9, 10, 11
 
The Java memory model made easy
The Java memory model made easyThe Java memory model made easy
The Java memory model made easy
 
Java 7 New Features
Java 7 New FeaturesJava 7 New Features
Java 7 New Features
 

Similar to Hadoop cluster performance profiler

Inside the JVM - Follow the white rabbit!
Inside the JVM - Follow the white rabbit!Inside the JVM - Follow the white rabbit!
Inside the JVM - Follow the white rabbit!Sylvain Wallez
 
Java Performance and Profiling
Java Performance and ProfilingJava Performance and Profiling
Java Performance and ProfilingWSO2
 
JDD 2016 - Grzegorz Rozniecki - Java 8 What Could Possibly Go Wrong
JDD 2016 - Grzegorz Rozniecki - Java 8 What Could Possibly Go WrongJDD 2016 - Grzegorz Rozniecki - Java 8 What Could Possibly Go Wrong
JDD 2016 - Grzegorz Rozniecki - Java 8 What Could Possibly Go WrongPROIDEA
 
Java Performance and Using Java Flight Recorder
Java Performance and Using Java Flight RecorderJava Performance and Using Java Flight Recorder
Java Performance and Using Java Flight RecorderIsuru Perera
 
Struts 2-overview2
Struts 2-overview2Struts 2-overview2
Struts 2-overview2divzi1913
 
Performance eng prakash.sahu
Performance eng prakash.sahuPerformance eng prakash.sahu
Performance eng prakash.sahuDr. Prakash Sahu
 
Typesafe spark- Zalando meetup
Typesafe spark- Zalando meetupTypesafe spark- Zalando meetup
Typesafe spark- Zalando meetupStavros Kontopoulos
 
Cloudify workshop at CCCEU 2014
Cloudify workshop at CCCEU 2014 Cloudify workshop at CCCEU 2014
Cloudify workshop at CCCEU 2014 Uri Cohen
 
Struts 2-overview2
Struts 2-overview2Struts 2-overview2
Struts 2-overview2Long Nguyen
 
Software Profiling: Understanding Java Performance and how to profile in Java
Software Profiling: Understanding Java Performance and how to profile in JavaSoftware Profiling: Understanding Java Performance and how to profile in Java
Software Profiling: Understanding Java Performance and how to profile in JavaIsuru Perera
 
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...Ontico
 
Performance Tuning and Optimization
Performance Tuning and OptimizationPerformance Tuning and Optimization
Performance Tuning and OptimizationMongoDB
 
Measurement .Net Performance with BenchmarkDotNet
Measurement .Net Performance with BenchmarkDotNetMeasurement .Net Performance with BenchmarkDotNet
Measurement .Net Performance with BenchmarkDotNetVasyl Senko
 
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...Big Data Spain
 
Intro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran MizrahiIntro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran MizrahiRan Mizrahi
 

Similar to Hadoop cluster performance profiler (20)

Inside the JVM - Follow the white rabbit!
Inside the JVM - Follow the white rabbit!Inside the JVM - Follow the white rabbit!
Inside the JVM - Follow the white rabbit!
 
Struts framework
Struts frameworkStruts framework
Struts framework
 
Struts framework
Struts frameworkStruts framework
Struts framework
 
Java Performance and Profiling
Java Performance and ProfilingJava Performance and Profiling
Java Performance and Profiling
 
JDD 2016 - Grzegorz Rozniecki - Java 8 What Could Possibly Go Wrong
JDD 2016 - Grzegorz Rozniecki - Java 8 What Could Possibly Go WrongJDD 2016 - Grzegorz Rozniecki - Java 8 What Could Possibly Go Wrong
JDD 2016 - Grzegorz Rozniecki - Java 8 What Could Possibly Go Wrong
 
Java Performance and Using Java Flight Recorder
Java Performance and Using Java Flight RecorderJava Performance and Using Java Flight Recorder
Java Performance and Using Java Flight Recorder
 
Struts 2-overview2
Struts 2-overview2Struts 2-overview2
Struts 2-overview2
 
Performance eng prakash.sahu
Performance eng prakash.sahuPerformance eng prakash.sahu
Performance eng prakash.sahu
 
Typesafe spark- Zalando meetup
Typesafe spark- Zalando meetupTypesafe spark- Zalando meetup
Typesafe spark- Zalando meetup
 
Cloudify workshop at CCCEU 2014
Cloudify workshop at CCCEU 2014 Cloudify workshop at CCCEU 2014
Cloudify workshop at CCCEU 2014
 
Struts 2-overview2
Struts 2-overview2Struts 2-overview2
Struts 2-overview2
 
AMIS Oracle OpenWorld 2013 Review Part 3 - Fusion Middleware
AMIS Oracle OpenWorld 2013 Review Part 3 - Fusion MiddlewareAMIS Oracle OpenWorld 2013 Review Part 3 - Fusion Middleware
AMIS Oracle OpenWorld 2013 Review Part 3 - Fusion Middleware
 
Software Profiling: Understanding Java Performance and how to profile in Java
Software Profiling: Understanding Java Performance and how to profile in JavaSoftware Profiling: Understanding Java Performance and how to profile in Java
Software Profiling: Understanding Java Performance and how to profile in Java
 
01 oracle architecture
01 oracle architecture01 oracle architecture
01 oracle architecture
 
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
 
Performance Tuning and Optimization
Performance Tuning and OptimizationPerformance Tuning and Optimization
Performance Tuning and Optimization
 
Measurement .Net Performance with BenchmarkDotNet
Measurement .Net Performance with BenchmarkDotNetMeasurement .Net Performance with BenchmarkDotNet
Measurement .Net Performance with BenchmarkDotNet
 
React inter3
React inter3React inter3
React inter3
 
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
 
Intro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran MizrahiIntro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran Mizrahi
 

Recently uploaded

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

Hadoop cluster performance profiler

  • 1. HADOOP CLUSTER PERFORMANCE PROFILING Ihor Bobak Lead Software Engineer, EPAM Systems AUGUST 27, 2015
  • 2. CONTENTS Covered topics: • What is profiling? How do profilers work? • What problems can affect performance? • How to profile a distributed application? • Gathering, storing and analysis of stack traces • Memory analysis • Use Case • Alternative approaches to profiling
  • 3. 3 WHAT IS A PROFILER? Profiler is a tool to look what parts of your app is working slowly. VisualVM YourKit
  • 4. 4 HOW DO PROFILERS WORK? • Instrumenting: adding extra bytecode to your methods for recording when they’re called and how long they execute. • Sampling: taking dumps of all the threads periodically in order to understand how much CPU time each method takes.
  • 5. 5 DIFFUCULTIES WITH A CLUSTER This is a typical mapreduce application running on a Hadoop cluster. All blue boxes are separate JVM processes running on different machines. Question: how can we profile a distributed Java app? 1. How to attach to a process running on another host? 2. How to track the appearance of new processes? 3. How to gather profiling data? 4. How to analyze this vast amount of data?
  • 6. 6 WHY DO WE NEED A CLUSTER PROFILER? Answer: we need a profiler to get more performance. Hadoop principle is next: “If you want more performance, add more hardware”. This is a truth. But this is not the only truth. Another truth is: there are problems that are related to ALL applications (both distributed and local).
  • 7. 7 PROBLEM 1: NOT OPTIMAL CODE public static void QuickSort(int[] a, int x, int y){ int pivot = (x+y)/2; int apivot = a[pivot]; int i = x; int j = y; while (i <= j){ while (a[i] < apivot) i++; while (a[j] > apivot) j--; if (i <= j){ int temp = a[i]; a[i] = a[j]; a[j] = temp; i++; j--; } } if (x < j) QuickSort(a, x, j); if (i < y) QuickSort(a, i, y); } public static void StupidSort(int[] a){ for (int i = 0; i < a.length - 1; ++i) for (int j = i + 1; j < a.length; ++j) if (a[i] > a[j]){ int temp = a[i]; a[i] = a[j]; a[j] = temp; } } “Tupo-v-lob” sort: O(N^2) Quicksort: O(N*log(N)) This is a simple example of different algorithm solving the same tasks:
  • 8. 8 PROBLEM 2: BAD CODE/DATA • Repeatedly doing the same unnecessary actions Example: re-reading the configuration file or a database table again and again during every operation (although we could cache it in the memory). • Wrong usage of someone’s code/libraries/binaries Example: sqoop can import from MySQL in two modes – direct mode (using mysqldump and mysqlimport) and JDBC- mode. The first one is faster. • Usage of wrong libraries Example: https://powercollections.codeplex.com/workitem/16950 I found that famous Wintellect’s OrderedSet works 3 times slower than native Microsoft’s SortedSet. • Absense of indexes in a database Example: “select * from fact join dim on fact.productid = dim.productid” is slow because developers missed to make keys/indexes • Bugs in famous libraries/frameworks Example: http://ihorbobak.com/index.php/2015/06/03/spark-sql-bad-performance/ problem with A->B->C tables join when enumerated in order A, C, B. This is handled fine by all database servers, but NOT by Spark SQL.
  • 9. 9 PROBLEM 3: HARDWARE TROUBLES Two most important problems are: • Disk problems (slow I/O speed) • Network problems (slow bandwidths, packets loss)
  • 10. 10 CLUSTER PROFILER ARCHITECTURE Java Process “Injected” code which does stacktrace sampling Passing stacktraces each 10 seconds though HTTP A set of Python/Perl scripts to get visualizations Visualization in the form of flame graphs This is applicable to any java process: mapper, reducer, etc., and it is applicable not only Hadoop: it can be Spark RDD code, Java web app code, etc.
  • 11. 11 HOW JAVA AGENT WORKS? • Agent is bound to a java process by specifying -javaagent parameter, e.g. java –javaagent:/path/agent.jar=parameters MainClass or by overriding _JAVA_OPTIONS like this: _JAVA_OPTIONS='-javaagent:/path/agent.jar=parameters • Agent’s jar has a manifest with PreMain-Class: namespace.TheAgentClass • “TheAgentClass” has a premain() method that executes before your main() and does the following: – Read the parameters of the agent – Constructs the profiler instances (based on parameters) – Creates a ScheduledExecutorService (see java.util.concurrent) that does scheduleAtFixedRate(worker, 0, 10, TimeUnit.SECONDS)
  • 12. 12 HOW JAVA AGENT WORKS? The profiler thread collects stacktraces 100 times per second using ThreadMXBean (a part of JMX – a technology for monitoring and managing the JVM) public void profile() { profileCount++; try{ for (ThreadInfo thread : getAllRunnableThreads()) { if (thread.getStackTrace().length > 0) { String traceKey = StackTraceFormatter.formatStackTrace(thread.getStackTrace()); if (filter.includeStackTrace(traceKey)) traces.increment(traceKey, 1); } } } catch (OutOfMemoryError ex) { // ... skipping code for handling OOM (just for safety) } if (profileCount == reportingFrequency) { profileCount = 0; recordMethodCounts(); } } For more information about JMX read here: https://docs.oracle.com/javase/tutorial/jmx/index.html
  • 13. 13 STATSD + MY CHANGES I made a modification of a famous StatsD JVM profiler https://github.com/etsy/statsd- jvm-profiler List of my changes: • Added the jvmName and host tag to each stacktrace; • Optimized performance in stacktraces collection code; • Improved stability - added catching of OutOfMemoryException; • Added statistics to show how many lines and characters we pass to the backend; • Seriously modified the influxdb_dump.py: now it extracts data into a set of distinct files - one file for each JVM, each host and a total. • Added extraction of memory information and rendering it with charts in R • Added call_tree.py: a script for analysis of the method call trees • Added some helper scripts.
  • 14. 14 INFLUXD What is InfluxDB? It is a time series, metrics, and analytics database. Targeted at: gathering metrics (like response times, CPU load), sensor data, events (like exceptions) and real-time analytics. Key Features: • SQL-like query language; • HTTP(S) API for data ingestion and queries; • Built-in support for other data protocols such as collectd; • Has a CLI and web interface; • Tag data for fast and efficient queries.
  • 15. 15 Measurements (analog of tables) tag keys:values SQL-like query language timestamps Series: measurement name + tag key-values + data values
  • 16. 16 Schema exploration examples: • SHOW MEASUREMENTS shows the list of measurements • SHOW SERIES FROM /.*cpu.*/ shows the list of series for each measurement whose name matches the pattern /*.cpu.*/ • SHOW TAG KEYS FROM /.*heap.*/ shows different tag keys from measurements that match pattern • SHOW TAG VALUES FROM /.*cpu.*/ WITH KEY = jvmName shows different tag keys from measurements that match pattern Data exploration examples: • SELECT * FROM cpu WHERE host = ‘A’ selects series for “cpu” measurement with tag host=‘A’ • SELECT percentile(value, 95) FROM response_times WHERE time > now() - 1d GROUP BY time(1m) shows the 95th percentile of response times in the last day in 1 minute interval
  • 17. 17 FLAME GRAPHS D D C C C B B B B A A A A 0th ms 10th ms 20th ms 30th ms Gathered stack traces: A->B->C A->B->C->D A->B->C->D A->B D D C C C B B B B A A A A 0th ms 10th ms 20th ms 30th ms THE WIDTH OF A BAR MATTERS. Color doesn’t matter and is selected just to distinguish bars.
  • 18. 18 FLAME GRAPHS Flame graphs are a visualization of profiled software, allowing the most frequent code-paths to be identified quickly and accurately. Invented by Brendann Gregg: http://www.brendangregg.com
  • 19. 19 SEQUENCE OF ACTIONS Steps to Profile a Cluster: 1. Install InfluxDB on a separate machine visible to all machines of the cluster. Create a database and a user. 2. Get the agent’s jar file from my blog (or from sources) and put it into /var/lib at every worker node. 3. Change the configuration of the cluster: make _JAVA_OPTIONS=‘- javaagent…’ available to all JVM processes. 4. Run your application and get the stacktraces in the InfluxDB. You may “switch off” the _JAVA_AGENT after this. 5. Get the SVG files (flame graphs) from InfluxDB with the help of influxdb_dump.py and flamegraph_files.sh and do the analysis. These steps are described in detail at my blog http://ihorbobak.com
  • 21. 21 USE CASE WITH A REAL CUSTOMER The App/Inventory/Environment: •Our customer has an app that crawls data from a set of sites, parses it and puts to a Hadoop cluster (20 machines with 8 cores, 32GB RAM and 1TB HDD each). •The app leverages Apache Nutch, Cloudera Hadoop distribution version 5.3, Hbase, MongoDB and other technologies. •There is a central Java web app (Java/Tomcat) that uses Nutch which runs the mapreduce jobs. The problem: •The cluster crawls just 100 sites per day; a customer is asking us “how to make it crawl 10 times more on the same hardware?”
  • 22. 22 FIRST FINDINGS The first question that arose in my head: what exactly works slowly? At the beginning I quickly found this: slow are the parts that are I/O intensive.
  • 23. 23 DISK I/O Then I did I/O monitoring procedures and a series of test of disk speed on nodes. This is the result of IOPS benchmark https://github.com/cxcv/iops/blob/master/iops: 512 B blocks: 80.9 IO/s, 40.4 KiB/s (331.3 kbit/s) 1 KiB blocks: 97.9 IO/s, 97.9 KiB/s (802.1 kbit/s) 2 KiB blocks: 83.8 IO/s, 167.5 KiB/s ( 1.4 Mbit/s) 4 KiB blocks: 72.3 IO/s, 289.2 KiB/s ( 2.4 Mbit/s) 8 KiB blocks: 69.8 IO/s, 558.7 KiB/s ( 4.6 Mbit/s) 16 KiB blocks: 69.4 IO/s, 1.1 MiB/s ( 9.1 Mbit/s) 32 KiB blocks: 58.2 IO/s, 1.8 MiB/s ( 15.3 Mbit/s) 64 KiB blocks: 54.3 IO/s, 3.4 MiB/s ( 28.5 Mbit/s) 128 KiB blocks: 45.9 IO/s, 5.7 MiB/s ( 48.1 Mbit/s) 256 KiB blocks: 38.7 IO/s, 9.7 MiB/s ( 81.1 Mbit/s) 512 KiB blocks: 29.0 IO/s, 14.5 MiB/s (121.8 Mbit/s) 1 MiB blocks: 18.3 IO/s, 18.3 MiB/s (153.2 Mbit/s) 2 MiB blocks: 10.3 IO/s, 20.7 MiB/s (173.6 Mbit/s) 4 MiB blocks: 5.7 IO/s, 22.8 MiB/s (191.7 Mbit/s) 8 MiB blocks: 4.8 IO/s, 38.8 MiB/s (325.2 Mbit/s) 16 MiB blocks: 2.0 IO/s, 32.6 MiB/s (273.8 Mbit/s) 32 MiB blocks: 0.8 IO/s, 27.0 MiB/s (226.1 Mbit/s) 512 B blocks: 861.1 IO/s, 430.5 KiB/s ( 3.5 Mbit/s) 1 KiB blocks: 1084.7 IO/s, 1.1 MiB/s ( 8.9 Mbit/s) 2 KiB blocks: 836.6 IO/s, 1.6 MiB/s ( 13.7 Mbit/s) 4 KiB blocks: 698.4 IO/s, 2.7 MiB/s ( 22.9 Mbit/s) 8 KiB blocks: 755.7 IO/s, 5.9 MiB/s ( 49.5 Mbit/s) 16 KiB blocks: 909.1 IO/s, 14.2 MiB/s (119.2 Mbit/s) 32 KiB blocks: 784.9 IO/s, 24.5 MiB/s (205.7 Mbit/s) 64 KiB blocks: 747.9 IO/s, 46.7 MiB/s (392.1 Mbit/s) 128 KiB blocks: 593.2 IO/s, 74.2 MiB/s (622.0 Mbit/s) 256 KiB blocks: 441.4 IO/s, 110.4 MiB/s (925.8 Mbit/s) 512 KiB blocks: 423.3 IO/s, 211.6 MiB/s ( 1.8 Gbit/s) 1 MiB blocks: 295.1 IO/s, 295.1 MiB/s ( 2.5 Gbit/s) 2 MiB blocks: 159.1 IO/s, 318.3 MiB/s ( 2.7 Gbit/s) 4 MiB blocks: 103.2 IO/s, 412.6 MiB/s ( 3.5 Gbit/s) 8 MiB blocks: 46.6 IO/s, 372.8 MiB/s ( 3.1 Gbit/s) 16 MiB blocks: 23.4 IO/s, 374.0 MiB/s ( 3.1 Gbit/s) 32 MiB blocks: 11.9 IO/s, 381.9 MiB/s ( 3.2 Gbit/s) Cluster Node My local VM Cluster node is 10 times slower than a VM running on my development workstation (the host is Core i7/32GB/1TB, guest is 3-core VM with 16GB RAM)
  • 24. 24 FETCHER MAPREDUCE JOB % of CPU time: 15% - HTML parsing 15% - Hadoop framework initialization code 7% - HDFS initialization code 22% - reducer code (BAD NEWS HERE) 18% - reading Hadoop XML config files 23% - real job
  • 25. 25 DRILL DOWN INTO THE REDUCER org.apache.hadoop.hbase. catalog.MetatataReader. fullScan() org.apache.avro. Schema$Parser.parse() ending with ZipFile.read, ZipFile.getEntry(), etc. org.apache.hadoop.hbase. client.HConnectionManager. createConnection() Creating a record writer Parsing avro schema Fetcher Reducer .run()
  • 26. 26 DRILL DOWN INTO THE RECORD WRITER This is Gora library code Most observable function calls on top are: java.util.zip.* FileInputStream* FileOutputStream*
  • 27. 27 REPEATING SLOW PARTS IN ALL JOBS
  • 28. 28 INEFFECTIVE MEMORY MANAGEMENT Most of Java processes used significantly less memory than they were initially assigned. Legend: • init - the initial amount of memory that the JVM requests from the OS during startup; • used - the amount of memory currently used; • Committed - the amount of memory that is guaranteed to be available for use by the Java virtual machine; • Max - represents the maximum amount of memory (in bytes) that can be used for memory management. A memory allocation may fail if it attempts to increase the used memory such that used > committed even if used <= max would still be true
  • 29. 29 PROBLEMS AND NEXT STEPS 1) Gora + HBase Reasons: Bad code in Gora (too many metadata full table scans) Actions: • check Gora’s configuration, dive into the code to find out why it does full scan • try Cassandra instead of HBase 2) Hadoop Framework parts, in particular: • HDFS initialization in mapreduce jobs (slow communication with Namenode) • Reading configuration files (it is done with Xerces library ). Possible Reasons: • Bad I/O speed and bad network speed. • There can be some parameterizing of XML parsing of config files that we’re not aware of. Actions: • fix the hardware issues. • Search for why Hadoop XML config parsing may be so slow • Check namenode memory usage
  • 30. 30 OTHER METHOD OF GETTING STACKTRACE Another method to get stack traces is Linux’s perf_events: perf record -F 99 -g -p PID perf record -e L1-dcache-load-misses -c 10000 -ag -- sleep 5 Perf monitors: • Hardware events (e.g. level 2 cache misses); • Software events (e.g. CPU migrations) • Tracepoint events (e.g. filesystem I/O, TCP events) Perf can also do • Sampling: collection of snapshots at some frequency (by timer) • Dynamic tracing: instrumenting code to create events in any location (using kprobes or uprobes frameworks) For more details see: http://www.brendangregg.com/perf.html
  • 31. 31 PERF vs. JAVA AGENT Advantages of perf over java agent: • low overhead when getting stack traces; • combining user calls (Java) and kernel calls in one flame graph. • Will 100% catch all Java methods (no matter that JVM may exclude safepoint checks from hot methods) (http://chriskirk.blogspot.com/2013/09/what-is-java-safepoint.html - a good explanation about safepoints). Disadvantages of perf: • Cannot get Java’s stacktraces (it is necessary to fix frame pointer-based stack walking in OpenJDK – done by Netflix and Twitter) • Doesn’t see Java symbols (hex numbers instead; special agent needed to add symbols https://github.com/jrudolph/perf-map-agent ) • Permissions must be configured to symbol files • It is necessary to develop a service which will launch perf, get stacktraces and pass them to a server.
  • 32. 32 PERF vs. JAVA AGENT And…. it happens that Netflix’s product is open sourced…
  • 33. 33 CREDITS Andrew Johnson Software Engineer at Etsy Previously: Explorys, Inc. https://www.linkedin.com/in/ajsquared Brendann Gregg Senior Performance Architect at Netflix Previously: Joyent, Oracle, Sun Microsystems http://www.brendangregg.com/index.html
  • 34. 34 BLOGS/ARTICLES Blogs: • My blog article http://ihorbobak.com/index.php/2015/08/05/cluster-profiling/ • Etsy’s blog about JVM Profiler https://codeascraft.com/2015/01/14/introducing-statsd-jvm-profiler-a-jvm-profiler-for-hadoop/ https://codeascraft.com/2015/05/12/four-months-of-statsd-jvm-profiler-a-retrospective/ • Brendan Gregg’s blog http://www.brendangregg.com/blog/index.html Source code: • My modification of StatsD JVM Profiler https://github.com/ibobak/statsd-jvm-profiler • Original Etsy’s StatsD JVM Profiler https://github.com/etsy/statsd-jvm-profiler • Brendan Gregg’s FlameGraph https://github.com/brendangregg/FlameGraph Manuals: • InfluxDB Docs https://influxdb.com/docs/v0.9/introduction/overview.html • Overview of the JMX Technology https://docs.oracle.com/javase/tutorial/jmx/overview/index.html • JVM Tool Interface http://docs.oracle.com/javase/7/docs/platform/jvmti/jvmti.html#starting
  • 35. 35 BOOKS / VIDEOS • Systems Performance: Enterprise and the Cloud by Brendan Gregg http://www.amazon.com/Systems-Performance- Enterprise-Brendan-Gregg/dp/0133390098 • Blazing Performance with Flame Graphs by Brendan Gregg https://www.youtube.com/watch?v=nZfNehCzGdw • Linux profiling at Netflix by Brendan Gregg https://www.youtube.com/watch?v=_Ik8oiQvWgo • Profiling Java in Production by Kaushik Srenevasan, Twitter University https://www.youtube.com/watch?v=Yg6_ulhwLw0

Editor's Notes

  1. P37 – divide into 2 slides
  2. P37 – divide into 2 slides
  3. P37 – divide into 2 slides
  4. P37 – divide into 2 slides
  5. P37 – divide into 2 slides