SlideShare a Scribd company logo
1 of 33
Download to read offline
EXTENDING SPARK WITH
JAVA AGENTS
Jaroslav Bachorik
Adrian Popescu
Meet The Speakers
● Backend engineer
● Hands-on experience with JVM
○ OpenJDK Reviewer
● Strong focus on performance
○ VisualVM Contributor
○ BTrace Maintainer and Lead
Developer
Jaroslav Adrian
● Backend engineer
● 5+ years of experience in
performance monitoring &
modeling
● Focusing on tuning and
optimization of Big Data apps
Unravel
Performance Intelligence for Big Data Stack
Caching in Spark
…
rdd.cache()
rdd.count()
…
rdd.reduceBy
Key((x,y) =>
x+y).count()
1. Evaluate
part #1
Executor
part #2
Executor
part #3
Executor
Block
Manager
Block
Manager
Block
Manager
2. Caching
part #1 part #2 part #3
3. Fetch or
re-evaluate part #1 part #2 part #3
Avoiding re-evaluation for cached RDD blocks
Spark Caching Problem
• RDD Caching
- Benefits: speedup, saved resources
- Too much caching may waste resources
• Spot the opportunity
- Which RDDs to cache (for most benefit)?
- How to prioritize when memory is scarce?
- Where to cache?
- Benefit depends on memory state
• http://unraveldata.com/to-cache-or-not-to-cache/
Challenging to decide when, what, and where to cache
Algorithm
1. Find stages that share RDDs
2. Cached RDDs: Measure benefit
Block hitRate & time saved for every Stage
3. RDDs not cached: Potential benefit
Approximate time saved & RDD storage size
4. Suggest which RDDs to cache
Need to collect additional metrics
Insufficient visibility
into Spark structures
e.g., only block info in
BlockStatusListener
Meet Java Agent
Meet Java Agent
Premain-Class: com.myorg.agents.FancyAgent
Can-Redefine-Classes: true
Can-Retransform-Classes: true
META-INF/manifest.mf
public static void premain(String args, Instrumentation inst) {
// do the magic stuff here
}
FancyAgent.java
Launcher
java -javaagent:myfancyagent.jar=<args>
● Runs JVM with an agent
● Agent jar
○ agent class
○ required manifest entries
● Arguments
○ any string understood by agent class
● Multiple agents can be specified
Meet Java Agent
Premain-Class: com.myorg.agents.FancyAgent
Can-Redefine-Classes: true
Can-Retransform-Classes: true
META-INF/manifest.mf
public static void premain(String args, Instrumentation inst) {
// do the magic stuff here
}
FancyAgent.java
Launcher
java -javaagent:myfancyagent.jar=<args>
● Executed at JVM startup
● Instrumentation provides access to JVM internals
Meet Java Agent
Premain-Class: com.myorg.agents.FancyAgent
Can-Redefine-Classes: true
Can-Retransform-Classes: true
META-INF/manifest.mf
public static void premain(String args, Instrumentation inst) {
// do the magic stuff here
}
FancyAgent.java
Launcher
java -javaagent:myfancyagent.jar=<args>
● Agent class
● Must be present in the agent jar
Meet Java Agent
Premain-Class: com.myorg.agents.FancyAgent
Can-Redefine-Classes: true
Can-Retransform-Classes: true
META-INF/manifest.mf
public static void premain(String args, Instrumentation inst) {
// do the magic stuff here
}
FancyAgent.java
Launcher
java -javaagent:myfancyagent.jar=<args>
Transform classes upon load
Meet Java Agent
Premain-Class: com.myorg.agents.FancyAgent
Can-Redefine-Classes: true
Can-Retransform-Classes: true
META-INF/manifest.mf
public static void premain(String args, Instrumentation inst) {
// do the magic stuff here
}
FancyAgent.java
Launcher
java -javaagent:myfancyagent.jar=<args>
● Agent capabilities and settings
○ retransform classes
○ redefine classes
○ native method prefix
https://docs.oracle.com/javase/7/docs/api/java/lang/instrument/package-summary.html
Class Transformers
class SpecialLibraryClass {
public void doStuff() {
System.out.println(
"Doing stuff #" + id(System.currentTimeMillis())
);
}
private int id(int input) {
return input % 42;
}
}
Private method in 3rd
party package
private method.
Patch and recompile? Reflection?
http://pixabay.com
Class Transformers
class SpecialLibraryClass {
public void doStuff() {
System.out.println("Doing stuff #" + id(System.currentTimeMillis()));
}
private int id(int input) {
AgentCallbacks.onEnter("SLC", "id", "input = " + input);
int tmp = input % 42;
AgentCallbacks.onReturn("SLC", "id", tmp);
return tmp;
}
}
http://pixabay.com
Class Transformers
java.lang.instrument.ClassTransformer
byte[] transform(ClassLoader l, String name, Class<?> cls,
ProtectionDomain pd, byte[] classfileBuffer)
● Inspect and modify class data
○ without recompilation
○ even on-the-fly
● Powerful
○ introduce callbacks
○ patch problematic code
● Not a simple task
○ class data structure
○ constant pool
○ stack frame maps
http://wikipedia.org
BTrace
● Open source dynamic instrumentation
framework for JVM
● Java annotations based DSL for class
transformations
● Optimized for high performance
● Used eg. in VisualVM
(http://visualvm.github.io)
https://github.com/btraceio/btrace
Spark Config Tuning
● Intercept SparkContext instantiation at Spark Driver
● Modify the associated SparkConf object
○ before SparkContext is fully instantiated
● Any Spark property can be changed
○ eg. “spark.executor.cores” based on system
load
● Modified SparkConf will be applied to the current job
○ and distributed to to all executors
Config Tuning Transformation
@OnMethod(clazz = "org.apache.spark.SparkContext",
method = "<init>",
location = @Location(Kind.RETURN)
)
public static void sparkContextConstructor(@Self SparkContext scObj) {
SparkConf conf = scObj.conf();
if (conf == null)
return;
int recommendedCores = AdviceSystem.getRecommendedCores(scObj);
conf.set("spark.executor.cores", recommendedCores);
}
Agents on Spark?
http://apache.org
http://pixabay.com
Agents on Spark!
● Spark is a JVM application
● Any Java agent can be used with Spark
http://apache.org
http://pixabay.com
Register JVM Agent for Spark
java -javaagent:myfancyagent.jar=<args>
spark.executor.extraJavaOptions=
”-javaagent:/opt/myagent/myagent.jar=opt.load:true”
spark.driver.extraJavaOptions=
”-javaagent:/opt/myagent/myagent.jar=track.conf:true”
Store the options in spark-defaults.conf to apply system-wide.
Spark configuration parameters
http://arthurandsage.blogspot.cz/
Let’s Put It All Together!
Identify Required Data
Block
Manager
Block
Manager
Block
Manager
1. Evaluation
2. Caching
part #1 part #2 part #3
part #1
Executor
part #2
Executor
part #3
Executor
• Access to Spark internal structures
• Collecting: RDD block usage, cache hit, potential benefit
• Run algorithm and suggest which RDDs to cache
…
rdd.cache()
rdd.count()
…
rdd.reduceBy
Key((x,y) =>
x+y).count()
…
Agents
allow:
Transform Classes
def get(blockId: BlockId): Option[BlockResult] = {
val local = getLocal(blockId)
if (local.isDefined) {
logInfo(s"Found block $blockId locally")
return local
}
val remote = getRemote(blockId)
if (remote.isDefined) {
logInfo(s"Found block $blockId remotely")
return remote
}
None
}
def get(blockId: BlockId): Option[BlockResult] = {
AgentRuntime.useBlock(blockId, currentStage)
val local = getLocal(blockId)
if (local.isDefined) {
logInfo(s"Found block $blockId locally")
return local
}
val remote = getRemote(blockId)
...
...
}
Unravel Spark Cache Events
Identify the costs of not-caching
Give very precise advice
Put It On Cluster
Agents In Cluster
http://http://spark.apache.org/
Distributing Agents for Spark
• Manually copy agents to all nodes
Make sure all nodes have the latest and same copy of agent binaries
https://www.quickmeme.com
• Put that script in CRON!
• Script that copy process
Yarn Localization Service
● Built-in resource distribution system
● Cluster resources
○ linked into executor working
directories
○ kept up-to-date
● Supported resource types
○ files
○ archives
● https://spark.apache.org/docs/1.5.2/running-on-yarn.html
Takeaways
● Sometimes even rich set of Spark metrics, logs and events is not
sufficient
● Java agents are powerful tool
○ extend application beyond the original intent
○ hot-patch problems or enhancements
○ may be removed when not needed
● Try Unravel to see the power of Java agents unleashed
Resources
● To simplify application and operations management with Spark, try out
Unravel
○ http://unraveldata.com/free-trial/
● Start with Java agents and try out BTrace
○ https://github.com/btraceio/btrace
○ contributors most sincerely welcome!
● Spark Configuration
○ http://spark.apache.org/docs/latest/configuration.html
○ https://spark.apache.org/docs/1.5.2/running-on-yarn.html
● Java Agents
○ https://docs.oracle.com/javase/7/docs/api/java/lang/instrument/pac
kage-summary.html
○ http://zeroturnaround.com/rebellabs/how-to-inspect-classes-in-your
-jvm/
Q&A
Unravel Free Trial: http://unraveldata.com/free-trial/
THANK YOU.
Jaroslav Bachorik, jaroslav@unraveldata.com
Adrian Popescu, adrian@unraveldata.com

More Related Content

What's hot

Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Databricks
 
Memory Management in Apache Spark
Memory Management in Apache SparkMemory Management in Apache Spark
Memory Management in Apache SparkDatabricks
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Enabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkEnabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkKazuaki Ishizaki
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraFlink Forward
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLDatabricks
 
Deep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDeep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDatabricks
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleFlink Forward
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Databricks
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroDatabricks
 
Optimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsOptimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsdatamantra
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on KubernetesDatabricks
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 
Introduction to Apache Spark Developer Training
Introduction to Apache Spark Developer TrainingIntroduction to Apache Spark Developer Training
Introduction to Apache Spark Developer TrainingCloudera, Inc.
 
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLBuilding a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLDatabricks
 
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsDatabricks
 
Apache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark OperatorApache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark OperatorDatabricks
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitFlink Forward
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 

What's hot (20)

Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
 
Memory Management in Apache Spark
Memory Management in Apache SparkMemory Management in Apache Spark
Memory Management in Apache Spark
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Enabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkEnabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache Spark
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQL
 
Deep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDeep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.x
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scale
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
 
Optimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsOptimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloads
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Introduction to Apache Spark Developer Training
Introduction to Apache Spark Developer TrainingIntroduction to Apache Spark Developer Training
Introduction to Apache Spark Developer Training
 
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLBuilding a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQL
 
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
 
Apache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark OperatorApache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark Operator
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and Profit
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 

Viewers also liked

Ten tools for ten big data areas 03_Apache Spark
Ten tools for ten big data areas 03_Apache SparkTen tools for ten big data areas 03_Apache Spark
Ten tools for ten big data areas 03_Apache SparkWill Du
 
Software evolution evangelisation
Software evolution evangelisationSoftware evolution evangelisation
Software evolution evangelisationNicolas Anquetil
 
презентация брв для клиента
презентация брв для клиентапрезентация брв для клиента
презентация брв для клиентаhafiller
 
Baby shop in dubai
Baby shop in dubaiBaby shop in dubai
Baby shop in dubaiyoumahuae
 
Spor o univerzálie, anzelm z canterbury
Spor o univerzálie, anzelm z canterburySpor o univerzálie, anzelm z canterbury
Spor o univerzálie, anzelm z canterburyMartin Cigánik
 
Laurie West Resume
Laurie West ResumeLaurie West Resume
Laurie West ResumeLaurie West
 
Baby shop dubai
Baby shop dubaiBaby shop dubai
Baby shop dubaiyoumahuae
 
Suomen Asiakastieto: AREX ja laskurahoituksen uudet muodot
Suomen Asiakastieto: AREX ja laskurahoituksen uudet muodotSuomen Asiakastieto: AREX ja laskurahoituksen uudet muodot
Suomen Asiakastieto: AREX ja laskurahoituksen uudet muodotAREX
 
Web 2.0 2015
Web 2.0 2015Web 2.0 2015
Web 2.0 20158brrus
 
Relationship between Tobacco Crop Evapotranspiration and the Normalized Diffe...
Relationship between Tobacco Crop Evapotranspiration and the Normalized Diffe...Relationship between Tobacco Crop Evapotranspiration and the Normalized Diffe...
Relationship between Tobacco Crop Evapotranspiration and the Normalized Diffe...Journal of Agriculture and Crops
 
Taste and smell sci story
Taste and smell sci storyTaste and smell sci story
Taste and smell sci storydhazzard
 
Evaluation of Heterosis in Pearl Millet (Pennisetum Glaucum (L.) R. Br) for A...
Evaluation of Heterosis in Pearl Millet (Pennisetum Glaucum (L.) R. Br) for A...Evaluation of Heterosis in Pearl Millet (Pennisetum Glaucum (L.) R. Br) for A...
Evaluation of Heterosis in Pearl Millet (Pennisetum Glaucum (L.) R. Br) for A...Journal of Agriculture and Crops
 
Baby boutique dubai
Baby boutique dubaiBaby boutique dubai
Baby boutique dubaiyoumahuae
 

Viewers also liked (19)

Software Evolution
Software EvolutionSoftware Evolution
Software Evolution
 
Ten tools for ten big data areas 03_Apache Spark
Ten tools for ten big data areas 03_Apache SparkTen tools for ten big data areas 03_Apache Spark
Ten tools for ten big data areas 03_Apache Spark
 
Software Evolution
Software EvolutionSoftware Evolution
Software Evolution
 
Software evolution evangelisation
Software evolution evangelisationSoftware evolution evangelisation
Software evolution evangelisation
 
Software Evolution
Software EvolutionSoftware Evolution
Software Evolution
 
презентация брв для клиента
презентация брв для клиентапрезентация брв для клиента
презентация брв для клиента
 
Baby shop in dubai
Baby shop in dubaiBaby shop in dubai
Baby shop in dubai
 
Apresentação lie
Apresentação lieApresentação lie
Apresentação lie
 
Spor o univerzálie, anzelm z canterbury
Spor o univerzálie, anzelm z canterburySpor o univerzálie, anzelm z canterbury
Spor o univerzálie, anzelm z canterbury
 
Laurie West Resume
Laurie West ResumeLaurie West Resume
Laurie West Resume
 
Spectre
SpectreSpectre
Spectre
 
Baby shop dubai
Baby shop dubaiBaby shop dubai
Baby shop dubai
 
Suomen Asiakastieto: AREX ja laskurahoituksen uudet muodot
Suomen Asiakastieto: AREX ja laskurahoituksen uudet muodotSuomen Asiakastieto: AREX ja laskurahoituksen uudet muodot
Suomen Asiakastieto: AREX ja laskurahoituksen uudet muodot
 
Web 2.0 2015
Web 2.0 2015Web 2.0 2015
Web 2.0 2015
 
Relationship between Tobacco Crop Evapotranspiration and the Normalized Diffe...
Relationship between Tobacco Crop Evapotranspiration and the Normalized Diffe...Relationship between Tobacco Crop Evapotranspiration and the Normalized Diffe...
Relationship between Tobacco Crop Evapotranspiration and the Normalized Diffe...
 
Taste and smell sci story
Taste and smell sci storyTaste and smell sci story
Taste and smell sci story
 
Consells familia saludable
Consells familia saludableConsells familia saludable
Consells familia saludable
 
Evaluation of Heterosis in Pearl Millet (Pennisetum Glaucum (L.) R. Br) for A...
Evaluation of Heterosis in Pearl Millet (Pennisetum Glaucum (L.) R. Br) for A...Evaluation of Heterosis in Pearl Millet (Pennisetum Glaucum (L.) R. Br) for A...
Evaluation of Heterosis in Pearl Millet (Pennisetum Glaucum (L.) R. Br) for A...
 
Baby boutique dubai
Baby boutique dubaiBaby boutique dubai
Baby boutique dubai
 

Similar to EXTENDING SPARK WITH JAVA AGENTS

Audit your reactive applications
Audit your reactive applicationsAudit your reactive applications
Audit your reactive applicationsOCTO Technology
 
Spark 2.x Troubleshooting Guide
Spark 2.x Troubleshooting GuideSpark 2.x Troubleshooting Guide
Spark 2.x Troubleshooting GuideIBM
 
Java Training | Java Tutorial for Beginners | Java Programming | Java Certifi...
Java Training | Java Tutorial for Beginners | Java Programming | Java Certifi...Java Training | Java Tutorial for Beginners | Java Programming | Java Certifi...
Java Training | Java Tutorial for Beginners | Java Programming | Java Certifi...Edureka!
 
Java Performance and Using Java Flight Recorder
Java Performance and Using Java Flight RecorderJava Performance and Using Java Flight Recorder
Java Performance and Using Java Flight RecorderIsuru Perera
 
Web Sphere Problem Determination Ext
Web Sphere Problem Determination ExtWeb Sphere Problem Determination Ext
Web Sphere Problem Determination ExtRohit Kelapure
 
Java Performance and Profiling
Java Performance and ProfilingJava Performance and Profiling
Java Performance and ProfilingWSO2
 
SF JUG - GWT Can Help You Create Amazing Apps - 2009-10-13
SF JUG - GWT Can Help You Create Amazing Apps - 2009-10-13SF JUG - GWT Can Help You Create Amazing Apps - 2009-10-13
SF JUG - GWT Can Help You Create Amazing Apps - 2009-10-13Fred Sauer
 
DCSF19 Docker Containers & Java: What I Wish I Had Been Told
DCSF19 Docker Containers & Java: What I Wish I Had Been ToldDCSF19 Docker Containers & Java: What I Wish I Had Been Told
DCSF19 Docker Containers & Java: What I Wish I Had Been ToldDocker, Inc.
 
JavaOne 2016: Life after Modularity
JavaOne 2016: Life after ModularityJavaOne 2016: Life after Modularity
JavaOne 2016: Life after ModularityDanHeidinga
 
DevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDinakar Guniguntala
 
Gradle - time for a new build
Gradle - time for a new buildGradle - time for a new build
Gradle - time for a new buildIgor Khotin
 
Gradle - time for another build
Gradle - time for another buildGradle - time for another build
Gradle - time for another buildIgor Khotin
 
What Is Java | Java Tutorial | Java Programming | Learn Java | Edureka
What Is Java | Java Tutorial | Java Programming | Learn Java | EdurekaWhat Is Java | Java Tutorial | Java Programming | Learn Java | Edureka
What Is Java | Java Tutorial | Java Programming | Learn Java | EdurekaEdureka!
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Automated Performance Testing With J Meter And Maven
Automated  Performance  Testing With  J Meter And  MavenAutomated  Performance  Testing With  J Meter And  Maven
Automated Performance Testing With J Meter And MavenPerconaPerformance
 

Similar to EXTENDING SPARK WITH JAVA AGENTS (20)

Audit your reactive applications
Audit your reactive applicationsAudit your reactive applications
Audit your reactive applications
 
Java in flames
Java in flamesJava in flames
Java in flames
 
Spark 2.x Troubleshooting Guide
Spark 2.x Troubleshooting GuideSpark 2.x Troubleshooting Guide
Spark 2.x Troubleshooting Guide
 
Java Training | Java Tutorial for Beginners | Java Programming | Java Certifi...
Java Training | Java Tutorial for Beginners | Java Programming | Java Certifi...Java Training | Java Tutorial for Beginners | Java Programming | Java Certifi...
Java Training | Java Tutorial for Beginners | Java Programming | Java Certifi...
 
Java Performance and Using Java Flight Recorder
Java Performance and Using Java Flight RecorderJava Performance and Using Java Flight Recorder
Java Performance and Using Java Flight Recorder
 
Apache Flink Hands On
Apache Flink Hands OnApache Flink Hands On
Apache Flink Hands On
 
Web Sphere Problem Determination Ext
Web Sphere Problem Determination ExtWeb Sphere Problem Determination Ext
Web Sphere Problem Determination Ext
 
Java Performance and Profiling
Java Performance and ProfilingJava Performance and Profiling
Java Performance and Profiling
 
SF JUG - GWT Can Help You Create Amazing Apps - 2009-10-13
SF JUG - GWT Can Help You Create Amazing Apps - 2009-10-13SF JUG - GWT Can Help You Create Amazing Apps - 2009-10-13
SF JUG - GWT Can Help You Create Amazing Apps - 2009-10-13
 
JavaCro'14 - Unit testing in AngularJS – Slaven Tomac
JavaCro'14 - Unit testing in AngularJS – Slaven TomacJavaCro'14 - Unit testing in AngularJS – Slaven Tomac
JavaCro'14 - Unit testing in AngularJS – Slaven Tomac
 
DCSF19 Docker Containers & Java: What I Wish I Had Been Told
DCSF19 Docker Containers & Java: What I Wish I Had Been ToldDCSF19 Docker Containers & Java: What I Wish I Had Been Told
DCSF19 Docker Containers & Java: What I Wish I Had Been Told
 
JavaOne 2016: Life after Modularity
JavaOne 2016: Life after ModularityJavaOne 2016: Life after Modularity
JavaOne 2016: Life after Modularity
 
DevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on Kubernetes
 
Gradle - time for a new build
Gradle - time for a new buildGradle - time for a new build
Gradle - time for a new build
 
Gradle - time for another build
Gradle - time for another buildGradle - time for another build
Gradle - time for another build
 
Using Maven2
Using Maven2Using Maven2
Using Maven2
 
What Is Java | Java Tutorial | Java Programming | Learn Java | Edureka
What Is Java | Java Tutorial | Java Programming | Learn Java | EdurekaWhat Is Java | Java Tutorial | Java Programming | Learn Java | Edureka
What Is Java | Java Tutorial | Java Programming | Learn Java | Edureka
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Automated Performance Testing With J Meter And Maven
Automated  Performance  Testing With  J Meter And  MavenAutomated  Performance  Testing With  J Meter And  Maven
Automated Performance Testing With J Meter And Maven
 

EXTENDING SPARK WITH JAVA AGENTS

  • 1. EXTENDING SPARK WITH JAVA AGENTS Jaroslav Bachorik Adrian Popescu
  • 2. Meet The Speakers ● Backend engineer ● Hands-on experience with JVM ○ OpenJDK Reviewer ● Strong focus on performance ○ VisualVM Contributor ○ BTrace Maintainer and Lead Developer Jaroslav Adrian ● Backend engineer ● 5+ years of experience in performance monitoring & modeling ● Focusing on tuning and optimization of Big Data apps
  • 4. Caching in Spark … rdd.cache() rdd.count() … rdd.reduceBy Key((x,y) => x+y).count() 1. Evaluate part #1 Executor part #2 Executor part #3 Executor Block Manager Block Manager Block Manager 2. Caching part #1 part #2 part #3 3. Fetch or re-evaluate part #1 part #2 part #3 Avoiding re-evaluation for cached RDD blocks
  • 5. Spark Caching Problem • RDD Caching - Benefits: speedup, saved resources - Too much caching may waste resources • Spot the opportunity - Which RDDs to cache (for most benefit)? - How to prioritize when memory is scarce? - Where to cache? - Benefit depends on memory state • http://unraveldata.com/to-cache-or-not-to-cache/ Challenging to decide when, what, and where to cache
  • 6. Algorithm 1. Find stages that share RDDs 2. Cached RDDs: Measure benefit Block hitRate & time saved for every Stage 3. RDDs not cached: Potential benefit Approximate time saved & RDD storage size 4. Suggest which RDDs to cache Need to collect additional metrics Insufficient visibility into Spark structures e.g., only block info in BlockStatusListener
  • 8. Meet Java Agent Premain-Class: com.myorg.agents.FancyAgent Can-Redefine-Classes: true Can-Retransform-Classes: true META-INF/manifest.mf public static void premain(String args, Instrumentation inst) { // do the magic stuff here } FancyAgent.java Launcher java -javaagent:myfancyagent.jar=<args> ● Runs JVM with an agent ● Agent jar ○ agent class ○ required manifest entries ● Arguments ○ any string understood by agent class ● Multiple agents can be specified
  • 9. Meet Java Agent Premain-Class: com.myorg.agents.FancyAgent Can-Redefine-Classes: true Can-Retransform-Classes: true META-INF/manifest.mf public static void premain(String args, Instrumentation inst) { // do the magic stuff here } FancyAgent.java Launcher java -javaagent:myfancyagent.jar=<args> ● Executed at JVM startup ● Instrumentation provides access to JVM internals
  • 10. Meet Java Agent Premain-Class: com.myorg.agents.FancyAgent Can-Redefine-Classes: true Can-Retransform-Classes: true META-INF/manifest.mf public static void premain(String args, Instrumentation inst) { // do the magic stuff here } FancyAgent.java Launcher java -javaagent:myfancyagent.jar=<args> ● Agent class ● Must be present in the agent jar
  • 11. Meet Java Agent Premain-Class: com.myorg.agents.FancyAgent Can-Redefine-Classes: true Can-Retransform-Classes: true META-INF/manifest.mf public static void premain(String args, Instrumentation inst) { // do the magic stuff here } FancyAgent.java Launcher java -javaagent:myfancyagent.jar=<args> Transform classes upon load
  • 12. Meet Java Agent Premain-Class: com.myorg.agents.FancyAgent Can-Redefine-Classes: true Can-Retransform-Classes: true META-INF/manifest.mf public static void premain(String args, Instrumentation inst) { // do the magic stuff here } FancyAgent.java Launcher java -javaagent:myfancyagent.jar=<args> ● Agent capabilities and settings ○ retransform classes ○ redefine classes ○ native method prefix https://docs.oracle.com/javase/7/docs/api/java/lang/instrument/package-summary.html
  • 13. Class Transformers class SpecialLibraryClass { public void doStuff() { System.out.println( "Doing stuff #" + id(System.currentTimeMillis()) ); } private int id(int input) { return input % 42; } } Private method in 3rd party package private method. Patch and recompile? Reflection? http://pixabay.com
  • 14. Class Transformers class SpecialLibraryClass { public void doStuff() { System.out.println("Doing stuff #" + id(System.currentTimeMillis())); } private int id(int input) { AgentCallbacks.onEnter("SLC", "id", "input = " + input); int tmp = input % 42; AgentCallbacks.onReturn("SLC", "id", tmp); return tmp; } } http://pixabay.com
  • 15. Class Transformers java.lang.instrument.ClassTransformer byte[] transform(ClassLoader l, String name, Class<?> cls, ProtectionDomain pd, byte[] classfileBuffer) ● Inspect and modify class data ○ without recompilation ○ even on-the-fly ● Powerful ○ introduce callbacks ○ patch problematic code ● Not a simple task ○ class data structure ○ constant pool ○ stack frame maps http://wikipedia.org
  • 16. BTrace ● Open source dynamic instrumentation framework for JVM ● Java annotations based DSL for class transformations ● Optimized for high performance ● Used eg. in VisualVM (http://visualvm.github.io) https://github.com/btraceio/btrace
  • 17. Spark Config Tuning ● Intercept SparkContext instantiation at Spark Driver ● Modify the associated SparkConf object ○ before SparkContext is fully instantiated ● Any Spark property can be changed ○ eg. “spark.executor.cores” based on system load ● Modified SparkConf will be applied to the current job ○ and distributed to to all executors
  • 18. Config Tuning Transformation @OnMethod(clazz = "org.apache.spark.SparkContext", method = "<init>", location = @Location(Kind.RETURN) ) public static void sparkContextConstructor(@Self SparkContext scObj) { SparkConf conf = scObj.conf(); if (conf == null) return; int recommendedCores = AdviceSystem.getRecommendedCores(scObj); conf.set("spark.executor.cores", recommendedCores); }
  • 20. Agents on Spark! ● Spark is a JVM application ● Any Java agent can be used with Spark http://apache.org http://pixabay.com
  • 21. Register JVM Agent for Spark java -javaagent:myfancyagent.jar=<args> spark.executor.extraJavaOptions= ”-javaagent:/opt/myagent/myagent.jar=opt.load:true” spark.driver.extraJavaOptions= ”-javaagent:/opt/myagent/myagent.jar=track.conf:true” Store the options in spark-defaults.conf to apply system-wide. Spark configuration parameters http://arthurandsage.blogspot.cz/
  • 22. Let’s Put It All Together!
  • 23. Identify Required Data Block Manager Block Manager Block Manager 1. Evaluation 2. Caching part #1 part #2 part #3 part #1 Executor part #2 Executor part #3 Executor • Access to Spark internal structures • Collecting: RDD block usage, cache hit, potential benefit • Run algorithm and suggest which RDDs to cache … rdd.cache() rdd.count() … rdd.reduceBy Key((x,y) => x+y).count() … Agents allow:
  • 24. Transform Classes def get(blockId: BlockId): Option[BlockResult] = { val local = getLocal(blockId) if (local.isDefined) { logInfo(s"Found block $blockId locally") return local } val remote = getRemote(blockId) if (remote.isDefined) { logInfo(s"Found block $blockId remotely") return remote } None } def get(blockId: BlockId): Option[BlockResult] = { AgentRuntime.useBlock(blockId, currentStage) val local = getLocal(blockId) if (local.isDefined) { logInfo(s"Found block $blockId locally") return local } val remote = getRemote(blockId) ... ... }
  • 25. Unravel Spark Cache Events Identify the costs of not-caching Give very precise advice
  • 26. Put It On Cluster
  • 28. Distributing Agents for Spark • Manually copy agents to all nodes Make sure all nodes have the latest and same copy of agent binaries https://www.quickmeme.com • Put that script in CRON! • Script that copy process
  • 29. Yarn Localization Service ● Built-in resource distribution system ● Cluster resources ○ linked into executor working directories ○ kept up-to-date ● Supported resource types ○ files ○ archives ● https://spark.apache.org/docs/1.5.2/running-on-yarn.html
  • 30. Takeaways ● Sometimes even rich set of Spark metrics, logs and events is not sufficient ● Java agents are powerful tool ○ extend application beyond the original intent ○ hot-patch problems or enhancements ○ may be removed when not needed ● Try Unravel to see the power of Java agents unleashed
  • 31. Resources ● To simplify application and operations management with Spark, try out Unravel ○ http://unraveldata.com/free-trial/ ● Start with Java agents and try out BTrace ○ https://github.com/btraceio/btrace ○ contributors most sincerely welcome! ● Spark Configuration ○ http://spark.apache.org/docs/latest/configuration.html ○ https://spark.apache.org/docs/1.5.2/running-on-yarn.html ● Java Agents ○ https://docs.oracle.com/javase/7/docs/api/java/lang/instrument/pac kage-summary.html ○ http://zeroturnaround.com/rebellabs/how-to-inspect-classes-in-your -jvm/
  • 32. Q&A Unravel Free Trial: http://unraveldata.com/free-trial/
  • 33. THANK YOU. Jaroslav Bachorik, jaroslav@unraveldata.com Adrian Popescu, adrian@unraveldata.com