SlideShare a Scribd company logo
1 of 24
Download to read offline
Taming GC Pauses for Humongous Java
Heaps in Spark Graph Computing
Eric Kaczmarek – Senior Java Performance Architect, Intel Corporation
Liqi Yi – Senior Performance Engineer, Intel Corporation
Legal Disclaimer
• Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence
or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall
have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject
to change without notice. Do not finalize a design with this information.
The products described in this document may contain design defects or errors known as errata which may cause the product to deviate
from published specifications. Current characterized errata are available on request.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration
will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase.
• For more complete information about performance and benchmark results, visit http://www.intel.com/performance.
• Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system
hardware or software design or configuration may affect actual performance.
• Results have been simulated and are provided for informational purposes only. Results were derived using simulations run on an
architecture simulator or model. Any difference in system hardware or software design or configuration may affect actual performance.
• Intel does not control or audit the design or implementation of third party benchmark data or Web sites referenced in this document.
Intel encourages all of its customers to visit the referenced Web sites or others where similar performance benchmark data are reported
and confirm whether the referenced benchmark data are accurate and reflect performance of systems available for purchase.
• Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries.
• *Other names and brands may be claimed as the property of others.
• Copyright © 2015 Intel Corporation. All rights reserved.
Motivation
• Spark uses memory aggressively
• Growing memory capacity (200+GB not uncommon)
• Lengthy stop the world garbage collection burdens
applications
• Need for a more efficient GC algorithm to tame 100GB+
Java Heap
Garbage Collectors in JDK 8
• Parallel Compacting Collector
– -XX:+UseParallelOldGC
– Throughput friendly collector
• Concurrent Mark Sweep (CMS) Collector
– -XX:+UseConcMarkSweepGC
– Low latency collector for heap < 32GB
• Garbage First (G1) Collector
– -XX:+UseG1GC
– Low latency collector
G1 Collector Promises
• Concurrent marking
• Nature compaction
• Simplified tuning
• Low and predictable latency
Garbage First Collector Overview
• Heap is divided to ~2K non-contiguous regions of eden, survivor,
and old spaces, region size can be 1MB, 2MB, 4MB, 8MB, 16MB,
32MB.
• Humongous regions are old regions for large objects that are
larger than ½ of region size. Humongous objects are stored in
contiguous regions in heap.
• Number of regions in eden and survivor can be changed between
GC’s.
• Young GC: multi-threaded, low stop-the-world pauses
• Concurrent marking and clean up: mostly parallel, multi-threaded
• Mixed GC: multi-threaded, incremental GC, collects and compacts
heap partially, more expensive stop-the-world pauses
• Full GC: Serial, collects and compacts entire heap
* Picture from: http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/G1GettingStarted/index.html
Garbage First Collector Parameters
G1GC Diagnostic Flags
-XX:+PrintFlagsFinal
Prints all JVM runtime flags when JVM starts (no overhead)
-XX:+PrintGCDetails
Prints GC status (Must have for GC tuning, low overhead)
-XX:+PrintGCDateStamps
-XX:+PrintGCTimeStamps
Tracks time stamps for each GC activity (low overhead)
-XX:+PrintAdaptiveSizePolicy
Prints information every time when GC decides to change
any setting or hits certain conditions
-XX:+PrintReferenceGC
Prints GC reference processing for each GC
G1GC Performance Flags
-XX:+UseG1GC -Xms100g –Xmx100g
-XX:+ParallelRefProcEnabled
Uses Multi-threads in parallel to process references
-XX:MaxGCPauseMillis=100
Sets desired GC pause target, the default is 200ms
-XX:ParallelGCThreads=33
Sets number of Parallel GC threads, recommended
8+(#of_logical_processors-8)(5/8)
Experiment Environment
• Hardware • Software
Spark 1.0 (standalone)
CentOS 6.4 (Final)
Kernel 3.15.1
Oracle JDK 8 update 60
Scala 2.10.4
4 Worker nodes
Dual sockets Xeon E5
2697 v2 @2.7 GHz
192GB DDR3 @ 1333MHz
10Gbps Ethernet
Experiment Configuration
• Graph computing workload using Bagel
– Input: 1-degree weight table of vertices
– Output: n-degree weight table of vertices (ranked and
truncated to top-m)
• Spark job configuration
– Java serializer
– spark.storage.memoryFraction=0.4
– 128 partitions
G1 Baseline
0.01
0.1
1
10
100
1000
16:30:43 16:32:10 16:33:36 16:35:02 16:36:29 16:37:55 16:39:22 16:40:48
GCpausetime(sec)
Time stamp
G1 GC Pause time
Two FULL GCs (107s, 91s), Total execution time 470s, Total GC pause time 298.54s, 63.5% of Total
execution time spent in GC!!!
107 seconds 91 seconds
ONLY 1 GC thread running
How to Improve GC Behavior
1. Enable GC logging
2. Look for outliers (long pauses) in the logs
3. Understand the root of long GC pauses
4. Tune GC command line to avoid or alleviate the symptoms
5. Examine logs and repeat at step #2 (multiple iterations)
Log Snippet
175.556: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation request failed,
allocation request: 32 bytes]
175.556: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 33554432
bytes, attempted expansion amount: 33554432 bytes]
175.556: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap already fully expanded]
175.557: [Full GC (Allocation Failure), 107.8433808 secs]
…
[Eden: 0.0B(5120.0M)->0.0B(5120.0M) Survivors: 0.0B->0.0B Heap: 97.3G(100.0G)->53.9G(100.0G)],
[Metaspace: 36255K->36255K(38912K)]
[Times: user=157.48 sys=0.16, real=107.84 secs]
Heap is over 95% full
45% is garbage
Log Snippet
175.556: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation request failed,
allocation request: 32 bytes]
175.556: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 33554432
bytes, attempted expansion amount: 33554432 bytes]
175.556: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap already fully expanded]
175.557: [Full GC (Allocation Failure), 107.8433808 secs]
…
[Eden: 0.0B(5120.0M)->0.0B(5120.0M) Survivors: 0.0B->0.0B Heap: 97.3G(100.0G)->53.9G(100.0G)],
[Metaspace: 36255K->36255K(38912K)]
[Times: user=157.48 sys=0.16, real=107.84 secs]
Heap is over 95% full
45% is garbage
Garbage piled up
Why the pile up?
• 35 seconds before Full GC:
139.825: [GC concurrent-root-region-scan-start]
140.187: [GC concurrent-root-region-scan-end, 0.3621302 secs]
140.187: [GC concurrent-mark-start]
140.354: [GC pause (G1 Evacuation Pause) (young) …]
[Eden: 4480.0M(4480.0M)->0.0B(4480.0M) Survivors: 640.0M->640.0M Heap: 77.4G(100.0G)->74.4G(100.0G)]
• 10 minor GC has passed:
170.602: [GC pause (G1 Evacuation Pause) (young) 170.602: [G1Ergonomics (CSet Construction) start choosing CSet,
_pending_cards: 1321991, predicted base time: 431.40 ms, remaining time: 0.00 ms, target pause time: 100.00 ms]
…
[Eden: 0.0B(4480.0M)->0.0B(5120.0M) Survivors: 640.0M->0.0B Heap: 97.3G(100.0G)->97.3G(100.0G)]
[Times: user=157.48 sys=0.16, real=107.84 secs]
283.401: [GC concurrent-mark-abort]
Heap 77% full when
concurrent mark started
Heap filled up, but concurrent
mark not finished yet
Why the pile up?
72
74
77
79
82
85
87
89
90
92
97 97
54 54
50
55
60
65
70
75
80
85
90
95
100
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Heaputilization(GB)
GC event number
Heap Utilization after each GC event
Full GC
Concurrent mark abort
10 minor GCs (old, humongous regions not collected)
Minor GC
Concurrent mark start
Root Causes and Solutions
• Concurrent marking phase needs speed up
– Concurrent phase did not finish before Full GC happens
– Concurrent marking phase must be completed before mixed GC
could happen
• More aggressive concurrent phase
– Increase concurrent threads from 8(default) to 20
-XX:ConcGCThreads=20
G1 after Tuning
0
0.5
1
1.5
2
2.5
16:09:07 16:09:50 16:10:34 16:11:17 16:12:00 16:12:43 16:13:26
GCpausetime(sec)
Time stamp
G1 GC Pause time (sec)
No FULL GC, Total execution time 217s, Total GC pause time 58.77s, 27% of Total execution time spent in GC.
G1 after Tuning
76
78
79
80
81
82
83
85 85 85
81
80 80
81
82
60
65
70
75
80
85
90
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Heaputilization(GB)
GC event number
Heap Utilization
Concurrent mark end
remark
Cleanup
Initial mark
Concurrent mark start
7 minor GCs
Mixed GCs
Comparison Before and After
0.01
0.1
1
10
100
1000
0:00:00 0:01:26 0:02:53 0:04:19 0:05:46 0:07:12 0:08:38
GCpausetime(seconds)
relative time stamp since JVM started
G1 GC pauses before and after tuning
Pause time before (sec)
Pause time after (sec)
1,681
62,663
293
409
199,096
375,800
9,302
45,645
448
641
127,870
1
10
100
1000
10000
100000
1000000
Mixed GC Young GC Cleanup Remark Full GC Total STW Pauses
GCpausetime(seconds)
Different GC types comparison
before tuning after tuning
5.5X
73%
1.5X
1.5X
34%
Detailed comparison
Conclusion
– JDK8u60 G1 collector works well for large Java Heaps (100+
GB)
– Default G1 settings still do not offer best performance
• Requires tuning
– Tuned G1 provides low GC pauses for Spark running with large
heaps (100+GB)
Acknowledging
• Yu (Jenny) Zhang – Oracle Corp.
• Yanping Wang – Intel Corp.
• Liye Zhang - Intel Corp.
• Jie (Grace) Huang - Intel Corp.
Contacts
Eric Kaczmarek
eric.kaczmarek@intel.com
Liqi Yi
liqi.yi@intel.com
Additional resources
• https://blogs.oracle.com/g1gc/
• http://www.infoq.com/articles/G1-One-Garbage-Collector-To-Rule-Them-All
• http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/G1GettingStarted/index.html#Clear
CT
• https://blogs.oracle.com/g1gc/entry/g1gc_logs_how_to_print

More Related Content

What's hot

しばちょう先生が語る!オラクルデータベースの進化の歴史と最新技術動向#1
しばちょう先生が語る!オラクルデータベースの進化の歴史と最新技術動向#1しばちょう先生が語る!オラクルデータベースの進化の歴史と最新技術動向#1
しばちょう先生が語る!オラクルデータベースの進化の歴史と最新技術動向#1オラクルエンジニア通信
 
Oracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best PracticesOracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best PracticesBobby Curtis
 
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...Salesforce Engineering
 
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsDatabricks
 
Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)StreamNative
 
Oracle Database Availability & Scalability Across Versions & Editions
Oracle Database Availability & Scalability Across Versions & EditionsOracle Database Availability & Scalability Across Versions & Editions
Oracle Database Availability & Scalability Across Versions & EditionsMarkus Michalewicz
 
大規模データ活用向けストレージレイヤソフトのこれまでとこれから(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
大規模データ活用向けストレージレイヤソフトのこれまでとこれから(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)大規模データ活用向けストレージレイヤソフトのこれまでとこれから(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
大規模データ活用向けストレージレイヤソフトのこれまでとこれから(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)NTT DATA Technology & Innovation
 
Exadata X3 in action: Measuring Smart Scan efficiency with AWR
Exadata X3 in action:  Measuring Smart Scan efficiency with AWRExadata X3 in action:  Measuring Smart Scan efficiency with AWR
Exadata X3 in action: Measuring Smart Scan efficiency with AWRFranck Pachot
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergAnant Corporation
 
Physical Plans in Spark SQL
Physical Plans in Spark SQLPhysical Plans in Spark SQL
Physical Plans in Spark SQLDatabricks
 
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesBest Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesMarkus Michalewicz
 
Oracle Goldengate for Big Data - LendingClub Implementation
Oracle Goldengate for Big Data - LendingClub ImplementationOracle Goldengate for Big Data - LendingClub Implementation
Oracle Goldengate for Big Data - LendingClub ImplementationVengata Guruswamy
 
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...confluent
 
Simplifying Change Data Capture using Databricks Delta
Simplifying Change Data Capture using Databricks DeltaSimplifying Change Data Capture using Databricks Delta
Simplifying Change Data Capture using Databricks DeltaDatabricks
 
Spark SQL Beyond Official Documentation
Spark SQL Beyond Official DocumentationSpark SQL Beyond Official Documentation
Spark SQL Beyond Official DocumentationDatabricks
 
Greenplum versus redshift and actian vectorwise comparison
Greenplum versus redshift and actian vectorwise comparisonGreenplum versus redshift and actian vectorwise comparison
Greenplum versus redshift and actian vectorwise comparisonDr. Syed Hassan Amin
 

What's hot (20)

しばちょう先生が語る!オラクルデータベースの進化の歴史と最新技術動向#1
しばちょう先生が語る!オラクルデータベースの進化の歴史と最新技術動向#1しばちょう先生が語る!オラクルデータベースの進化の歴史と最新技術動向#1
しばちょう先生が語る!オラクルデータベースの進化の歴史と最新技術動向#1
 
Oracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best PracticesOracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best Practices
 
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
 
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
 
OCIコンテナサービス関連の技術詳細
OCIコンテナサービス関連の技術詳細OCIコンテナサービス関連の技術詳細
OCIコンテナサービス関連の技術詳細
 
Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)
 
Oracle Database Availability & Scalability Across Versions & Editions
Oracle Database Availability & Scalability Across Versions & EditionsOracle Database Availability & Scalability Across Versions & Editions
Oracle Database Availability & Scalability Across Versions & Editions
 
大規模データ活用向けストレージレイヤソフトのこれまでとこれから(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
大規模データ活用向けストレージレイヤソフトのこれまでとこれから(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)大規模データ活用向けストレージレイヤソフトのこれまでとこれから(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
大規模データ活用向けストレージレイヤソフトのこれまでとこれから(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
 
Exadata X3 in action: Measuring Smart Scan efficiency with AWR
Exadata X3 in action:  Measuring Smart Scan efficiency with AWRExadata X3 in action:  Measuring Smart Scan efficiency with AWR
Exadata X3 in action: Measuring Smart Scan efficiency with AWR
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
 
Physical Plans in Spark SQL
Physical Plans in Spark SQLPhysical Plans in Spark SQL
Physical Plans in Spark SQL
 
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesBest Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
 
Oracle Goldengate for Big Data - LendingClub Implementation
Oracle Goldengate for Big Data - LendingClub ImplementationOracle Goldengate for Big Data - LendingClub Implementation
Oracle Goldengate for Big Data - LendingClub Implementation
 
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
 
Simplifying Change Data Capture using Databricks Delta
Simplifying Change Data Capture using Databricks DeltaSimplifying Change Data Capture using Databricks Delta
Simplifying Change Data Capture using Databricks Delta
 
Oracle GoldenGate入門
Oracle GoldenGate入門Oracle GoldenGate入門
Oracle GoldenGate入門
 
Oracle GoldenGate導入Tips
Oracle GoldenGate導入TipsOracle GoldenGate導入Tips
Oracle GoldenGate導入Tips
 
Spark SQL Beyond Official Documentation
Spark SQL Beyond Official DocumentationSpark SQL Beyond Official Documentation
Spark SQL Beyond Official Documentation
 
Greenplum versus redshift and actian vectorwise comparison
Greenplum versus redshift and actian vectorwise comparisonGreenplum versus redshift and actian vectorwise comparison
Greenplum versus redshift and actian vectorwise comparison
 
Oracle GoldenGate アーキテクチャと基本機能
Oracle GoldenGate アーキテクチャと基本機能Oracle GoldenGate アーキテクチャと基本機能
Oracle GoldenGate アーキテクチャと基本機能
 

Viewers also liked

Spark 2.x Troubleshooting Guide
Spark 2.x Troubleshooting GuideSpark 2.x Troubleshooting Guide
Spark 2.x Troubleshooting GuideIBM
 
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon
 
G1 Garbage Collector: Details and Tuning
G1 Garbage Collector: Details and TuningG1 Garbage Collector: Details and Tuning
G1 Garbage Collector: Details and TuningSimone Bordet
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guidelarsgeorge
 
楽して JVM を学びたい #jjug
楽して JVM を学びたい #jjug楽して JVM を学びたい #jjug
楽して JVM を学びたい #jjugYuji Kubota
 
JVM のいろはにほ #javajo
JVM のいろはにほ #javajoJVM のいろはにほ #javajo
JVM のいろはにほ #javajoYuji Kubota
 
java.lang.OutOfMemoryError #渋谷java
java.lang.OutOfMemoryError #渋谷javajava.lang.OutOfMemoryError #渋谷java
java.lang.OutOfMemoryError #渋谷javaYuji Kubota
 
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic RepartitioningHandling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic RepartitioningSpark Summit
 
Concurrent Mark-Sweep Garbage Collection #jjug_ccc
Concurrent Mark-Sweep Garbage Collection #jjug_cccConcurrent Mark-Sweep Garbage Collection #jjug_ccc
Concurrent Mark-Sweep Garbage Collection #jjug_cccYuji Kubota
 
Garbage First Garbage Collection (G1 GC) #jjug_ccc #ccc_cd6
Garbage First Garbage Collection (G1 GC) #jjug_ccc #ccc_cd6Garbage First Garbage Collection (G1 GC) #jjug_ccc #ccc_cd6
Garbage First Garbage Collection (G1 GC) #jjug_ccc #ccc_cd6Yuji Kubota
 
Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Spark Summit
 
Understanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And ProfitUnderstanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And ProfitSpark Summit
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsCloudera, Inc.
 
Scaling Instagram
Scaling InstagramScaling Instagram
Scaling Instagramiammutex
 

Viewers also liked (14)

Spark 2.x Troubleshooting Guide
Spark 2.x Troubleshooting GuideSpark 2.x Troubleshooting Guide
Spark 2.x Troubleshooting Guide
 
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
 
G1 Garbage Collector: Details and Tuning
G1 Garbage Collector: Details and TuningG1 Garbage Collector: Details and Tuning
G1 Garbage Collector: Details and Tuning
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guide
 
楽して JVM を学びたい #jjug
楽して JVM を学びたい #jjug楽して JVM を学びたい #jjug
楽して JVM を学びたい #jjug
 
JVM のいろはにほ #javajo
JVM のいろはにほ #javajoJVM のいろはにほ #javajo
JVM のいろはにほ #javajo
 
java.lang.OutOfMemoryError #渋谷java
java.lang.OutOfMemoryError #渋谷javajava.lang.OutOfMemoryError #渋谷java
java.lang.OutOfMemoryError #渋谷java
 
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic RepartitioningHandling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
 
Concurrent Mark-Sweep Garbage Collection #jjug_ccc
Concurrent Mark-Sweep Garbage Collection #jjug_cccConcurrent Mark-Sweep Garbage Collection #jjug_ccc
Concurrent Mark-Sweep Garbage Collection #jjug_ccc
 
Garbage First Garbage Collection (G1 GC) #jjug_ccc #ccc_cd6
Garbage First Garbage Collection (G1 GC) #jjug_ccc #ccc_cd6Garbage First Garbage Collection (G1 GC) #jjug_ccc #ccc_cd6
Garbage First Garbage Collection (G1 GC) #jjug_ccc #ccc_cd6
 
Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...
 
Understanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And ProfitUnderstanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And Profit
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
 
Scaling Instagram
Scaling InstagramScaling Instagram
Scaling Instagram
 

Similar to Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kaczmarek and Liqi Yi, Intel)

this-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptxthis-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptxTier1 app
 
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GCHadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GCErik Krogen
 
Performance tuning jvm
Performance tuning jvmPerformance tuning jvm
Performance tuning jvmPrem Kuppumani
 
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai YuAn Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai YuDatabricks
 
millions-gc-jax-2022.pptx
millions-gc-jax-2022.pptxmillions-gc-jax-2022.pptx
millions-gc-jax-2022.pptxTier1 app
 
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...Monica Beckwith
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeDatabricks
 
淺談 Java GC 原理、調教和 新發展
淺談 Java GC 原理、調教和新發展淺談 Java GC 原理、調教和新發展
淺談 Java GC 原理、調教和 新發展Leon Chen
 
GC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance EngineerGC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance EngineerMonica Beckwith
 
Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Tier1 App
 
Troubleshooting Memory Problems in Java Applications
Troubleshooting Memory Problems in Java ApplicationsTroubleshooting Memory Problems in Java Applications
Troubleshooting Memory Problems in Java ApplicationsPoonam Bajaj Parhar
 
Gopher in performance_tales_ms_go_cracow
Gopher in performance_tales_ms_go_cracowGopher in performance_tales_ms_go_cracow
Gopher in performance_tales_ms_go_cracowMateuszSzczyrzyca
 
GC Tuning Confessions Of A Performance Engineer - Improved :)
GC Tuning Confessions Of A Performance Engineer - Improved :)GC Tuning Confessions Of A Performance Engineer - Improved :)
GC Tuning Confessions Of A Performance Engineer - Improved :)Monica Beckwith
 
Secrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on KubernetesSecrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on KubernetesBruno Borges
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Webcast - Making kubernetes production ready
Webcast - Making kubernetes production readyWebcast - Making kubernetes production ready
Webcast - Making kubernetes production readyApplatix
 
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...DVClub
 
Taming Java Garbage Collector
Taming Java Garbage CollectorTaming Java Garbage Collector
Taming Java Garbage CollectorDaya Atapattu
 

Similar to Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kaczmarek and Liqi Yi, Intel) (20)

ZGC-SnowOne.pdf
ZGC-SnowOne.pdfZGC-SnowOne.pdf
ZGC-SnowOne.pdf
 
this-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptxthis-is-garbage-talk-2022.pptx
this-is-garbage-talk-2022.pptx
 
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GCHadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
 
Performance tuning jvm
Performance tuning jvmPerformance tuning jvm
Performance tuning jvm
 
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai YuAn Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
 
millions-gc-jax-2022.pptx
millions-gc-jax-2022.pptxmillions-gc-jax-2022.pptx
millions-gc-jax-2022.pptx
 
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Adva...
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
 
淺談 Java GC 原理、調教和 新發展
淺談 Java GC 原理、調教和新發展淺談 Java GC 原理、調教和新發展
淺談 Java GC 原理、調教和 新發展
 
GC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance EngineerGC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance Engineer
 
Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Am I reading GC logs Correctly?
Am I reading GC logs Correctly?
 
Troubleshooting Memory Problems in Java Applications
Troubleshooting Memory Problems in Java ApplicationsTroubleshooting Memory Problems in Java Applications
Troubleshooting Memory Problems in Java Applications
 
Gopher in performance_tales_ms_go_cracow
Gopher in performance_tales_ms_go_cracowGopher in performance_tales_ms_go_cracow
Gopher in performance_tales_ms_go_cracow
 
Java Performance Tuning
Java Performance TuningJava Performance Tuning
Java Performance Tuning
 
GC Tuning Confessions Of A Performance Engineer - Improved :)
GC Tuning Confessions Of A Performance Engineer - Improved :)GC Tuning Confessions Of A Performance Engineer - Improved :)
GC Tuning Confessions Of A Performance Engineer - Improved :)
 
Secrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on KubernetesSecrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on Kubernetes
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Webcast - Making kubernetes production ready
Webcast - Making kubernetes production readyWebcast - Making kubernetes production ready
Webcast - Making kubernetes production ready
 
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...
 
Taming Java Garbage Collector
Taming Java Garbage CollectorTaming Java Garbage Collector
Taming Java Garbage Collector
 

More from Spark Summit

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang Spark Summit
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...Spark Summit
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang WuSpark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya RaghavendraSpark Summit
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...Spark Summit
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingSpark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingSpark Summit
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...Spark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakSpark Summit
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimSpark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraSpark Summit
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Spark Summit
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...Spark Summit
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spark Summit
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovSpark Summit
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Spark Summit
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkSpark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...Spark Summit
 

More from Spark Summit (20)

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
 

Recently uploaded

Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfsimulationsindia
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 

Recently uploaded (20)

Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 

Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kaczmarek and Liqi Yi, Intel)

  • 1. Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing Eric Kaczmarek – Senior Java Performance Architect, Intel Corporation Liqi Yi – Senior Performance Engineer, Intel Corporation
  • 2. Legal Disclaimer • Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. • For more complete information about performance and benchmark results, visit http://www.intel.com/performance. • Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. • Results have been simulated and are provided for informational purposes only. Results were derived using simulations run on an architecture simulator or model. Any difference in system hardware or software design or configuration may affect actual performance. • Intel does not control or audit the design or implementation of third party benchmark data or Web sites referenced in this document. Intel encourages all of its customers to visit the referenced Web sites or others where similar performance benchmark data are reported and confirm whether the referenced benchmark data are accurate and reflect performance of systems available for purchase. • Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries. • *Other names and brands may be claimed as the property of others. • Copyright © 2015 Intel Corporation. All rights reserved.
  • 3. Motivation • Spark uses memory aggressively • Growing memory capacity (200+GB not uncommon) • Lengthy stop the world garbage collection burdens applications • Need for a more efficient GC algorithm to tame 100GB+ Java Heap
  • 4. Garbage Collectors in JDK 8 • Parallel Compacting Collector – -XX:+UseParallelOldGC – Throughput friendly collector • Concurrent Mark Sweep (CMS) Collector – -XX:+UseConcMarkSweepGC – Low latency collector for heap < 32GB • Garbage First (G1) Collector – -XX:+UseG1GC – Low latency collector
  • 5. G1 Collector Promises • Concurrent marking • Nature compaction • Simplified tuning • Low and predictable latency
  • 6. Garbage First Collector Overview • Heap is divided to ~2K non-contiguous regions of eden, survivor, and old spaces, region size can be 1MB, 2MB, 4MB, 8MB, 16MB, 32MB. • Humongous regions are old regions for large objects that are larger than ½ of region size. Humongous objects are stored in contiguous regions in heap. • Number of regions in eden and survivor can be changed between GC’s. • Young GC: multi-threaded, low stop-the-world pauses • Concurrent marking and clean up: mostly parallel, multi-threaded • Mixed GC: multi-threaded, incremental GC, collects and compacts heap partially, more expensive stop-the-world pauses • Full GC: Serial, collects and compacts entire heap * Picture from: http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/G1GettingStarted/index.html
  • 7. Garbage First Collector Parameters G1GC Diagnostic Flags -XX:+PrintFlagsFinal Prints all JVM runtime flags when JVM starts (no overhead) -XX:+PrintGCDetails Prints GC status (Must have for GC tuning, low overhead) -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps Tracks time stamps for each GC activity (low overhead) -XX:+PrintAdaptiveSizePolicy Prints information every time when GC decides to change any setting or hits certain conditions -XX:+PrintReferenceGC Prints GC reference processing for each GC G1GC Performance Flags -XX:+UseG1GC -Xms100g –Xmx100g -XX:+ParallelRefProcEnabled Uses Multi-threads in parallel to process references -XX:MaxGCPauseMillis=100 Sets desired GC pause target, the default is 200ms -XX:ParallelGCThreads=33 Sets number of Parallel GC threads, recommended 8+(#of_logical_processors-8)(5/8)
  • 8. Experiment Environment • Hardware • Software Spark 1.0 (standalone) CentOS 6.4 (Final) Kernel 3.15.1 Oracle JDK 8 update 60 Scala 2.10.4 4 Worker nodes Dual sockets Xeon E5 2697 v2 @2.7 GHz 192GB DDR3 @ 1333MHz 10Gbps Ethernet
  • 9. Experiment Configuration • Graph computing workload using Bagel – Input: 1-degree weight table of vertices – Output: n-degree weight table of vertices (ranked and truncated to top-m) • Spark job configuration – Java serializer – spark.storage.memoryFraction=0.4 – 128 partitions
  • 10. G1 Baseline 0.01 0.1 1 10 100 1000 16:30:43 16:32:10 16:33:36 16:35:02 16:36:29 16:37:55 16:39:22 16:40:48 GCpausetime(sec) Time stamp G1 GC Pause time Two FULL GCs (107s, 91s), Total execution time 470s, Total GC pause time 298.54s, 63.5% of Total execution time spent in GC!!! 107 seconds 91 seconds ONLY 1 GC thread running
  • 11. How to Improve GC Behavior 1. Enable GC logging 2. Look for outliers (long pauses) in the logs 3. Understand the root of long GC pauses 4. Tune GC command line to avoid or alleviate the symptoms 5. Examine logs and repeat at step #2 (multiple iterations)
  • 12. Log Snippet 175.556: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation request failed, allocation request: 32 bytes] 175.556: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 33554432 bytes, attempted expansion amount: 33554432 bytes] 175.556: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap already fully expanded] 175.557: [Full GC (Allocation Failure), 107.8433808 secs] … [Eden: 0.0B(5120.0M)->0.0B(5120.0M) Survivors: 0.0B->0.0B Heap: 97.3G(100.0G)->53.9G(100.0G)], [Metaspace: 36255K->36255K(38912K)] [Times: user=157.48 sys=0.16, real=107.84 secs] Heap is over 95% full 45% is garbage
  • 13. Log Snippet 175.556: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation request failed, allocation request: 32 bytes] 175.556: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 33554432 bytes, attempted expansion amount: 33554432 bytes] 175.556: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap already fully expanded] 175.557: [Full GC (Allocation Failure), 107.8433808 secs] … [Eden: 0.0B(5120.0M)->0.0B(5120.0M) Survivors: 0.0B->0.0B Heap: 97.3G(100.0G)->53.9G(100.0G)], [Metaspace: 36255K->36255K(38912K)] [Times: user=157.48 sys=0.16, real=107.84 secs] Heap is over 95% full 45% is garbage Garbage piled up
  • 14. Why the pile up? • 35 seconds before Full GC: 139.825: [GC concurrent-root-region-scan-start] 140.187: [GC concurrent-root-region-scan-end, 0.3621302 secs] 140.187: [GC concurrent-mark-start] 140.354: [GC pause (G1 Evacuation Pause) (young) …] [Eden: 4480.0M(4480.0M)->0.0B(4480.0M) Survivors: 640.0M->640.0M Heap: 77.4G(100.0G)->74.4G(100.0G)] • 10 minor GC has passed: 170.602: [GC pause (G1 Evacuation Pause) (young) 170.602: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 1321991, predicted base time: 431.40 ms, remaining time: 0.00 ms, target pause time: 100.00 ms] … [Eden: 0.0B(4480.0M)->0.0B(5120.0M) Survivors: 640.0M->0.0B Heap: 97.3G(100.0G)->97.3G(100.0G)] [Times: user=157.48 sys=0.16, real=107.84 secs] 283.401: [GC concurrent-mark-abort] Heap 77% full when concurrent mark started Heap filled up, but concurrent mark not finished yet
  • 15. Why the pile up? 72 74 77 79 82 85 87 89 90 92 97 97 54 54 50 55 60 65 70 75 80 85 90 95 100 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Heaputilization(GB) GC event number Heap Utilization after each GC event Full GC Concurrent mark abort 10 minor GCs (old, humongous regions not collected) Minor GC Concurrent mark start
  • 16. Root Causes and Solutions • Concurrent marking phase needs speed up – Concurrent phase did not finish before Full GC happens – Concurrent marking phase must be completed before mixed GC could happen • More aggressive concurrent phase – Increase concurrent threads from 8(default) to 20 -XX:ConcGCThreads=20
  • 17. G1 after Tuning 0 0.5 1 1.5 2 2.5 16:09:07 16:09:50 16:10:34 16:11:17 16:12:00 16:12:43 16:13:26 GCpausetime(sec) Time stamp G1 GC Pause time (sec) No FULL GC, Total execution time 217s, Total GC pause time 58.77s, 27% of Total execution time spent in GC.
  • 18. G1 after Tuning 76 78 79 80 81 82 83 85 85 85 81 80 80 81 82 60 65 70 75 80 85 90 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Heaputilization(GB) GC event number Heap Utilization Concurrent mark end remark Cleanup Initial mark Concurrent mark start 7 minor GCs Mixed GCs
  • 19. Comparison Before and After 0.01 0.1 1 10 100 1000 0:00:00 0:01:26 0:02:53 0:04:19 0:05:46 0:07:12 0:08:38 GCpausetime(seconds) relative time stamp since JVM started G1 GC pauses before and after tuning Pause time before (sec) Pause time after (sec)
  • 20. 1,681 62,663 293 409 199,096 375,800 9,302 45,645 448 641 127,870 1 10 100 1000 10000 100000 1000000 Mixed GC Young GC Cleanup Remark Full GC Total STW Pauses GCpausetime(seconds) Different GC types comparison before tuning after tuning 5.5X 73% 1.5X 1.5X 34% Detailed comparison
  • 21. Conclusion – JDK8u60 G1 collector works well for large Java Heaps (100+ GB) – Default G1 settings still do not offer best performance • Requires tuning – Tuned G1 provides low GC pauses for Spark running with large heaps (100+GB)
  • 22. Acknowledging • Yu (Jenny) Zhang – Oracle Corp. • Yanping Wang – Intel Corp. • Liye Zhang - Intel Corp. • Jie (Grace) Huang - Intel Corp.
  • 24. Additional resources • https://blogs.oracle.com/g1gc/ • http://www.infoq.com/articles/G1-One-Garbage-Collector-To-Rule-Them-All • http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/G1GettingStarted/index.html#Clear CT • https://blogs.oracle.com/g1gc/entry/g1gc_logs_how_to_print