SlideShare a Scribd company logo
WIFI SSID:SparkAISummit | Password: UnifiedAnalytics
Cheng Xu, Intel
Balcer Piotr, Intel
Accelerate your Spark
with Intel Optane DC
Persistent Memory
#UnifiedAnalytics #SparkAISummit
Notices and Disclaimers
3
© 2018 Intel Corporation. Intel, the Intel logo, 3D XPoint, Optane, Xeon, Xeon logos, and Intel Optane logo are trademarks of Intel Corporation in the U.S. and/or other countries.
All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.
No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.
The cost reduction scenarios described are intended to enable you to get a better understanding of how the purchase of a given Intel based product, combined with a number of situation-specific
variables, might affect future costs and savings. Circumstances will vary and there may be unaccounted-for costs related to the use and deployment of a given product. Nothing in this document
should be interpreted as either a promise of or contract for a given level of costs or cost reduction.
The benchmark results reported above may need to be revised as additional testing is conducted. The results depend on the specific platform configurations and workloads utilized in the testing, and
may not be applicable to any particular user’s components, computer system or workloads. The results are not necessarily representative of other benchmarks and other benchmark results may show
greater or lesser impact from mitigations.
Results have been estimated based on tests conducted on pre-production systems, and provided to you for informational purposes. Any differences in your system hardware, software or
configuration may affect your actual performance. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.
Performance results are based on testing as of 03-14-2019 and may not reflect all publicly available security updates. See configuration disclosure for details. No product can be absolutely secure.
Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause
the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when
combined with other products. For more information go to www.intel.com/benchmarks.
Intel processors of the same SKU may vary in frequency or power as a result of natural variability in the production process.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.
Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations
include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not
manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are
reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice
Revision #20110804.
Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of
information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
*Other names and brands may be claimed as the property of others.
About US
Intel SSP/DAT – Data Analytics Technology
● Active open source contributions: About 30 committers from our team with
significant contributions to a lot of big data key projects including Spark, Hadoop,
HBase, Hive...
● A global engineering team (China, US, India) - Major team is based in Shanghai
● Mission: To fully optimize analytics solutions on IA and make Intel the best and
easiest platform for big data and analytics customers
Intel NVMS – Non-Volatile Memory Solutions Group
• Tasked with preparing the ground for Intel® Optane™ DC Persistent Memory
• Enabling new and existing software through Persistent Memory Development Kit
(PMDK)
• pmem.io
4
Out of memory Large Spill
Some Challenges in Spark: Memory???
Re-architecting the Memory/Storage Hierarchy
7
Intel®Optane™DCPersistentMemory-ProductOverview
(Optane™ based Memory Module for the Data Center)
IMC
Cascade Lake
Server
IMC
• 128, 256, 512GB
DIMM Capacity
• 2666 MT/sec
Speed
• 3TB (not including DRAM)
Capacity per CPU
• DDR4 electrical & physical
• Close to DRAM latency
• Cache line size access
* DIMM population shown as an example only.
PersistentmemoryOperatingModes
Memory Mode & AppDirect
8
9
Intel®Optane™DCPersistentMemory-OperationalModes
AppDirectMode memoryMode
Persistent
High availability /
less downtime
High capacity
Affordable
Significantly faster
storage
Ease of
adoption†
Note that a BIOS update will be required before using Intel persistent memory
10
INTEL®OPTANE™DCPERSISTENTMEMORYSupportforBreadthofapplications
PERSISTENT PERFORMANCE
& MAXIMUM CAPACITY
APPLICATION
VOLATILE MEMORY POOL
O P T A N E P E R S I S T E N T M E M O R Y
D R A M A S C A C H E
AFFORDABLE MEMORY CAPACITY
FOR MANY APPLICATIONS
APPLICATION
OPTANE PERSISTENT
MEMORY
DRAM
11
AppDirectModeOptions
Legacy Storage APIs
Block Atomicity
Storage APIs with DAX
(AppDirect)
PersistentMemory
USERSPACEKERNELSPACE
Standard
File API
GenericNVDIMMDriver
Application
FileSystem
Standard
Raw Device
Access
mmap
Load/
Store
Standard
File API
pmem-Aware
FileSystem
MMU
Mappings
“DAX”
BTT
DevDAX
PMDK
mmap
HARDWARE
• No Code Changes Required
• Operates in Blocks like SSD/HDD
• Traditional read/write
• Works with Existing File
Systems
• Atomicity at block level
• Block size configurable
• 4K, 512B*
• NVDIMM Driver required
• Support starting Kernel 4.2
• Configured as Boot Device
• Higher Endurance than Enterprise
SSDs
• High Performance Block Storage
• Low Latency, higher BW, High
IOPs
*Requires Linux
• Code changes may be required*
• Bypasses file system page cache
• Requires DAX enabled file system
• XFS, EXT4, NTFS
• No Kernel Code or interrupts
• No interrupts
• Fastest IO path possible
* Code changes required for load/store direct
access if the application does not already support
this.
Low Bandwidth IO storage (e.g. HDD, S3)
RDD Cache
SparkDCPMMoptimizationOverview
OAP Cache
DRAM
IO Layer
Compute Layer
Spark
Input Data
Cache Hit Cache Miss
DRAM
Tied Storage
RDD cache DCPMM optimization (Against DRAM):
● Reduce DRAM footprint
● Higher Capacity to cache more data
● Status: To Be Added
OAP DCPMM optimization (Against DRAM)
● High Capacity, Less Cache Miss
● Avoid heavy cost disk read
● Status: To Be Added
9 I/O intensive Workload
(from Decision Support
Queries)
K-means Workload
Use Case 1: Spark SQL
OAP I/O Cache
OAP (Optimized Analytics Package)
Goal
• IO cache is critical for I/O intensive workload especially on low bandwidth environment (e.g. Cloud, On-Prem
HDD based system)
• Make full use of the advantage from DCPMM to speed up Spark SQL
• High capacity and high throughput
• No latency and reduced DRAM footprint
• Better performance per TCO
Feature
• Fine grain cache (e.g. column chunk for Parquet) columnar based cache
• Cache aware scheduler (V2 API via preferred location)
• Self managed DCPMM pool (no extra GC overhead)
• Easy to use (easily turn on/off), transparent to user (no changes for their queries)
14
https://github.com/Intel-bigdata/OAP
OAP
15
SparkDCPMMFullSoftwareStack
SQL workload
Spark SQL
VMEMCACHE
PMEM-AWARE
File SYSTEM
Intel® Optane™ DC Persistent Memory Module
Unmodified SQL
Unchanged Spark
Provide scheduler and fine gain cache based on Data Source API
Abstract away hardware details and cache implementation
Expose persistent memory as memory-mapped files (DAX)
Native library
to access DCPMM
Deployment Overview
Server 1
Local Storage (HDD)
Spark Executor
Spark Gateway
(e.g. ThriftServer, Spark shell)SQL
Cached Data Source (v1/v2)
Task scheduled
Intel Optane DC Persistent
Memory
Cache Hit Cache Miss
Server 2
Native library (vmemcache)
Cache Aware Scheduler
Cache Design - Problem statement
• Local LRU cache
• Support for large capacities available with
persistent memory (many terabytes per server)
• Lightweight, efficient and embeddable
• In-memory
• Scalable
Cache Design - Existing solutions
• In-memory databases tend to rely on malloc() in some form for allocating
memory for entries
– Which means allocating anonymous memory
• Persistent Memory is exposed by the operating system through normal file-
system operations
– Which means allocating byte-addressable PMEM needs to use file memory mapping
(fsdax).
• We could modify the allocator of an existing in-memory database and be
done with it, right? J
Cache Design - Fragmentation
• Manual dynamic memory
management a’la
dlmalloc/jemalloc/tcmalloc/palloc
causes fragmentation
• Applications with substantial
expected runtime durations need
a way to combat this problem
– Compacting GC (Java, .NET)
– Defragmentation (Redis, Apache
Ignite)
– Slab allocation (memcached)
• Especially so if there’s substantial
expected variety in allocated sizes
Cache Design - Extent allocation
• If fragmentation is unavoidable,
and defragmentation/compacting
is CPU and memory bandwidth
intensive, let’s embrace it!
• Usually only done in relatively
large blocks in file-systems.
• But on PMEM, we are no longer
restricted by large transfer units
(sectors, pages etc)
Cache Design - Scalable replacement policy
• Performance of
libvmemcache was
bottlenecked by naïve
implementation of LRU
based on a doubly-linked
list.
• With 100st of threads, most
of the time of any request
was spent
waiting on a list lock…
• Locking per-node doesn’t
solve the problem…
Cache Design - Buffered LRU
• Our solution was quite
simple.
• We’ve added a wait-free
ringbuffer which buffers
the list-move operations
• This way, the list only
needs to get locked
during eviction or when
the ringbuffer is full.
Cache Design - Lightweight, embeddable,
in-memory caching
https://github.com/pmem/vmemcache
VMEMcache *cache = vmemcache_new("/tmp", VMEMCACHE_MIN_POOL,
VMEMCACHE_MIN_EXTENT, VMEMCACHE_REPLACEMENT_LRU);
const char *key = "foo";
vmemcache_put(cache, key, strlen(key), "bar", sizeof("bar"));
char buf[128];
ssize_t len = vmemcache_get(cache, key, strlen(key),
buf, sizeof(buf), 0, NULL);
vmemcache_delete(cache);
libvmemcache has normal get/put APIs, optional replacement policy, and configurable extent size
Works with terabyte-sized in-memory workloads without a sweat, with very high space utilization.
Also works on regular DRAM.
24
Parquet File
Footer
RowGroup #1
Column Chunk #1
Column Chunk #2
Column Chunk #N
RowGroup #2
RowGroup #N
Guava Based
Cache
Vmemcache
Fine Grain Cache
Cache Status & Report
Cache Design – Status and Fine Grain
25
Experiments and Configurations
DCPMM DRAM
Hardware DRAM 192GB (12x 16GB DDR4) 768GB (24x 32GB DDR4)
Intel Optane DC Persistent Memory 1TB (QS: 8 x 128GB) N/A
DCPMM Mode App Direct (Memkind) N/A
SSD N/A N/A
CPU 2 * Cascadelake 8280M (Thread(s) per core: 2, Core(s) per socket: 28, Socket(s): 2 CPU max MHz:
4000.0000 CPU min MHz: 1000.0000 L1d cache: 32K, L1i cache: 32K, L2 cache: 1024K, L3 cache: 39424)
OS 4.20.4-200.fc29.x86_64 (BKC: WW06'19, BIOS: SE5C620.86B.0D.01.0134.100420181737)
Software OAP 1TB DCPMM based OAP cache 610GB DRAM based OAP cache
Hadoop 8 * HDD disk (ST1000NX0313, 1-replica uncompressed & plain encoded data on Hadoop)
Spark 1 * Driver (5GB) + 2 * Executor (62 cores, 74GB), spark.sql.oap.rowgroup.size=1MB
JDK Oracle JDK 1.8.0_161
Workload Data Scale 2TB, 3TB, 4TB
Decision Making Queries 9 I/O intensive queries
Multi-Tenants 9 threads (Fair scheduled)
Test Scenario 1: Both Fit In DRAM And DCPMM - 2TB
*The performance gain can be much lower if IO is improved (e.g. compression
& encoding enabled) or some hacking codes fixed (e.g. NUMA scheduler)
With 2TB Data Scale, both DCPMM and DRAM based OAP cache can hold the full dataset
(613GB). DCPMM is 24.6% (=1- 100.1/132.81) performance lower than DRAM as measured on 9
I/O intensive decision making queries.
On Premise
Performance
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases,
including the performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks. Configurations: See page20.Test by Intel on 24/02/2019.
Test Scenario 2: Fit In DCPMM Not For DRAM - 3TB
*The performance gain can be much lower if IO is improved (e.g. compression
& encoding enabled) or some hacking codes fixed (e.g. NUMA scheduler)
With 3TB Data Scale, only DCPMM based OAP cache can hold the full dataset (920GB).
DCPMM shows 8X* performance gain over DRAM as measured on 9 I/O intensive decision
making queries.
On Premise
Performance
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases,
including the performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks. Configurations: See page20.Test by Intel on 24/02/2019.
Performance Analysis - System Metrics
● Input data is all cached in DCPMM while partially for DRAM
● DCPMM reaches up to 18GB/sbandwidth while DRAM case it’s bounded by disk IO which is only about
250MB/s ~ 450MB/s
● OAP cache doesn’t apply for shuffle data (intermediate data) that extra IO pressure put onto DRAM case
● DCPMM reads about 27.5GBwhile DRAM reads about 395.6GB
OAP Cache works avoiding disk read (blue)
and only disk write for shuffle (red)
Disk read (blue) comes from
shuffle data
Disk read (blue) happens from time to time
DCPMM
Test Scenario 3: None Of DRAM &DCPMM Fit - 4TB
*The performance gain can be much lower if IO is improved (e.g. compression
& encoding enabled) or some hacking codes fixed (e.g. NUMA scheduler)
With 4TB Data Scale, none of DCPMM and DRAM based OAP cache can hold the full dataset
(1226.7GB). DCPMM shows 1.66X* (=2252.80/1353.67) performance gain over DRAM as
measured on 9 I/O intensive decision making queries.
On Premise
Performance
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases,
including the performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks. Configurations: See page20.Test by Intel on 24/02/2019.
Use Case 2: Machine Learning
Spark K-means
DCPMM Storage Level
DRAM
Disk (SSD, HDD)
Object
Byte buffer
Byte buffer
Spark Core (Compute/Storage Memory)
DRAM (on heap)
DRAM (off heap)
Disk
K-means workload GraphX workload
Decompression
Deseriliazation
• Spark Storage Level
• A few storage levels serving for different purposes
including memory and disk
• Off-heap memory is supported to avoid GC
overhead in Java
• Large capacity storage level (disk) is widely used
for iterative computation workload (e.g. K-means,
GraphX workloads) by caching hot data in storage
level
• DCPMM Storage Level
• Extend memory layer
• Using Pmem library to access DCPMM avoiding
the overhead of decompression from disk
• Large capacity and high I/O performance of
DCPMM shows better performance than tied
solution (original DRAM + Disk solution)
32
K-means in Spark
M0 - Mj
Mj+1 - Mk
Mk+1 - Mm
…
Mx+1 - Mn
HDFS
Cache #1
(DRAM + DISK)
Cache #2
(DRAM + DISK)
Cache #3
(DRAM + DISK)
…
Cache #N
(DRAM + DISK)
Spark Executor
Processes
Loading
Normalization
Caching
C1, C2, C3, … CK
For Each Record in
the Cache
For Each Record in
the Cache
For Each Record in
the Cache
For Each Record in
the Cache
For Each Record in
the Cache
Step 1: For each record in the cache, Sum the
vectors with the same closet centroid
Update the new Centroids
Step 2: Sum the vectors according to
the centroids and find out the new
centroids globally
Sync
N iterations
Load TrainRandom Initial Centroids
2 Iterations for all
of the cache data
to get the K
centroids in a
random mode
Load the data from HDFS to DRAM (and DCPMM / SSD if DRAM cannot hold all of the data)(Load), after that, the data will not be changed,
and will be iterated repeatedly in Initialization and Train stages.
Compute
Node
33
K-means Basic Data Flow
• Load data from HDFS to memory.
• Spill over to local storage
Load
• Compute using initial centroid based on
data in memory or local storage
Initialization
• Compute iterations based on local data
Train
CPU
Memory
Storage
Local
Shared
Compute
Node
CPU
Memory
Storage
Local
Shared
Compute
Node
CPU
Memory
Storage
Local
Shared
HDFS
…
…
K centroids
C1, C2, C3, … CK
1
2
3
1
2
3
34
Experiments and Configurations
#1 AD #2 DRAM
Hardware DRAM 192GB (12x 16GB DDR4 2666) 768GB (24× 32GB DDR4 2666)
Intel Optane DC Persistent Memory 1TB (8x 128GB QS) N/A
DCPMM Mode AppDirect N/A
CPU Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz
Disk Drive 8x Intel DC S4510 SSD (1.8T) + 2 * P4500 (1.8T)
Software CPU / Memory Allocation Driver (5GB + 10 cores) + 2 * Executor (80GB + 32 Cores)
Software Stack Spark (2.2.2-snapshot: bb3e6ed9216f98f3a3b96c8c52f20042d65e2181) + Hadoop (2.7.5)
OS & Kernel & BIOS
Linux-4.18.8-100.fc27.x86_64-x86_64-with-fedora-27-Twenty_Seven
BIOS: SE5C620.86B.0D.01.0299.122420180146
Mitigation variants (1,2,3,3a,4, L1TF) 1,2,3,3a,4, L1TF
Cache Spill Disk N/A 8 * SSD
NUMA Bind 2 NUMA Nodes bind 2 Spark Executors respectively
Data Cache Size (OffHeap) DCPMM (no spill) 680GB DRAM (partially spill)
Workload
Record Count / Data Size / Memory Footprint
Row size 6.1 Billion Records / data size 1.1TB / cache data 954GB
Cache Spill Size 0 235GB
Kmeans (Iteration times) 10
Kmeans (K value) 5
1 Modified to support DCPMM
35
Performance Comparison(DCPMM AD V.S. DRAM)
• DCPMM provides larger “memory” than DRAM and avoid the data spill in this case, hence got up to 4.98X performance in Training(READ),
and 2.85X end to end performance;
• The DCPMM AppDirect mode is about ~19% faster than DRAM in Load(WRITE cache into DCPMM / DRAM);
• Training(READ) execution gap of DRAM case caused by storage media, as well as the data decompression and de-serialization from the
local spill file;
4.98X1.19X
2.85X
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases,
including the performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks. Configurations: See page11.Test by Intel on 14/03/2019.
DCPMM DCPMM
DRAM
System Analysis
DCPMM
load training
trainingload
DRAM
● SSD read bandwidth is 2.86GB/s
● It’s I/O bound leading to low CPU
utilization
load training
load training
DCPMM
● Much high CPU utilization than
DRAM only system
• Intel Optane DC Persistent Memory brings high capacity, low latency and high throughput
• Two operation modes for DCPMM: 1) App-Direct 2) 2LM
• Good for Spark in: 1) memory bound workload 2) I/O intensive workload
Summary
Want to know more?
See you at Intel Booth 100
Download source code from Github:
https://github.com/intel-bigdata/OAP
https://github.com/pmem/vmemcache
DON’T FORGET TO RATE
AND REVIEW THE SESSIONS
SEARCH SPARK + AI SUMMIT

More Related Content

What's hot

[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅
NAVER D2
 
Jvm tuning for low latency application & Cassandra
Jvm tuning for low latency application & CassandraJvm tuning for low latency application & Cassandra
Jvm tuning for low latency application & Cassandra
Quentin Ambard
 
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Databricks
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
DataWorks Summit
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
MIJIN AN
 
Memory management in Linux kernel
Memory management in Linux kernelMemory management in Linux kernel
Memory management in Linux kernel
Vadim Nikitin
 
Zero Data Loss Recovery Applianceによるデータベース保護のアーキテクチャ
Zero Data Loss Recovery Applianceによるデータベース保護のアーキテクチャZero Data Loss Recovery Applianceによるデータベース保護のアーキテクチャ
Zero Data Loss Recovery Applianceによるデータベース保護のアーキテクチャ
オラクルエンジニア通信
 
Linux Memory Management with CMA (Contiguous Memory Allocator)
Linux Memory Management with CMA (Contiguous Memory Allocator)Linux Memory Management with CMA (Contiguous Memory Allocator)
Linux Memory Management with CMA (Contiguous Memory Allocator)
Pankaj Suryawanshi
 
Oracle RAC Internals - The Cache Fusion Edition
Oracle RAC Internals - The Cache Fusion EditionOracle RAC Internals - The Cache Fusion Edition
Oracle RAC Internals - The Cache Fusion Edition
Markus Michalewicz
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
Yoshinori Matsunobu
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdf
Adrian Huang
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
EXEM
 
Understanding DPDK algorithmics
Understanding DPDK algorithmicsUnderstanding DPDK algorithmics
Understanding DPDK algorithmics
Denys Haryachyy
 
The ideal and reality of NVDIMM RAS
The ideal and reality of NVDIMM RASThe ideal and reality of NVDIMM RAS
The ideal and reality of NVDIMM RAS
Yasunori Goto
 
Bluestore
BluestoreBluestore
Bluestore
Patrick McGarry
 
Using Wildcards with rsyslog's File Monitor imfile
Using Wildcards with rsyslog's File Monitor imfileUsing Wildcards with rsyslog's File Monitor imfile
Using Wildcards with rsyslog's File Monitor imfile
Rainer Gerhards
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKB
shimosawa
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
Brendan Gregg
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
PGConf APAC
 

What's hot (20)

[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅
 
Jvm tuning for low latency application & Cassandra
Jvm tuning for low latency application & CassandraJvm tuning for low latency application & Cassandra
Jvm tuning for low latency application & Cassandra
 
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
 
Memory management in Linux kernel
Memory management in Linux kernelMemory management in Linux kernel
Memory management in Linux kernel
 
Zero Data Loss Recovery Applianceによるデータベース保護のアーキテクチャ
Zero Data Loss Recovery Applianceによるデータベース保護のアーキテクチャZero Data Loss Recovery Applianceによるデータベース保護のアーキテクチャ
Zero Data Loss Recovery Applianceによるデータベース保護のアーキテクチャ
 
Linux Memory Management with CMA (Contiguous Memory Allocator)
Linux Memory Management with CMA (Contiguous Memory Allocator)Linux Memory Management with CMA (Contiguous Memory Allocator)
Linux Memory Management with CMA (Contiguous Memory Allocator)
 
Oracle RAC Internals - The Cache Fusion Edition
Oracle RAC Internals - The Cache Fusion EditionOracle RAC Internals - The Cache Fusion Edition
Oracle RAC Internals - The Cache Fusion Edition
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdf
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
 
Understanding DPDK algorithmics
Understanding DPDK algorithmicsUnderstanding DPDK algorithmics
Understanding DPDK algorithmics
 
The ideal and reality of NVDIMM RAS
The ideal and reality of NVDIMM RASThe ideal and reality of NVDIMM RAS
The ideal and reality of NVDIMM RAS
 
iostatの見方
iostatの見方iostatの見方
iostatの見方
 
Bluestore
BluestoreBluestore
Bluestore
 
Using Wildcards with rsyslog's File Monitor imfile
Using Wildcards with rsyslog's File Monitor imfileUsing Wildcards with rsyslog's File Monitor imfile
Using Wildcards with rsyslog's File Monitor imfile
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKB
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
 

Similar to Accelerate Your Apache Spark with Intel Optane DC Persistent Memory

Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
 Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive... Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
Databricks
 
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference ChipSpring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
inside-BigData.com
 
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red_Hat_Storage
 
Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques
Ceph Community
 
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI ConvergenceDAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
inside-BigData.com
 
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memoryhbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
Michael Stack
 
QATCodec: past, present and future
QATCodec: past, present and futureQATCodec: past, present and future
QATCodec: past, present and future
boxu42
 
Deep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning AcceleratorDeep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning Accelerator
inside-BigData.com
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Databricks
 
Performance out of the box developers
Performance   out of the box developersPerformance   out of the box developers
Performance out of the box developers
Michelle Holley
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
tdc-globalcode
 
Overcoming Scaling Challenges in MongoDB Deployments with SSD
Overcoming Scaling Challenges in MongoDB Deployments with SSDOvercoming Scaling Challenges in MongoDB Deployments with SSD
Overcoming Scaling Challenges in MongoDB Deployments with SSD
MongoDB
 
Accelerating Virtual Machine Access with the Storage Performance Development ...
Accelerating Virtual Machine Access with the Storage Performance Development ...Accelerating Virtual Machine Access with the Storage Performance Development ...
Accelerating Virtual Machine Access with the Storage Performance Development ...
Michelle Holley
 
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciStreamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Intel® Software
 
Новые технологии Intel в центрах обработки данных
Новые технологии Intel в центрах обработки данныхНовые технологии Intel в центрах обработки данных
Новые технологии Intel в центрах обработки данных
Cisco Russia
 
Konsolidace Oracle DB na systémech s procesory M7
Konsolidace Oracle DB na systémech s procesory M7Konsolidace Oracle DB na systémech s procesory M7
Konsolidace Oracle DB na systémech s procesory M7
MarketingArrowECS_CZ
 
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
Andrey Kudryavtsev
 
Intel Technologies for High Performance Computing
Intel Technologies for High Performance ComputingIntel Technologies for High Performance Computing
Intel Technologies for High Performance Computing
Intel Software Brasil
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Community
 

Similar to Accelerate Your Apache Spark with Intel Optane DC Persistent Memory (20)

Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
 Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive... Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...
 
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference ChipSpring Hill (NNP-I 1000): Intel's Data Center Inference Chip
Spring Hill (NNP-I 1000): Intel's Data Center Inference Chip
 
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
 
Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques
 
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI ConvergenceDAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
 
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memoryhbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
 
QATCodec: past, present and future
QATCodec: past, present and futureQATCodec: past, present and future
QATCodec: past, present and future
 
Deep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning AcceleratorDeep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning Accelerator
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 
Performance out of the box developers
Performance   out of the box developersPerformance   out of the box developers
Performance out of the box developers
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
 
Overcoming Scaling Challenges in MongoDB Deployments with SSD
Overcoming Scaling Challenges in MongoDB Deployments with SSDOvercoming Scaling Challenges in MongoDB Deployments with SSD
Overcoming Scaling Challenges in MongoDB Deployments with SSD
 
Accelerating Virtual Machine Access with the Storage Performance Development ...
Accelerating Virtual Machine Access with the Storage Performance Development ...Accelerating Virtual Machine Access with the Storage Performance Development ...
Accelerating Virtual Machine Access with the Storage Performance Development ...
 
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciStreamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
 
Новые технологии Intel в центрах обработки данных
Новые технологии Intel в центрах обработки данныхНовые технологии Intel в центрах обработки данных
Новые технологии Intel в центрах обработки данных
 
Konsolidace Oracle DB na systémech s procesory M7
Konsolidace Oracle DB na systémech s procesory M7Konsolidace Oracle DB na systémech s procesory M7
Konsolidace Oracle DB na systémech s procesory M7
 
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...
 
Intel Technologies for High Performance Computing
Intel Technologies for High Performance ComputingIntel Technologies for High Performance Computing
Intel Technologies for High Performance Computing
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 

Recently uploaded (20)

做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 

Accelerate Your Apache Spark with Intel Optane DC Persistent Memory

  • 1. WIFI SSID:SparkAISummit | Password: UnifiedAnalytics
  • 2. Cheng Xu, Intel Balcer Piotr, Intel Accelerate your Spark with Intel Optane DC Persistent Memory #UnifiedAnalytics #SparkAISummit
  • 3. Notices and Disclaimers 3 © 2018 Intel Corporation. Intel, the Intel logo, 3D XPoint, Optane, Xeon, Xeon logos, and Intel Optane logo are trademarks of Intel Corporation in the U.S. and/or other countries. All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com. The cost reduction scenarios described are intended to enable you to get a better understanding of how the purchase of a given Intel based product, combined with a number of situation-specific variables, might affect future costs and savings. Circumstances will vary and there may be unaccounted-for costs related to the use and deployment of a given product. Nothing in this document should be interpreted as either a promise of or contract for a given level of costs or cost reduction. The benchmark results reported above may need to be revised as additional testing is conducted. The results depend on the specific platform configurations and workloads utilized in the testing, and may not be applicable to any particular user’s components, computer system or workloads. The results are not necessarily representative of other benchmarks and other benchmark results may show greater or lesser impact from mitigations. Results have been estimated based on tests conducted on pre-production systems, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance results are based on testing as of 03-14-2019 and may not reflect all publicly available security updates. See configuration disclosure for details. No product can be absolutely secure. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to www.intel.com/benchmarks. Intel processors of the same SKU may vary in frequency or power as a result of natural variability in the production process. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice Revision #20110804. Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks. *Other names and brands may be claimed as the property of others.
  • 4. About US Intel SSP/DAT – Data Analytics Technology ● Active open source contributions: About 30 committers from our team with significant contributions to a lot of big data key projects including Spark, Hadoop, HBase, Hive... ● A global engineering team (China, US, India) - Major team is based in Shanghai ● Mission: To fully optimize analytics solutions on IA and make Intel the best and easiest platform for big data and analytics customers Intel NVMS – Non-Volatile Memory Solutions Group • Tasked with preparing the ground for Intel® Optane™ DC Persistent Memory • Enabling new and existing software through Persistent Memory Development Kit (PMDK) • pmem.io 4
  • 5. Out of memory Large Spill Some Challenges in Spark: Memory???
  • 7. 7 Intel®Optane™DCPersistentMemory-ProductOverview (Optane™ based Memory Module for the Data Center) IMC Cascade Lake Server IMC • 128, 256, 512GB DIMM Capacity • 2666 MT/sec Speed • 3TB (not including DRAM) Capacity per CPU • DDR4 electrical & physical • Close to DRAM latency • Cache line size access * DIMM population shown as an example only.
  • 9. 9 Intel®Optane™DCPersistentMemory-OperationalModes AppDirectMode memoryMode Persistent High availability / less downtime High capacity Affordable Significantly faster storage Ease of adoption† Note that a BIOS update will be required before using Intel persistent memory
  • 10. 10 INTEL®OPTANE™DCPERSISTENTMEMORYSupportforBreadthofapplications PERSISTENT PERFORMANCE & MAXIMUM CAPACITY APPLICATION VOLATILE MEMORY POOL O P T A N E P E R S I S T E N T M E M O R Y D R A M A S C A C H E AFFORDABLE MEMORY CAPACITY FOR MANY APPLICATIONS APPLICATION OPTANE PERSISTENT MEMORY DRAM
  • 11. 11 AppDirectModeOptions Legacy Storage APIs Block Atomicity Storage APIs with DAX (AppDirect) PersistentMemory USERSPACEKERNELSPACE Standard File API GenericNVDIMMDriver Application FileSystem Standard Raw Device Access mmap Load/ Store Standard File API pmem-Aware FileSystem MMU Mappings “DAX” BTT DevDAX PMDK mmap HARDWARE • No Code Changes Required • Operates in Blocks like SSD/HDD • Traditional read/write • Works with Existing File Systems • Atomicity at block level • Block size configurable • 4K, 512B* • NVDIMM Driver required • Support starting Kernel 4.2 • Configured as Boot Device • Higher Endurance than Enterprise SSDs • High Performance Block Storage • Low Latency, higher BW, High IOPs *Requires Linux • Code changes may be required* • Bypasses file system page cache • Requires DAX enabled file system • XFS, EXT4, NTFS • No Kernel Code or interrupts • No interrupts • Fastest IO path possible * Code changes required for load/store direct access if the application does not already support this.
  • 12. Low Bandwidth IO storage (e.g. HDD, S3) RDD Cache SparkDCPMMoptimizationOverview OAP Cache DRAM IO Layer Compute Layer Spark Input Data Cache Hit Cache Miss DRAM Tied Storage RDD cache DCPMM optimization (Against DRAM): ● Reduce DRAM footprint ● Higher Capacity to cache more data ● Status: To Be Added OAP DCPMM optimization (Against DRAM) ● High Capacity, Less Cache Miss ● Avoid heavy cost disk read ● Status: To Be Added 9 I/O intensive Workload (from Decision Support Queries) K-means Workload
  • 13. Use Case 1: Spark SQL OAP I/O Cache
  • 14. OAP (Optimized Analytics Package) Goal • IO cache is critical for I/O intensive workload especially on low bandwidth environment (e.g. Cloud, On-Prem HDD based system) • Make full use of the advantage from DCPMM to speed up Spark SQL • High capacity and high throughput • No latency and reduced DRAM footprint • Better performance per TCO Feature • Fine grain cache (e.g. column chunk for Parquet) columnar based cache • Cache aware scheduler (V2 API via preferred location) • Self managed DCPMM pool (no extra GC overhead) • Easy to use (easily turn on/off), transparent to user (no changes for their queries) 14 https://github.com/Intel-bigdata/OAP
  • 15. OAP 15 SparkDCPMMFullSoftwareStack SQL workload Spark SQL VMEMCACHE PMEM-AWARE File SYSTEM Intel® Optane™ DC Persistent Memory Module Unmodified SQL Unchanged Spark Provide scheduler and fine gain cache based on Data Source API Abstract away hardware details and cache implementation Expose persistent memory as memory-mapped files (DAX) Native library to access DCPMM
  • 16. Deployment Overview Server 1 Local Storage (HDD) Spark Executor Spark Gateway (e.g. ThriftServer, Spark shell)SQL Cached Data Source (v1/v2) Task scheduled Intel Optane DC Persistent Memory Cache Hit Cache Miss Server 2 Native library (vmemcache) Cache Aware Scheduler
  • 17. Cache Design - Problem statement • Local LRU cache • Support for large capacities available with persistent memory (many terabytes per server) • Lightweight, efficient and embeddable • In-memory • Scalable
  • 18. Cache Design - Existing solutions • In-memory databases tend to rely on malloc() in some form for allocating memory for entries – Which means allocating anonymous memory • Persistent Memory is exposed by the operating system through normal file- system operations – Which means allocating byte-addressable PMEM needs to use file memory mapping (fsdax). • We could modify the allocator of an existing in-memory database and be done with it, right? J
  • 19. Cache Design - Fragmentation • Manual dynamic memory management a’la dlmalloc/jemalloc/tcmalloc/palloc causes fragmentation • Applications with substantial expected runtime durations need a way to combat this problem – Compacting GC (Java, .NET) – Defragmentation (Redis, Apache Ignite) – Slab allocation (memcached) • Especially so if there’s substantial expected variety in allocated sizes
  • 20. Cache Design - Extent allocation • If fragmentation is unavoidable, and defragmentation/compacting is CPU and memory bandwidth intensive, let’s embrace it! • Usually only done in relatively large blocks in file-systems. • But on PMEM, we are no longer restricted by large transfer units (sectors, pages etc)
  • 21. Cache Design - Scalable replacement policy • Performance of libvmemcache was bottlenecked by naïve implementation of LRU based on a doubly-linked list. • With 100st of threads, most of the time of any request was spent waiting on a list lock… • Locking per-node doesn’t solve the problem…
  • 22. Cache Design - Buffered LRU • Our solution was quite simple. • We’ve added a wait-free ringbuffer which buffers the list-move operations • This way, the list only needs to get locked during eviction or when the ringbuffer is full.
  • 23. Cache Design - Lightweight, embeddable, in-memory caching https://github.com/pmem/vmemcache VMEMcache *cache = vmemcache_new("/tmp", VMEMCACHE_MIN_POOL, VMEMCACHE_MIN_EXTENT, VMEMCACHE_REPLACEMENT_LRU); const char *key = "foo"; vmemcache_put(cache, key, strlen(key), "bar", sizeof("bar")); char buf[128]; ssize_t len = vmemcache_get(cache, key, strlen(key), buf, sizeof(buf), 0, NULL); vmemcache_delete(cache); libvmemcache has normal get/put APIs, optional replacement policy, and configurable extent size Works with terabyte-sized in-memory workloads without a sweat, with very high space utilization. Also works on regular DRAM.
  • 24. 24 Parquet File Footer RowGroup #1 Column Chunk #1 Column Chunk #2 Column Chunk #N RowGroup #2 RowGroup #N Guava Based Cache Vmemcache Fine Grain Cache Cache Status & Report Cache Design – Status and Fine Grain
  • 25. 25 Experiments and Configurations DCPMM DRAM Hardware DRAM 192GB (12x 16GB DDR4) 768GB (24x 32GB DDR4) Intel Optane DC Persistent Memory 1TB (QS: 8 x 128GB) N/A DCPMM Mode App Direct (Memkind) N/A SSD N/A N/A CPU 2 * Cascadelake 8280M (Thread(s) per core: 2, Core(s) per socket: 28, Socket(s): 2 CPU max MHz: 4000.0000 CPU min MHz: 1000.0000 L1d cache: 32K, L1i cache: 32K, L2 cache: 1024K, L3 cache: 39424) OS 4.20.4-200.fc29.x86_64 (BKC: WW06'19, BIOS: SE5C620.86B.0D.01.0134.100420181737) Software OAP 1TB DCPMM based OAP cache 610GB DRAM based OAP cache Hadoop 8 * HDD disk (ST1000NX0313, 1-replica uncompressed & plain encoded data on Hadoop) Spark 1 * Driver (5GB) + 2 * Executor (62 cores, 74GB), spark.sql.oap.rowgroup.size=1MB JDK Oracle JDK 1.8.0_161 Workload Data Scale 2TB, 3TB, 4TB Decision Making Queries 9 I/O intensive queries Multi-Tenants 9 threads (Fair scheduled)
  • 26. Test Scenario 1: Both Fit In DRAM And DCPMM - 2TB *The performance gain can be much lower if IO is improved (e.g. compression & encoding enabled) or some hacking codes fixed (e.g. NUMA scheduler) With 2TB Data Scale, both DCPMM and DRAM based OAP cache can hold the full dataset (613GB). DCPMM is 24.6% (=1- 100.1/132.81) performance lower than DRAM as measured on 9 I/O intensive decision making queries. On Premise Performance Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks. Configurations: See page20.Test by Intel on 24/02/2019.
  • 27. Test Scenario 2: Fit In DCPMM Not For DRAM - 3TB *The performance gain can be much lower if IO is improved (e.g. compression & encoding enabled) or some hacking codes fixed (e.g. NUMA scheduler) With 3TB Data Scale, only DCPMM based OAP cache can hold the full dataset (920GB). DCPMM shows 8X* performance gain over DRAM as measured on 9 I/O intensive decision making queries. On Premise Performance Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks. Configurations: See page20.Test by Intel on 24/02/2019.
  • 28. Performance Analysis - System Metrics ● Input data is all cached in DCPMM while partially for DRAM ● DCPMM reaches up to 18GB/sbandwidth while DRAM case it’s bounded by disk IO which is only about 250MB/s ~ 450MB/s ● OAP cache doesn’t apply for shuffle data (intermediate data) that extra IO pressure put onto DRAM case ● DCPMM reads about 27.5GBwhile DRAM reads about 395.6GB OAP Cache works avoiding disk read (blue) and only disk write for shuffle (red) Disk read (blue) comes from shuffle data Disk read (blue) happens from time to time DCPMM
  • 29. Test Scenario 3: None Of DRAM &DCPMM Fit - 4TB *The performance gain can be much lower if IO is improved (e.g. compression & encoding enabled) or some hacking codes fixed (e.g. NUMA scheduler) With 4TB Data Scale, none of DCPMM and DRAM based OAP cache can hold the full dataset (1226.7GB). DCPMM shows 1.66X* (=2252.80/1353.67) performance gain over DRAM as measured on 9 I/O intensive decision making queries. On Premise Performance Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks. Configurations: See page20.Test by Intel on 24/02/2019.
  • 30. Use Case 2: Machine Learning Spark K-means
  • 31. DCPMM Storage Level DRAM Disk (SSD, HDD) Object Byte buffer Byte buffer Spark Core (Compute/Storage Memory) DRAM (on heap) DRAM (off heap) Disk K-means workload GraphX workload Decompression Deseriliazation • Spark Storage Level • A few storage levels serving for different purposes including memory and disk • Off-heap memory is supported to avoid GC overhead in Java • Large capacity storage level (disk) is widely used for iterative computation workload (e.g. K-means, GraphX workloads) by caching hot data in storage level • DCPMM Storage Level • Extend memory layer • Using Pmem library to access DCPMM avoiding the overhead of decompression from disk • Large capacity and high I/O performance of DCPMM shows better performance than tied solution (original DRAM + Disk solution)
  • 32. 32 K-means in Spark M0 - Mj Mj+1 - Mk Mk+1 - Mm … Mx+1 - Mn HDFS Cache #1 (DRAM + DISK) Cache #2 (DRAM + DISK) Cache #3 (DRAM + DISK) … Cache #N (DRAM + DISK) Spark Executor Processes Loading Normalization Caching C1, C2, C3, … CK For Each Record in the Cache For Each Record in the Cache For Each Record in the Cache For Each Record in the Cache For Each Record in the Cache Step 1: For each record in the cache, Sum the vectors with the same closet centroid Update the new Centroids Step 2: Sum the vectors according to the centroids and find out the new centroids globally Sync N iterations Load TrainRandom Initial Centroids 2 Iterations for all of the cache data to get the K centroids in a random mode Load the data from HDFS to DRAM (and DCPMM / SSD if DRAM cannot hold all of the data)(Load), after that, the data will not be changed, and will be iterated repeatedly in Initialization and Train stages.
  • 33. Compute Node 33 K-means Basic Data Flow • Load data from HDFS to memory. • Spill over to local storage Load • Compute using initial centroid based on data in memory or local storage Initialization • Compute iterations based on local data Train CPU Memory Storage Local Shared Compute Node CPU Memory Storage Local Shared Compute Node CPU Memory Storage Local Shared HDFS … … K centroids C1, C2, C3, … CK 1 2 3 1 2 3
  • 34. 34 Experiments and Configurations #1 AD #2 DRAM Hardware DRAM 192GB (12x 16GB DDR4 2666) 768GB (24× 32GB DDR4 2666) Intel Optane DC Persistent Memory 1TB (8x 128GB QS) N/A DCPMM Mode AppDirect N/A CPU Intel(R) Xeon(R) Platinum 8280L CPU @ 2.70GHz Disk Drive 8x Intel DC S4510 SSD (1.8T) + 2 * P4500 (1.8T) Software CPU / Memory Allocation Driver (5GB + 10 cores) + 2 * Executor (80GB + 32 Cores) Software Stack Spark (2.2.2-snapshot: bb3e6ed9216f98f3a3b96c8c52f20042d65e2181) + Hadoop (2.7.5) OS & Kernel & BIOS Linux-4.18.8-100.fc27.x86_64-x86_64-with-fedora-27-Twenty_Seven BIOS: SE5C620.86B.0D.01.0299.122420180146 Mitigation variants (1,2,3,3a,4, L1TF) 1,2,3,3a,4, L1TF Cache Spill Disk N/A 8 * SSD NUMA Bind 2 NUMA Nodes bind 2 Spark Executors respectively Data Cache Size (OffHeap) DCPMM (no spill) 680GB DRAM (partially spill) Workload Record Count / Data Size / Memory Footprint Row size 6.1 Billion Records / data size 1.1TB / cache data 954GB Cache Spill Size 0 235GB Kmeans (Iteration times) 10 Kmeans (K value) 5 1 Modified to support DCPMM
  • 35. 35 Performance Comparison(DCPMM AD V.S. DRAM) • DCPMM provides larger “memory” than DRAM and avoid the data spill in this case, hence got up to 4.98X performance in Training(READ), and 2.85X end to end performance; • The DCPMM AppDirect mode is about ~19% faster than DRAM in Load(WRITE cache into DCPMM / DRAM); • Training(READ) execution gap of DRAM case caused by storage media, as well as the data decompression and de-serialization from the local spill file; 4.98X1.19X 2.85X Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks. Configurations: See page11.Test by Intel on 14/03/2019. DCPMM DCPMM
  • 36. DRAM System Analysis DCPMM load training trainingload DRAM ● SSD read bandwidth is 2.86GB/s ● It’s I/O bound leading to low CPU utilization load training load training DCPMM ● Much high CPU utilization than DRAM only system
  • 37. • Intel Optane DC Persistent Memory brings high capacity, low latency and high throughput • Two operation modes for DCPMM: 1) App-Direct 2) 2LM • Good for Spark in: 1) memory bound workload 2) I/O intensive workload Summary
  • 38. Want to know more? See you at Intel Booth 100 Download source code from Github: https://github.com/intel-bigdata/OAP https://github.com/pmem/vmemcache
  • 39. DON’T FORGET TO RATE AND REVIEW THE SESSIONS SEARCH SPARK + AI SUMMIT