SlideShare a Scribd company logo
Costin Iancu, Khaled Ibrahim – LBNL
Nicholas Chaimov – U. Oregon
Spark on Supercomputers:
A Tale of the Storage Hierarchy
Apache Spark
• Developed for cloud environments
• Specialized runtime provides for
– Performance J, Elastic parallelism, Resilience
• Programming productivity through
– HLL front-ends (Scala, R, SQL), multiple domain-specific libraries:
Streaming, SparkSQL, SparkR, GraphX, Splash, MLLib, Velox
• We have huge datasets but little penetration in HPC
Apache Spark
• In-memory Map-Reduce framework
• Central abstraction is the Resilient Distributed Dataset.
• Data movement is important
– Lazy, on-demand
– Horizontal (node-to-node) – shuffle/Reduce
– Vertical (node-to-storage) - Map/Reduce
p1
p2
p3
textFile
p1
p2
p3
flatMap
p1
p2
p3
map
p1
p2
p3
reduceByKey
(local)
STAGE 0
p1
p2
p3
reduceByKey
(global)
STAGE 1
JOB 0
Data Centers/Clouds
Node	local	storage,	assumes	 all	disk
operations	are	equal
Disk I/O	optimized	for	latency
Network optimized	for	bandwidth
HPC
Global	file	system,	asymmetry	expected
Disk I/O	optimized	for	bandwidth
Network optimized	for	latency
HDD/
SSD
NIC
CPU
Mem
HDD/
SDD
HDD/
SDD
HDD/
SDD
CPU
Mem
NIC
HDD/
SDD
HDD/
SSD
HDD/
SSD
HDD/
SSD
HDD
/SSD
Cloud: commodity CPU,
memory, HDD/SSD NIC
Data appliance: server CPU,
large fast memory, fast SSD
Backend storage
Intermediate
storage
HPC: server CPU, fast memory,
combo of fast and slower storage
HDD/
SSD
NIC
CPU
Mem
HDD/
SDD
HDD/
SDD
HDD/
SDD
CPU
Mem
NIC
HDD/
SDD
HDD/
SSD
HDD/
SSD
HDD/
SSD
HDD
/SSD
Backend storage
Intermediate
storage
2.5 GHz Intel Haswell - 24 cores
2.3 GHz Intel Haswell – 32 cores
128GB/1.5TB DDR4
128GB DDR4
320 GB of SSD local 56 Gbps FDR InfiniBand
Cray Data Warp
1.8PB at 1.7TB/s
Sonexion Lustre 30PB
Cray Aries
Comet (DELL)
Cori (Cray XC40)
Scaling Spark on Cray XC40
(It’s all about file system metadata)
Not ALL I/O is Created Equal
0	
2000	
4000	
6000	
8000	
10000	
12000	
1	 2	 4	 8	 16	
Time	Per	Opera1on	(microseconds)	
Nodes	
GroupByTest	-	I/O	Components	-	Cori		
Lustre	-	Open	 BB	Striped	-	Open	
BB	Private	-	Open	 Lustre	-	Read	
BB	Striped	-	Read	 BB	Private	-	Read	
Lustre	-	Write	 BB	Striped	-	Write	
# Shuffle opens = # Shuffle reads O(cores2)
Time per open increases with scale, unlike read/write
9,216
36,864
147,456
589,824
2,359,296
opens
I/O Variability is HIGH
fopen is a problem:
• Mean time is 23X larger than SSD
• Variability is 14,000X
READ fopen
Improving I/O Performance
Eliminate file metadata operations
1. Keep files open (cache fopen)
• Surprising 10%-20% improvement on data appliance
• Argues for user level file systems, gets rid of serialized system calls
2. Use file system backed by single Lustre file for shuffle
• This should also help on systems with local SSDs
3. Use containers
• Speeds up startup, up to 20% end-to-end performance improvement
• Solutions need to be used in conjunction
– E.g. fopen from Parquet reader
Plenty of details in “Scaling Spark on HPC Systems”. HPDC 2016
0
100
200
300
400
500
600
700
32 160 320 640 1280 2560 5120 10240
Time	(s)
Cores
Cori	- GroupBy	- Weak	Scaling	- Time	to	Job	Completion
Ramdisk
Mounted	File
Lustre
Scalability
6x
12x 14x
19x
33x
61x
At 10,240 cores
only 1.6x slower
than RAMdisk
(in memory
execution)
We scaled Spark from O(100)
up to
O(10,000) cores
File-Backed Filesystems
• NERSC Shifter (container infrastructure for HPC)
– Compatible with Docker images
– Integrated with Slurm scheduler
– Can control mounting of filesystems within container
• Per-Node Cache
– File-backed filesystem mounted within each node’s container instance at common
path (/mnt)
– ​--volume=$SCRATCH/backingFile:/mnt:perNodeCache=
size=100G
– File for each node is created stored on backend Lustre filesystem
– Single file open — intermediate data file opens are kept local
Now the fun part J
Architectural Performance Considerations
Cori Comet
The Supercomputer vs The Data Appliance
HDD/
SSD
NIC
CPU
Mem
HDD/
SDD
HDD/
SDD
HDD/
SDD
CPU
Mem
NIC
HDD/
SDD
HDD/
SSD
HDD/
SSD
HDD/
SSD
HDD
/SSD
Backend storage
Intermediate
storage
2.5 GHz Intel Haswell - 24 cores
2.3 GHz Intel Haswell – 32 cores
128GB/1.5TB DDR4
128GB DDR4
320 GB of SSD local 56 Gbps FDR InfiniBand
Cray Data Warp
1.8PB at 1.7TB/s
Sonexion Lustre 30PB
Cray Aries
Comet (DELL)
Cori (Cray XC40)
CPU, Memory, Network, Disk?
• Multiple extensions to Blocked Time Analysis (Ousterhout, 2015)
• BTA indicated that CPU dominates
– Network 2%, disk 19%
• Concentrate on scaling out, weak scaling studies
– Spark-perf, BigDataBenchmark, TPC-DS, TeraSort
• Interested in determining right ratio, machine balance for
– CPU, memory, network, disk …
• Spark 2.0.2 & Spark-RDMA 0.9.4 from Ohio State University,
Hadoop 2.6
Storage hierarchy and performance
Global Storage Matches Local Storage
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
200000
Lustre
Mount+Pool
SSD+IB
Lustre
Mount+Pool
SSD+IB
Lustre
Mount+Pool
SSD+IB
1 5 20
Time	(ms)
Nodes	(32	cores)
App
JVM
RW	Input
RW	Shuffle
Open	Input
Open	Shuffle
• Variability matters more than
advertised latency and bandwidth
number
• Storage performance
obscured/mitigated by network
due to client/server in
BlockManager
• Small scale local is slightly
faster
• Large scale global is faster
Disk+Network Latency/BW
Metadata Overhead
Cray XC40 – TeraSort (100GB/node)
0
0.2
0.4
0.6
0.8
1
1.2
1 16 1 16
Comet	RDMA	Singularity	24	Cores Cori	Shifter	24	Cores
Average	Across	MLLib	Benchmarks
App Fetch JVM
Global Storage Matches Local Storage
11.8%
Fetch
12.5%
Fetch
Intermediate Storage Hurts
Performance
0
2000
4000
6000
8000
10000
12000
14000
1 2 4 8 16 32 64 1 2 4 8 16 32 64 1 2 4 8 16 32 64
Cori	Shifter	Lustre Cori	Shifter	BB	Striped Cori	Shifter	BB	Private
Time	(s)
TPC-DS	- Weak	Scaling
App Fetch JVM
19.4%	slower
on	average
86.8%	slower
on	average
(Without our optimizations, intermediate storage scaled better)
Networking performance
0
50
100
150
200
250
300
350
1 2 4 8 16 32 64 1 2 4 8 16 32 64 1 2 4 8 16 32 64
Comet	Singularity Comet	RDMA	Singularity Cori	Shifter	24	cores
Time	(s)
Singular	Value	Decomposition
App
Fetch
JVM
Latency or Bandwidth?
10X in bandwidth,
latency differences matter
Can hide 2X differences
Average message size for spark-perf is 43B
Network Matters at Scale
0
50
100
150
200
250
1 2 4 8 16 32 64 1 2 4 8 16 32 64
Cori	Shifter	24	Cores Cori	Shifter	32	Cores
Time	(s)
Average	Across	Benchmarks
App Fetch JVM
44%
CPU
More cores or better memory?
• Need more cores to hide
disk and network latency
at scale.
• Preliminary experiences
with Intel KNL are bad
• Too much concurrency
• Not enough integer
throughput
• Execution does not seem
to be memory bandwidth
limited
0
50
100
150
200
250
1 2 4 8 16 32 64 1 2 4 8 16 32 64
Cori	Shifter	24	Cores Cori	Shifter	32	Cores
Time	(s)
Average	Across	Benchmarks
App Fetch JVM
Summary/Conclusions
• Latency and bandwidth are important, but not dominant
– Variability more important than marketing numbers
• Network time dominates at scale
– Network, disk is mis-attributed as CPU
• Comet matches Cori up to 512 cores, Cori twice as fast at
2048 cores
– Spark can run well on global storage
• Global storage opens the possibility of global name space, no
more client-server
Ackowledgement
Work partially supported by
Intel Parallel Computing Center: Big Data Support
for HPC
Thank You.
Questions, collaborations, free software
cciancu@lbl.gov
kzibrahim@lbl.gov
nchaimov@uoregon.edu
Burst Buffer
Setup
• Cray XC30 at NERSC (Edison): 2.4 GHz IvyBridge - Global
• Cray XC40 at NERSC (Cori): 2.3 GHz Haswell + Cray
DataWarp
• Comet at SDSC: 2.5GHz Haswell, InfiniBand FDR, 320 GB
SSD, 1.5TB memory - LOCAL

More Related Content

What's hot

Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions
Ceph Community
 
OpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaOpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of Alabama
Kamesh Pemmaraju
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_Plan
Narayana B
 
Interactive Hadoop via Flash and Memory
Interactive Hadoop via Flash and MemoryInteractive Hadoop via Flash and Memory
Interactive Hadoop via Flash and Memory
Chris Nauroth
 
Millions of Regions in HBase: Size Matters
Millions of Regions in HBase: Size MattersMillions of Regions in HBase: Size Matters
Millions of Regions in HBase: Size Matters
DataWorks Summit
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
Jack Levin
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
Cloudera, Inc.
 
Apache Hadoop India Summit 2011 talk "Searching Information Inside Hadoop Pla...
Apache Hadoop India Summit 2011 talk "Searching Information Inside Hadoop Pla...Apache Hadoop India Summit 2011 talk "Searching Information Inside Hadoop Pla...
Apache Hadoop India Summit 2011 talk "Searching Information Inside Hadoop Pla...
Yahoo Developer Network
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon
 
Spark tunning in Apache Kylin
Spark tunning in Apache KylinSpark tunning in Apache Kylin
Spark tunning in Apache Kylin
Shi Shao Feng
 
Hadoop 1.x vs 2
Hadoop 1.x vs 2Hadoop 1.x vs 2
Hadoop 1.x vs 2
Rommel Garcia
 
ORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, SmallerORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, Smaller
The Apache Software Foundation
 
[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma
[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma
[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma
Newton Alex
 
Tutorial Haddop 2.3
Tutorial Haddop 2.3Tutorial Haddop 2.3
Tutorial Haddop 2.3
Atanu Chatterjee
 
[RakutenTechConf2014] [D-4] The next step of LeoFS and Introducing NewDB Project
[RakutenTechConf2014] [D-4] The next step of LeoFS and Introducing NewDB Project[RakutenTechConf2014] [D-4] The next step of LeoFS and Introducing NewDB Project
[RakutenTechConf2014] [D-4] The next step of LeoFS and Introducing NewDB Project
Rakuten Group, Inc.
 
Ceph on arm64 upload
Ceph on arm64   uploadCeph on arm64   upload
Ceph on arm64 upload
Ceph Community
 
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
HDFS Erasure Code Storage - Same Reliability at Better Storage EfficiencyHDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
DataWorks Summit
 
RuG Guest Lecture
RuG Guest LectureRuG Guest Lecture
RuG Guest Lecture
fvanvollenhoven
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Cloudera, Inc.
 
UberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for BeginnersUberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for Beginners
hpcexperiment
 

What's hot (20)

Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions
 
OpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaOpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of Alabama
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_Plan
 
Interactive Hadoop via Flash and Memory
Interactive Hadoop via Flash and MemoryInteractive Hadoop via Flash and Memory
Interactive Hadoop via Flash and Memory
 
Millions of Regions in HBase: Size Matters
Millions of Regions in HBase: Size MattersMillions of Regions in HBase: Size Matters
Millions of Regions in HBase: Size Matters
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
 
Apache Hadoop India Summit 2011 talk "Searching Information Inside Hadoop Pla...
Apache Hadoop India Summit 2011 talk "Searching Information Inside Hadoop Pla...Apache Hadoop India Summit 2011 talk "Searching Information Inside Hadoop Pla...
Apache Hadoop India Summit 2011 talk "Searching Information Inside Hadoop Pla...
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on Mesos
 
Spark tunning in Apache Kylin
Spark tunning in Apache KylinSpark tunning in Apache Kylin
Spark tunning in Apache Kylin
 
Hadoop 1.x vs 2
Hadoop 1.x vs 2Hadoop 1.x vs 2
Hadoop 1.x vs 2
 
ORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, SmallerORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, Smaller
 
[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma
[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma
[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma
 
Tutorial Haddop 2.3
Tutorial Haddop 2.3Tutorial Haddop 2.3
Tutorial Haddop 2.3
 
[RakutenTechConf2014] [D-4] The next step of LeoFS and Introducing NewDB Project
[RakutenTechConf2014] [D-4] The next step of LeoFS and Introducing NewDB Project[RakutenTechConf2014] [D-4] The next step of LeoFS and Introducing NewDB Project
[RakutenTechConf2014] [D-4] The next step of LeoFS and Introducing NewDB Project
 
Ceph on arm64 upload
Ceph on arm64   uploadCeph on arm64   upload
Ceph on arm64 upload
 
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
HDFS Erasure Code Storage - Same Reliability at Better Storage EfficiencyHDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
 
RuG Guest Lecture
RuG Guest LectureRuG Guest Lecture
RuG Guest Lecture
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
 
UberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for BeginnersUberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for Beginners
 

Similar to Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin Iancu and Nicholas Chaimov

IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
In-Memory Computing Summit
 
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
OpenEBS
 
Shak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-finalShak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-final
Tommy Lee
 
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
Databricks
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Uwe Printz
 
OMI - The Missing Piece of a Modular, Flexible and Composable Computing World
OMI - The Missing Piece of a Modular, Flexible and Composable Computing WorldOMI - The Missing Piece of a Modular, Flexible and Composable Computing World
OMI - The Missing Piece of a Modular, Flexible and Composable Computing World
Allan Cantle
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Odinot Stanislas
 
Ceph
CephCeph
Linux one vs x86
Linux one vs x86 Linux one vs x86
Linux one vs x86
Diego Rodriguez
 
Linux one vs x86 18 july
Linux one vs x86 18 julyLinux one vs x86 18 july
Linux one vs x86 18 july
Diego Rodriguez
 
Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015 Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015
Roger Zhou 周志强
 
CLFS 2010
CLFS 2010CLFS 2010
CLFS 2010
bergwolf
 
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Databricks
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Uwe Printz
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and Virtualization
Bigstep
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Databricks
 
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Mac Moore
 
Sheepdog Status Report
Sheepdog Status ReportSheepdog Status Report
Sheepdog Status Report
Liu Yuan
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
IO Visor Project
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
HPCC Systems
 

Similar to Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin Iancu and Nicholas Chaimov (20)

IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
 
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
 
Shak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-finalShak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-final
 
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
OMI - The Missing Piece of a Modular, Flexible and Composable Computing World
OMI - The Missing Piece of a Modular, Flexible and Composable Computing WorldOMI - The Missing Piece of a Modular, Flexible and Composable Computing World
OMI - The Missing Piece of a Modular, Flexible and Composable Computing World
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
 
Ceph
CephCeph
Ceph
 
Linux one vs x86
Linux one vs x86 Linux one vs x86
Linux one vs x86
 
Linux one vs x86 18 july
Linux one vs x86 18 julyLinux one vs x86 18 july
Linux one vs x86 18 july
 
Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015 Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015
 
CLFS 2010
CLFS 2010CLFS 2010
CLFS 2010
 
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and Virtualization
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
 
Sheepdog Status Report
Sheepdog Status ReportSheepdog Status Report
Sheepdog Status Report
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
osoyvvf
 
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
Call Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call GirlCall Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call Girl
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
sapna sharmap11
 
Senior Engineering Sample EM DOE - Sheet1.pdf
Senior Engineering Sample EM DOE  - Sheet1.pdfSenior Engineering Sample EM DOE  - Sheet1.pdf
Senior Engineering Sample EM DOE - Sheet1.pdf
Vineet
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
keesa2
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
blueshagoo1
 
CAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdfCAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdf
frp60658
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
Vineet
 
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
PsychoTech Services
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
eudsoh
 
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
actyx
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
Alireza Kamrani
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
Vietnam Cotton & Spinning Association
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
uevausa
 
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your DoorHyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Russian Escorts in Delhi 9711199171 with low rate Book online
 
Do People Really Know Their Fertility Intentions? Correspondence between Sel...
Do People Really Know Their Fertility Intentions?  Correspondence between Sel...Do People Really Know Their Fertility Intentions?  Correspondence between Sel...
Do People Really Know Their Fertility Intentions? Correspondence between Sel...
Xiao Xu
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Marlon Dumas
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
aguty
 
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service LucknowCall Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
hiju9823
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
perranet1
 

Recently uploaded (20)

一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
 
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
Call Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call GirlCall Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call Girl
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
 
Senior Engineering Sample EM DOE - Sheet1.pdf
Senior Engineering Sample EM DOE  - Sheet1.pdfSenior Engineering Sample EM DOE  - Sheet1.pdf
Senior Engineering Sample EM DOE - Sheet1.pdf
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
 
CAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdfCAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdf
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
 
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
 
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
 
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your DoorHyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
 
Do People Really Know Their Fertility Intentions? Correspondence between Sel...
Do People Really Know Their Fertility Intentions?  Correspondence between Sel...Do People Really Know Their Fertility Intentions?  Correspondence between Sel...
Do People Really Know Their Fertility Intentions? Correspondence between Sel...
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
 
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service LucknowCall Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
 

Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin Iancu and Nicholas Chaimov

  • 1. Costin Iancu, Khaled Ibrahim – LBNL Nicholas Chaimov – U. Oregon Spark on Supercomputers: A Tale of the Storage Hierarchy
  • 2. Apache Spark • Developed for cloud environments • Specialized runtime provides for – Performance J, Elastic parallelism, Resilience • Programming productivity through – HLL front-ends (Scala, R, SQL), multiple domain-specific libraries: Streaming, SparkSQL, SparkR, GraphX, Splash, MLLib, Velox • We have huge datasets but little penetration in HPC
  • 3. Apache Spark • In-memory Map-Reduce framework • Central abstraction is the Resilient Distributed Dataset. • Data movement is important – Lazy, on-demand – Horizontal (node-to-node) – shuffle/Reduce – Vertical (node-to-storage) - Map/Reduce p1 p2 p3 textFile p1 p2 p3 flatMap p1 p2 p3 map p1 p2 p3 reduceByKey (local) STAGE 0 p1 p2 p3 reduceByKey (global) STAGE 1 JOB 0
  • 4. Data Centers/Clouds Node local storage, assumes all disk operations are equal Disk I/O optimized for latency Network optimized for bandwidth HPC Global file system, asymmetry expected Disk I/O optimized for bandwidth Network optimized for latency
  • 5. HDD/ SSD NIC CPU Mem HDD/ SDD HDD/ SDD HDD/ SDD CPU Mem NIC HDD/ SDD HDD/ SSD HDD/ SSD HDD/ SSD HDD /SSD Cloud: commodity CPU, memory, HDD/SSD NIC Data appliance: server CPU, large fast memory, fast SSD Backend storage Intermediate storage HPC: server CPU, fast memory, combo of fast and slower storage
  • 6. HDD/ SSD NIC CPU Mem HDD/ SDD HDD/ SDD HDD/ SDD CPU Mem NIC HDD/ SDD HDD/ SSD HDD/ SSD HDD/ SSD HDD /SSD Backend storage Intermediate storage 2.5 GHz Intel Haswell - 24 cores 2.3 GHz Intel Haswell – 32 cores 128GB/1.5TB DDR4 128GB DDR4 320 GB of SSD local 56 Gbps FDR InfiniBand Cray Data Warp 1.8PB at 1.7TB/s Sonexion Lustre 30PB Cray Aries Comet (DELL) Cori (Cray XC40)
  • 7. Scaling Spark on Cray XC40 (It’s all about file system metadata)
  • 8. Not ALL I/O is Created Equal 0 2000 4000 6000 8000 10000 12000 1 2 4 8 16 Time Per Opera1on (microseconds) Nodes GroupByTest - I/O Components - Cori Lustre - Open BB Striped - Open BB Private - Open Lustre - Read BB Striped - Read BB Private - Read Lustre - Write BB Striped - Write # Shuffle opens = # Shuffle reads O(cores2) Time per open increases with scale, unlike read/write 9,216 36,864 147,456 589,824 2,359,296 opens
  • 9. I/O Variability is HIGH fopen is a problem: • Mean time is 23X larger than SSD • Variability is 14,000X READ fopen
  • 10. Improving I/O Performance Eliminate file metadata operations 1. Keep files open (cache fopen) • Surprising 10%-20% improvement on data appliance • Argues for user level file systems, gets rid of serialized system calls 2. Use file system backed by single Lustre file for shuffle • This should also help on systems with local SSDs 3. Use containers • Speeds up startup, up to 20% end-to-end performance improvement • Solutions need to be used in conjunction – E.g. fopen from Parquet reader Plenty of details in “Scaling Spark on HPC Systems”. HPDC 2016
  • 11. 0 100 200 300 400 500 600 700 32 160 320 640 1280 2560 5120 10240 Time (s) Cores Cori - GroupBy - Weak Scaling - Time to Job Completion Ramdisk Mounted File Lustre Scalability 6x 12x 14x 19x 33x 61x At 10,240 cores only 1.6x slower than RAMdisk (in memory execution) We scaled Spark from O(100) up to O(10,000) cores
  • 12. File-Backed Filesystems • NERSC Shifter (container infrastructure for HPC) – Compatible with Docker images – Integrated with Slurm scheduler – Can control mounting of filesystems within container • Per-Node Cache – File-backed filesystem mounted within each node’s container instance at common path (/mnt) – ​--volume=$SCRATCH/backingFile:/mnt:perNodeCache= size=100G – File for each node is created stored on backend Lustre filesystem – Single file open — intermediate data file opens are kept local
  • 13. Now the fun part J Architectural Performance Considerations Cori Comet The Supercomputer vs The Data Appliance
  • 14. HDD/ SSD NIC CPU Mem HDD/ SDD HDD/ SDD HDD/ SDD CPU Mem NIC HDD/ SDD HDD/ SSD HDD/ SSD HDD/ SSD HDD /SSD Backend storage Intermediate storage 2.5 GHz Intel Haswell - 24 cores 2.3 GHz Intel Haswell – 32 cores 128GB/1.5TB DDR4 128GB DDR4 320 GB of SSD local 56 Gbps FDR InfiniBand Cray Data Warp 1.8PB at 1.7TB/s Sonexion Lustre 30PB Cray Aries Comet (DELL) Cori (Cray XC40)
  • 15. CPU, Memory, Network, Disk? • Multiple extensions to Blocked Time Analysis (Ousterhout, 2015) • BTA indicated that CPU dominates – Network 2%, disk 19% • Concentrate on scaling out, weak scaling studies – Spark-perf, BigDataBenchmark, TPC-DS, TeraSort • Interested in determining right ratio, machine balance for – CPU, memory, network, disk … • Spark 2.0.2 & Spark-RDMA 0.9.4 from Ohio State University, Hadoop 2.6
  • 16. Storage hierarchy and performance
  • 17. Global Storage Matches Local Storage 0 20000 40000 60000 80000 100000 120000 140000 160000 180000 200000 Lustre Mount+Pool SSD+IB Lustre Mount+Pool SSD+IB Lustre Mount+Pool SSD+IB 1 5 20 Time (ms) Nodes (32 cores) App JVM RW Input RW Shuffle Open Input Open Shuffle • Variability matters more than advertised latency and bandwidth number • Storage performance obscured/mitigated by network due to client/server in BlockManager • Small scale local is slightly faster • Large scale global is faster Disk+Network Latency/BW Metadata Overhead Cray XC40 – TeraSort (100GB/node)
  • 18. 0 0.2 0.4 0.6 0.8 1 1.2 1 16 1 16 Comet RDMA Singularity 24 Cores Cori Shifter 24 Cores Average Across MLLib Benchmarks App Fetch JVM Global Storage Matches Local Storage 11.8% Fetch 12.5% Fetch
  • 19. Intermediate Storage Hurts Performance 0 2000 4000 6000 8000 10000 12000 14000 1 2 4 8 16 32 64 1 2 4 8 16 32 64 1 2 4 8 16 32 64 Cori Shifter Lustre Cori Shifter BB Striped Cori Shifter BB Private Time (s) TPC-DS - Weak Scaling App Fetch JVM 19.4% slower on average 86.8% slower on average (Without our optimizations, intermediate storage scaled better)
  • 21. 0 50 100 150 200 250 300 350 1 2 4 8 16 32 64 1 2 4 8 16 32 64 1 2 4 8 16 32 64 Comet Singularity Comet RDMA Singularity Cori Shifter 24 cores Time (s) Singular Value Decomposition App Fetch JVM Latency or Bandwidth? 10X in bandwidth, latency differences matter Can hide 2X differences Average message size for spark-perf is 43B
  • 22. Network Matters at Scale 0 50 100 150 200 250 1 2 4 8 16 32 64 1 2 4 8 16 32 64 Cori Shifter 24 Cores Cori Shifter 32 Cores Time (s) Average Across Benchmarks App Fetch JVM 44%
  • 23. CPU
  • 24. More cores or better memory? • Need more cores to hide disk and network latency at scale. • Preliminary experiences with Intel KNL are bad • Too much concurrency • Not enough integer throughput • Execution does not seem to be memory bandwidth limited 0 50 100 150 200 250 1 2 4 8 16 32 64 1 2 4 8 16 32 64 Cori Shifter 24 Cores Cori Shifter 32 Cores Time (s) Average Across Benchmarks App Fetch JVM
  • 25. Summary/Conclusions • Latency and bandwidth are important, but not dominant – Variability more important than marketing numbers • Network time dominates at scale – Network, disk is mis-attributed as CPU • Comet matches Cori up to 512 cores, Cori twice as fast at 2048 cores – Spark can run well on global storage • Global storage opens the possibility of global name space, no more client-server
  • 26. Ackowledgement Work partially supported by Intel Parallel Computing Center: Big Data Support for HPC
  • 27. Thank You. Questions, collaborations, free software cciancu@lbl.gov kzibrahim@lbl.gov nchaimov@uoregon.edu
  • 28. Burst Buffer Setup • Cray XC30 at NERSC (Edison): 2.4 GHz IvyBridge - Global • Cray XC40 at NERSC (Cori): 2.3 GHz Haswell + Cray DataWarp • Comet at SDSC: 2.5GHz Haswell, InfiniBand FDR, 320 GB SSD, 1.5TB memory - LOCAL