SlideShare a Scribd company logo
Jayjeet Chakraborty
Towards an Arrow-Native Storage System
SkyhookDM
Mentored by: Carlos Maltzahn, Ivo Jimenez, Je
ff
LeFevre
1
Who am I ?
• Incoming Grad Student at UC Santa Cruz

• CS Graduate from NIT Durgapur, India

• IRIS-HEP Fellow Summer 2020

• Twitter: @heyjc25

• Github: JayjeetAtGithub

• LinkedIn: https://www.linkedin.com/in/jayjeet-chakraborty-077579162/

• E-Mail: jchakra1@ucsc.edu
2
Problem
• CPU is the new bottleneck with high speed network and storage devices.

• Client-side processing of data from highly e
ffi
cient storage formats like
Parquet, ORC exhausts the CPUs.

• Severely hampered scalability.
• O
ffl
oad computation from client to the storage layer.

• Take advantage of the idle CPUs of storage systems for increased processing
rates and faster queries.

• Results in less data movement and network tra
ffi
c.
Our Solution
3
Introduction to Ceph
1.Provides 3 types of storage interface:
File, Object, Block.

2.No central point of failure. Uses
CRUSH maps that contains object -
OSD mapping. A CRUSH map in each
client. Client talks directly to OSD.

3.Highly extensible Object storage layer
via the Ceph Object Classes SDK.

4
• Language-independent columnar memory format for
fl
at and hierarchical data,
organised for e
ffi
cient analytic operations on modern hardware.

• Share data between processes without serialization overhead.
Before
Arrow
After Arrow
5
Components of Arrow
6
Arrow components
used by Skyhook
Design Paradigm
• Extend client and storage layers of
programmable storage systems
with data access libraries.

• Embed a FS shim inside storage
nodes to have
fi
le-like view over
objects.

• Allow direct interaction with objects
in an object store while bypassing
the
fi
lesystem layer utilising FS
metadata.
7
Architecture
• Arrow data access libraries embedded inside Ceph OSDs to allow
fi
le fragment scanning inside the storage
layer. 

• Expose the functionality through the Arrow Dataset API by creating a new
fi
le format abstraction
“RadosParquetFileFormat”.
8
File Layout Design
• Large multi-gigabyte Parquet
fi
les are split into smaller ~128 MB Parquet
fi
les.
• Each Parquet
fi
le is stored in a single RADOS object for SkyhookDM to access.
9
Experiments: Latency
• O
ffl
oading makes queries with higher
selectivity faster as less amount of data
is moved around the system. Also, less
time goes in data (de)serialization and
more into processing.

• LZ4 compressed Arrow IPC
fi
les
(Bottom) makes SkyhookDM better
performing than Parquet
fi
les (Top) since
they are faster to R/W.
Parquet
on Disk
LZ4 IPC on
Disk
10
Experiments: CPU Usage
• SkyhookDM nicely o
ffl
oads CPU usage from client layer to storage layer. For
example with 4 OSDs and 100% selectivity,
Without
Skyhook
With Skyhook
11
Experiments: Network Traffic
• SkyhookDM saves network
bandwidth by transferring only
the data that is requested by the
client.

• We end up transferring a little
more data in case of 100% as
LZ4 compressed Arrow is larger
than Parquet binary data.
1%
10%
100%
12
Experiments: Crash Recovery
• In SkyhookDM, since processing is colocated with storage nodes, the crash recovery
and consistency semantics of the storage layer apply naturally to query processing.
Crash Point
13
Coffea + SkyhookDM
• Implemented a run_parquet_job executor method in Co
ff
ea to be able to read from
Parquet
fi
les using the Arrow Dataset API. This in turn allowed integrating Co
ff
ea with
SkyhookDM seamlessly.
14
41.5%
30.5%
24.6%
3
.
3
4
%
0.103%
0.0324%
0.00855%
0.00511%
[6] Serialize Result Table
[5] Scan Parquet Data
[7] Result Transfer
[4] Disk I/O
[3] Deserialize Scan Request
[1] Stat Fragment
[8] Deserialize Result Table
[2] Serialize Scan Request
Sending uncompressed IPC
Ongoing Work
• Arrow’s memory layout requires internal memory copies to serialize it to a
contiguous on the wire format and this has a very high overhead.
48.3%
29.5%
11.7%
5.37%
5.11%
0.0513%
0.0304%
0.00771%
[5] Scan Parquet Data
[6] Serialize Result Table
[7] Result Transfer
[8] Deserialize Result Table
[4] Disk I/O
[3] Deserialize Scan Request
[1] Stat Fragment
[2] Serialize Scan Request
Sending LZ4 compressed IPC
• Collaborating with ServiceX and Co
ff
ea team to integrate SkyhookDM into the
larger analysis facility ecosystem.
15
Checkout our work
• Github Repository: https://github.com/uccross/skyhookdm-arrow

• Docker containers: https://github.com/uccross/skyhookdm-arrow-docker

• ArXiv Paper: https://arxiv.org/pdf/2105.09894.pdf

• Co
ff
ea Skyhook Plugin: https://github.com/Co
ff
eaTeam/co
ff
ea/tree/master/
docker/co
ff
ea_rados_parquet

• Several bugs found and reported in Apache Arrow: ARROW-13161,
ARROW-13126, ARROW-13088.
16
Thank You


Questions ?


17

More Related Content

What's hot

Using A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific OutputUsing A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific Output
Igor Sfiligoi
 
Deep Learning on Aerial Imagery: What does it look like on a map?
Deep Learning on Aerial Imagery: What does it look like on a map?Deep Learning on Aerial Imagery: What does it look like on a map?
Deep Learning on Aerial Imagery: What does it look like on a map?
Rob Emanuele
 
Finding New Sub-Atomic Particles on the AWS Cloud (BDT402) | AWS re:Invent 2013
Finding New Sub-Atomic Particles on the AWS Cloud (BDT402) | AWS re:Invent 2013Finding New Sub-Atomic Particles on the AWS Cloud (BDT402) | AWS re:Invent 2013
Finding New Sub-Atomic Particles on the AWS Cloud (BDT402) | AWS re:Invent 2013
Amazon Web Services
 
Q4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis PresentationQ4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis Presentation
Rob Emanuele
 
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...
Rob Emanuele
 
Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...
Igor Sfiligoi
 
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechGeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
Rob Emanuele
 
Federated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation TherapyFederated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation Therapy
CESGA Centro de Supercomputación de Galicia
 
The OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack NordicThe OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack Nordic
Tim Bell
 
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
Andrew Howard
 
OpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspectiveOpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspective
Tim Bell
 
How a Particle Accelerator Monitors Scientific Experiments Using InfluxDB
How a Particle Accelerator Monitors Scientific Experiments Using InfluxDBHow a Particle Accelerator Monitors Scientific Experiments Using InfluxDB
How a Particle Accelerator Monitors Scientific Experiments Using InfluxDB
InfluxData
 
OpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim BellOpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim Bell
Amrita Prasad
 
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Amazon Web Services
 
20150924 rda federation_v1
20150924 rda federation_v120150924 rda federation_v1
20150924 rda federation_v1
Tim Bell
 
20170926 cern cloud v4
20170926 cern cloud v420170926 cern cloud v4
20170926 cern cloud v4
Tim Bell
 
Cycle Computing Record-breaking Petascale HPC Run
Cycle Computing Record-breaking Petascale HPC RunCycle Computing Record-breaking Petascale HPC Run
Cycle Computing Record-breaking Petascale HPC Run
inside-BigData.com
 
BioPig for scalable analysis of big sequencing data
BioPig for scalable analysis of big sequencing dataBioPig for scalable analysis of big sequencing data
BioPig for scalable analysis of big sequencing data
Zhong Wang
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
inside-BigData.com
 
20181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v320181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v3
Tim Bell
 

What's hot (20)

Using A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific OutputUsing A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific Output
 
Deep Learning on Aerial Imagery: What does it look like on a map?
Deep Learning on Aerial Imagery: What does it look like on a map?Deep Learning on Aerial Imagery: What does it look like on a map?
Deep Learning on Aerial Imagery: What does it look like on a map?
 
Finding New Sub-Atomic Particles on the AWS Cloud (BDT402) | AWS re:Invent 2013
Finding New Sub-Atomic Particles on the AWS Cloud (BDT402) | AWS re:Invent 2013Finding New Sub-Atomic Particles on the AWS Cloud (BDT402) | AWS re:Invent 2013
Finding New Sub-Atomic Particles on the AWS Cloud (BDT402) | AWS re:Invent 2013
 
Q4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis PresentationQ4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis Presentation
 
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...
Analyzing Larger RasterData in a Jupyter Notebook with GeoPySpark on AWS - FO...
 
Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...
 
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTechGeoSpatially enabling your Spark and Accumulo clusters with LocationTech
GeoSpatially enabling your Spark and Accumulo clusters with LocationTech
 
Federated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation TherapyFederated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation Therapy
 
The OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack NordicThe OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack Nordic
 
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
inGeneoS: Intercontinental Genetic sequencing over trans-Pacific networks and...
 
OpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspectiveOpenStack at CERN : A 5 year perspective
OpenStack at CERN : A 5 year perspective
 
How a Particle Accelerator Monitors Scientific Experiments Using InfluxDB
How a Particle Accelerator Monitors Scientific Experiments Using InfluxDBHow a Particle Accelerator Monitors Scientific Experiments Using InfluxDB
How a Particle Accelerator Monitors Scientific Experiments Using InfluxDB
 
OpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim BellOpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim Bell
 
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
 
20150924 rda federation_v1
20150924 rda federation_v120150924 rda federation_v1
20150924 rda federation_v1
 
20170926 cern cloud v4
20170926 cern cloud v420170926 cern cloud v4
20170926 cern cloud v4
 
Cycle Computing Record-breaking Petascale HPC Run
Cycle Computing Record-breaking Petascale HPC RunCycle Computing Record-breaking Petascale HPC Run
Cycle Computing Record-breaking Petascale HPC Run
 
BioPig for scalable analysis of big sequencing data
BioPig for scalable analysis of big sequencing dataBioPig for scalable analysis of big sequencing data
BioPig for scalable analysis of big sequencing data
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
20181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v320181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v3
 

Similar to SkyhookDM - Towards an Arrow-Native Storage System

Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
JayjeetChakraborty
 
OracleStore: A Highly Performant RawStore Implementation for Hive Metastore
OracleStore: A Highly Performant RawStore Implementation for Hive MetastoreOracleStore: A Highly Performant RawStore Implementation for Hive Metastore
OracleStore: A Highly Performant RawStore Implementation for Hive Metastore
DataWorks Summit
 
RaptorX: Building a 10X Faster Presto with hierarchical cache
RaptorX: Building a 10X Faster Presto with hierarchical cacheRaptorX: Building a 10X Faster Presto with hierarchical cache
RaptorX: Building a 10X Faster Presto with hierarchical cache
Alluxio, Inc.
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
OpenEBS
 
Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4
Tony Pearson
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
musrath mohammad
 
OSGi Community Event 2010 - Modular Applications on a Data Grid - A Case Stud...
OSGi Community Event 2010 - Modular Applications on a Data Grid - A Case Stud...OSGi Community Event 2010 - Modular Applications on a Data Grid - A Case Stud...
OSGi Community Event 2010 - Modular Applications on a Data Grid - A Case Stud...
mfrancis
 
Webinar: Untethering Compute from Storage
Webinar: Untethering Compute from StorageWebinar: Untethering Compute from Storage
Webinar: Untethering Compute from Storage
Avere Systems
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
Peter Clapham
 
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld
 
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu YongUnlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Ceph Community
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
Rose Toomey
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
Databricks
 
COBOL to Apache Spark
COBOL to Apache SparkCOBOL to Apache Spark
COBOL to Apache Spark
Rakuten Group, Inc.
 
Oracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceOracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture Performance
Enkitec
 
Scaling Security Workflows in Government Agencies
Scaling Security Workflows in Government AgenciesScaling Security Workflows in Government Agencies
Scaling Security Workflows in Government Agencies
Avere Systems
 
GEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use CasesGEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use Cases
inside-BigData.com
 
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Cloudera, Inc.
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation in
RahulBhole12
 
Accelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & AlluxioAccelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & Alluxio
Alluxio, Inc.
 

Similar to SkyhookDM - Towards an Arrow-Native Storage System (20)

Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
 
OracleStore: A Highly Performant RawStore Implementation for Hive Metastore
OracleStore: A Highly Performant RawStore Implementation for Hive MetastoreOracleStore: A Highly Performant RawStore Implementation for Hive Metastore
OracleStore: A Highly Performant RawStore Implementation for Hive Metastore
 
RaptorX: Building a 10X Faster Presto with hierarchical cache
RaptorX: Building a 10X Faster Presto with hierarchical cacheRaptorX: Building a 10X Faster Presto with hierarchical cache
RaptorX: Building a 10X Faster Presto with hierarchical cache
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
 
Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
OSGi Community Event 2010 - Modular Applications on a Data Grid - A Case Stud...
OSGi Community Event 2010 - Modular Applications on a Data Grid - A Case Stud...OSGi Community Event 2010 - Modular Applications on a Data Grid - A Case Stud...
OSGi Community Event 2010 - Modular Applications on a Data Grid - A Case Stud...
 
Webinar: Untethering Compute from Storage
Webinar: Untethering Compute from StorageWebinar: Untethering Compute from Storage
Webinar: Untethering Compute from Storage
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right
 
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu YongUnlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
COBOL to Apache Spark
COBOL to Apache SparkCOBOL to Apache Spark
COBOL to Apache Spark
 
Oracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceOracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture Performance
 
Scaling Security Workflows in Government Agencies
Scaling Security Workflows in Government AgenciesScaling Security Workflows in Government Agencies
Scaling Security Workflows in Government Agencies
 
GEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use CasesGEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use Cases
 
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation in
 
Accelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & AlluxioAccelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & Alluxio
 

Recently uploaded

GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
lorraineandreiamcidl
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
SOCRadar
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
Philip Schwarz
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
Rakesh Kumar R
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
Hornet Dynamics
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 

Recently uploaded (20)

GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 

SkyhookDM - Towards an Arrow-Native Storage System

  • 1. Jayjeet Chakraborty Towards an Arrow-Native Storage System SkyhookDM Mentored by: Carlos Maltzahn, Ivo Jimenez, Je ff LeFevre 1
  • 2. Who am I ? • Incoming Grad Student at UC Santa Cruz • CS Graduate from NIT Durgapur, India • IRIS-HEP Fellow Summer 2020 • Twitter: @heyjc25 • Github: JayjeetAtGithub • LinkedIn: https://www.linkedin.com/in/jayjeet-chakraborty-077579162/ • E-Mail: jchakra1@ucsc.edu 2
  • 3. Problem • CPU is the new bottleneck with high speed network and storage devices. • Client-side processing of data from highly e ffi cient storage formats like Parquet, ORC exhausts the CPUs. • Severely hampered scalability. • O ffl oad computation from client to the storage layer. • Take advantage of the idle CPUs of storage systems for increased processing rates and faster queries. • Results in less data movement and network tra ffi c. Our Solution 3
  • 4. Introduction to Ceph 1.Provides 3 types of storage interface: File, Object, Block.
 2.No central point of failure. Uses CRUSH maps that contains object - OSD mapping. A CRUSH map in each client. Client talks directly to OSD.
 3.Highly extensible Object storage layer via the Ceph Object Classes SDK.
 4
  • 5. • Language-independent columnar memory format for fl at and hierarchical data, organised for e ffi cient analytic operations on modern hardware. • Share data between processes without serialization overhead. Before Arrow After Arrow 5
  • 6. Components of Arrow 6 Arrow components used by Skyhook
  • 7. Design Paradigm • Extend client and storage layers of programmable storage systems with data access libraries. • Embed a FS shim inside storage nodes to have fi le-like view over objects. • Allow direct interaction with objects in an object store while bypassing the fi lesystem layer utilising FS metadata. 7
  • 8. Architecture • Arrow data access libraries embedded inside Ceph OSDs to allow fi le fragment scanning inside the storage layer. • Expose the functionality through the Arrow Dataset API by creating a new fi le format abstraction “RadosParquetFileFormat”. 8
  • 9. File Layout Design • Large multi-gigabyte Parquet fi les are split into smaller ~128 MB Parquet fi les. • Each Parquet fi le is stored in a single RADOS object for SkyhookDM to access. 9
  • 10. Experiments: Latency • O ffl oading makes queries with higher selectivity faster as less amount of data is moved around the system. Also, less time goes in data (de)serialization and more into processing. • LZ4 compressed Arrow IPC fi les (Bottom) makes SkyhookDM better performing than Parquet fi les (Top) since they are faster to R/W. Parquet on Disk LZ4 IPC on Disk 10
  • 11. Experiments: CPU Usage • SkyhookDM nicely o ffl oads CPU usage from client layer to storage layer. For example with 4 OSDs and 100% selectivity, Without Skyhook With Skyhook 11
  • 12. Experiments: Network Traffic • SkyhookDM saves network bandwidth by transferring only the data that is requested by the client. • We end up transferring a little more data in case of 100% as LZ4 compressed Arrow is larger than Parquet binary data. 1% 10% 100% 12
  • 13. Experiments: Crash Recovery • In SkyhookDM, since processing is colocated with storage nodes, the crash recovery and consistency semantics of the storage layer apply naturally to query processing. Crash Point 13
  • 14. Coffea + SkyhookDM • Implemented a run_parquet_job executor method in Co ff ea to be able to read from Parquet fi les using the Arrow Dataset API. This in turn allowed integrating Co ff ea with SkyhookDM seamlessly. 14
  • 15. 41.5% 30.5% 24.6% 3 . 3 4 % 0.103% 0.0324% 0.00855% 0.00511% [6] Serialize Result Table [5] Scan Parquet Data [7] Result Transfer [4] Disk I/O [3] Deserialize Scan Request [1] Stat Fragment [8] Deserialize Result Table [2] Serialize Scan Request Sending uncompressed IPC Ongoing Work • Arrow’s memory layout requires internal memory copies to serialize it to a contiguous on the wire format and this has a very high overhead. 48.3% 29.5% 11.7% 5.37% 5.11% 0.0513% 0.0304% 0.00771% [5] Scan Parquet Data [6] Serialize Result Table [7] Result Transfer [8] Deserialize Result Table [4] Disk I/O [3] Deserialize Scan Request [1] Stat Fragment [2] Serialize Scan Request Sending LZ4 compressed IPC • Collaborating with ServiceX and Co ff ea team to integrate SkyhookDM into the larger analysis facility ecosystem. 15
  • 16. Checkout our work • Github Repository: https://github.com/uccross/skyhookdm-arrow • Docker containers: https://github.com/uccross/skyhookdm-arrow-docker • ArXiv Paper: https://arxiv.org/pdf/2105.09894.pdf • Co ff ea Skyhook Plugin: https://github.com/Co ff eaTeam/co ff ea/tree/master/ docker/co ff ea_rados_parquet • Several bugs found and reported in Apache Arrow: ARROW-13161, ARROW-13126, ARROW-13088. 16