SlideShare a Scribd company logo
© 2017 Dremio Corporation @DremioHQ
Apache Arrow: In Theory, In Practice
Apache Arrow Meetup @ Enigma
November 1, 2017
Jacques Nadeau
© 2017 Dremio Corporation @DremioHQ
Who?
Jacques Nadeau
@intjesus
• CTO & Co-founder of Dremio
• Apache member
• VP Apache Arrow
• PMCs: Arrow, Calcite, Incubator, Heron (incubating)
© 2017 Dremio Corporation @DremioHQ
Arrow In Theory
© 2017 Dremio Corporation @DremioHQ
The Apache Arrow Project
• Started Feb 17, 2016 (Apache tlp)
• Focused on Columnar In-Memory Analytics
1. 10-100x speedup on many workloads
2. Common data layer enables companies to
choose best of breed systems
3. Designed to work with any programming
language
4. Support for both relational and complex data
as-is
Calcite
Cassandra
Deeplearning4j
Drill
Hadoop
HBase
Ibis
Impala
Kudu
Pandas
Parquet
Phoenix
Spark
Storm
R
Committers & Contributors from:
© 2017 Dremio Corporation @DremioHQ
Arrow goals
• Well-documented and cross language
compatible
• Designed to take advantage of modern CPU
characteristics
• Embeddable in execution engines, storage
layers, etc.
• Interoperable
© 2017 Dremio Corporation @DremioHQ
Arrow In Memory Columnar Format
• Shredded Nested Data Structures
• Randomly Accessible
• Maximize CPU throughput
– Pipelining
– SIMD
– cache locality
• Scatter/gather I/O
© 2017 Dremio Corporation @DremioHQ
High Performance Sharing & Interchange
Before With Arrow
• Each system has its own internal memory format
• 70-80% CPU wasted on serialization and
deserialization
• Functionality duplication and unnecessary
conversions
• All systems utilize the same memory format
• No overhead for cross-system communication
• Projects can share functionality (eg: Parquet-to-
Arrow reader)
© 2017 Dremio Corporation @DremioHQ
Common Processing Libraries (soon)
• High Performance Canonical processing for Arrow
Data Structures
– Sort
– Hash Table
– Dictionary encoding
– Predicate application & masking
• Multiple Medium and Processing Paradigms
– Memory, NVMe, 3d Xpoint
– X86, GPU, Many Core (Phi), etc.
© 2017 Dremio Corporation @DremioHQ
Arrow Data Types
• Scalars
– Boolean
– [u]int[8,16,32,64], Decimal, Float, Double
– Date, Time, Timestamp
– UTF8 String, Binary
• Complex
– Struct, Map, List
• Advanced
– Union (sparse & dense)
© 2017 Dremio Corporation @DremioHQ
Common Message Pattern
• Schema Negotiation
– Logical Description of structure
– Identification of dictionary encoded
Nodes
• Dictionary Batch
– Dictionary ID, Values
• Record Batch
– Batches of records up to 64K
– Leaf nodes up to 2B values
Schema
Negotiation
Dictionary
Batch
Record
Batch
Record
Batch
Record
Batch
1..N
Batches
0..N
Batches
© 2017 Dremio Corporation @DremioHQ
Columnar data
persons = [{
name: ’Joe',
age: 18,
phones: [
‘555-111-1111’,
‘555-222-2222’
]
}, {
name: ’Jack',
age: 37,
phones: [ ‘555-333-3333’ ]
}]
© 2017 Dremio Corporation @DremioHQ
Record Batch Construction
Schema
Negotiation
Dictionary
Batch
Record
Batch
Record
Batch
Record
Batch
name (offset)
name (data)
age (data)
phones (list offset)
phones (data)
data header (describes offsets into data)
name (bitmap)
age (bitmap)
phones (bitmap)
phones (offset)
{
name: ’Joe',
age: 18,
phones: [
‘555-111-1111’,
‘555-222-2222’
]
}
Each box (vector) is contiguous memory
The entire record batch is contiguous on wire
© 2017 Dremio Corporation @DremioHQ
Arrow Components
• Core Libraries
• Within Project Integrations
• Extended Integrations
© 2017 Dremio Corporation @DremioHQ
Arrow: Core Components
• Java Library
• C++ Library
• C Library
• Ruby Library
• Python Library
• JavaScript Library
© 2017 Dremio Corporation @DremioHQ
In-Project Arrow Building Blocks/Applications
• Plasma:
– Shared memory caching layer, originally created in Ray
• Feather:
– Fast ephemeral format for movement of data between
R/Python
• ArrowRest (soon):
– RPC/IPC interchange library (active development)
• ArrowRoutines (soon):
– Common data manipulation components
© 2017 Dremio Corporation @DremioHQ
Arrow Integrations
• Pandas
– Move seamlessly to from Arrow as a means for communication, serialization,
fast processing
• GOAI (GPU Open Analytics Initiative), libgdf and the GPU dataframe
– Leverages Arrow as internal representation
• Parquet
– Read and write Parquet quickly to/from Parquet. C++ library builds directly on
Arrow.
• Spark
– Supports conversion to Pandas via Arrow construction using Arrow Java Library
• Dremio
– OSS project, Sabot Engine executes entirely on Arrow memory
© 2017 Dremio Corporation @DremioHQ
Arrow In Practice
© 2017 Dremio Corporation @DremioHQ
Real World Arrow: Sabot
• Dremio is an OSS data fabric
product
• The core engine is “Sabot”
– Built entirely on top of Arrow
libraries, runs in JVM
© 2017 Dremio Corporation @DremioHQ
Sabot: Arrow in Practice
• Memory Management
• Vector sizing
• RPC Communication
• Filtering/Sorting
• Rowwise-algorithms: Hash Tables
• Vector-wise Algorithms
– Aggregation
– Unnesting
© 2017 Dremio Corporation @DremioHQ
Practice: Memory Management
• Arrow includes chunk-based managed allocator
– Built on top of Netty’s JEMalloc implementation
• Create a tree of allocators
– Support both reservation and local limits
– Include leak detection, debug ownership logs and location accounting
• Size allocators (reservation and maximum) based on workload
management, when to trigger spilling, etc.
• All Arrow Vectors hold one or more off-heap buffers
• Everything is manually reference managed
– Some code more complex
– Provides strong memory availability understanding
Root
res: 0
max: 20g
Job 1
res: 10m
max: 1g
Job 2
res: 10m
max: 1g
Task 1
res: 1m
max: -1
Task 2
res: 5m
max: 20m
Task 1
res: 1m
max: -1
Task 2
res: 5m
max: 20m
IntVector
Validity
Data
© 2017 Dremio Corporation @DremioHQ
Practice: Memory Management Cont’d
• Data moves through data pipelines
• Ownership needs to be clear (to
plan/control execution
– Allocated memory can be referenced
by many consumers
– One allocator ‘owns’ the accounted
memory
– Consumers can use Vector’s transfer
capability to leverage transfer
semantics and handoff data ownership
https://goo.gl/HN9nCH
Scan
Aggregate
Aggregate
res: 10m
max: 1g
Scan
res: 10m
max: 1g
transfer
ownership
© 2017 Dremio Corporation @DremioHQ
Practice: Vector Sizing
• Batches are the smallest work unit
• Batches of records can be 1..64k
records in size.
• Optimization Problem
– Larger improve processing
performance
– Larger causes pipeline problems
– Smaller causes more heap overhead
• Execution-Level Adaptive Resizing for
wide records (100-1000s fields)
Narrow Batch
Wide Batch
4095 records
127 records
© 2017 Dremio Corporation @DremioHQ
Practice: RPC Communication
• Goals
– Leverage Gathering Writes
– Ensure connection resilience despite
memory pressure
• Custom Netty-based RPC protocol
– All messages include structured
(proto) and sidecar memory message
– Out of memory at message
consumption time, ensuring fail-ack
as opposed to connection disconnect
Send:
Listener listener
Proto structuredMessage
ArrowBuf... dataBodies
https://goo.gl/XWyrc1
Structured message
Gathering
write
© 2017 Dremio Corporation @DremioHQ
Filtering & Sorting
• For filtering and sorting, create a selection
vector
– Describes valid values and ordering without
reorganizing underlying data.
– Two bytes for filter purposes (single batch
horizon)
– Four bytes for sort purposes (multi-batch
horizon)
• 4-Byte selection vector pattern frequently by
other operations
• 6-Byte selection vector used in some cases
(to manage wide batches)
• Defer copy/compacting
2
14
35
99
1-2
2-14
1-35
2-99
sv4
sv2
© 2017 Dremio Corporation @DremioHQ
Row-wise Algorithms: Hash Table + Aggregation
For generating hash table, maintaining a
columnar structure for keys slows hashing
insertion and lookup
• Break data into fixed and variable values
• Use consistent fixed value insertion
• Use dynamic variable output
• Pivot data
– Vector at time for fixed values
– All variable at same time for variable
vectors
• Hash and equality as bucket of bytes
• Avoids excessive indirection
• Maintain Aggregation tables in columnar
format
Fixed Block Vector Variable Block Vector
Aggregation Tables
validity|fixed1|fixed2|varlen|varoffset
validity|fixed1|fixed2|varlen|varoffset
validity|fixed1|fixed2|varlen|varoffset
validity|fixed1|fixed2|varlen|varoffset
len|data|len|data|len|data|len
|data|len|data|len|data|len|da
ta|len|data|len|data|len|data|l
en|data|len|data|len|data|len|
data|len|data|len|data
Partial-agg2
Partial-agg1
Partial-agg3
Partial-agg4
Partial-agg5
Partial-agg6
pivot fixed
pivot variable
unpivot
unpivot
direct
projection
© 2017 Dremio Corporation @DremioHQ
Example Pivot Code
• Takes advantage of runs of
nullable values, working a
word at a time
– ALL_SET, NONE_SET, SOME_SET
• Ensure canonicalization of
values based on validity
– Typically validity data is zeroed
on allocation, other vectors are
not.
– Vector data has to be cleared
when pivoting nulled values
• Conditions are avoided
static void pivot8Bytes(
VectorPivotDef def,
FixedBlockVector fixedBlock,
final int count
){
...
// decode word at a time.
while (srcDataAddr < finalWordAddr) {
final long bitValues = PlatformDependent.getLong(srcBitsAddr);
if (bitValues == NONE_SET) {
// noop (all nulls).
bitTargetAddr += (WORD_BITS * blockLength);
valueTargetAddr += (WORD_BITS * blockLength);
srcDataAddr += (WORD_BITS * EIGHT_BYTE);
} else if (bitValues == ALL_SET) {
// all set, set the bit values using a constant AND. Independently set the data values without transformation.
final int bitVal = 1 << bitOffset;
for (int i = 0; i < WORD_BITS; i++, bitTargetAddr += blockLength) {
PlatformDependent.putInt(bitTargetAddr, PlatformDependent.getInt(bitTargetAddr) | bitVal);
}
for (int i = 0; i < WORD_BITS; i++, valueTargetAddr += blockLength, srcDataAddr += EIGHT_BYTE) {
PlatformDependent.putLong(valueTargetAddr, PlatformDependent.getLong(srcDataAddr));
}
} else {
// some nulls, some not, update each value to zero or the value, depending on the null bit.
for (int i = 0; i < WORD_BITS; i++, bitTargetAddr += blockLength, valueTargetAddr += blockLength, srcDataAddr += E
final int bitVal = ((int) (bitValues >>> i)) & 1;
PlatformDependent.putInt(bitTargetAddr, PlatformDependent.getInt(bitTargetAddr) | (bitVal << bitOffset));
PlatformDependent.putLong(valueTargetAddr, PlatformDependent.getLong(srcDataAddr) * bitVal);
}
}
srcBitsAddr += WORD_BYTES;
}
https://goo.gl/EgLy9r
© 2017 Dremio Corporation @DremioHQ
Node 1
Mux’d
Practice: Parallel Columnar Shuffle
• Partition data based on a hashed key
• Avoid excessive batch buffering cost
• Steps
1. Consolidate node-local streams
• Allow reduction in buffering memory in large
clusters (k*n instead of n*n)
2. Hash the key(s) to determine bucket offset
• Generate bucket vector
3. Pre-allocate output buffers at target output
size
• Sized depending on narrow/wide batches
4. Do columnar copies per vector
• Written in C-like low overhead pattern with
no abstraction
Node 2
Thread 1 Thread 2
generate bucket vector
Do bucket-
level copies
Gathering
Write
Thread 1 Thread 2
© 2017 Dremio Corporation @DremioHQ
Example Copier Code
• Two byte offset
addresses (sv2)
• Tight loop focused on
• Far more efficient than
runtime-generated row-
wise code
– Also has faster startup
time
public void copy(long offsetAddr, int count) {
final List<ArrowBuf> sourceBuffers = source.getFieldBuffers();
targetAlt.allocateNew(count);
final List<ArrowBuf> targetBuffers = target.getFieldBuffers();
final long max = offsetAddr + count * STEP_SIZE;
final long srcAddr = sourceBuffers.get(VALUE_BUFFER_ORDINAL).memoryAddress();
long dstAddr = targetBuffers.get(VALUE_BUFFER_ORDINAL).memoryAddress();
for(long addr = offsetAddr; addr < max; addr += STEP_SIZE, dstAddr += SIZE){
PlatformDependent.putLong(dstAddr,
PlatformDependent.getLong(srcAddr + ((char) PlatformDependent.getShort(addr)) * SIZE));
}
}
https://goo.gl/fZEsfy
© 2017 Dremio Corporation @DremioHQ
Unnesting List Vectors
• Common Pattern: List of objects that want to be
unrolled to separate records.
• Arrow’s representation allows a direct unroll (no
inner data copies required)
• Since leaf vectors can be larger (up to 2B), may
need to split apart inner vectors
– Make use of SplitAndTransfer necessary
– SplitAndTransfer as cheap as possible
• Noop for fixed data
• Offset rewrite for variable width vectors, noop for variable
data
• Bit rewrite & shifting for Validity vectors
List Vector
OffsetVector
Struct Vector
Inner Vectors
© 2017 Dremio Corporation @DremioHQ
What’s Coming
• Arrow RPC/REST
– Generic way to retrieve data in Arrow format
– Generic way to serve data in Arrow format
– Simplify integrations across the ecosystem
• Arrow Routines
– GPU and LLVM
© 2017 Dremio Corporation @DremioHQ
Get Involved
• Join the community
– dev@arrow.apache.org
– Slack:
• https://apachearrowslackin.herokuapp.com/
– http://arrow.apache.org
– Follow @ApacheArrow, @DremioHQ, @intjesus

More Related Content

What's hot

Apache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data TransportApache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data Transport
Wes McKinney
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
Databricks
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
Nishith Agarwal
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
Ryan Blue
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
Databricks
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Databricks
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Dremio Corporation
 
Apache Arrow Flight Overview
Apache Arrow Flight OverviewApache Arrow Flight Overview
Apache Arrow Flight Overview
Jacques Nadeau
 
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Databricks
 
Delta: Building Merge on Read
Delta: Building Merge on ReadDelta: Building Merge on Read
Delta: Building Merge on Read
Databricks
 
Deep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDeep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache Spark
Databricks
 
Building a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowBuilding a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache Arrow
Dremio Corporation
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
Alluxio, Inc.
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
colorant
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
HostedbyConfluent
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxData
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
Alluxio, Inc.
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper Optimization
Databricks
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
Databricks
 

What's hot (20)

Apache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data TransportApache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data Transport
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
 
Apache Arrow Flight Overview
Apache Arrow Flight OverviewApache Arrow Flight Overview
Apache Arrow Flight Overview
 
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
 
Delta: Building Merge on Read
Delta: Building Merge on ReadDelta: Building Merge on Read
Delta: Building Merge on Read
 
Deep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDeep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache Spark
 
Building a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowBuilding a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache Arrow
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper Optimization
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 

Viewers also liked

Apache Calcite: One planner fits all
Apache Calcite: One planner fits allApache Calcite: One planner fits all
Apache Calcite: One planner fits all
Julian Hyde
 
The twins that everyone loved too much
The twins that everyone loved too muchThe twins that everyone loved too much
The twins that everyone loved too much
Julian Hyde
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
Dremio Corporation
 
Data Science Languages and Industry Analytics
Data Science Languages and Industry AnalyticsData Science Languages and Industry Analytics
Data Science Languages and Industry Analytics
Wes McKinney
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in London
Dremio Corporation
 
SQL on everything, in memory
SQL on everything, in memorySQL on everything, in memory
SQL on everything, in memory
Julian Hyde
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!
Julian Hyde
 
Apache Calcite overview
Apache Calcite overviewApache Calcite overview
Apache Calcite overview
Julian Hyde
 

Viewers also liked (8)

Apache Calcite: One planner fits all
Apache Calcite: One planner fits allApache Calcite: One planner fits all
Apache Calcite: One planner fits all
 
The twins that everyone loved too much
The twins that everyone loved too muchThe twins that everyone loved too much
The twins that everyone loved too much
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
 
Data Science Languages and Industry Analytics
Data Science Languages and Industry AnalyticsData Science Languages and Industry Analytics
Data Science Languages and Industry Analytics
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in London
 
SQL on everything, in memory
SQL on everything, in memorySQL on everything, in memory
SQL on everything, in memory
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!
 
Apache Calcite overview
Apache Calcite overviewApache Calcite overview
Apache Calcite overview
 

Similar to Apache Arrow: In Theory, In Practice

Efficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and ArrowEfficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and Arrow
DataWorks Summit/Hadoop Summit
 
Data Eng Conf NY Nov 2016 Parquet Arrow
Data Eng Conf NY Nov 2016 Parquet ArrowData Eng Conf NY Nov 2016 Parquet Arrow
Data Eng Conf NY Nov 2016 Parquet Arrow
Julien Le Dem
 
Strata London 2016: The future of column oriented data processing with Arrow ...
Strata London 2016: The future of column oriented data processing with Arrow ...Strata London 2016: The future of column oriented data processing with Arrow ...
Strata London 2016: The future of column oriented data processing with Arrow ...
Julien Le Dem
 
HUG_Ireland_Apache_Arrow_Tomer_Shiran
HUG_Ireland_Apache_Arrow_Tomer_Shiran HUG_Ireland_Apache_Arrow_Tomer_Shiran
HUG_Ireland_Apache_Arrow_Tomer_Shiran
John Mulhall
 
Strata NY 2016: The future of column-oriented data processing with Arrow and ...
Strata NY 2016: The future of column-oriented data processing with Arrow and ...Strata NY 2016: The future of column-oriented data processing with Arrow and ...
Strata NY 2016: The future of column-oriented data processing with Arrow and ...
Julien Le Dem
 
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
DataWorks Summit/Hadoop Summit
 
Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...
Spark Summit
 
Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...
Databricks
 
Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...
Spark Summit
 
Improving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache ArrowImproving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache Arrow
Julien Le Dem
 
Mule soft mar 2017 Parquet Arrow
Mule soft mar 2017 Parquet ArrowMule soft mar 2017 Parquet Arrow
Mule soft mar 2017 Parquet Arrow
Julien Le Dem
 
Using LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache ArrowUsing LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache Arrow
DataWorks Summit
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Community
 
GEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use CasesGEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use Cases
inside-BigData.com
 
DataFrames: The Extended Cut
DataFrames: The Extended CutDataFrames: The Extended Cut
DataFrames: The Extended Cut
Wes McKinney
 
Solving Cybersecurity at Scale
Solving Cybersecurity at ScaleSolving Cybersecurity at Scale
Solving Cybersecurity at Scale
DataWorks Summit
 
Next-generation Python Big Data Tools, powered by Apache Arrow
Next-generation Python Big Data Tools, powered by Apache ArrowNext-generation Python Big Data Tools, powered by Apache Arrow
Next-generation Python Big Data Tools, powered by Apache Arrow
Wes McKinney
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory data
Wes McKinney
 
Stream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data PipelinesStream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data Pipelines
Vladimír Schreiner
 
Drill at the Chicago Hug
Drill at the Chicago HugDrill at the Chicago Hug
Drill at the Chicago Hug
MapR Technologies
 

Similar to Apache Arrow: In Theory, In Practice (20)

Efficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and ArrowEfficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and Arrow
 
Data Eng Conf NY Nov 2016 Parquet Arrow
Data Eng Conf NY Nov 2016 Parquet ArrowData Eng Conf NY Nov 2016 Parquet Arrow
Data Eng Conf NY Nov 2016 Parquet Arrow
 
Strata London 2016: The future of column oriented data processing with Arrow ...
Strata London 2016: The future of column oriented data processing with Arrow ...Strata London 2016: The future of column oriented data processing with Arrow ...
Strata London 2016: The future of column oriented data processing with Arrow ...
 
HUG_Ireland_Apache_Arrow_Tomer_Shiran
HUG_Ireland_Apache_Arrow_Tomer_Shiran HUG_Ireland_Apache_Arrow_Tomer_Shiran
HUG_Ireland_Apache_Arrow_Tomer_Shiran
 
Strata NY 2016: The future of column-oriented data processing with Arrow and ...
Strata NY 2016: The future of column-oriented data processing with Arrow and ...Strata NY 2016: The future of column-oriented data processing with Arrow and ...
Strata NY 2016: The future of column-oriented data processing with Arrow and ...
 
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
 
Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...
 
Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...
 
Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...
 
Improving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache ArrowImproving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache Arrow
 
Mule soft mar 2017 Parquet Arrow
Mule soft mar 2017 Parquet ArrowMule soft mar 2017 Parquet Arrow
Mule soft mar 2017 Parquet Arrow
 
Using LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache ArrowUsing LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache Arrow
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
GEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use CasesGEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use Cases
 
DataFrames: The Extended Cut
DataFrames: The Extended CutDataFrames: The Extended Cut
DataFrames: The Extended Cut
 
Solving Cybersecurity at Scale
Solving Cybersecurity at ScaleSolving Cybersecurity at Scale
Solving Cybersecurity at Scale
 
Next-generation Python Big Data Tools, powered by Apache Arrow
Next-generation Python Big Data Tools, powered by Apache ArrowNext-generation Python Big Data Tools, powered by Apache Arrow
Next-generation Python Big Data Tools, powered by Apache Arrow
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory data
 
Stream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data PipelinesStream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data Pipelines
 
Drill at the Chicago Hug
Drill at the Chicago HugDrill at the Chicago Hug
Drill at the Chicago Hug
 

Recently uploaded

Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
Hornet Dynamics
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
Launch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in MinutesLaunch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in Minutes
Roshan Dwivedi
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
Alina Yurenko
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 

Recently uploaded (20)

Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
Launch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in MinutesLaunch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in Minutes
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 

Apache Arrow: In Theory, In Practice

  • 1. © 2017 Dremio Corporation @DremioHQ Apache Arrow: In Theory, In Practice Apache Arrow Meetup @ Enigma November 1, 2017 Jacques Nadeau
  • 2. © 2017 Dremio Corporation @DremioHQ Who? Jacques Nadeau @intjesus • CTO & Co-founder of Dremio • Apache member • VP Apache Arrow • PMCs: Arrow, Calcite, Incubator, Heron (incubating)
  • 3. © 2017 Dremio Corporation @DremioHQ Arrow In Theory
  • 4. © 2017 Dremio Corporation @DremioHQ The Apache Arrow Project • Started Feb 17, 2016 (Apache tlp) • Focused on Columnar In-Memory Analytics 1. 10-100x speedup on many workloads 2. Common data layer enables companies to choose best of breed systems 3. Designed to work with any programming language 4. Support for both relational and complex data as-is Calcite Cassandra Deeplearning4j Drill Hadoop HBase Ibis Impala Kudu Pandas Parquet Phoenix Spark Storm R Committers & Contributors from:
  • 5. © 2017 Dremio Corporation @DremioHQ Arrow goals • Well-documented and cross language compatible • Designed to take advantage of modern CPU characteristics • Embeddable in execution engines, storage layers, etc. • Interoperable
  • 6. © 2017 Dremio Corporation @DremioHQ Arrow In Memory Columnar Format • Shredded Nested Data Structures • Randomly Accessible • Maximize CPU throughput – Pipelining – SIMD – cache locality • Scatter/gather I/O
  • 7. © 2017 Dremio Corporation @DremioHQ High Performance Sharing & Interchange Before With Arrow • Each system has its own internal memory format • 70-80% CPU wasted on serialization and deserialization • Functionality duplication and unnecessary conversions • All systems utilize the same memory format • No overhead for cross-system communication • Projects can share functionality (eg: Parquet-to- Arrow reader)
  • 8. © 2017 Dremio Corporation @DremioHQ Common Processing Libraries (soon) • High Performance Canonical processing for Arrow Data Structures – Sort – Hash Table – Dictionary encoding – Predicate application & masking • Multiple Medium and Processing Paradigms – Memory, NVMe, 3d Xpoint – X86, GPU, Many Core (Phi), etc.
  • 9. © 2017 Dremio Corporation @DremioHQ Arrow Data Types • Scalars – Boolean – [u]int[8,16,32,64], Decimal, Float, Double – Date, Time, Timestamp – UTF8 String, Binary • Complex – Struct, Map, List • Advanced – Union (sparse & dense)
  • 10. © 2017 Dremio Corporation @DremioHQ Common Message Pattern • Schema Negotiation – Logical Description of structure – Identification of dictionary encoded Nodes • Dictionary Batch – Dictionary ID, Values • Record Batch – Batches of records up to 64K – Leaf nodes up to 2B values Schema Negotiation Dictionary Batch Record Batch Record Batch Record Batch 1..N Batches 0..N Batches
  • 11. © 2017 Dremio Corporation @DremioHQ Columnar data persons = [{ name: ’Joe', age: 18, phones: [ ‘555-111-1111’, ‘555-222-2222’ ] }, { name: ’Jack', age: 37, phones: [ ‘555-333-3333’ ] }]
  • 12. © 2017 Dremio Corporation @DremioHQ Record Batch Construction Schema Negotiation Dictionary Batch Record Batch Record Batch Record Batch name (offset) name (data) age (data) phones (list offset) phones (data) data header (describes offsets into data) name (bitmap) age (bitmap) phones (bitmap) phones (offset) { name: ’Joe', age: 18, phones: [ ‘555-111-1111’, ‘555-222-2222’ ] } Each box (vector) is contiguous memory The entire record batch is contiguous on wire
  • 13. © 2017 Dremio Corporation @DremioHQ Arrow Components • Core Libraries • Within Project Integrations • Extended Integrations
  • 14. © 2017 Dremio Corporation @DremioHQ Arrow: Core Components • Java Library • C++ Library • C Library • Ruby Library • Python Library • JavaScript Library
  • 15. © 2017 Dremio Corporation @DremioHQ In-Project Arrow Building Blocks/Applications • Plasma: – Shared memory caching layer, originally created in Ray • Feather: – Fast ephemeral format for movement of data between R/Python • ArrowRest (soon): – RPC/IPC interchange library (active development) • ArrowRoutines (soon): – Common data manipulation components
  • 16. © 2017 Dremio Corporation @DremioHQ Arrow Integrations • Pandas – Move seamlessly to from Arrow as a means for communication, serialization, fast processing • GOAI (GPU Open Analytics Initiative), libgdf and the GPU dataframe – Leverages Arrow as internal representation • Parquet – Read and write Parquet quickly to/from Parquet. C++ library builds directly on Arrow. • Spark – Supports conversion to Pandas via Arrow construction using Arrow Java Library • Dremio – OSS project, Sabot Engine executes entirely on Arrow memory
  • 17. © 2017 Dremio Corporation @DremioHQ Arrow In Practice
  • 18. © 2017 Dremio Corporation @DremioHQ Real World Arrow: Sabot • Dremio is an OSS data fabric product • The core engine is “Sabot” – Built entirely on top of Arrow libraries, runs in JVM
  • 19. © 2017 Dremio Corporation @DremioHQ Sabot: Arrow in Practice • Memory Management • Vector sizing • RPC Communication • Filtering/Sorting • Rowwise-algorithms: Hash Tables • Vector-wise Algorithms – Aggregation – Unnesting
  • 20. © 2017 Dremio Corporation @DremioHQ Practice: Memory Management • Arrow includes chunk-based managed allocator – Built on top of Netty’s JEMalloc implementation • Create a tree of allocators – Support both reservation and local limits – Include leak detection, debug ownership logs and location accounting • Size allocators (reservation and maximum) based on workload management, when to trigger spilling, etc. • All Arrow Vectors hold one or more off-heap buffers • Everything is manually reference managed – Some code more complex – Provides strong memory availability understanding Root res: 0 max: 20g Job 1 res: 10m max: 1g Job 2 res: 10m max: 1g Task 1 res: 1m max: -1 Task 2 res: 5m max: 20m Task 1 res: 1m max: -1 Task 2 res: 5m max: 20m IntVector Validity Data
  • 21. © 2017 Dremio Corporation @DremioHQ Practice: Memory Management Cont’d • Data moves through data pipelines • Ownership needs to be clear (to plan/control execution – Allocated memory can be referenced by many consumers – One allocator ‘owns’ the accounted memory – Consumers can use Vector’s transfer capability to leverage transfer semantics and handoff data ownership https://goo.gl/HN9nCH Scan Aggregate Aggregate res: 10m max: 1g Scan res: 10m max: 1g transfer ownership
  • 22. © 2017 Dremio Corporation @DremioHQ Practice: Vector Sizing • Batches are the smallest work unit • Batches of records can be 1..64k records in size. • Optimization Problem – Larger improve processing performance – Larger causes pipeline problems – Smaller causes more heap overhead • Execution-Level Adaptive Resizing for wide records (100-1000s fields) Narrow Batch Wide Batch 4095 records 127 records
  • 23. © 2017 Dremio Corporation @DremioHQ Practice: RPC Communication • Goals – Leverage Gathering Writes – Ensure connection resilience despite memory pressure • Custom Netty-based RPC protocol – All messages include structured (proto) and sidecar memory message – Out of memory at message consumption time, ensuring fail-ack as opposed to connection disconnect Send: Listener listener Proto structuredMessage ArrowBuf... dataBodies https://goo.gl/XWyrc1 Structured message Gathering write
  • 24. © 2017 Dremio Corporation @DremioHQ Filtering & Sorting • For filtering and sorting, create a selection vector – Describes valid values and ordering without reorganizing underlying data. – Two bytes for filter purposes (single batch horizon) – Four bytes for sort purposes (multi-batch horizon) • 4-Byte selection vector pattern frequently by other operations • 6-Byte selection vector used in some cases (to manage wide batches) • Defer copy/compacting 2 14 35 99 1-2 2-14 1-35 2-99 sv4 sv2
  • 25. © 2017 Dremio Corporation @DremioHQ Row-wise Algorithms: Hash Table + Aggregation For generating hash table, maintaining a columnar structure for keys slows hashing insertion and lookup • Break data into fixed and variable values • Use consistent fixed value insertion • Use dynamic variable output • Pivot data – Vector at time for fixed values – All variable at same time for variable vectors • Hash and equality as bucket of bytes • Avoids excessive indirection • Maintain Aggregation tables in columnar format Fixed Block Vector Variable Block Vector Aggregation Tables validity|fixed1|fixed2|varlen|varoffset validity|fixed1|fixed2|varlen|varoffset validity|fixed1|fixed2|varlen|varoffset validity|fixed1|fixed2|varlen|varoffset len|data|len|data|len|data|len |data|len|data|len|data|len|da ta|len|data|len|data|len|data|l en|data|len|data|len|data|len| data|len|data|len|data Partial-agg2 Partial-agg1 Partial-agg3 Partial-agg4 Partial-agg5 Partial-agg6 pivot fixed pivot variable unpivot unpivot direct projection
  • 26. © 2017 Dremio Corporation @DremioHQ Example Pivot Code • Takes advantage of runs of nullable values, working a word at a time – ALL_SET, NONE_SET, SOME_SET • Ensure canonicalization of values based on validity – Typically validity data is zeroed on allocation, other vectors are not. – Vector data has to be cleared when pivoting nulled values • Conditions are avoided static void pivot8Bytes( VectorPivotDef def, FixedBlockVector fixedBlock, final int count ){ ... // decode word at a time. while (srcDataAddr < finalWordAddr) { final long bitValues = PlatformDependent.getLong(srcBitsAddr); if (bitValues == NONE_SET) { // noop (all nulls). bitTargetAddr += (WORD_BITS * blockLength); valueTargetAddr += (WORD_BITS * blockLength); srcDataAddr += (WORD_BITS * EIGHT_BYTE); } else if (bitValues == ALL_SET) { // all set, set the bit values using a constant AND. Independently set the data values without transformation. final int bitVal = 1 << bitOffset; for (int i = 0; i < WORD_BITS; i++, bitTargetAddr += blockLength) { PlatformDependent.putInt(bitTargetAddr, PlatformDependent.getInt(bitTargetAddr) | bitVal); } for (int i = 0; i < WORD_BITS; i++, valueTargetAddr += blockLength, srcDataAddr += EIGHT_BYTE) { PlatformDependent.putLong(valueTargetAddr, PlatformDependent.getLong(srcDataAddr)); } } else { // some nulls, some not, update each value to zero or the value, depending on the null bit. for (int i = 0; i < WORD_BITS; i++, bitTargetAddr += blockLength, valueTargetAddr += blockLength, srcDataAddr += E final int bitVal = ((int) (bitValues >>> i)) & 1; PlatformDependent.putInt(bitTargetAddr, PlatformDependent.getInt(bitTargetAddr) | (bitVal << bitOffset)); PlatformDependent.putLong(valueTargetAddr, PlatformDependent.getLong(srcDataAddr) * bitVal); } } srcBitsAddr += WORD_BYTES; } https://goo.gl/EgLy9r
  • 27. © 2017 Dremio Corporation @DremioHQ Node 1 Mux’d Practice: Parallel Columnar Shuffle • Partition data based on a hashed key • Avoid excessive batch buffering cost • Steps 1. Consolidate node-local streams • Allow reduction in buffering memory in large clusters (k*n instead of n*n) 2. Hash the key(s) to determine bucket offset • Generate bucket vector 3. Pre-allocate output buffers at target output size • Sized depending on narrow/wide batches 4. Do columnar copies per vector • Written in C-like low overhead pattern with no abstraction Node 2 Thread 1 Thread 2 generate bucket vector Do bucket- level copies Gathering Write Thread 1 Thread 2
  • 28. © 2017 Dremio Corporation @DremioHQ Example Copier Code • Two byte offset addresses (sv2) • Tight loop focused on • Far more efficient than runtime-generated row- wise code – Also has faster startup time public void copy(long offsetAddr, int count) { final List<ArrowBuf> sourceBuffers = source.getFieldBuffers(); targetAlt.allocateNew(count); final List<ArrowBuf> targetBuffers = target.getFieldBuffers(); final long max = offsetAddr + count * STEP_SIZE; final long srcAddr = sourceBuffers.get(VALUE_BUFFER_ORDINAL).memoryAddress(); long dstAddr = targetBuffers.get(VALUE_BUFFER_ORDINAL).memoryAddress(); for(long addr = offsetAddr; addr < max; addr += STEP_SIZE, dstAddr += SIZE){ PlatformDependent.putLong(dstAddr, PlatformDependent.getLong(srcAddr + ((char) PlatformDependent.getShort(addr)) * SIZE)); } } https://goo.gl/fZEsfy
  • 29. © 2017 Dremio Corporation @DremioHQ Unnesting List Vectors • Common Pattern: List of objects that want to be unrolled to separate records. • Arrow’s representation allows a direct unroll (no inner data copies required) • Since leaf vectors can be larger (up to 2B), may need to split apart inner vectors – Make use of SplitAndTransfer necessary – SplitAndTransfer as cheap as possible • Noop for fixed data • Offset rewrite for variable width vectors, noop for variable data • Bit rewrite & shifting for Validity vectors List Vector OffsetVector Struct Vector Inner Vectors
  • 30. © 2017 Dremio Corporation @DremioHQ What’s Coming • Arrow RPC/REST – Generic way to retrieve data in Arrow format – Generic way to serve data in Arrow format – Simplify integrations across the ecosystem • Arrow Routines – GPU and LLVM
  • 31. © 2017 Dremio Corporation @DremioHQ Get Involved • Join the community – dev@arrow.apache.org – Slack: • https://apachearrowslackin.herokuapp.com/ – http://arrow.apache.org – Follow @ApacheArrow, @DremioHQ, @intjesus