SlideShare a Scribd company logo
1 of 25
Download to read offline
| © Copyright 2023, InfluxData
How InfluxDB and
Dremio Leverage the
Apache Ecosystem
July 2023
| © Copyright 2023, InfluxData
2
Anais Dotis-Georgiou
Developer Advocate
LinkedIn
| © Copyright 2023, InfluxData
3
Alex Merced
Developer Advocate
LinkedIn
| © Copyright 2023, InfluxData
InfluxDB & Apache
4
| © Copyright 2023, InfluxData
InfluxData Reference Architecture
| © Copyright 2023, InfluxData
| © Copyright 2023, InfluxData
InfluxDB’s new storage engine is built on
● Rust
● Apache Arrow
● Apache Parquet
● Arrow Flight
● DataFusion
| © Copyright 2023, InfluxData
Arrow and InfluxDB
7
| © Copyright 2023, InfluxData
| © Copyright 2023, InfluxData
Arrow and InfluxDB
● Overcomes memory challenges. Fine grain memory control.
● Efficient data exchange provides for data analytics
performance benefits
● Arrow provides the in-memory columnar storage while
Parquet will provide the column-oriented data file format on
disk.
● Broad ecosystem compatibility
8
| © Copyright 2023, InfluxData
| © Copyright 2023, InfluxData
Sidebar–Advantages of Columnar Data Storage
9
measurement1,tag1=tagvalue1 field1=1i timestamp1
measurement1,tag1=tagvalue2 field1=2i timestamp2
measurement1,tag2=tagvalue3 field1=3i timestamp3
measurement1,tag1=tagvalue1,tag2=tagvalue3 field1=4i,field2=true timestamp4
measurement1, field1=1i timestamp5
| © Copyright 2023, InfluxData
| © Copyright 2023, InfluxData
Sidebar–Advantages of Columnar Data Storage
10
Name: measurement1
field1 field2 tag1 tag2 tag3 time
1i null tagvalue1 null null timestamp1
2i null tagvalue2 null null timestamp2
3i null null tagvalue3 null timestamp3
4i true tagvalue1 tagvalue3 tagvalue4 timestamp4
1i null null null null timestamp5
| © Copyright 2023, InfluxData
| © Copyright 2023, InfluxData
Sidebar–Advantages of Columnar Data Storage
11
1i 2i 3i 4i 1i
null null null true null
tagvalue1 tagvalue2 null tagvalue1 null
null null tagvalue3 tagvalue3 null
null null null tagvalue4 null
timestamp1 timestamp2 timestamp3 timestamp4 timestamp5
1i, 2i, 3i, 4i, 1i;
null, null, null, true, null;
tagvalue1, tagvalue2, null, tagvalue1, null;
null, null, null, tagvalue3, tagvalue3, null;
null, null, null, tagvalue4, null;
timestamp1, timestamp2, timestamp3, timestamp4, timestamp5.
| © Copyright 2023, InfluxData
DataFusion and InfluxDB
12
| © Copyright 2023, InfluxData
| © Copyright 2023, InfluxData
DataFusion and InfluxDB
● DataFusion provides the query, processing, and
transformation of the data.
● Enables fast query against data stored on cheaper object
store.
● DataFusion supports both a postgres compatible SQL and
DataFrame API.
13
| © Copyright 2023, InfluxData
Parquet and InfluxDB
14
| © Copyright 2023, InfluxData
| © Copyright 2023, InfluxData
Parquet and InfluxDB
● Efficient compression and interoperability with ML and
analytics tooling.
● Take up little disk space and are fast to scan.
● Enable bulk data export and import.
● Offer interoperability with modern ML and analytics tools.
15
| © Copyright 2023, InfluxData
Dremio & Apache
16
| © Copyright 2023, InfluxData
| © Copyright 2023, InfluxData
What is Dremio
17
Test-Drive Dremio
| © Copyright 2023, InfluxData
| © Copyright 2023, InfluxData
Dremio and
18
Apache Calcite is an open-source framework for building databases and data management systems. It allows developers to
design systems that can parse, optimize, and execute SQL queries. Apache Calcite provides components to implement
SQL-like query processing capabilities in various systems.
- SQL Parser
- Query Optimizer
- Adapter Framework
- Pluggable Query Execution
- Standards Support
Enables Dremio’s
SQL Interface
| © Copyright 2023, InfluxData
| © Copyright 2023, InfluxData
Dremio and
19
- Standard In-memory format
- Standard Transport Protocol
(Arrow Flight)
- Pre-Compiled Query Logic on
Arrow Buffers (Gandiva)
Enables Dremio’s
Data Processing
| © Copyright 2023, InfluxData
| © Copyright 2023, InfluxData
Dremio and Apache
20
- Metadata Layer for Data Lake
Datasets
- Enables Time Travel
- Enables Schema Evolution
- Enables Partition Evolution
Enables Dremio Data
Reflections and More
| © Copyright 2023, InfluxData
| © Copyright 2023, InfluxData
Dremio and Apache
21
- Binary
- Columnar
- Open Standard
- Predicate Pushdown
Enables Dremio’s
Caching Layer
| © Copyright 2023, InfluxData
Get Help + Resources!
22
Forums: community.influxdata.com
Slack: influxcommunity.slack.com
GH: github.com/InfluxCommunity
Book: awesome.influxdata.com
Docs: docs.influxdata.com
Blogs: influxdata.com/blog
InfluxDB University: influxdata.com/university
| © Copyright 2023, InfluxData
23
R E S O U R C E S
Webinar: InfluxDB 3.0 Basics
Learn the basics of InfluxDB, covering core time series
concepts, an overview of the platform, and practical
guidance for installation and usage.
Save 96% on Data Storage Costs:
bit.ly/3NJEcGZ
Run a Proof of Concept:
InfluxDB Resources
Watch now
Learn more
Learn more bit.ly/3puRsal
bit.ly/445mRy4
| © Copyright 2023, InfluxData
Questions?
24
| © Copyright 2023, InfluxData
T H A N K Y O U

More Related Content

Similar to Best Practices for Leveraging the Apache Arrow Ecosystem

2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...Timothy Spann
 
002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...
002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...
002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...Neo4j
 
Session 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers ProgramSession 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers ProgramFIWARE
 
WP VERITAS InfoScale Storage and Dockers Intro - v8
WP VERITAS InfoScale Storage and Dockers Intro - v8WP VERITAS InfoScale Storage and Dockers Intro - v8
WP VERITAS InfoScale Storage and Dockers Intro - v8Rajagopal Vaideeswaran
 
Node.js and the MySQL Document Store
Node.js and the MySQL Document StoreNode.js and the MySQL Document Store
Node.js and the MySQL Document StoreRui Quelhas
 
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data AnalyticsSupersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analyticsmason_s
 
Start Automating InfluxDB Deployments at the Edge with balena
Start Automating InfluxDB Deployments at the Edge with balena Start Automating InfluxDB Deployments at the Edge with balena
Start Automating InfluxDB Deployments at the Edge with balena InfluxData
 
The Path To Success With Graph Database and Analytics
The Path To Success With Graph Database and AnalyticsThe Path To Success With Graph Database and Analytics
The Path To Success With Graph Database and AnalyticsNeo4j
 
Accelerating Spark with Kubernetes
Accelerating Spark with KubernetesAccelerating Spark with Kubernetes
Accelerating Spark with KubernetesAlluxio, Inc.
 
IRJET- An Approach for Implemented Secure Proxy Server for Multi-User Searcha...
IRJET- An Approach for Implemented Secure Proxy Server for Multi-User Searcha...IRJET- An Approach for Implemented Secure Proxy Server for Multi-User Searcha...
IRJET- An Approach for Implemented Secure Proxy Server for Multi-User Searcha...IRJET Journal
 
01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinarAerospike, Inc.
 
Solving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowSolving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowWes McKinney
 
Interactive Analytics with the Starburst Presto + Alluxio stack for the Cloud
Interactive Analytics with the Starburst Presto + Alluxio stack for the CloudInteractive Analytics with the Starburst Presto + Alluxio stack for the Cloud
Interactive Analytics with the Starburst Presto + Alluxio stack for the CloudAlluxio, Inc.
 
Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timeAerospike, Inc.
 
Meetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentMeetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentTimothy Spann
 
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023ssuser73434e
 
Cleversafe august 2016
Cleversafe august 2016Cleversafe august 2016
Cleversafe august 2016Joe Krotz
 
GSJUG: Mastering Data Streaming Pipelines 09May2023
GSJUG: Mastering Data Streaming Pipelines 09May2023GSJUG: Mastering Data Streaming Pipelines 09May2023
GSJUG: Mastering Data Streaming Pipelines 09May2023Timothy Spann
 
Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Bostonkbajda
 
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdfOSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdfTimothy Spann
 

Similar to Best Practices for Leveraging the Apache Arrow Ecosystem (20)

2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
 
002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...
002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...
002 Introducing Neo4j 5 for Administrators - NODES2022 AMERICAS Beginner 2 - ...
 
Session 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers ProgramSession 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers Program
 
WP VERITAS InfoScale Storage and Dockers Intro - v8
WP VERITAS InfoScale Storage and Dockers Intro - v8WP VERITAS InfoScale Storage and Dockers Intro - v8
WP VERITAS InfoScale Storage and Dockers Intro - v8
 
Node.js and the MySQL Document Store
Node.js and the MySQL Document StoreNode.js and the MySQL Document Store
Node.js and the MySQL Document Store
 
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data AnalyticsSupersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
 
Start Automating InfluxDB Deployments at the Edge with balena
Start Automating InfluxDB Deployments at the Edge with balena Start Automating InfluxDB Deployments at the Edge with balena
Start Automating InfluxDB Deployments at the Edge with balena
 
The Path To Success With Graph Database and Analytics
The Path To Success With Graph Database and AnalyticsThe Path To Success With Graph Database and Analytics
The Path To Success With Graph Database and Analytics
 
Accelerating Spark with Kubernetes
Accelerating Spark with KubernetesAccelerating Spark with Kubernetes
Accelerating Spark with Kubernetes
 
IRJET- An Approach for Implemented Secure Proxy Server for Multi-User Searcha...
IRJET- An Approach for Implemented Secure Proxy Server for Multi-User Searcha...IRJET- An Approach for Implemented Secure Proxy Server for Multi-User Searcha...
IRJET- An Approach for Implemented Secure Proxy Server for Multi-User Searcha...
 
01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar
 
Solving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowSolving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache Arrow
 
Interactive Analytics with the Starburst Presto + Alluxio stack for the Cloud
Interactive Analytics with the Starburst Presto + Alluxio stack for the CloudInteractive Analytics with the Starburst Presto + Alluxio stack for the Cloud
Interactive Analytics with the Starburst Presto + Alluxio stack for the Cloud
 
Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-time
 
Meetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentMeetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline Development
 
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
 
Cleversafe august 2016
Cleversafe august 2016Cleversafe august 2016
Cleversafe august 2016
 
GSJUG: Mastering Data Streaming Pipelines 09May2023
GSJUG: Mastering Data Streaming Pipelines 09May2023GSJUG: Mastering Data Streaming Pipelines 09May2023
GSJUG: Mastering Data Streaming Pipelines 09May2023
 
Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Boston
 
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdfOSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
 

More from InfluxData

Announcing InfluxDB Clustered
Announcing InfluxDB ClusteredAnnouncing InfluxDB Clustered
Announcing InfluxDB ClusteredInfluxData
 
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...InfluxData
 
Power Your Predictive Analytics with InfluxDB
Power Your Predictive Analytics with InfluxDBPower Your Predictive Analytics with InfluxDB
Power Your Predictive Analytics with InfluxDBInfluxData
 
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base InfluxData
 
Build an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING StackBuild an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING StackInfluxData
 
Meet the Founders: An Open Discussion About Rewriting Using Rust
Meet the Founders: An Open Discussion About Rewriting Using RustMeet the Founders: An Open Discussion About Rewriting Using Rust
Meet the Founders: An Open Discussion About Rewriting Using RustInfluxData
 
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...InfluxData
 
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...InfluxData
 
Introducing InfluxDB’s New Time Series Database Storage Engine
Introducing InfluxDB’s New Time Series Database Storage EngineIntroducing InfluxDB’s New Time Series Database Storage Engine
Introducing InfluxDB’s New Time Series Database Storage EngineInfluxData
 
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDBStreamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDBInfluxData
 
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...InfluxData
 
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022InfluxData
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022InfluxData
 
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...InfluxData
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022InfluxData
 
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022InfluxData
 
Jay Clifford [InfluxData] | Tips & Tricks for Analyzing IIoT in Real-Time | I...
Jay Clifford [InfluxData] | Tips & Tricks for Analyzing IIoT in Real-Time | I...Jay Clifford [InfluxData] | Tips & Tricks for Analyzing IIoT in Real-Time | I...
Jay Clifford [InfluxData] | Tips & Tricks for Analyzing IIoT in Real-Time | I...InfluxData
 
Brian Gilmore [InfluxData] | Use Case: IIoT Overview | InfluxDays 2022
Brian Gilmore [InfluxData] | Use Case: IIoT Overview | InfluxDays 2022Brian Gilmore [InfluxData] | Use Case: IIoT Overview | InfluxDays 2022
Brian Gilmore [InfluxData] | Use Case: IIoT Overview | InfluxDays 2022InfluxData
 
Gilmore, Palani [InfluxData] | Use Case: Monitoring / Observability | InfluxD...
Gilmore, Palani [InfluxData] | Use Case: Monitoring / Observability | InfluxD...Gilmore, Palani [InfluxData] | Use Case: Monitoring / Observability | InfluxD...
Gilmore, Palani [InfluxData] | Use Case: Monitoring / Observability | InfluxD...InfluxData
 
Gilmore, Palani [InfluxData] | Use Case: Crypto & Fintech | InfluxDays 2022
Gilmore, Palani [InfluxData] | Use Case: Crypto & Fintech | InfluxDays 2022Gilmore, Palani [InfluxData] | Use Case: Crypto & Fintech | InfluxDays 2022
Gilmore, Palani [InfluxData] | Use Case: Crypto & Fintech | InfluxDays 2022InfluxData
 

More from InfluxData (20)

Announcing InfluxDB Clustered
Announcing InfluxDB ClusteredAnnouncing InfluxDB Clustered
Announcing InfluxDB Clustered
 
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
 
Power Your Predictive Analytics with InfluxDB
Power Your Predictive Analytics with InfluxDBPower Your Predictive Analytics with InfluxDB
Power Your Predictive Analytics with InfluxDB
 
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
 
Build an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING StackBuild an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING Stack
 
Meet the Founders: An Open Discussion About Rewriting Using Rust
Meet the Founders: An Open Discussion About Rewriting Using RustMeet the Founders: An Open Discussion About Rewriting Using Rust
Meet the Founders: An Open Discussion About Rewriting Using Rust
 
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
 
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
 
Introducing InfluxDB’s New Time Series Database Storage Engine
Introducing InfluxDB’s New Time Series Database Storage EngineIntroducing InfluxDB’s New Time Series Database Storage Engine
Introducing InfluxDB’s New Time Series Database Storage Engine
 
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDBStreamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
 
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
 
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
 
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
 
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
 
Jay Clifford [InfluxData] | Tips & Tricks for Analyzing IIoT in Real-Time | I...
Jay Clifford [InfluxData] | Tips & Tricks for Analyzing IIoT in Real-Time | I...Jay Clifford [InfluxData] | Tips & Tricks for Analyzing IIoT in Real-Time | I...
Jay Clifford [InfluxData] | Tips & Tricks for Analyzing IIoT in Real-Time | I...
 
Brian Gilmore [InfluxData] | Use Case: IIoT Overview | InfluxDays 2022
Brian Gilmore [InfluxData] | Use Case: IIoT Overview | InfluxDays 2022Brian Gilmore [InfluxData] | Use Case: IIoT Overview | InfluxDays 2022
Brian Gilmore [InfluxData] | Use Case: IIoT Overview | InfluxDays 2022
 
Gilmore, Palani [InfluxData] | Use Case: Monitoring / Observability | InfluxD...
Gilmore, Palani [InfluxData] | Use Case: Monitoring / Observability | InfluxD...Gilmore, Palani [InfluxData] | Use Case: Monitoring / Observability | InfluxD...
Gilmore, Palani [InfluxData] | Use Case: Monitoring / Observability | InfluxD...
 
Gilmore, Palani [InfluxData] | Use Case: Crypto & Fintech | InfluxDays 2022
Gilmore, Palani [InfluxData] | Use Case: Crypto & Fintech | InfluxDays 2022Gilmore, Palani [InfluxData] | Use Case: Crypto & Fintech | InfluxDays 2022
Gilmore, Palani [InfluxData] | Use Case: Crypto & Fintech | InfluxDays 2022
 

Recently uploaded

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 

Recently uploaded (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 

Best Practices for Leveraging the Apache Arrow Ecosystem

  • 1. | © Copyright 2023, InfluxData How InfluxDB and Dremio Leverage the Apache Ecosystem July 2023
  • 2. | © Copyright 2023, InfluxData 2 Anais Dotis-Georgiou Developer Advocate LinkedIn
  • 3. | © Copyright 2023, InfluxData 3 Alex Merced Developer Advocate LinkedIn
  • 4. | © Copyright 2023, InfluxData InfluxDB & Apache 4
  • 5. | © Copyright 2023, InfluxData InfluxData Reference Architecture
  • 6. | © Copyright 2023, InfluxData | © Copyright 2023, InfluxData InfluxDB’s new storage engine is built on ● Rust ● Apache Arrow ● Apache Parquet ● Arrow Flight ● DataFusion
  • 7. | © Copyright 2023, InfluxData Arrow and InfluxDB 7
  • 8. | © Copyright 2023, InfluxData | © Copyright 2023, InfluxData Arrow and InfluxDB ● Overcomes memory challenges. Fine grain memory control. ● Efficient data exchange provides for data analytics performance benefits ● Arrow provides the in-memory columnar storage while Parquet will provide the column-oriented data file format on disk. ● Broad ecosystem compatibility 8
  • 9. | © Copyright 2023, InfluxData | © Copyright 2023, InfluxData Sidebar–Advantages of Columnar Data Storage 9 measurement1,tag1=tagvalue1 field1=1i timestamp1 measurement1,tag1=tagvalue2 field1=2i timestamp2 measurement1,tag2=tagvalue3 field1=3i timestamp3 measurement1,tag1=tagvalue1,tag2=tagvalue3 field1=4i,field2=true timestamp4 measurement1, field1=1i timestamp5
  • 10. | © Copyright 2023, InfluxData | © Copyright 2023, InfluxData Sidebar–Advantages of Columnar Data Storage 10 Name: measurement1 field1 field2 tag1 tag2 tag3 time 1i null tagvalue1 null null timestamp1 2i null tagvalue2 null null timestamp2 3i null null tagvalue3 null timestamp3 4i true tagvalue1 tagvalue3 tagvalue4 timestamp4 1i null null null null timestamp5
  • 11. | © Copyright 2023, InfluxData | © Copyright 2023, InfluxData Sidebar–Advantages of Columnar Data Storage 11 1i 2i 3i 4i 1i null null null true null tagvalue1 tagvalue2 null tagvalue1 null null null tagvalue3 tagvalue3 null null null null tagvalue4 null timestamp1 timestamp2 timestamp3 timestamp4 timestamp5 1i, 2i, 3i, 4i, 1i; null, null, null, true, null; tagvalue1, tagvalue2, null, tagvalue1, null; null, null, null, tagvalue3, tagvalue3, null; null, null, null, tagvalue4, null; timestamp1, timestamp2, timestamp3, timestamp4, timestamp5.
  • 12. | © Copyright 2023, InfluxData DataFusion and InfluxDB 12
  • 13. | © Copyright 2023, InfluxData | © Copyright 2023, InfluxData DataFusion and InfluxDB ● DataFusion provides the query, processing, and transformation of the data. ● Enables fast query against data stored on cheaper object store. ● DataFusion supports both a postgres compatible SQL and DataFrame API. 13
  • 14. | © Copyright 2023, InfluxData Parquet and InfluxDB 14
  • 15. | © Copyright 2023, InfluxData | © Copyright 2023, InfluxData Parquet and InfluxDB ● Efficient compression and interoperability with ML and analytics tooling. ● Take up little disk space and are fast to scan. ● Enable bulk data export and import. ● Offer interoperability with modern ML and analytics tools. 15
  • 16. | © Copyright 2023, InfluxData Dremio & Apache 16
  • 17. | © Copyright 2023, InfluxData | © Copyright 2023, InfluxData What is Dremio 17 Test-Drive Dremio
  • 18. | © Copyright 2023, InfluxData | © Copyright 2023, InfluxData Dremio and 18 Apache Calcite is an open-source framework for building databases and data management systems. It allows developers to design systems that can parse, optimize, and execute SQL queries. Apache Calcite provides components to implement SQL-like query processing capabilities in various systems. - SQL Parser - Query Optimizer - Adapter Framework - Pluggable Query Execution - Standards Support Enables Dremio’s SQL Interface
  • 19. | © Copyright 2023, InfluxData | © Copyright 2023, InfluxData Dremio and 19 - Standard In-memory format - Standard Transport Protocol (Arrow Flight) - Pre-Compiled Query Logic on Arrow Buffers (Gandiva) Enables Dremio’s Data Processing
  • 20. | © Copyright 2023, InfluxData | © Copyright 2023, InfluxData Dremio and Apache 20 - Metadata Layer for Data Lake Datasets - Enables Time Travel - Enables Schema Evolution - Enables Partition Evolution Enables Dremio Data Reflections and More
  • 21. | © Copyright 2023, InfluxData | © Copyright 2023, InfluxData Dremio and Apache 21 - Binary - Columnar - Open Standard - Predicate Pushdown Enables Dremio’s Caching Layer
  • 22. | © Copyright 2023, InfluxData Get Help + Resources! 22 Forums: community.influxdata.com Slack: influxcommunity.slack.com GH: github.com/InfluxCommunity Book: awesome.influxdata.com Docs: docs.influxdata.com Blogs: influxdata.com/blog InfluxDB University: influxdata.com/university
  • 23. | © Copyright 2023, InfluxData 23 R E S O U R C E S Webinar: InfluxDB 3.0 Basics Learn the basics of InfluxDB, covering core time series concepts, an overview of the platform, and practical guidance for installation and usage. Save 96% on Data Storage Costs: bit.ly/3NJEcGZ Run a Proof of Concept: InfluxDB Resources Watch now Learn more Learn more bit.ly/3puRsal bit.ly/445mRy4
  • 24. | © Copyright 2023, InfluxData Questions? 24
  • 25. | © Copyright 2023, InfluxData T H A N K Y O U