SlideShare a Scribd company logo
1 of 18
Download to read offline
Marco Neumann
@crepererum
mneumann@influxdata.com
Catalogs
Turning a set of Parquet files
into a data set
© 2021 InfluxData. All rights reserved.
2
Agenda
1. Requirements
2. Prior Art
3. Solution
© 2021 InfluxData. All rights reserved.
3
Requirements
© 2021 InfluxData. All rights reserved.
4
Parquet Files
2815898179/
my_db/
data/
2020-01/
0/
sensors.parquet
stocks.parquet
1/
sensors.parquet
2020-02/
0/
sensors.parquet
1/
sensors.parquet
other_db/
data/
2020-01/
0/
health.parquet
3837527170/
my_db/
data/
2021-01/
0/
stocks.parquet
Without catalog:
● No transactions
● Large scan times
● No easy schema / statistics lookup
© 2021 InfluxData. All rights reserved.
5
Operations
• ✍ Upsert
• 👓 Read
• ⤫ (Soft) Delete
• 🗑 Garbage Collection
• ⏲ Time Travel
• 🛈 Upgrade
⚠ These are specific to
InfluxData IOx
🛈 Sets of operation form atomic
catalog-level transactions
might create table on-demand
© 2021 InfluxData. All rights reserved.
6
Properties
Easy to implement
Run on AWS, Azure, GCP, in-memory, local FS
Stores:
• Transaction state (= pointers to files)
• “Arrow Cache” (e.g. schemas, statistics)
Can be rebuild from files
⚠ These are specific to
InfluxData IOx
⚠ NO easy atomic
“compare+swap” or “create if
not exist” everywhere
© 2021 InfluxData. All rights reserved.
7
Writer Federation
Data Producers
Router / Writer / Reader
separate stores / namespaces
© 2021 InfluxData. All rights reserved.
8
Prior Art
© 2021 InfluxData. All rights reserved.
9
Apache Hive
sensors/
_common_metadata
year=2020/month=01
/
0.parquet
1.parquet
year=2020/month=02
/
0.parquet
stocks/
_common_metadata
year=2020/month=01
/
0.parquet
health/
_common_metadata
year=2020/month=01
/
0.parquet
● file exists ⇒ part of the dataset
● _common_metadata contains schema (= parquet file w/
0 rows)
● technically no difference between “table” and “database”
➔ no time travel
➔ LISTing object store expensive for large data sets
➔ no soft delete
➔ no atomic commits
➔ no real multi-writer semantics
© 2021 InfluxData. All rights reserved.
10
Apache
sensors/
data/
2020-01/
0.parquet
1.parquet
2.parquet
2020-02/
0.parquet
2020-03/
0.parquet
➔ heavily work-in progress
➔ more focused on:
◆ single data set / table
◆ multiple concurrent writers
➔ complexity
(might be a future candidate)
snapshot 1
manifest list
manifest
manifest manifest list snapshot 1
manifest manifest list snapshot 2
add initial files
add more files
delete file
© 2021 InfluxData. All rights reserved.
11
Delta Lake
sensors/
data/
2020-01/
0.parquet
1.parquet
2.parquet
2020-02/
0.parquet
2020-03/
0.parquet
➔ more focused on:
◆ single data set / table
◆ multiple concurrent writers
➔ no non-Java implementation until recently
➔ Rust implementation not feature-complete
000000.json
Add
Add
000001.json
Add
Add
Add
000002.json
Remove
000001.checkpoint.parquet
© 2021 InfluxData. All rights reserved.
12
Solution
© 2021 InfluxData. All rights reserved.
13
Writer-DB-local Multi-Table Transaction Log
2815898179/
my_db/
data/
2020-01/
0/
sensors.parquet
stocks.parquet
1/
sensors.parquet
2020-02/
0/
sensors.parquet
1/
sensors.parquet
other_db/
data/
2020-01/
0/
health.parquet
3837527170/
my_db/
data/
2021-01/
0/
stocks.parquet
000000.txn
Add
Add
000001.txn
Add
Add
Add
000002.txn
Remove
000001.ckpt
© 2021 InfluxData. All rights reserved.
14
Transaction
List of possible actions:
• Add: path, checksum, Parquet metadata
• Remove: path
• Tombstone
• Upgrade: new format
• … (might be extended)
Serialization done via Protocol Buffers.
Checkpoints aggregate transactions.
© 2021 InfluxData. All rights reserved.
15
Statistics + Schema
Add action contains Parquet metadata (= schema + statistics)
Apache Thrift Compact Protocol
bytes
Protocol Buffers
➔ Same expressiveness as Parquet
➔ No additional format conversion
© 2021 InfluxData. All rights reserved.
16
Writer Conflicts
Assumptions:
• Conductor provides cluster-wide unique ServerID
• No inter-writer (=global) catalog
Robustness Measures:
• Transaction filename handling:
<transaction counter>/<uuid>.txn
• Transaction contains UUID of previous transaction
• Writers detect “fork” scenario
© 2021 InfluxData. All rights reserved.
17
References
• Apache Projects Logos
https://apache.org/logos/
• Apache Iceberg Table Spec
https://iceberg.apache.org/spec/
• Delta Lake
https://databricks.com/blog/2019/08/21/diving-into-delta-lake-unpacking-the-transaction-log.ht
ml
https://github.com/delta-io/delta/blob/master/PROTOCOL.md
https://cs.stanford.edu/people/matei/papers/2020/vldb_delta_lake.pdf
• Apache Parquet
https://github.com/apache/parquet-format
• IOx Design
https://github.com/influxdata/influxdb_iox/blob/main/docs/catalog_persistence.md
Thank You

More Related Content

What's hot

Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0
Cloudera, Inc.
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
Databricks
 
An Introduction to Druid
An Introduction to DruidAn Introduction to Druid
An Introduction to Druid
DataWorks Summit
 

What's hot (20)

File Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & ParquetFile Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & Parquet
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Efficient Data Storage for Analytics with Parquet 2.0 - Hadoop Summit 2014
Efficient Data Storage for Analytics with Parquet 2.0 - Hadoop Summit 2014Efficient Data Storage for Analytics with Parquet 2.0 - Hadoop Summit 2014
Efficient Data Storage for Analytics with Parquet 2.0 - Hadoop Summit 2014
 
Delight: An Improved Apache Spark UI, Free, and Cross-Platform
Delight: An Improved Apache Spark UI, Free, and Cross-PlatformDelight: An Improved Apache Spark UI, Free, and Cross-Platform
Delight: An Improved Apache Spark UI, Free, and Cross-Platform
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
 
Building large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBuilding large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudi
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
 
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
 
Observability of InfluxDB IOx: Tracing, Metrics and System Tables
Observability of InfluxDB IOx: Tracing, Metrics and System TablesObservability of InfluxDB IOx: Tracing, Metrics and System Tables
Observability of InfluxDB IOx: Tracing, Metrics and System Tables
 
Time Series Analytics with Spark: Spark Summit East talk by Simon Ouellette
Time Series Analytics with Spark: Spark Summit East talk by Simon OuelletteTime Series Analytics with Spark: Spark Summit East talk by Simon Ouellette
Time Series Analytics with Spark: Spark Summit East talk by Simon Ouellette
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Understanding InfluxDB’s New Storage Engine
Understanding InfluxDB’s New Storage EngineUnderstanding InfluxDB’s New Storage Engine
Understanding InfluxDB’s New Storage Engine
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
An Introduction to Druid
An Introduction to DruidAn Introduction to Druid
An Introduction to Druid
 

Similar to Catalogs - Turning a Set of Parquet Files into a Data Set

Integrating best of breed open source tools to vitess orchestrator pleu21
Integrating best of breed open source tools to vitess  orchestrator   pleu21Integrating best of breed open source tools to vitess  orchestrator   pleu21
Integrating best of breed open source tools to vitess orchestrator pleu21
Alkin Tezuysal
 
Buckle Up! With Valerie Burchby and Xinran Waibe | Current 2022
Buckle Up! With Valerie Burchby and Xinran Waibe | Current 2022Buckle Up! With Valerie Burchby and Xinran Waibe | Current 2022
Buckle Up! With Valerie Burchby and Xinran Waibe | Current 2022
HostedbyConfluent
 

Similar to Catalogs - Turning a Set of Parquet Files into a Data Set (20)

Peter Zaitsev "18 ways to fix MySQL bottlenecks"
Peter Zaitsev "18 ways to fix MySQL bottlenecks"Peter Zaitsev "18 ways to fix MySQL bottlenecks"
Peter Zaitsev "18 ways to fix MySQL bottlenecks"
 
Kristina Robinson [InfluxData] | Understand and Visualize Your Data with Infl...
Kristina Robinson [InfluxData] | Understand and Visualize Your Data with Infl...Kristina Robinson [InfluxData] | Understand and Visualize Your Data with Infl...
Kristina Robinson [InfluxData] | Understand and Visualize Your Data with Infl...
 
Vitess VReplication: Standing on the Shoulders of a MySQL Giant
Vitess VReplication: Standing on the Shoulders of a MySQL GiantVitess VReplication: Standing on the Shoulders of a MySQL Giant
Vitess VReplication: Standing on the Shoulders of a MySQL Giant
 
Streaming Architecture Walkthrough.pdf
Streaming Architecture Walkthrough.pdfStreaming Architecture Walkthrough.pdf
Streaming Architecture Walkthrough.pdf
 
InfluxDB Live Product Training
InfluxDB Live Product TrainingInfluxDB Live Product Training
InfluxDB Live Product Training
 
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays EMEA 2021
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays EMEA 2021Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays EMEA 2021
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays EMEA 2021
 
Getting Started: Intro to Telegraf - July 2021
Getting Started: Intro to Telegraf - July 2021Getting Started: Intro to Telegraf - July 2021
Getting Started: Intro to Telegraf - July 2021
 
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3
 
Tim Hall and Ryan Betts [InfluxData] | InfluxDB Roadmap and Engineering Updat...
Tim Hall and Ryan Betts [InfluxData] | InfluxDB Roadmap and Engineering Updat...Tim Hall and Ryan Betts [InfluxData] | InfluxDB Roadmap and Engineering Updat...
Tim Hall and Ryan Betts [InfluxData] | InfluxDB Roadmap and Engineering Updat...
 
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5
 
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldHadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
 
Real-Time Health Score Application using Apache Spark on Kubernetes
Real-Time Health Score Application using Apache Spark on KubernetesReal-Time Health Score Application using Apache Spark on Kubernetes
Real-Time Health Score Application using Apache Spark on Kubernetes
 
2007-05-23 Cecchet_PGCon2007.ppt
2007-05-23 Cecchet_PGCon2007.ppt2007-05-23 Cecchet_PGCon2007.ppt
2007-05-23 Cecchet_PGCon2007.ppt
 
Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5
 
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
 
WebRTC Standards & Implementation Q&A - All You Wanted to Know About W3C TPAC...
WebRTC Standards & Implementation Q&A - All You Wanted to Know About W3C TPAC...WebRTC Standards & Implementation Q&A - All You Wanted to Know About W3C TPAC...
WebRTC Standards & Implementation Q&A - All You Wanted to Know About W3C TPAC...
 
How to Use Telegraf and Its Plugin Ecosystem
How to Use Telegraf and Its Plugin EcosystemHow to Use Telegraf and Its Plugin Ecosystem
How to Use Telegraf and Its Plugin Ecosystem
 
Integrating best of breed open source tools to vitess orchestrator pleu21
Integrating best of breed open source tools to vitess  orchestrator   pleu21Integrating best of breed open source tools to vitess  orchestrator   pleu21
Integrating best of breed open source tools to vitess orchestrator pleu21
 
Buckle Up! With Valerie Burchby and Xinran Waibe | Current 2022
Buckle Up! With Valerie Burchby and Xinran Waibe | Current 2022Buckle Up! With Valerie Burchby and Xinran Waibe | Current 2022
Buckle Up! With Valerie Burchby and Xinran Waibe | Current 2022
 

More from InfluxData

How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
InfluxData
 
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
InfluxData
 
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
InfluxData
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
InfluxData
 
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
InfluxData
 

More from InfluxData (20)

Announcing InfluxDB Clustered
Announcing InfluxDB ClusteredAnnouncing InfluxDB Clustered
Announcing InfluxDB Clustered
 
Best Practices for Leveraging the Apache Arrow Ecosystem
Best Practices for Leveraging the Apache Arrow EcosystemBest Practices for Leveraging the Apache Arrow Ecosystem
Best Practices for Leveraging the Apache Arrow Ecosystem
 
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
 
Power Your Predictive Analytics with InfluxDB
Power Your Predictive Analytics with InfluxDBPower Your Predictive Analytics with InfluxDB
Power Your Predictive Analytics with InfluxDB
 
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
 
Build an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING StackBuild an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING Stack
 
Meet the Founders: An Open Discussion About Rewriting Using Rust
Meet the Founders: An Open Discussion About Rewriting Using RustMeet the Founders: An Open Discussion About Rewriting Using Rust
Meet the Founders: An Open Discussion About Rewriting Using Rust
 
Introducing InfluxDB Cloud Dedicated
Introducing InfluxDB Cloud DedicatedIntroducing InfluxDB Cloud Dedicated
Introducing InfluxDB Cloud Dedicated
 
Gain Better Observability with OpenTelemetry and InfluxDB
Gain Better Observability with OpenTelemetry and InfluxDB Gain Better Observability with OpenTelemetry and InfluxDB
Gain Better Observability with OpenTelemetry and InfluxDB
 
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
 
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
 
Introducing InfluxDB’s New Time Series Database Storage Engine
Introducing InfluxDB’s New Time Series Database Storage EngineIntroducing InfluxDB’s New Time Series Database Storage Engine
Introducing InfluxDB’s New Time Series Database Storage Engine
 
Start Automating InfluxDB Deployments at the Edge with balena
Start Automating InfluxDB Deployments at the Edge with balena Start Automating InfluxDB Deployments at the Edge with balena
Start Automating InfluxDB Deployments at the Edge with balena
 
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDBStreamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
 
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
 
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
 
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
 
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
 

Recently uploaded

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
UK Journal
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 

Recently uploaded (20)

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 

Catalogs - Turning a Set of Parquet Files into a Data Set

  • 2. © 2021 InfluxData. All rights reserved. 2 Agenda 1. Requirements 2. Prior Art 3. Solution
  • 3. © 2021 InfluxData. All rights reserved. 3 Requirements
  • 4. © 2021 InfluxData. All rights reserved. 4 Parquet Files 2815898179/ my_db/ data/ 2020-01/ 0/ sensors.parquet stocks.parquet 1/ sensors.parquet 2020-02/ 0/ sensors.parquet 1/ sensors.parquet other_db/ data/ 2020-01/ 0/ health.parquet 3837527170/ my_db/ data/ 2021-01/ 0/ stocks.parquet Without catalog: ● No transactions ● Large scan times ● No easy schema / statistics lookup
  • 5. © 2021 InfluxData. All rights reserved. 5 Operations • ✍ Upsert • 👓 Read • ⤫ (Soft) Delete • 🗑 Garbage Collection • ⏲ Time Travel • 🛈 Upgrade ⚠ These are specific to InfluxData IOx 🛈 Sets of operation form atomic catalog-level transactions might create table on-demand
  • 6. © 2021 InfluxData. All rights reserved. 6 Properties Easy to implement Run on AWS, Azure, GCP, in-memory, local FS Stores: • Transaction state (= pointers to files) • “Arrow Cache” (e.g. schemas, statistics) Can be rebuild from files ⚠ These are specific to InfluxData IOx ⚠ NO easy atomic “compare+swap” or “create if not exist” everywhere
  • 7. © 2021 InfluxData. All rights reserved. 7 Writer Federation Data Producers Router / Writer / Reader separate stores / namespaces
  • 8. © 2021 InfluxData. All rights reserved. 8 Prior Art
  • 9. © 2021 InfluxData. All rights reserved. 9 Apache Hive sensors/ _common_metadata year=2020/month=01 / 0.parquet 1.parquet year=2020/month=02 / 0.parquet stocks/ _common_metadata year=2020/month=01 / 0.parquet health/ _common_metadata year=2020/month=01 / 0.parquet ● file exists ⇒ part of the dataset ● _common_metadata contains schema (= parquet file w/ 0 rows) ● technically no difference between “table” and “database” ➔ no time travel ➔ LISTing object store expensive for large data sets ➔ no soft delete ➔ no atomic commits ➔ no real multi-writer semantics
  • 10. © 2021 InfluxData. All rights reserved. 10 Apache sensors/ data/ 2020-01/ 0.parquet 1.parquet 2.parquet 2020-02/ 0.parquet 2020-03/ 0.parquet ➔ heavily work-in progress ➔ more focused on: ◆ single data set / table ◆ multiple concurrent writers ➔ complexity (might be a future candidate) snapshot 1 manifest list manifest manifest manifest list snapshot 1 manifest manifest list snapshot 2 add initial files add more files delete file
  • 11. © 2021 InfluxData. All rights reserved. 11 Delta Lake sensors/ data/ 2020-01/ 0.parquet 1.parquet 2.parquet 2020-02/ 0.parquet 2020-03/ 0.parquet ➔ more focused on: ◆ single data set / table ◆ multiple concurrent writers ➔ no non-Java implementation until recently ➔ Rust implementation not feature-complete 000000.json Add Add 000001.json Add Add Add 000002.json Remove 000001.checkpoint.parquet
  • 12. © 2021 InfluxData. All rights reserved. 12 Solution
  • 13. © 2021 InfluxData. All rights reserved. 13 Writer-DB-local Multi-Table Transaction Log 2815898179/ my_db/ data/ 2020-01/ 0/ sensors.parquet stocks.parquet 1/ sensors.parquet 2020-02/ 0/ sensors.parquet 1/ sensors.parquet other_db/ data/ 2020-01/ 0/ health.parquet 3837527170/ my_db/ data/ 2021-01/ 0/ stocks.parquet 000000.txn Add Add 000001.txn Add Add Add 000002.txn Remove 000001.ckpt
  • 14. © 2021 InfluxData. All rights reserved. 14 Transaction List of possible actions: • Add: path, checksum, Parquet metadata • Remove: path • Tombstone • Upgrade: new format • … (might be extended) Serialization done via Protocol Buffers. Checkpoints aggregate transactions.
  • 15. © 2021 InfluxData. All rights reserved. 15 Statistics + Schema Add action contains Parquet metadata (= schema + statistics) Apache Thrift Compact Protocol bytes Protocol Buffers ➔ Same expressiveness as Parquet ➔ No additional format conversion
  • 16. © 2021 InfluxData. All rights reserved. 16 Writer Conflicts Assumptions: • Conductor provides cluster-wide unique ServerID • No inter-writer (=global) catalog Robustness Measures: • Transaction filename handling: <transaction counter>/<uuid>.txn • Transaction contains UUID of previous transaction • Writers detect “fork” scenario
  • 17. © 2021 InfluxData. All rights reserved. 17 References • Apache Projects Logos https://apache.org/logos/ • Apache Iceberg Table Spec https://iceberg.apache.org/spec/ • Delta Lake https://databricks.com/blog/2019/08/21/diving-into-delta-lake-unpacking-the-transaction-log.ht ml https://github.com/delta-io/delta/blob/master/PROTOCOL.md https://cs.stanford.edu/people/matei/papers/2020/vldb_delta_lake.pdf • Apache Parquet https://github.com/apache/parquet-format • IOx Design https://github.com/influxdata/influxdb_iox/blob/main/docs/catalog_persistence.md