Successfully reported this slideshow.
Your SlideShare is downloading. ×

Catalogs - Turning a Set of Parquet Files into a Data Set

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 18 Ad

Catalogs - Turning a Set of Parquet Files into a Data Set

Download to read offline

InfluxDB IOx Tech Talks

Placing a Parquet file into an object store serves as a simple data persistence format. However, storing data into multiple files enabling upserts, deletions, format upgrades, metadata management, and consistency checks at scale requires some form of a catalog that manages these files. In this talk we will explore the requirements for a catalog for InfluxDB IOx, prior art from the Parquet ecosystem, and the proposed solution.

InfluxDB IOx Tech Talks

Placing a Parquet file into an object store serves as a simple data persistence format. However, storing data into multiple files enabling upserts, deletions, format upgrades, metadata management, and consistency checks at scale requires some form of a catalog that manages these files. In this talk we will explore the requirements for a catalog for InfluxDB IOx, prior art from the Parquet ecosystem, and the proposed solution.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to Catalogs - Turning a Set of Parquet Files into a Data Set (20)

Advertisement

More from InfluxData (20)

Advertisement

Catalogs - Turning a Set of Parquet Files into a Data Set

  1. 1. Marco Neumann @crepererum mneumann@influxdata.com Catalogs Turning a set of Parquet files into a data set
  2. 2. © 2021 InfluxData. All rights reserved. 2 Agenda 1. Requirements 2. Prior Art 3. Solution
  3. 3. © 2021 InfluxData. All rights reserved. 3 Requirements
  4. 4. © 2021 InfluxData. All rights reserved. 4 Parquet Files 2815898179/ my_db/ data/ 2020-01/ 0/ sensors.parquet stocks.parquet 1/ sensors.parquet 2020-02/ 0/ sensors.parquet 1/ sensors.parquet other_db/ data/ 2020-01/ 0/ health.parquet 3837527170/ my_db/ data/ 2021-01/ 0/ stocks.parquet Without catalog: ● No transactions ● Large scan times ● No easy schema / statistics lookup
  5. 5. © 2021 InfluxData. All rights reserved. 5 Operations • ✍ Upsert • 👓 Read • ⤫ (Soft) Delete • 🗑 Garbage Collection • ⏲ Time Travel • 🛈 Upgrade ⚠ These are specific to InfluxData IOx 🛈 Sets of operation form atomic catalog-level transactions might create table on-demand
  6. 6. © 2021 InfluxData. All rights reserved. 6 Properties Easy to implement Run on AWS, Azure, GCP, in-memory, local FS Stores: • Transaction state (= pointers to files) • “Arrow Cache” (e.g. schemas, statistics) Can be rebuild from files ⚠ These are specific to InfluxData IOx ⚠ NO easy atomic “compare+swap” or “create if not exist” everywhere
  7. 7. © 2021 InfluxData. All rights reserved. 7 Writer Federation Data Producers Router / Writer / Reader separate stores / namespaces
  8. 8. © 2021 InfluxData. All rights reserved. 8 Prior Art
  9. 9. © 2021 InfluxData. All rights reserved. 9 Apache Hive sensors/ _common_metadata year=2020/month=01 / 0.parquet 1.parquet year=2020/month=02 / 0.parquet stocks/ _common_metadata year=2020/month=01 / 0.parquet health/ _common_metadata year=2020/month=01 / 0.parquet ● file exists ⇒ part of the dataset ● _common_metadata contains schema (= parquet file w/ 0 rows) ● technically no difference between “table” and “database” ➔ no time travel ➔ LISTing object store expensive for large data sets ➔ no soft delete ➔ no atomic commits ➔ no real multi-writer semantics
  10. 10. © 2021 InfluxData. All rights reserved. 10 Apache sensors/ data/ 2020-01/ 0.parquet 1.parquet 2.parquet 2020-02/ 0.parquet 2020-03/ 0.parquet ➔ heavily work-in progress ➔ more focused on: ◆ single data set / table ◆ multiple concurrent writers ➔ complexity (might be a future candidate) snapshot 1 manifest list manifest manifest manifest list snapshot 1 manifest manifest list snapshot 2 add initial files add more files delete file
  11. 11. © 2021 InfluxData. All rights reserved. 11 Delta Lake sensors/ data/ 2020-01/ 0.parquet 1.parquet 2.parquet 2020-02/ 0.parquet 2020-03/ 0.parquet ➔ more focused on: ◆ single data set / table ◆ multiple concurrent writers ➔ no non-Java implementation until recently ➔ Rust implementation not feature-complete 000000.json Add Add 000001.json Add Add Add 000002.json Remove 000001.checkpoint.parquet
  12. 12. © 2021 InfluxData. All rights reserved. 12 Solution
  13. 13. © 2021 InfluxData. All rights reserved. 13 Writer-DB-local Multi-Table Transaction Log 2815898179/ my_db/ data/ 2020-01/ 0/ sensors.parquet stocks.parquet 1/ sensors.parquet 2020-02/ 0/ sensors.parquet 1/ sensors.parquet other_db/ data/ 2020-01/ 0/ health.parquet 3837527170/ my_db/ data/ 2021-01/ 0/ stocks.parquet 000000.txn Add Add 000001.txn Add Add Add 000002.txn Remove 000001.ckpt
  14. 14. © 2021 InfluxData. All rights reserved. 14 Transaction List of possible actions: • Add: path, checksum, Parquet metadata • Remove: path • Tombstone • Upgrade: new format • … (might be extended) Serialization done via Protocol Buffers. Checkpoints aggregate transactions.
  15. 15. © 2021 InfluxData. All rights reserved. 15 Statistics + Schema Add action contains Parquet metadata (= schema + statistics) Apache Thrift Compact Protocol bytes Protocol Buffers ➔ Same expressiveness as Parquet ➔ No additional format conversion
  16. 16. © 2021 InfluxData. All rights reserved. 16 Writer Conflicts Assumptions: • Conductor provides cluster-wide unique ServerID • No inter-writer (=global) catalog Robustness Measures: • Transaction filename handling: <transaction counter>/<uuid>.txn • Transaction contains UUID of previous transaction • Writers detect “fork” scenario
  17. 17. © 2021 InfluxData. All rights reserved. 17 References • Apache Projects Logos https://apache.org/logos/ • Apache Iceberg Table Spec https://iceberg.apache.org/spec/ • Delta Lake https://databricks.com/blog/2019/08/21/diving-into-delta-lake-unpacking-the-transaction-log.ht ml https://github.com/delta-io/delta/blob/master/PROTOCOL.md https://cs.stanford.edu/people/matei/papers/2020/vldb_delta_lake.pdf • Apache Parquet https://github.com/apache/parquet-format • IOx Design https://github.com/influxdata/influxdb_iox/blob/main/docs/catalog_persistence.md
  18. 18. Thank You

×