Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Alluxio+Presto: An Architecture for Fast SQL in the Cloud

97 views

Published on

Alluxio Bay Area Meetup - 12/04/18

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Alluxio+Presto: An Architecture for Fast SQL in the Cloud

  1. 1. Alluxio + Presto: An Architecture for Fast SQL in the Cloud 1 Alluxio Meetup Follow us | @alluxio Download Alluxio | www.alluxio.org Questions? | info@alluxio.com
  2. 2. Alluxio Overview 12/4/2018 Meetup Andrew Audibert – Core Maintainer @ Alluxio Inc.
  3. 3. About Me 3 • Andrew Audibert • 3+ Years on Alluxio • Core Maintainer Email: andrew@alluxio.com Github: aaudiber
  4. 4. Company Overview • Founded Feb. 2015 – Haoyuan Li • PhD research at UC Berkeley AMPLab • Initially Tachyon Nexus • Venture Backed • Andreessen Horowitz etc. • Open Source Business Model • Tachyon Open Sourced in Dec. 2012 • Open source v1.0 released Feb. 2016 • Commercial product released Oct. 2016 • Office in San Mateo, CA • Team: Google, Palantir, Vmware, AMD, Cisco…
  5. 5. Agenda Technology Trends1 Data Access Layer2 Alluxio Architecture3
  6. 6. Agenda Technology Trends1 Data Access Layer2 Alluxio Architecture3
  7. 7. Data Transformation 7 • Pressure in all industries to be “data driven” • Majority of companies still figuring out the transformation • Increased collection of numerous, low-value data • Challenge of overcoming data silos to convert data into business value • Limited success of Data Warehouse, Mart, and Lakes – cost of copying/moving data is substantial • Single Data Plane for Business value
  8. 8. Migration to Cloud 8 • Decoupling of compute and storage • Enterprise move from turnkey solution to self managed data platforms on IaaS • Lacking agility at Data Storage level • Requires Storage Abstraction
  9. 9. Rise of Artificial Intelligence 9 • New workload targeting the same data used for Hadoop OLAP • Potentially can be as valuable or more valuable than existing OLAP workloads • Challenge of adapting existing architectures to this new workload
  10. 10. Agenda Technology Trends1 Data Access Layer2 Alluxio Architecture3
  11. 11. The Data Access Layer 11 • Abstraction layer between applications and storage systems • Present a stable storage interface to applications, including semantics, security, and performance • Eliminate weakness of data silos instead of data silos themselves • Enable transparent migration of underlying storage systems • Enable application API to storage API translation in a single layer
  12. 12. Data Access Layer Data Access Layer Security Standard APIsHigh Performance Compatibility Decoupling Transparent Migration 12
  13. 13. Alluxio 13 • Our implementation of the data access layer – a virtual distributed file system • Open source project with over 900 contributors from 100s of organizations worldwide • Deployed in many top internet and financial companies
  14. 14. 100+ Known Production Deployments AND MORE! 11/16/18 14
  15. 15. Agenda Technology Trends1 Data Access Layer2 Alluxio Architecture3
  16. 16. Data Ecosystem with Alluxio • Apps only talk to Alluxio • Simple Add/Remove • No App Changes • Highest performance in Memory • No Lock in Alluxio, a Virtual Distributed File System (VDFS) Java File API HDFS Interface S3 Interface REST API HDFS Driver S3 Driver Swift Driver NFS Driver FUSE Interface 16
  17. 17. Alluxio Architecture Alluxio Master Zookeeper Standby Master Alluxio Worker Alluxio Worker Under Store RAM / SSD / HDD RAM / SSD / HDD Control Path Data Path 17
  18. 18. Read Data not Cached in Alluxio + Caching 18 RAM / SSD / HDD Application Alluxio Client Alluxio WorkerUnder Store 12 3 4 4
  19. 19. Read Cached Data in Alluxio Alluxio Worker RAM / SSD / HDD Application Alluxio Client 19 1 2 3
  20. 20. Write data only to Alluxio Alluxio Worker RAM / SSD / HDD Application Alluxio Client 20 1 2 3
  21. 21. Write to Alluxio and Under Store Synchronously RAM / SSD / HDD Application Alluxio Client Alluxio Worker Under Store 21 12 2 3
  22. 22. Alluxio, Presto and Cloud 12/4/2018 Meetup Bin Fan – Founding Engineer @ Alluxio Inc.
  23. 23. About Me • Bin Fan • PhD CS@CMU • Founding Engineer@Alluxio 23 Email: binfan@alluxio.com Github: apc999 Twitter: @binfan Wechat: apc999_fb
  24. 24. Analogy of Alluxio in the Stack on Cloud 24
  25. 25. A Common File System Abstraction 25 • Common interface across apps • HDFS-compatible interface: change hdfs://foo/bar to alluxio://foo/bar • Other interfaces: Native Alluxio Java FS, POSIX and S3. • Cloud storage becomes “hidden” to apps • Less vendor lock-in! Compute Zone Standalone or managed with Mesos or Yarn Storage in Different Availability Zone Either on-prem or cloud TensorflowPrestoMR HDFS API POSIX API
  26. 26. Data Path: Improved I/O Performance 26 • A New Tier Above Cloud Storage for Compute • Distributed buffer cache • Restore locality to compute • Read: • Cache-hit read: served by Alluxio workers (local worker preferred) • Cache-miss read: served by cloud storage, then cache to Alluxio worker • Write: • Burst buffer, then async propagate to S3 (Alluxio 2.0) • Challenges: • Locality: expose location information to applications; serve local apps through ramdisk (rather than network)
  27. 27. Data Path: Async Persist to S3 (Alluxio 2.0) 27 RAM / SSD / HDD Application Alluxio Client Alluxio Master Alluxio Worker Under Store • Async Writes • Step1: App writes to Alluxio • Step2: Alluxio writes to UFS • Benefits • Apps writes in Alluxio speed • Data gets persisted • Challenges • File rename/delete before persist: 2PC • Fault-tolerance: journal async requests
  28. 28. Metadata Path: Familiar Semantics 28 • Listing / renaming on object store can be expensive • Common operations for batch or SQL analytics • Overwriting Put is eventually consistent • Alluxio loads and manages metadata in master • Apps can continue assuming HDFS-like semantics and performance implication • Challenges • Data modification bypassing Alluxio: when and how to re-sync • Slow lists in object store: batch operations • Too many objects: off-heap metadata (Alluxio 2.0)
  29. 29. Metadata Path: Efficient Renames 29 • Rename files on S3 can be expensive • Common operations for MR in commit phase • Write results to tmp paths • Rename tmp files to final paths (another copy, slow) • Rename with Alluxio async writes • t0: writes to tmp paths in Alluxio: near-compute, fast writes • t1: rename tmp paths to final path in Alluxio: cheap renames • t2: persist files in final paths in Alluxio to S3: 2PC to avoid partial data • Speculative execution allowed
  30. 30. Performance Benchmark: Presto + Alluxio + S3 30 • Setup • 5x EC2 r4.4xlarge, Scale factor = 100 • Alluxio: 1 master, 4 workers, 50G each Alluxio node, • Presto: 1 coordinator, 4 workers, 50G each Presto node • Initial results of TPC-DS 2.4 • Up to 5x improvement across queries • Avg 1.6x improvement (Most TPC-DS queries are CPU bound) • Potential improvements • Closer integration (e.g., a native Presto connector, better split calculation, better locality hints)
  31. 31. Case Study: - Digital marketing SaaS platform - Hive metastore: Input files from S3 - ~100 TB data on S3 - Pain Point: - Slow to list files - Limited EC2<->S3 bw - No compute side caching - No data locality https://www.slideshare.net/ThaiBui7/hybrid-collaborative-tiered-storage-with-alluxio 31
  32. 32. Solution: Hot & Warm Data on Alluxio 32
  33. 33. Result - 5 - 10x read improvement - Enable easier debug with feedback loop for data analysts 33
  34. 34. Case Study: - Leading Online Retailer (NASDAQ: JD) - Building Ad-hoc SQL Query Engine - Pain Point: - Presto workers may read remotely from HDFS datanodes - Large query variance https://www.slideshare.net/Alluxio/alluxio-in-jd 34
  35. 35. Solution: Colocate Alluxio with Presto 35
  36. 36. Query Time 36
  37. 37. 37 Query Time
  38. 38. Case Study: 38 - Leading Online Gaming Service Company (NASDAQ: NTES) - Partner with Blizzard to operate service of “WoW”, “Hearthstone” - Coming “Diablo Immortal” - Building Ad-hoc SQL Query Engine - Large data volume: ~30 TB raw data daily - A separate satellite compute cluster - Pain Point: - Requirement in response time: < 15s - Large startup latency on submitting SQL jobs as YARN app
  39. 39. Solution: Presto + Alluxio 39
  40. 40. Result: Smoother Response During Peak Time 40 Response time (ms) Presto w/ Alluxio Presto w/o Alluxio
  41. 41. - Alluxio: A New Data Access Layer - Between compute and storage - Transparent to bigdata analytics (HDFS-compatible, POSIX) - Improve data and metadata performance on cloud storage - Architecture and Data Flow - Master, Worker, Under Storage - Cache-{hit, miss} reads, Sync/Async writes - Use Cases on Presto + Alluxio Conclusion 41
  42. 42. zhuanlan.zhihu.com/alluxio www.alluxio.com info@alluxio.com twitter.com/alluxio linkedIn.com/alluxio Thank you binfan@alluxio.com andrew@alluxio.com

×