Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Hybrid collaborative tiered storage with alluxio

281 views

Published on

Systems that deal with AWS S3 often come with a negative performance impact. There's no co-location and the data has to move through slower, often congested wire networks. Alluxio can provide a caching layer for the data, however there's still the question of how and when to move which data. Should all the data by default be cached or should they be cached when used? In this talk, I will explore that gray area in between where the users and the dataset publishers will collaborate to decide what and how the data is cache in a tiered-storage architecture to maximize performance and minimize operating costs.

Published in: Data & Analytics
  • Be the first to comment

Hybrid collaborative tiered storage with alluxio

  1. 1. Hybrid collaborative tiered storage with Alluxio Thai Bui Data Engineer @ Bazaarvoice
  2. 2. Bazaarvoice ● Founded in 2005 in Austin, TX ● Digital marketing SaaS platforms for ratings and reviews ○ Display & syndicate reviews from brands to retailer websites ○ Reporting & analytics on consumers, reviews, products, etc. ● 2,600 client websites ● 5.4 billion product page views each month ● 900 million unique shoppers each month
  3. 3. Reporting & analytics on S3 When you have 100s of TB of data on S3 ● Just listing the files is slow ● Download speed in EC2 is limited (50-150Mb/s per node) ● No concept of cache ● No concept of data locality
  4. 4. AWS S3 : The Need For Speed ● Add tiered storage to S3 ○ Hot, warm, cold storage (fastest, fast, and not so fast) ○ Metadata cache ○ Data cache ● Keep data local ○ In the same machine, not via the Ethernet cable ● Compatible with existing services ○ Hadoop, Spark, Hive, Presto, etc. ● Adaptive & highly configurable ○ Symlink for S3
  5. 5. ZFS App1 Spark Alluxio S3 Hot & Warm Cold Overview App2 ● Alluxio ○ Distributed data storage ○ Hadoop compatible ○ By AMPLab ● ZFS ○ OS-level file system ○ Volume manager ○ By Sun Microsystems ● Both are open-source Metastore
  6. 6. Alluxio : The tiered-storage layer ● Support for native filesystem and Hadoop filesystem ● Distributed and can be installed on every node ○ Provides data locality ● Mount S3, HDFS, etc. to Alluxio ○ Think symlink. No data movement. ● Use Hive metastore to partition data into hot/warm and cold region ○ Acts as a remote tiered-storage layer
  7. 7. ZFS : The acceleration layer ● Both a filesytem & a volume manager ○ Mirror write to 2 SSDs -> 2x read speed ● Works at the Linux kernel-space ○ Works with RAM to accelerate read/write ○ Auto promote/demote blocks from RAM to other storage ○ Used with local NVMe SSD if data is not in RAM ○ Acts as a local tiered-storage layer ● Extremely reliable ○ Automatic block checksum & repair
  8. 8. ZFS + NVMe: Micro benchmark I3.4xlarge, up to 10Gbit network, 2 x 1.9 NVMe SSD ● Baseline w/ EBS ○ 135 MB/s write (dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=dsync) ○ 157 MB/s read (dd if=/tmp/test1.img of=/dev/zero bs=8k) ● ZFS + 2 mirrored NVMe SSD ○ 820 MB/s write (dd if=/dev/zero of=/alluxio/fs/test1.img bs=1G count=1) ○ 1.7 GB/s read (dd if=/alluxio/fs/test1.img of=/dev/zero bs=1G count=1) ● 4x write, 10x read compared to EBS ● 10-15x compared to S3
  9. 9. With ZFS ZFS Hot Warm Kernel-space User-space Alluxio RAM NVMe SSD promote demote Native/Hadoop Filesystem API
  10. 10. Hive Metastore Last 30 days Alluxio > 30 daysS3 Hot & Warm Cold With Hive
  11. 11. CPU/IO Monitoring
  12. 12. Tiered storage Monitoring
  13. 13. Alluxio Monitoring
  14. 14. Hive Monitoring & Performance Scanning 200G of data in tiered storage, 500M rows, select * Scanning 5G of data in tiered storage, 350M rows, fewer projections
  15. 15. Scanning 35G of data in S3, 1.6B rows, count distinct Metadata/split calculation ops 60s, majority of the time spent on scanning S3
  16. 16. Result ● 5-10X read improvement in Hive ○ Worker can short-circuit and read directly from ZFS instead of S3 ○ Move compute to the data ● Easy to debug, with feedback loop, collaborative ○ Data publishers + data analysts/scientists ● Good for iterating over the same data set multiple times ○ Machine learning ○ Exploratory analysis ● Give us control over S3 ○ More recent data should be faster to access
  17. 17. Question?

×