Hybrid collaborative tiered storage with alluxio

Hybrid collaborative tiered
storage with Alluxio
Thai Bui
Data Engineer @ Bazaarvoice

Bazaarvoice
● Founded in 2005 in Austin, TX
● Digital marketing SaaS platforms for ratings and reviews
○ Display & syndicate reviews from brands to retailer websites
○ Reporting & analytics on consumers, reviews, products, etc.
● 2,600 client websites
● 5.4 billion product page views each month
● 900 million unique shoppers each month

Reporting & analytics on S3
When you have 100s of TB of data on S3
● Just listing the files is slow
● Download speed in EC2 is limited (50-150Mb/s per node)
● No concept of cache
● No concept of data locality

AWS S3 : The Need For Speed
● Add tiered storage to S3
○ Hot, warm, cold storage (fastest, fast, and not so fast)
○ Metadata cache
○ Data cache
● Keep data local
○ In the same machine, not via the Ethernet cable
● Compatible with existing services
○ Hadoop, Spark, Hive, Presto, etc.
● Adaptive & highly configurable
○ Symlink for S3

ZFS
App1 Spark
Alluxio
S3
Hot & Warm
Cold
Overview
App2
● Alluxio
○ Distributed data
storage
○ Hadoop compatible
○ By AMPLab
● ZFS
○ OS-level file system
○ Volume manager
○ By Sun Microsystems
● Both are open-source
Metastore

Alluxio : The tiered-storage layer
● Support for native filesystem and Hadoop filesystem
● Distributed and can be installed on every node
○ Provides data locality
● Mount S3, HDFS, etc. to Alluxio
○ Think symlink. No data movement.
● Use Hive metastore to partition data into hot/warm and cold region
○ Acts as a remote tiered-storage layer

ZFS : The acceleration layer
● Both a filesytem & a volume manager
○ Mirror write to 2 SSDs -> 2x read speed
● Works at the Linux kernel-space
○ Works with RAM to accelerate read/write
○ Auto promote/demote blocks from RAM to other storage
○ Used with local NVMe SSD if data is not in RAM
○ Acts as a local tiered-storage layer
● Extremely reliable
○ Automatic block checksum & repair

ZFS + NVMe: Micro benchmark
I3.4xlarge, up to 10Gbit network, 2 x 1.9 NVMe SSD
● Baseline w/ EBS
○ 135 MB/s write (dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=dsync)
○ 157 MB/s read (dd if=/tmp/test1.img of=/dev/zero bs=8k)
● ZFS + 2 mirrored NVMe SSD
○ 820 MB/s write (dd if=/dev/zero of=/alluxio/fs/test1.img bs=1G count=1)
○ 1.7 GB/s read (dd if=/alluxio/fs/test1.img of=/dev/zero bs=1G count=1)
● 4x write, 10x read compared to EBS
● 10-15x compared to S3

With ZFS
ZFS
Hot
Warm
Kernel-space
User-space
Alluxio
RAM
NVMe SSD
promote demote
Native/Hadoop Filesystem API

Hive
Metastore
Last 30
days
Alluxio
> 30 daysS3
Hot &
Warm
Cold
With Hive

Hive Monitoring & Performance
Scanning 200G of data in
tiered storage, 500M
rows, select *
tiered storage, 350M
rows, fewer projections

S3, 1.6B rows, count
distinct
Metadata/split calculation ops
60s, majority of the
time spent on
scanning S3

Result
● 5-10X read improvement in Hive
○ Worker can short-circuit and read directly from ZFS instead of S3
○ Move compute to the data
● Easy to debug, with feedback loop, collaborative
○ Data publishers + data analysts/scientists
● Good for iterating over the same data set multiple times
○ Machine learning
○ Exploratory analysis
● Give us control over S3
○ More recent data should be faster to access

Hybrid collaborative tiered storage with alluxio

More Related Content

What's hot

Similar to Hybrid collaborative tiered storage with alluxio

Recently uploaded

Hybrid collaborative tiered storage with alluxio