Memory-Speed Virtual Distributed Storage: Latest Features and Use Cases
August 2016 @ Strata+Hadoop World Beijing
Bin Fan, Yupeng Fu
1
About Us
•  Bin Fan 
•  Software Engineer @ Alluxio, Inc.
•  Alluxio PMC
•  Worked at Google, MSR
•  CMU, CUHK, USTC
•  Wechat @ apc999_fb
•  Yupeng Fu
•  Software Engineer @ Alluxio, Inc.
•  Alluxio PMC
•  Worked at Palantir, Google
•  UCSD, Tsinghua 
•  Wechat @richbird9
2
•  Team consists of Alluxio creators and top committers
•  Invested by

•  Committed to Alluxio Open Source

•  http://www.alluxio.com
Alluxio Inc.
3
Outline
•  What is Alluxio
•  Example use cases
1.  Off-heap memory to alleviate resource pressure
2.  Fast data sharing between jobs
3.  Accelerate access to remote storage
4.  Unified namespace
•  Summary

4
Non-persistent
data-storage
software.
ATTRIBUTES
Memory-Speed
 Virtual
 Distributed
 Storage
Scale out
architecture.
Virtualized across different
storage types under a
unified namespace.
Memory-speed
access to data.
5
OPEN SOURCE ALLUXIO: History
• Started in the AMPLab at UC Berkeley
–  Summer of 2012
• Open Sourced as Tachyon (renamed to Alluxio)
–  Apache 2.0 License
–  Latest Release: version 1.2.0 (July, 2016)
6
OPEN SOURCE ALLUXIO
v0.2
 v0.3
v0.4
v0.5
v0.6
v0.7
v0.8
v1.0
v1.1
7
OPEN SOURCE ALLUXIO
The fastest growing
open-source project
in the big data
ecosystem.

Currently over 300
contributors from
over 100
organizations.
8
Outline
•  What is Alluxio
•  Example use cases
1.  Off-heap memory to alleviate resource pressure
2.  Fast data sharing between jobs
3.  Accelerate access to remote storage
4.  Unified namespace
•  Summary

9
Application 1
Alluxio
CASE 1:
Off-heap Memory Storage to
Alleviate Resource Pressure
10
HDFS / Amazon S3
Consolidating Memory
Spark Job1
Spark
Memory
block 1
block 3
Spark Job2
Spark
Memory
block 3
block 1
block 1
block 3
block 2
block 4
storage engine & 
execution engine
same process
Data duplicated at memory-level
11
Spark Job1
Spark mem
Spark Job2
Spark mem
HDFS / Amazon S3
block 1
block 3
block 2
block 4
HDFS
disk
block 1
block 3
block 2
block 4
Alluxio
In-Memory
block 1
block 3
 block 4
Data not duplicated at memory-level
Consolidating Memory
storage engine & 
execution engine
same process
12
Data Resilience during Crashes
Spark Task
Spark Memory
block manager
block 1
block 3
HDFS / Amazon S3
block 1
block 3
block 2
block 4
Process crash requires network and/or disk I/O to re-read the
data
storage engine & 
execution engine
same process
13
Data Resilience during Crashes
Crash
Spark Memory
block manager
block 1
block 3
HDFS / Amazon S3
block 1
block 3
block 2
block 4
Process crash requires network and/or disk I/O to re-read the
data
storage engine & 
execution engine
same process
14
HDFS / Amazon S3
Data Resilience during Crashes
block 1
block 3
block 2
block 4
Crash
storage engine & 
execution engine
same process
Process crash requires network and/or disk I/O to re-read the
data
15
Spark Task
Spark Memory
block manager
HDFS
disk
block 1
block 3
block 2
block 4
Alluxio
In-Memory
block 1
block 3
 block 4
Process crash only needs memory I/O to re-read the
data
Data Resilience during Crashes
storage engine & 
execution engine
same process
16
Crash
Process crash only needs memory I/O to re-read the data
HDFS
disk
block 1
block 3
block 2
block 4
Alluxio
In-Memory
block 1
block 3
 block 4
Data Resilience during Crashes
storage engine & 
execution engine
same process
17
CASE STUDY
Thanks to Alluxio, we now have the raw
data immediately available at every
iteration and we can skip the costs of
loading in terms of time waiting,
network traffic, and RDBMS activity.

- Henry Powell, Barclays
RESULTS
•  Barclays workflow iteration time decreased
from hours to seconds

•  Alluxio enabled workflows that were
impossible before

•  By keeping data only in memory, the I/O cost of
loading and storing in Alluxio is now on the
order of seconds
Relational Database
Share Data Across Jobs at Memory
Speed
•  Reducing time from hours to
seconds
•  Solving issues of Teradata as
under storage
18
Barclays Previous Architecture
Spark
Teradata
Slow ETL,
30+ minutes
Each context restart
requires ETL process
again
h"ps://dzone.com/ar1cles/Accelerate-­‐In-­‐Memory-­‐Processing-­‐with-­‐Spark-­‐from-­‐Hours-­‐to-­‐Seconds-­‐With-­‐Tachyon	
  
19
Barclays Architecture with Alluxio
Spark
Teradata
Alluxio
ETL only once
Data still available in Alluxio
memory after Job crash
20
CASE 2:
Fast Data sharing between apps
Application 1
Alluxio
Application2
HDFS
21
Spark Job
Spark 
Memory
block 1
block 3
Hadoop MR Job
YARN
HDFS / Amazon S3
block 1
block 3
block 2
block 4
storage engine & 
execution engine
same process
Data Sharing Between Frameworks
Inter-process sharing slowed down by network and/or disk I/O
22
Data Sharing Between Frameworks
Spark Job
Spark Memory
Hadoop MR Job
YARN
HDFS / Amazon S3
block 1
block 3
block 2
block 4
HDFS
disk
block 1
block 3
block 2
block 4
Alluxio
In-Memory
block 1
block 3
 block 4
storage engine & 
execution engine
same process
Inter-process sharing can happen at memory
speed
23
CASE STUDY
We’ve been running Alluxio in
production for over 9 months, resulting
in 15x speedup on average, and 300x
speedup at peak service times.

- Xueyan Li, Qunar
RESULTS

•  Multiple computing frameworks can share data
at memory speed with Alluxio

•  Improved the performance of their system with
15x – 300x speedups

•  Alluxio provides various user-friendly APIs,
which simplifies the transition to Alluxio.

Sharing Data Across Different
Compute Frameworks
•  6 billion logs (4.5 TB) daily
•  Mix of Memory + HDD
24
Qunar Previous Architecture
Flink
Streaming
Spark
Spark
Streaming
Flink
HDFS Cluster 
Some jobs cannot finish
Slow access to the storage
25
Qunar Architecture with Alluxio
Flink
Streaming
HDFS Cluster 
Spark
Spark
Streaming
Flink
Alluxio
Share in-memory data
across frameworks
26
CASE 3:
Accelerate access to remote storage
Application 
Alluxio
Remote
Storage
27
•  Different compute and storage hardware requirements
•  Scale compute and storage resources independently
•  Cost effective object stores (Amazon S3, Google GCS, Microsoft Azure
Blob Store) are inherently decoupled
Decoupling Compute from Storage
Accessing data
requires remote I/O!
28
Remote I/O problems
Application
Remote
Storage
every data operation requires
data transfer, sometimes over
the WAN
high latency, network
throughput
29
Visualizing the Stack with Alluxio
FAST 
104 - 105 MB/s
Application
Remote Storage
MODERATE 
103 - 104 MB/s
SLOW
102 - 103 MB/s
Alluxio MEM
Often
Only When Necessary

Alluxio SSD/HDD
Limited

30
What if memory capacity is still
not enough?
31
Alluxio Manages All Local Storage
MEM
SSD
HDD
Faster
Higher Capacity
32
Configurable Storage Tiers
MEM only
MEM + HDD
SSD only
33
Pluggable Tier Management Policies
Evict stale data to
slower tier
Promote hot data
to faster tier
34
CASE STUDY
Baidu File System
The performance was amazing. With
Spark SQL alone, it took 100-150 seconds
to finish a query; using Alluxio, where
data may hit local or remote Alluxio
nodes, it took 10-15 seconds.

- Shaoshan Liu, Baidu
RESULTS
•  Data queries are now 30x faster with Alluxio

•  1000-worker Alluxio cluster run stably,
providing over 50TB of RAM space

•  By using Alluxio, batch queries usually lasting
over 15 minutes were transformed into an
interactive query taking less than 30 seconds
Accelerate Access to Remote
Storage
•  1000+ nodes deployment
•  2+ petabytes of storage
•  Mix of memory + HDD
•  1000-worker cluster
35
CASE 4:
Unified Name Space
36	
  
Application 
Alluxio
AWS S3
 HDFS
Common Scenarios
•  At large organizations, data spans many storage
systems

•  Legacy storage systems
•  Application logic needs to integrate with different
storage systems

37
Unified Name Space
An abstraction for applications to access different storage
systems through the same interface

•  Benefits
–  Application written once with Allluxio API
–  Transparently add new storage systems
–  Seamlessly integrate with existing systems

38
Transparent Naming
Operations over persisted Alluxio objects mapped
transparently to underlying storage

alluxio://Data/Sales <-> hdfs://Data/Sales
Alluxio
 Storage System (HDFS, S3, …)
alluxio://host:port/
Data
 Users
Reports
 Sales
 Alice
 Bob
hdfs://host:port/
Data
 Users
Reports
 Sales
 Alice
 Bob
39
mount
Multiple Storage Systems
•  Unified namespace for multiple data sources
alluxio://Data/Sales -> s3://bucket/Sales
alluxio://Users/Alice -> hdfs://Users/Alice
•  API for on-the-fly mounting / unmounting

Alluxio
 Storage System A

alluxio://host:port/
Data
 Users
Alice
 Bob
hdfs://host:port/
Users
Alice
 Bob
Storage System B

s3://host/bucket
Reports
 Sales
Reports
 Sales
40
mount
mount
A Proliferation of Under Storage Systems
MORE…
41
CASE STUDY
RESULTS
•  Alluxio’s unified namespace enables different
applications and frameworks to easily interact
with their data from different storage systems

•  Global data centers across North America and
Asia

•  Tiered storage feature manages various
storage resources including memory, SSD and
disk
Transparently Manage Data
Across Different Storage Systems
•  Storage medium: MEM
•  Clusters in different regions
An Internet Company
Machine Learning Pipelines
42
42
Outline
•  What is Alluxio
•  Example use cases
1.  Off-heap memory to alleviate resource pressure
2.  Fast data sharing between jobs
3.  Accelerate access to remote storage
4.  Unified namespace
•  Summary

43
Takeaway: When to use Alluxio
•  Two or more jobs access the same dataset
•  Job(s) may not always succeed
•  Spark with memory/GC pressure
•  Jobs/Apps are pipelined
•  Resulting data does not need to be immediately
persisted
•  Interact with remote data
•  Data stored across different storage systems


44
Try out Alluxio 1.2.0
http://www.alluxio.org/releases
45
First China Meetups!
•  Nanjing (Saturday, 7/30)
•  Shanghai (Sunday, 7/31)


•  Beijing (Sunday, 8/7)
46
Resources
•  Alluxio Project: http://www.alluxio.org
•  Development: https://github.com/Alluxio/alluxio
•  Meet Friends: http://www.meetup.com/Alluxio
•  Alluxio, Inc.: http://www.alluxio.com
•  Training: http://www.alluxio.com/alluxio-training/
•  Contact us: info@alluxio.com
•  Alluxio Wechat: Alluxio
47
Alluxio Use Cases at Strata+Hadoop World Beijing 2016

Alluxio Use Cases at Strata+Hadoop World Beijing 2016

  • 1.
    Memory-Speed Virtual DistributedStorage: Latest Features and Use Cases August 2016 @ Strata+Hadoop World Beijing Bin Fan, Yupeng Fu 1
  • 2.
    About Us •  BinFan •  Software Engineer @ Alluxio, Inc. •  Alluxio PMC •  Worked at Google, MSR •  CMU, CUHK, USTC •  Wechat @ apc999_fb •  Yupeng Fu •  Software Engineer @ Alluxio, Inc. •  Alluxio PMC •  Worked at Palantir, Google •  UCSD, Tsinghua •  Wechat @richbird9 2
  • 3.
    •  Team consistsof Alluxio creators and top committers •  Invested by •  Committed to Alluxio Open Source •  http://www.alluxio.com Alluxio Inc. 3
  • 4.
    Outline •  What isAlluxio •  Example use cases 1.  Off-heap memory to alleviate resource pressure 2.  Fast data sharing between jobs 3.  Accelerate access to remote storage 4.  Unified namespace •  Summary 4
  • 5.
    Non-persistent data-storage software. ATTRIBUTES Memory-Speed Virtual Distributed Storage Scale out architecture. Virtualized across different storage types under a unified namespace. Memory-speed access to data. 5
  • 6.
    OPEN SOURCE ALLUXIO:History • Started in the AMPLab at UC Berkeley –  Summer of 2012 • Open Sourced as Tachyon (renamed to Alluxio) –  Apache 2.0 License –  Latest Release: version 1.2.0 (July, 2016) 6
  • 7.
    OPEN SOURCE ALLUXIO v0.2 v0.3 v0.4 v0.5 v0.6 v0.7 v0.8 v1.0 v1.1 7
  • 8.
    OPEN SOURCE ALLUXIO Thefastest growing open-source project in the big data ecosystem. Currently over 300 contributors from over 100 organizations. 8
  • 9.
    Outline •  What isAlluxio •  Example use cases 1.  Off-heap memory to alleviate resource pressure 2.  Fast data sharing between jobs 3.  Accelerate access to remote storage 4.  Unified namespace •  Summary 9
  • 10.
    Application 1 Alluxio CASE 1: Off-heapMemory Storage to Alleviate Resource Pressure 10
  • 11.
    HDFS / AmazonS3 Consolidating Memory Spark Job1 Spark Memory block 1 block 3 Spark Job2 Spark Memory block 3 block 1 block 1 block 3 block 2 block 4 storage engine & execution engine same process Data duplicated at memory-level 11
  • 12.
    Spark Job1 Spark mem SparkJob2 Spark mem HDFS / Amazon S3 block 1 block 3 block 2 block 4 HDFS disk block 1 block 3 block 2 block 4 Alluxio In-Memory block 1 block 3 block 4 Data not duplicated at memory-level Consolidating Memory storage engine & execution engine same process 12
  • 13.
    Data Resilience duringCrashes Spark Task Spark Memory block manager block 1 block 3 HDFS / Amazon S3 block 1 block 3 block 2 block 4 Process crash requires network and/or disk I/O to re-read the data storage engine & execution engine same process 13
  • 14.
    Data Resilience duringCrashes Crash Spark Memory block manager block 1 block 3 HDFS / Amazon S3 block 1 block 3 block 2 block 4 Process crash requires network and/or disk I/O to re-read the data storage engine & execution engine same process 14
  • 15.
    HDFS / AmazonS3 Data Resilience during Crashes block 1 block 3 block 2 block 4 Crash storage engine & execution engine same process Process crash requires network and/or disk I/O to re-read the data 15
  • 16.
    Spark Task Spark Memory blockmanager HDFS disk block 1 block 3 block 2 block 4 Alluxio In-Memory block 1 block 3 block 4 Process crash only needs memory I/O to re-read the data Data Resilience during Crashes storage engine & execution engine same process 16
  • 17.
    Crash Process crash onlyneeds memory I/O to re-read the data HDFS disk block 1 block 3 block 2 block 4 Alluxio In-Memory block 1 block 3 block 4 Data Resilience during Crashes storage engine & execution engine same process 17
  • 18.
    CASE STUDY Thanks toAlluxio, we now have the raw data immediately available at every iteration and we can skip the costs of loading in terms of time waiting, network traffic, and RDBMS activity. - Henry Powell, Barclays RESULTS •  Barclays workflow iteration time decreased from hours to seconds •  Alluxio enabled workflows that were impossible before •  By keeping data only in memory, the I/O cost of loading and storing in Alluxio is now on the order of seconds Relational Database Share Data Across Jobs at Memory Speed •  Reducing time from hours to seconds •  Solving issues of Teradata as under storage 18
  • 19.
    Barclays Previous Architecture Spark Teradata SlowETL, 30+ minutes Each context restart requires ETL process again h"ps://dzone.com/ar1cles/Accelerate-­‐In-­‐Memory-­‐Processing-­‐with-­‐Spark-­‐from-­‐Hours-­‐to-­‐Seconds-­‐With-­‐Tachyon   19
  • 20.
    Barclays Architecture withAlluxio Spark Teradata Alluxio ETL only once Data still available in Alluxio memory after Job crash 20
  • 21.
    CASE 2: Fast Datasharing between apps Application 1 Alluxio Application2 HDFS 21
  • 22.
    Spark Job Spark Memory block1 block 3 Hadoop MR Job YARN HDFS / Amazon S3 block 1 block 3 block 2 block 4 storage engine & execution engine same process Data Sharing Between Frameworks Inter-process sharing slowed down by network and/or disk I/O 22
  • 23.
    Data Sharing BetweenFrameworks Spark Job Spark Memory Hadoop MR Job YARN HDFS / Amazon S3 block 1 block 3 block 2 block 4 HDFS disk block 1 block 3 block 2 block 4 Alluxio In-Memory block 1 block 3 block 4 storage engine & execution engine same process Inter-process sharing can happen at memory speed 23
  • 24.
    CASE STUDY We’ve beenrunning Alluxio in production for over 9 months, resulting in 15x speedup on average, and 300x speedup at peak service times. - Xueyan Li, Qunar RESULTS •  Multiple computing frameworks can share data at memory speed with Alluxio •  Improved the performance of their system with 15x – 300x speedups •  Alluxio provides various user-friendly APIs, which simplifies the transition to Alluxio. Sharing Data Across Different Compute Frameworks •  6 billion logs (4.5 TB) daily •  Mix of Memory + HDD 24
  • 25.
    Qunar Previous Architecture Flink Streaming Spark Spark Streaming Flink HDFSCluster Some jobs cannot finish Slow access to the storage 25
  • 26.
    Qunar Architecture withAlluxio Flink Streaming HDFS Cluster Spark Spark Streaming Flink Alluxio Share in-memory data across frameworks 26
  • 27.
    CASE 3: Accelerate accessto remote storage Application Alluxio Remote Storage 27
  • 28.
    •  Different computeand storage hardware requirements •  Scale compute and storage resources independently •  Cost effective object stores (Amazon S3, Google GCS, Microsoft Azure Blob Store) are inherently decoupled Decoupling Compute from Storage Accessing data requires remote I/O! 28
  • 29.
    Remote I/O problems Application Remote Storage everydata operation requires data transfer, sometimes over the WAN high latency, network throughput 29
  • 30.
    Visualizing the Stackwith Alluxio FAST 104 - 105 MB/s Application Remote Storage MODERATE 103 - 104 MB/s SLOW 102 - 103 MB/s Alluxio MEM Often Only When Necessary Alluxio SSD/HDD Limited 30
  • 31.
    What if memorycapacity is still not enough? 31
  • 32.
    Alluxio Manages AllLocal Storage MEM SSD HDD Faster Higher Capacity 32
  • 33.
    Configurable Storage Tiers MEMonly MEM + HDD SSD only 33
  • 34.
    Pluggable Tier ManagementPolicies Evict stale data to slower tier Promote hot data to faster tier 34
  • 35.
    CASE STUDY Baidu FileSystem The performance was amazing. With Spark SQL alone, it took 100-150 seconds to finish a query; using Alluxio, where data may hit local or remote Alluxio nodes, it took 10-15 seconds. - Shaoshan Liu, Baidu RESULTS •  Data queries are now 30x faster with Alluxio •  1000-worker Alluxio cluster run stably, providing over 50TB of RAM space •  By using Alluxio, batch queries usually lasting over 15 minutes were transformed into an interactive query taking less than 30 seconds Accelerate Access to Remote Storage •  1000+ nodes deployment •  2+ petabytes of storage •  Mix of memory + HDD •  1000-worker cluster 35
  • 36.
    CASE 4: Unified NameSpace 36   Application Alluxio AWS S3 HDFS
  • 37.
    Common Scenarios •  Atlarge organizations, data spans many storage systems •  Legacy storage systems •  Application logic needs to integrate with different storage systems 37
  • 38.
    Unified Name Space Anabstraction for applications to access different storage systems through the same interface •  Benefits –  Application written once with Allluxio API –  Transparently add new storage systems –  Seamlessly integrate with existing systems 38
  • 39.
    Transparent Naming Operations overpersisted Alluxio objects mapped transparently to underlying storage alluxio://Data/Sales <-> hdfs://Data/Sales Alluxio Storage System (HDFS, S3, …) alluxio://host:port/ Data Users Reports Sales Alice Bob hdfs://host:port/ Data Users Reports Sales Alice Bob 39 mount
  • 40.
    Multiple Storage Systems • Unified namespace for multiple data sources alluxio://Data/Sales -> s3://bucket/Sales alluxio://Users/Alice -> hdfs://Users/Alice •  API for on-the-fly mounting / unmounting Alluxio Storage System A alluxio://host:port/ Data Users Alice Bob hdfs://host:port/ Users Alice Bob Storage System B s3://host/bucket Reports Sales Reports Sales 40 mount mount
  • 41.
    A Proliferation ofUnder Storage Systems MORE… 41
  • 42.
    CASE STUDY RESULTS •  Alluxio’sunified namespace enables different applications and frameworks to easily interact with their data from different storage systems •  Global data centers across North America and Asia •  Tiered storage feature manages various storage resources including memory, SSD and disk Transparently Manage Data Across Different Storage Systems •  Storage medium: MEM •  Clusters in different regions An Internet Company Machine Learning Pipelines 42 42
  • 43.
    Outline •  What isAlluxio •  Example use cases 1.  Off-heap memory to alleviate resource pressure 2.  Fast data sharing between jobs 3.  Accelerate access to remote storage 4.  Unified namespace •  Summary 43
  • 44.
    Takeaway: When touse Alluxio •  Two or more jobs access the same dataset •  Job(s) may not always succeed •  Spark with memory/GC pressure •  Jobs/Apps are pipelined •  Resulting data does not need to be immediately persisted •  Interact with remote data •  Data stored across different storage systems 44
  • 45.
    Try out Alluxio1.2.0 http://www.alluxio.org/releases 45
  • 46.
    First China Meetups! • Nanjing (Saturday, 7/30) •  Shanghai (Sunday, 7/31) •  Beijing (Sunday, 8/7) 46
  • 47.
    Resources •  Alluxio Project:http://www.alluxio.org •  Development: https://github.com/Alluxio/alluxio •  Meet Friends: http://www.meetup.com/Alluxio •  Alluxio, Inc.: http://www.alluxio.com •  Training: http://www.alluxio.com/alluxio-training/ •  Contact us: info@alluxio.com •  Alluxio Wechat: Alluxio 47