Data Orchestration for AI, Big Data, and Cloud

Data Orchestration for AI, Big Data, and Cloud
Haoyuan (HY) Li | Founder, Chairman, CTO | Alluxio
haoyuan@alluxio.com | @haoyuan
2019-06-21 @ O’Reilly AI Beijing

The journey to a fragmented data world
More people & teams need
access to this data
More data
generated every day
New compute & storage
technologies created every
3-8 years

Single
Data
Lake
Limit
Actively
Managed
Data
Abstract &
Orchestrate
Data

Data silos cross data centers, regions, clouds
HDFS
HIVE
HDFS
Spark
NFS
TENSOR
FLOW
DATA IN DISPARATE STORAGE SYSTEMS
OBJECT
STORE
PRESTO
COMPUTE SPREAD ACROSS MANY DIFFERENT FRAMEWORKS
WAN
HDFS
WAN
S3
Spark
AZURE
PRESTO

Abstract & orchestrate data across data silos
HDFS
HIVE Spark
NFS
TENSOR
FLOW
DATA IN DISPARATE STORAGE SYSTEMS
PRESTO
COMPUTE SPREAD ACROSS MANY DIFFERENT FRAMEWORKS
S3
SPARK
DATA
ORCHESTRATION
DATA
ORCHESTRATION
DATA
ORCHESTRATION
DATA
ORCHESTRATION
DATA
ORCHESTRATION
ANY
DATA
APP
DATA
ORCHESTRATION

Data Orchestration for the cloud
Data Locality,Accessibility & Elasticity for AI & Big Data
§ Accelerate speed to insights with hot data made local to compute faster
§ Burst data elastically with compute anytime in any cloud environment
§ Reduce costs by time-consuming ETL and eliminating multiple persisted copies

Data Orchestration for the AI, Big Data, and Cloud

Java File API HDFS Interface S3 Interface REST APIFUSE Interface
HDFS Driver Swift Driver S3 Driver NFS Driver
Data Orchestration for the AI, Big Data, and Cloud

Data Orchestration for Agility

Data Orchestration for Compute Bursting
Leading Hedge Fund

An Open Source Implementation of Data Orchestration
Started From UC Berkeley AMPLab
1000+ contributors &
growing
4000+ Git Stars
Apache 2.0 Licensed
GitHub’s Top 100 Most
Valuable Repositories
Out of 96 Million
Join the
conversation on
Slack
slackin.alluxio.io

Companies Moving Towards Data Orchestration
(Including 8 of the Top 10 Internet Companies in China)
Read More

Embracing Data Silos – the Data Orchestration Approach
Welcome to join the Alluxio Open Source Community!
www.alluxio.io | @alluxio | slackin.alluxio.io

Data Orchestration for AI, Big Data, and Cloud

In this document

More Related Content

What's hot

Similar to Data Orchestration for AI, Big Data, and Cloud

More from Alluxio, Inc.

Recently uploaded

Data Orchestration for AI, Big Data, and Cloud