Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Orchestration for AI, Big Data, and Cloud

125 views

Published on

IFA+ Summit 2019
Sept 9, 2019
Keynote by Haoyuan Li, Alluxio
Founder, Chairman, CTO

For more events: https://www.alluxio.io/events/

Published in: Software
  • Be the first to comment

  • Be the first to like this

Data Orchestration for AI, Big Data, and Cloud

  1. 1. Data Orchestration for AI, Big Data, and Cloud Haoyuan (HY) Li | Founder, Chairman, CTO | Alluxio haoyuan@alluxio.com | @haoyuan 2019-09-09 @ IFA+ Summit 2019
  2. 2. Realities:A Fragmented Data World Data Silos are Inevitable More data generated every day Data Scientists and Analysts need access to this data New compute technologies in the cloud
  3. 3. Data Silos are Inevitable
  4. 4. Single Data Lake Limit Actively Managed Data Abstract & Orchestrate Data
  5. 5. Data silos cross data centers, regions, clouds HDFS HIVE HDFS Spark NFS TENSOR FLOW DATA IN DISPARATE STORAGE SYSTEMS OBJECT STORE PRESTO COMPUTE SPREAD ACROSS MANY DIFFERENT FRAMEWORKS WAN HDFS WAN S3 Spark AZURE PRESTO
  6. 6. Abstract & orchestrate data across data silos HDFS HIVE Spark NFS TENSOR FLOW DATA IN DISPARATE STORAGE SYSTEMS PRESTO COMPUTE SPREAD ACROSS MANY DIFFERENT FRAMEWORKS S3 SPARK DATA ORCHESTRATION DATA ORCHESTRATION DATA ORCHESTRATION DATA ORCHESTRATION DATA ORCHESTRATION ANY DATA APP DATA ORCHESTRATION
  7. 7. Data Orchestration for the cloud Data Locality,Accessibility & Elasticity for AI & Data Analytics • Faster Data to Insights to Innovation • Elastic Compute Resource in the Cloud • Saving Cost from both Machine & People
  8. 8. Data Orchestration for the AI, Big Data, and Cloud
  9. 9. Java File API HDFS Interface S3 Interface REST APIFUSE Interface HDFS Driver Swift Driver S3 Driver NFS Driver Data Orchestration for the AI, Big Data, and Cloud
  10. 10. Data Orchestration for Agility
  11. 11. Data Orchestration for Compute Bursting
  12. 12. An Open Source Implementation of Data Orchestration Started From UC Berkeley AMPLab 1000+ contributors & growing 4000+ Git Stars Apache 2.0 Licensed GitHub’s Top 100 Most Valuable Repositories Out of 96 Million Join the conversation on Slack slackin.alluxio.io
  13. 13. Companies Moving Towards Data Orchestration Read More
  14. 14. Embracing Data Silos – the Data Orchestration Approach Welcome to join the Open Source Community! www.alluxio.io | @alluxio | slackin.alluxio.io

×