Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Alluxio Presentation at AMPLab Summer Retreat 2016

2,298 views

Published on

High Level Alluxio, Memory Speed Virtual Distributed Storage, Overview.

Published in: Technology
  • Be the first to comment

Alluxio Presentation at AMPLab Summer Retreat 2016

  1. 1. Alluxio (formerly Tachyon) Memory Speed Virtual Distributed Storage 1 Haoyuan Li June 1st, 2016 @ AMPLab 2016 Summer Retreat
  2. 2. What is Alluxio? •  Memory Speed Virtual Distributed Storage •  Enables Virtualized Data Across Multiple Types of Storage 2
  3. 3. Alluxio Open Source Contributor Growth 3 •  Over 250 contributors from over 100 organizations •  3x growth over the last year!
  4. 4. Introducing Alluxio Open Source Governance 4 •  Alibaba •  Alluxio •  Baidu •  Fosun International •  Google •  Huawei •  IBM •  Intel •  Nanjing University •  UC Berkeley
  5. 5. Performance Trend: Memory is Fast •  RAM throughput increasing exponentially •  Disk throughput increasing slowly •  Memory-locality key to interactive response times 5
  6. 6. Price Trend: Memory is Cheaper 6 Source: jcmit.com
  7. 7. The Big Data Ecosystem Today 7
  8. 8. The Big Data Ecosystem Today 8
  9. 9. Alluxio Approach 9
  10. 10. •  Flexibility –  Enable new workloads across any storage systems –  Unified Name Space enable application to access data in any storage system –  Future Proven Architecture •  Technology of your choice –  Work with the framework of your choice –  Work with the storage of your choice •  Performance –  High performance data access –  Efficient data sharing among different computation frameworks and applications •  Cost Saving –  Scale storage and compute independently 10 Alluxio Benefits Alluxio: Any application accesses any data from any storage at memory speed.
  11. 11. •  Tiered Storage •  Transparent Naming •  Unified Namespace •  Native Amazon S3, Google Cloud Storage, Open Stack Swift, Alibaba OSS integrations •  Fuse Connector, K/V Interface •  One Command Cluster Deployment •  Metrics Reporting 11 New Features
  12. 12. 12 The Storage Tier Hierarchy MEM SSD HDD
  13. 13. •  Data can be evicted to lower layers if it is “cooling down” •  Data can be promoted to upper layers if it is “warming up” 13 Automatic Data Migration Evict stale data to lower Ler Promote hot data to upper Ler
  14. 14. •  Applications can transparently and efficiently interact with remote storage through Alluxio. •  Applications do not need to use different APIs for interacting with different storage systems. 14 Transparent Naming alluxio://host:port/ data users reports sales alice bob s3n://bucket/directory data users reports sales alice bob Alluxio Storage System
  15. 15. •  Applications can read and write different storage systems •  Decouples data location from application 15 Unified Namespace alluxio://host:port/ data users reports sales alice bob hdfs://host:port/ users alice bob s3n://bucket/directory reports sales Alluxio Storage System A Storage System B
  16. 16. •  Framework: Spark •  Under Storage: Baidu’s File System •  Storage Media: MEM + HDD •  200+ nodes deployment •  2PB+ managed space 16 +
  17. 17. •  Framework: Spark •  Storage Media: MEM •  Improvement from Hours to Seconds 17 +
  18. 18. •  Framework: Spark Streaming & Flink •  Under Storage: HDFS & Ceph •  Storage Media: MEM + HDD •  200 nodes deployment •  Alluxio enables previously impossible jobs to finish •  10x Performance Improvement on average •  300x Performance Improvement during peak time. 18 +
  19. 19. Contacts •  Alluxio Open Source Project: www.alluxio.org •  Alluxio, Inc: www.alluxio.com •  Development: www.github.com/Alluxio/alluxio •  Meet Friends: www.meetup.com/Alluxio •  Contact: info@alluxio.com ; haoyuan@alluxio.com 19
  20. 20. Thank You

×