Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Open Source Memory Speed Virtual Distributed Storage

720 views

Published on

Bay Area Meetup presentation (6/15/16)

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Open Source Memory Speed Virtual Distributed Storage

  1. 1. Alluxio  (formerly  Tachyon) Open  Source  Memory  Speed  Virtual  Distributed  Storage Haoyuan  Li CEO,  Alluxio,  Inc.
  2. 2. 2 Rebranded from Tachyon to Alluxio! Tachyon Alluxio
  3. 3. 3 Rebranded from Tachyon to Alluxio! http://www.alluxio.com/blog/
  4. 4. About Alluxio • Team – Alluxio Creators and Top Developers/Committers (all top 8 committers). • Investors
  5. 5. Performance Trend: Memory is Fast • RAM throughput increasing exponentially • Disk throughput increasing slowly • Memory-locality key to interactive response times
  6. 6. Price Trend: Memory is Cheaper Source:  jcmit.com
  7. 7. The Big Data Ecosystem Today
  8. 8. What is Alluxio? • Alluxio: Memory Speed Virtual Distributed Storage • Enables Virtualized Data Across Multiple Types of Storage
  9. 9. 9 Open Source Community Growth 0 50 100 150 200 250 300 350 #  Contributors  (gitcommit  history) v0.2 v0.3 v0.4 v0.5 v0.6 v0.7 v0.8
  10. 10. 10 Open Source Community Growth 0 50 100 150 200 250 300 350 #  Contributors  (gitcommit  history) v0.2 v0.3 v0.4 v0.5 v0.6 v0.7 v0.8 v1.0 v1.1
  11. 11. Open Source Alluxio System • The fastest growing open source project in big data • Over 250 contributors from over 100 organizations
  12. 12. Alluxio Benefits • Flexibility – Enable new workloads across any storage systems – Unified Name Space enable application to access data in any storage system • Agility – Work with the framework of your choice – Work with the storage of your choice • Performance – High performance data access • Cost – Grow Storage and Compute independently • Any application accesses any data from any storage at memory speed.
  13. 13. New Features and Improvements in Alluxio 1.0 and 1.1 Gene Pang @ Alluxio, Inc. June 15, 2016 @ Alluxio Meetup (hosted by Intel)
  14. 14. About Me • Gene Pang - Software Engineer @ Alluxio, Inc. • One of the core maintainers of Alluxio Open Source Project • Ph.D. @ AMPLab, UC Berkeley • Worked at Google before UC Berkeley • Twitter: @unityxx 14
  15. 15. 15 Outline Performance Improvement Results in Alluxio 1.1 New Developments in Alluxio Alluxio Architecture Overview
  16. 16. 16 Alluxio Architecture Overview
  17. 17. 17 Architecture Overview Alluxio Master Alluxio Worker Alluxio Worker Alluxio Worker Under File System Under File System Journal Manages metadata Serves data blocks Mount multiple storage systems
  18. 18. 18 Alluxio New Developments
  19. 19. 19 Releases Tachyon 0.8 – Oct 22, 2015 Alluxio 1.0 – Feb 23, 2016 Alluxio 1.1 – Jun 7, 2016
  20. 20. 20 New Developments New Integrations Usability Improvements Performance Improvements Access Control (Alpha)
  21. 21. 21 New Integrations Native OpenStack Swift Driver Alluxio to FUSE Connector Google Cloud Storage Aliyun Object Storage Service Google Compute Engine improve performance, reduce complexity manage data on Alibaba Cloud mount Alluxio to local file system manage data on Google Cloud Platform deploy Alluxio on Google Cloud Platform
  22. 22. 22 Access Control (Alpha) User/Group Support Command-line Permission Tools Configuration Parameter File System Permissions similar to POSIX permission model chown, chgrp, chmod alluxio.security.authorization.permission.enabled similar to POSIX permission model
  23. 23. 23 Usability Improvements Write Location Policies Simplified Configuration Automatic Metadata Loading configure how to write data to Alluxio load metadata automatically customize with properties
  24. 24. 24 Performance Improvements Improved Alluxio Master Scalability Better Support for Random I/O Workloads Improved Alluxio Worker Scalability fine-grained locking, efficient journaling improved data structures, improved locking cache blocks during random I/O (e.g., parquet files)
  25. 25. 25 Alluxio 1.1 Performance Improvement Results
  26. 26. 26 Create File Throughput Throughput Test  Duration 1.0.1 (Local Journal)
  27. 27. 27 Create File Throughput Throughput Test  Duration 1.0.1 1.1.0 1.8x improvement (Local Journal)
  28. 28. 28 Create File Throughput (Remote Journal) Throughput Test  Duration 1.0.1
  29. 29. 29 Create File Throughput (Remote Journal) Throughput Test  Duration 1.0.1 1.1.0 23x improvement
  30. 30. 30 List Directory Throughput Throughput Test  Duration 1.0.1
  31. 31. 31 List Directory Throughput Throughput Test  Duration 1.0.1 1.1.0 7x improvement
  32. 32. 32 Worker Scalability Write  Latency #  Blocks  on  Worker 1.0.1
  33. 33. 33 Worker Scalability Write  Latency #  Blocks  on  Worker 1.0.1 1.1.0
  34. 34. 34 Try out Alluxio 1.1.0 http://www.alluxio.org/releases

×