Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x performance improvement to Qunar's streaming processing

1,171 views

Published on

Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x performance improvement to Qunar's streaming processing

Published in: Technology
  • Be the first to comment

Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x performance improvement to Qunar's streaming processing

  1. 1. Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x performance improvement to Qunar’s streaming processing Yupeng Fu, Alluxio Inc. Xueyan Li, Qunar Inc. May 2017
  2. 2. ABOUT US • Yupeng Fu (yupeng9@github) • Software Engineer @ Alluxio, Inc. • Alluxio PMC member • Worked at Palantir, Google • Xueyan Li (astralidea@github) • Software Engineer @ Qunar, Inc. • Alluxio contributor 2
  3. 3. HISTORY • Started at UC Berkeley AMPLab In Summer 2012 • Originally named as Tachyon • Rebranded to Alluxio in early 2016 • Open Sourced in 2013 • Apache License 2.0 • Latest Stable Release: Alluxio 1.4.0 (Jan 2017) • Alluxio 1.5.0-RC1 just cut 3
  4. 4. FASTEST-GROWING BIG DATA PROJECT • Fastest growing open- source project in the big data ecosystem • 500+ contributors from 100+ organizations • Running world’s largest production clusters 4
  5. 5. DATA ECOSYSTEM YESTERDAY • One Compute Framework • Single Storage System • Co-located 5
  6. 6. DATA ECOSYSTEM TODAY … • Many Compute Frameworks • Multiple Storage Systems • Most not co-located … 6
  7. 7. DATA ECOSYSTEM ISSUES • Each application manage multiple data sources • Add/Removing data sources require application changes • Storage optimizations requires application change • Lower performance due to lack of locality … … 7
  8. 8. DATA ECOSYSTEM WITH ALLUXIO • Apps only talk to Alluxio • Simple Add/Remove • No App Changes • Highest performance in Memory • No Lock in Native  File   System Hadoop Compatible   File  System Native  Key-­Value   Interface Fuse  Compatible   File  System HDFS  Interface Amazon  S3   Interface Swift  Interface GlusterFS Interface … … 8
  9. 9. WHY ALLUXIO Co-located compute and data with memory-speed access to data Virtualized across different storage systems under a unified namespace Scale-out architecture File system API, software only 9
  10. 10. ALLUXIO BENEFITS Unification New workflows across any data in any storage system Orders of magnitude improvement in run time Choice in compute and storage – grow each independently, buy only what is needed Performance Flexibility 10
  11. 11. ALLUXIO DEPLOYMENTS 11
  12. 12. ALLUXIO USE CASES On-Demand Analytics & Accelerating I/O to and from remote storage Managing data across disparate storage systems Sharing data across workloads at memory speed Unify Data Analytics on Data from Geo-distributed Stores One  of  Top  3 IT  Vendors 12
  13. 13. ON-DEMAND ANALYTICS & ACCELERATE I/O TO/FROM REMOTE STORAGE “The performance was amazing. With Spark SQL alone, it took 100-150 seconds to finish a query; using Alluxio, where data may hit local or remote Alluxio nodes, it took 10-15 seconds. RESULTS • Data queries are now 30x faster with Alluxio • Alluxio cluster runs stably, providing over 50TB of RAM space • By using Alluxio, batch queries usually lasting over 15 minutes were transformed into an interactive query taking less than 30 seconds PMs run interactive queries to gain insights into their products & business • 200+ nodes deployment • 2+ petabytes of storage • Mix of memory + HDD ALLUXIO Baidu File System 13
  14. 14. SHARE DATA ACROSS JOBS @ MEMORY SPEED “Thanks to Alluxio, we now have the raw data immediately available at every iteration & can skip the costs of loading in terms of time waiting, network traffic, and RDBMS activity. RESULTS • Barclays workflow iteration time decreased from hours to seconds • Alluxio enabled workflows that were impossible before • By keeping data only in memory, the I/O cost of loading and storing in Alluxio is now on the order of seconds Barclays uses query & machine learning to train models for risk management • 6 node deployment • 1TB of storage • Memory only ALLUXIOALLUXIO Relational Database: Teradata 14
  15. 15. ONE OF TOP 3 IT VENDORS: UNIFY DATA ANALYTICS ON DATA FROM GEO-DISTRIBUTED STORES “Alluxio has enabled us to get valuable insights into all our data as opposed to just a subset - VP of Analytics RESULTS • Alluxio Unified Global Namespace enabled access of data from stores in different data centers without the need for ETL • Enables Insights into business that was otherwise not possible due to ETL restrictions on data Analysts at a Major global IT company run analytics on WW data • 10 Data Centers across different geo- regions in the world: North America, Europe, and Asia ALLUXIO Europe Asia 15
  16. 16. MANAGE DATA ACROSS STORAGE SYSTEMS “We’ve been running in production for over 1 year, Alluxio’s enabled different applications & frameworks to easily interact with data from different storage systems RESULTS • Data sharing among Spark Streaming, Spark batch and Flink jobs provide efficient data sharing • Improved the performance of their system with 15x – 300x speedups • Tiered storage feature manages storage resources including memory, SSD and disk Qunar uses real-time machine learning for their website ads • 200+ nodes deployment • 6 billion logs (4.5 TB) daily • Mix of Memory + HDD ALLUXIO 16
  17. 17. About  Qunar 400 0QP S Pric e   Data 4T 500 G Raw  messageSensitive  data Daily  data  volume After  compression • Leading travel search engine and information provider in China • 75 million monthly visitors and 34 million activated mobile app users • Real-time data processing platform • Alluxio in production over a year 17
  18. 18. Hotel  Quotation  Pricing  System 01 02 03 04 1 2 3 4 Analyst/PM/Operatio ns Business  Products Direct  queries Price  center   Monitor Real-time  /  off-line   model  training Raw  Data Collection Data  Extraction And  Cleaning Data  Compression And  Conversion Pricing  Info 18
  19. 19. Platform  Architecture  -­ Before Issues • Slow Remote HDFS • Repetitive disk read • Spark executor restart • Spark GC and OOM 19
  20. 20. Improving  the  Architecture  with  Alluxio 20
  21. 21. Platform  Architecture  in  a  nutshell Compute Storage Resource   Manager HDFS HDFS Ceph 21
  22. 22. Benefits  with  Alluxio 02 04 01 03 Management  of  the  local  storage,  including  memory,  SSD  and  disk  constitute  a   hierarchical  storage  layer. Simple  API  and  easy  integration Reduce  GC  overhead   and  when  a  Spark  executor  fails  to  exit,  the  calculated  data   will  not  be  lost  due  to  the  "drifting"  of  the  executor. Zeppelin,  Flink,  Spark,  MapReduce,  can  share  data  at  memory- speed. Unified  namespace Data  sharing  among  compute  frameworks Unifies  the  HDFS  clusters  and  other  storage  systems. Tiered   storage Write  app  once  and  work  with  multiple  storage  systems 05 Spark  off-­heap  storage 22
  23. 23. Benefits  with  Alluxio On  average  15X  faster! 300x  faster  at  peak  time! 23
  24. 24. • Team consists of Alluxio creators and top committers • Invested by • Committed to Alluxio Open Source • http://www.alluxio.com Alluxio Inc. We are hiring! 24
  25. 25. Contact: yupeng@alluxio.com Twitter: @Alluxio Websites: www.alluxio.com and www.alluxio.org Thank you! Demo: Spark + Alluxio + S3 https://youtu.be/QVtxDpA-jis Alluxio Unified Namespace https://youtu.be/lIXpNK4VxqE 25
  26. 26. Tiered storage separates cold and hot data MEM SSD HDD Most  of  the  data  in  a  hotspot  will  only  be  used   for  the  day's  results. We  deployed  Alluxio Worker  on  each  compute   node  and  managed  the  local  storage  media,   including  memory,  SSDs  and  disks,  to  form  a   hierarchical  storage  tier.  Each  node  upstream   computing  related  data  will  be  stored  in  the  local   as  much  as  possible,  to  avoid  consumption  of   network  resources.  At  the  same  time,  Alluxio  itself   provides  LRU,  LFU  and  other  efficient   replacement  strategy  to  ensure  that  the  hot  data   is  located  in  the  faster  memory  layer  to  improve   the  data  access  rate;  even  the  cold  data  is  stored   in  the  local  disk,  avoiding  having  to  access  26

×