Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using Alluxio as a Fault Tolerant Pluggable Optimization Component to Compute Frameworks of JD System

104 views

Published on

Strata Data Conference - May 2018

Published in: Technology
  • Be the first to comment

Using Alluxio as a Fault Tolerant Pluggable Optimization Component to Compute Frameworks of JD System

  1. 1. Using&Alluxio(formerly&Tachyon)&as&a&fault8 tolerant&pluggable&optimization&component&to& compute&frameworks&of&JD&system& 201885823 Yupeng'Fu,''''''Alluxio Inc Baolong Mao,'JD.com Yiran Wu,'''''''''JD.com
  2. 2. A introduction of alluxio   Introduce0JD.com and BDP’s architecture and business The JD use case of Alluxio &   Alluxio on yarn & shuffle service & storage@computing separation & & Contents
  3. 3. An#open'source,#memory'based,# distributed#file#system
  4. 4. Data$Ecosystem$Develops • One$Compute$Framework • Single$Storage$System • Co:located ETL ETL ETL
  5. 5. Data$Ecosystem$Explodes … • Many$Compute$ Frameworks • Many$Storage$Systems • Most$not$co:located …
  6. 6. Data$Ecosystem$Issues • Each$app$manages$ multiple$data$sources • Data$source$changes$ require$global$updates • Storage$optimizations$ requires$app$change • Poor$performance$due$to$ lack$of$locality$ … …
  7. 7. 7 Data%Ecosystem%Challenges 2 Data Freshness • Cross%network+movement+is+slow • Copies+create+lag • Data+quality+suffers+with+copies 4 Security & Governance • Data+security+&+governance+is+ increasingly+complex 1 Speed & Complexity • Integration+and+interoperability+issues+ (on+prem,+hybrid,+cloud) • Many+departments+&+groups 3 Cost • Many%to%many+integrations+are+ expensive • Data+duplication 7 Heavy%integrations%create%painful%organizational%drag
  8. 8. Data$Ecosystem$with$Alluxio • Apps$only$talk$to$ Alluxio • Simple$Add/Remove$ • No$App$Changes • Highest$performance$ in$Memory Java$File$API HDFS$Interface Amazon$S3$ Interface REST$Web$Service HDFS$Interface Amazon$S3$ Interface Swift$Interface NFS$Interface … …
  9. 9. 9 Alluxio Design-Principles 2 Data Sharing • Don’t&own&the&data • Multiple&apps&sharing&common&data • Data&stored&in&multiple,&hybrid&systems 4 Enterprise Class • Distributed&architecture • Commodity&hardware • Service<oriented • High&availability • Security 1 Big Data & Machine Learning • Interoperability&with&leading&projects • Large&scale&data&sets • High&IO 3 High Speed Data Access • Remote&data • Hot/warm/cold&data • Temporary&data • Read/write&support 9
  10. 10. 10 Alluxio Innovation: Server-side API Translation Convert from Client-side Interface to Native Storage Interface HDFS%Interface%/%S3%Interface%/%FUSE HDFS%Interface S3A%Interface Swift%Interface Google%Cloud% Interface
  11. 11. 11 Alluxio Innovation: Server-side API Translation Convert between different versions of HDFS HDFS%2.7%Interface HDP%2.4%InterfaceCDH%5.6%Interface MAPR%5.2%Interface
  12. 12. Alluxio'Innovation: Unified'Namespace • Enables(effective(data(management(across(different(Under(Stores • Uses(Mounting(with(Transparent(Naming
  13. 13. 13 Alluxio Innovation: Unified Namespace Create a catalog of available data sources for Data Scientists /finance/customer.transactions/ /finance/vendor.transactions/ /operations/device.logs/ /operations/phone.call.recordings/ /operations/check.images/ /research/us.economic.data/ /research/intl.economic.data/ /marketing/advertising.dataset/ /marketing/marketing.funnel.dataset/ alluxio://
  14. 14. 14 Alluxio Innovation: Intelligent Cache Local performance from remote data using native multi-tier storage RAM SSD HDD Hot Warm Cold
  15. 15. 15 Where to use Alluxio Finding high-fit Alluxio use-cases Compute*Zone Standalone*or*managed*with*Mesos or*Yarn Storage*in*Different*Availability*Zone Either*on@prem or*cloud Alluxio*is*installed*with*or*near*compute*to*unify*data* stores,*stage*remote*data,*and*improve*system* performance. Spark Tensorflow Presto HDFS Guidelines ! Cloud*deployment ! Compute*separated*from*storage ! I/O*or*network*latency*exists ! Unification*of*many*storage*systems ! Applications*sharing*long*lived*data More%checks%result%in%higher%fit%applications
  16. 16. 100+$known$production$deployments AND$MORE!
  17. 17.     &
  18. 18. JD Introduction • China’s largest-retailer,-online-or-offline • First Chinese-internet-company-to-make-the- Fortune-Global-500-list • Strict-“zero?tolerance”-policy-toward- counterfeit-goods.-Customers-trust-JD- because-the-brand-is-a-guarantee-of- authenticity Rapid-Growth-in-GMV-in-Last-Six Years* 144.5 billion 93.3 billion 13.4 billion 23.5 billion 46.8 billion USD Sustained,-Rapid-Growth
  19. 19. JD Introduction • Able%to%reach%more%than%1%billion%Chinese% consumers%leveraging%the%strategic%cooperation% with%Tencent • Help%brands%leverage%most%comprehensive%social% +%commerce%target%marketing%program%based%on% big%data • About%80%%of%orders%placed%through%mobile%
  20. 20. JD BDP Platform   ,+ ,   , , - + +,+ ,+ +,   +,   +, + + +,+ +   - , +
  21. 21. JD BDP 23
  22. 22. Pluggable Fault*tolerant Locality Alluxio can*be*online*or*updated*at*any*time,*and* business’s feeliing is just*a little slow When we use Alluxio for JDPresto, we make some changes and bring some good features. When Alluxio unable to access JDPresto can access HDFS directly. Reduce the remote read • Alluxio led to 10x performance*improvement
  23. 23. JDPresto on Alluxio load once use every time ō AfterBefore
  24. 24. JDPresto on Alluxio
  25. 25. JDPresto on Alluxio
  26. 26. JDPresto on Alluxio 29
  27. 27. JDPresto on Alluxio 30 hadoop cluster: X DataSource: Ad cluster 1 day SQL Worker jdpresto on alluxio 22 jdpresto 40 Speed Contrast
  28. 28. JDPresto on Alluxio
  29. 29. &
  30. 30. Alluxio on YARN Resource Manager NodeManager Alluxio AppMaster Alluxio Worker Alluxio Master Alluxio Master NodeManager Alluxio Worker Alluxio Worker Client Spark Presto
  31. 31. Shuffle'Service'on Alluxio 34 ! . /   / ! . / ! . . . ! . .
  32. 32. Shuffle'Service'on Alluxio   Alluxio Node Alluxio Node Alluxio Node Map Map Map   Alluxio Cluster Alluxio Node ReduceReduce
  33. 33. Shuffle'Service'on Alluxio spark4default.conf spark.shuffle.service.enabled=true spark.shuffle.store.type ='DistributedCache spark.shuffle.store.distributed.cache.url=alluxio://bdp.jd.com Implement'DistributeCache implemention for'shuffle Re4implement'org.apache.spark.shuffle.sort.SortShuffleWriter Re4implement'org.apache.spark.shuffle.sort.HashShuffleReader
  34. 34. Shuffle'Service'on Alluxio
  35. 35. Shuffle'Service'on Alluxio CPU'Usage CPU'Usage TimeTime Percent Percent
  36. 36. Shuffle Service On Alluxio Using Alluxio FUSE Using Alluxio API
  37. 37. Separate computing and storage Resource Manager NodeManager DataNode NodeManager DataNode NodeManager DataNode DataNode DataNode NodeManagerNodeManager NameNode Resource Manager Cluster1 Cluster2 Alluxio
  38. 38. JD Contribution to Alluxio - 76 8 / 78 - 18210 - 18210 755 00 76 +141 76
  39. 39. JD Contribution to Alluxio / -   /   - - / - - - - - - - - -   - - / - /- -           - - - A -   - A
  40. 40.   ! yupeng@alluxio.co maobaolong@jd.com wuyiran@jd.com Website www.alluxio.org E(mail info@alluxio.com @

×