Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Using Alluxio as a Fault-tolerant Pluggable Optimization Component of JD.com's Computation Frameworks
1. Using Alluxio as a fault-tolerant pluggable optimization
component of JD.com's computation frameworks
2018-09-13
Bing Bai, JD.com
Tao Huang, JD.com
2. Introduce JD.com and BDP’s architecture and business
JD and BDP
01
02 The JD use case of Alluxio
JDPresto on Alluxio
03 Alluxio on yarn & shuffle service & storage-computing separation
Ongoing Exploration
Contents
4. JD Introduction
• China’s largest retailer, online or offline
• First Chinese internet company to make the
Fortune Global 500 list
• Strict “zero-tolerance” policy toward
counterfeit goods. Customers trust JD
because the brand is a guarantee of
authenticity
2012 2013 2014 2015 2016 2017
系列 1
Rapid Growth in GMV in Last Six Years*
144.5
billion
93.3
billion
13.4
billion
23.5
billion
46.8
billion
Sustained, Rapid Growth
199.1
billion
5. JD BDP Platform
30k+ Node, off-line
cluster 18k+, user
6000+
Cluster
scale
Computing
ability
off-line data daily
40PB+, Job daily
1millon+
450PB+, daily
increase 500TB+
Business
capability
business 40+, data
model 450+
Storage
capacity
8. JDPresto on Alluxio
JDPresto on Alluxio advantage
Pluggable
Fault-tolerant
Locality
Alluxio can be online or updated at any time, and business’s feeliing is
just a little slow
When we use Alluxio for JDPresto, we make some changes
and bring some good features.
When Alluxio unable to access,JDPresto can access HDFS directly.
Reduce the remote read
• Alluxio led to 10x performance
improvement
• 100+ nodes
• More than 1 year.
17. Shuffle Service on Alluxio
17
Disk I/O performance bottleneck
Not enough space for the local disk
Executor fails without recalculating
Uniform data TTL ensures that
temporary files are deleted.