8
SPARK ⾼速化の為の RAPIDSACCELERATOR
UCX Libraries
RAPIDS libcudf
(C++ Libraries)
CUDA
JNI bindings
Mapping From Java/Scala to C++
RAPIDS Accelerator
for Spark
DISTRIBUTED SCALE-OUT SPARK APPLICATIONS
Spark SQL API Spark Shuffle
DataFrame API
if gpu_enabled(operation, data_type)
call-out to RAPIDS
else
execute standard Spark operation
JNI bindings
Mapping From Java/Scala to C++
●Custom Implementation of Spark
Shuffle
●Optimized to use RDMA and GPU-
to-GPU direct communication
APACHE SPARK CORE
RAPIDS –データ分析およびマシンラーニングを GPU ⾼速化する為のオープンソース
https://developer.nvidia.com/rapids
9.
9
GTC21: Deep-Learning Data-PipelineOptimization for Network Data Analysis in
SK Telecom by Employing Spark Rapids for Custom Data Source
https://www.nvidia.com/en-us/on-demand/session/gtcspring21-s31400/
ユースケース
10.
10
GTC21: Deep-Learning Data-PipelineOptimization for Network Data Analysis in
SK Telecom by Employing Spark Rapids for Custom Data Source
https://www.nvidia.com/en-us/on-demand/session/gtcspring21-s31400/
ユースケース