輪講発表資料です。
S-Caffe: Co-designing MPI Runtimes and Caffe for Scalable Deep Learning on Modern GPU Clusters
Ammar Ahmad Awan, Khaled Hamidouche, Jahanzeb Maqbool Hashmi, Dhabaleswar K. Panda
PPoPP ‘17 Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming Pages 193-205
2. Title
2
S-Caffe: Co-designing MPI Runtimes and Caffe
for Scalable Deep Learning on Modern GPU Clusters
Authors:
Ammar Ahmad Awan, Khaled Hamidouche, Jahanzeb Maqbool Hashmi, Dhabaleswar K. Panda
Dept. of Computer Science and Engg. The Ohio State University
Published in:
PPoPP ‘17 Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Pages 193-205
Austin, Texas, USA — February 04 - 08, 2017
16. DL-Aware MPI Reduce
チャンクサイズ:n プロセス数:P バッファサイズ:b
1ステップにかかる時間:t(b)として、
T (Bin) = log(P ) * t(b)
T(CC)=(n+P −2)* t(b/n)
small P, large b で T(CC) << T(Bin)
large P, small b で T(CC) >> T(Bin)
16