Midpoint-Based Parallel Sparse Matrix-Matrix Multiplication Algorithm

Midpoint-Based Parallel Sparse Matrix–Matrix
Multiplication Algorithm
V. Weber1, A. Pozdneev2, T. Laino1
1
IBM Zurich Research Laboratory, R¨uschlikon, Switzerland
2
IBM Science and Technology Center, Moscow, Russia
March 3, 2016 — GraphHPC-2016, Moscow
1 c○ 2016 IBM Corporation

Matrix graph duality
Graph algorithms in the language of linear algebra
∙ Adjacency matrix A is dual with the corresponding graph
∙ Vector matrix multiply is dual with breadth-ﬁrst search
Image credit: [Kepner2011]

Sparse matrix-matrix multiplication motivation
Electronic structure theory
∙ Density matrix 𝐷 at zero
electronic temperature:
𝐷 = 𝜃[𝜇𝐼 − 𝐹]
𝜃(𝑥) — Heaviside step
function
𝜇 — chemical potential
𝐼 — identity matrix
𝐹 — Fockian
∙ Second-order spectral projection (SP2)
method
𝐷 = lim
𝑖→∞
𝑓𝑖[𝑓𝑖−1[. . . 𝑓0[𝑋0]]]
∙ Recursive polynomial
𝑓𝑖[𝑋𝑖] =
{︃
𝑋2
𝑖 Tr[𝑋𝑖] > 𝑁 𝑠
2𝑋𝑖 − 𝑋2
𝑖 otherwise
𝑁 𝑠 — number of occupied states

Scalability issues of the Cannon’s and SUMMA algorithms
∙ Weak scaling experiment
∙ ≈ 200 atoms per MPI task
∙ From 32 928 atoms on 144 cores
∙ To 1 022 208 atoms on 5184 cores
∙ Image legend:
Total CPU time per atom
Block matrix multiplications
Book-keeping overhead
Communication between processes
Dashed line: fit to the data using
𝑓(𝑁) = 𝑎 + 𝑏
√
𝑁
Image credit: [VandeVondele2012]

Midpoint method
∙ Particle interactions in molecular
dynamics [Bowers2006]
∙ The computational domain is
divided between processors
∙ Interaction between a pair — a
processor that holds a midpoint
Image credit: [Bowers2006]

Midpoint distribution of a matrix

Midpoint distribution of a matrix (contd.)

Sparse matrix-matrix multiplication
∙ Exploit the locality
∙ 3D Cartesian topology
∙ Each matrix element lies at the
process that owns a midpoint:
midpoint(𝑖, 𝑘) ⇒ element 𝐴𝑖𝑘
midpoint(𝑘, 𝑗) ⇒ element 𝐵 𝑘𝑗
midpoint(𝑖, 𝑗) ⇒ element 𝐶𝑖𝑗
∙ Send 𝐴𝑖𝑘 and 𝐵 𝑘𝑗
∙ 𝐶𝑖𝑗 ← 𝐴𝑖𝑘 · 𝐵 𝑘𝑗 + 𝐶𝑖𝑗
i
j
k
Bkj
(-1,0) (0,0)
(0,-1)
Cij
Aik
Image credit: [Weber2015]

Why does it work?
As a reminder. . .
Midpoint distribution of a matrix
𝐴 = 𝐴(1,1)
+ · · · + 𝐴(𝑝 𝑥,𝑝 𝑦)
𝐴𝐵 = (𝐴(1,1)
+ · · · + 𝐴(𝑝 𝑥,𝑝 𝑦)
) ×
× (𝐵(1,1)
+ · · · + 𝐵(𝑝 𝑥,𝑝 𝑦)
)
𝐴(𝑟,𝑠)
𝐵(𝑡,𝑢)
̸= 𝑂
if |𝑟 − 𝑡| ≤ 2, |𝑠 − 𝑢| ≤ 2

Weak scaling experiments
∙ Two occupations of the
density matrix (O1, O2)
∙ ≈ 19 water molecules
per MPI task
∙ 216 ÷ 59 319 MPI tasks
∙ MPSM3 vs. SUMMA
crossover point:
O1: ≈ 19 683 MPI
tasks (373 248 water
molecules)
O2: ≈ 9 261 MPI
tasks (175 616 water
molecules)
Image credit: [Weber2015]

Strong scaling experiments
∙ Water molecules:
S1: 110 592
S2: 373 248
S3: 1 124 864
∙ MPI tasks:
S1: 216 ÷ 59 319
S2: 1 728 ÷ 110 592
S3: 9 261 ÷ 185 193
∙ MPSM3 vs. SUMMA
crossover point — slightly
smaller than S2 Image credit: [Weber2015]

Summary
∙ Linear algebra is the language of graph algorithms
∙ Sparse matrix-matrix multiplication is a critical building block
∙ General-purpose SMMM algorithms do not scale well
∙ We distribute sparse matrix according to the midpoint principle
∙ Linear scalability with number of “edges” given proportional resources
∙ Advantages over SUMMA:
reduced communication volume
more effective load balancing
communication latency (nearby processes only)
∙ The scalability comes from the reduced volume of interprocess
communications

References
V. Weber, T. Laino, A. Pozdneev, I. Fedulova, A. Curioni
Semiempirical Molecular Dynamics (SEMD) I: Midpoint-Based Parallel
Sparse Matrix–Matrix Multiplication Algorithm for Matrices with Decay
J. Chem. Theory Comput., 2015, 11 (7), pp 3145–3152
K.J. Bowers, R.O. Dror, D.E. Shaw
The midpoint method for parallelization of particle simulations
J. Chem. Phys., 124, 184109 (2006)
J. VandeVondele, U. Borˇstnik, J. Hutter
Linear Scaling Self-Consistent Field Calculations with Millions of Atoms
in the Condensed Phase
J. Chem. Theory Comput., 2012, 8 (10), pp 3565–3573
J. Kepner, J. Gilbert (eds.)
Graph Algorithms in the Language of Linear Algebra
SIAM, Philadelphia, 2011

Disclaimer
All the information, representations, statements, opinions and proposals in this
document are correct and accurate to the best of our present knowledge but are
not intended (and should not be taken) to be contractually binding unless and
until they become the subject of separate, specific agreement between us.
Any IBM Machines provided are subject to the Statements of Limited Warranty
accompanying the applicable Machine.
Any IBM Program Products provided are subject to their applicable license terms.
Nothing herein, in whole or in part, shall be deemed to constitute a warranty.
IBM products are subject to withdrawal from marketing and or service upon
notice, and changes to product configurations, or follow-on products, may result
in price changes.
Any references in this document to “partner” or “partnership” do not constitute or
imply a partnership in the sense of the Partnership Act 1890.
IBM is not responsible for printing errors in this proposal that result in pricing or
information inaccuracies.

Правовая информация
IBM, логотип IBM, BladeCenter, System Storage и System x являются товарными знаками International Business
Machines Corporation в США и/или других странах. Полный список товарных знаков компании IBM смотрите
на узле Web: www.ibm.com/legal/copytrade.shtml.
Названия других компаний, продуктов и услуг могут являться товарными знаками или знаками обслуживания
других компаний.
(c) 2016 International Business Machines Corporation. Все права защищены.
Упоминание в этой публикации продуктов или услуг корпорации IBM не означает, что IBM предполагает
предоставлять их во всех странах, в которых осуществляет свою деятельность, информация о
предоставлении продуктов или услуг может быть изменена без уведомления. За самой свежей информацией
о продуктах и услугах компании IBM, предоставляемых в Вашем регионе, следует обращаться в ближайшее
торговое представительство IBM или к авторизованным бизнес-партнерам.
Все заявления относительно намерений и перспективных планов IBM могут быть изменены без уведомления.
Информация о продуктах третьих фирм получена от производителей этих продуктов или из опубликованных
анонсов указанных продуктов. IBM не тестировала эти продукты и не может подтвердить
производительность, совместимость, или любые другие заявления относительно продуктов третьих фирм.
Вопросы о возможностях продуктов третьих фирм следует адресовать поставщику этих продуктов.
Информация может содержать технические неточности или типографические ошибки. В представленную в
публикации информацию могут вноситься изменения, эти изменения будут включаться в новые редакции
данной публикации. IBM может вносить изменения в рассматриваемые в данной публикации продукты или
услуги в любое время без уведомления.
Любые ссылки на узлы Web третьих фирм приведены только для удобства и никоим образом не служат
поддержкой этим узлам Web. Материалы на указанных узлах Web не являются частью материалов для
данного продукта IBM.

Midpoint-Based Parallel Sparse Matrix-Matrix Multiplication Algorithm

Recommended

Recommended

More Related Content

Similar to Midpoint-Based Parallel Sparse Matrix-Matrix Multiplication Algorithm

Similar to Midpoint-Based Parallel Sparse Matrix-Matrix Multiplication Algorithm (6)

More from Alexander Pozdneev

More from Alexander Pozdneev (7)

Midpoint-Based Parallel Sparse Matrix-Matrix Multiplication Algorithm