SlideShare a Scribd company logo
1 of 20
Download to read offline
1/20 
University of Tokyo 
Huge-Scale Molecular Dynamics 
Simulation of Multi-bubble Nuclei 
ISSP, The University of Tokyo 
M. Suzuki Kyushu University 
H. Inaoka RIKEN AICS 
The University of Tokyo, RIKEN AICS 
Outline 
H. Watanabe 
N. Ito 
1. Introduction 
2. Benchmark results 
3. Cavitation 
4. Future of HPC 
5. Summary and Discussion 
May. 15th, 2014@NCSA Blue Waters Symposium
2/20 
University of Tokyo 
Gas-Liquid Multi-Phase Flow Hierarchical 
Phase 
Transi+on 
Modeling 
Bubble 
Interac+on 
Flow 
Par+cle 
Interac+on 
Direct 
Calcula+on 
Multi-Scale and Multi-Physics Problem 
Numerical Difficulties 
Moving, Annihilation and Creation 
of Boundary 
Hierarchical Modeling 
Divide a problem into sub problems 
Governing Equations for each scale 
Artificial Hierarchy 
Validity of Governing Equation 
Direct Simulation 
Movie 
Introduction (1/2)
3/20 
Classical Nucleation Theory 
Classical nucleation theory (CNT) predicts a nucleation rate of clusters. 
OK for droplet nuclei, bad for bubble nuclei. 
Interaction between clusters 
Interact via gas (diffusive) 
Interact via liquid (ballistic) 
University of Tokyo 
Introduction (2/2) 
Droplets and Bubbles 
Additional work to create a bubble 
Same as Droplet 
Work by Bubble 
It is difficult to investigate microscopic behavior directly. 
Huge scale molecular dynamics simulation 
to observe interaction between bubbles.
4/20 
University of Tokyo 
Simulation Method 
Molecular Dynamics (MD) method with Short-range Interaction 
General Approach 
Parallelization (1/3) 
MPI: Domain Decomposition 
OpenMP: Loop Decomposition 
DO I=1,N 
DO J in Pair(I) 
CalcForce(I,J) 
ENDDO 
ENDDO 
Loop Decomposition here 
or here 
Problems 
Conflicts on write back of momenta 
→ Prepare temporary buffer to avoid conflicts 
→ Do not use Newton’s third law (calculate every force twice) 
We have to thread parallelize other parts, such as pair-list construction. 
We have to treat SIMDization and thread parallelization simultaneously.
5/20 
University of Tokyo 
Pseudo-flat MPI 
Full domain decomposition for both intra- and inner-nodes. 
MDUnit class: 
Represent an OpenMP thread 
Responsible for calculation 
MDManager class: 
Represent an MPI process 
Responsible for communication 
DO I=1,THREAD_NUM 
CALL MDUnit[i]->Calculate() 
ENDDO 
Loop parallelization 
MDUnit 
MDUnit 
MDUnit 
MDUnit 
Thread 
MDUnit 
MDUnit 
MDUnit 
Process 
MDUnit 
Parallelization (2/3) 
Merits 
It is straightforward to implement pseudo-flat MPI codes from flat MPI codes. 
Memory locality is automatically satisfied. (not related to the K computer) 
We have to take care SIMDization only at the hot-spot (force calculation). 
→ We do not have to take care two kinds of load balance simultaneously.
6/20 
Parallelization (3/3) 
Demerits 
Two-level communication is required. 
→ Limitation of MPI call in thread. 
→ Data should be packed before being sent by a master thread. 
It is troublesome to compute physical quantities such as pressure, and so on. 
University of Tokyo 
MDManager 
MDManager 
MDUnit 
MDUnit 
MDUnit 
MDUnit 
Communication 
MDUnit 
MDUnit 
MDUnit 
MDUnit
7/20 
University of Tokyo 
Pair-list Construction 
Algorithms for MD 
Grid Search 
O(N) method to find interacting particle-pairs 
Bookkeeping method 
Reuse the pair-list by registering pairs in 
a length longer than truncation length. 
z 
Sorting of Pair-list 
Sort by i-particle in order to reduce memory access 
Truncation 
Registration
8/20 
University of Tokyo 
SIMDization 
Non-use of Newton’s third law 
We find that store operations involving indirect addressing is quite 
expensive for the K computer. 
We avoid indirect storing by not using Newton’s third law. 
Computations becomes double, but speed up is more than twofold. 
The value of FLOPS is meaningless. 
Hand-SIMDization 
Compiler cannot generate SIMD operations efficiently. 
We use intrinsic SIMD instructions explicitly. (emmintrin.h) 
Divide instruction (fdivd) cannot be specified as a SIMD instruction. 
We use a reciprocal approximation (frcpd) which is fast, but accuracy 
is only 8bit. 
We improve the precision by additional calculations. 
The loop is unrolled by four times to enhance the software pipelining. 
You can see the source codes from: http://mdacp.sourceforge.net/
9/20 
University of Tokyo 
Benchmark (1/3) - Conditions - 
Conditions 
Truncated Lennard-Jones potential with cutoff length 2.5 (in LJ units) 
Initial Condition is FCC and density is 0.5 
Integration Scheme: 2nd order symplectic with dt = 0.001 
0.5 million particles on a CPU-core (4 million particles on a node) 
After 150 steps, measure time required for 1000 steps. 
Parallelization 
Flat MPI : 8 processes /node up to 32768 nodes (4.2 PF) 
Hybrid : 8 threads/node   up to 82944 nodes (10.6 PF) 
Computations assigned to each core are perfectly identical for both scheme. 
1 CPU-core 
1 node 
8 nodes 
0.5 million 
particles 
4 million 
particles 
32 million 
particles
10/20 
University of Tokyo 
500 
400 
300 
200 
100 
0 
Benchmark (2/3) - Elapse Time - 
1 8 64 512 4096 82944 
Elapsed Time [sec] 
Nodes 
Flat-MPI 
Hybrid 
332 billion particles 
72.3% efficiency 
131 billion particles 
92.0% efficiency 
Nice scaling for the flat MPI (92% efficiency compared with a single node) 
Performance of the largest run: 1.76 PFLOPS (16.6%)
11/20 
University of Tokyo 
9 
8 
7 
6 
5 
4 
3 
2 
1 
0 
Benchmark (3/3) - Memory Usage - 
1 8 64 512 4096 82944 
Memory Usage [GiB] 
Nodes 
Flat-MPI 
Hybrid 
8.52 GiB 
39% increase 
5.66 GiB 
2.8% increase 
Usage of memory is almost flat for hybrid. 
extra 1GB/node for Flat-MPI 
We performed product runs using 4096 nodes adopting flat-MPI.
12/20 
University of Tokyo 
Time Evolution of Multi-bubble nuclei 
Equilibrium Liquid 
Adiabatic expansion 
(2.5% on a side) 
Multi-bubble nuclei 
Converge Ostwald-ripening to a single bubble 
Larger bubbles becomes larger, 
smaller bubbles becomes smaller
13/20 
University of Tokyo 
Density 
Temperature 
G 
Coexistence 
Liquid 
Condition 
Initial Condition : Pure Liquid 
Truncation Lenght:3.0 
Periodic Boundary Conditon: 
Time Step:0.005 
Integration:2nd order Symplectic (NVE) 
     2nd order RESPA (NVT) 
Computational Scale 
Resource:4096 nodes of K computer (0.5PF) 
Flat-MPI: 32768 processes 
Parallelization:Simple Domain Decomposition 
Particles: 400 〜 700 million particles 
System Size: L = 960 (in LJ units) 
Thermalize: 10,000 steps + Observation:1,000,000 steps 
Movie 
Details of Simulation 
24 hours / run 
1 run 〜 0.1million node hours 
10 samples =1 million node hours
14/20 
Multi-bubble Nuclei and Ostwald-like Ripening 
intermediate stage 
Final stage 
University of Tokyo 
23 million particles 
Initial stage 
Multi nuclei 
Ostwald-like ripening 
One bubble survives 
1.5 billion particles 
Bubbles appear a the results of particle interactions. 
Ostwald-like ripening is observed as the results of bubble interactions. 
→ Direct simulations of multi-scale and multi-physics phenomena.
15/20 
A type of binarization 
University of Tokyo 
Identification of Bubbles 
Definition of Bubble 
Divide a system into small cells. 
A cell with density smaller than some threshold is defined to be gas phase. 
Identify bubbles on the basis of site-percolation criterion (clustering). 
Process A 
Process B 
Liquid 
Gas 
Bubble 
We do not have to take 
care of byte-order! 
Data Format 
A number of particles in a cell is stored as unsigned char (1 Byte). 
A system is divided into 32.7 million cells → 32.7 MB / snap shots. 
1000 frames/run → 32.7GB/run. 
Clustering is performed in post process.
16/20 
1.5 billion particles 
23 million particles 
University of Tokyo 
Bubble-size distribution 
Bubble size distribution at the 10,000th step after the expansion. 
Accuracy of the smaller run is insufficient for further analysis. 
→ We need 1 billion particles to study population dynamics of bubbles 
which is impossible without Peta-scale computers.
17/20 
shared with the classical nucleation theory 
f(v, t) ⇠ ty ˜ f(vtx) 
University of Tokyo 
Lifshitz-Slyozov-Wagner (LSW) Theory 
Definitions 
a number of bubbles which has volume v at time t. 
f(v, t) 
@f 
@t 
=  
@ 
@v 
v˙(v, t) Kinetic Term (Microscopic Dynamics) 
Assumptions 
( ˙ vf) Governing Equation 
- Mean field approximation 
- Mechanical equilibrium 
- Conservation of the total volume of gas phase 
- Self-similarity of the distribution function 
Predictions 
Power-law behaviors 
n(t) ⇠ tx 
¯v(t) ⇠ tx 
The total number of bubbles 
The average volume of bubbles 
The scaling index x depends on microscopic dynamics.
18/20 
Scaling Region 
University of Tokyo 
Simulations Results (1/2) 
Number of Bubbles 
Average Volume of Bubbles 
Slope: x=1.5 
Scaling Region 
Power-law behavior in late-stage of time evolution. 
Sharing the value of an exponent as predicted by the theory. 
The LSW theory works fine. 
Assumptions are valid. 
We cannot obtain the scaling region using small systems. 
Huge-scale simulation is necessary.
19/20 
University of Tokyo 
Simulations Results (2/2) 
Scaling exponent changed from 1.5 to 1 
High T 
Low T 
Reaction-Limit 
x=3/2 
Diffusion-Limit 
x=1 
Microscopic dynamics can be investigated from macroscopic behavior.
20/20 
Simulations involving billions of particles allows us to investigate multi-scale 
University of Tokyo 
Summary and Discussion 
Summary 
We did physics using a peta-scale computer. 
References 
MDACP (Molecular Dynamics code for Avogadro Challenge Project) 
Source codes are available online: http://mdacp.sourceforge.net/ 
Financial Supports: 
CMSI 
KAUST GRP (KUK-I1-005-04) 
KAKENHI (23740287) 
and multi-physics problems directly. 
Toward Exa-scale Computing 
Please give us decent programming language! 
MPI (library), OpenMP (directive), SIMD (intrinsic functions) 
Acknowledgements 
Computational Resources: 
“Large-scale HPC Challenge” Project, ITC, Univ. of Tokyo 
Institute of Solid State Physics, Univ. of Tokyo 
HPCI project (hp130047) on the K computer, AICS, RIKEN

More Related Content

What's hot

Multi threading
Multi threadingMulti threading
Multi threadinggndu
 
Multi-threaded Programming in JAVA
Multi-threaded Programming in JAVAMulti-threaded Programming in JAVA
Multi-threaded Programming in JAVAVikram Kalyani
 
Learning Java 3 – Threads and Synchronization
Learning Java 3 – Threads and SynchronizationLearning Java 3 – Threads and Synchronization
Learning Java 3 – Threads and Synchronizationcaswenson
 
Java And Multithreading
Java And MultithreadingJava And Multithreading
Java And MultithreadingShraddha
 
Life cycle-of-a-thread
Life cycle-of-a-threadLife cycle-of-a-thread
Life cycle-of-a-threadjavaicon
 
Jeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYCJeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYCMLconf
 
Java Multithreading
Java MultithreadingJava Multithreading
Java MultithreadingRajkattamuri
 
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYCTed Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYCMLconf
 
Python Training in Bangalore | Multi threading | Learnbay.in
Python Training in Bangalore | Multi threading | Learnbay.inPython Training in Bangalore | Multi threading | Learnbay.in
Python Training in Bangalore | Multi threading | Learnbay.inLearnbayin
 
Developing Multithreaded Applications
Developing Multithreaded ApplicationsDeveloping Multithreaded Applications
Developing Multithreaded ApplicationsBharat17485
 
[DL輪読会]Pay Attention to MLPs (gMLP)
[DL輪読会]Pay Attention to MLPs	(gMLP)[DL輪読会]Pay Attention to MLPs	(gMLP)
[DL輪読会]Pay Attention to MLPs (gMLP)Deep Learning JP
 
L22 multi-threading-introduction
L22 multi-threading-introductionL22 multi-threading-introduction
L22 multi-threading-introductionteach4uin
 
Shogun 2.0 @ PyData NYC 2012
Shogun 2.0 @ PyData NYC 2012Shogun 2.0 @ PyData NYC 2012
Shogun 2.0 @ PyData NYC 2012Christian Widmer
 
Multithreading In Java
Multithreading In JavaMultithreading In Java
Multithreading In Javaparag
 

What's hot (18)

Multi threading
Multi threadingMulti threading
Multi threading
 
Multi-threaded Programming in JAVA
Multi-threaded Programming in JAVAMulti-threaded Programming in JAVA
Multi-threaded Programming in JAVA
 
Learning Java 3 – Threads and Synchronization
Learning Java 3 – Threads and SynchronizationLearning Java 3 – Threads and Synchronization
Learning Java 3 – Threads and Synchronization
 
Java And Multithreading
Java And MultithreadingJava And Multithreading
Java And Multithreading
 
Multithreading in Java
Multithreading in JavaMultithreading in Java
Multithreading in Java
 
Life cycle-of-a-thread
Life cycle-of-a-threadLife cycle-of-a-thread
Life cycle-of-a-thread
 
Multithreading in java
Multithreading in javaMultithreading in java
Multithreading in java
 
Jeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYCJeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYC
 
Multithreading Concepts
Multithreading ConceptsMultithreading Concepts
Multithreading Concepts
 
Java Multithreading
Java MultithreadingJava Multithreading
Java Multithreading
 
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYCTed Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
 
Python Training in Bangalore | Multi threading | Learnbay.in
Python Training in Bangalore | Multi threading | Learnbay.inPython Training in Bangalore | Multi threading | Learnbay.in
Python Training in Bangalore | Multi threading | Learnbay.in
 
Developing Multithreaded Applications
Developing Multithreaded ApplicationsDeveloping Multithreaded Applications
Developing Multithreaded Applications
 
[DL輪読会]Pay Attention to MLPs (gMLP)
[DL輪読会]Pay Attention to MLPs	(gMLP)[DL輪読会]Pay Attention to MLPs	(gMLP)
[DL輪読会]Pay Attention to MLPs (gMLP)
 
L22 multi-threading-introduction
L22 multi-threading-introductionL22 multi-threading-introduction
L22 multi-threading-introduction
 
Shogun 2.0 @ PyData NYC 2012
Shogun 2.0 @ PyData NYC 2012Shogun 2.0 @ PyData NYC 2012
Shogun 2.0 @ PyData NYC 2012
 
Multithreading In Java
Multithreading In JavaMultithreading In Java
Multithreading In Java
 
Java threads
Java threadsJava threads
Java threads
 

Similar to Huge-Scale Molecular Dynamics Simulation of Multi-bubble Nuclei

El text.tokuron a(2019).jung190711
El text.tokuron a(2019).jung190711El text.tokuron a(2019).jung190711
El text.tokuron a(2019).jung190711RCCSRENKEI
 
第13回 配信講義 計算科学技術特論A(2021)
第13回 配信講義 計算科学技術特論A(2021)第13回 配信講義 計算科学技術特論A(2021)
第13回 配信講義 計算科学技術特論A(2021)RCCSRENKEI
 
What can we learn from molecular dynamics simulations of carbon nanotube and ...
What can we learn from molecular dynamics simulations of carbon nanotube and ...What can we learn from molecular dynamics simulations of carbon nanotube and ...
What can we learn from molecular dynamics simulations of carbon nanotube and ...Stephan Irle
 
CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics
CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics
CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics Computational Materials Science Initiative
 
Comparing ICH-leach and Leach Descendents on Image Transfer using DCT
Comparing ICH-leach and Leach Descendents on Image Transfer using DCT Comparing ICH-leach and Leach Descendents on Image Transfer using DCT
Comparing ICH-leach and Leach Descendents on Image Transfer using DCT IJECEIAES
 
Entropy 12-02268-v2
Entropy 12-02268-v2Entropy 12-02268-v2
Entropy 12-02268-v2CAA Sudan
 
A new Leach protocol based on ICH-Leach for adaptive image transferring using...
A new Leach protocol based on ICH-Leach for adaptive image transferring using...A new Leach protocol based on ICH-Leach for adaptive image transferring using...
A new Leach protocol based on ICH-Leach for adaptive image transferring using...TELKOMNIKA JOURNAL
 
Quantum Computers
Quantum ComputersQuantum Computers
Quantum Computerskathan
 
Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...Anubhav Jain
 
A NOVEL FULL ADDER CELL BASED ON CARBON NANOTUBE FIELD EFFECT TRANSISTORS
A NOVEL FULL ADDER CELL BASED ON CARBON NANOTUBE FIELD EFFECT TRANSISTORSA NOVEL FULL ADDER CELL BASED ON CARBON NANOTUBE FIELD EFFECT TRANSISTORS
A NOVEL FULL ADDER CELL BASED ON CARBON NANOTUBE FIELD EFFECT TRANSISTORSVLSICS Design
 
Energy Efficient Wireless Internet Access
Energy Efficient Wireless Internet AccessEnergy Efficient Wireless Internet Access
Energy Efficient Wireless Internet AccessScienzainrete
 
The Dielectric Relaxation Properties And Dipole Ordering...
The Dielectric Relaxation Properties And Dipole Ordering...The Dielectric Relaxation Properties And Dipole Ordering...
The Dielectric Relaxation Properties And Dipole Ordering...Sarah Gordon
 
Design Novel CMOL Architecture in Terabit-scale Nano Electronic Memories Tech...
Design Novel CMOL Architecture in Terabit-scale Nano Electronic Memories Tech...Design Novel CMOL Architecture in Terabit-scale Nano Electronic Memories Tech...
Design Novel CMOL Architecture in Terabit-scale Nano Electronic Memories Tech...IJMTST Journal
 
Analytical instrumentation introduction
Analytical instrumentation   introductionAnalytical instrumentation   introduction
Analytical instrumentation introductionBurdwan University
 
Self-Balancing Multimemetic Algorithms in Dynamic Scale-Free Networks
Self-Balancing Multimemetic Algorithms in Dynamic Scale-Free NetworksSelf-Balancing Multimemetic Algorithms in Dynamic Scale-Free Networks
Self-Balancing Multimemetic Algorithms in Dynamic Scale-Free NetworksRafael Nogueras
 

Similar to Huge-Scale Molecular Dynamics Simulation of Multi-bubble Nuclei (20)

El text.tokuron a(2019).jung190711
El text.tokuron a(2019).jung190711El text.tokuron a(2019).jung190711
El text.tokuron a(2019).jung190711
 
第13回 配信講義 計算科学技術特論A(2021)
第13回 配信講義 計算科学技術特論A(2021)第13回 配信講義 計算科学技術特論A(2021)
第13回 配信講義 計算科学技術特論A(2021)
 
What can we learn from molecular dynamics simulations of carbon nanotube and ...
What can we learn from molecular dynamics simulations of carbon nanotube and ...What can we learn from molecular dynamics simulations of carbon nanotube and ...
What can we learn from molecular dynamics simulations of carbon nanotube and ...
 
CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics
CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics
CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics
 
Comparing ICH-leach and Leach Descendents on Image Transfer using DCT
Comparing ICH-leach and Leach Descendents on Image Transfer using DCT Comparing ICH-leach and Leach Descendents on Image Transfer using DCT
Comparing ICH-leach and Leach Descendents on Image Transfer using DCT
 
Entropy 12-02268-v2
Entropy 12-02268-v2Entropy 12-02268-v2
Entropy 12-02268-v2
 
Testo
TestoTesto
Testo
 
Nano mos25
Nano mos25Nano mos25
Nano mos25
 
Quantum computing1
Quantum computing1Quantum computing1
Quantum computing1
 
Quantum comput ing
Quantum comput ingQuantum comput ing
Quantum comput ing
 
A new Leach protocol based on ICH-Leach for adaptive image transferring using...
A new Leach protocol based on ICH-Leach for adaptive image transferring using...A new Leach protocol based on ICH-Leach for adaptive image transferring using...
A new Leach protocol based on ICH-Leach for adaptive image transferring using...
 
Physical Modeling and Design for Phase Change Memories
Physical Modeling and Design for Phase Change MemoriesPhysical Modeling and Design for Phase Change Memories
Physical Modeling and Design for Phase Change Memories
 
Quantum Computers
Quantum ComputersQuantum Computers
Quantum Computers
 
Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...
 
A NOVEL FULL ADDER CELL BASED ON CARBON NANOTUBE FIELD EFFECT TRANSISTORS
A NOVEL FULL ADDER CELL BASED ON CARBON NANOTUBE FIELD EFFECT TRANSISTORSA NOVEL FULL ADDER CELL BASED ON CARBON NANOTUBE FIELD EFFECT TRANSISTORS
A NOVEL FULL ADDER CELL BASED ON CARBON NANOTUBE FIELD EFFECT TRANSISTORS
 
Energy Efficient Wireless Internet Access
Energy Efficient Wireless Internet AccessEnergy Efficient Wireless Internet Access
Energy Efficient Wireless Internet Access
 
The Dielectric Relaxation Properties And Dipole Ordering...
The Dielectric Relaxation Properties And Dipole Ordering...The Dielectric Relaxation Properties And Dipole Ordering...
The Dielectric Relaxation Properties And Dipole Ordering...
 
Design Novel CMOL Architecture in Terabit-scale Nano Electronic Memories Tech...
Design Novel CMOL Architecture in Terabit-scale Nano Electronic Memories Tech...Design Novel CMOL Architecture in Terabit-scale Nano Electronic Memories Tech...
Design Novel CMOL Architecture in Terabit-scale Nano Electronic Memories Tech...
 
Analytical instrumentation introduction
Analytical instrumentation   introductionAnalytical instrumentation   introduction
Analytical instrumentation introduction
 
Self-Balancing Multimemetic Algorithms in Dynamic Scale-Free Networks
Self-Balancing Multimemetic Algorithms in Dynamic Scale-Free NetworksSelf-Balancing Multimemetic Algorithms in Dynamic Scale-Free Networks
Self-Balancing Multimemetic Algorithms in Dynamic Scale-Free Networks
 

More from Hiroshi Watanabe

CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術2
CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術2CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術2
CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術2Hiroshi Watanabe
 
CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術1
CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術1CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術1
CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術1Hiroshi Watanabe
 
AVX命令を用いたLJの力計算のSIMD化
AVX命令を用いたLJの力計算のSIMD化AVX命令を用いたLJの力計算のSIMD化
AVX命令を用いたLJの力計算のSIMD化Hiroshi Watanabe
 
短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなもの 〜並列編〜
短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなもの 〜並列編〜短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなもの 〜並列編〜
短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなもの 〜並列編〜Hiroshi Watanabe
 
短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなもの
短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなもの短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなもの
短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなものHiroshi Watanabe
 
ハイパースレッディングの並列化への影響
ハイパースレッディングの並列化への影響ハイパースレッディングの並列化への影響
ハイパースレッディングの並列化への影響Hiroshi Watanabe
 
短距離古典分子動力学計算の 高速化と大規模並列化
短距離古典分子動力学計算の 高速化と大規模並列化短距離古典分子動力学計算の 高速化と大規模並列化
短距離古典分子動力学計算の 高速化と大規模並列化Hiroshi Watanabe
 

More from Hiroshi Watanabe (11)

Enjoy supercomputing
Enjoy supercomputingEnjoy supercomputing
Enjoy supercomputing
 
CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術2
CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術2CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術2
CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術2
 
CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術1
CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術1CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術1
CMSI計算科学技術特論A(8) 高速化チューニングとその関連技術1
 
AVX命令を用いたLJの力計算のSIMD化
AVX命令を用いたLJの力計算のSIMD化AVX命令を用いたLJの力計算のSIMD化
AVX命令を用いたLJの力計算のSIMD化
 
MDとはなにか
MDとはなにかMDとはなにか
MDとはなにか
 
短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなもの 〜並列編〜
短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなもの 〜並列編〜短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなもの 〜並列編〜
短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなもの 〜並列編〜
 
短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなもの
短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなもの短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなもの
短距離ハイブリッド並列分子動力学コードの設計思想と説明のようなもの
 
130613-debug
130613-debug130613-debug
130613-debug
 
ハイパースレッディングの並列化への影響
ハイパースレッディングの並列化への影響ハイパースレッディングの並列化への影響
ハイパースレッディングの並列化への影響
 
短距離古典分子動力学計算の 高速化と大規模並列化
短距離古典分子動力学計算の 高速化と大規模並列化短距離古典分子動力学計算の 高速化と大規模並列化
短距離古典分子動力学計算の 高速化と大規模並列化
 
Tuning, etc.
Tuning, etc.Tuning, etc.
Tuning, etc.
 

Recently uploaded

Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptxkhadijarafiq2012
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 

Recently uploaded (20)

Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptx
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 

Huge-Scale Molecular Dynamics Simulation of Multi-bubble Nuclei

  • 1. 1/20 University of Tokyo Huge-Scale Molecular Dynamics Simulation of Multi-bubble Nuclei ISSP, The University of Tokyo M. Suzuki Kyushu University H. Inaoka RIKEN AICS The University of Tokyo, RIKEN AICS Outline H. Watanabe N. Ito 1. Introduction 2. Benchmark results 3. Cavitation 4. Future of HPC 5. Summary and Discussion May. 15th, 2014@NCSA Blue Waters Symposium
  • 2. 2/20 University of Tokyo Gas-Liquid Multi-Phase Flow Hierarchical Phase Transi+on Modeling Bubble Interac+on Flow Par+cle Interac+on Direct Calcula+on Multi-Scale and Multi-Physics Problem Numerical Difficulties Moving, Annihilation and Creation of Boundary Hierarchical Modeling Divide a problem into sub problems Governing Equations for each scale Artificial Hierarchy Validity of Governing Equation Direct Simulation Movie Introduction (1/2)
  • 3. 3/20 Classical Nucleation Theory Classical nucleation theory (CNT) predicts a nucleation rate of clusters. OK for droplet nuclei, bad for bubble nuclei. Interaction between clusters Interact via gas (diffusive) Interact via liquid (ballistic) University of Tokyo Introduction (2/2) Droplets and Bubbles Additional work to create a bubble Same as Droplet Work by Bubble It is difficult to investigate microscopic behavior directly. Huge scale molecular dynamics simulation to observe interaction between bubbles.
  • 4. 4/20 University of Tokyo Simulation Method Molecular Dynamics (MD) method with Short-range Interaction General Approach Parallelization (1/3) MPI: Domain Decomposition OpenMP: Loop Decomposition DO I=1,N DO J in Pair(I) CalcForce(I,J) ENDDO ENDDO Loop Decomposition here or here Problems Conflicts on write back of momenta → Prepare temporary buffer to avoid conflicts → Do not use Newton’s third law (calculate every force twice) We have to thread parallelize other parts, such as pair-list construction. We have to treat SIMDization and thread parallelization simultaneously.
  • 5. 5/20 University of Tokyo Pseudo-flat MPI Full domain decomposition for both intra- and inner-nodes. MDUnit class: Represent an OpenMP thread Responsible for calculation MDManager class: Represent an MPI process Responsible for communication DO I=1,THREAD_NUM CALL MDUnit[i]->Calculate() ENDDO Loop parallelization MDUnit MDUnit MDUnit MDUnit Thread MDUnit MDUnit MDUnit Process MDUnit Parallelization (2/3) Merits It is straightforward to implement pseudo-flat MPI codes from flat MPI codes. Memory locality is automatically satisfied. (not related to the K computer) We have to take care SIMDization only at the hot-spot (force calculation). → We do not have to take care two kinds of load balance simultaneously.
  • 6. 6/20 Parallelization (3/3) Demerits Two-level communication is required. → Limitation of MPI call in thread. → Data should be packed before being sent by a master thread. It is troublesome to compute physical quantities such as pressure, and so on. University of Tokyo MDManager MDManager MDUnit MDUnit MDUnit MDUnit Communication MDUnit MDUnit MDUnit MDUnit
  • 7. 7/20 University of Tokyo Pair-list Construction Algorithms for MD Grid Search O(N) method to find interacting particle-pairs Bookkeeping method Reuse the pair-list by registering pairs in a length longer than truncation length. z Sorting of Pair-list Sort by i-particle in order to reduce memory access Truncation Registration
  • 8. 8/20 University of Tokyo SIMDization Non-use of Newton’s third law We find that store operations involving indirect addressing is quite expensive for the K computer. We avoid indirect storing by not using Newton’s third law. Computations becomes double, but speed up is more than twofold. The value of FLOPS is meaningless. Hand-SIMDization Compiler cannot generate SIMD operations efficiently. We use intrinsic SIMD instructions explicitly. (emmintrin.h) Divide instruction (fdivd) cannot be specified as a SIMD instruction. We use a reciprocal approximation (frcpd) which is fast, but accuracy is only 8bit. We improve the precision by additional calculations. The loop is unrolled by four times to enhance the software pipelining. You can see the source codes from: http://mdacp.sourceforge.net/
  • 9. 9/20 University of Tokyo Benchmark (1/3) - Conditions - Conditions Truncated Lennard-Jones potential with cutoff length 2.5 (in LJ units) Initial Condition is FCC and density is 0.5 Integration Scheme: 2nd order symplectic with dt = 0.001 0.5 million particles on a CPU-core (4 million particles on a node) After 150 steps, measure time required for 1000 steps. Parallelization Flat MPI : 8 processes /node up to 32768 nodes (4.2 PF) Hybrid : 8 threads/node   up to 82944 nodes (10.6 PF) Computations assigned to each core are perfectly identical for both scheme. 1 CPU-core 1 node 8 nodes 0.5 million particles 4 million particles 32 million particles
  • 10. 10/20 University of Tokyo 500 400 300 200 100 0 Benchmark (2/3) - Elapse Time - 1 8 64 512 4096 82944 Elapsed Time [sec] Nodes Flat-MPI Hybrid 332 billion particles 72.3% efficiency 131 billion particles 92.0% efficiency Nice scaling for the flat MPI (92% efficiency compared with a single node) Performance of the largest run: 1.76 PFLOPS (16.6%)
  • 11. 11/20 University of Tokyo 9 8 7 6 5 4 3 2 1 0 Benchmark (3/3) - Memory Usage - 1 8 64 512 4096 82944 Memory Usage [GiB] Nodes Flat-MPI Hybrid 8.52 GiB 39% increase 5.66 GiB 2.8% increase Usage of memory is almost flat for hybrid. extra 1GB/node for Flat-MPI We performed product runs using 4096 nodes adopting flat-MPI.
  • 12. 12/20 University of Tokyo Time Evolution of Multi-bubble nuclei Equilibrium Liquid Adiabatic expansion (2.5% on a side) Multi-bubble nuclei Converge Ostwald-ripening to a single bubble Larger bubbles becomes larger, smaller bubbles becomes smaller
  • 13. 13/20 University of Tokyo Density Temperature G Coexistence Liquid Condition Initial Condition : Pure Liquid Truncation Lenght:3.0 Periodic Boundary Conditon: Time Step:0.005 Integration:2nd order Symplectic (NVE)      2nd order RESPA (NVT) Computational Scale Resource:4096 nodes of K computer (0.5PF) Flat-MPI: 32768 processes Parallelization:Simple Domain Decomposition Particles: 400 〜 700 million particles System Size: L = 960 (in LJ units) Thermalize: 10,000 steps + Observation:1,000,000 steps Movie Details of Simulation 24 hours / run 1 run 〜 0.1million node hours 10 samples =1 million node hours
  • 14. 14/20 Multi-bubble Nuclei and Ostwald-like Ripening intermediate stage Final stage University of Tokyo 23 million particles Initial stage Multi nuclei Ostwald-like ripening One bubble survives 1.5 billion particles Bubbles appear a the results of particle interactions. Ostwald-like ripening is observed as the results of bubble interactions. → Direct simulations of multi-scale and multi-physics phenomena.
  • 15. 15/20 A type of binarization University of Tokyo Identification of Bubbles Definition of Bubble Divide a system into small cells. A cell with density smaller than some threshold is defined to be gas phase. Identify bubbles on the basis of site-percolation criterion (clustering). Process A Process B Liquid Gas Bubble We do not have to take care of byte-order! Data Format A number of particles in a cell is stored as unsigned char (1 Byte). A system is divided into 32.7 million cells → 32.7 MB / snap shots. 1000 frames/run → 32.7GB/run. Clustering is performed in post process.
  • 16. 16/20 1.5 billion particles 23 million particles University of Tokyo Bubble-size distribution Bubble size distribution at the 10,000th step after the expansion. Accuracy of the smaller run is insufficient for further analysis. → We need 1 billion particles to study population dynamics of bubbles which is impossible without Peta-scale computers.
  • 17. 17/20 shared with the classical nucleation theory f(v, t) ⇠ ty ˜ f(vtx) University of Tokyo Lifshitz-Slyozov-Wagner (LSW) Theory Definitions a number of bubbles which has volume v at time t. f(v, t) @f @t = @ @v v˙(v, t) Kinetic Term (Microscopic Dynamics) Assumptions ( ˙ vf) Governing Equation - Mean field approximation - Mechanical equilibrium - Conservation of the total volume of gas phase - Self-similarity of the distribution function Predictions Power-law behaviors n(t) ⇠ tx ¯v(t) ⇠ tx The total number of bubbles The average volume of bubbles The scaling index x depends on microscopic dynamics.
  • 18. 18/20 Scaling Region University of Tokyo Simulations Results (1/2) Number of Bubbles Average Volume of Bubbles Slope: x=1.5 Scaling Region Power-law behavior in late-stage of time evolution. Sharing the value of an exponent as predicted by the theory. The LSW theory works fine. Assumptions are valid. We cannot obtain the scaling region using small systems. Huge-scale simulation is necessary.
  • 19. 19/20 University of Tokyo Simulations Results (2/2) Scaling exponent changed from 1.5 to 1 High T Low T Reaction-Limit x=3/2 Diffusion-Limit x=1 Microscopic dynamics can be investigated from macroscopic behavior.
  • 20. 20/20 Simulations involving billions of particles allows us to investigate multi-scale University of Tokyo Summary and Discussion Summary We did physics using a peta-scale computer. References MDACP (Molecular Dynamics code for Avogadro Challenge Project) Source codes are available online: http://mdacp.sourceforge.net/ Financial Supports: CMSI KAUST GRP (KUK-I1-005-04) KAKENHI (23740287) and multi-physics problems directly. Toward Exa-scale Computing Please give us decent programming language! MPI (library), OpenMP (directive), SIMD (intrinsic functions) Acknowledgements Computational Resources: “Large-scale HPC Challenge” Project, ITC, Univ. of Tokyo Institute of Solid State Physics, Univ. of Tokyo HPCI project (hp130047) on the K computer, AICS, RIKEN