Going Smart and Deep on Materials at ALCF

Building an ALCF Data Service:
Interactive, scalable, reproducible data science
Ian Foster
Rick Wagner, Nick Saint, Eric Blau
Kyle Chard, Yadu Nand Babuji
Logan Ward, Ben Blaiszik
Mike Papka
with
André Schleife and Cheng-Wei Lee
Aiichiro Nakano (USC - ALCF INCITE 2017), Maria Chan (ANL - ALCF INCITE 2016, André Schleife (UIUC - ALCF INCITE 2016)

Overview
• Leadership simulations produce
data of great scientific value

Overview
• We demonstrate how to:
 Make data more accessible and
useful by associating them with rich
data lifecycle and analysis services
Find Analyze Publish
Interactive ♦ Scalable ♦ Reproducible

Overview
• We demonstrate how to:
 Make data more accessible and
useful by associating them with rich
data lifecycle and analysis services
 Leverage advanced data science and
machine learning (ML) methods to
reduce simulation costs and
increase data quality and value
Find Analyze Publish
Interactive ♦ Scalable ♦ Reproducible
Collect Process Represent Learn

PUBLISH
 Automate capture, publication, and indexing of results from ALCF projects
 Enable creation of workspaces and reusable data objects to accelerate data analysis and
promote replicability
ANALYZE
 Combine ML approaches with ALCF HPC resources to extract more information from
existing datasets and to guide future simulation campaigns
FIND
 Unify search, discovery, and consumption of datasets, workspaces, and analysis results

Data
Movement
Data
Discovery
Data
Publication
Automation
Machine
Learning
HPC
Data
Interactivity
Data
Access
ALCF
Other
services
PUBLISH
 Automate capture, publication, and indexing of results from ALCF projects
 Enable creation of workspaces and reusable data objects to accelerate data analysis and
promote replicability
ANALYZE
 Combine ML approaches with ALCF HPC resources to extract more information from
existing datasets and to guide future simulation campaigns
FIND
 Unify search, discovery, and consumption of datasets, workspaces, and analysis results

PARSL
Data
Movement
Data
Discovery
Data
Publication
Automation
Machine
Learning
HPC
Data
Interactivity
Data
Access
ALCF
Other
services

Materials science as an initial testbed
• Advanced materials are critical to economic
security and competitiveness, national security,
and human welfare. (MGI 2011 interagency effort
DoD, DOE, NASA, NIST, and NSF)
• Finding and understanding new materials is
complex, expensive, and time consuming: often
taking > 20 years from research to application
• Materials scientists are key users of leadership
class computing (20-30% at ALCF)
• Community data tools and services to advance
materials science are emerging
Nicholas Brawand, University of Chicago; Larry Curtiss, Argonne National Laboratory

Modeling material stopping power
Stopping Power: a “drag” force
experienced by high speed protons,
electrons, or positrons in a material
Areas of Application
• Nuclear reactor safety
• Magnetic confinement / inertial
containment for nuclear fusion
• Solar cell surface adsorption
• Medicine (e.g., proton therapy
cancer treatment)
• Critical to understanding
material radiation damage
André Schleife and Cheng-Wei Lee (UIUC)
2016 ALCF INCITE Project
“Electronic Response to Particle Radiation
in Condensed Matter”
André Schleife, Yosuke Kanai, Alfredo A. Correa, 2015 -- 10.1103/PhysRevB.91.014306

Modeling material stopping power
André Schleife and Cheng-Wei Lee (UIUC)
2016 ALCF INCITE Project
“Electronic Response to Particle Radiation
in Condensed Matter”
André Schleife, Yosuke Kanai, Alfredo A. Correa, 2015 -- 10.1103/PhysRevB.91.014306
Stopping Power: a “drag” force
experienced by high speed protons,
electrons, or positrons in a material
Areas of Application
• Nuclear reactor safety
• Magnetic confinement / inertial
containment for nuclear fusion
• Solar cell surface adsorption
• Medicine (e.g., proton therapy
cancer treatment)
• Critical to understanding
material radiation damage

Computing stopping power with TD-DFT
Stopping power (SP) can be accurately
calculated by time-dependent density
functional theory (TD-DFT)
 Excellent agreement with experiment
 Can vary orientation, projectile, material
 Highly parallelizable
But we need many results
 Direction dependence
 Effect of defects
 Many more materials
TD-DFT alone may not be sufficient
André Schleife, Yosuke Kanai, and Alfredo A. Correa, 2015 -- 10.1103/PhysRevB.91.014306
Experiment
TD-DFT
16k CPU-Hr

Computing stopping power with TD-DFT
Stopping power (SP) can be accurately
calculated by time-dependent density
functional theory (TD-DFT)
 Excellent agreement with experiment
 Can vary orientation, projectile, material
 Highly parallelizable
But we need many results
 Direction dependence
 Effect of defects
 Many more materials
TD-DFT alone may not be sufficient
Experiment
TD-DFT
16k CPU-Hr
Potential Solution:
Machine Learning!

13
What? Algorithms that generate computer programs
Why? Create software too complex to write manually
General Task: Given inputs, predict output 𝑦 = 𝑓(𝑥)
Common Algorithms:
Advantages:
 Fast 104-107 evaluations/CPU/sec
 Adaptable Limited need to know underlying physics
 Self-correcting Improves with more data.
 Parallelizable Can use large-scale resources
x
y
𝒇(𝒙) = 𝒎𝒙 + 𝒃
Linear
Regression
𝒙 < 𝟒
𝒚 = 𝟐 𝒚 = 𝟔
Decision
Trees
Neural
Networks
Source: nature.com
What is machine learning?

Computing stopping power with TD-DFT+ML
We propose to use ML to create surrogate models for TD-DFT
How do we replace TD-DFT? First, consider what it does
Inputs:
 Atomic-scale structure (atom types, position)
 Electronic structure of system
Outputs:
 Energy of entire system
 Forces on each atom
 Time-derivates of electronic structure
If successful, we can use the ML model – not TD-DFT – to compute SP
Allow prediction of future state

Materials science and machine learning
3 4 -1.0
3 5 -0.5
Δ𝐻𝑓 = −1.0
Δ𝐻𝑓 = −0.5
𝑋 𝑦
Δ𝐻𝑓 = 𝑓(𝑍 𝐴, 𝑍 𝐵)

PARSL
Step 1: Data collection
3 4 -1.0
3 5 -0.5
Δ𝐻𝑓 = −1.0
Δ𝐻𝑓 = −0.5
𝑋 𝑦
Cooley
• AGNI
Fingerprints
• Ion-Ion force
• Local Charge
Density
• Linear
models
• ANNs
• RNNs

Stopping Power prediction: Our data
We have simulation results for H in face-centered cubic Al on a random trajectory.

For each of multiple velocities, we have:
1) A simulated SP: one red point

2) A trajectory for that point:

2) A trajectory for that point:
3) A ground-state calculation for that trajectory’s starting point
(About 6GB in total, mostly Qbox output files)

Steps 2-3: Data processing/Representation
3 4 -1.0
3 5 -0.5
Δ𝐻𝑓 = −1.0
Δ𝐻𝑓 = −0.5
𝑋 𝑦
Cooley
• AGNI
Fingerprints
• Ion-Ion force
• Local Charge
Density
• Linear
models
• ANNs
• RNNs
PARSL

Designing a training set
Key Question: What are the inputs and outputs to our model?
Consider those for TD-DFT:
Inputs:
 Atomic-scale structure (atom types, position)
 Electronic structure of system
Outputs:
 Energy of entire system
 Time-derivates of electronic structure
 Forces on each atom the projectile
Input: Atomic Structure, Output: Force on Particle
Requires TD-DFT to compute
Reliant on entire history (hard to predict)
Not needed to compute stopping power

Selecting a representation
Key Questions: What determines force on projectile? How do we quantify it?
Types of Features
Ion-ion repulsion: Can be computed directly
Electronic interactions: We approximate with two feature types
Local charge density: Density of electrons at projectile position
AGNI fingerprints*: Describe the atom positions around projectile
Another need: History Dependence
Approach: Use charge density at fixed points ahead/behind projectile
* Botu, et al. J. Phys. Chem. C 121, 511–522 (2017).

PARSL
Step 4: Machine learning
3 4 -1.0
3 5 -0.5
Δ𝐻𝑓 = −1.0
Δ𝐻𝑓 = −0.5
𝑋 𝑦
Cooley
• AGNI
Fingerprints
• Ion-Ion force
• Local Charge
Density
• Linear
models
• ANNs
• RNNs

Selecting a machine learning algorithm
Key Criterion: Prediction accuracy
Beyond accuracy, the algorithm should…
be feasible to train with >104 entries
be quick to evaluate
produce a differentiable model
Standard Procedure:
1. Identify suitable algorithms
(linear models, neural networks)
2. Evaluate performance using cross-
validation
3. Validate the model vs. unseen data

Live Demo
You won’t see the live demo here. But it was cool. We located Schleife
simulation data previously published to MDF; assembled a workspace
comprising Aluminum data plus four Jupyter notebooks comprising
data processing, ML training, and SP modeling methods; deployed the
workspace to ALCF; and ran the notebooks to process data, train a
model, and predict SP values for many directions.

Summary of analysis results
We compared a variety of ML algorithms
We computed SP for other trajectories
We evaluated data needed for training
We calculated
Stopping Power
for many
trajectories

Materials science and machine learning
3 4 -1.0
3 5 -0.5
Δ𝐻𝑓 = −1.0
Δ𝐻𝑓 = −0.5
𝑋 𝑦
Cooley
• AGNI
Fingerprints
• Ion-Ion force
• Local Charge
Density
• Linear
models
• ANNs
• RNNs
PARSL

EP
EP
EP
EP
Deep indexing
Web UI, Forge, or
REST API
• Query
• Browse
• Aggregate
Publish
Web UI or API
• Mint DOIs
• Associate
metadata
Databases
Datasets
APIs
LIMS
etc.
Distributed data
storage
Data
publication
service
Data
discovery
service
Materials Data Facility to discover data
116 data sources
3.4M records
300 TB

Data ingest flow
1. Data are created at ALCF
2. Data are staged, published,
and assigned a permanent
identifier (DOI)
3. Results are indexed for
easy discovery
4. Interactive analysis,
modeling, and interrogation
1

Data ingest flow
Data Publication
Data Storage
2
2
identifier (DOI)
easy discovery
modeling, interrogation

Data ingest flow
Data Publication
Data Storage
3
Indexing
identifier (DOI)
easy discovery

Data ingest flow
Data Publication
Data Storage
Query
Fetch
PARSL4
Indexing
identifier (DOI)
easy discovery

Data collection
1. Find data through search index
2. Create BDBags for data
reusability, staging, and sharing
3. Stage data and launch
interactive environment
on ALCF computers
4. Analyze data!

Data staging
1. Find data through search index
2. Create BDBags for data
reusability, staging, and sharing
3. Stage data and launch
interactive environment
on ALCF computers
4. Analyze data!

Interactive, scalable, reproducible data analysis
Data science and learning applications require:
- Interactivity
- Scalability
- You can’t run this on a desktop
- Reproducibility
- Publish code and documentation

Interactive, scalable, reproducible data analysis
Data science and learning applications require:
- Interactivity
- Scalability
- You can’t run this on a desktop
- Reproducibility
- Publish code and documentation
Our solution: JupyterHub + Parsl
 Interactive computing environment
 Notebooks for publication
 Can run on dedicated hardware
PARSL
parsl-project.orgjupyter.org
• Python-based parallel scripting library
• Tasks exposed as functions (Python or bash)
• Python code used to glue functions together
• Leverages Globus for auth and data movement
@App('python', dfk)
def compute_features(chunk):
for f in featurizers:
chunk = f.featurize_dataframe(chunk, 'atoms')
return chunk
chunks = [compute_features(chunk)
for chunk in np.array_split(data, chunks)]

TD-DFT Calculations Machine Learning
Direction-Dependent
Stopping Power
Existing Data ALCF Data Facility New Capabilities

Existing Data ALCF Data Facility New Capabilities
Next Steps
1. Model multiple velocities
2. Model more materials
3. Model direction dependence
4. Transfer Learning
Results so far
• Indexed data from an ALCF INCITE project
• Interactively built surrogate model using
ALCF Data Service capabilities
• Extended results to model SP direction
dependence in Aluminum

Thanks to our sponsors!
U . S . D E P A R T M E N T O F
ENERGY
ALCF DF
Parsl Globus IMaD

Going Smart and Deep on Materials at ALCF

More Related Content

What's hot

Similar to Going Smart and Deep on Materials at ALCF

More from Ian Foster

Recently uploaded

Going Smart and Deep on Materials at ALCF

Editor's Notes