Graphs, Environments,
and Machine Learning
for Materials Science
Shyue Ping Ong, Chi Chen, Xiangguo Li,Yunxing
Zuo, Zhi Deng, Weike Ye, Zhenbin Wang
Aug 1 2019
NIST Workshop
High-throughput computation is not enough:A
statistical history of the Materials Project
Aug 1 2019
Reasonable ML
Deep learning
(AA’)0.5(BB’)0.5O3 perovskite
2 x 2 x 2 supercell,
10 A and 10 B species
= (10C2 x 8C4)2 ≈107
NIST Workshop
ratio of (634 + 34)/485 ≈ 1.38 (Supplementary Table S-II) with b5%
difference in the experimental and theoretical values. This again
agree well with those calculated from the rule of mixture (Supplemen-
tary Table-III). The experimental XRD patterns also agree well with
Fig. 2. Atomic-resolution STEM ABF and HAADF images of a representative high-entropy perovskite oxide, Sr(Zr0.2Sn0.2Ti0.2Hf0.2Mn0.2)O3. (a, c) ABF and (b, d) HAADF images at (a, b) low
and (c, d) high magnifications showing nanoscale compositional homogeneity and atomic structure. The [001] zone axis and two perpendicular atomic planes (110) and (110) are marked.
Insets are averaged STEM images.
Jiang et al. A New Class of
High-Entropy Perovskite
Oxides. Scripta Materialia
2018, 142, 116–120.
Materials design is
combinatorial
Combinatorial generalization,i.e.,making infinite
use of finite means,is how humans learn….
Aug 1 2019 NIST Workshop
Graph Networks as a Universal Machine Learning Framework for
Molecules and Crystals
Chi Chen, Weike Ye, Yunxing Zuo, Chen Zheng, and Shyue Ping Ong*
Department of NanoEngineering, University of California San Diego, 9500 Gilman Dr, Mail Code 0448, La Jolla, California
92093-0448, United States
*S Supporting Information
ABSTRACT: Graph networks are a new machine learning (ML)
paradigm that supports both relational reasoning and combinatorial
generalization. Here, we develop universal MatErials Graph Network
(MEGNet) models for accurate property prediction in both
molecules and crystals. We demonstrate that the MEGNet models
outperform prior ML models such as the SchNet in 11 out of 13
properties of the QM9 molecule data set. Similarly, we show that
MEGNet models trained on ∼60 000 crystals in the Materials Project
substantially outperform prior ML models in the prediction of the
formation energies, band gaps, and elastic moduli of crystals,
achieving better than density functional theory accuracy over a
much larger data set. We present two new strategies to address data
limitations common in materials science and chemistry. First, we
demonstrate a physically intuitive approach to unify four separate
molecular MEGNet models for the internal energy at 0 K and room temperature, enthalpy, and Gibbs free energy into a single
free energy MEGNet model by incorporating the temperature, pressure, and entropy as global state inputs. Second, we show
that the learned element embeddings in MEGNet models encode periodic chemical trends and can be transfer-learned from a
property model trained on a larger data set (formation energies) to improve property models with smaller amounts of data
(band gaps and elastic moduli).
■ INTRODUCTION
Machine learning (ML)1,2
has emerged as a powerful new tool
in materials science,3−14
driven in part by the advent of large
materials data sets from high-throughput electronic structure
calculations15−18
and/or combinatorial experiments.19,20
Among its many applications, the development of fast,
surrogate ML models for property prediction has arguably
received the most interest for its potential in accelerating
materials design21,22
as well as accessing larger length/time
scales at near-quantum accuracy.11,23−28
The key input to any ML model is a description of the
material, which must satisfy the necessary rotational, transla-
neural network model. Gilmer et al.37
later proposed the
message passing neural network (MPNN) framework that
includes the existing graph models with differences only in
their update functions.
Unlike molecules, descriptions of crystals must account for
lattice periodicity and additional space group symmetries. In
the crystal graph convolutional neural networks (CGCNNs)
proposed by Xie and Grossman,9
each crystal is represented by
a crystal graph, and invariance with respect to permutation of
atomic indices and unit cell choice are achieved through
convolution and pooling layers. They demonstrated excellent
prediction performance on a broad array of properties,
Article
pubs.acs.org/cmCite This: Chem. Mater. XXXX, XXX, XXX−XXX
DownloadedviaUNIVOFCALIFORNIASANDIEGOonApril23,2019at14:47:23(UTC).
Seehttps://pubs.acs.org/sharingguidelinesforoptionsonhowtolegitimatelysharepublishedarticles.
Chen, C.; Ye, W.; Zuo, Y.; Zheng, C.; Ong, S. P. Graph Networks as a Universal
Machine Learning Framework for Molecules and Crystals. Chem. Mater. 2019, 31 (9),
3564–3572. https://doi.org/10.1021/acs.chemmater.9b01294.
Shall I compare thee
to a summer’s day?
Thou art more lovely
and more temperate.
- William Shakespeare
But learning also takes place in the context
of hierarchy of structured knowledge…
Aug 1 2019 NIST Workshop
Thermodynamics
• Extensive (additive) versus
intensive properties
“Locality”
• Interactions between nearby
atoms are stronger than
atoms far away ~ 1/rn
Symmetry
• Translation
• Rotation
• Reflection
• Permutation of identical
atoms
Representations for a collection of atoms
Aug 1 2019 NIST Workshop
Graphs
Local
env.
Graphs as a natural representation for materials,
i.e.,molecules and crystals
Aug 1 2019
Global state (u)
ek1
vsk
vrk
NIST Workshop
Zr … … … … … …
Zs … … ... … … …
… … … … … … …
T p S … … … …
𝑒
"
#"#$
%
&%
Chen, C.; Ye, W.; Zuo, Y.; Zheng, C.; Ong, S. P. Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals. Chem.
Mater. 2019, 31 (9), 3564–3572. doi: 10.1021/acs.chemmater.9b01294.
Information flow between elements in a graph
network
Aug 1 2019
Global state (u)
ek1
vsk
vrk
NIST Workshop
T p S … … … …
𝒆 𝒌𝟏
*
= 𝜙-(𝒗 𝒓𝒌⨁𝒗 𝒔𝒌⨁𝒆 𝒌𝟏⨁𝒖)
Bond update
Chen, C.; Ye, W.; Zuo, Y.; Zheng, C.; Ong, S. P. Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals. Chem.
Mater. 2019, 31 (9), 3564–3572. doi: 10.1021/acs.chemmater.9b01294.
Information flow between elements in a graph
network
Aug 1 2019
Global state (u)
ek1
vsk
vrk
NIST Workshop
T p S … … … …
ek2 ek3
Atom update
𝒆 𝒌𝟏
*
= 𝜙-(𝒗 𝒓𝒌⨁𝒗 𝒔𝒌⨁𝒆 𝒌𝟏⨁𝒖)
Bond update
𝒗 𝒓𝒌
*
= 𝜙5(𝒗 𝒓𝒌⨁𝒆 𝒌𝒓⨁𝒖)
Chen, C.; Ye, W.; Zuo, Y.; Zheng, C.; Ong, S. P. Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals. Chem.
Mater. 2019, 31 (9), 3564–3572. doi: 10.1021/acs.chemmater.9b01294.
Information flow between elements in a graph
network
Aug 1 2019
Global state (u)
ek1
vsk
vrk
NIST Workshop
T p S … … … …
ek2 ek3
Atom update
𝒆 𝒌𝟏
*
= 𝜙-(𝒗 𝒓𝒌⨁𝒗 𝒔𝒌⨁𝒆 𝒌𝟏⨁𝒖)
Bond update
State update
𝒖′ = 𝜙7(𝑽⨁𝑬⨁𝒖)
𝜙 are approximated using
Universal
approximation
theorem
Cybenko et al. Math. Control Signal Systems
1989, 2 (4), 303–314.
𝒗 𝒓𝒌
*
= 𝜙5(𝒗 𝒓𝒌⨁𝒆 𝒌𝒓⨁𝒖)
MatErials Graph Networks (MEGNet)
Aug 1 2019 NIST Workshop
Z mapping to vector (remember
this!)
Implementation is open source at https://github.com/materialsvirtuallab/megnet.
Modular blocks can be stacked to
generate models of arbitrary
complexity and “locality” of
interactions
Performance on 130,462 QM9 molecules
Aug 1 2019 NIST Workshop
80%-10%-10%
train-validation-test split
Only Z as atomic feature, i.e.,
feature selection helps model
learn, but is not critical!
MEGNet1 MEGNet-
Simple1
SchNet2 “Chemical
Accuracy”
U0 (meV) 9 12 14 43
G (meV) 10 12 14 43
εHOMO (eV) 0.038 0.043 0.041 0.043
εLUMO (eV) 0.031 0.044 0.034 0.043
Cv (cal/molK) 0.030 0.029 0.033 0.05
1 Chen et al. Chem. Mater. 2019, 31 (9), 3564–3572. doi: 10.1021/acs.chemmater.9b01294.
2 Schutt et al. J. Chem. Phys. 148, 241722 (2018)
State-of-the-art performance
surpassing chemical accuracy in 11
of 13 properties!
MEGNet
Unified Free Energy Model
Aug 1 2019 NIST Workshop
Training data:
U, H, G at 298 K
U at 0K (U0)∅ = 𝑓(𝐸, 𝑇, 𝑃, 𝑆))
Failure in H and G due to lack of more
pressure and entropy in training data.
Performance on Materials Project Crystals
Aug 1 2019 NIST Workshop
Property MEGNet SchNet1 CGCNN2
Formation energy Ef (meV/atom) 28
(60,000)
35 39
(28,046)
Band gap Eg (eV) 0.330
(36,720)
- 0.388
(16,485)
log10 KVRH (GPa) 0.050
(4,664)
- 0.054
(2,041)
log10 GVRH (GPa) 0.079
(4,664)
- 0.087
(2,041)
Metal classifier 78.9%
(55,391)
- 80%
(28,046)
Non-metal classifier 90.6%
(55,391)
- 95%
(28,046)
1 Schutt et al. J. Chem. Phys. 148, 241722 (2018)
2 Xie et al. PRL. 120.14 (2018): 145301.
“Noisy”
Dataset too small
Transfer learning for improved convergence and
speed
Aug 1 2019
Ef model
(60,000 data points)
Eg model
(36,000 data points)
ü MAE decreases from
0.38 eV to 0.32 eV
ü Convergence speed x2
NIST Workshop
CGCNN vs MEGNet
Aug 1 2019 NIST Workshop
CGCNN MEGNet
Message passing Node to node only Node, edge, and global
Input atomic
features
Group number
Period number
Electronegativity
Covalent radius
Valence electrons,
First ionization energy
Electron affinity
Block
Atomic volume
Atomic number
Global state No Yes
Transferable
components
Possible, but not demonstrated Yes
Composability New models require network
optimization
New models can be formed
by stacking modular blocks
Extracting chemistry from machine-learned
models
Aug 1 2019 NIST Workshop
Sorted by Mendeleev number
t-SNE projection
Pearson
correlation
Aug 1 2019
http://crystals.ai
NIST Workshop
Practical considerations
• Training of deep learning
models is fairly expensive -
Dedicated GPU resources
recommended.
• But prediction is cheap,
https://megnet.crystals.ai runs
on a single dyno on Heroku!
Aug 1 2019 NIST Workshop
Pythia @ MaterialsVirtualLab
Representations for a collection of atoms
Aug 1 2019 NIST Workshop
Graphs
Local
env.
The scale problem in computational materials
science
Many real-world materials problems are not related to
bulk crystals.
Aug 1 2019
Huang et al. ACS Energy Lett. 2018, 3 (12), 2983–2988.Tang et al. Chem. Mater. 2018, 30 (1), 163–173.
Electrode-electrolyte interfaces Catalysis Microstructure and segregation
Need linear-scaling with ab initio accuracy.
NIST Workshop
Machine learning the potential energy surface
Aug 1 2019 NIST Workshop
Local environment descriptors ML approach
A separate neural network is used for each atom. The neural network is defined by
the number of hidden layers and the nodes in each layer, while the descriptor space is
given by the following symmetry functions:
Gatom,rad
i =
NatomX
j6=i
e ⌘(Rij Rs)2
· fc(Rij),
Gatom,ang
i = 21 ⇣
NatomX
j,k6=i
(1 + cos ✓ijk)⇣
· e ⌘0(R2
ij+R2
ik+R2
jk)
· fc(Rij) · fc(Rik) · fc(Rjk),
where Rij is the distance between atom i and neighbor atom j, ⌘ is the width of the
Gaussian and Rs is the position shift over all neighboring atoms within the cuto↵
radius Rc, ⌘0
is the width of the Gaussian basis and ⇣ controls the angular resolution.
fc(Rij) is a cuto↵ function, defined as follows:
fc(Rij) =
8
>><
>>:
0.5 · [cos (
⇡Rij
Rc
) + 1], for Rij  Rc
0.0, for Rij > Rc.
These hyperparameters were optimized to minimize the mean absolute errors of en-
ergies and forces for each chemistry. The NNP model has shown great performance
for Si,11
TiO2,40
water41
and solid-liquid interfaces,42
metal-organic frameworks,43
and
has been extended to incorporate long-range electrostatics for ionic systems such as
4
Atom-centered symmetry
functions (ACSF)
Moment tensors
Smooth overlap of atomic
positions (SOAP)
SO4 bispectrum
Polynomial / Linear
regression
Kernel regression
Neural networks
ZnO44
and Li3PO4.45
2. Gaussian Approximation Potential (GAP). The GAP calculates the similar-
ity between atomic configurations based on a smooth-overlap of atomic positions
(SOAP)10,46
kernel, which is then used in a Gaussian process model. In SOAP, the
Gaussian-smeared atomic neighbor densities ⇢i(R) are expanded in spherical harmonics
as follows:
⇢i(R) =
X
j
fc(Rij) · exp(
|R Rij|2
2 2
atom
) =
X
nlm
cnlm gn(R)Ylm( ˆR),
The spherical power spectrum vector, which is in turn the square of expansion coe -
cients,
pn1n2l(Ri) =
lX
m= l
c⇤
n1lmcn2lm,
can be used to construct the SOAP kernel while raised to a positive integer power ⇣
(which is 4 in present case) to accentuate the sensitivity of the kernel,10
K(R, R0
) =
X
n1n2l
(pn1n2l(R)pn1n2l(R0
))⇣
,
In the above equations, atom is a smoothness controlling the Gaussian smearing, and
nmax and lmax determine the maximum powers for radial components and angular com-
ponents in spherical harmonics expansion, respectively.10
These hyperparameters, as
well as the number of reference atomic configurations used in Gaussian process, are
Behler-Parinello Neural
Network Potential (NNP)1
Moment Tensor Potential
(MTP)2
Gaussian Approximation
Potential (GAP)3
Spectral Neighbor Analysis
Potential (SNAP)4
ACSF/MT encodes distances and angles.
SOAP/bispectrum encodes neighbor density.
Interatomic Potential
1 Behler et al. PRL. 98.14 (2007): 146401.
2 Shapeev MultiScale Modeling and Simulation 14, (2016).
3 Bart ́ok et al. PRL. 104.13 (2010): 136403.
4 Thompson et al. J. Chem. Phys. 285, 316330 (2015)
Evaluation criteria
qAccuracy (energies,
forces and properties)
qComputational cost
qTraining data
requirements
qExtrapolability
Aug 1 2019 NIST Workshop
Machine learning
Interatomic Potentials
(ML-IAPs)
MaterialsVirtual Lab
Standardized workflow for ML-IAP construction
and evaluation
Pymatgen
Fireworks + VASP
DFT static
Dataset
Elastic deformation Distorted
structures
Surface generation Surface
structures
Vacancy + AIMD Trajectory
snapshots
(low T, high T) AIMD Trajectory
snapshots
Crystal
structure
property fittingE
e
e.g. elastic, phonon
···
energy weights
degrees of freedom
···
cutoff radius
expansion width
S1
S2
Sn
· · ·
rc
atomic descriptors
local
environment
sites
· · · · · ·
X1(r1j … r1n)
X2(r2k … r2m)
Xn(rnj … rnm)
machine learning
Y =f(X; !)
Y (energy, force, stress)
DFT properties
grid search
evolutionary algorithm
Aug 1 2019 NIST Workshop
Available open source on Github: https://github.com/materialsvirtuallab/mlearn
Test systems:
• Fcc Ni
• Fcc Cu
• Bcc Li
• Bcc Mo
• Diamond Ge
• Diamond Si
Zuo, Y.; Chen, C.; Li, X.; Deng, Z.; Chen, Y.; Behler, J.; Csányi, G.; Shapeev, A. V.; Thompson, A. P.; Wood, M. A.; et al. A Performance and Cost Assessment
of Machine Learning Interatomic Potentials. arXiv:1906.08888 2019.
ML-IAP:Accuracy vs Cost
Aug 1 2019 NIST Workshop
Testerror(meV/atom)
Computational cost s/(MD step atom)
a
b
Jmax = 3
Jmax = 3
2000 kernels20 polynomial powers
hidden layers [16, 16]
GAP reaches
best accuracy,
but is the most
expensive by
O(102-103)
MTP, NNP,
qSNAP all lie
quite close to
Pareto frontier.
Mo dataset
Zuo et al. A Performance and Cost Assessment of Machine Learning Interatomic Potentials. arXiv:1906.08888 2019.
ML-IAP:Training Data Requirements
Aug 1 2019 NIST Workshop
Energies Forces
• Data quality is more important than data quantity -
~O(102) structures sufficient to converge energies and
forces for most ML-IAPs..
a b
Zuo et al. A Performance and Cost Assessment of Machine Learning Interatomic Potentials. arXiv:1906.08888 2019.
NNP and qSNAP require
much more training data
ML-IAP:Extrapolability
• The greater the ML complexity (e.g., NNP
and GAP), the greater the issues with
extrapolation.
• Linear SNAP performs surprisingly well on
EOS and polymorph energy differences.
Aug 1 2019 NIST Workshop
Ni Li Si
Cu Mo Ge
DFT GAP
NNP
MTP
SNAP qSNAP
bcc Ni bcc Cu
fcc Mo fcc Li
wurtzite Si wurtzite Ge
GAP performs poorly!
Zuo et al. A Performance and Cost Assessment of Machine Learning Interatomic Potentials. arXiv:1906.08888 2019.
Applications:Ni-Mo phase diagram and
mechanical behavior
Aug 1 2019 NIST Workshop
Solid-liquid equilibrium Hall-Petch strengthening
[1] Hu et al. Nature, 2017, 355, 1292
Conclusions
Aug 1 2019 NIST Workshop
Graph /
Local env.
Descriptors
Machine
Learning
”Instant",
linear-scaling
property
predictions
Transfer learning
Property with
smaller data
Pymatgen
Fireworks + VASP
DFT static
Dataset
Elastic deformation Distorted
structures
Surface generation Surface
structures
Vacancy + AIMD Trajectory
snapshots
(low T, high T) AIMD Trajectory
snapshots
Crystal
structure
property fittingE
e
e.g. elastic, phonon
···
energy weights
degrees of freedom
···
cutoff radius
expansion width
S1
S2
Sn
· · ·
rc
atomic descriptors
local
environment
sites
· · · · · ·
X1(r1j … r1n)
X2(r2k … r2m)
Xn(rnj … rnm)
machine learning
Y =f(X; !)
Y (energy, force, stress)
DFT properties
grid search
evolutionary algorithm
ML-IAPs: Reproducible, near-DFT accuracy
+ linear scaling -> New science
http://crystals.ai & mlearn
Open source software
and standardized
datasets for materials ML
Acknowledgements
Aug 1 2019 NIST Workshop
MAVRL
Creating It from Bit
Contract #N000141612621
GRO Program
Chi Chen
(MEGNet)
Yunxing Zuo
(ML-IAP)
Xiangguo Li
(Ni-Mo and MPE)

Graphs, Environments, and Machine Learning for Materials Science

  • 1.
    Graphs, Environments, and MachineLearning for Materials Science Shyue Ping Ong, Chi Chen, Xiangguo Li,Yunxing Zuo, Zhi Deng, Weike Ye, Zhenbin Wang Aug 1 2019 NIST Workshop
  • 2.
    High-throughput computation isnot enough:A statistical history of the Materials Project Aug 1 2019 Reasonable ML Deep learning (AA’)0.5(BB’)0.5O3 perovskite 2 x 2 x 2 supercell, 10 A and 10 B species = (10C2 x 8C4)2 ≈107 NIST Workshop ratio of (634 + 34)/485 ≈ 1.38 (Supplementary Table S-II) with b5% difference in the experimental and theoretical values. This again agree well with those calculated from the rule of mixture (Supplemen- tary Table-III). The experimental XRD patterns also agree well with Fig. 2. Atomic-resolution STEM ABF and HAADF images of a representative high-entropy perovskite oxide, Sr(Zr0.2Sn0.2Ti0.2Hf0.2Mn0.2)O3. (a, c) ABF and (b, d) HAADF images at (a, b) low and (c, d) high magnifications showing nanoscale compositional homogeneity and atomic structure. The [001] zone axis and two perpendicular atomic planes (110) and (110) are marked. Insets are averaged STEM images. Jiang et al. A New Class of High-Entropy Perovskite Oxides. Scripta Materialia 2018, 142, 116–120. Materials design is combinatorial
  • 3.
    Combinatorial generalization,i.e.,making infinite useof finite means,is how humans learn…. Aug 1 2019 NIST Workshop Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals Chi Chen, Weike Ye, Yunxing Zuo, Chen Zheng, and Shyue Ping Ong* Department of NanoEngineering, University of California San Diego, 9500 Gilman Dr, Mail Code 0448, La Jolla, California 92093-0448, United States *S Supporting Information ABSTRACT: Graph networks are a new machine learning (ML) paradigm that supports both relational reasoning and combinatorial generalization. Here, we develop universal MatErials Graph Network (MEGNet) models for accurate property prediction in both molecules and crystals. We demonstrate that the MEGNet models outperform prior ML models such as the SchNet in 11 out of 13 properties of the QM9 molecule data set. Similarly, we show that MEGNet models trained on ∼60 000 crystals in the Materials Project substantially outperform prior ML models in the prediction of the formation energies, band gaps, and elastic moduli of crystals, achieving better than density functional theory accuracy over a much larger data set. We present two new strategies to address data limitations common in materials science and chemistry. First, we demonstrate a physically intuitive approach to unify four separate molecular MEGNet models for the internal energy at 0 K and room temperature, enthalpy, and Gibbs free energy into a single free energy MEGNet model by incorporating the temperature, pressure, and entropy as global state inputs. Second, we show that the learned element embeddings in MEGNet models encode periodic chemical trends and can be transfer-learned from a property model trained on a larger data set (formation energies) to improve property models with smaller amounts of data (band gaps and elastic moduli). ■ INTRODUCTION Machine learning (ML)1,2 has emerged as a powerful new tool in materials science,3−14 driven in part by the advent of large materials data sets from high-throughput electronic structure calculations15−18 and/or combinatorial experiments.19,20 Among its many applications, the development of fast, surrogate ML models for property prediction has arguably received the most interest for its potential in accelerating materials design21,22 as well as accessing larger length/time scales at near-quantum accuracy.11,23−28 The key input to any ML model is a description of the material, which must satisfy the necessary rotational, transla- neural network model. Gilmer et al.37 later proposed the message passing neural network (MPNN) framework that includes the existing graph models with differences only in their update functions. Unlike molecules, descriptions of crystals must account for lattice periodicity and additional space group symmetries. In the crystal graph convolutional neural networks (CGCNNs) proposed by Xie and Grossman,9 each crystal is represented by a crystal graph, and invariance with respect to permutation of atomic indices and unit cell choice are achieved through convolution and pooling layers. They demonstrated excellent prediction performance on a broad array of properties, Article pubs.acs.org/cmCite This: Chem. Mater. XXXX, XXX, XXX−XXX DownloadedviaUNIVOFCALIFORNIASANDIEGOonApril23,2019at14:47:23(UTC). Seehttps://pubs.acs.org/sharingguidelinesforoptionsonhowtolegitimatelysharepublishedarticles. Chen, C.; Ye, W.; Zuo, Y.; Zheng, C.; Ong, S. P. Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals. Chem. Mater. 2019, 31 (9), 3564–3572. https://doi.org/10.1021/acs.chemmater.9b01294. Shall I compare thee to a summer’s day? Thou art more lovely and more temperate. - William Shakespeare
  • 4.
    But learning alsotakes place in the context of hierarchy of structured knowledge… Aug 1 2019 NIST Workshop Thermodynamics • Extensive (additive) versus intensive properties “Locality” • Interactions between nearby atoms are stronger than atoms far away ~ 1/rn Symmetry • Translation • Rotation • Reflection • Permutation of identical atoms
  • 5.
    Representations for acollection of atoms Aug 1 2019 NIST Workshop Graphs Local env.
  • 6.
    Graphs as anatural representation for materials, i.e.,molecules and crystals Aug 1 2019 Global state (u) ek1 vsk vrk NIST Workshop Zr … … … … … … Zs … … ... … … … … … … … … … … T p S … … … … 𝑒 " #"#$ % &% Chen, C.; Ye, W.; Zuo, Y.; Zheng, C.; Ong, S. P. Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals. Chem. Mater. 2019, 31 (9), 3564–3572. doi: 10.1021/acs.chemmater.9b01294.
  • 7.
    Information flow betweenelements in a graph network Aug 1 2019 Global state (u) ek1 vsk vrk NIST Workshop T p S … … … … 𝒆 𝒌𝟏 * = 𝜙-(𝒗 𝒓𝒌⨁𝒗 𝒔𝒌⨁𝒆 𝒌𝟏⨁𝒖) Bond update Chen, C.; Ye, W.; Zuo, Y.; Zheng, C.; Ong, S. P. Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals. Chem. Mater. 2019, 31 (9), 3564–3572. doi: 10.1021/acs.chemmater.9b01294.
  • 8.
    Information flow betweenelements in a graph network Aug 1 2019 Global state (u) ek1 vsk vrk NIST Workshop T p S … … … … ek2 ek3 Atom update 𝒆 𝒌𝟏 * = 𝜙-(𝒗 𝒓𝒌⨁𝒗 𝒔𝒌⨁𝒆 𝒌𝟏⨁𝒖) Bond update 𝒗 𝒓𝒌 * = 𝜙5(𝒗 𝒓𝒌⨁𝒆 𝒌𝒓⨁𝒖) Chen, C.; Ye, W.; Zuo, Y.; Zheng, C.; Ong, S. P. Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals. Chem. Mater. 2019, 31 (9), 3564–3572. doi: 10.1021/acs.chemmater.9b01294.
  • 9.
    Information flow betweenelements in a graph network Aug 1 2019 Global state (u) ek1 vsk vrk NIST Workshop T p S … … … … ek2 ek3 Atom update 𝒆 𝒌𝟏 * = 𝜙-(𝒗 𝒓𝒌⨁𝒗 𝒔𝒌⨁𝒆 𝒌𝟏⨁𝒖) Bond update State update 𝒖′ = 𝜙7(𝑽⨁𝑬⨁𝒖) 𝜙 are approximated using Universal approximation theorem Cybenko et al. Math. Control Signal Systems 1989, 2 (4), 303–314. 𝒗 𝒓𝒌 * = 𝜙5(𝒗 𝒓𝒌⨁𝒆 𝒌𝒓⨁𝒖)
  • 10.
    MatErials Graph Networks(MEGNet) Aug 1 2019 NIST Workshop Z mapping to vector (remember this!) Implementation is open source at https://github.com/materialsvirtuallab/megnet. Modular blocks can be stacked to generate models of arbitrary complexity and “locality” of interactions
  • 11.
    Performance on 130,462QM9 molecules Aug 1 2019 NIST Workshop 80%-10%-10% train-validation-test split Only Z as atomic feature, i.e., feature selection helps model learn, but is not critical! MEGNet1 MEGNet- Simple1 SchNet2 “Chemical Accuracy” U0 (meV) 9 12 14 43 G (meV) 10 12 14 43 εHOMO (eV) 0.038 0.043 0.041 0.043 εLUMO (eV) 0.031 0.044 0.034 0.043 Cv (cal/molK) 0.030 0.029 0.033 0.05 1 Chen et al. Chem. Mater. 2019, 31 (9), 3564–3572. doi: 10.1021/acs.chemmater.9b01294. 2 Schutt et al. J. Chem. Phys. 148, 241722 (2018) State-of-the-art performance surpassing chemical accuracy in 11 of 13 properties!
  • 12.
    MEGNet Unified Free EnergyModel Aug 1 2019 NIST Workshop Training data: U, H, G at 298 K U at 0K (U0)∅ = 𝑓(𝐸, 𝑇, 𝑃, 𝑆)) Failure in H and G due to lack of more pressure and entropy in training data.
  • 13.
    Performance on MaterialsProject Crystals Aug 1 2019 NIST Workshop Property MEGNet SchNet1 CGCNN2 Formation energy Ef (meV/atom) 28 (60,000) 35 39 (28,046) Band gap Eg (eV) 0.330 (36,720) - 0.388 (16,485) log10 KVRH (GPa) 0.050 (4,664) - 0.054 (2,041) log10 GVRH (GPa) 0.079 (4,664) - 0.087 (2,041) Metal classifier 78.9% (55,391) - 80% (28,046) Non-metal classifier 90.6% (55,391) - 95% (28,046) 1 Schutt et al. J. Chem. Phys. 148, 241722 (2018) 2 Xie et al. PRL. 120.14 (2018): 145301. “Noisy” Dataset too small
  • 14.
    Transfer learning forimproved convergence and speed Aug 1 2019 Ef model (60,000 data points) Eg model (36,000 data points) ü MAE decreases from 0.38 eV to 0.32 eV ü Convergence speed x2 NIST Workshop
  • 15.
    CGCNN vs MEGNet Aug1 2019 NIST Workshop CGCNN MEGNet Message passing Node to node only Node, edge, and global Input atomic features Group number Period number Electronegativity Covalent radius Valence electrons, First ionization energy Electron affinity Block Atomic volume Atomic number Global state No Yes Transferable components Possible, but not demonstrated Yes Composability New models require network optimization New models can be formed by stacking modular blocks
  • 16.
    Extracting chemistry frommachine-learned models Aug 1 2019 NIST Workshop Sorted by Mendeleev number t-SNE projection Pearson correlation
  • 17.
  • 18.
    Practical considerations • Trainingof deep learning models is fairly expensive - Dedicated GPU resources recommended. • But prediction is cheap, https://megnet.crystals.ai runs on a single dyno on Heroku! Aug 1 2019 NIST Workshop Pythia @ MaterialsVirtualLab
  • 19.
    Representations for acollection of atoms Aug 1 2019 NIST Workshop Graphs Local env.
  • 20.
    The scale problemin computational materials science Many real-world materials problems are not related to bulk crystals. Aug 1 2019 Huang et al. ACS Energy Lett. 2018, 3 (12), 2983–2988.Tang et al. Chem. Mater. 2018, 30 (1), 163–173. Electrode-electrolyte interfaces Catalysis Microstructure and segregation Need linear-scaling with ab initio accuracy. NIST Workshop
  • 21.
    Machine learning thepotential energy surface Aug 1 2019 NIST Workshop Local environment descriptors ML approach A separate neural network is used for each atom. The neural network is defined by the number of hidden layers and the nodes in each layer, while the descriptor space is given by the following symmetry functions: Gatom,rad i = NatomX j6=i e ⌘(Rij Rs)2 · fc(Rij), Gatom,ang i = 21 ⇣ NatomX j,k6=i (1 + cos ✓ijk)⇣ · e ⌘0(R2 ij+R2 ik+R2 jk) · fc(Rij) · fc(Rik) · fc(Rjk), where Rij is the distance between atom i and neighbor atom j, ⌘ is the width of the Gaussian and Rs is the position shift over all neighboring atoms within the cuto↵ radius Rc, ⌘0 is the width of the Gaussian basis and ⇣ controls the angular resolution. fc(Rij) is a cuto↵ function, defined as follows: fc(Rij) = 8 >>< >>: 0.5 · [cos ( ⇡Rij Rc ) + 1], for Rij  Rc 0.0, for Rij > Rc. These hyperparameters were optimized to minimize the mean absolute errors of en- ergies and forces for each chemistry. The NNP model has shown great performance for Si,11 TiO2,40 water41 and solid-liquid interfaces,42 metal-organic frameworks,43 and has been extended to incorporate long-range electrostatics for ionic systems such as 4 Atom-centered symmetry functions (ACSF) Moment tensors Smooth overlap of atomic positions (SOAP) SO4 bispectrum Polynomial / Linear regression Kernel regression Neural networks ZnO44 and Li3PO4.45 2. Gaussian Approximation Potential (GAP). The GAP calculates the similar- ity between atomic configurations based on a smooth-overlap of atomic positions (SOAP)10,46 kernel, which is then used in a Gaussian process model. In SOAP, the Gaussian-smeared atomic neighbor densities ⇢i(R) are expanded in spherical harmonics as follows: ⇢i(R) = X j fc(Rij) · exp( |R Rij|2 2 2 atom ) = X nlm cnlm gn(R)Ylm( ˆR), The spherical power spectrum vector, which is in turn the square of expansion coe - cients, pn1n2l(Ri) = lX m= l c⇤ n1lmcn2lm, can be used to construct the SOAP kernel while raised to a positive integer power ⇣ (which is 4 in present case) to accentuate the sensitivity of the kernel,10 K(R, R0 ) = X n1n2l (pn1n2l(R)pn1n2l(R0 ))⇣ , In the above equations, atom is a smoothness controlling the Gaussian smearing, and nmax and lmax determine the maximum powers for radial components and angular com- ponents in spherical harmonics expansion, respectively.10 These hyperparameters, as well as the number of reference atomic configurations used in Gaussian process, are Behler-Parinello Neural Network Potential (NNP)1 Moment Tensor Potential (MTP)2 Gaussian Approximation Potential (GAP)3 Spectral Neighbor Analysis Potential (SNAP)4 ACSF/MT encodes distances and angles. SOAP/bispectrum encodes neighbor density. Interatomic Potential 1 Behler et al. PRL. 98.14 (2007): 146401. 2 Shapeev MultiScale Modeling and Simulation 14, (2016). 3 Bart ́ok et al. PRL. 104.13 (2010): 136403. 4 Thompson et al. J. Chem. Phys. 285, 316330 (2015)
  • 22.
    Evaluation criteria qAccuracy (energies, forcesand properties) qComputational cost qTraining data requirements qExtrapolability Aug 1 2019 NIST Workshop Machine learning Interatomic Potentials (ML-IAPs) MaterialsVirtual Lab
  • 23.
    Standardized workflow forML-IAP construction and evaluation Pymatgen Fireworks + VASP DFT static Dataset Elastic deformation Distorted structures Surface generation Surface structures Vacancy + AIMD Trajectory snapshots (low T, high T) AIMD Trajectory snapshots Crystal structure property fittingE e e.g. elastic, phonon ··· energy weights degrees of freedom ··· cutoff radius expansion width S1 S2 Sn · · · rc atomic descriptors local environment sites · · · · · · X1(r1j … r1n) X2(r2k … r2m) Xn(rnj … rnm) machine learning Y =f(X; !) Y (energy, force, stress) DFT properties grid search evolutionary algorithm Aug 1 2019 NIST Workshop Available open source on Github: https://github.com/materialsvirtuallab/mlearn Test systems: • Fcc Ni • Fcc Cu • Bcc Li • Bcc Mo • Diamond Ge • Diamond Si Zuo, Y.; Chen, C.; Li, X.; Deng, Z.; Chen, Y.; Behler, J.; Csányi, G.; Shapeev, A. V.; Thompson, A. P.; Wood, M. A.; et al. A Performance and Cost Assessment of Machine Learning Interatomic Potentials. arXiv:1906.08888 2019.
  • 24.
    ML-IAP:Accuracy vs Cost Aug1 2019 NIST Workshop Testerror(meV/atom) Computational cost s/(MD step atom) a b Jmax = 3 Jmax = 3 2000 kernels20 polynomial powers hidden layers [16, 16] GAP reaches best accuracy, but is the most expensive by O(102-103) MTP, NNP, qSNAP all lie quite close to Pareto frontier. Mo dataset Zuo et al. A Performance and Cost Assessment of Machine Learning Interatomic Potentials. arXiv:1906.08888 2019.
  • 25.
    ML-IAP:Training Data Requirements Aug1 2019 NIST Workshop Energies Forces • Data quality is more important than data quantity - ~O(102) structures sufficient to converge energies and forces for most ML-IAPs.. a b Zuo et al. A Performance and Cost Assessment of Machine Learning Interatomic Potentials. arXiv:1906.08888 2019. NNP and qSNAP require much more training data
  • 26.
    ML-IAP:Extrapolability • The greaterthe ML complexity (e.g., NNP and GAP), the greater the issues with extrapolation. • Linear SNAP performs surprisingly well on EOS and polymorph energy differences. Aug 1 2019 NIST Workshop Ni Li Si Cu Mo Ge DFT GAP NNP MTP SNAP qSNAP bcc Ni bcc Cu fcc Mo fcc Li wurtzite Si wurtzite Ge GAP performs poorly! Zuo et al. A Performance and Cost Assessment of Machine Learning Interatomic Potentials. arXiv:1906.08888 2019.
  • 27.
    Applications:Ni-Mo phase diagramand mechanical behavior Aug 1 2019 NIST Workshop Solid-liquid equilibrium Hall-Petch strengthening [1] Hu et al. Nature, 2017, 355, 1292
  • 28.
    Conclusions Aug 1 2019NIST Workshop Graph / Local env. Descriptors Machine Learning ”Instant", linear-scaling property predictions Transfer learning Property with smaller data Pymatgen Fireworks + VASP DFT static Dataset Elastic deformation Distorted structures Surface generation Surface structures Vacancy + AIMD Trajectory snapshots (low T, high T) AIMD Trajectory snapshots Crystal structure property fittingE e e.g. elastic, phonon ··· energy weights degrees of freedom ··· cutoff radius expansion width S1 S2 Sn · · · rc atomic descriptors local environment sites · · · · · · X1(r1j … r1n) X2(r2k … r2m) Xn(rnj … rnm) machine learning Y =f(X; !) Y (energy, force, stress) DFT properties grid search evolutionary algorithm ML-IAPs: Reproducible, near-DFT accuracy + linear scaling -> New science http://crystals.ai & mlearn Open source software and standardized datasets for materials ML
  • 29.
    Acknowledgements Aug 1 2019NIST Workshop MAVRL Creating It from Bit Contract #N000141612621 GRO Program Chi Chen (MEGNet) Yunxing Zuo (ML-IAP) Xiangguo Li (Ni-Mo and MPE)