Brian DeCost1
, Heshan Yu2
, Xiaohang Zhang2
, Seunghun Lee2
, Yangang Liang2
, Ichiro Takeuchi2
, Jason Hattrick-Simpers1
, A. Gilad Kusne1
1
National Institute of Standards and Technology, 2
University of Maryland, College Park
Autonomous experimental phase diagram acquisition
2018.08.07 -- AIMS 2018 Meeting -- NIST, Gaithersburg MD
Brian DeCost
2
High-Throughput Experimental Materials Collaboratory
https://mgi.nist.gov/htemc
Distributed experimental materials science platform, built on a network of member institutes.
Realization of the HPC paradigm for experimental materials science?
Contact: martin.green@nist.gov Look for a forthcoming white paper
3
Active clustering for phase diagram acquisition
4
Autonomous science systems
Tabor, Daniel P., et al. "Accelerating the discovery of materials for clean energy in the era of smart automation." Nat. Rev. Mater. 3 (2018): 5-20
https://doi.org/10.1038/s41578-018-0005-z
5
Parallel synthesis, serial characterization
Co-sputtering scheme Ni
Mn
Al
3” spread wafer
Ni Al
Mn
Phase diagram
Gregoire, J. M., et al. "High-throughput synchrotron X-ray diffraction for combinatorial phase mapping." Journal of synchrotron radiation 21.6 (2014): 1262-1268.
Bi
Fe V
XRD
6
Unsupervised phase diagram estimation is hard
Hattrick-Simpers, Jason R., John M. Gregoire, and A. Gilad Kusne.
"Perspective: Composition–structure–property mapping in high-throughput experiments: Turning data into knowledge." APL Materials 4.5 (2016): 053211.
https://doi.org/10.1063/1.4950995
What you really want:
- multi-phase: linear unmixing
- single-phase: invariance to peak shift
- infer the number of regions...
- respect thermodynamics
- leverage archival data
- deal with missing reflections
- fast!
Compromises we can live with
How to discover e.g. line compounds with this approach?
7
Simultaneous phase and property mapping
Kusne, Aaron Gilad, et al. "On-the-fly machine-learning for high-throughput experiments: search for rare-earth-free permanent magnets." Scientific reports 4 (2014): 6367. 10.1038/srep06367
Finding novel rare-earth-free
permanent magnets
8
GRENDEL: Iterative piecewise matrix factorization
Alternate between:
- clustering
- matrix factorization
Kusne, Aaron G., et al. "High-throughput determination of structural phase diagram and constituent phases using GRENDEL." Nanotechnology 26.44 (2015): 444002.
Include archival data from
- ICSD
- AFLOW
9
Amdahl's law in materials science
Speedup (innovation) is limited by the serial portion of the process!
parallel synthesis
MnNiGe: 535 'samples'
serial characterization
Lab diffractometer
30min per composition
2 weeks per ternary!
fast (serial) characterization
Synchrotron (SLAC)
30s per composition
4.5 hours per ternary
Exploit the structure of materials data to scale up
10
Autonomous run: cluster, extrapolate, select
Fe
Fe0.4Pd0.6
Fe0.4Ga0.6
Fe Fe0.4Ga0.6
Fe0.4Pd0.6
VO2
11
Metal Insulator transition: VNbO2, VWO2, etc.
100 150 200 250 300 350 400
10
2
10
3
10
4
10
5
R(W)
T (K)
0.33
0.96
1.36
1.68
2.27
2.61
2.86
3.44
c-Al
2
O
3
substrate
W%
Mixed
Tetragonal
We'd like to efficiently determine metal-insulator transition temperatures experimentally in a variety of systems
9mm composition spread chip
Metal-insulator transition temperature
decreases with doping
Monoclinic
By Original PNGs by Daniel Mayer, traced in Inkscape by User:Stannered - Crystal stucture
CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=1735636
12
X-ray diffraction for phase diagram determination
- Dense collection
- Laboriously hand labeled by experts
- clustering can help
- scaling to more complex systems is a challenge
- higher temperatures: avoid annealing and diffusion effects
Temperature,W
13
Active clustering for autonomous XRD phase mapping
Think carefully about modeling to remove researcher degrees of freedom
14
Spectral clustering loosely: like kernel k-means clustering
27.00 27.25 27.50 27.75 28.00 28.25 28.50
2✓
0
50
100
Intensity(arb.units)
27.00 27.25 27.50 27.75 28.00 28.25 28.50
2✓
0
50
100
Intensity(arb.units)
1. Form the cosine similarity covariance matrix
Y
von Luxburg Tutorial on Spectral Clustering arXiv:0711.0189Zelnik-Manor and Perona, Self-Tuning Spectral Clustering, NIPS 2005
2. compute the eigendecomposition of the kernel matrix
N a = Kasolve
min
C
X
c2C
X
x2c
kx ck2
3. perform k-means clustering in the latent space
find cluster centers c:
project data onto k principal eigenvectors
Ki,j = e dcos(yi,yj )/2 i j
set to k'th nearest neighbor distance fori yi
15
Gaussian process classification
p(y = 1|x) = (f(x))
(z) = 1/(1 + exp( z))
A Bayesian non-parametric generalization of logistic regression
X
0.0 0.2 0.4 0.6 0.8 1.0
X
1
0
1
y
0.0 0.2 0.4 0.6 0.8 1.0
X
1
0
1
y
0.0 0.2 0.4 0.6 0.8 1.0
X
1
0
1
y
0.0 0.2 0.4 0.6 0.8 1.0
X
1
0
1
y
0.0 0.2 0.4 0.6 0.8 1.0
X
1
0
1
y
Bayesian model selection:
gradient-based optimization of the marginal likelihood p(y|X, Hi)
0.0 0.2 0.4 0.6 0.8 1.0
X
1
0
1
y
Ki,j = e |xj xi|2
/`2
Multi-class: one-vs-all strategy
f⇤ = k(x⇤)(K + 2
I) 1
y
V[f⇤] = k(x⇤, x⇤) kT
⇤ (K + 2
I) 1
k⇤
GP prior on f:
http://gpflow.readthedocs.io
Why you should consider Bayesian non-parametric models:
principled hyperparameter tuning (without CV)
This model knows what it doesn't know!
16
Active Gaussian process classification
Kapoor et al. Gaussian Processes for Object Categorization (2010)
DOI: 10.1007/s11263-009-0268-3
Classification uncertainty
Multi-class: one-vs-all strategy
arg min
xu2Xu
|fu|
p
⌃u + 2
monoclinic tetragonal
arg min
xu2Xu
|yu 0.5|
arg max
xu2Xu
⌃u
Margin
Variance
two phase
17
☢
4. GET
command
f = k⇤(K + 2
nI) 1
y
V[f⇤] = k(x⇤, x⇤) kT
⇤ (K + 2
nI) 1
k⇤
0.6 0.7 0.8 0.9 1.0
composition
30
40
50
60
temperature
0.6 0.7 0.8 0.9 1.0
composition
30
40
50
60
temperature
5. POST
new data
1. GET
available data
3. POST
proposed experiment
2. (re)train and predict
Analytics client XRD client
Diffractometer
Control server
26.0 26.5 27.0 27.5 28.0 28.5 29.0 29.5 30.0
2✓
0
25
50
intensity(arb.units)
Infrastructure for autonomous experiments
18
VWO2
monoclinic tetragonal
19
VWO2 clustering performance
monoclinic tetragonal
20
Acknowledgements
Funding sources
NIST
ONR
NRC Postdoctoral Research Associate Program
XRD setup and VNbO2 data collection by Yangang Liang
VWO2 film growth by Xiaohang Zhang
VWO2 setup by Heshan Yu
Ground truth phase labeling by Jason Hattrick-Simpers

Autonomous experimental phase diagram acquisition

  • 1.
    Brian DeCost1 , HeshanYu2 , Xiaohang Zhang2 , Seunghun Lee2 , Yangang Liang2 , Ichiro Takeuchi2 , Jason Hattrick-Simpers1 , A. Gilad Kusne1 1 National Institute of Standards and Technology, 2 University of Maryland, College Park Autonomous experimental phase diagram acquisition 2018.08.07 -- AIMS 2018 Meeting -- NIST, Gaithersburg MD Brian DeCost
  • 2.
    2 High-Throughput Experimental MaterialsCollaboratory https://mgi.nist.gov/htemc Distributed experimental materials science platform, built on a network of member institutes. Realization of the HPC paradigm for experimental materials science? Contact: martin.green@nist.gov Look for a forthcoming white paper
  • 3.
    3 Active clustering forphase diagram acquisition
  • 4.
    4 Autonomous science systems Tabor,Daniel P., et al. "Accelerating the discovery of materials for clean energy in the era of smart automation." Nat. Rev. Mater. 3 (2018): 5-20 https://doi.org/10.1038/s41578-018-0005-z
  • 5.
    5 Parallel synthesis, serialcharacterization Co-sputtering scheme Ni Mn Al 3” spread wafer Ni Al Mn Phase diagram Gregoire, J. M., et al. "High-throughput synchrotron X-ray diffraction for combinatorial phase mapping." Journal of synchrotron radiation 21.6 (2014): 1262-1268. Bi Fe V XRD
  • 6.
    6 Unsupervised phase diagramestimation is hard Hattrick-Simpers, Jason R., John M. Gregoire, and A. Gilad Kusne. "Perspective: Composition–structure–property mapping in high-throughput experiments: Turning data into knowledge." APL Materials 4.5 (2016): 053211. https://doi.org/10.1063/1.4950995 What you really want: - multi-phase: linear unmixing - single-phase: invariance to peak shift - infer the number of regions... - respect thermodynamics - leverage archival data - deal with missing reflections - fast! Compromises we can live with How to discover e.g. line compounds with this approach?
  • 7.
    7 Simultaneous phase andproperty mapping Kusne, Aaron Gilad, et al. "On-the-fly machine-learning for high-throughput experiments: search for rare-earth-free permanent magnets." Scientific reports 4 (2014): 6367. 10.1038/srep06367 Finding novel rare-earth-free permanent magnets
  • 8.
    8 GRENDEL: Iterative piecewisematrix factorization Alternate between: - clustering - matrix factorization Kusne, Aaron G., et al. "High-throughput determination of structural phase diagram and constituent phases using GRENDEL." Nanotechnology 26.44 (2015): 444002. Include archival data from - ICSD - AFLOW
  • 9.
    9 Amdahl's law inmaterials science Speedup (innovation) is limited by the serial portion of the process! parallel synthesis MnNiGe: 535 'samples' serial characterization Lab diffractometer 30min per composition 2 weeks per ternary! fast (serial) characterization Synchrotron (SLAC) 30s per composition 4.5 hours per ternary Exploit the structure of materials data to scale up
  • 10.
    10 Autonomous run: cluster,extrapolate, select Fe Fe0.4Pd0.6 Fe0.4Ga0.6 Fe Fe0.4Ga0.6 Fe0.4Pd0.6
  • 11.
    VO2 11 Metal Insulator transition:VNbO2, VWO2, etc. 100 150 200 250 300 350 400 10 2 10 3 10 4 10 5 R(W) T (K) 0.33 0.96 1.36 1.68 2.27 2.61 2.86 3.44 c-Al 2 O 3 substrate W% Mixed Tetragonal We'd like to efficiently determine metal-insulator transition temperatures experimentally in a variety of systems 9mm composition spread chip Metal-insulator transition temperature decreases with doping Monoclinic By Original PNGs by Daniel Mayer, traced in Inkscape by User:Stannered - Crystal stucture CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=1735636
  • 12.
    12 X-ray diffraction forphase diagram determination - Dense collection - Laboriously hand labeled by experts - clustering can help - scaling to more complex systems is a challenge - higher temperatures: avoid annealing and diffusion effects Temperature,W
  • 13.
    13 Active clustering forautonomous XRD phase mapping Think carefully about modeling to remove researcher degrees of freedom
  • 14.
    14 Spectral clustering loosely:like kernel k-means clustering 27.00 27.25 27.50 27.75 28.00 28.25 28.50 2✓ 0 50 100 Intensity(arb.units) 27.00 27.25 27.50 27.75 28.00 28.25 28.50 2✓ 0 50 100 Intensity(arb.units) 1. Form the cosine similarity covariance matrix Y von Luxburg Tutorial on Spectral Clustering arXiv:0711.0189Zelnik-Manor and Perona, Self-Tuning Spectral Clustering, NIPS 2005 2. compute the eigendecomposition of the kernel matrix N a = Kasolve min C X c2C X x2c kx ck2 3. perform k-means clustering in the latent space find cluster centers c: project data onto k principal eigenvectors Ki,j = e dcos(yi,yj )/2 i j set to k'th nearest neighbor distance fori yi
  • 15.
    15 Gaussian process classification p(y= 1|x) = (f(x)) (z) = 1/(1 + exp( z)) A Bayesian non-parametric generalization of logistic regression X 0.0 0.2 0.4 0.6 0.8 1.0 X 1 0 1 y 0.0 0.2 0.4 0.6 0.8 1.0 X 1 0 1 y 0.0 0.2 0.4 0.6 0.8 1.0 X 1 0 1 y 0.0 0.2 0.4 0.6 0.8 1.0 X 1 0 1 y 0.0 0.2 0.4 0.6 0.8 1.0 X 1 0 1 y Bayesian model selection: gradient-based optimization of the marginal likelihood p(y|X, Hi) 0.0 0.2 0.4 0.6 0.8 1.0 X 1 0 1 y Ki,j = e |xj xi|2 /`2 Multi-class: one-vs-all strategy f⇤ = k(x⇤)(K + 2 I) 1 y V[f⇤] = k(x⇤, x⇤) kT ⇤ (K + 2 I) 1 k⇤ GP prior on f: http://gpflow.readthedocs.io Why you should consider Bayesian non-parametric models: principled hyperparameter tuning (without CV) This model knows what it doesn't know!
  • 16.
    16 Active Gaussian processclassification Kapoor et al. Gaussian Processes for Object Categorization (2010) DOI: 10.1007/s11263-009-0268-3 Classification uncertainty Multi-class: one-vs-all strategy arg min xu2Xu |fu| p ⌃u + 2 monoclinic tetragonal arg min xu2Xu |yu 0.5| arg max xu2Xu ⌃u Margin Variance two phase
  • 17.
    17 ☢ 4. GET command f =k⇤(K + 2 nI) 1 y V[f⇤] = k(x⇤, x⇤) kT ⇤ (K + 2 nI) 1 k⇤ 0.6 0.7 0.8 0.9 1.0 composition 30 40 50 60 temperature 0.6 0.7 0.8 0.9 1.0 composition 30 40 50 60 temperature 5. POST new data 1. GET available data 3. POST proposed experiment 2. (re)train and predict Analytics client XRD client Diffractometer Control server 26.0 26.5 27.0 27.5 28.0 28.5 29.0 29.5 30.0 2✓ 0 25 50 intensity(arb.units) Infrastructure for autonomous experiments
  • 18.
  • 19.
  • 20.
    20 Acknowledgements Funding sources NIST ONR NRC PostdoctoralResearch Associate Program XRD setup and VNbO2 data collection by Yangang Liang VWO2 film growth by Xiaohang Zhang VWO2 setup by Heshan Yu Ground truth phase labeling by Jason Hattrick-Simpers