Machine Learning and Surrogate Optimization on Heterogeneous Catalysts

Machine Learning and Surrogate Optimization
on Heterogeneous Catalysts
Ichigaku Takigawa
2019 PRESTO International Symposium on Materials Informatics 
Feb 9-11, 2019 @ Tokyo
Graduate School of Information Science and Technology, Hokkaido University
Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University

Heterogeneous catalysts and surface reactions
Wolfgang Pauli
“God made the bulk;  
the surface was invented by the devil.”
adsorption
diffusion
desorption
dissociation
recombination
kinks
terraces
adatom
vacancysteps
Many hard-to-quantify factors complicate their atomic-
level characterization by modelling and experiments.
• reaction conditions
• composition
• support
• surface termination
• particle size & morphology
• atomic coordination environment
• disordered/amorphous structures 
in their active state 
:
GAS
(Reactants)
SOLID
(Catalysts)
Hi-ﬁdelity simulations are too time-consuming...

Then how can we characterize the catalytic activity?
K. Shimizu et al, ACS Catal. 2, 1904 (2012)
d-band center (εd − EF) / eVd-band center (εd − EF) / eV
Hammer–Nørskov d-band model
reactionrates
Volcano
trends!
adsorption energy / eV
Brønsted-Evans-Polanyi
relation
activationenergy/eV
Linear trends!
The d-electrons of transition metals govern...
Several DFT-calculated indexes capture the trend to some extent...

Outline: Our ML-based studies
1. Can we predict the d-band center?
2. Can we predict the adsorption energy?
3. Can we predict the catalytic activity?
predicting DFT-calculated values by machine learning
(Takigawa et al, RSC Advances. 2016)
predicting DFT-calculated values by machine learning
(Toyao et al, JPC C 2018)
predicting values from experiments reported in the literature
by machine learning
(Suzuki et al, in preparation)

Case 1. Predicting the d-band centers
Guest
Host
Ruban, Hammer, Stoltze, Skriver, Nørskov, J Mol Catal A, 115:421-429 (1997)
J. K. Nørskov, et al., Advances in Catalysis, 2000
Host
Guest
Two types of models
• 1% doped
• overlayer
[1% doped]
The d-bands of
transition metals
play central roles.

The beauty of the periodic table worked!
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -0.92 -0.96 -0.97 -1.65 -1.64 -2.24 -1.87 -2.4 -3.11
Co -1.37 -1.23 -2.12 -2.82 -2.53 -2.26 -3.56
Ni -0.33 -1.18 -1.92 -2.03 -2.43 -2.15 -2.82 -3.39
Cu -2.42 -2.49 -2.67 -2.89 -2.94 -3.82 -4.63
Ru -1.11 -1.04 -1.12 -1.41 -1.88 -1.81 -1.54 -2.27
Rh -1.42 -1.32 -1.51 -1.7 -1.73 -2.12 -1.81 -1.7 -2.18 -2.3
Pd -1.47 -1.29 -1.29 -1.03 -1.58 -1.83 -1.68 -1.52 -1.79
Ag -3.75 -3.56 -3.62 -3.8 -4.03 -3.5 -3.93 -4.51
Ir -1.78 -1.71 -1.78 -1.55 -2.14 -2.53 -2.2 -2.11 -2.6 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 -1.96 -2.33
Au -3.03 -2.82 -2.85 -2.89 -3.44 -3.56
Fe -0.78 -1.65 -1.64 -1.87
Co -1.18 -1.17 -1.37 -1.87 -2.12 -2.82 -2.26
Ni -0.33 -1.18 -1.17 -2.61 -2.43 -2.15 -2.82
Cu -2.42 -2.89 -2.94 -3.88 -4.63
Ru -1.11 -1.04 -1.12 -1.11 -1.41 -1.81 -2.27
Rh -1.42 -1.51 -2.12 -1.81 -1.7
Pd -1.29 -1.29 -1.03 -1.58 -1.83 -1.52 -1.79
Ag -3.68 -3.8 -3.63 -4.51
Ir -2.14 -2.11 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06
Au -2.86 -3.09 -2.89 -3.44 -3.56
Fe -2.17 -3.11
Co -1.17 -1.37 -2.12
Ni -0.33 -1.18 -2.61 -2.43
Cu -2.42 -2.29 -2.49 -3.71 -4.63
Ru -2.02
Rh -1.32 -1.73 -2.12
Pd -1.94 -1.83 -1.97
Ag -3.75 -3.68 -4.51
Ir -1.78 -1.71 -2.7
Pt -2.13
Au -3.09 -2.89
training sets (75%)
test sets (25%)
training sets (50%)
test sets (50%)
training sets (25%)
test sets (75%)
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
100 times 
mean RMSE:
0.153 / eV
100 times 
mean RMSE:
0.235 / eV
100 times 
mean RMSE:
0.402 / eV

The ML model
•Group (G)
•Bulk Wigner–Seitz radius (R) in Å
•Atomic number (AN)
•Atomic mass (AM) in g mol
−1
•Period (P)
•Electronegativity (EN)
•Ionization energy (IE) in eV
•Enthalpy of fusion (∆fusH) in J g
−1
•Density at 25 ℃ (ρ) in g cm
−3
Readily available 
9 descriptors pretested 
for host & guest (18 in total)
Gradient Boosted Tree Regression (GBR) 
with only 6 descriptors
(1) Group in the periodic table (host)
(2) Density at 25 ℃ (host)
(3) Enthalpy of fusion (guest)
(4) Ionization energy (guest)
(5) Enthalpy of fusion (host)
(6) Ionization energy (host)

11 ML methods pretested
[3 Tree Ensembles (Nonlinear Regression Models)]  
GBR (Gradient Boosted Tree Regression); ETR (Extra-Trees Regression); RFR (Random Forest Regression);
[5 Linear Regression Models]  
OLS (Ordinary Least Squares Regression); PLS (Partial Least Squares Regression); LASSO (Lasso
Regression); RIDGE (Ridge Regression); RANSAC (Random Sample Consensus Regression);
G
BR
ETR
G
PR
R
FR
KR
R
O
LS
R
ID
G
E
PLS
R
AN
SAC
SVR
LASSO
[3 Kernel Methods (Nonlinear Regression Models)]  
GPR (Gaussian Process Regression); KRR (Kernel Ridge Regression); SVR (Support Vector Regression);
training sets (75%)
test sets (25%)

Tree ensemble regressors (GBR, ETR, RFR)
Decision Tree
(Regression Tree)
Tree Ensemble
⇡
⇡
y
x1
x2
y
x1
x2
x1
x2
ˆy
y = sin
✓q
x2
1 + x2
2
◆
y = sin
✓q
x2
1 + x2
2
◆
= + + + ...
x1
x2
ˆy
c1 c2
c3c1
c2
c3
• Region-wise constant prediction
• The regions are given by recursive axis-
parallel partitioning of the data space
<latexit sha1_base64="HFw5DiyTzq0XGqmoTg6I06/Dc80=">AAACr3ichVHLSsNQED3GV62vqhvBTbEoClImVVBciW5c+qoVXzWJV72YJiFJS2vxB1wLLkRBwYX4GW76Ay78BHGp4MaFkzQgKuqE5J57Zs7cczO6Y0rPJ3psUBqbmltaY23x9o7Oru5ET++qZxddQ2QN27TdNV3zhCktkfWlb4o1xxVaQTdFTj+cC/K5knA9aVsrfsURWwVt35J70tB8ptZHynl1LFnOZ0bziRSlKYzkT6BGIIUoFuxEDZvYhQ0DRRQgYMFnbEKDx88GVBAc5rZQZc5lJMO8wDHirC1yleAKjdlD/u7zbiNiLd4HPb1QbfApJr8uK5MYoge6pReq0R090fuvvaphj8BLhVe9rhVOvvukf/ntX1WBVx8Hn6o/PfvYw1ToVbJ3J2SCWxh1feno7GV5emmoOkzX9Mz+r+iR7vkGVunVuFkUS+d/+NHZy+9/LMhHFTxC9fvAfoLVTFodT9PiRGpmNhpmDAMYxAhPbBIzmMcCsnyChVNc4FJRlZyyrezUS5WGSNOHL6HID0VdmeQ=</latexit>

Tree ensemble regressors (GBR, ETR, RFR)
Advantages
• quick, nonlinear, parallelizable
• highly accurate (widely used in many winning
solutions for data prediction competitions)
• usually less hyperparameter dependent  
(compared to kernel methods and neural networks)
• conservative extrapolation
• "variable importance" provided
• popular implementations
• Scikit-learn
• XGBoost (by DMLC)
• LightGBM (by Microsoft)
…Data
How to generate multiple
decision trees?
• RFR / ETR 
random patches (random subsampling of instances and variables) or random splits
• GBR (can be also mixed with the above strategy) 
sequentially add a new tree to compensate the weak point of the current ensemble.

Descriptor analysis and evaluation
100 times mean RMSE:
0.204±0.047 / eV
0.212±0.047 / eV
0.214±0.046 / eV
GBR with 18
descriptors
GBR with 6
descriptors
GBR with 4
descriptors
Descriptor
Importances
Descriptor
Selection 
(top-k)
training sets (75%)
test sets (25%)

Case 2. Predicting the adsorption energy
DFT calculation of adsorption energy
• 10 hours with our 32 cores workstation  
(CH3 on the Cu monometallic surface)
• even longer time (about 34 hours) for the system
containing another metal such as Pb
Predicting Adsorption energy of CH3
ML prediction
• < 1 sec with our 1 core laptop
• not dependent on target systems, but
methods we choose
training sets (75%)
test sets (25%)

But what these mean for catalyst design and discovery!?
Fe -0.92 -0.96 -0.97 -1.65 -1.64 -2.24 -1.87 -2.4 -3.11
Co -1.37 -1.23 -2.12 -2.82 -2.53 -2.26 -3.56
Ni -0.33 -1.18 -1.92 -2.03 -2.43 -2.15 -2.82 -3.39
Cu -2.42 -2.49 -2.67 -2.89 -2.94 -3.82 -4.63
Ru -1.11 -1.04 -1.12 -1.41 -1.88 -1.81 -1.54 -2.27
Rh -1.42 -1.32 -1.51 -1.7 -1.73 -2.12 -1.81 -1.7 -2.18 -2.3
Pd -1.47 -1.29 -1.29 -1.03 -1.58 -1.83 -1.68 -1.52 -1.79
Ag -3.75 -3.56 -3.62 -3.8 -4.03 -3.5 -3.93 -4.51
Ir -1.78 -1.71 -1.78 -1.55 -2.14 -2.53 -2.2 -2.11 -2.6 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 -1.96 -2.33
Au -3.03 -2.82 -2.85 -2.89 -3.44 -3.56
Fe -0.78 -1.65 -1.64 -1.87
Co -1.18 -1.17 -1.37 -1.87 -2.12 -2.82 -2.26
Ni -0.33 -1.18 -1.17 -2.61 -2.43 -2.15 -2.82
Cu -2.42 -2.89 -2.94 -3.88 -4.63
Ru -1.11 -1.04 -1.12 -1.11 -1.41 -1.81 -2.27
Rh -1.42 -1.51 -2.12 -1.81 -1.7
Pd -1.29 -1.29 -1.03 -1.58 -1.83 -1.52 -1.79
Ag -3.68 -3.8 -3.63 -4.51
Ir -2.14 -2.11 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06
Au -2.86 -3.09 -2.89 -3.44 -3.56
Fe -2.17 -3.11
Co -1.17 -1.37 -2.12
Ni -0.33 -1.18 -2.61 -2.43
Cu -2.42 -2.29 -2.49 -3.71 -4.63
Ru -2.02
Rh -1.32 -1.73 -2.12
Pd -1.94 -1.83 -1.97
Ag -3.75 -3.68 -4.51
Ir -1.78 -1.71 -2.7
Pt -2.13
Au -3.09 -2.89
training sets (75%)
test sets (25%)
training sets (50%)
test sets (50%)
training sets (25%)
test sets (75%)
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
100 times 
mean RMSE:
0.153 / eV
100 times 
mean RMSE:
0.235 / eV
100 times 
mean RMSE:
0.402 / eV

Standard procedure for optimizing the activity
All your
available
data
• Experiments
• Simulations
Hypothesis
generation
(abduction)
Check results
Feedback

Replace the time-consuming and costly part by ML
All your
available
data
Check the best
predicted ones
Feed them as
training data
Make predictions 
for many possible
candidates
Machine Learning
(any "data-driven"  
predictions)
The "Surrogate (or proxy)"
model for
• Demanding experiments
• Time-consuming hi-ﬁdelity
simulations

Replace the time-consuming and costly part by ML
All your
available
data
Check the best
predicted ones
Feed them as
training data
Make predictions 
for many possible
candidates
Machine Learning
(any "data-driven"  
predictions)
The "Surrogate (or proxy)"
model for
• Demanding experiments
• Time-consuming hi-ﬁdelity
simulations
This simple procedure
won't work in most
practical cases!

ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value

Activity
Training samples
Best known value
ML prediction
Fitted to minimize
the average error.
<latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit>
<latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>

Activity
Training samples
Best known value
ML prediction
Fitted to minimize
the average error.
Nicely predicted for 
the average (but mediocre )

Activity
Training samples
Best known value
ML prediction
Fitted to minimize
the average error.
nice "discovery" can be largely
deviated from the average of knowns
outlier

Activity
Training samples
Best known value
ML captures the average trend of available knowns
"discovery" corresponds to something not in knowns
Mismatch
ML prediction
Fitted to minimize
the average error.
nice "discovery" can be largely
deviated from the average of knowns
outlier

An ML model is just representative of the training data
Highly Inaccurate Model Predictions from
Extrapolation (Lohninger 1999)
"Beware of the perils of extrapolation,
and understand that ML algorithms
build models that are representative of
the available training samples."
"exploitation""exploration"
to obtain new knowledge/data to use the knowledge/data to
improve the performane
We also need this ML basically for this

Surrogate optimization (model-based optimization)
Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
Use ML to guide the balance between "exploitation" and "exploration"!

"Uncertainty" of
prediction
e.g. prediction variance

"Uncertainty" of
prediction
e.g. 
"expected improvement"

"Uncertainty" of
prediction
e.g. 
"expected improvement"
1. Initial Sampling (DoE)
2. Loop:
1. Construct a Surrogate Model.
2. Search the Inﬁll Criterion.
3. Add new samples. (intervention)
• Reinforcement learning
• Blackbox optimization
• Bayesian optimization
• Sequential design of experiments
• Multi-armed bandit
• Evolutional computation
• Game-theoretic approaches 
:
An Open Research Topic in ML

Structure-activity landscapes are nonsmooth...
J. Med. Chem. 2012, 55, 2932−2942
The structure-activity landscape can be often
nonsmooth. Small changes in descriptors can
largely affect the activity/selectivity.
Activity cliffs Selectivity cliffs

Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:

• Nonsmooth interpolation (SMAC-like by tree ensemble regressors) 
with multistart local search with L-BFGS + random samples

• Use EI as the inﬁll criterion estimated by OOB or Quantile regression

• Get multiple samples in each loop  
by batched optimization with "diversiﬁed" samples by clustering 
(need some margin; in reality, not easy to realize suggested catalysts)

• Pose several restrictions on new samples to be tested

• the sum of compositions equals to 1 (compositional restriction)

• the sum of compositions equals to 1 (compositional restriction)
• restrict the number of elements in a catalyst (bounded nonzeros) 
(because a catalyst with 60 elements would be not realistic...)

Case 3. Predicting the catalytic acitivity (in prep)
• Oxidative coupling of methane (OCM)  
[Zavyalova+ 2011]
• Water gas shift reaction (WGS)  
[Odabaşi+ 2014]
• CO oxidation [Günay+ 2013]
Test on 3 DatasetsOur model
GPR-based BO
Random

ICReDD, Hokkaido University
Check the website for any collaborations and postdoc positions!
Our mission: 
To rationally design and discover
new chemical reactions  
by seemlessly fusing
• experimental sciences (realization)
• computational sciences (theory-driven)
• information sciences (data-driven)
started Oct 2018, funded $ 6.4 million per year by government (for 10 years)
Sapporo
Tokyo
HOKKAIDO
• 2 million population 
(5th largest city in Japan)
• 6.3m / 248 inches 
avg. annual snowfall
Institute for Chemical Reaction Design and Discovery
(WPI-ICReDD), Hokkaido University

Summary
• Predicting the d-band centers by ML 
(Takigawa et al, RSC Advances. 2016)
• Predicting the adsorption energy by ML 
(Toyao et al, JPC C 2018)
• Predicting the experimentally-reported catalytic
activity by ML 
(Suzuki et al, in preparation)
Acknowledgements
Ken-ichi
SHIMIZU 
(ICAT)
Satoru
TAKAKUSAGI 
(ICAT)
Takashi
TOYAO 
(ICAT)
Keisuke 
SUZUKI 
(DENSO)

Machine Learning and Surrogate Optimization on Heterogeneous Catalysts

Recommended

Recommended

More Related Content

Similar to Machine Learning and Surrogate Optimization on Heterogeneous Catalysts

Similar to Machine Learning and Surrogate Optimization on Heterogeneous Catalysts (20)

More from Ichigaku Takigawa

More from Ichigaku Takigawa (20)

Recently uploaded

Recently uploaded (20)

Machine Learning and Surrogate Optimization on Heterogeneous Catalysts