SlideShare a Scribd company logo
Machine Learning and Surrogate Optimization
on Heterogeneous Catalysts
Ichigaku Takigawa
2019 PRESTO International Symposium on Materials Informatics

Feb 9-11, 2019 @ Tokyo
Graduate School of Information Science and Technology, Hokkaido University
Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University
Heterogeneous catalysts and surface reactions
Wolfgang Pauli
“God made the bulk; 

the surface was invented by the devil.”
adsorption
diffusion
desorption
dissociation
recombination
kinks
terraces
adatom
vacancysteps
Many hard-to-quantify factors complicate their atomic-
level characterization by modelling and experiments.
• reaction conditions
• composition
• support
• surface termination
• particle size & morphology
• atomic coordination environment
• disordered/amorphous structures

in their active state

:
GAS
(Reactants)
SOLID
(Catalysts)
Hi-fidelity simulations are too time-consuming...
Then how can we characterize the catalytic activity?
K. Shimizu et al, ACS Catal. 2, 1904 (2012)
d-band center (εd − EF) / eVd-band center (εd − EF) / eV
Hammer–Nørskov d-band model
reactionrates
Volcano
trends!
adsorption energy / eV
Brønsted-Evans-Polanyi
relation
activationenergy/eV
Linear trends!
The d-electrons of transition metals govern...
Several DFT-calculated indexes capture the trend to some extent...
Outline: Our ML-based studies
1. Can we predict the d-band center?
2. Can we predict the adsorption energy?
3. Can we predict the catalytic activity?
predicting DFT-calculated values by machine learning
  (Takigawa et al, RSC Advances. 2016)
predicting DFT-calculated values by machine learning
  (Toyao et al, JPC C 2018)
predicting values from experiments reported in the  literature
by machine learning
  (Suzuki et al, in preparation)
Case 1. Predicting the d-band centers
Guest
Host
Ruban, Hammer, Stoltze, Skriver, Nørskov, J Mol Catal A, 115:421-429 (1997)
J. K. Nørskov, et al., Advances in Catalysis, 2000
Host
Guest
Two types of models
• 1% doped
• overlayer
[1% doped]
The d-bands of
transition metals
play central roles.
The beauty of the periodic table worked!
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -0.92 -0.96 -0.97 -1.65 -1.64 -2.24 -1.87 -2.4 -3.11
Co -1.37 -1.23 -2.12 -2.82 -2.53 -2.26 -3.56
Ni -0.33 -1.18 -1.92 -2.03 -2.43 -2.15 -2.82 -3.39
Cu -2.42 -2.49 -2.67 -2.89 -2.94 -3.82 -4.63
Ru -1.11 -1.04 -1.12 -1.41 -1.88 -1.81 -1.54 -2.27
Rh -1.42 -1.32 -1.51 -1.7 -1.73 -2.12 -1.81 -1.7 -2.18 -2.3
Pd -1.47 -1.29 -1.29 -1.03 -1.58 -1.83 -1.68 -1.52 -1.79
Ag -3.75 -3.56 -3.62 -3.8 -4.03 -3.5 -3.93 -4.51
Ir -1.78 -1.71 -1.78 -1.55 -2.14 -2.53 -2.2 -2.11 -2.6 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 -1.96 -2.33
Au -3.03 -2.82 -2.85 -2.89 -3.44 -3.56
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -0.78 -1.65 -1.64 -1.87
Co -1.18 -1.17 -1.37 -1.87 -2.12 -2.82 -2.26
Ni -0.33 -1.18 -1.17 -2.61 -2.43 -2.15 -2.82
Cu -2.42 -2.89 -2.94 -3.88 -4.63
Ru -1.11 -1.04 -1.12 -1.11 -1.41 -1.81 -2.27
Rh -1.42 -1.51 -2.12 -1.81 -1.7
Pd -1.29 -1.29 -1.03 -1.58 -1.83 -1.52 -1.79
Ag -3.68 -3.8 -3.63 -4.51
Ir -2.14 -2.11 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06
Au -2.86 -3.09 -2.89 -3.44 -3.56
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -2.17 -3.11
Co -1.17 -1.37 -2.12
Ni -0.33 -1.18 -2.61 -2.43
Cu -2.42 -2.29 -2.49 -3.71 -4.63
Ru -2.02
Rh -1.32 -1.73 -2.12
Pd -1.94 -1.83 -1.97
Ag -3.75 -3.68 -4.51
Ir -1.78 -1.71 -2.7
Pt -2.13
Au -3.09 -2.89
training sets (75%)
test sets (25%)
training sets (50%)
test sets (50%)
training sets (25%)
test sets (75%)
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
100 times

mean RMSE:
0.153 / eV
100 times

mean RMSE:
0.235 / eV
100 times

mean RMSE:
0.402 / eV
The ML model
•Group (G)
•Bulk Wigner–Seitz radius (R) in Å
•Atomic number (AN)
•Atomic mass (AM) in g mol
−1
•Period (P)
•Electronegativity (EN)
•Ionization energy (IE) in eV
•Enthalpy of fusion (∆fusH) in J g
−1
•Density at 25 ℃ (ρ) in g cm
−3
Readily available

9 descriptors pretested

for host & guest (18 in total)
Gradient Boosted Tree Regression (GBR)

with only 6 descriptors
(1) Group in the periodic table (host)
(2) Density at 25 ℃ (host)
(3) Enthalpy of fusion (guest)
(4) Ionization energy (guest)
(5) Enthalpy of fusion (host)
(6) Ionization energy (host)
11 ML methods pretested
[3 Tree Ensembles (Nonlinear Regression Models)] 

GBR (Gradient Boosted Tree Regression); ETR (Extra-Trees Regression); RFR (Random Forest Regression);
[5 Linear Regression Models] 

OLS (Ordinary Least Squares Regression); PLS (Partial Least Squares Regression); LASSO (Lasso
Regression); RIDGE (Ridge Regression); RANSAC (Random Sample Consensus Regression);
G
BR
ETR
G
PR
R
FR
KR
R
O
LS
R
ID
G
E
PLS
R
AN
SAC
SVR
LASSO
[3 Kernel Methods (Nonlinear Regression Models)] 

GPR (Gaussian Process Regression); KRR (Kernel Ridge Regression); SVR (Support Vector Regression);
training sets (75%)
test sets (25%)
Tree ensemble regressors (GBR, ETR, RFR)
Decision Tree
(Regression Tree)
Tree Ensemble
⇡
⇡
y
x1
x2
y
x1
x2
x1
x2
ˆy
y = sin
✓q
x2
1 + x2
2
◆
y = sin
✓q
x2
1 + x2
2
◆
= + + + ...
x1
x2
ˆy
c1 c2
c3c1
c2
c3
• Region-wise constant prediction
• The regions are given by recursive axis-
parallel partitioning of the data space
<latexit sha1_base64="HFw5DiyTzq0XGqmoTg6I06/Dc80=">AAACr3ichVHLSsNQED3GV62vqhvBTbEoClImVVBciW5c+qoVXzWJV72YJiFJS2vxB1wLLkRBwYX4GW76Ay78BHGp4MaFkzQgKuqE5J57Zs7cczO6Y0rPJ3psUBqbmltaY23x9o7Oru5ET++qZxddQ2QN27TdNV3zhCktkfWlb4o1xxVaQTdFTj+cC/K5knA9aVsrfsURWwVt35J70tB8ptZHynl1LFnOZ0bziRSlKYzkT6BGIIUoFuxEDZvYhQ0DRRQgYMFnbEKDx88GVBAc5rZQZc5lJMO8wDHirC1yleAKjdlD/u7zbiNiLd4HPb1QbfApJr8uK5MYoge6pReq0R090fuvvaphj8BLhVe9rhVOvvukf/ntX1WBVx8Hn6o/PfvYw1ToVbJ3J2SCWxh1feno7GV5emmoOkzX9Mz+r+iR7vkGVunVuFkUS+d/+NHZy+9/LMhHFTxC9fvAfoLVTFodT9PiRGpmNhpmDAMYxAhPbBIzmMcCsnyChVNc4FJRlZyyrezUS5WGSNOHL6HID0VdmeQ=</latexit>
Tree ensemble regressors (GBR, ETR, RFR)
Advantages
• quick, nonlinear, parallelizable
• highly accurate (widely used in many winning
solutions for data prediction competitions)
• usually less hyperparameter dependent 

(compared to kernel methods and neural networks)
• conservative extrapolation
• "variable importance" provided
• popular implementations
• Scikit-learn
• XGBoost (by DMLC)
• LightGBM (by Microsoft)
…Data
How to generate multiple
decision trees?
• RFR / ETR

random patches (random subsampling of instances and variables) or random splits
• GBR (can be also mixed with the above strategy)

sequentially add a new tree to compensate the weak point of the current ensemble.
Descriptor analysis and evaluation
100 times mean RMSE:
0.204±0.047 / eV
100 times mean RMSE:
0.212±0.047 / eV
100 times mean RMSE:
0.214±0.046 / eV
GBR with 18
descriptors
GBR with 6
descriptors
GBR with 4
descriptors
Descriptor
Importances
Descriptor
Selection

(top-k)
training sets (75%)
test sets (25%)
Case 2. Predicting the adsorption energy
DFT calculation of adsorption energy
• 10 hours with our 32 cores workstation 

(CH3 on the Cu monometallic surface)
• even longer time (about 34 hours) for the system
containing another metal such as Pb
Predicting Adsorption energy of CH3
ML prediction
• < 1 sec with our 1 core laptop
• not dependent on target systems, but
methods we choose
training sets (75%)
test sets (25%)
But what these mean for catalyst design and discovery!?
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -0.92 -0.96 -0.97 -1.65 -1.64 -2.24 -1.87 -2.4 -3.11
Co -1.37 -1.23 -2.12 -2.82 -2.53 -2.26 -3.56
Ni -0.33 -1.18 -1.92 -2.03 -2.43 -2.15 -2.82 -3.39
Cu -2.42 -2.49 -2.67 -2.89 -2.94 -3.82 -4.63
Ru -1.11 -1.04 -1.12 -1.41 -1.88 -1.81 -1.54 -2.27
Rh -1.42 -1.32 -1.51 -1.7 -1.73 -2.12 -1.81 -1.7 -2.18 -2.3
Pd -1.47 -1.29 -1.29 -1.03 -1.58 -1.83 -1.68 -1.52 -1.79
Ag -3.75 -3.56 -3.62 -3.8 -4.03 -3.5 -3.93 -4.51
Ir -1.78 -1.71 -1.78 -1.55 -2.14 -2.53 -2.2 -2.11 -2.6 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 -1.96 -2.33
Au -3.03 -2.82 -2.85 -2.89 -3.44 -3.56
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -0.78 -1.65 -1.64 -1.87
Co -1.18 -1.17 -1.37 -1.87 -2.12 -2.82 -2.26
Ni -0.33 -1.18 -1.17 -2.61 -2.43 -2.15 -2.82
Cu -2.42 -2.89 -2.94 -3.88 -4.63
Ru -1.11 -1.04 -1.12 -1.11 -1.41 -1.81 -2.27
Rh -1.42 -1.51 -2.12 -1.81 -1.7
Pd -1.29 -1.29 -1.03 -1.58 -1.83 -1.52 -1.79
Ag -3.68 -3.8 -3.63 -4.51
Ir -2.14 -2.11 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06
Au -2.86 -3.09 -2.89 -3.44 -3.56
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -2.17 -3.11
Co -1.17 -1.37 -2.12
Ni -0.33 -1.18 -2.61 -2.43
Cu -2.42 -2.29 -2.49 -3.71 -4.63
Ru -2.02
Rh -1.32 -1.73 -2.12
Pd -1.94 -1.83 -1.97
Ag -3.75 -3.68 -4.51
Ir -1.78 -1.71 -2.7
Pt -2.13
Au -3.09 -2.89
training sets (75%)
test sets (25%)
training sets (50%)
test sets (50%)
training sets (25%)
test sets (75%)
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
100 times

mean RMSE:
0.153 / eV
100 times

mean RMSE:
0.235 / eV
100 times

mean RMSE:
0.402 / eV
Standard procedure for optimizing the activity
All your
available
data
• Experiments
• Simulations
Hypothesis
generation
(abduction)
Check results
Feedback
Replace the time-consuming and costly part by ML
All your
available
data
Check the best
predicted ones
Feed them as
training data
Make predictions

for many possible
candidates
Machine Learning
(any "data-driven" 

predictions)
The "Surrogate (or proxy)"
model for
• Demanding experiments
• Time-consuming hi-fidelity
simulations
Replace the time-consuming and costly part by ML
All your
available
data
Check the best
predicted ones
Feed them as
training data
Make predictions

for many possible
candidates
Machine Learning
(any "data-driven" 

predictions)
The "Surrogate (or proxy)"
model for
• Demanding experiments
• Time-consuming hi-fidelity
simulations
This simple procedure
won't work in most
practical cases!
ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value
ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value
ML prediction
Fitted to minimize
the average error.
<latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit>
<latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value
ML prediction
Fitted to minimize
the average error.
<latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit>
<latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
Nicely predicted for

the average (but mediocre )
ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value
ML prediction
Fitted to minimize
the average error.
<latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit>
<latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
nice "discovery" can be largely
deviated from the average of knowns
outlier
Nicely predicted for

the average (but mediocre )
ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value
ML captures the average trend of available knowns
"discovery" corresponds to something not in knowns
Mismatch
ML prediction
Fitted to minimize
the average error.
<latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit>
<latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
nice "discovery" can be largely
deviated from the average of knowns
outlier
Nicely predicted for

the average (but mediocre )
An ML model is just representative of the training data
Highly Inaccurate Model Predictions from
Extrapolation (Lohninger 1999)
"Beware of the perils of extrapolation,
and understand that ML algorithms
build models that are representative of
the available training samples."
"exploitation""exploration"
to obtain new knowledge/data to use the knowledge/data to
improve the performane
We also need this ML basically for this
Surrogate optimization (model-based optimization)
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
Use ML to guide the balance between "exploitation" and "exploration"!
Surrogate optimization (model-based optimization)
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
"Uncertainty" of
prediction
e.g. prediction variance
Use ML to guide the balance between "exploitation" and "exploration"!
Surrogate optimization (model-based optimization)
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
"Uncertainty" of
prediction
e.g. prediction variance
e.g.

"expected improvement"
Use ML to guide the balance between "exploitation" and "exploration"!
Surrogate optimization (model-based optimization)
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
"Uncertainty" of
prediction
e.g. prediction variance
e.g.

"expected improvement"
1. Initial Sampling (DoE)
2. Loop:
1. Construct a Surrogate Model.
2. Search the Infill Criterion.
3. Add new samples. (intervention)
• Reinforcement learning
• Blackbox optimization
• Bayesian optimization
• Sequential design of experiments
• Multi-armed bandit
• Evolutional computation
• Game-theoretic approaches

:
An Open Research Topic in ML
Use ML to guide the balance between "exploitation" and "exploration"!
Structure-activity landscapes are nonsmooth...
J. Med. Chem. 2012, 55, 2932−2942
The structure-activity landscape can be often
nonsmooth. Small changes in descriptors can
largely affect the activity/selectivity.
Activity cliffs Selectivity cliffs
Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)

with multistart local search with L-BFGS + random samples
Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)

with multistart local search with L-BFGS + random samples
• Use EI as the infill criterion estimated by OOB or Quantile regression
Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)

with multistart local search with L-BFGS + random samples
• Use EI as the infill criterion estimated by OOB or Quantile regression
• Get multiple samples in each loop 

by batched optimization with "diversified" samples by clustering

(need some margin; in reality, not easy to realize suggested catalysts)
Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)

with multistart local search with L-BFGS + random samples
• Use EI as the infill criterion estimated by OOB or Quantile regression
• Get multiple samples in each loop 

by batched optimization with "diversified" samples by clustering

(need some margin; in reality, not easy to realize suggested catalysts)
• Pose several restrictions on new samples to be tested
Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)

with multistart local search with L-BFGS + random samples
• Use EI as the infill criterion estimated by OOB or Quantile regression
• Get multiple samples in each loop 

by batched optimization with "diversified" samples by clustering

(need some margin; in reality, not easy to realize suggested catalysts)
• Pose several restrictions on new samples to be tested
• the sum of compositions equals to 1 (compositional restriction)
Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)

with multistart local search with L-BFGS + random samples
• Use EI as the infill criterion estimated by OOB or Quantile regression
• Get multiple samples in each loop 

by batched optimization with "diversified" samples by clustering

(need some margin; in reality, not easy to realize suggested catalysts)
• Pose several restrictions on new samples to be tested
• the sum of compositions equals to 1 (compositional restriction)
• restrict the number of elements in a catalyst (bounded nonzeros)

(because a catalyst with 60 elements would be not realistic...)
Case 3. Predicting the catalytic acitivity (in prep)
• Oxidative coupling of methane (OCM) 

[Zavyalova+ 2011]
• Water gas shift reaction (WGS) 

[Odabaşi+ 2014]
• CO oxidation [Günay+ 2013]
Test on 3 DatasetsOur model
GPR-based BO
Random
ICReDD, Hokkaido University
Check the website for any collaborations and postdoc positions!
Our mission:

To rationally design and discover
new chemical reactions 

by seemlessly fusing
• experimental sciences (realization)
• computational sciences (theory-driven)
• information sciences (data-driven)
started Oct 2018, funded $ 6.4 million per year by government (for 10 years)
Sapporo
Tokyo
HOKKAIDO
• 2 million population

(5th largest city in Japan)
• 6.3m / 248 inches

avg. annual snowfall
Institute for Chemical Reaction Design and Discovery
(WPI-ICReDD), Hokkaido University
Summary
• Predicting the d-band centers by ML

(Takigawa et al, RSC Advances. 2016)
• Predicting the adsorption energy by ML

(Toyao et al, JPC C 2018)
• Predicting the experimentally-reported catalytic
activity by ML

(Suzuki et al, in preparation)
Acknowledgements
Ken-ichi
SHIMIZU

(ICAT)
Satoru
TAKAKUSAGI

(ICAT)
Takashi
TOYAO

(ICAT)
Keisuke

SUZUKI

(DENSO)

More Related Content

Similar to Machine Learning and Surrogate Optimization on Heterogeneous Catalysts

Basic solid state chem
Basic solid state chemBasic solid state chem
Basic solid state chem
Mithil Fal Desai
 
Investigation on thermoelectric material
Investigation on thermoelectric materialInvestigation on thermoelectric material
Investigation on thermoelectric material
SEVUGARAJAN KARUPPAIAH, BE,MS.
 
Shuaib Y-basedComprehensive mahmudj.pptx
Shuaib Y-basedComprehensive mahmudj.pptxShuaib Y-basedComprehensive mahmudj.pptx
Shuaib Y-basedComprehensive mahmudj.pptx
MdAbuRayhan16
 
Heterogeneous Catalyst-opportunity and challenges.ppt
Heterogeneous Catalyst-opportunity and challenges.pptHeterogeneous Catalyst-opportunity and challenges.ppt
Heterogeneous Catalyst-opportunity and challenges.ppt
Manoj Mohapatra
 
Heterogeneous Catalysis - Opportunities and Challenges
Heterogeneous Catalysis - Opportunities and ChallengesHeterogeneous Catalysis - Opportunities and Challenges
Heterogeneous Catalysis - Opportunities and Challenges
Manoj Mohapatra
 
CONVERSIONS TO USE METRIC PREFIXES This table us
CONVERSIONS TO USE   METRIC PREFIXES This table usCONVERSIONS TO USE   METRIC PREFIXES This table us
CONVERSIONS TO USE METRIC PREFIXES This table us
AlleneMcclendon878
 
Preparation characterization and conductivity studies of Nasicon systems Ag3-...
Preparation characterization and conductivity studies of Nasicon systems Ag3-...Preparation characterization and conductivity studies of Nasicon systems Ag3-...
Preparation characterization and conductivity studies of Nasicon systems Ag3-...
iosrjce
 
Epr 1
Epr 1Epr 1
Epr 1
xzv345
 
Basics Nuclear Physics concepts
Basics Nuclear Physics conceptsBasics Nuclear Physics concepts
Basics Nuclear Physics concepts
Muhammad IrfaN
 
A facile method to prepare CdO-Mn3O4 nanocomposite
A facile method to prepare CdO-Mn3O4 nanocompositeA facile method to prepare CdO-Mn3O4 nanocomposite
A facile method to prepare CdO-Mn3O4 nanocomposite
IOSR Journals
 
Single Phase Heat Transfer with Nanofluids
Single Phase Heat Transfer with Nanofluids Single Phase Heat Transfer with Nanofluids
Single Phase Heat Transfer with Nanofluids
Ehsan B. Haghighi
 
Restrained refinement using Reflex
Restrained refinement using ReflexRestrained refinement using Reflex
Restrained refinement using Reflex
zavalij
 
Puurunen_invited-talk_ALD-modelling_ALD2005_050805
Puurunen_invited-talk_ALD-modelling_ALD2005_050805Puurunen_invited-talk_ALD-modelling_ALD2005_050805
Puurunen_invited-talk_ALD-modelling_ALD2005_050805
Riikka Puurunen
 
Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...
Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...
Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...
Abdullah Khan Zehady
 
Accelerated Materials Discovery & Characterization with Classical, Quantum an...
Accelerated Materials Discovery & Characterization with Classical, Quantum an...Accelerated Materials Discovery & Characterization with Classical, Quantum an...
Accelerated Materials Discovery & Characterization with Classical, Quantum an...
KAMAL CHOUDHARY
 
I0371048054
I0371048054I0371048054
I0371048054
theijes
 
Passivation Of Ga As Surface With Sa Ms Final
Passivation Of Ga As Surface With Sa Ms   FinalPassivation Of Ga As Surface With Sa Ms   Final
Passivation Of Ga As Surface With Sa Ms Final
rkjean
 
revision xi - chapters1-5.pdf
revision xi - chapters1-5.pdfrevision xi - chapters1-5.pdf
revision xi - chapters1-5.pdf
ssuserfa137e1
 
Block 3 training exercises. Draw the Lewis struct
Block 3 training exercises.    Draw the Lewis structBlock 3 training exercises.    Draw the Lewis struct
Block 3 training exercises. Draw the Lewis struct
ChantellPantoja184
 

Similar to Machine Learning and Surrogate Optimization on Heterogeneous Catalysts (20)

Basic solid state chem
Basic solid state chemBasic solid state chem
Basic solid state chem
 
Investigation on thermoelectric material
Investigation on thermoelectric materialInvestigation on thermoelectric material
Investigation on thermoelectric material
 
Shuaib Y-basedComprehensive mahmudj.pptx
Shuaib Y-basedComprehensive mahmudj.pptxShuaib Y-basedComprehensive mahmudj.pptx
Shuaib Y-basedComprehensive mahmudj.pptx
 
Heterogeneous Catalyst-opportunity and challenges.ppt
Heterogeneous Catalyst-opportunity and challenges.pptHeterogeneous Catalyst-opportunity and challenges.ppt
Heterogeneous Catalyst-opportunity and challenges.ppt
 
Heterogeneous Catalysis - Opportunities and Challenges
Heterogeneous Catalysis - Opportunities and ChallengesHeterogeneous Catalysis - Opportunities and Challenges
Heterogeneous Catalysis - Opportunities and Challenges
 
CONVERSIONS TO USE METRIC PREFIXES This table us
CONVERSIONS TO USE   METRIC PREFIXES This table usCONVERSIONS TO USE   METRIC PREFIXES This table us
CONVERSIONS TO USE METRIC PREFIXES This table us
 
Preparation characterization and conductivity studies of Nasicon systems Ag3-...
Preparation characterization and conductivity studies of Nasicon systems Ag3-...Preparation characterization and conductivity studies of Nasicon systems Ag3-...
Preparation characterization and conductivity studies of Nasicon systems Ag3-...
 
Epr 1
Epr 1Epr 1
Epr 1
 
Basics Nuclear Physics concepts
Basics Nuclear Physics conceptsBasics Nuclear Physics concepts
Basics Nuclear Physics concepts
 
A facile method to prepare CdO-Mn3O4 nanocomposite
A facile method to prepare CdO-Mn3O4 nanocompositeA facile method to prepare CdO-Mn3O4 nanocomposite
A facile method to prepare CdO-Mn3O4 nanocomposite
 
Single Phase Heat Transfer with Nanofluids
Single Phase Heat Transfer with Nanofluids Single Phase Heat Transfer with Nanofluids
Single Phase Heat Transfer with Nanofluids
 
Restrained refinement using Reflex
Restrained refinement using ReflexRestrained refinement using Reflex
Restrained refinement using Reflex
 
Puurunen_invited-talk_ALD-modelling_ALD2005_050805
Puurunen_invited-talk_ALD-modelling_ALD2005_050805Puurunen_invited-talk_ALD-modelling_ALD2005_050805
Puurunen_invited-talk_ALD-modelling_ALD2005_050805
 
Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...
Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...
Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...
 
Accelerated Materials Discovery & Characterization with Classical, Quantum an...
Accelerated Materials Discovery & Characterization with Classical, Quantum an...Accelerated Materials Discovery & Characterization with Classical, Quantum an...
Accelerated Materials Discovery & Characterization with Classical, Quantum an...
 
I0371048054
I0371048054I0371048054
I0371048054
 
Passivation Of Ga As Surface With Sa Ms Final
Passivation Of Ga As Surface With Sa Ms   FinalPassivation Of Ga As Surface With Sa Ms   Final
Passivation Of Ga As Surface With Sa Ms Final
 
revision xi - chapters1-5.pdf
revision xi - chapters1-5.pdfrevision xi - chapters1-5.pdf
revision xi - chapters1-5.pdf
 
Block 3 training exercises. Draw the Lewis struct
Block 3 training exercises.    Draw the Lewis structBlock 3 training exercises.    Draw the Lewis struct
Block 3 training exercises. Draw the Lewis struct
 
Talk 3_0
Talk 3_0Talk 3_0
Talk 3_0
 

More from Ichigaku Takigawa

機械学習と自動微分
機械学習と自動微分機械学習と自動微分
機械学習と自動微分
Ichigaku Takigawa
 
データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜
Ichigaku Takigawa
 
機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?
Ichigaku Takigawa
 
A Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree EnsemblesA Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree Ensembles
Ichigaku Takigawa
 
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Ichigaku Takigawa
 
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
Ichigaku Takigawa
 
機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考
Ichigaku Takigawa
 
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
Ichigaku Takigawa
 
"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンス"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンス
Ichigaku Takigawa
 
自然科学における機械学習と機械発見
自然科学における機械学習と機械発見自然科学における機械学習と機械発見
自然科学における機械学習と機械発見
Ichigaku Takigawa
 
幾何と機械学習: A Short Intro
幾何と機械学習: A Short Intro幾何と機械学習: A Short Intro
幾何と機械学習: A Short Intro
Ichigaku Takigawa
 
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
Ichigaku Takigawa
 
Machine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
Machine Learning for Molecules: Lessons and Challenges of Data-Centric ChemistryMachine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
Machine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
Ichigaku Takigawa
 
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
Ichigaku Takigawa
 
自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学
Ichigaku Takigawa
 
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
Ichigaku Takigawa
 
Machine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and GeometriesMachine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and Geometries
Ichigaku Takigawa
 
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
Ichigaku Takigawa
 
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
Ichigaku Takigawa
 
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから (2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
Ichigaku Takigawa
 

More from Ichigaku Takigawa (20)

機械学習と自動微分
機械学習と自動微分機械学習と自動微分
機械学習と自動微分
 
データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜
 
機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?
 
A Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree EnsemblesA Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree Ensembles
 
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
 
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
 
機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考
 
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
 
"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンス"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンス
 
自然科学における機械学習と機械発見
自然科学における機械学習と機械発見自然科学における機械学習と機械発見
自然科学における機械学習と機械発見
 
幾何と機械学習: A Short Intro
幾何と機械学習: A Short Intro幾何と機械学習: A Short Intro
幾何と機械学習: A Short Intro
 
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
 
Machine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
Machine Learning for Molecules: Lessons and Challenges of Data-Centric ChemistryMachine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
Machine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
 
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
 
自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学
 
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
 
Machine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and GeometriesMachine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and Geometries
 
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
 
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
 
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから (2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
 

Recently uploaded

Anemia_ types_clinical significance.pptx
Anemia_ types_clinical significance.pptxAnemia_ types_clinical significance.pptx
Anemia_ types_clinical significance.pptx
muralinath2
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Mudde & Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...
Mudde &  Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...Mudde &  Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...
Mudde & Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...
frank0071
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
RASHMI M G
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 

Recently uploaded (20)

Anemia_ types_clinical significance.pptx
Anemia_ types_clinical significance.pptxAnemia_ types_clinical significance.pptx
Anemia_ types_clinical significance.pptx
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Mudde & Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...
Mudde &  Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...Mudde &  Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...
Mudde & Rovira Kaltwasser. - Populism in Europe and the Americas - Threat Or...
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 

Machine Learning and Surrogate Optimization on Heterogeneous Catalysts

  • 1. Machine Learning and Surrogate Optimization on Heterogeneous Catalysts Ichigaku Takigawa 2019 PRESTO International Symposium on Materials Informatics
 Feb 9-11, 2019 @ Tokyo Graduate School of Information Science and Technology, Hokkaido University Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University
  • 2. Heterogeneous catalysts and surface reactions Wolfgang Pauli “God made the bulk; 
 the surface was invented by the devil.” adsorption diffusion desorption dissociation recombination kinks terraces adatom vacancysteps Many hard-to-quantify factors complicate their atomic- level characterization by modelling and experiments. • reaction conditions • composition • support • surface termination • particle size & morphology • atomic coordination environment • disordered/amorphous structures
 in their active state
 : GAS (Reactants) SOLID (Catalysts) Hi-fidelity simulations are too time-consuming...
  • 3. Then how can we characterize the catalytic activity? K. Shimizu et al, ACS Catal. 2, 1904 (2012) d-band center (εd − EF) / eVd-band center (εd − EF) / eV Hammer–Nørskov d-band model reactionrates Volcano trends! adsorption energy / eV Brønsted-Evans-Polanyi relation activationenergy/eV Linear trends! The d-electrons of transition metals govern... Several DFT-calculated indexes capture the trend to some extent...
  • 4. Outline: Our ML-based studies 1. Can we predict the d-band center? 2. Can we predict the adsorption energy? 3. Can we predict the catalytic activity? predicting DFT-calculated values by machine learning   (Takigawa et al, RSC Advances. 2016) predicting DFT-calculated values by machine learning   (Toyao et al, JPC C 2018) predicting values from experiments reported in the  literature by machine learning   (Suzuki et al, in preparation)
  • 5. Case 1. Predicting the d-band centers Guest Host Ruban, Hammer, Stoltze, Skriver, Nørskov, J Mol Catal A, 115:421-429 (1997) J. K. Nørskov, et al., Advances in Catalysis, 2000 Host Guest Two types of models • 1% doped • overlayer [1% doped] The d-bands of transition metals play central roles.
  • 6. The beauty of the periodic table worked! Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au Fe -0.92 -0.96 -0.97 -1.65 -1.64 -2.24 -1.87 -2.4 -3.11 Co -1.37 -1.23 -2.12 -2.82 -2.53 -2.26 -3.56 Ni -0.33 -1.18 -1.92 -2.03 -2.43 -2.15 -2.82 -3.39 Cu -2.42 -2.49 -2.67 -2.89 -2.94 -3.82 -4.63 Ru -1.11 -1.04 -1.12 -1.41 -1.88 -1.81 -1.54 -2.27 Rh -1.42 -1.32 -1.51 -1.7 -1.73 -2.12 -1.81 -1.7 -2.18 -2.3 Pd -1.47 -1.29 -1.29 -1.03 -1.58 -1.83 -1.68 -1.52 -1.79 Ag -3.75 -3.56 -3.62 -3.8 -4.03 -3.5 -3.93 -4.51 Ir -1.78 -1.71 -1.78 -1.55 -2.14 -2.53 -2.2 -2.11 -2.6 -2.7 Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 -1.96 -2.33 Au -3.03 -2.82 -2.85 -2.89 -3.44 -3.56 Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au Fe -0.78 -1.65 -1.64 -1.87 Co -1.18 -1.17 -1.37 -1.87 -2.12 -2.82 -2.26 Ni -0.33 -1.18 -1.17 -2.61 -2.43 -2.15 -2.82 Cu -2.42 -2.89 -2.94 -3.88 -4.63 Ru -1.11 -1.04 -1.12 -1.11 -1.41 -1.81 -2.27 Rh -1.42 -1.51 -2.12 -1.81 -1.7 Pd -1.29 -1.29 -1.03 -1.58 -1.83 -1.52 -1.79 Ag -3.68 -3.8 -3.63 -4.51 Ir -2.14 -2.11 -2.7 Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 Au -2.86 -3.09 -2.89 -3.44 -3.56 Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au Fe -2.17 -3.11 Co -1.17 -1.37 -2.12 Ni -0.33 -1.18 -2.61 -2.43 Cu -2.42 -2.29 -2.49 -3.71 -4.63 Ru -2.02 Rh -1.32 -1.73 -2.12 Pd -1.94 -1.83 -1.97 Ag -3.75 -3.68 -4.51 Ir -1.78 -1.71 -2.7 Pt -2.13 Au -3.09 -2.89 training sets (75%) test sets (25%) training sets (50%) test sets (50%) training sets (25%) test sets (75%) gradient boosting w/ 6 descriptors gradient boosting w/ 6 descriptors gradient boosting w/ 6 descriptors 100 times
 mean RMSE: 0.153 / eV 100 times
 mean RMSE: 0.235 / eV 100 times
 mean RMSE: 0.402 / eV
  • 7. The ML model •Group (G) •Bulk Wigner–Seitz radius (R) in Å •Atomic number (AN) •Atomic mass (AM) in g mol −1 •Period (P) •Electronegativity (EN) •Ionization energy (IE) in eV •Enthalpy of fusion (∆fusH) in J g −1 •Density at 25 ℃ (ρ) in g cm −3 Readily available
 9 descriptors pretested
 for host & guest (18 in total) Gradient Boosted Tree Regression (GBR)
 with only 6 descriptors (1) Group in the periodic table (host) (2) Density at 25 ℃ (host) (3) Enthalpy of fusion (guest) (4) Ionization energy (guest) (5) Enthalpy of fusion (host) (6) Ionization energy (host)
  • 8. 11 ML methods pretested [3 Tree Ensembles (Nonlinear Regression Models)] 
 GBR (Gradient Boosted Tree Regression); ETR (Extra-Trees Regression); RFR (Random Forest Regression); [5 Linear Regression Models] 
 OLS (Ordinary Least Squares Regression); PLS (Partial Least Squares Regression); LASSO (Lasso Regression); RIDGE (Ridge Regression); RANSAC (Random Sample Consensus Regression); G BR ETR G PR R FR KR R O LS R ID G E PLS R AN SAC SVR LASSO [3 Kernel Methods (Nonlinear Regression Models)] 
 GPR (Gaussian Process Regression); KRR (Kernel Ridge Regression); SVR (Support Vector Regression); training sets (75%) test sets (25%)
  • 9. Tree ensemble regressors (GBR, ETR, RFR) Decision Tree (Regression Tree) Tree Ensemble ⇡ ⇡ y x1 x2 y x1 x2 x1 x2 ˆy y = sin ✓q x2 1 + x2 2 ◆ y = sin ✓q x2 1 + x2 2 ◆ = + + + ... x1 x2 ˆy c1 c2 c3c1 c2 c3 • Region-wise constant prediction • The regions are given by recursive axis- parallel partitioning of the data space <latexit sha1_base64="HFw5DiyTzq0XGqmoTg6I06/Dc80=">AAACr3ichVHLSsNQED3GV62vqhvBTbEoClImVVBciW5c+qoVXzWJV72YJiFJS2vxB1wLLkRBwYX4GW76Ay78BHGp4MaFkzQgKuqE5J57Zs7cczO6Y0rPJ3psUBqbmltaY23x9o7Oru5ET++qZxddQ2QN27TdNV3zhCktkfWlb4o1xxVaQTdFTj+cC/K5knA9aVsrfsURWwVt35J70tB8ptZHynl1LFnOZ0bziRSlKYzkT6BGIIUoFuxEDZvYhQ0DRRQgYMFnbEKDx88GVBAc5rZQZc5lJMO8wDHirC1yleAKjdlD/u7zbiNiLd4HPb1QbfApJr8uK5MYoge6pReq0R090fuvvaphj8BLhVe9rhVOvvukf/ntX1WBVx8Hn6o/PfvYw1ToVbJ3J2SCWxh1feno7GV5emmoOkzX9Mz+r+iR7vkGVunVuFkUS+d/+NHZy+9/LMhHFTxC9fvAfoLVTFodT9PiRGpmNhpmDAMYxAhPbBIzmMcCsnyChVNc4FJRlZyyrezUS5WGSNOHL6HID0VdmeQ=</latexit>
  • 10. Tree ensemble regressors (GBR, ETR, RFR) Advantages • quick, nonlinear, parallelizable • highly accurate (widely used in many winning solutions for data prediction competitions) • usually less hyperparameter dependent 
 (compared to kernel methods and neural networks) • conservative extrapolation • "variable importance" provided • popular implementations • Scikit-learn • XGBoost (by DMLC) • LightGBM (by Microsoft) …Data How to generate multiple decision trees? • RFR / ETR
 random patches (random subsampling of instances and variables) or random splits • GBR (can be also mixed with the above strategy)
 sequentially add a new tree to compensate the weak point of the current ensemble.
  • 11. Descriptor analysis and evaluation 100 times mean RMSE: 0.204±0.047 / eV 100 times mean RMSE: 0.212±0.047 / eV 100 times mean RMSE: 0.214±0.046 / eV GBR with 18 descriptors GBR with 6 descriptors GBR with 4 descriptors Descriptor Importances Descriptor Selection
 (top-k) training sets (75%) test sets (25%)
  • 12. Case 2. Predicting the adsorption energy DFT calculation of adsorption energy • 10 hours with our 32 cores workstation 
 (CH3 on the Cu monometallic surface) • even longer time (about 34 hours) for the system containing another metal such as Pb Predicting Adsorption energy of CH3 ML prediction • < 1 sec with our 1 core laptop • not dependent on target systems, but methods we choose training sets (75%) test sets (25%)
  • 13. But what these mean for catalyst design and discovery!? Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au Fe -0.92 -0.96 -0.97 -1.65 -1.64 -2.24 -1.87 -2.4 -3.11 Co -1.37 -1.23 -2.12 -2.82 -2.53 -2.26 -3.56 Ni -0.33 -1.18 -1.92 -2.03 -2.43 -2.15 -2.82 -3.39 Cu -2.42 -2.49 -2.67 -2.89 -2.94 -3.82 -4.63 Ru -1.11 -1.04 -1.12 -1.41 -1.88 -1.81 -1.54 -2.27 Rh -1.42 -1.32 -1.51 -1.7 -1.73 -2.12 -1.81 -1.7 -2.18 -2.3 Pd -1.47 -1.29 -1.29 -1.03 -1.58 -1.83 -1.68 -1.52 -1.79 Ag -3.75 -3.56 -3.62 -3.8 -4.03 -3.5 -3.93 -4.51 Ir -1.78 -1.71 -1.78 -1.55 -2.14 -2.53 -2.2 -2.11 -2.6 -2.7 Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 -1.96 -2.33 Au -3.03 -2.82 -2.85 -2.89 -3.44 -3.56 Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au Fe -0.78 -1.65 -1.64 -1.87 Co -1.18 -1.17 -1.37 -1.87 -2.12 -2.82 -2.26 Ni -0.33 -1.18 -1.17 -2.61 -2.43 -2.15 -2.82 Cu -2.42 -2.89 -2.94 -3.88 -4.63 Ru -1.11 -1.04 -1.12 -1.11 -1.41 -1.81 -2.27 Rh -1.42 -1.51 -2.12 -1.81 -1.7 Pd -1.29 -1.29 -1.03 -1.58 -1.83 -1.52 -1.79 Ag -3.68 -3.8 -3.63 -4.51 Ir -2.14 -2.11 -2.7 Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 Au -2.86 -3.09 -2.89 -3.44 -3.56 Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au Fe -2.17 -3.11 Co -1.17 -1.37 -2.12 Ni -0.33 -1.18 -2.61 -2.43 Cu -2.42 -2.29 -2.49 -3.71 -4.63 Ru -2.02 Rh -1.32 -1.73 -2.12 Pd -1.94 -1.83 -1.97 Ag -3.75 -3.68 -4.51 Ir -1.78 -1.71 -2.7 Pt -2.13 Au -3.09 -2.89 training sets (75%) test sets (25%) training sets (50%) test sets (50%) training sets (25%) test sets (75%) gradient boosting w/ 6 descriptors gradient boosting w/ 6 descriptors gradient boosting w/ 6 descriptors 100 times
 mean RMSE: 0.153 / eV 100 times
 mean RMSE: 0.235 / eV 100 times
 mean RMSE: 0.402 / eV
  • 14. Standard procedure for optimizing the activity All your available data • Experiments • Simulations Hypothesis generation (abduction) Check results Feedback
  • 15. Replace the time-consuming and costly part by ML All your available data Check the best predicted ones Feed them as training data Make predictions
 for many possible candidates Machine Learning (any "data-driven" 
 predictions) The "Surrogate (or proxy)" model for • Demanding experiments • Time-consuming hi-fidelity simulations
  • 16. Replace the time-consuming and costly part by ML All your available data Check the best predicted ones Feed them as training data Make predictions
 for many possible candidates Machine Learning (any "data-driven" 
 predictions) The "Surrogate (or proxy)" model for • Demanding experiments • Time-consuming hi-fidelity simulations This simple procedure won't work in most practical cases!
  • 17. ML itself is not for "discovery" x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity Training samples Best known value
  • 18. ML itself is not for "discovery" x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity Training samples Best known value ML prediction Fitted to minimize the average error. <latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit> <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
  • 19. ML itself is not for "discovery" x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity Training samples Best known value ML prediction Fitted to minimize the average error. <latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit> <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit> Nicely predicted for
 the average (but mediocre )
  • 20. ML itself is not for "discovery" x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity Training samples Best known value ML prediction Fitted to minimize the average error. <latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit> <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit> nice "discovery" can be largely deviated from the average of knowns outlier Nicely predicted for
 the average (but mediocre )
  • 21. ML itself is not for "discovery" x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity Training samples Best known value ML captures the average trend of available knowns "discovery" corresponds to something not in knowns Mismatch ML prediction Fitted to minimize the average error. <latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit> <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit> nice "discovery" can be largely deviated from the average of knowns outlier Nicely predicted for
 the average (but mediocre )
  • 22. An ML model is just representative of the training data Highly Inaccurate Model Predictions from Extrapolation (Lohninger 1999) "Beware of the perils of extrapolation, and understand that ML algorithms build models that are representative of the available training samples." "exploitation""exploration" to obtain new knowledge/data to use the knowledge/data to improve the performane We also need this ML basically for this
  • 23. Surrogate optimization (model-based optimization) x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit> Use ML to guide the balance between "exploitation" and "exploration"!
  • 24. Surrogate optimization (model-based optimization) x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit> "Uncertainty" of prediction e.g. prediction variance Use ML to guide the balance between "exploitation" and "exploration"!
  • 25. Surrogate optimization (model-based optimization) x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit> "Uncertainty" of prediction e.g. prediction variance e.g.
 "expected improvement" Use ML to guide the balance between "exploitation" and "exploration"!
  • 26. Surrogate optimization (model-based optimization) x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit> "Uncertainty" of prediction e.g. prediction variance e.g.
 "expected improvement" 1. Initial Sampling (DoE) 2. Loop: 1. Construct a Surrogate Model. 2. Search the Infill Criterion. 3. Add new samples. (intervention) • Reinforcement learning • Blackbox optimization • Bayesian optimization • Sequential design of experiments • Multi-armed bandit • Evolutional computation • Game-theoretic approaches
 : An Open Research Topic in ML Use ML to guide the balance between "exploitation" and "exploration"!
  • 27. Structure-activity landscapes are nonsmooth... J. Med. Chem. 2012, 55, 2932−2942 The structure-activity landscape can be often nonsmooth. Small changes in descriptors can largely affect the activity/selectivity. Activity cliffs Selectivity cliffs
  • 28. Batched + restriced surrogate optimization Our optimization model for heterogeneous catalyst data:
  • 29. Batched + restriced surrogate optimization Our optimization model for heterogeneous catalyst data: • Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
 with multistart local search with L-BFGS + random samples
  • 30. Batched + restriced surrogate optimization Our optimization model for heterogeneous catalyst data: • Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
 with multistart local search with L-BFGS + random samples • Use EI as the infill criterion estimated by OOB or Quantile regression
  • 31. Batched + restriced surrogate optimization Our optimization model for heterogeneous catalyst data: • Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
 with multistart local search with L-BFGS + random samples • Use EI as the infill criterion estimated by OOB or Quantile regression • Get multiple samples in each loop 
 by batched optimization with "diversified" samples by clustering
 (need some margin; in reality, not easy to realize suggested catalysts)
  • 32. Batched + restriced surrogate optimization Our optimization model for heterogeneous catalyst data: • Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
 with multistart local search with L-BFGS + random samples • Use EI as the infill criterion estimated by OOB or Quantile regression • Get multiple samples in each loop 
 by batched optimization with "diversified" samples by clustering
 (need some margin; in reality, not easy to realize suggested catalysts) • Pose several restrictions on new samples to be tested
  • 33. Batched + restriced surrogate optimization Our optimization model for heterogeneous catalyst data: • Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
 with multistart local search with L-BFGS + random samples • Use EI as the infill criterion estimated by OOB or Quantile regression • Get multiple samples in each loop 
 by batched optimization with "diversified" samples by clustering
 (need some margin; in reality, not easy to realize suggested catalysts) • Pose several restrictions on new samples to be tested • the sum of compositions equals to 1 (compositional restriction)
  • 34. Batched + restriced surrogate optimization Our optimization model for heterogeneous catalyst data: • Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
 with multistart local search with L-BFGS + random samples • Use EI as the infill criterion estimated by OOB or Quantile regression • Get multiple samples in each loop 
 by batched optimization with "diversified" samples by clustering
 (need some margin; in reality, not easy to realize suggested catalysts) • Pose several restrictions on new samples to be tested • the sum of compositions equals to 1 (compositional restriction) • restrict the number of elements in a catalyst (bounded nonzeros)
 (because a catalyst with 60 elements would be not realistic...)
  • 35. Case 3. Predicting the catalytic acitivity (in prep) • Oxidative coupling of methane (OCM) 
 [Zavyalova+ 2011] • Water gas shift reaction (WGS) 
 [Odabaşi+ 2014] • CO oxidation [Günay+ 2013] Test on 3 DatasetsOur model GPR-based BO Random
  • 36. ICReDD, Hokkaido University Check the website for any collaborations and postdoc positions! Our mission:
 To rationally design and discover new chemical reactions 
 by seemlessly fusing • experimental sciences (realization) • computational sciences (theory-driven) • information sciences (data-driven) started Oct 2018, funded $ 6.4 million per year by government (for 10 years) Sapporo Tokyo HOKKAIDO • 2 million population
 (5th largest city in Japan) • 6.3m / 248 inches
 avg. annual snowfall Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University
  • 37.
  • 38. Summary • Predicting the d-band centers by ML
 (Takigawa et al, RSC Advances. 2016) • Predicting the adsorption energy by ML
 (Toyao et al, JPC C 2018) • Predicting the experimentally-reported catalytic activity by ML
 (Suzuki et al, in preparation) Acknowledgements Ken-ichi SHIMIZU
 (ICAT) Satoru TAKAKUSAGI
 (ICAT) Takashi TOYAO
 (ICAT) Keisuke
 SUZUKI
 (DENSO)