This document summarizes the work of Ichigaku Takigawa on using machine learning and surrogate optimization to characterize heterogeneous catalysts. Takigawa describes how density functional theory (DFT) calculations are too computationally expensive to fully characterize catalysts. Their work uses machine learning models to predict DFT-calculated values like d-band centers and adsorption energies in a fraction of the time. This allows for more efficient catalyst discovery and design by screening many more potential materials than DFT alone. The models are trained on existing DFT data and achieve prediction errors below 0.2 eV. This surrogate approach could provide insights into catalyst activity trends and guide further experimental studies.
Optical properties such as UV/vis spectra and polarizability can be predicted with new features in Materials Studio DMol3 5.5. This presenation provides some background on the implementaiton as well as case studies.
Optical properties such as UV/vis spectra and polarizability can be predicted with new features in Materials Studio DMol3 5.5. This presenation provides some background on the implementaiton as well as case studies.
1 Packing of spheres: Unit cell and description of crystal structure, close
packing of spheres, holes in closed-packed structures.
2 Structure of Metals: Polytypism, structures that are not closed packed, polymorphism of metals, atomic radii of metals, alloys.
3 Ionic solids: Characteristic structures of ionic solids, the rationalization of structures, the energetics of ionic bonding, consequences of lattice enthalpy.
CrSi2 materialisoutstandingbecauseofitsthermoelectricpropertiesandalsobecauseofitsmany
optimizationroutes.Indeed,itsthermalconductivityatroomtemperatureisabout9Wm1 K1 with
a ZT of 0.25.Inthispaperweproposetodecreasethethermalconductivitybynanostructurationand
compensatetheelectronscatteringbyincreasingthechargecarrierconcentrationwithTi.Theprocess
which permittedtogetnanocrystalliteofabout14nmispresented.Aftercoldpressingandsintering
the averagecrystallitesizereaches50nmwithaporosityof70%.Nanostructuringandporositytoa
lesser extentleadtoastrongdecreaseofthethermalconductivityupto0.970.15Wm1 K1 for pure
CrSi2. Asignificantenhancementofthepowerfactorfrom1:25 mWcm1 K2 for purenano-CrSi2 to
2:5 mWcm1 K2 for nano-Cr0.90Ti0.10Si2 was obtained.Thestabilityofthedifferentphasesisalso
evaluatedbycomparingexperimentswithabinitiocalculations.
CONVERSIONS TO USE
METRIC PREFIXES
This table uses liters (L) as the base unit, but you can use this table for ANY base unit. For example, 1 s = 1×106 µs.
OR Base Unit Prefix
OR
OR
OR
OR
OR
OR
OR
OR
OR
NOTE: Two equivalence statements are written for each prefix. Either is equally correct (they are exactly the
same). Use whichever makes more sense to you.
OTHER CONVERSIONS
All of these are exact numbers except those marked with *
METRIC tt METRIC ENGLISH tt
ENGLISH
ENGLISH tt
METRIC
LENGTH 1 cm = 1×108 Å
(Å is the symbol for
angstroms)
12 in. = 1 ft.
3 ft. = 1 yd.
5280 ft. = 1 mi.
1 in. = 2.54 cm
1 mile = 1.609 km*
MASS / WEIGHT 1000 kg = 1 metric ton 2000 lb. = 1 ton
16 oz. = 1 lb.
1 lb. = 453.6 g*
VOLUME 1 L = 1 dm3
1 mL = 1 cm3
1000 L = 1 m3
3 tsp. = 1 Tbsp.
16 Tbsp. = 1 cup
2 cups = 1 pint
2 pints = 1 quart
4 quarts = 1 gal.
8 fluid oz. = 1 cup
1 qt. = 0.9464 L*
1 fluid oz. = 29.57 mL*
1 ft3 = 28.32 L*
TEMPERATURE TK = TC + 273.15 TF = 1.8(TC) + 32
TC = (TF - 32) / 1.8
ENERGY 1 cal = 4.184 J
NOTE: The ounces that measure mass are completely different from and unrelated to the fluid
ounces that measure volume.
DO NOT WRITE ON THIS SHEET
DO NOT WRITE ON THIS SHEET
Symbol Meaning Base Unit Prefix
giga, G billion 1 L = 1×10–9 GL
mega, M million 1 L = 1×10–6 ML
kilo, k thousand 1 L = 0.001 kL
deci, d tenth 1 L = 10 dL
centi, c hundredth 1 L = 100 cL
milli, m thousandth 1 L = 1000 mL
micro, µ millionth 1 L = 1×106 µL
nano, n billionth 1 L = 1×109 nL
pico, p trillionth 1 L = 1×1012 pL
1×109 L = 1 GL
1×106 L = 1 ML
1000 L = 1 kL
0.1 L = 1 dL
0.01 L = 1 cL
0.001 L = 1 mL
1×10–6 L = 1 µL
1×10–9 L = 1 nL
1×10–12 L = 1 pL
SOME CONSTANTS AND EQUATIONS
density =
mass
volume
mass % element in a compound = g element
g compound
⨯ 100
c = speed of light = 3.00x108 m/s
Ephoton = h Planck’s constant = 6.626x10–34 J´s
E = hc
λ
Avogadro’s Number = 6.022x1023
ELECTROMAGNETIC SPECTRUM
ELECTRONEGATIVITIES FOR SOME OF THE ELEMENTS
H
2.1
Li
1.0
Be
1.5
B
2.0
C
2.5
N
3.0
O
3.5
F
4.0
Na
0.9
Mg
1.2
Al
1.5
Si
1.8
P
2.1
S
2.5
Cl
3.0
K
0.8
Ca
1.0
Sc
1.3
Ti
1.5
V
1.6
Cr
1.6
Mn
1.5
Fe
1.8
Co
1.9
Ni
1.9
Cu
1.9
Zn
1.6
Ga
1.6
Ge
1.8
As
2.0
Se
2.4
Br
2.8
Rb
0.8
Sr
1.0
Y
1.2
Zr
1.4
Nb
1.6
Mo
1.8
Tc
1.9
Ru
2.2
Rh
2.2
Pd
2.2
Ag
1.9
Cd
1.7
In
1.7
Sn
1.8
Sb
1.9
Te
2.1
I
2.5
Cs
0.7
Ba
0.9
La
1.0
Hf
1.3
Ta
1.5
W
1.7
Re
1.9
Os
2.2
Ir
2.2
Pt
2.2
Au
2.4
Hg
1.9
Tl
1.8
Pb
1.9
Bi
1.9
Po
2.0
At
2.2
Hp
0.7
Hm
0.8
Ws
1.0
Ss
1.2
Lp
1.3
Bl
1.5
Ad
1.7
Nv
1.9
Le
2.0
Mc
2.1
Rs
2.3
Gh
1.8
An
1.8
Fd
1.9
Sw
1.9
Gm
2.0
Gf
2.1
DO NOT WRITE ...
Preparation characterization and conductivity studies of Nasicon systems Ag3-...iosrjce
Materials belonging to NASICON family of compositions Ag3-2xTaxIn2-x(PO4)3 ( x = 0.6,0.8 and 1.1)
are prepared by sol-gel method. Ethylene glycol is used as a gelating agent. All the compositions are
characterizedby powder X-ray diffraction and Fourier transform infrared spectroscopy All these
phosphates are crystallized in rhombohedral lattice with space group R3c
. These compounds exhibit
characteristic PO4 vibrational modes in their FT-IR spectra. The dc conductivity of Ag3-2xTaxIn2-x(PO4)3 ( x =
0.6,0.8 and 1.1) was also investigated.
A facile method to prepare CdO-Mn3O4 nanocompositeIOSR Journals
CdO-Mn3O4 nanocomposite has been prepared by a simple solvothermal method using a domestic microwave oven. Cadmium acetate, manganese acetate and urea were used as the precursors and ethylene glycol as the solvent. The as-prepared sample was annealed for 1 hour in each case at different temperatures, viz. 100, 200 and 300°C. The as-prepared and annealed samples were characterized by X-ray diffraction and scanning electron microscopic analyses. Results indicate that annealing at 300°C is required to get the sample with high phase purity and homogeneity. The present study indicates that the method adopted can be considered as an economical and scalable one to prepare the proposed nanocomposite with reduced size, phase purity and homogeneity.
Slides of my first invited talk at a conference, the ALD 2005 conference in San Jose 2005, about ALD modelling. ALD is fantastic, but fantastic is not perfect :)
---
R. L. Puurunen, Atomic-scale modelling of atomic layer deposition processes, American Vacuum Society Topical Conference on Atomic Layer Deposition (ALD 2005), San Jose, California, August 8-10, 2005. Invited talk.
Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...Abdullah Khan Zehady
How are the environmental variables and marine evolution connected? Does astronomical forcing influence climate variation? Can we apply deep learning to classify index fossils?
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
Theoretical work submitted to the Journal should be original in its motivation or modeling structure. Empirical analysis should be based on a theoretical framework and should be capable of replication. It is expected that all materials required for replication (including computer programs and data sets) should be available upon request to the authors.
Block 3 training exercises.
Draw the Lewis structures for each of the following molecules and determine the electron pair
geometry and the molecular geometry as predicted by the VSEPR.
1) F2
2) N2
3) ICL
4) CO2
5) NH3
6) CF4
7) C2H6
8) C2H4
9) C2H2
10) HCN
11) SO2
12) HNO3 (the hydrogen is bonded to one of the oxygens).
13) CH4O
For each of the following polyatomic ions, draw all resonance structures. Based on the formal
charges identify which is the best resonance structure. (if all resonance structures are equivalent
indicate so next to the structure). Indicate the bond angles as predicted by VSEPR for the best
structure .
14) OH-
15) CN-
16) ClO2 –
17) ClO3 –
18) CO3 2-
19) SO3 2-
20) SCN-
21) HCO2-
22) NO2+
State the hybridization of each of the central atom in each of the following.
23) ClO2 –
24) ClO3 –
25) CO3 2-
26) SO3 2-
27) SCN-
28) HCO2-
29) NO2 -
30) Hydrogen Peroxide (H2O2) is a reactive molecule, often used an antiseptic and sometimes
used for bleaching. Draw a Lewis structure for hydrogen peroxide (peroxide is a
polyatomic ion). What is the oxidation state of the oxygen in hydrogen peroxide?
Suggest a reason for its reactivity.
31) What is the hybridization of the central atom in Acetone CH3COCH3?
32) How many sigma bonds and how many pi bonds are in Acetone?
33) Ozone (O3) is needed in the stratosphere to absorb (and filter out) potentially damaging
ultraviolet light. However, in the lower atmosphere is a dangerous pollutant as it is a very
reactive form of oxygen and as a result, very toxic and destructive. Draw the two
reasonable structures, include formal charges. (Hint: it is not a ring) Suggest a reason for
its high reactivity
34) Which of the following reactions is associated with the lattice energy of Li2O (ΔH°latt)?
A) Li2O(s) → 2 Li⁺(g) + O2⁻(g)
B) 2 Li⁺(aq) + O2⁻(aq) → Li2O(s)
C) 2 Li⁺(g) + O2⁻(g) → Li2O(s)
D) Li2O(s) → 2 Li⁺(aq) + O2⁻(aq)
35). Which of the following reactions is associated with the lattice energy of CaS (ΔH°latt)?
A) Ca(s) + S(s) → CaS(s)
B) CaS(s) → Ca(s) + S(s)
C) Ca2⁺(aq) + S2⁻(aq) → CaS(s)
D) Ca2⁺(g) + S2⁻(g) → CaS(s)
E) CaS(s) → Ca2⁺(aq) + S2⁻(aq)
36). Which of the following reactions is associated with the lattice energy of RbI (ΔH°latt)?
A) Rb(s) +
2
1
I2(g) → RbI(s)
B) RbI(s) → Rb⁺(g) + I⁻(g)
C) RbI(s) → Rb(s) +
2
1
I2(g)
D) RbI(s) → Rb⁺(aq) + I⁻(aq)
E) Rb⁺(g) + I⁻(g) → RbI(s)
37). Which of the following NaCl, KCl, LiCl, CsCl has the highest magnitude of lattice energy?
38). Identify the compound with the lowest magnitude of lattice energy among the following:
KCl, KBr,SrO,CaO.
39). Identify the shortest bond.
A) single covalent bond
B) double covalent bond
C) triple covalent bond
D) all of the above bonds are the same length
40 ...
1 Packing of spheres: Unit cell and description of crystal structure, close
packing of spheres, holes in closed-packed structures.
2 Structure of Metals: Polytypism, structures that are not closed packed, polymorphism of metals, atomic radii of metals, alloys.
3 Ionic solids: Characteristic structures of ionic solids, the rationalization of structures, the energetics of ionic bonding, consequences of lattice enthalpy.
CrSi2 materialisoutstandingbecauseofitsthermoelectricpropertiesandalsobecauseofitsmany
optimizationroutes.Indeed,itsthermalconductivityatroomtemperatureisabout9Wm1 K1 with
a ZT of 0.25.Inthispaperweproposetodecreasethethermalconductivitybynanostructurationand
compensatetheelectronscatteringbyincreasingthechargecarrierconcentrationwithTi.Theprocess
which permittedtogetnanocrystalliteofabout14nmispresented.Aftercoldpressingandsintering
the averagecrystallitesizereaches50nmwithaporosityof70%.Nanostructuringandporositytoa
lesser extentleadtoastrongdecreaseofthethermalconductivityupto0.970.15Wm1 K1 for pure
CrSi2. Asignificantenhancementofthepowerfactorfrom1:25 mWcm1 K2 for purenano-CrSi2 to
2:5 mWcm1 K2 for nano-Cr0.90Ti0.10Si2 was obtained.Thestabilityofthedifferentphasesisalso
evaluatedbycomparingexperimentswithabinitiocalculations.
CONVERSIONS TO USE
METRIC PREFIXES
This table uses liters (L) as the base unit, but you can use this table for ANY base unit. For example, 1 s = 1×106 µs.
OR Base Unit Prefix
OR
OR
OR
OR
OR
OR
OR
OR
OR
NOTE: Two equivalence statements are written for each prefix. Either is equally correct (they are exactly the
same). Use whichever makes more sense to you.
OTHER CONVERSIONS
All of these are exact numbers except those marked with *
METRIC tt METRIC ENGLISH tt
ENGLISH
ENGLISH tt
METRIC
LENGTH 1 cm = 1×108 Å
(Å is the symbol for
angstroms)
12 in. = 1 ft.
3 ft. = 1 yd.
5280 ft. = 1 mi.
1 in. = 2.54 cm
1 mile = 1.609 km*
MASS / WEIGHT 1000 kg = 1 metric ton 2000 lb. = 1 ton
16 oz. = 1 lb.
1 lb. = 453.6 g*
VOLUME 1 L = 1 dm3
1 mL = 1 cm3
1000 L = 1 m3
3 tsp. = 1 Tbsp.
16 Tbsp. = 1 cup
2 cups = 1 pint
2 pints = 1 quart
4 quarts = 1 gal.
8 fluid oz. = 1 cup
1 qt. = 0.9464 L*
1 fluid oz. = 29.57 mL*
1 ft3 = 28.32 L*
TEMPERATURE TK = TC + 273.15 TF = 1.8(TC) + 32
TC = (TF - 32) / 1.8
ENERGY 1 cal = 4.184 J
NOTE: The ounces that measure mass are completely different from and unrelated to the fluid
ounces that measure volume.
DO NOT WRITE ON THIS SHEET
DO NOT WRITE ON THIS SHEET
Symbol Meaning Base Unit Prefix
giga, G billion 1 L = 1×10–9 GL
mega, M million 1 L = 1×10–6 ML
kilo, k thousand 1 L = 0.001 kL
deci, d tenth 1 L = 10 dL
centi, c hundredth 1 L = 100 cL
milli, m thousandth 1 L = 1000 mL
micro, µ millionth 1 L = 1×106 µL
nano, n billionth 1 L = 1×109 nL
pico, p trillionth 1 L = 1×1012 pL
1×109 L = 1 GL
1×106 L = 1 ML
1000 L = 1 kL
0.1 L = 1 dL
0.01 L = 1 cL
0.001 L = 1 mL
1×10–6 L = 1 µL
1×10–9 L = 1 nL
1×10–12 L = 1 pL
SOME CONSTANTS AND EQUATIONS
density =
mass
volume
mass % element in a compound = g element
g compound
⨯ 100
c = speed of light = 3.00x108 m/s
Ephoton = h Planck’s constant = 6.626x10–34 J´s
E = hc
λ
Avogadro’s Number = 6.022x1023
ELECTROMAGNETIC SPECTRUM
ELECTRONEGATIVITIES FOR SOME OF THE ELEMENTS
H
2.1
Li
1.0
Be
1.5
B
2.0
C
2.5
N
3.0
O
3.5
F
4.0
Na
0.9
Mg
1.2
Al
1.5
Si
1.8
P
2.1
S
2.5
Cl
3.0
K
0.8
Ca
1.0
Sc
1.3
Ti
1.5
V
1.6
Cr
1.6
Mn
1.5
Fe
1.8
Co
1.9
Ni
1.9
Cu
1.9
Zn
1.6
Ga
1.6
Ge
1.8
As
2.0
Se
2.4
Br
2.8
Rb
0.8
Sr
1.0
Y
1.2
Zr
1.4
Nb
1.6
Mo
1.8
Tc
1.9
Ru
2.2
Rh
2.2
Pd
2.2
Ag
1.9
Cd
1.7
In
1.7
Sn
1.8
Sb
1.9
Te
2.1
I
2.5
Cs
0.7
Ba
0.9
La
1.0
Hf
1.3
Ta
1.5
W
1.7
Re
1.9
Os
2.2
Ir
2.2
Pt
2.2
Au
2.4
Hg
1.9
Tl
1.8
Pb
1.9
Bi
1.9
Po
2.0
At
2.2
Hp
0.7
Hm
0.8
Ws
1.0
Ss
1.2
Lp
1.3
Bl
1.5
Ad
1.7
Nv
1.9
Le
2.0
Mc
2.1
Rs
2.3
Gh
1.8
An
1.8
Fd
1.9
Sw
1.9
Gm
2.0
Gf
2.1
DO NOT WRITE ...
Preparation characterization and conductivity studies of Nasicon systems Ag3-...iosrjce
Materials belonging to NASICON family of compositions Ag3-2xTaxIn2-x(PO4)3 ( x = 0.6,0.8 and 1.1)
are prepared by sol-gel method. Ethylene glycol is used as a gelating agent. All the compositions are
characterizedby powder X-ray diffraction and Fourier transform infrared spectroscopy All these
phosphates are crystallized in rhombohedral lattice with space group R3c
. These compounds exhibit
characteristic PO4 vibrational modes in their FT-IR spectra. The dc conductivity of Ag3-2xTaxIn2-x(PO4)3 ( x =
0.6,0.8 and 1.1) was also investigated.
A facile method to prepare CdO-Mn3O4 nanocompositeIOSR Journals
CdO-Mn3O4 nanocomposite has been prepared by a simple solvothermal method using a domestic microwave oven. Cadmium acetate, manganese acetate and urea were used as the precursors and ethylene glycol as the solvent. The as-prepared sample was annealed for 1 hour in each case at different temperatures, viz. 100, 200 and 300°C. The as-prepared and annealed samples were characterized by X-ray diffraction and scanning electron microscopic analyses. Results indicate that annealing at 300°C is required to get the sample with high phase purity and homogeneity. The present study indicates that the method adopted can be considered as an economical and scalable one to prepare the proposed nanocomposite with reduced size, phase purity and homogeneity.
Slides of my first invited talk at a conference, the ALD 2005 conference in San Jose 2005, about ALD modelling. ALD is fantastic, but fantastic is not perfect :)
---
R. L. Puurunen, Atomic-scale modelling of atomic layer deposition processes, American Vacuum Society Topical Conference on Atomic Layer Deposition (ALD 2005), San Jose, California, August 8-10, 2005. Invited talk.
Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...Abdullah Khan Zehady
How are the environmental variables and marine evolution connected? Does astronomical forcing influence climate variation? Can we apply deep learning to classify index fossils?
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
Theoretical work submitted to the Journal should be original in its motivation or modeling structure. Empirical analysis should be based on a theoretical framework and should be capable of replication. It is expected that all materials required for replication (including computer programs and data sets) should be available upon request to the authors.
Block 3 training exercises.
Draw the Lewis structures for each of the following molecules and determine the electron pair
geometry and the molecular geometry as predicted by the VSEPR.
1) F2
2) N2
3) ICL
4) CO2
5) NH3
6) CF4
7) C2H6
8) C2H4
9) C2H2
10) HCN
11) SO2
12) HNO3 (the hydrogen is bonded to one of the oxygens).
13) CH4O
For each of the following polyatomic ions, draw all resonance structures. Based on the formal
charges identify which is the best resonance structure. (if all resonance structures are equivalent
indicate so next to the structure). Indicate the bond angles as predicted by VSEPR for the best
structure .
14) OH-
15) CN-
16) ClO2 –
17) ClO3 –
18) CO3 2-
19) SO3 2-
20) SCN-
21) HCO2-
22) NO2+
State the hybridization of each of the central atom in each of the following.
23) ClO2 –
24) ClO3 –
25) CO3 2-
26) SO3 2-
27) SCN-
28) HCO2-
29) NO2 -
30) Hydrogen Peroxide (H2O2) is a reactive molecule, often used an antiseptic and sometimes
used for bleaching. Draw a Lewis structure for hydrogen peroxide (peroxide is a
polyatomic ion). What is the oxidation state of the oxygen in hydrogen peroxide?
Suggest a reason for its reactivity.
31) What is the hybridization of the central atom in Acetone CH3COCH3?
32) How many sigma bonds and how many pi bonds are in Acetone?
33) Ozone (O3) is needed in the stratosphere to absorb (and filter out) potentially damaging
ultraviolet light. However, in the lower atmosphere is a dangerous pollutant as it is a very
reactive form of oxygen and as a result, very toxic and destructive. Draw the two
reasonable structures, include formal charges. (Hint: it is not a ring) Suggest a reason for
its high reactivity
34) Which of the following reactions is associated with the lattice energy of Li2O (ΔH°latt)?
A) Li2O(s) → 2 Li⁺(g) + O2⁻(g)
B) 2 Li⁺(aq) + O2⁻(aq) → Li2O(s)
C) 2 Li⁺(g) + O2⁻(g) → Li2O(s)
D) Li2O(s) → 2 Li⁺(aq) + O2⁻(aq)
35). Which of the following reactions is associated with the lattice energy of CaS (ΔH°latt)?
A) Ca(s) + S(s) → CaS(s)
B) CaS(s) → Ca(s) + S(s)
C) Ca2⁺(aq) + S2⁻(aq) → CaS(s)
D) Ca2⁺(g) + S2⁻(g) → CaS(s)
E) CaS(s) → Ca2⁺(aq) + S2⁻(aq)
36). Which of the following reactions is associated with the lattice energy of RbI (ΔH°latt)?
A) Rb(s) +
2
1
I2(g) → RbI(s)
B) RbI(s) → Rb⁺(g) + I⁻(g)
C) RbI(s) → Rb(s) +
2
1
I2(g)
D) RbI(s) → Rb⁺(aq) + I⁻(aq)
E) Rb⁺(g) + I⁻(g) → RbI(s)
37). Which of the following NaCl, KCl, LiCl, CsCl has the highest magnitude of lattice energy?
38). Identify the compound with the lowest magnitude of lattice energy among the following:
KCl, KBr,SrO,CaO.
39). Identify the shortest bond.
A) single covalent bond
B) double covalent bond
C) triple covalent bond
D) all of the above bonds are the same length
40 ...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...Ichigaku Takigawa
Video https://youtu.be/P4QogT8bdqY
ACS Spring 2023 Symposium on AI-Accelerated Scientific Workflow
https://acs.digitellinc.com/acs/sessions/526630/view
ACS SPRING 2023 ———— Crossroads of Chemistry
Indianapolis, IN & Hybrid, March 26-30
https://www.acs.org/meetings/acs-meetings/spring-2023.html
Slide PDF
https://itakigawa.page.link/acs2023spring
Our Paper
Accelerated discovery of multi-elemental reverse water-gas shift catalysts using extrapolative machine learning approach (2022, ChemRxiv)
https://doi.org/10.26434/chemrxiv-2022-695rj
Ichi Takigawa
https://itakigawa.github.io/
Machine Learning for Molecules: Lessons and Challenges of Data-Centric ChemistryIchigaku Takigawa
Perspectives on Artificial Intelligence and Machine Learning in Materials Science
February 4, 2022. – February 6, 2022.
https://joint.imi.kyushu-u.ac.jp/post-2698/
Machine Learning for Molecular Graph Representations and GeometriesIchigaku Takigawa
Dec 1, 2021, Pacifico Yokohama, Japan.
Symposium 1AS-17 "Data science and machine learning: Tackling the Noise and Heterogeneity of the Real World"
The 44th Annual Meetingn of the Molecular Biology Society of Japan
https://www2.aeplan.co.jp/mbsj2021/english/designation/index.html
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
BREEDING METHODS FOR DISEASE RESISTANCE.pptxRASHMI M G
Plant breeding for disease resistance is a strategy to reduce crop losses caused by disease. Plants have an innate immune system that allows them to recognize pathogens and provide resistance. However, breeding for long-lasting resistance often involves combining multiple resistance genes
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
Machine Learning and Surrogate Optimization on Heterogeneous Catalysts
1. Machine Learning and Surrogate Optimization
on Heterogeneous Catalysts
Ichigaku Takigawa
2019 PRESTO International Symposium on Materials Informatics
Feb 9-11, 2019 @ Tokyo
Graduate School of Information Science and Technology, Hokkaido University
Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University
2. Heterogeneous catalysts and surface reactions
Wolfgang Pauli
“God made the bulk;
the surface was invented by the devil.”
adsorption
diffusion
desorption
dissociation
recombination
kinks
terraces
adatom
vacancysteps
Many hard-to-quantify factors complicate their atomic-
level characterization by modelling and experiments.
• reaction conditions
• composition
• support
• surface termination
• particle size & morphology
• atomic coordination environment
• disordered/amorphous structures
in their active state
:
GAS
(Reactants)
SOLID
(Catalysts)
Hi-fidelity simulations are too time-consuming...
3. Then how can we characterize the catalytic activity?
K. Shimizu et al, ACS Catal. 2, 1904 (2012)
d-band center (εd − EF) / eVd-band center (εd − EF) / eV
Hammer–Nørskov d-band model
reactionrates
Volcano
trends!
adsorption energy / eV
Brønsted-Evans-Polanyi
relation
activationenergy/eV
Linear trends!
The d-electrons of transition metals govern...
Several DFT-calculated indexes capture the trend to some extent...
4. Outline: Our ML-based studies
1. Can we predict the d-band center?
2. Can we predict the adsorption energy?
3. Can we predict the catalytic activity?
predicting DFT-calculated values by machine learning
(Takigawa et al, RSC Advances. 2016)
predicting DFT-calculated values by machine learning
(Toyao et al, JPC C 2018)
predicting values from experiments reported in the literature
by machine learning
(Suzuki et al, in preparation)
5. Case 1. Predicting the d-band centers
Guest
Host
Ruban, Hammer, Stoltze, Skriver, Nørskov, J Mol Catal A, 115:421-429 (1997)
J. K. Nørskov, et al., Advances in Catalysis, 2000
Host
Guest
Two types of models
• 1% doped
• overlayer
[1% doped]
The d-bands of
transition metals
play central roles.
6. The beauty of the periodic table worked!
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -0.92 -0.96 -0.97 -1.65 -1.64 -2.24 -1.87 -2.4 -3.11
Co -1.37 -1.23 -2.12 -2.82 -2.53 -2.26 -3.56
Ni -0.33 -1.18 -1.92 -2.03 -2.43 -2.15 -2.82 -3.39
Cu -2.42 -2.49 -2.67 -2.89 -2.94 -3.82 -4.63
Ru -1.11 -1.04 -1.12 -1.41 -1.88 -1.81 -1.54 -2.27
Rh -1.42 -1.32 -1.51 -1.7 -1.73 -2.12 -1.81 -1.7 -2.18 -2.3
Pd -1.47 -1.29 -1.29 -1.03 -1.58 -1.83 -1.68 -1.52 -1.79
Ag -3.75 -3.56 -3.62 -3.8 -4.03 -3.5 -3.93 -4.51
Ir -1.78 -1.71 -1.78 -1.55 -2.14 -2.53 -2.2 -2.11 -2.6 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 -1.96 -2.33
Au -3.03 -2.82 -2.85 -2.89 -3.44 -3.56
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -0.78 -1.65 -1.64 -1.87
Co -1.18 -1.17 -1.37 -1.87 -2.12 -2.82 -2.26
Ni -0.33 -1.18 -1.17 -2.61 -2.43 -2.15 -2.82
Cu -2.42 -2.89 -2.94 -3.88 -4.63
Ru -1.11 -1.04 -1.12 -1.11 -1.41 -1.81 -2.27
Rh -1.42 -1.51 -2.12 -1.81 -1.7
Pd -1.29 -1.29 -1.03 -1.58 -1.83 -1.52 -1.79
Ag -3.68 -3.8 -3.63 -4.51
Ir -2.14 -2.11 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06
Au -2.86 -3.09 -2.89 -3.44 -3.56
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -2.17 -3.11
Co -1.17 -1.37 -2.12
Ni -0.33 -1.18 -2.61 -2.43
Cu -2.42 -2.29 -2.49 -3.71 -4.63
Ru -2.02
Rh -1.32 -1.73 -2.12
Pd -1.94 -1.83 -1.97
Ag -3.75 -3.68 -4.51
Ir -1.78 -1.71 -2.7
Pt -2.13
Au -3.09 -2.89
training sets (75%)
test sets (25%)
training sets (50%)
test sets (50%)
training sets (25%)
test sets (75%)
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
100 times
mean RMSE:
0.153 / eV
100 times
mean RMSE:
0.235 / eV
100 times
mean RMSE:
0.402 / eV
7. The ML model
•Group (G)
•Bulk Wigner–Seitz radius (R) in Å
•Atomic number (AN)
•Atomic mass (AM) in g mol
−1
•Period (P)
•Electronegativity (EN)
•Ionization energy (IE) in eV
•Enthalpy of fusion (∆fusH) in J g
−1
•Density at 25 ℃ (ρ) in g cm
−3
Readily available
9 descriptors pretested
for host & guest (18 in total)
Gradient Boosted Tree Regression (GBR)
with only 6 descriptors
(1) Group in the periodic table (host)
(2) Density at 25 ℃ (host)
(3) Enthalpy of fusion (guest)
(4) Ionization energy (guest)
(5) Enthalpy of fusion (host)
(6) Ionization energy (host)
8. 11 ML methods pretested
[3 Tree Ensembles (Nonlinear Regression Models)]
GBR (Gradient Boosted Tree Regression); ETR (Extra-Trees Regression); RFR (Random Forest Regression);
[5 Linear Regression Models]
OLS (Ordinary Least Squares Regression); PLS (Partial Least Squares Regression); LASSO (Lasso
Regression); RIDGE (Ridge Regression); RANSAC (Random Sample Consensus Regression);
G
BR
ETR
G
PR
R
FR
KR
R
O
LS
R
ID
G
E
PLS
R
AN
SAC
SVR
LASSO
[3 Kernel Methods (Nonlinear Regression Models)]
GPR (Gaussian Process Regression); KRR (Kernel Ridge Regression); SVR (Support Vector Regression);
training sets (75%)
test sets (25%)
9. Tree ensemble regressors (GBR, ETR, RFR)
Decision Tree
(Regression Tree)
Tree Ensemble
⇡
⇡
y
x1
x2
y
x1
x2
x1
x2
ˆy
y = sin
✓q
x2
1 + x2
2
◆
y = sin
✓q
x2
1 + x2
2
◆
= + + + ...
x1
x2
ˆy
c1 c2
c3c1
c2
c3
• Region-wise constant prediction
• The regions are given by recursive axis-
parallel partitioning of the data space
<latexit sha1_base64="HFw5DiyTzq0XGqmoTg6I06/Dc80=">AAACr3ichVHLSsNQED3GV62vqhvBTbEoClImVVBciW5c+qoVXzWJV72YJiFJS2vxB1wLLkRBwYX4GW76Ay78BHGp4MaFkzQgKuqE5J57Zs7cczO6Y0rPJ3psUBqbmltaY23x9o7Oru5ET++qZxddQ2QN27TdNV3zhCktkfWlb4o1xxVaQTdFTj+cC/K5knA9aVsrfsURWwVt35J70tB8ptZHynl1LFnOZ0bziRSlKYzkT6BGIIUoFuxEDZvYhQ0DRRQgYMFnbEKDx88GVBAc5rZQZc5lJMO8wDHirC1yleAKjdlD/u7zbiNiLd4HPb1QbfApJr8uK5MYoge6pReq0R090fuvvaphj8BLhVe9rhVOvvukf/ntX1WBVx8Hn6o/PfvYw1ToVbJ3J2SCWxh1feno7GV5emmoOkzX9Mz+r+iR7vkGVunVuFkUS+d/+NHZy+9/LMhHFTxC9fvAfoLVTFodT9PiRGpmNhpmDAMYxAhPbBIzmMcCsnyChVNc4FJRlZyyrezUS5WGSNOHL6HID0VdmeQ=</latexit>
10. Tree ensemble regressors (GBR, ETR, RFR)
Advantages
• quick, nonlinear, parallelizable
• highly accurate (widely used in many winning
solutions for data prediction competitions)
• usually less hyperparameter dependent
(compared to kernel methods and neural networks)
• conservative extrapolation
• "variable importance" provided
• popular implementations
• Scikit-learn
• XGBoost (by DMLC)
• LightGBM (by Microsoft)
…Data
How to generate multiple
decision trees?
• RFR / ETR
random patches (random subsampling of instances and variables) or random splits
• GBR (can be also mixed with the above strategy)
sequentially add a new tree to compensate the weak point of the current ensemble.
11. Descriptor analysis and evaluation
100 times mean RMSE:
0.204±0.047 / eV
100 times mean RMSE:
0.212±0.047 / eV
100 times mean RMSE:
0.214±0.046 / eV
GBR with 18
descriptors
GBR with 6
descriptors
GBR with 4
descriptors
Descriptor
Importances
Descriptor
Selection
(top-k)
training sets (75%)
test sets (25%)
12. Case 2. Predicting the adsorption energy
DFT calculation of adsorption energy
• 10 hours with our 32 cores workstation
(CH3 on the Cu monometallic surface)
• even longer time (about 34 hours) for the system
containing another metal such as Pb
Predicting Adsorption energy of CH3
ML prediction
• < 1 sec with our 1 core laptop
• not dependent on target systems, but
methods we choose
training sets (75%)
test sets (25%)
13. But what these mean for catalyst design and discovery!?
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -0.92 -0.96 -0.97 -1.65 -1.64 -2.24 -1.87 -2.4 -3.11
Co -1.37 -1.23 -2.12 -2.82 -2.53 -2.26 -3.56
Ni -0.33 -1.18 -1.92 -2.03 -2.43 -2.15 -2.82 -3.39
Cu -2.42 -2.49 -2.67 -2.89 -2.94 -3.82 -4.63
Ru -1.11 -1.04 -1.12 -1.41 -1.88 -1.81 -1.54 -2.27
Rh -1.42 -1.32 -1.51 -1.7 -1.73 -2.12 -1.81 -1.7 -2.18 -2.3
Pd -1.47 -1.29 -1.29 -1.03 -1.58 -1.83 -1.68 -1.52 -1.79
Ag -3.75 -3.56 -3.62 -3.8 -4.03 -3.5 -3.93 -4.51
Ir -1.78 -1.71 -1.78 -1.55 -2.14 -2.53 -2.2 -2.11 -2.6 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 -1.96 -2.33
Au -3.03 -2.82 -2.85 -2.89 -3.44 -3.56
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -0.78 -1.65 -1.64 -1.87
Co -1.18 -1.17 -1.37 -1.87 -2.12 -2.82 -2.26
Ni -0.33 -1.18 -1.17 -2.61 -2.43 -2.15 -2.82
Cu -2.42 -2.89 -2.94 -3.88 -4.63
Ru -1.11 -1.04 -1.12 -1.11 -1.41 -1.81 -2.27
Rh -1.42 -1.51 -2.12 -1.81 -1.7
Pd -1.29 -1.29 -1.03 -1.58 -1.83 -1.52 -1.79
Ag -3.68 -3.8 -3.63 -4.51
Ir -2.14 -2.11 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06
Au -2.86 -3.09 -2.89 -3.44 -3.56
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -2.17 -3.11
Co -1.17 -1.37 -2.12
Ni -0.33 -1.18 -2.61 -2.43
Cu -2.42 -2.29 -2.49 -3.71 -4.63
Ru -2.02
Rh -1.32 -1.73 -2.12
Pd -1.94 -1.83 -1.97
Ag -3.75 -3.68 -4.51
Ir -1.78 -1.71 -2.7
Pt -2.13
Au -3.09 -2.89
training sets (75%)
test sets (25%)
training sets (50%)
test sets (50%)
training sets (25%)
test sets (75%)
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
100 times
mean RMSE:
0.153 / eV
100 times
mean RMSE:
0.235 / eV
100 times
mean RMSE:
0.402 / eV
14. Standard procedure for optimizing the activity
All your
available
data
• Experiments
• Simulations
Hypothesis
generation
(abduction)
Check results
Feedback
15. Replace the time-consuming and costly part by ML
All your
available
data
Check the best
predicted ones
Feed them as
training data
Make predictions
for many possible
candidates
Machine Learning
(any "data-driven"
predictions)
The "Surrogate (or proxy)"
model for
• Demanding experiments
• Time-consuming hi-fidelity
simulations
16. Replace the time-consuming and costly part by ML
All your
available
data
Check the best
predicted ones
Feed them as
training data
Make predictions
for many possible
candidates
Machine Learning
(any "data-driven"
predictions)
The "Surrogate (or proxy)"
model for
• Demanding experiments
• Time-consuming hi-fidelity
simulations
This simple procedure
won't work in most
practical cases!
17. ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value
18. ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value
ML prediction
Fitted to minimize
the average error.
<latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit>
<latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
19. ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value
ML prediction
Fitted to minimize
the average error.
<latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit>
<latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
Nicely predicted for
the average (but mediocre )
20. ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value
ML prediction
Fitted to minimize
the average error.
<latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit>
<latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
nice "discovery" can be largely
deviated from the average of knowns
outlier
Nicely predicted for
the average (but mediocre )
21. ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value
ML captures the average trend of available knowns
"discovery" corresponds to something not in knowns
Mismatch
ML prediction
Fitted to minimize
the average error.
<latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit>
<latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
nice "discovery" can be largely
deviated from the average of knowns
outlier
Nicely predicted for
the average (but mediocre )
22. An ML model is just representative of the training data
Highly Inaccurate Model Predictions from
Extrapolation (Lohninger 1999)
"Beware of the perils of extrapolation,
and understand that ML algorithms
build models that are representative of
the available training samples."
"exploitation""exploration"
to obtain new knowledge/data to use the knowledge/data to
improve the performane
We also need this ML basically for this
23. Surrogate optimization (model-based optimization)
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
Use ML to guide the balance between "exploitation" and "exploration"!
24. Surrogate optimization (model-based optimization)
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
"Uncertainty" of
prediction
e.g. prediction variance
Use ML to guide the balance between "exploitation" and "exploration"!
25. Surrogate optimization (model-based optimization)
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
"Uncertainty" of
prediction
e.g. prediction variance
e.g.
"expected improvement"
Use ML to guide the balance between "exploitation" and "exploration"!
26. Surrogate optimization (model-based optimization)
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
"Uncertainty" of
prediction
e.g. prediction variance
e.g.
"expected improvement"
1. Initial Sampling (DoE)
2. Loop:
1. Construct a Surrogate Model.
2. Search the Infill Criterion.
3. Add new samples. (intervention)
• Reinforcement learning
• Blackbox optimization
• Bayesian optimization
• Sequential design of experiments
• Multi-armed bandit
• Evolutional computation
• Game-theoretic approaches
:
An Open Research Topic in ML
Use ML to guide the balance between "exploitation" and "exploration"!
27. Structure-activity landscapes are nonsmooth...
J. Med. Chem. 2012, 55, 2932−2942
The structure-activity landscape can be often
nonsmooth. Small changes in descriptors can
largely affect the activity/selectivity.
Activity cliffs Selectivity cliffs
28. Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
29. Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
with multistart local search with L-BFGS + random samples
30. Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
with multistart local search with L-BFGS + random samples
• Use EI as the infill criterion estimated by OOB or Quantile regression
31. Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
with multistart local search with L-BFGS + random samples
• Use EI as the infill criterion estimated by OOB or Quantile regression
• Get multiple samples in each loop
by batched optimization with "diversified" samples by clustering
(need some margin; in reality, not easy to realize suggested catalysts)
32. Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
with multistart local search with L-BFGS + random samples
• Use EI as the infill criterion estimated by OOB or Quantile regression
• Get multiple samples in each loop
by batched optimization with "diversified" samples by clustering
(need some margin; in reality, not easy to realize suggested catalysts)
• Pose several restrictions on new samples to be tested
33. Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
with multistart local search with L-BFGS + random samples
• Use EI as the infill criterion estimated by OOB or Quantile regression
• Get multiple samples in each loop
by batched optimization with "diversified" samples by clustering
(need some margin; in reality, not easy to realize suggested catalysts)
• Pose several restrictions on new samples to be tested
• the sum of compositions equals to 1 (compositional restriction)
34. Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
with multistart local search with L-BFGS + random samples
• Use EI as the infill criterion estimated by OOB or Quantile regression
• Get multiple samples in each loop
by batched optimization with "diversified" samples by clustering
(need some margin; in reality, not easy to realize suggested catalysts)
• Pose several restrictions on new samples to be tested
• the sum of compositions equals to 1 (compositional restriction)
• restrict the number of elements in a catalyst (bounded nonzeros)
(because a catalyst with 60 elements would be not realistic...)
35. Case 3. Predicting the catalytic acitivity (in prep)
• Oxidative coupling of methane (OCM)
[Zavyalova+ 2011]
• Water gas shift reaction (WGS)
[Odabaşi+ 2014]
• CO oxidation [Günay+ 2013]
Test on 3 DatasetsOur model
GPR-based BO
Random
36. ICReDD, Hokkaido University
Check the website for any collaborations and postdoc positions!
Our mission:
To rationally design and discover
new chemical reactions
by seemlessly fusing
• experimental sciences (realization)
• computational sciences (theory-driven)
• information sciences (data-driven)
started Oct 2018, funded $ 6.4 million per year by government (for 10 years)
Sapporo
Tokyo
HOKKAIDO
• 2 million population
(5th largest city in Japan)
• 6.3m / 248 inches
avg. annual snowfall
Institute for Chemical Reaction Design and Discovery
(WPI-ICReDD), Hokkaido University
37.
38. Summary
• Predicting the d-band centers by ML
(Takigawa et al, RSC Advances. 2016)
• Predicting the adsorption energy by ML
(Toyao et al, JPC C 2018)
• Predicting the experimentally-reported catalytic
activity by ML
(Suzuki et al, in preparation)
Acknowledgements
Ken-ichi
SHIMIZU
(ICAT)
Satoru
TAKAKUSAGI
(ICAT)
Takashi
TOYAO
(ICAT)
Keisuke
SUZUKI
(DENSO)