Video https://youtu.be/P4QogT8bdqY
ACS Spring 2023 Symposium on AI-Accelerated Scientific Workflow
https://acs.digitellinc.com/acs/sessions/526630/view
ACS SPRING 2023 ———— Crossroads of Chemistry
Indianapolis, IN & Hybrid, March 26-30
https://www.acs.org/meetings/acs-meetings/spring-2023.html
Slide PDF
https://itakigawa.page.link/acs2023spring
Our Paper
Accelerated discovery of multi-elemental reverse water-gas shift catalysts using extrapolative machine learning approach (2022, ChemRxiv)
https://doi.org/10.26434/chemrxiv-2022-695rj
Ichi Takigawa
https://itakigawa.github.io/
Use of spark for proteomic scoring seattle presentationlordjoe
Slides presented to the Seattle Spark Meetup on August 12 2015 - Note the work on Accumulators is a separate GitHub project https://github.com/lordjoe/SparkAccumulators
PubChem QC project. In this project we calculate all molecules in the PubChem Project. Currently 1,100,000 molecules are available at http://pubchemqc.riken.jp/ . Results are in public domain.
How to use data to design and optimize reaction? A quick introduction to work...Ichigaku Takigawa
(Journal Club) ICReDD Seminar, Apr 27 2020
Institute for Chemical Reaction Design and Discovery (ICReDD)
Hokkaido University
Sapporo, JAPAN
https://www.icredd.hokudai.ac.jp
ARChem Route Designer is a new retrosynthetic analysis package (1) that generates complete synthetic routes for target molecules from readily available starting materials. Rule generation from reaction databases is fully automated to insure that the system can keep abreast with the latest reaction literature and available starting materials. After these rules are used to carry out an exhaustive retrosynthetic analysis of the target molecules, special heuristics are used to mitigate a combinatorial explosion.
Proposed routes are then prioritized by a merit ranking algorithm to present to the viewer the most diverse solution profile. Users then have the option to view the details of the overall reaction tree and to scroll through the reaction details. The program runs on a server with a web-based user interface. An overview of some of the challenges, solutions, and examples that illustrate ARChem's utility will be discussed.
Use of spark for proteomic scoring seattle presentationlordjoe
Slides presented to the Seattle Spark Meetup on August 12 2015 - Note the work on Accumulators is a separate GitHub project https://github.com/lordjoe/SparkAccumulators
PubChem QC project. In this project we calculate all molecules in the PubChem Project. Currently 1,100,000 molecules are available at http://pubchemqc.riken.jp/ . Results are in public domain.
How to use data to design and optimize reaction? A quick introduction to work...Ichigaku Takigawa
(Journal Club) ICReDD Seminar, Apr 27 2020
Institute for Chemical Reaction Design and Discovery (ICReDD)
Hokkaido University
Sapporo, JAPAN
https://www.icredd.hokudai.ac.jp
ARChem Route Designer is a new retrosynthetic analysis package (1) that generates complete synthetic routes for target molecules from readily available starting materials. Rule generation from reaction databases is fully automated to insure that the system can keep abreast with the latest reaction literature and available starting materials. After these rules are used to carry out an exhaustive retrosynthetic analysis of the target molecules, special heuristics are used to mitigate a combinatorial explosion.
Proposed routes are then prioritized by a merit ranking algorithm to present to the viewer the most diverse solution profile. Users then have the option to view the details of the overall reaction tree and to scroll through the reaction details. The program runs on a server with a web-based user interface. An overview of some of the challenges, solutions, and examples that illustrate ARChem's utility will be discussed.
RMG at the Flame Chemistry Workshop 2014Richard West
Presentation to the 2nd International Workshop on Flame Chemistry, preceding the 35th International Symposium on Combustion, in San Francisco, CA, in August 2014.
Describes recent progress in two projects related to our Reaction Mechanism Generator software.
An introductory workshop about machine learning in chemistry. This workshop is a set of slides and jupyter notebooks intended to give an overview of machine learning in chemistry to graduate students in chemical sciences, which was originally presented during a research trip to Ben Gurion University and the Hebrew University in Jerusalem in February 2019. Part 1 of 2.
The workshop lives at https://github.com/jpjanet/ML-chem-workshop where it is maintained in an up-to-date fashion. Notebook examples can be obtained from the GitHub page.
SWCNT Growth from Chiral and Achiral Carbon Nanorings: Prediction of Chiralit...Stephan Irle
Catalyst-free, chirality-controlled growth of chiral and achiral single-walled carbon nanotubes (SWCNTs) from organic precursors is demonstrated using quantum chemical simulations [1]. Growth of (4,3), (6,5), (6,1), (10,1), (6,6) and (8,0) SWCNTs was induced by ethynyl radical (C2H) addition to organic precursors. These simulations show a strong dependence of the SWCNT growth rate on the chiral angle, θ. The SWCNT diameter however does not influence the SWCNT growth rate under these conditions. This agreement with a previously proposed screw-dislocation-like model of transition metal-catalyzed SWCNT growth rates [2] indicates that the SWCNT growth rate is an intrinsic property of the SWCNT edge itself. Conversely, we predict that the rate of local SWCNT growth via Diels-Alder cycloaddition of C2H2 is strongly influenced by the diameter of the SWCNT. We therefore predict the existence of a maximum local growth rate for an optimum diameter/chirality combination at a given C2H/C2H2 ratio. We also find that the ability of a SWCNT to avoid defect formation during growth is an intrinsic quality of the SWCNT edge.
References:
[1] Li, H.-B.; Page, A. J.; Irle, S.; Morokuma, K. J. Am. Chem. Soc. 2012, 134, 15887-15896.
[2] Ding, F.; Harutyunyan, A. R.; Yakobson, B. I. Proc. Natl. Acad. Sci. 2009, 106, 2506-2509.
Presentation on machine learning and materials science at Computing in Engineering Forum 2018, Machine Ground Interaction Consortium (MaGIC) 2018, Wisconsin, Madison, December 4, 2018
IC-SDV 2019: The RISE of Machine Learning; Predicting Future of Process Devel...Dr. Haxel Consult
Process research and development of active pharmaceutical ingredients (API’s) plays an important role in the pharmaceutical industry. In this phase, all aspects of an API’s synthetic route are tested and optimized.
We envision that artificial intelligence and machine learning techniques could help in the optimization process of chemical transformations. In this presentation, the latest advances in the prediction of reaction conditions using machine learning algorithms will be discussed.
This presentation was given on October 15th 2014 at the University of Connecticut to a class hosted by Mark Peczuh. The intention was to provide an overview of data upload to ChemSpider and ChemSpider SyntheticPages as well as the new direction with the data repository.
Molecular modelling for in silico drug discoveryLee Larcombe
A slide set based on the small molecule section of "Introduction to in silico drug discovery" with more detail on molecular modelling and simulation aspects. Including a bit more on protein structure prediction
Many of us nowadays invest significant amounts of time in sharing our activities and opinions with friends and family via social networking tools. However, despite the availability of many platforms for scientists to connect and share with their peers in the scientific community the majority do not make use of these tools, despite their promise and potential impact and influence on our future careers. We are being indexed and exposed on the internet via our publications, presentations and data. We also have many more ways to contribute to science, to annotate and curate data, to “publish” in new ways, and many of these activities are as part of a growing crowdsourcing network. This presentation will provide an overview of the various types of networking and collaborative sites available to scientists and ways to expose your scientific activities online. Many of these can ultimately contribute to the developing measures of you as a scientist as identified in the new world of alternative metrics. Participating offers a great opportunity to develop a scientific profile within the community and may ultimately be very beneficial, especially to scientists early in their career.
Metabolomic Data Analysis Workshop and Tutorials (2014)Dmitry Grapov
Get more information:
http://imdevsoftware.wordpress.com/2014/10/11/2014-metabolomic-data-analysis-and-visualization-workshop-and-tutorials/
Recently I had the pleasure of teaching statistical and multivariate data analysis and visualization at the annual Summer Sessions in Metabolomics 2014, organized by the NIH West Coast Metabolomics Center.
Similar to last year, I’ve posted all the content (lectures, labs and software) for any one to follow along with at their own pace. I also plan to release videos for all the lectures and labs.
RMG at the Flame Chemistry Workshop 2014Richard West
Presentation to the 2nd International Workshop on Flame Chemistry, preceding the 35th International Symposium on Combustion, in San Francisco, CA, in August 2014.
Describes recent progress in two projects related to our Reaction Mechanism Generator software.
An introductory workshop about machine learning in chemistry. This workshop is a set of slides and jupyter notebooks intended to give an overview of machine learning in chemistry to graduate students in chemical sciences, which was originally presented during a research trip to Ben Gurion University and the Hebrew University in Jerusalem in February 2019. Part 1 of 2.
The workshop lives at https://github.com/jpjanet/ML-chem-workshop where it is maintained in an up-to-date fashion. Notebook examples can be obtained from the GitHub page.
SWCNT Growth from Chiral and Achiral Carbon Nanorings: Prediction of Chiralit...Stephan Irle
Catalyst-free, chirality-controlled growth of chiral and achiral single-walled carbon nanotubes (SWCNTs) from organic precursors is demonstrated using quantum chemical simulations [1]. Growth of (4,3), (6,5), (6,1), (10,1), (6,6) and (8,0) SWCNTs was induced by ethynyl radical (C2H) addition to organic precursors. These simulations show a strong dependence of the SWCNT growth rate on the chiral angle, θ. The SWCNT diameter however does not influence the SWCNT growth rate under these conditions. This agreement with a previously proposed screw-dislocation-like model of transition metal-catalyzed SWCNT growth rates [2] indicates that the SWCNT growth rate is an intrinsic property of the SWCNT edge itself. Conversely, we predict that the rate of local SWCNT growth via Diels-Alder cycloaddition of C2H2 is strongly influenced by the diameter of the SWCNT. We therefore predict the existence of a maximum local growth rate for an optimum diameter/chirality combination at a given C2H/C2H2 ratio. We also find that the ability of a SWCNT to avoid defect formation during growth is an intrinsic quality of the SWCNT edge.
References:
[1] Li, H.-B.; Page, A. J.; Irle, S.; Morokuma, K. J. Am. Chem. Soc. 2012, 134, 15887-15896.
[2] Ding, F.; Harutyunyan, A. R.; Yakobson, B. I. Proc. Natl. Acad. Sci. 2009, 106, 2506-2509.
Presentation on machine learning and materials science at Computing in Engineering Forum 2018, Machine Ground Interaction Consortium (MaGIC) 2018, Wisconsin, Madison, December 4, 2018
IC-SDV 2019: The RISE of Machine Learning; Predicting Future of Process Devel...Dr. Haxel Consult
Process research and development of active pharmaceutical ingredients (API’s) plays an important role in the pharmaceutical industry. In this phase, all aspects of an API’s synthetic route are tested and optimized.
We envision that artificial intelligence and machine learning techniques could help in the optimization process of chemical transformations. In this presentation, the latest advances in the prediction of reaction conditions using machine learning algorithms will be discussed.
This presentation was given on October 15th 2014 at the University of Connecticut to a class hosted by Mark Peczuh. The intention was to provide an overview of data upload to ChemSpider and ChemSpider SyntheticPages as well as the new direction with the data repository.
Molecular modelling for in silico drug discoveryLee Larcombe
A slide set based on the small molecule section of "Introduction to in silico drug discovery" with more detail on molecular modelling and simulation aspects. Including a bit more on protein structure prediction
Many of us nowadays invest significant amounts of time in sharing our activities and opinions with friends and family via social networking tools. However, despite the availability of many platforms for scientists to connect and share with their peers in the scientific community the majority do not make use of these tools, despite their promise and potential impact and influence on our future careers. We are being indexed and exposed on the internet via our publications, presentations and data. We also have many more ways to contribute to science, to annotate and curate data, to “publish” in new ways, and many of these activities are as part of a growing crowdsourcing network. This presentation will provide an overview of the various types of networking and collaborative sites available to scientists and ways to expose your scientific activities online. Many of these can ultimately contribute to the developing measures of you as a scientist as identified in the new world of alternative metrics. Participating offers a great opportunity to develop a scientific profile within the community and may ultimately be very beneficial, especially to scientists early in their career.
Metabolomic Data Analysis Workshop and Tutorials (2014)Dmitry Grapov
Get more information:
http://imdevsoftware.wordpress.com/2014/10/11/2014-metabolomic-data-analysis-and-visualization-workshop-and-tutorials/
Recently I had the pleasure of teaching statistical and multivariate data analysis and visualization at the annual Summer Sessions in Metabolomics 2014, organized by the NIH West Coast Metabolomics Center.
Similar to last year, I’ve posted all the content (lectures, labs and software) for any one to follow along with at their own pace. I also plan to release videos for all the lectures and labs.
Machine Learning for Molecules: Lessons and Challenges of Data-Centric ChemistryIchigaku Takigawa
Perspectives on Artificial Intelligence and Machine Learning in Materials Science
February 4, 2022. – February 6, 2022.
https://joint.imi.kyushu-u.ac.jp/post-2698/
Machine Learning for Molecular Graph Representations and GeometriesIchigaku Takigawa
Dec 1, 2021, Pacifico Yokohama, Japan.
Symposium 1AS-17 "Data science and machine learning: Tackling the Noise and Heterogeneity of the Real World"
The 44th Annual Meetingn of the Molecular Biology Society of Japan
https://www2.aeplan.co.jp/mbsj2021/english/designation/index.html
Friday, October 15th, 2021, Sapporo, Hokkaido, Japan.
Hokkaido University ICReDD - Faculty of Medicine Joint Symposium
https://www.icredd.hokudai.ac.jp/event/5993
ICReDD (Institute for Chemical Reaction Design and Discovery)
https://www.icredd.hokudai.ac.jp
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneous Catalysis
1. https://itakigawa.github.io/
Exploring practices in
machine learning and machine discovery
for heterogeneous catalysis
Ichi Takigawa
Institute for Liberal Arts and Sciences, Kyoto University
Institute for Chemical Reaction Design and Discovery, Hokkaido University
RIKEN Center for Advanced Intelligence Project
2. Share a viewpoint from the ML side (as I am an ML researcher, not a chemist)
after >7 years struggling in heterogeneous catalyst design and discovery
With great people in chemistry!
Prof. Ken-ichi
SHIMIZU
Prof. Takashi
TOYAO
Prof. Satoru Takakusagi
Prof. Zen Maeno
Prof. Takashi Kamachi
Keisuke Suzuki
Shoma Kikuchi
Shinya Mine
Takumi Mukaiyama
Motoshi Takao
Yuan Jing
Gang Wang
Duotian Chen
Kah Wei Ting
Taichi Yamaguchi
Koichi Matsushita
S.M.A.H. Siddiki
Prof. Koji Tsuda (U Tokyo)
This talk
3. Gas-phase reactions on solid-phase catalyst surface (Heterogeneous catalysis)
Industrial Synthesis (e.g. Haber-Bosch), Automobile Exhaust Gas Purification, Methane Conversion, etc.
https://en.wikipedia.org/wiki/Heterogeneous_catalysis
Reactants
(Gas)
Catalysts
(Solid)
Nano-particle
surface
High Temperature, High Pressure
Adsorption
Diffusion
Dissociation
Recombination
Desorption
Heterogeneous catalysis
4. Gas-phase reactions on solid-phase catalyst surface (Heterogeneous catalysis)
Industrial Synthesis (e.g. Haber-Bosch), Automobile Exhaust Gas Purification, Methane Conversion, etc.
https://en.wikipedia.org/wiki/Heterogeneous_catalysis
Reactants
(Gas)
Catalysts
(Solid)
Nano-particle
surface
High Temperature, High Pressure
Adsorption
Diffusion
Dissociation
Recombination
Desorption
Involves devilishly complex too-many-factor processes.
A solid surface shares its border with the external world.
God made the bulk; the surface was invented by the devil —— Wolfgang Pauli
Heterogeneous catalysis
6. Accelerated discovery of multi-elemental reverse water-gas shift catalysts using extrapolative machine learning
approach. https://doi.org/10.26434/chemrxiv-2022-695rj
• Discovered more than 100 catalysts better
than the previously reported best catalyst.
Our recent research: Results
Our Target:
Pt(3)/X1-X2-X3-X4-X5/TiO2 RWGS Calalyst
7. Accelerated discovery of multi-elemental reverse water-gas shift catalysts using extrapolative machine learning
approach. https://doi.org/10.26434/chemrxiv-2022-695rj
• Discovered more than 100 catalysts better
than the previously reported best catalyst.
• 300 catalysts tested in total by 44 cycles of
ML prediction + experiment
Our recent research: Results
Our Target:
Pt(3)/X1-X2-X3-X4-X5/TiO2 RWGS Calalyst
8. Accelerated discovery of multi-elemental reverse water-gas shift catalysts using extrapolative machine learning
approach. https://doi.org/10.26434/chemrxiv-2022-695rj
• Discovered more than 100 catalysts better
than the previously reported best catalyst.
• 300 catalysts tested in total by 44 cycles of
ML prediction + experiment
• The optimal catalyst Pt(3)/Rb(1)-Ba(1)-
Mo(0.6)-Nb(0.2)/TiO2 was hardly predictable
by human experts
Our recent research: Results
Our Target:
Pt(3)/X1-X2-X3-X4-X5/TiO2 RWGS Calalyst
9. Accelerated discovery of multi-elemental reverse water-gas shift catalysts using extrapolative machine learning
approach. https://doi.org/10.26434/chemrxiv-2022-695rj
• Discovered more than 100 catalysts better
than the previously reported best catalyst.
• 300 catalysts tested in total by 44 cycles of
ML prediction + experiment
• The optimal catalyst Pt(3)/Rb(1)-Ba(1)-
Mo(0.6)-Nb(0.2)/TiO2 was hardly predictable
by human experts
• Notably, Nb was never used in training.
Our recent research: Results
Our Target:
Pt(3)/X1-X2-X3-X4-X5/TiO2 RWGS Calalyst
10. Decision tree ensembles (with UQ)
i.e. histogram on data-dependent partitions
• ExtraTrees regressor
• Gradient Boosted Trees regressor
+ Abstracted (coarse grained) featurization of
chemical compositions
Input representations by elemental features
e.g. “composition-based feature vector (CBFV)”
Pt(3)/Ba(2)-Mo(1)-Tm(1)-
Eu(0.5)-Dy(0.5)/TiO2
Pt(3)/Mo(1)-Ba(1)-Tb(1)-
Ho(1)-Cs(0.5)/TiO2
Pt(3)/Rb(1)-Ba(1)-
Mo(0.6)-Nb(0.2)/TiO2
Our recent research: Method
11. Decision tree ensembles (with UQ)
i.e. histogram on data-dependent partitions
• ExtraTrees regressor
• Gradient Boosted Trees regressor
+ Abstracted (coarse grained) featurization of
chemical compositions
Input representations by elemental features
e.g. “composition-based feature vector (CBFV)”
Pt(3)/Ba(2)-Mo(1)-Tm(1)-
Eu(0.5)-Dy(0.5)/TiO2
Pt(3)/Mo(1)-Ba(1)-Tb(1)-
Ho(1)-Cs(0.5)/TiO2
Pt(3)/Rb(1)-Ba(1)-
Mo(0.6)-Nb(0.2)/TiO2
Very Conservative Prediction
(Histogram)
Very Radical Representation
(Discard specific details)
Our recent research: Method
12. Decision tree ensembles (with UQ)
i.e. histogram on data-dependent partitions
• ExtraTrees regressor
• Gradient Boosted Trees regressor
+ Abstracted (coarse grained) featurization of
chemical compositions
Input representations by elemental features
e.g. “composition-based feature vector (CBFV)”
Pt(3)/Ba(2)-Mo(1)-Tm(1)-
Eu(0.5)-Dy(0.5)/TiO2
Pt(3)/Mo(1)-Ba(1)-Tb(1)-
Ho(1)-Cs(0.5)/TiO2
Pt(3)/Rb(1)-Ba(1)-
Mo(0.6)-Nb(0.2)/TiO2
This talk will hopefully explain why we go for such a standard method choice
(even though I’m a ML researcher doing research also in GNNs and Transformers)
Very Conservative Prediction
(Histogram)
Very Radical Representation
(Discard specific details)
Our recent research: Method
13. At first I had an optimistic image of the unfamiliar field of "Materials Informatics"...
(after I worked in machine learning for bioinformatics for 10 years)
Step 1 Step 2 Step 3
We give all possible types of
available data into ML
ML becomes smarter
than standard experts
ML suggests more and more
promising materials
My prologue: Materials informatics?
15. Three lessons learned as I experienced this illusion being shattered…
1. The goals of ML and ‘materials/chemical science’ are fundamentally different.
What we need here is not ML but a much harder problem of ‘machine discovery.’
Takeaways
16. Three lessons learned as I experienced this illusion being shattered…
1. The goals of ML and ‘materials/chemical science’ are fundamentally different.
What we need here is not ML but a much harder problem of ‘machine discovery.’
2. If we go for a hypothesis-free + off-the-shelf solution, exploration by decision tree
ensembles, combined with UQ and abstracted (coarse grained) feature
representations, will give a very strong baseline.
Takeaways
17. Three lessons learned as I experienced this illusion being shattered…
1. The goals of ML and ‘materials/chemical science’ are fundamentally different.
What we need here is not ML but a much harder problem of ‘machine discovery.’
2. If we go for a hypothesis-free + off-the-shelf solution, exploration by decision tree
ensembles, combined with UQ and abstracted (coarse grained) feature
representations, will give a very strong baseline.
3. If we want more than that, we can’t be hypothesis free. Any strategies to narrow
down the scope as well as domain expertise really matters.
Takeaways
18. Get weight (g) & height (cm)
Apple
Orange
Machine Learning converts data into predictions
19. 5 6.25 7.5 8.75 10
90
112.5
135
157.5
180
●
●
●
●
●
●
●
●
●
●
Get weight (g) & height (cm)
Weight (g)
Height (cm)
Apple
Orange
Apple
Orange
Machine Learning converts data into predictions