Friday, October 15th, 2021, Sapporo, Hokkaido, Japan.
Hokkaido University ICReDD - Faculty of Medicine Joint Symposium
https://www.icredd.hokudai.ac.jp/event/5993
ICReDD (Institute for Chemical Reaction Design and Discovery)
https://www.icredd.hokudai.ac.jp
Machine Learning for Molecules: Lessons and Challenges of Data-Centric ChemistryIchigaku Takigawa
Perspectives on Artificial Intelligence and Machine Learning in Materials Science
February 4, 2022. – February 6, 2022.
https://joint.imi.kyushu-u.ac.jp/post-2698/
Hokkaido University (HU) - Seoul National University (SNU) Joint Symposium
2018 International Workshop on
New Frontiers in Convergence Science and Technology
Machine Learning for Molecules: Lessons and Challenges of Data-Centric ChemistryIchigaku Takigawa
Perspectives on Artificial Intelligence and Machine Learning in Materials Science
February 4, 2022. – February 6, 2022.
https://joint.imi.kyushu-u.ac.jp/post-2698/
Hokkaido University (HU) - Seoul National University (SNU) Joint Symposium
2018 International Workshop on
New Frontiers in Convergence Science and Technology
論文紹介:Grad-CAM: Visual explanations from deep networks via gradient-based loca...Kazuki Adachi
Selvaraju, Ramprasaath R., et al. "Grad-cam: Visual explanations from deep networks via gradient-based localization." The IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618-626
Alphafold2 - Protein Structural Bioinformatics After CASP14Purdue University
This presentation discusses Alphafold2 about the algorithm in comparison with others and the time course of the impact and development that happened after the paper and software release. This slide was presented at IIBMP2021 https://iibmp2021.hamadalab.com/program/, a virtual conference in Tokyo, Japan on 9/27, 28, 2021
Frontiers of data-driven property prediction: molecular machine learningIchigaku Takigawa
Innovation Camp 2018 for Computational Materials Science(ICCMS2018)
January 23rd(Tue.)-25th(Thu.), 2018
The Jozankei View Hotel, Sapporo, Hokkaido, Japan.
http://ccms.issp.u-tokyo.ac.jp/events/eventsfolder/ICCMS2018
In materials science, data-centric science is becoming one of the major approaches along with theoretical, experimental, and computational sciences. The main purpose of this camp is that we learn the basics of the machine learning as data-centric science and use it to solve problems in our researches through group works. We will also have lectures on advanced researches in computational and data-centric sciences and discuss future perspectives. Furthermore, we learn innovation minds by inviting lecturers who are at the forefront beyond the industry-government-academia framework.
計算物質科学イノベーションキャンプ2018
物質科学の課題を解決する際、理論科学、実験科学、計算科学に加え、データ科学の活用が盛んになっている。本キャンプでは、そのデータ科学として機械学習の手法を学び、チームでの実習を通し手法を身に着け、各自の研究やプロジェクトの課題解決に役立てることを主目的とする。また、講師を招いて計算科学やデータ科学の最先端の研究成果に関する講義と今後の発展の可能性などについて議論する。さらに、産官学や学問領域を超えて活躍する方々のレクチャーと意見交換などでイノベーションマインドを学ぶ。
MMTF-Spark: Interactive, Scalable, and Reproducible Datamining of 3D Macromo...Peter Rose
Presented at the NIH/NCI Informatics Technology for Cancer Research (ITCR) 2018 meeting (https://itcr.cancer.gov/).
Advances in Structural Bioinformatics are driven by the fast growth in experimental 3D structures and integration with even larger sets of sequence and protein function data. At the same time, the field of Data Science has created new technologies for re-engineering legacy software pipelines to make them scalable, easy to use, reproducible, reusable, and sharable.
Here, we describe the MMTF-Spark/PySpark [1] project that combines three key components to create such an infrastructure: 1. Interactive Jupyter notebooks to run ad hoc analyses, data mining, machine learning, and visualization of 3D structure and sequence datasets, 2. A scalable compute infrastructure to run these analyses interactively across large datasets, e.g., the entire PDB, using previously developed efficient data representations [2, 3] and the Apache Spark framework for distributed parallel computing, 3. A library of methods for data mining and analysis of 3D structure and sequence data, capitalizing on the rich data analytics, visualization, and machine/deep learning tools available in the Python ecosystem.
Scientists face a number of complex and time consuming barriers when applying structural bioinformatics analysis, including complex software setups, non-interoperable data formats and software applications, lack of documentation, simple examples, and tutorials. Given the large datasets, biologists routinely apply computational tools and automation pipelines in their research. However, there is a long tail of ad-hoc, one-off, questions that biologists ask that cannot be answered using available web resources or workflow systems that focus on common tasks. In this project, we provide a self-contained programming environment that caters to scientists with varying computational skills and needs, ranging from biologists with basic programming skills, to structural and computational biologists who want to share their work, to data scientists who seek access to bioinformatics datasets to benchmark new machine learning methods. A key advantage of this environment is interactivity, which enables iterative exploration. By combining documentation, data sets, analysis code, results, and interactive visualizations in Jupyter notebooks, the steps of an interactive session can be captured, reproduced, and shared.
Acknowledgements
This project was supported by the NCI of the NIH under award number U01 CA198942.
References
1. https://github.com/sbl-sdsc/mmtf-spark, https://github.com/sbl-sdsc/mmtf-pyspark
2. Bradley AR, et al. (2017) MMTF - an efficient file format for the transmission, visualization, and analysis of macromolecular structures. PLOS Computational Biology 13(6): e1005575.
3. Valasatava Y, et al. (2017) Towards an efficient compression of 3D coordinates of macromolecular structures. PLOS ONE 12(3): e0174846.
論文紹介:Grad-CAM: Visual explanations from deep networks via gradient-based loca...Kazuki Adachi
Selvaraju, Ramprasaath R., et al. "Grad-cam: Visual explanations from deep networks via gradient-based localization." The IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618-626
Alphafold2 - Protein Structural Bioinformatics After CASP14Purdue University
This presentation discusses Alphafold2 about the algorithm in comparison with others and the time course of the impact and development that happened after the paper and software release. This slide was presented at IIBMP2021 https://iibmp2021.hamadalab.com/program/, a virtual conference in Tokyo, Japan on 9/27, 28, 2021
Frontiers of data-driven property prediction: molecular machine learningIchigaku Takigawa
Innovation Camp 2018 for Computational Materials Science(ICCMS2018)
January 23rd(Tue.)-25th(Thu.), 2018
The Jozankei View Hotel, Sapporo, Hokkaido, Japan.
http://ccms.issp.u-tokyo.ac.jp/events/eventsfolder/ICCMS2018
In materials science, data-centric science is becoming one of the major approaches along with theoretical, experimental, and computational sciences. The main purpose of this camp is that we learn the basics of the machine learning as data-centric science and use it to solve problems in our researches through group works. We will also have lectures on advanced researches in computational and data-centric sciences and discuss future perspectives. Furthermore, we learn innovation minds by inviting lecturers who are at the forefront beyond the industry-government-academia framework.
計算物質科学イノベーションキャンプ2018
物質科学の課題を解決する際、理論科学、実験科学、計算科学に加え、データ科学の活用が盛んになっている。本キャンプでは、そのデータ科学として機械学習の手法を学び、チームでの実習を通し手法を身に着け、各自の研究やプロジェクトの課題解決に役立てることを主目的とする。また、講師を招いて計算科学やデータ科学の最先端の研究成果に関する講義と今後の発展の可能性などについて議論する。さらに、産官学や学問領域を超えて活躍する方々のレクチャーと意見交換などでイノベーションマインドを学ぶ。
MMTF-Spark: Interactive, Scalable, and Reproducible Datamining of 3D Macromo...Peter Rose
Presented at the NIH/NCI Informatics Technology for Cancer Research (ITCR) 2018 meeting (https://itcr.cancer.gov/).
Advances in Structural Bioinformatics are driven by the fast growth in experimental 3D structures and integration with even larger sets of sequence and protein function data. At the same time, the field of Data Science has created new technologies for re-engineering legacy software pipelines to make them scalable, easy to use, reproducible, reusable, and sharable.
Here, we describe the MMTF-Spark/PySpark [1] project that combines three key components to create such an infrastructure: 1. Interactive Jupyter notebooks to run ad hoc analyses, data mining, machine learning, and visualization of 3D structure and sequence datasets, 2. A scalable compute infrastructure to run these analyses interactively across large datasets, e.g., the entire PDB, using previously developed efficient data representations [2, 3] and the Apache Spark framework for distributed parallel computing, 3. A library of methods for data mining and analysis of 3D structure and sequence data, capitalizing on the rich data analytics, visualization, and machine/deep learning tools available in the Python ecosystem.
Scientists face a number of complex and time consuming barriers when applying structural bioinformatics analysis, including complex software setups, non-interoperable data formats and software applications, lack of documentation, simple examples, and tutorials. Given the large datasets, biologists routinely apply computational tools and automation pipelines in their research. However, there is a long tail of ad-hoc, one-off, questions that biologists ask that cannot be answered using available web resources or workflow systems that focus on common tasks. In this project, we provide a self-contained programming environment that caters to scientists with varying computational skills and needs, ranging from biologists with basic programming skills, to structural and computational biologists who want to share their work, to data scientists who seek access to bioinformatics datasets to benchmark new machine learning methods. A key advantage of this environment is interactivity, which enables iterative exploration. By combining documentation, data sets, analysis code, results, and interactive visualizations in Jupyter notebooks, the steps of an interactive session can be captured, reproduced, and shared.
Acknowledgements
This project was supported by the NCI of the NIH under award number U01 CA198942.
References
1. https://github.com/sbl-sdsc/mmtf-spark, https://github.com/sbl-sdsc/mmtf-pyspark
2. Bradley AR, et al. (2017) MMTF - an efficient file format for the transmission, visualization, and analysis of macromolecular structures. PLOS Computational Biology 13(6): e1005575.
3. Valasatava Y, et al. (2017) Towards an efficient compression of 3D coordinates of macromolecular structures. PLOS ONE 12(3): e0174846.
Demonstration of the applicability of the Linked Data Modeling Language and CHEMROF ( https://chemkg.github.io/chemrof/) for semantic chemical sciences. Presented at MADICES 2022. https://github.com/MADICES/MADICES-2022
Machine Learning for Chemistry: Representing and InterveningIchigaku Takigawa
Joint Symposium of Engineering & Information Science & WPI-ICReDD in Hokkaido University
Apr. 26 (Mon), 2021
https://www.icredd.hokudai.ac.jp/event/5430
Molecular modelling for in silico drug discoveryLee Larcombe
A slide set based on the small molecule section of "Introduction to in silico drug discovery" with more detail on molecular modelling and simulation aspects. Including a bit more on protein structure prediction
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...Robert (Rob) Salomon
"Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in Cytometry" was an Invited Tutorial given at the 2019 CYTO conference for the the International Society for the Advancement of Cytometry on the 22nd May 2019. This tutorial was recorded and we expect that it will be converted to a CYTOU webinar in the near future.
This tutorial will begin by explaining why the emerging field of Genomic Cytometry, i.e. the measurement of cells using genomic techniques (e.g. sequencing), in conjunction with more traditional cytometry techniques such as fluorescence, mass and imaging cytometry is becoming a standard tool for biologists looking to unravel complex cellular processes and to develop a deeper understanding of heterogeneity.
We will give a detailed overview of the various technologies that have allowed the emergence of Genomic Cytometry as well as those that continue to push the boundaries of cellular characterisation.
We will then provide a basic overview of the sequencing process such that both research cytometerists and the staff for the cytometry SRL are better equipped to understand the downstream genomic component of Genomic Cytometry.
Finally, we will wrap up the session with case studies that illustrate the power of the genomic cytometry approach and will give a brief outline of where we feel the field needs to go as it matures. We expect attendees will gain a better understanding of 1) the rapidly maturing field of Genomic Cytometry and 2) how Genomic Cytometry should be leveraged into more traditional cytometry workflows.
How to Scale from Workstation through Cloud to HPC in Cryo-EM Processinginside-BigData.com
In this video from the GPU Technology Conference, Lance Wilson from Monash University presents: How to Scale from Workstation through Cloud to HPC in Cryo-EM Processing.
"Learn how high-resolution imaging is revolutionizing science and dramatically changing how we process, analyze, and visualize at this new scale. We will show the journey a researcher can take to produce images capable of winning a Nobel prize. We'll review the last two years of development in single-particle cryo-electron microscopy processing, with a focus on accelerated software, and discuss benchmarks and best practices for common software packages in this domain. Our talk will include videos and images of atomic resolution molecules and viruses that demonstrate our success in high-resolution imaging."
Watch the video: https://wp.me/p3RLHQ-kcW
Learn more: https://www.monash.edu/researchinfrastructure/cryo-em
and
https://www.nvidia.com/en-us/gtc/home/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck from the 2014 HPC User Forum in Seattle, Jack Collins from the National Cancer Institute presents: Genomes to Structures to Function: The Role of HPC.
Watch the video presentation: http://wp.me/p3RLHQ-d28
Beyond Broken Stick Modeling: R Tutorial for interpretable multivariate analysisPetteriTeikariPhD
“R Tutorial” for Interpretable multivariate analysis with t-SNE and Random Forests mainly for ophthalmic data modeling.
Bust through the fetish for indices and easy scalar human-readable interpretations of data.
Alternative download link:
https://www.dropbox.com/s/wyg5k0k35qxdcyx/beyond_brokenStick.pdf?dl=0
Link Mining for Kernel-based Compound-Protein Interaction Predictions Using a...Masahito Ohue
Thirteenth International Conference on Intelligent Computing (ICIC2017)
R13: Protein and Gene Bioinformatics: Analysis, Algorithms and Applications, Aug 9, 2017.
Masahito Ohue, Takuro Yamazaki, Tomohiro Ban, Yutaka Akiyama.
In Proceedings of the Thirteenth International Conference On Intelligent Computing (ICIC2017) (Lecture Notes in Computer Science), 10362, 549-558, Liverpool,UK August 7-10, 2017
https://link.springer.com/chapter/10.1007/978-3-319-63312-1_48
Open Source Pharma: Crowd computing: A new approach to predictive modelingOpen Source Pharma
Presentation about "Predictive in silico models," given by Joerg Bentzien at the Open Source Pharma Conference. The event took place at Rockefeller Foundation Bellagio Center in July 2014.
Joerg Bentzien Bio:
http://www.opensourcepharma.net/participants/jorg-bentzien
Conference Agenda (see Day 1, Session 2):
http://www.opensourcepharma.net/agenda.html
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...Ichigaku Takigawa
Video https://youtu.be/P4QogT8bdqY
ACS Spring 2023 Symposium on AI-Accelerated Scientific Workflow
https://acs.digitellinc.com/acs/sessions/526630/view
ACS SPRING 2023 ———— Crossroads of Chemistry
Indianapolis, IN & Hybrid, March 26-30
https://www.acs.org/meetings/acs-meetings/spring-2023.html
Slide PDF
https://itakigawa.page.link/acs2023spring
Our Paper
Accelerated discovery of multi-elemental reverse water-gas shift catalysts using extrapolative machine learning approach (2022, ChemRxiv)
https://doi.org/10.26434/chemrxiv-2022-695rj
Ichi Takigawa
https://itakigawa.github.io/
Machine Learning for Molecular Graph Representations and GeometriesIchigaku Takigawa
Dec 1, 2021, Pacifico Yokohama, Japan.
Symposium 1AS-17 "Data science and machine learning: Tackling the Noise and Heterogeneity of the Real World"
The 44th Annual Meetingn of the Molecular Biology Society of Japan
https://www2.aeplan.co.jp/mbsj2021/english/designation/index.html
A machine-learning view on heterogeneous catalyst design and discoveryIchigaku Takigawa
Telluride Workshop on Computational Materials Chemistry, Telluride, Colorado, USA, July 1, 2021.
https://research.chem.ucr.edu/groups/jiang/Telluride_Workshop.html
https://www.telluridescience.org/meetings/workshop-details?wid=901
https://www.telluridescience.org/meetings/workshop-details?wid=945
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
1. Machine Learning for Molecules
Ichigaku Takigawa
takigawa@icredd.hokudai.ac.jp
15 October 2021 @ Hokkaido University
Hokkaido Univ ICReDD-Faculty of Medicine Joint Symposium
2. / 23
2
RIKEN Center for AI Project @ Kyoto
Medical-risk Avoidance based on iPS Cells Team
(A joint lab with Kyoto Univ CiRA)
Inst. Chemical Reaction Design & Discovery
Hokkaido Univ
A machine learning (ML) researcher working for
ML for Stem Cell Biology ML for Chemistry
3. / 23
2
RIKEN Center for AI Project @ Kyoto
Medical-risk Avoidance based on iPS Cells Team
(A joint lab with Kyoto Univ CiRA)
Inst. Chemical Reaction Design & Discovery
Hokkaido Univ
A machine learning (ML) researcher working for
ML for Stem Cell Biology ML for Chemistry
Interests: Machine Learning and Machine Discovery
An intersection of ML with combinatorial structures + ML for natural sciences
4. / 23
2
RIKEN Center for AI Project @ Kyoto
Medical-risk Avoidance based on iPS Cells Team
(A joint lab with Kyoto Univ CiRA)
Inst. Chemical Reaction Design & Discovery
Hokkaido Univ
A machine learning (ML) researcher working for
ML for Stem Cell Biology ML for Chemistry
Interests: Machine Learning and Machine Discovery
An intersection of ML with combinatorial structures + ML for natural sciences
Joint project with HU Med Dept:
• Prof. Shinya Tanaka: Cancer diagnosis with fluorescent markers, Enzyme-catalyzed reaction design
• Prof. Shigetsugu Hatakeyama: Mediator complex of transcription regulations (Nat Commun 2020, 2015)
• Prof. Ichiro Yabe: Video-based predictions of motor symptom severity
• Prof. Yasuyuki Fujita: Cell competition (Cell Reports 2018; Sci Rep 2015)
• Prof. Hidenao Sasaki: Copy number variations for neurodegenerative diseases (Mol Brain 2017)
5. / 23
3
X-informatics: Bio- and Chemo-informatics
• Biochemical reaction networks (Metabolic pathways)
Bioinformatics 2007, 2008a, 2008b, 2009, 2010
Nucleic Acids Res 2011, PLoS One 2012, 2013, KDD’07
• Drug-target interactions (Polypharmacology)
PLoS One 2011, Drug Discov Today 2013, Brief Bioinform 2014
• Modulatory proteolysis
Mol Cell Proteom 2016, Genome Informatics 2009
(We also developed a database http://calpain.org)
• Genomic repeats
Discrete Appl Math 2013, 2016, AAAI 2020
• Genetic variations in cancer cells
Brief Bioinform 2014
• Mediator complex and transcription regulation
Nat Commun 2015, 2020 (w/ Prof. Hatakeyama)
• Genomic copy number variations for neurodegenerative diseases
Mol Brain 2017 (w/ Prof. Sasaki)
• Cell competitions and cancer cells
Cell Reports 2018, Sci Rep 2015 (w/ Prof. Fujita)
In addition to pure ML research, I’ve worked for bio/chemo-informatics
6. / 23
4
Molecules have a combinatorial aspect
https://cen.acs.org/physical-chemistry/computational-chemistry/Exploring-chemical-space-AI-take/98/i13
7. / 23
5
simulation
ML: A new way for (lazy) programming
computer program
input output
full specification in
every detail required
deductive (rationalism)
8. / 23
5
simulation
ML: A new way for (lazy) programming
computer program
input output
full specification in
every detail required
input output
computer program
ML
give up explicit model
instead, grab a tunable
model, and show it many
input-output instances
inductive (empiricism)
airhead with a god-like
learning capability
Random Forest Neural Networks SVR Kernel Ridge
All about fitting a very-flexible function to finite points in high-dimensional space.
deductive (rationalism)
9. / 23
6
ML: A new way for (lazy) programming
x1
x2
y
p1 p2 p3 p5
p4
All about statistical and algorithmic techniques for
surface-model fitting to data points by adjusting model parameters.
Variable 1
Variable 2
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
x1
x2
y
ML
ML = Tweak these parameter values to
best fit a surface model to the given points.
5 params
10. / 23
7
A modern aspect of ML
ResNet50: 26 million params
ResNet101: 45 million params
EfficientNet-B7: 66 million params
VGG19: 144 million params
12-layer, 12-heads BERT: 110 million params
24-layer, 16-heads BERT: 336 million params
GPT-2 XL: 1558 million params
GPT-3: 175 billion params
Modern ML: Can we imagine what would happen if we try to fit a function having 175
billion parameters to 100 million data points in 10 thousand dimension??
simulation
input output
computer program
input output
computer program
ML
full specification in
every detail required
give up explicit model
instead, grab a tunable
model, and show it many
input-output instances
inductive (empiricism)
airhead with a god-like
learning capability
deductive (rationalism)
All about fitting a very-flexible function to finite points in high-dimensional space.
11. / 23
8
This Talk: ML for Molecular Graphs
Latent
variables
Representation
learning
Reactions
Materials
Molecules
Graphs (of different size)
Node
features
Edge
features
CC1CCNO1
Graph Neural
Networks (GNNs)
NCc1ccoc1.S=(Cl)Cl>>[RX_5]S=C=NCc1ccoc1
…
Classifier or
Regressor
Diverse
Downstream
Tasks
Modular Hierarchy
Amide
Proline
Oxazoline
Compositionality
Phenyl
Carboxyl Methyl Ethyl Tert-butyl
Isoprophyl
Trifluoromethyl
Benzyl
Substituents
Graph
Coarsening
Combinatorial aspects
12. / 23
9
Representation Learning
Use some inductive biases to constrain/regularize the model space.
Prediction
Input
variables
Classifier or
Regressor
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Function
model
13. / 23
10
Representation Learning
Prediction
Input
variables
Function
model
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Latent
variables
Learnable variable
transformation
Representation learning
Classifier or
Regressor
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
Use some inductive biases to constrain/regularize the model space.
14. / 23
10
Representation Learning
Prediction
Input
variables
Function
model
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Latent
variables
Learnable variable
transformation
Representation learning
Classifier or
Regressor
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
Use some inductive biases to constrain/regularize the model space.
15. / 23
10
Representation Learning
Prediction
Input
variables
Function
model
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Latent
variables
Learnable variable
transformation
Representation learning
Classifier or
Regressor
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
Use some inductive biases to constrain/regularize the model space.
16. / 23
10
Representation Learning
Prediction
Input
variables
Function
model
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Latent
variables
Learnable variable
transformation
Representation learning
Classifier or
Regressor
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
Use some inductive biases to constrain/regularize the model space.
17. / 23
10
Representation Learning
Prediction
Input
variables
Function
model
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Latent
variables
Learnable variable
transformation
Representation learning
Classifier or
Regressor
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
Use some inductive biases to constrain/regularize the model space.
18. / 23
10
Representation Learning
Prediction
Input
variables
Function
model
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Latent
variables
Learnable variable
transformation
Representation learning
Classifier or
Regressor
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
Use some inductive biases to constrain/regularize the model space.
19. / 23
10
Representation Learning
Prediction
Input
variables
Function
model
<latexit sha1_base64="QFtMwnKe2I12XGZu0bNJbdnDaaE=">AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhAVIwV0caShzwSJGR3HXHDvrK7EJH4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlTutJqqhCIsxN8LDIO6BCLxIG6FHHOIIBmQ0oIFDh0NYhQibvjLiYDCJq6BNnEVIcfc5zhEgbYOyOGWIxNbpX6NV2WN1Wvdr2q5aplNUGhYpw4iyF3bPeuyZPbBX9vlnrbZbo++lRbM00HKzGrxYzn38q9JodnDyrRrp2cExtl2vCnk3XaZ/C3mgb551ermdbLS9xm7ZG/m/YV32RDfQm+/yXYZnr0f4kcgLvRg1KP67HcOgkIjFt2KJzEYkteu1yo8VrGKd+pFECvtII0/1a7jEFTqCX4gJm0JykCr4PM0SfoSQ+gJWLpCb</latexit>
x2
<latexit sha1_base64="lFhRrRrVTrFR31ebbMgRp5myJpc=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCIDPjCuiG5c8pBHgoS0dcCG0jZtISLxB0zcysKVJi6MH+AHuPEHXPAJxiUmblx4KU2MEvE20zlz5p47Z+ZKhqpYNmNdjzA2PjE55Z32zczOzfsDC4s5S2+YMs/KuqqbBUm0uKpoPGsrtsoLhsnFuqTyvFTb7+/nm9y0FF07tFsGL9XFqqZUFFm0icqcljfKgRCLMCeCwyDqghDcSOqBRxzhGDpkNFAHhwabsAoRFn1FRMFgEFdCmziTkOLsc5zDR9oGZXHKEImt0b9Kq6LLarTu17QctUynqDRMUgYRZi/snvXYM3tgr+zzz1ptp0bfS4tmaaDlRtl/sZz5+FdVp9nGybdqpGcbFew4XhXybjhM/xbyQN886/Qyu+lwe43dsjfyf8O67IluoDXf5bsUT1+P8CORF3oxalD0dzuGQS4WiW5HYqnNUGLPbZUXK1jFOvUjjgQOkESW6ldxiSt0BK8QEbaE+CBV8LiaJfwIIfEFWE6QnA==</latexit>
x3
<latexit sha1_base64="0IPXcU0UIDvzZlYURjV2A/THv9U=">AAACiXichVG7SgNBFD2ur/hM1EawEYNiFWZFNKQKprGMj0TBBNndTHR0X+xOFmLwB6zsRK0ULMQP8ANs/AELP0EsFWwsvNksiAbjXWbnzJl77pyZq7um8CVjz11Kd09vX39sYHBoeGQ0nhgbL/pOzTN4wXBMx9vWNZ+bwuYFKaTJt12Pa5Zu8i39MNfc3wq45wvH3pR1l5ctbc8WVWFokqhiKag40t9NJFmKhTHdDtQIJBFF3knco4QKHBiowQKHDUnYhAafvh2oYHCJK6NBnEdIhPscxxgkbY2yOGVoxB7Sf49WOxFr07pZ0w/VBp1i0vBIOY1Z9sRu2Rt7ZHfshX3+WasR1mh6qdOst7Tc3Y2fTG58/KuyaJbY/1Z19CxRRTr0Ksi7GzLNWxgtfXB09raRWZ9tzLFr9kr+r9gze6Ab2MG7cbPG1y87+NHJC70YNUj93Y52UFxIqUuphbXFZHYlalUMU5jBPPVjGVmsIo8C1T/AKc5xoQwpqpJWMq1UpSvSTOBHKLkvAi+SPA==</latexit>
.
.
.
Latent
variables
Learnable variable
transformation
Representation learning
Classifier or
Regressor
Linear
<latexit sha1_base64="Ill3Als4zZd947f5Xm9sW99d0QA=">AAAChnichVHLTsJAFD3UF+ID1I2JGyLBuCJTomJcEd245CGPBAlp64gNpW3aQkTiD5i4lYUrTVwYP8APcOMPuOATjEtM3LjwUpoYJeJtpnPmzD13zsyVTU21Hca6PmFsfGJyyj8dmJmdmw+GFhbzttGwFJ5TDM2wirJkc03Vec5RHY0XTYtLdVnjBbm2198vNLllq4Z+4LRMXq5LVV09VhXJISp7WhEroQiLMTfCw0D0QARepIzQIw5xBAMKGqiDQ4dDWIMEm74SRDCYxJXRJs4ipLr7HOcIkLZBWZwyJGJr9K/SquSxOq37NW1XrdApGg2LlGFE2Qu7Zz32zB7YK/v8s1bbrdH30qJZHmi5WQleLGc//lXVaXZw8q0a6dnBMbZdryp5N12mfwtloG+edXrZnUy0vcZu2Rv5v2Fd9kQ30Jvvyl2aZ65H+JHJC70YNUj83Y5hkI/HxK1YPL0RSe56rfJjBatYp34kkMQ+UshR/SoucYWO4BdiwqaQGKQKPk+zhB8hJL8AVA6Qmg==</latexit>
x1
Simple model is enough
when we have good features.
Use some inductive biases to constrain/regularize the model space.
20. / 23
11
Use Case 1: Virtual Screening (QSAR/QSPR)
• Mutagenic potency
• Carcinogenic potency
• Endocrine disruption
• Growth inhibition
• Aqueous solubility
N
NH
O
O
H
H
H
H H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
O
O
O
O
O
O
Cl
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
Br
Br O P
O
O Br
Br
O
Br
Br
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
N
S
N
N
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
O
N
O
O
H
H
H
O
O
H
H
N
O
O
Cl
Cl
Cl
H
H
H
H
H
H H
N
O
O
H
H
H
H
H
H
H H
H
N
O
O
H
H
H
H
H
H
H
N
H
N
O
O
N
O
O
H
H
H
H
H
H
H
H
N
CH3
O
O
H
N Cl
Cl
Cl
Cl
Cl
H3C
O O
O
O
O
O
H3C
CH3
CH2
O
HN
O
O
NH
CH3
HO
OH
CH3
N
O
O
CH3
N
N
H
N
H
H3C
N
H3C
H3C
NH
O
N
O
N
O
CH3
O N
NH2
O
CH3
Br
CH3
N
H3C
H
N
S
N
O
CH3
N
OH
CH3
CH3
N
N
N
CH3
H3C
H2N NH2
H
OH
O
HO
CH3
H
H
O
CH3
H
O
O
H3C H
H
H
O
H3C
S
CH3
O
H
H
O
CH3
CH3
O
O
HO
H3C
H
HO
F
H
O
H3C
NH2
O
N
HO
H
O
O
H
H
O
O
O
H3C
O
O
O
CH3
O
CH3
H
O
CH3
H
O
O
CH3
H
H
N
H
N O
H3C
O
O
O
22. / 23
13
Use Case 1: Virtual Screening (QSAR/QSPR)
input output
ML
activity: “Active”
LogGI50: -7.8811
CID 11978790
GI50: concentration required
for 50% inhibition of growth
23. / 23
14
Input representation (molecular graph)
Use Case 1: Virtual Screening (QSAR/QSPR)
1
2
1
3
explicit Hs
4
5
6
7
8
9
10
11
12
13
14
15
16
17
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Any permutation of
this numbering should
not change the results.
❗
CID 204
atoms → nodes
bonds → edges
1. permutation
equivariance
2. permutation
invariance
24. / 23
14
Input representation (molecular graph)
Use Case 1: Virtual Screening (QSAR/QSPR)
1
2
1
3
explicit Hs
4
5
6
7
8
9
10
11
12
13
14
15
16
17
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Any permutation of
this numbering should
not change the results.
❗
CID 204
• atomic_num (one-hot, 101)
• total_degree (one-hot, 7)
• formal_charge (one-hot, 6)
• chiral_tag (one-hot, 5)
• num_Hs (one-hot, 6)
• hybridization (one-hot, 6)
• is_aromatic (binary, 1)
• atomic_mass (real, 1)
17
edge(bond) features
• no_bond (binary, 1)
• is_single (binary, 1)
• is_double (binary, 1)
• is_triple (binary, 1)
• is_aromatic (binary, 1)
• is_connjugated (binary, 1)
• is_in_ring (binary, 1)
• stereo (one-hot, 7)
17
14
133
node(atom) features
133 features 14 features
e.g. Features for ChemProp (Yang et al, 2019)
atoms → nodes
bonds → edges
1. permutation
equivariance
2. permutation
invariance
25. / 23
14
Input representation (molecular graph)
Use Case 1: Virtual Screening (QSAR/QSPR)
1
2
1
3
explicit Hs
4
5
6
7
8
9
10
11
12
13
14
15
16
17
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Any permutation of
this numbering should
not change the results.
❗
atom features bond features
topology
Molecular Graph
read out
graph-level
output
• sum, mean or max
• attentive pooling
CID 204
• atomic_num (one-hot, 101)
• total_degree (one-hot, 7)
• formal_charge (one-hot, 6)
• chiral_tag (one-hot, 5)
• num_Hs (one-hot, 6)
• hybridization (one-hot, 6)
• is_aromatic (binary, 1)
• atomic_mass (real, 1)
17
edge(bond) features
• no_bond (binary, 1)
• is_single (binary, 1)
• is_double (binary, 1)
• is_triple (binary, 1)
• is_aromatic (binary, 1)
• is_connjugated (binary, 1)
• is_in_ring (binary, 1)
• stereo (one-hot, 7)
17
14
133
node(atom) features
133 features 14 features
e.g. Features for ChemProp (Yang et al, 2019)
atoms → nodes
bonds → edges
1. permutation
equivariance
2. permutation
invariance
26. / 23
15
Graph Neural Networks (GNNs)
N O
C
C
C
C
H
H
H
H
H
N O
C
C
C
C
H
H
H
H
H
GNN Layer
GNN updates
features
27. / 23
15
Graph Neural Networks (GNNs)
N O
C
C
C
C
H
H
H
H
H
N O
C
C
C
C
H
H
H
H
H
GNN Layer
GNN updates
features
28. / 23
15
Graph Neural Networks (GNNs)
N O
C
C
C
C
H
H
H
H
H
N O
C
C
C
C
H
H
H
H
H
GNN Layer
GNN updates
features
<latexit sha1_base64="+I1ZH8a510AHL/VRK05INyOdDHc=">AAACi3ichVHLSsNAFL2Nr1qtrboR3BRLxVWZaFEpLooiuOzDPqAtJYnTdmheJGmhhv6ASzcu6kbBhfgBfoAbf8BFP0FcVnDjwps0IFqsN0zmzJl77pyZK+oyMy1CBj5uanpmds4/H1hYDC6FwssrBVNrGxLNS5qsGSVRMKnMVJq3mCXTkm5QQRFlWhRbR85+sUMNk2nqqdXVaVURGiqrM0mwkCpVRMVu9mqsFo6SOHEjMg54D0TBi7QWfoQKnIEGErRBAQoqWIhlEMDErww8ENCRq4KNnIGIufsUehBAbRuzKGYIyLbw38BV2WNVXDs1TVct4SkyDgOVEYiRF3JPhuSZPJBX8vlnLdut4Xjp4iyOtFSvhS7Wch//qhScLWh+qyZ6tqAO+65Xht51l3FuIY30nfOrYS6Zjdmb5Ja8of8bMiBPeAO18y7dZWi2P8GPiF7wxbBB/O92jIPCdpzfjScyiWjq0GuVH9ZhA7awH3uQghNIQ97twyX04ZoLcjtckjsYpXI+T7MKP4I7/gL6JJMZ</latexit>
hi
<latexit sha1_base64="iVE2zwSrY7uMFp5lULN8y4s7ZXg=">AAACjnichVHLSsNAFL2Nr1ofjboR3BRLxVWZSKkiiEU3XfZhH9CWksSpjs2LJC3U0B9wLy4ERcGF+AF+gBt/wEU/QVxWcOPCmzQgWqw3TObMmXvunJkrGQqzbEJ6AW5sfGJyKjgdmpmdmw/zC4tFS2+ZMi3IuqKbZUm0qMI0WrCZrdCyYVJRlRRakpr77n6pTU2L6dqB3TFoTRWPNNZgsmgjValKqkO7dYeddOt8lMSJF5FhIPggCn5kdP4RqnAIOsjQAhUoaGAjVkAEC78KCEDAQK4GDnImIubtU+hCCLUtzKKYISLbxP8Rrio+q+HarWl5ahlPUXCYqIxAjLyQe9Inz+SBvJLPP2s5Xg3XSwdnaaClRj18tpz/+Fel4mzD8bdqpGcbGrDleWXo3fAY9xbyQN8+vejnt3MxZ43ckjf0f0N65AlvoLXf5bsszV2O8COhF3wxbJDwux3DoLgRF5LxRDYRTe35rQrCCqzCOvZjE1KQhgwUvBc9hyu45nguye1wu4NULuBrluBHcOkvYEmUlg==</latexit>
eij
atom features
bond features
29. / 23
15
Graph Neural Networks (GNNs)
N O
C
C
C
C
H
H
H
H
H
N O
C
C
C
C
H
H
H
H
H
GNN Layer
GNN updates
features
<latexit sha1_base64="rQGx6XMOvsCjXRp7b9gbmKjgm1M=">AAAC/HichVHNTttAEB4b2tI0lNBeKnGxiKiChKJNhVrUU1QuPaHwE0DCkbU2m3jD+kfrTVBquQ/AC3DgVNQeEOIKD9BLX4ADBx4AcUylXnroxHEVtah0LHu//Wa+8bc7dih4pAi50vSx8QcPH008zj3JTz6dKkw/24yCjnRY3QlEILdtGjHBfVZXXAm2HUpGPVuwLXtveZDf6jIZ8cDfUL2QNTza8nmTO1QhZRX2TduL3cTihilYU1Epg33DDKNsX/qdXsiZNm8FoehEVtw2TO4bpkeV61ARr2A+QZHLjVG9kaH2ELHEink7mc+ZkrdcNW8ViqRM0jDugkoGipBFLShcgAm7EIADHfCAgQ8KsQAKET47UAECIXINiJGTiHiaZ5BADrUdrGJYQZHdw28LdzsZ6+N+0DNK1Q7+ReArUWnAHLkkJ6RPvpFTckN+/rNXnPYYeOnhag+1LLSmDl6s//ivysNVgTtS3etZQROWUq8cvYcpMziFM9R3Pxz219+uzcUvyTG5Rf+fyBX5iifwu9+dL6ts7egePzZ6wRvDAVX+HsddsPmqXHldXlxdLFbfZaOagBmYhRLO4w1U4T3UoI79r7UxLa9N6h/1z/qpfjYs1bVM8xz+CP38F+WDvZo=</latexit>
hi
0
@hi,
M
j2Ni
(hi, hj, eij)
1
A
Update by “Message Passing”
<latexit sha1_base64="+I1ZH8a510AHL/VRK05INyOdDHc=">AAACi3ichVHLSsNAFL2Nr1qtrboR3BRLxVWZaFEpLooiuOzDPqAtJYnTdmheJGmhhv6ASzcu6kbBhfgBfoAbf8BFP0FcVnDjwps0IFqsN0zmzJl77pyZK+oyMy1CBj5uanpmds4/H1hYDC6FwssrBVNrGxLNS5qsGSVRMKnMVJq3mCXTkm5QQRFlWhRbR85+sUMNk2nqqdXVaVURGiqrM0mwkCpVRMVu9mqsFo6SOHEjMg54D0TBi7QWfoQKnIEGErRBAQoqWIhlEMDErww8ENCRq4KNnIGIufsUehBAbRuzKGYIyLbw38BV2WNVXDs1TVct4SkyDgOVEYiRF3JPhuSZPJBX8vlnLdut4Xjp4iyOtFSvhS7Wch//qhScLWh+qyZ6tqAO+65Xht51l3FuIY30nfOrYS6Zjdmb5Ja8of8bMiBPeAO18y7dZWi2P8GPiF7wxbBB/O92jIPCdpzfjScyiWjq0GuVH9ZhA7awH3uQghNIQ97twyX04ZoLcjtckjsYpXI+T7MKP4I7/gL6JJMZ</latexit>
hi
<latexit sha1_base64="iVE2zwSrY7uMFp5lULN8y4s7ZXg=">AAACjnichVHLSsNAFL2Nr1ofjboR3BRLxVWZSKkiiEU3XfZhH9CWksSpjs2LJC3U0B9wLy4ERcGF+AF+gBt/wEU/QVxWcOPCmzQgWqw3TObMmXvunJkrGQqzbEJ6AW5sfGJyKjgdmpmdmw/zC4tFS2+ZMi3IuqKbZUm0qMI0WrCZrdCyYVJRlRRakpr77n6pTU2L6dqB3TFoTRWPNNZgsmgjValKqkO7dYeddOt8lMSJF5FhIPggCn5kdP4RqnAIOsjQAhUoaGAjVkAEC78KCEDAQK4GDnImIubtU+hCCLUtzKKYISLbxP8Rrio+q+HarWl5ahlPUXCYqIxAjLyQe9Inz+SBvJLPP2s5Xg3XSwdnaaClRj18tpz/+Fel4mzD8bdqpGcbGrDleWXo3fAY9xbyQN8+vejnt3MxZ43ckjf0f0N65AlvoLXf5bsszV2O8COhF3wxbJDwux3DoLgRF5LxRDYRTe35rQrCCqzCOvZjE1KQhgwUvBc9hyu45nguye1wu4NULuBrluBHcOkvYEmUlg==</latexit>
eij
atom features
bond features
30. / 23
15
Graph Neural Networks (GNNs)
N O
C
C
C
C
H
H
H
H
H
N O
C
C
C
C
H
H
H
H
H
GNN Layer
GNN updates
features
Message
Permutation
equivariant
operations
• nn.Linear
Bond features can be
used (typically in )
<latexit sha1_base64="rQGx6XMOvsCjXRp7b9gbmKjgm1M=">AAAC/HichVHNTttAEB4b2tI0lNBeKnGxiKiChKJNhVrUU1QuPaHwE0DCkbU2m3jD+kfrTVBquQ/AC3DgVNQeEOIKD9BLX4ADBx4AcUylXnroxHEVtah0LHu//Wa+8bc7dih4pAi50vSx8QcPH008zj3JTz6dKkw/24yCjnRY3QlEILdtGjHBfVZXXAm2HUpGPVuwLXtveZDf6jIZ8cDfUL2QNTza8nmTO1QhZRX2TduL3cTihilYU1Epg33DDKNsX/qdXsiZNm8FoehEVtw2TO4bpkeV61ARr2A+QZHLjVG9kaH2ELHEink7mc+ZkrdcNW8ViqRM0jDugkoGipBFLShcgAm7EIADHfCAgQ8KsQAKET47UAECIXINiJGTiHiaZ5BADrUdrGJYQZHdw28LdzsZ6+N+0DNK1Q7+ReArUWnAHLkkJ6RPvpFTckN+/rNXnPYYeOnhag+1LLSmDl6s//ivysNVgTtS3etZQROWUq8cvYcpMziFM9R3Pxz219+uzcUvyTG5Rf+fyBX5iifwu9+dL6ts7egePzZ6wRvDAVX+HsddsPmqXHldXlxdLFbfZaOagBmYhRLO4w1U4T3UoI79r7UxLa9N6h/1z/qpfjYs1bVM8xz+CP38F+WDvZo=</latexit>
hi
0
@hi,
M
j2Ni
(hi, hj, eij)
1
A
Update by “Message Passing”
<latexit sha1_base64="+I1ZH8a510AHL/VRK05INyOdDHc=">AAACi3ichVHLSsNAFL2Nr1qtrboR3BRLxVWZaFEpLooiuOzDPqAtJYnTdmheJGmhhv6ASzcu6kbBhfgBfoAbf8BFP0FcVnDjwps0IFqsN0zmzJl77pyZK+oyMy1CBj5uanpmds4/H1hYDC6FwssrBVNrGxLNS5qsGSVRMKnMVJq3mCXTkm5QQRFlWhRbR85+sUMNk2nqqdXVaVURGiqrM0mwkCpVRMVu9mqsFo6SOHEjMg54D0TBi7QWfoQKnIEGErRBAQoqWIhlEMDErww8ENCRq4KNnIGIufsUehBAbRuzKGYIyLbw38BV2WNVXDs1TVct4SkyDgOVEYiRF3JPhuSZPJBX8vlnLdut4Xjp4iyOtFSvhS7Wch//qhScLWh+qyZ6tqAO+65Xht51l3FuIY30nfOrYS6Zjdmb5Ja8of8bMiBPeAO18y7dZWi2P8GPiF7wxbBB/O92jIPCdpzfjScyiWjq0GuVH9ZhA7awH3uQghNIQ97twyX04ZoLcjtckjsYpXI+T7MKP4I7/gL6JJMZ</latexit>
hi
<latexit sha1_base64="iVE2zwSrY7uMFp5lULN8y4s7ZXg=">AAACjnichVHLSsNAFL2Nr1ofjboR3BRLxVWZSKkiiEU3XfZhH9CWksSpjs2LJC3U0B9wLy4ERcGF+AF+gBt/wEU/QVxWcOPCmzQgWqw3TObMmXvunJkrGQqzbEJ6AW5sfGJyKjgdmpmdmw/zC4tFS2+ZMi3IuqKbZUm0qMI0WrCZrdCyYVJRlRRakpr77n6pTU2L6dqB3TFoTRWPNNZgsmgjValKqkO7dYeddOt8lMSJF5FhIPggCn5kdP4RqnAIOsjQAhUoaGAjVkAEC78KCEDAQK4GDnImIubtU+hCCLUtzKKYISLbxP8Rrio+q+HarWl5ahlPUXCYqIxAjLyQe9Inz+SBvJLPP2s5Xg3XSwdnaaClRj18tpz/+Fel4mzD8bdqpGcbGrDleWXo3fAY9xbyQN8+vejnt3MxZ43ckjf0f0N65AlvoLXf5bsszV2O8COhF3wxbJDwux3DoLgRF5LxRDYRTe35rQrCCqzCOvZjE1KQhgwUvBc9hyu45nguye1wu4NULuBrluBHcOkvYEmUlg==</latexit>
eij
atom features
bond features
31. / 23
15
Graph Neural Networks (GNNs)
N O
C
C
C
C
H
H
H
H
H
N O
C
C
C
C
H
H
H
H
H
GNN Layer
GNN updates
features
• sum, mean or max
• attentive pooling
Aggregate
Permutation
invariant
operations
Message
Permutation
equivariant
operations
• nn.Linear
Bond features can be
used (typically in )
<latexit sha1_base64="rQGx6XMOvsCjXRp7b9gbmKjgm1M=">AAAC/HichVHNTttAEB4b2tI0lNBeKnGxiKiChKJNhVrUU1QuPaHwE0DCkbU2m3jD+kfrTVBquQ/AC3DgVNQeEOIKD9BLX4ADBx4AcUylXnroxHEVtah0LHu//Wa+8bc7dih4pAi50vSx8QcPH008zj3JTz6dKkw/24yCjnRY3QlEILdtGjHBfVZXXAm2HUpGPVuwLXtveZDf6jIZ8cDfUL2QNTza8nmTO1QhZRX2TduL3cTihilYU1Epg33DDKNsX/qdXsiZNm8FoehEVtw2TO4bpkeV61ARr2A+QZHLjVG9kaH2ELHEink7mc+ZkrdcNW8ViqRM0jDugkoGipBFLShcgAm7EIADHfCAgQ8KsQAKET47UAECIXINiJGTiHiaZ5BADrUdrGJYQZHdw28LdzsZ6+N+0DNK1Q7+ReArUWnAHLkkJ6RPvpFTckN+/rNXnPYYeOnhag+1LLSmDl6s//ivysNVgTtS3etZQROWUq8cvYcpMziFM9R3Pxz219+uzcUvyTG5Rf+fyBX5iifwu9+dL6ts7egePzZ6wRvDAVX+HsddsPmqXHldXlxdLFbfZaOagBmYhRLO4w1U4T3UoI79r7UxLa9N6h/1z/qpfjYs1bVM8xz+CP38F+WDvZo=</latexit>
hi
0
@hi,
M
j2Ni
(hi, hj, eij)
1
A
Update by “Message Passing”
<latexit sha1_base64="+I1ZH8a510AHL/VRK05INyOdDHc=">AAACi3ichVHLSsNAFL2Nr1qtrboR3BRLxVWZaFEpLooiuOzDPqAtJYnTdmheJGmhhv6ASzcu6kbBhfgBfoAbf8BFP0FcVnDjwps0IFqsN0zmzJl77pyZK+oyMy1CBj5uanpmds4/H1hYDC6FwssrBVNrGxLNS5qsGSVRMKnMVJq3mCXTkm5QQRFlWhRbR85+sUMNk2nqqdXVaVURGiqrM0mwkCpVRMVu9mqsFo6SOHEjMg54D0TBi7QWfoQKnIEGErRBAQoqWIhlEMDErww8ENCRq4KNnIGIufsUehBAbRuzKGYIyLbw38BV2WNVXDs1TVct4SkyDgOVEYiRF3JPhuSZPJBX8vlnLdut4Xjp4iyOtFSvhS7Wch//qhScLWh+qyZ6tqAO+65Xht51l3FuIY30nfOrYS6Zjdmb5Ja8of8bMiBPeAO18y7dZWi2P8GPiF7wxbBB/O92jIPCdpzfjScyiWjq0GuVH9ZhA7awH3uQghNIQ97twyX04ZoLcjtckjsYpXI+T7MKP4I7/gL6JJMZ</latexit>
hi
<latexit sha1_base64="iVE2zwSrY7uMFp5lULN8y4s7ZXg=">AAACjnichVHLSsNAFL2Nr1ofjboR3BRLxVWZSKkiiEU3XfZhH9CWksSpjs2LJC3U0B9wLy4ERcGF+AF+gBt/wEU/QVxWcOPCmzQgWqw3TObMmXvunJkrGQqzbEJ6AW5sfGJyKjgdmpmdmw/zC4tFS2+ZMi3IuqKbZUm0qMI0WrCZrdCyYVJRlRRakpr77n6pTU2L6dqB3TFoTRWPNNZgsmgjValKqkO7dYeddOt8lMSJF5FhIPggCn5kdP4RqnAIOsjQAhUoaGAjVkAEC78KCEDAQK4GDnImIubtU+hCCLUtzKKYISLbxP8Rrio+q+HarWl5ahlPUXCYqIxAjLyQe9Inz+SBvJLPP2s5Xg3XSwdnaaClRj18tpz/+Fel4mzD8bdqpGcbGrDleWXo3fAY9xbyQN8+vejnt3MxZ43ckjf0f0N65AlvoLXf5bsszV2O8COhF3wxbJDwux3DoLgRF5LxRDYRTe35rQrCCqzCOvZjE1KQhgwUvBc9hyu45nguye1wu4NULuBrluBHcOkvYEmUlg==</latexit>
eij
atom features
bond features
32. / 23
15
Graph Neural Networks (GNNs)
N O
C
C
C
C
H
H
H
H
H
N O
C
C
C
C
H
H
H
H
H
GNN Layer
GNN updates
features
• sum, mean or max
• attentive pooling
Aggregate
Permutation
invariant
operations
Update
Any
• nn.Linear
Message
Permutation
equivariant
operations
• nn.Linear
Bond features can be
used (typically in )
<latexit sha1_base64="rQGx6XMOvsCjXRp7b9gbmKjgm1M=">AAAC/HichVHNTttAEB4b2tI0lNBeKnGxiKiChKJNhVrUU1QuPaHwE0DCkbU2m3jD+kfrTVBquQ/AC3DgVNQeEOIKD9BLX4ADBx4AcUylXnroxHEVtah0LHu//Wa+8bc7dih4pAi50vSx8QcPH008zj3JTz6dKkw/24yCjnRY3QlEILdtGjHBfVZXXAm2HUpGPVuwLXtveZDf6jIZ8cDfUL2QNTza8nmTO1QhZRX2TduL3cTihilYU1Epg33DDKNsX/qdXsiZNm8FoehEVtw2TO4bpkeV61ARr2A+QZHLjVG9kaH2ELHEink7mc+ZkrdcNW8ViqRM0jDugkoGipBFLShcgAm7EIADHfCAgQ8KsQAKET47UAECIXINiJGTiHiaZ5BADrUdrGJYQZHdw28LdzsZ6+N+0DNK1Q7+ReArUWnAHLkkJ6RPvpFTckN+/rNXnPYYeOnhag+1LLSmDl6s//ivysNVgTtS3etZQROWUq8cvYcpMziFM9R3Pxz219+uzcUvyTG5Rf+fyBX5iifwu9+dL6ts7egePzZ6wRvDAVX+HsddsPmqXHldXlxdLFbfZaOagBmYhRLO4w1U4T3UoI79r7UxLa9N6h/1z/qpfjYs1bVM8xz+CP38F+WDvZo=</latexit>
hi
0
@hi,
M
j2Ni
(hi, hj, eij)
1
A
Update by “Message Passing”
<latexit sha1_base64="+I1ZH8a510AHL/VRK05INyOdDHc=">AAACi3ichVHLSsNAFL2Nr1qtrboR3BRLxVWZaFEpLooiuOzDPqAtJYnTdmheJGmhhv6ASzcu6kbBhfgBfoAbf8BFP0FcVnDjwps0IFqsN0zmzJl77pyZK+oyMy1CBj5uanpmds4/H1hYDC6FwssrBVNrGxLNS5qsGSVRMKnMVJq3mCXTkm5QQRFlWhRbR85+sUMNk2nqqdXVaVURGiqrM0mwkCpVRMVu9mqsFo6SOHEjMg54D0TBi7QWfoQKnIEGErRBAQoqWIhlEMDErww8ENCRq4KNnIGIufsUehBAbRuzKGYIyLbw38BV2WNVXDs1TVct4SkyDgOVEYiRF3JPhuSZPJBX8vlnLdut4Xjp4iyOtFSvhS7Wch//qhScLWh+qyZ6tqAO+65Xht51l3FuIY30nfOrYS6Zjdmb5Ja8of8bMiBPeAO18y7dZWi2P8GPiF7wxbBB/O92jIPCdpzfjScyiWjq0GuVH9ZhA7awH3uQghNIQ97twyX04ZoLcjtckjsYpXI+T7MKP4I7/gL6JJMZ</latexit>
hi
<latexit sha1_base64="iVE2zwSrY7uMFp5lULN8y4s7ZXg=">AAACjnichVHLSsNAFL2Nr1ofjboR3BRLxVWZSKkiiEU3XfZhH9CWksSpjs2LJC3U0B9wLy4ERcGF+AF+gBt/wEU/QVxWcOPCmzQgWqw3TObMmXvunJkrGQqzbEJ6AW5sfGJyKjgdmpmdmw/zC4tFS2+ZMi3IuqKbZUm0qMI0WrCZrdCyYVJRlRRakpr77n6pTU2L6dqB3TFoTRWPNNZgsmgjValKqkO7dYeddOt8lMSJF5FhIPggCn5kdP4RqnAIOsjQAhUoaGAjVkAEC78KCEDAQK4GDnImIubtU+hCCLUtzKKYISLbxP8Rrio+q+HarWl5ahlPUXCYqIxAjLyQe9Inz+SBvJLPP2s5Xg3XSwdnaaClRj18tpz/+Fel4mzD8bdqpGcbGrDleWXo3fAY9xbyQN8+vejnt3MxZ43ckjf0f0N65AlvoLXf5bsszV2O8COhF3wxbJDwux3DoLgRF5LxRDYRTe35rQrCCqzCOvZjE1KQhgwUvBc9hyu45nguye1wu4NULuBrluBHcOkvYEmUlg==</latexit>
eij
atom features
bond features
33. / 23
16
ChemProp
(Directed MPNN)
Use Case 1: Virtual Screening (QSAR/QSPR)
ExtraTrees
w/ ECFP6(1024)
Performance for unseen (test) data:
Standard ML GNN
Stokes et al, Cell (2020) https://doi.org/10.1016/j.cell.2020.01.021
Marchant, Nature (2020) https://doi.org/10.1038/d41586-020-00018-3
ChemProp (Yang et al, 2019)
from MIT MLPDS (Machine Learning
for Pharmaceutical Discovery
and Synthesis) Consortium
Disclaimer: This is just for a toy demo. This should be taken
as classification for ACTIVITY_OUTCOME (Active or Inactive)
95.079% (Active/Inactive) 95.604% (Active/Inactive)
• Regression for LogGI50 • Regression for LogGI50
• Classification accuracy • Classification accuracy
RMSE 0.6076
RMSE 0.7970
Activie/Inactive (Classification), LogGI50 (Regression)
35. / 23
18
Use Case 2: Quantum chemistry
input output
gdb_21014
1000 sec
Density Functional Theory (DFT)
B3LYP/6-31G(2df, p)
<latexit sha1_base64="JI//afsBt1AdIhSgVUbVSGVXtww=">AAACmnichVHLSsNAFD2Nr1pfVREEXQSL4qpMpagIgiiC4qZVq4KVksSxHZomIZkWavAH/AEXrhRcqB/gB7jxB1z0E8SlghsX3qYBUVFvmMyZM/fcOTNXd0zhScYaEaWtvaOzK9od6+nt6x+IDw7teHbVNXjOsE3b3dM1j5vC4jkppMn3HJdrFd3ku3p5pbm/W+OuJ2xrW9YdflDRipY4EoYmiSrER/IlTfprJ2o+4wl1UV0NQCGeYEkWhPoTpEKQQBgZO36HPA5hw0AVFXBYkIRNaPDo20cKDA5xB/CJcwmJYJ/jBDHSVimLU4ZGbJn+RVrth6xF62ZNL1AbdIpJwyWlikn2yK7ZC3tgt+yJvf9ayw9qNL3UadZbWu4UBk5Ht97+VVVolih9qv70LHGE+cCrIO9OwDRvYbT0teOzl62FzUl/il2yZ/J/wRrsnm5g1V6NqyzfPP/Dj05e6MWoQanv7fgJdmaSqdlkOptOLC2HrYpiDBOYpn7MYQlryCBH9X1c4Aa3yriyrKwrG61UJRJqhvEllO0PT6aXZA==</latexit>
Ĥ = E
by solving a one-electron Schrödinger
equation (Kohn–Sham equation)
Quantum chemical
calculations
36. / 23
18
Use Case 2: Quantum chemistry
input output
gdb_21014
1000 sec
Density Functional Theory (DFT)
B3LYP/6-31G(2df, p)
<latexit sha1_base64="JI//afsBt1AdIhSgVUbVSGVXtww=">AAACmnichVHLSsNAFD2Nr1pfVREEXQSL4qpMpagIgiiC4qZVq4KVksSxHZomIZkWavAH/AEXrhRcqB/gB7jxB1z0E8SlghsX3qYBUVFvmMyZM/fcOTNXd0zhScYaEaWtvaOzK9od6+nt6x+IDw7teHbVNXjOsE3b3dM1j5vC4jkppMn3HJdrFd3ku3p5pbm/W+OuJ2xrW9YdflDRipY4EoYmiSrER/IlTfprJ2o+4wl1UV0NQCGeYEkWhPoTpEKQQBgZO36HPA5hw0AVFXBYkIRNaPDo20cKDA5xB/CJcwmJYJ/jBDHSVimLU4ZGbJn+RVrth6xF62ZNL1AbdIpJwyWlikn2yK7ZC3tgt+yJvf9ayw9qNL3UadZbWu4UBk5Ht97+VVVolih9qv70LHGE+cCrIO9OwDRvYbT0teOzl62FzUl/il2yZ/J/wRrsnm5g1V6NqyzfPP/Dj05e6MWoQanv7fgJdmaSqdlkOptOLC2HrYpiDBOYpn7MYQlryCBH9X1c4Aa3yriyrKwrG61UJRJqhvEllO0PT6aXZA==</latexit>
Ĥ = E
by solving a one-electron Schrödinger
equation (Kohn–Sham equation)
ML
0.01 sec
≈
100,000 times faster!
Quantum chemical
calculations
37. / 23
19
input molecule H2O
Use Case 2: Quantum chemistry
gdb_3
0
1 2
graph (SchNet)
0 1
atom features
0 1 2
2
edges w/ cutoff (10Å)
0
bond features
0 1
1
0 2
2
1 2
edge_index
0.9620 0.9622 1.5133
latexit sha1_base64=WhwbZjIo+PbaHz7kU/YZBO38PiQ=AAACp3ichVG7SgNBFD2u72eiNoLNYlC0MEwkqAhC0MbO+EgMGFl211En2Re7m4Cu6cUfsLBSsBDBVnsbf8DCTxBLBRsLbzYLoqLeZXbOnLnnzpm5mmMIz2fssUlqbmlta+/o7Oru6e2LxfsH8p5dcXWe023Ddgua6nFDWDznC9/gBcflqqkZfEMrL9b3N6rc9YRtrfv7Dt8y1V1L7Ahd9YlS4iOuEohSTZ6bl4uHRc0M3Joi5Ek5giVilXiCJVkY8k+QikACUWTt+A2K2IYNHRWY4LDgEzagwqNvEykwOMRtISDOJSTCfY4aukhboSxOGSqxZfrv0mozYi1a12t6oVqnUwwaLilljLIHdsle2D27Yk/s/ddaQVij7mWfZq2h5Y4SOx5ae/tXZdLsY+9T9adnHzuYDb0K8u6ETP0WekNfPTh5WZtbHQ3G2Dl7Jv9n7JHd0Q2s6qt+scJXT//wo5EXejFqUOp7O36C/FQyNZ1Mr6QTmYWoVR0YxgjGqR8zyGAJWeSo/hGucYNbaUJalvJSoZEqNUWaQXwJSf0AKaGdRg==/latexit
rij := kri rjk
latexit sha1_base64=kMmroSd0z/qeaFfqTkrtDbgltFg=AAACinichVHLSsNAFL2Nr1qrrboR3ARLxVW5LUWrIhR14bIP+8BaShKnNTRNQpIWavEH3LkS7ErBhfgBfoAbf8BFP0FcVnDjwps0IFqsN0zmzJl77pyZK+qKbFqIPQ83Nj4xOeWd9s34Z+cCwfmFvKk1DYnlJE3RjKIomEyRVZazZEthRd1gQkNUWEGs79n7hRYzTFlTD622zsoNoabKVVkSLKIKRxXkd/hEJRjCCDrBD4OoC0LgRkoLPsIxnIAGEjShAQxUsAgrIIBJXwmigKATV4YOcQYh2dlncA4+0jYpi1GGQGyd/jValVxWpbVd03TUEp2i0DBIyUMYX/Ae+/iMD/iKn3/W6jg1bC9tmsWBlumVwMVS9uNfVYNmC06/VSM9W1CFhONVJu+6w9i3kAb61tlVP7uVCXdW8RbfyP8N9vCJbqC23qW7NMt0R/gRyQu9GDUo+rsdwyAfi0TXI/F0PJTcdVvlhWVYgTXqxwYk4QBSkHPqX8I1dDk/F+M2ue1BKudxNYvwI7j9Ly4OkVo=/latexit
Z0 = 8
latexit sha1_base64=QxdA1PVsyCLNE23CpTjm2APprYs=AAACinichVHLSsNAFL2Nr1qrrboR3ARLxVWZlOIToagLl33YB1YpSZzWoWkSkmmhFn/AnSvBrhRciB/gB7jxB1z0E8RlBTcuvEkDosV6w2TOnLnnzpm5iqkxmxPS9Qkjo2PjE/7JwFRweiYUnp3L20bDUmlONTTDKiqyTTWm0xxnXKNF06JyXdFoQantOvuFJrVsZugHvGXS47pc1VmFqTJHqnBYlsRtUSqHIyRG3BAHgeSBCHiRMsKPcAQnYIAKDagDBR04Yg1ksPErgQQETOSOoY2chYi5+xTOIYDaBmZRzJCRreG/iquSx+q4dmrarlrFUzQcFipFiJIXck965Jk8kFfy+WettlvD8dLCWelrqVkOXSxkP/5V1XHmcPqtGuqZQwXWXa8MvZsu49xC7eubZ1e97GYm2l4mt+QN/d+QLnnCG+jNd/UuTTOdIX4U9IIvhg2SfrdjEOTjMWk1lkgnIskdr1V+WIQlWMF+rEES9iEFObf+JVxDRwgKcWFD2OqnCj5PMw8/Qtj7AiFSkVQ=/latexit
Z1 = 1
latexit sha1_base64=PJMbQqlxF/BaTgtbHwfwEQN8SrI=AAACinichVHLSsNAFL2Nr1qrrboR3ARLxVWZlOIToagLl33YB1YpSZzWoWkSkmmhFn/AnSvBrhRciB/gB7jxB1z0E8RlBTcuvEkDosV6w2TOnLnnzpm5iqkxmxPS9Qkjo2PjE/7JwFRweiYUnp3L20bDUmlONTTDKiqyTTWm0xxnXKNF06JyXdFoQantOvuFJrVsZugHvGXS47pc1VmFqTJHqnBYjovbolQOR0iMuCEOAskDEfAiZYQf4QhOwAAVGlAHCjpwxBrIYONXAgkImMgdQxs5CxFz9ymcQwC1DcyimCEjW8N/FVclj9Vx7dS0XbWKp2g4LFSKECUv5J70yDN5IK/k889abbeG46WFs9LXUrMculjIfvyrquPM4fRbNdQzhwqsu14ZejddxrmF2tc3z6562c1MtL1Mbskb+r8hXfKEN9Cb7+pdmmY6Q/wo6AVfDBsk/W7HIMjHY9JqLJFORJI7Xqv8sAhLsIL9WIMk7EMKcm79S7iGjhAU4sKGsNVPFXyeZh5+hLD3BSN2kVU=/latexit
Z2 = 1
latexit sha1_base64=A/aHUcWCUQde6WVY9Df1h30Gjpk=AAACqXichVE9T9tQFD2Ylo/wkQALEotFSoUERDdtSFokpKgsjCQQiEiiyDYPsPCXbCdSiPgD/AEGJpAYqi7dKlhZ+gc65CegjlTqwsC1Y6kqiHAt+5533j3X572rOobu+USdPqn/zduBwaHh2Mjo2Hg8MTG57dkNVxMlzTZst6wqnjB0S5R83TdE2XGFYqqG2FGP1oL9naZwPd22tvyWI2qmcmDp+7qm+EzVE++qqtl2T+okr8qVJUrRx0xmUabU51xuOchEuWytnkgGKAj5OUhHIIkoNuzED1SxBxsaGjAhYMFnbECBx08FaRAc5mpoM+cy0sN9gRPEWNvgKsEVCrNH/D3gVSViLV4HPb1QrfFfDH5dVsqYo1/0le7pJ32jO3p4sVc77BF4aXFWu1rh1OOn05t/X1WZnH0c/lP19OxjH59Crzp7d0ImOIXW1TePz+43V4pz7fd0Sb/Z/wV16JZPYDX/aFcFUTzv4UdlL3xjPKD003E8B9sfUulsKlPIJPNfolENYQazmOd55JDHOjZQ4v6n+I5r3EgLUkEqS7vdUqkv0kzhv5C0R6Sdmbs=/latexit
r0 = [ 0.0344, 0.9775, 0.0076]
latexit sha1_base64=2hhG9gKLK2JIb9w0aH6ObiinRms=AAACpnichVHLSsNQED3GV3226kZwUy2KCykTqbUIgujGlfiqFWqpSbxqMC+StKDFteAPuHCl4ELErX6AG3/AhZ8gLhXcuHCSBkRFnZDMuefOmZx7R3UM3fOJHhukxqbmltZYW3tHZ1d3PNHTu+bZFVcTec02bHddVTxh6JbI+7pviHXHFYqpGqKg7s0F+4WqcD3dtlb9fUeUTGXH0rd1TfGZKicGN1Sz5h6W5eR0skhpymZyY5zGKRskkidK5UQqQEEkfwI5AilEsWgnbrGBLdjQUIEJAQs+YwMKPH6KkEFwmCuhxpzLSA/3BQ7RztoKVwmuUJjd4+8Or4oRa/E66OmFao3/YvDrsjKJYXqgS3qhe7qiJ3r/tVct7BF42ees1rXCKceP+1fe/lWZnH3sfqr+9OxjG7nQq87enZAJTqHV9dWDk5eVqeXh2gid0zP7P6NHuuMTWNVX7WJJLJ/+4UdlL3xjPCD5+zh+grXxtJxNZ5YyqZnZaFQxDGAIozyPScxgHovIc/8jXOMGt9KotCDlpUK9VGqINH34EtLmBwvLmR0=/latexit
r1 = [0.0648, 0.0206, 0.0015]
latexit sha1_base64=FqaYo7zlBtUvxlEmhgyAfxy/v8k=AAACpnichVHLSsNAFD3G97NVN4KbalFcSLjRYosgiG5cia9aoZaaxFGDeZGkBS2uBX/AhSsFFyJu7Qe48Qdc+AniUsGNC2/TgKioN0zmzJl77pyZq7mm4QdEjw1SY1NzS2tbe0dnV3dPLN7bt+47JU8XWd0xHW9DU31hGrbIBkZgig3XE6qlmSKn7c/X9nNl4fmGY68FB64oWOqubewYuhowVYwPbWpWxTsqTiRmEnmSM2klM67Ik0SZcZKJKF0oxpMh4kj8BEoEkohiyYlXsYltONBRggUBGwFjEyp8/vJQQHCZK6DCnMfICPcFjtDB2hJnCc5Qmd3n/y6v8hFr87pW0w/VOp9i8vBYmcAIPdAVvdA9XdMTvf9aqxLWqHk54Fmra4VbjJ0MrL79q7J4DrD3qfrTc4AdZEKvBnt3Q6Z2C72uLx+evqxOr4xURumCntn/OT3SHd/ALr/ql8ti5ewPPxp74RfjBinf2/ETrE/IypScWk4lZ+eiVrVhEMMY436kMYsFLCHL9Y9xg1tUpTFpUcpKuXqq1BBp+vElpK0PJf+ZKQ==/latexit
r2 = [0.8718, 1.3008, 0.0007]
SchNet (Schütt et al, 2017)
latexit sha1_base64=tzvRXqv2dAYIROlG6uwbJPf0mrw=AAAC9XichVHNShxBEK6d+BeNuiaXgJchi7IiLL0iieQk5pKT+Lcq2DL0jL27vfZMDz29a3SYF8gL5BByiKgQcvDoA3jxBRLwEYJHBS8erJ0dEqKoNcz011/VV/N1lxtKERlCznPWs67unt6+5/0DLwaHhvMjL1cj1dQer3hKKr3usohLEfCKEUby9VBz5ruSr7nbH9r5tRbXkVDBitkN+abPaoGoCo8ZpJx8SF0//pQ4wqaSVw3TWu3Yf7lJm4aRSDNFGjV9J27YVAQ29Zmpe0zG81iVYFFdFDNRY8KmakuZtAlVPq+xxIlFI6Fa1OpmwskXSImkYd8H5QwUIIsFlT8BClugwIMm+MAhAINYAoMInw0oA4EQuU2IkdOIRJrnkEA/aptYxbGCIbuN3xruNjI2wH27Z5SqPfyLxFej0oYx8ov8IJfkjPwkf8jNg73itEfbyy6ubkfLQ2f48+vl6ydVPq4G6v9Uj3o2UIWZ1KtA72HKtE/hdfStvS+Xy++XxuJxsk8u0P93ck5O8QRB68o7XORLXx/x46IXvDEcUPnuOO6D1alS+W1penG6MDuXjaoPRuENFHEe72AWPsICVLD/b7jJded6rB3rm3VgHXVKrVymeQX/hXV8C560vMU=/latexit
xi xi +
0
@
X
j2Ni
(xj) !ij
1
A
Message Passing with
residual connections
latexit sha1_base64=4oqceeOsg0RegmHmCOLOjG0SaOU=AAACi3ichVHNSgJRFD5Of2aZVpugjSRGKzmWVEgLKYKW/uQPqMjMdLXB+WNmlGzwBVq2aWGbghbRA/QAbXqBFj5CtDRo06LjOBAl2Rnu3O9+93znfvceQZcl00LsebiJyanpGe+sb27evxAILi7lTa1piCwnarJmFAXeZLKkspwlWTIr6gbjFUFmBaFxMNgvtJhhSpp6bLV1VlH4uirVJJG3iCqWBcU+61SxGgxjFJ0IjYKYC8LgRkoLPkIZTkADEZqgAAMVLMIy8GDSV4IYIOjEVcAmziAkOfsMOuAjbZOyGGXwxDboX6dVyWVVWg9qmo5apFNkGgYpQxDBF7zHPj7jA77i55+1bKfGwEubZmGoZXo1cLGS/fhXpdBswem3aqxnC2qw63iVyLvuMINbiEN96/yqn01kIvY63uIb+b/BHj7RDdTWu3iXZpnuGD8CeaEXowbFfrdjFOQ3o7HtaDwdDyf33VZ5YRXWYIP6sQNJOIIU5Jw+XEIXrjk/t8UluL1hKudxNcvwI7jDL6M0kvA=/latexit
x0
latexit sha1_base64=jqQfQ7TB6F1UPH31OFoS1eerzNw=AAACi3ichVHLSsNAFL2Nr1qtrboR3BRLxVWZaFEpLooiuOzDPqAtJYnTOjQvkrRYQ3/ApRsXdaPgQvwAP8CNP+CinyAuK7hx4U0aEC3WGyZz5sw9d87MFXWZmRYhfR83MTk1PeOfDczNBxdC4cWlgqm1DInmJU3WjJIomFRmKs1bzJJpSTeooIgyLYrNA2e/2KaGyTT12OrotKoIDZXVmSRYSJUqomKfdWt8LRwlceJGZBTwHoiCF2kt/AgVOAENJGiBAhRUsBDLIICJXxl4IKAjVwUbOQMRc/cpdCGA2hZmUcwQkG3iv4GrssequHZqmq5awlNkHAYqIxAjL+SeDMgzeSCv5PPPWrZbw/HSwVkcaqleC12s5D7+VSk4W3D6rRrr2YI67LpeGXrXXca5hTTUt8+vBrlkNmavk1vyhv5vSJ884Q3U9rt0l6HZ3hg/InrBF8MG8b/bMQoKm3F+O57IJKKpfa9VfliFNdjAfuxACo4gDXm3D5fQg2suyG1xSW5vmMr5PM0y/Aju8AulVJLx/latexit
x1
latexit sha1_base64=fLMd1ywDpT0cBkzaL9hTb77jq/8=AAACi3ichVHLSsNAFL2Nr1qtrboR3BRLxVWZ1KJSXBRFcNmHfUBbShLHGpoXybRYQ3/ApRsXdaPgQvwAP8CNP+CinyAuK7hx4U0aEC3WGyZz5sw9d87MFQ1FthghfR83MTk1PeOfDczNBxdC4cWloqW3TIkWJF3RzbIoWFSRNVpgMlNo2TCpoIoKLYnNfWe/1KamJevaEesYtKYKDU0+kSWBIVWuiqp91q0n6uEoiRM3IqOA90AUvMjo4UeowjHoIEELVKCgAUOsgAAWfhXggYCBXA1s5ExEsrtPoQsB1LYwi2KGgGwT/w1cVTxWw7VT03LVEp6i4DBRGYEYeSH3ZECeyQN5JZ9/1rLdGo6XDs7iUEuNeuhiJf/xr0rFmcHpt2qsZwYnsON6ldG74TLOLaShvn1+NcincjF7ndySN/R/Q/rkCW+gtd+luyzN9cb4EdELvhg2iP/djlFQTMT5rXgym4ym97xW+WEV1mAD+7ENaTiEDBTcPlxCD665ILfJpbjdYSrn8zTL8CO4gy+ndJLy/latexit
x2
latexit sha1_base64=YVJW2X9s34FCmpunsQgR00z8vXY=AAACjnichVHLSsNAFL2Nr1ofrboR3BRLxVW5kVJFEItuuuzDPqAtJYlTDc2LJK3U0B9wLy4ERcGF+AF+gBt/wEU/QVxWcOPC2zQgWqw3TObMmXvunJkrGops2YhdHzc2PjE55Z8OzMzOzQdDC4sFS2+aEstLuqKbJVGwmCJrLG/LtsJKhskEVVRYUWzs9/eLLWZasq4d2G2DVVXhSJPrsiTYRJUrouqcdGoO8p1aKIIxdCM8DHgPRMCLtB56hAocgg4SNEEFBhrYhBUQwKKvDDwgGMRVwSHOJCS7+ww6ECBtk7IYZQjENuh/RKuyx2q07te0XLVEpyg0TFKGIYoveI89fMYHfMXPP2s5bo2+lzbN4kDLjFrwbDn38a9KpdmG42/VSM821GHL9SqTd8Nl+reQBvrW6UUvt52NOmt4i2/k/wa7+EQ30Frv0l2GZS9H+BHJC70YNYj/3Y5hUNiI8YlYPBOPJPe8VvlhBVZhnfqxCUlIQRry7ouewxVccyEuwe1wu4NUzudpluBHcKkvk/uUNg==/latexit
w01
latexit sha1_base64=tYR7XnwyhyUSVK/6XFo1uz/tDtI=AAACjnichVFLSwJRFD5OL7OHVpugjSRGKzmKWASR1Malj3yAisxMVxucFzOjYYN/oH20CIqCFtEP6Ae06Q+08CdES4M2LTqOA1GSneHO/e53z3fud+8RdFkyLcSeh5uYnJqe8c765uYXFv2BpeWCqbUMkeVFTdaMksCbTJZUlrckS2Yl3WC8IsisKDQPBvvFNjNMSVMPrY7OqgrfUKW6JPIWUeWKoNgn3ZqNsW4tEMIIOhEcBVEXhMCNtBZ4hAocgQYitEABBipYhGXgwaSvDFFA0Imrgk2cQUhy9hl0wUfaFmUxyuCJbdK/Qauyy6q0HtQ0HbVIp8g0DFIGIYwveI99fMYHfMXPP2vZTo2Blw7NwlDL9Jr/bDX38a9KodmC42/VWM8W1GHb8SqRd91hBrcQh/r26UU/t5MN2xt4i2/k/wZ7+EQ3UNvv4l2GZS/H+BHIC70YNSj6ux2joBCLRBOReCYeSu67rfLCGqzDJvVjC5KQgjTknRc9hyu45gJcgtvl9oapnMfVrMCP4FJflhyUNw==/latexit
w02
latexit sha1_base64=Ak3wjfcp94hUo7DAHdEjDzlDdm0=AAACjnichVHLSsNAFL2Nr1ofjboR3BRLxVWZlFJFEItuuuzDPqAtJYnTOjQvkrRSQ3/AvbgQFAUX4gf4AW78ARf9BHFZwY0Lb9OAaLHeMJkzZ+65c2auZCjMsgnp+biJyanpGf9sYG5+YTHILy0XLL1lyjQv64puliTRogrTaN5mtkJLhklFVVJoUWoeDPaLbWpaTNcO7Y5Bq6rY0FidyaKNVLkiqc5Jt+YIsW6ND5MocSM0CgQPhMGLtM4/QgWOQAcZWqACBQ1sxAqIYOFXBgEIGMhVwUHORMTcfQpdCKC2hVkUM0Rkm/hv4KrssRquBzUtVy3jKQoOE5UhiJAXck/65Jk8kFfy+Wctx60x8NLBWRpqqVELnq3mPv5VqTjbcPytGuvZhjpsu14ZejdcZnALeahvn170czvZiLNBbskb+r8hPfKEN9Da7/JdhmYvx/iR0Au+GDZI+N2OUVCIRYVENJ6Jh5P7Xqv8sAbrsIn92IIkpCANefdFz+EKrjmeS3C73N4wlfN5mhX4EVzqC5g+lDg=/latexit
w12
40. / 23
20
pred vs true for SchNet (Schütt et al, 2017)
Use Case 2: Quantum chemistry
pred vs true for DimeNet (Klicpera et al, 2020)
Dipole Moment Energy U
HOMO
LUMO
Heat Capacity
Enthalpy H
Dipole Moment Energy U
HOMO
LUMO
Heat Capacity
Enthalpy H
41. / 23
21
SchNet (Schütt et al, 2017)
Use Case 2: Quantum chemistry
DimeNet (Klicpera et al, 2020)
Free Energy Free Energy
y_true
y_true
y_pred y_pred
ExtraTrees w/ ECFP6 LightGBM w/ ECFP6 3-Layer MLP w/ ECFP6
(without 3D geometry) (without 3D geometry) (without 3D geometry)
Free Energy Free Energy Free Energy
46. / 23
Potential Energy Surface
latexit sha1_base64=Wyp4gB8RxRsto5OIPkBWRdHaY3U=AAAChHichVG7SgNBFD1ZNcZ31EawCQbFQsJE4wMLCYpgaRIThSiyu07i4GZ32Z0EYvAHtFUsrBQsxA/wA2z8AYt8glhGsLHwZrMgGtS7zM6ZM/fcOTNXsw3hSsbqAaWjsyvYHerp7esfGBwKD4/kXKvs6DyrW4bl7Giqyw1h8qwU0uA7tsPVkmbwbe1orbm/XeGOKyxzS1ZtvldSi6YoCF2VRKXW98NRFmNeRNpB3AdR+LFphR+wiwNY0FFGCRwmJGEDKlz68oiDwSZuDzXiHELC2+c4QS9py5TFKUMl9oj+RVrlfdakdbOm66l1OsWg4ZAygkn2zO5Ygz2xe/bCPn6tVfNqNL1UadZaWm7vD52OZd7/VZVoljj8Uv3pWaKAJc+rIO+2xzRvobf0lePLRmY5PVmbYjfslfxfszp7pBuYlTf9NsXTV3/40cgLvRg1KP6zHe0gNxuLL8QSqUQ0ueq3KoRxTGCa+rGIJDawiSzV5zjDOS6UoDKjzCnzrVQl4GtG8S2UlU9p44/F/latexit
E
latexit sha1_base64=VQOOvVSP0lW6m0gNVQ8zA5pNsEc=AAAChnichVG7SgNBFD1ZXzE+ErURbIJBsQo3Eh9YBW0sTWJMQCXsrmMcsi92NwEN/oBgq4WVgoX4AX6AjT9gkU8QSwUbC282C6Ki3mV2zpy5586ZuZpjSM8nakeUnt6+/oHoYGxoeGQ0nhgb3/LshquLkm4btlvRVE8Y0hIlX/qGqDiuUE3NEGWtvtbZLzeF60nb2vQPHbFrqjVL7ktd9ZkqulVZTaQoTUEkf4JMCFIIY8NO3GEHe7ChowETAhZ8xgZUePxtIwOCw9wuWsy5jGSwL3CMGGsbnCU4Q2W2zv8ar7ZD1uJ1p6YXqHU+xeDhsjKJGXqkG3qhB7qlJ3r/tVYrqNHxcsiz1tUKpxo/mSy+/asyefZx8Kn607OPfSwHXiV7dwKmcwu9q28enb8UVwozrVm6omf2f0ltuucbWM1X/TovChd/+NHYC78YNyjzvR0/wdZ8OrOYzuazqdxq2KoopjCNOe7HEnJYxwZKXL+GU5zhXIkqaWVBWeqmKpFQM4EvoeQ+AL7mkM4=/latexit
ri
latexit sha1_base64=kTQzD4iiA9afGMFuEz1PlDAJN4Y=AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhgUIwV0caShzwSJGR3HXFlX9ldSJD4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlbOqp9VQhMWYG+FhEPdABF6kjdAjDnEEAzIa0MChwyGsQoRNXxlxMJjEVdAmziKkuPsc5wiQtkFZnDJEYuv0r9Gq7LE6rfs1bVct0ykqDYuUYUTZC7tnPfbMHtgr+/yzVtut0ffSolkaaLlZDV4s5z7+VWk0Ozj5Vo307OAY265XhbybLtO/hTzQN886vdxONtpeY7fsjfzfsC57ohvozXf5LsOz1yP8SOSFXowaFP/djmFQ2IjFt2KJTCKS2vVa5ccKVrFO/UgihX2kkaf6NVziCh3BL8SETSE5SBV8nmYJP0JIfQHBBpDP/latexit
rj
22
Chemical Reaction Design and Discovery
EQ1 EQ2
Chemical Reaction
47. / 23
Potential Energy Surface
latexit sha1_base64=Wyp4gB8RxRsto5OIPkBWRdHaY3U=AAAChHichVG7SgNBFD1ZNcZ31EawCQbFQsJE4wMLCYpgaRIThSiyu07i4GZ32Z0EYvAHtFUsrBQsxA/wA2z8AYt8glhGsLHwZrMgGtS7zM6ZM/fcOTNXsw3hSsbqAaWjsyvYHerp7esfGBwKD4/kXKvs6DyrW4bl7Giqyw1h8qwU0uA7tsPVkmbwbe1orbm/XeGOKyxzS1ZtvldSi6YoCF2VRKXW98NRFmNeRNpB3AdR+LFphR+wiwNY0FFGCRwmJGEDKlz68oiDwSZuDzXiHELC2+c4QS9py5TFKUMl9oj+RVrlfdakdbOm66l1OsWg4ZAygkn2zO5Ygz2xe/bCPn6tVfNqNL1UadZaWm7vD52OZd7/VZVoljj8Uv3pWaKAJc+rIO+2xzRvobf0lePLRmY5PVmbYjfslfxfszp7pBuYlTf9NsXTV3/40cgLvRg1KP6zHe0gNxuLL8QSqUQ0ueq3KoRxTGCa+rGIJDawiSzV5zjDOS6UoDKjzCnzrVQl4GtG8S2UlU9p44/F/latexit
E
latexit sha1_base64=VQOOvVSP0lW6m0gNVQ8zA5pNsEc=AAAChnichVG7SgNBFD1ZXzE+ErURbIJBsQo3Eh9YBW0sTWJMQCXsrmMcsi92NwEN/oBgq4WVgoX4AX6AjT9gkU8QSwUbC282C6Ki3mV2zpy5586ZuZpjSM8nakeUnt6+/oHoYGxoeGQ0nhgb3/LshquLkm4btlvRVE8Y0hIlX/qGqDiuUE3NEGWtvtbZLzeF60nb2vQPHbFrqjVL7ktd9ZkqulVZTaQoTUEkf4JMCFIIY8NO3GEHe7ChowETAhZ8xgZUePxtIwOCw9wuWsy5jGSwL3CMGGsbnCU4Q2W2zv8ar7ZD1uJ1p6YXqHU+xeDhsjKJGXqkG3qhB7qlJ3r/tVYrqNHxcsiz1tUKpxo/mSy+/asyefZx8Kn607OPfSwHXiV7dwKmcwu9q28enb8UVwozrVm6omf2f0ltuucbWM1X/TovChd/+NHYC78YNyjzvR0/wdZ8OrOYzuazqdxq2KoopjCNOe7HEnJYxwZKXL+GU5zhXIkqaWVBWeqmKpFQM4EvoeQ+AL7mkM4=/latexit
ri
latexit sha1_base64=kTQzD4iiA9afGMFuEz1PlDAJN4Y=AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhgUIwV0caShzwSJGR3HXFlX9ldSJD4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlbOqp9VQhMWYG+FhEPdABF6kjdAjDnEEAzIa0MChwyGsQoRNXxlxMJjEVdAmziKkuPsc5wiQtkFZnDJEYuv0r9Gq7LE6rfs1bVct0ykqDYuUYUTZC7tnPfbMHtgr+/yzVtut0ffSolkaaLlZDV4s5z7+VWk0Ozj5Vo307OAY265XhbybLtO/hTzQN886vdxONtpeY7fsjfzfsC57ohvozXf5LsOz1yP8SOSFXowaFP/djmFQ2IjFt2KJTCKS2vVa5ccKVrFO/UgihX2kkaf6NVziCh3BL8SETSE5SBV8nmYJP0JIfQHBBpDP/latexit
rj
22
Chemical Reaction Design and Discovery
EQ1 EQ2
Chemical Reaction
48. / 23
Potential Energy Surface
latexit sha1_base64=Wyp4gB8RxRsto5OIPkBWRdHaY3U=AAAChHichVG7SgNBFD1ZNcZ31EawCQbFQsJE4wMLCYpgaRIThSiyu07i4GZ32Z0EYvAHtFUsrBQsxA/wA2z8AYt8glhGsLHwZrMgGtS7zM6ZM/fcOTNXsw3hSsbqAaWjsyvYHerp7esfGBwKD4/kXKvs6DyrW4bl7Giqyw1h8qwU0uA7tsPVkmbwbe1orbm/XeGOKyxzS1ZtvldSi6YoCF2VRKXW98NRFmNeRNpB3AdR+LFphR+wiwNY0FFGCRwmJGEDKlz68oiDwSZuDzXiHELC2+c4QS9py5TFKUMl9oj+RVrlfdakdbOm66l1OsWg4ZAygkn2zO5Ygz2xe/bCPn6tVfNqNL1UadZaWm7vD52OZd7/VZVoljj8Uv3pWaKAJc+rIO+2xzRvobf0lePLRmY5PVmbYjfslfxfszp7pBuYlTf9NsXTV3/40cgLvRg1KP6zHe0gNxuLL8QSqUQ0ueq3KoRxTGCa+rGIJDawiSzV5zjDOS6UoDKjzCnzrVQl4GtG8S2UlU9p44/F/latexit
E
latexit sha1_base64=VQOOvVSP0lW6m0gNVQ8zA5pNsEc=AAAChnichVG7SgNBFD1ZXzE+ErURbIJBsQo3Eh9YBW0sTWJMQCXsrmMcsi92NwEN/oBgq4WVgoX4AX6AjT9gkU8QSwUbC282C6Ki3mV2zpy5586ZuZpjSM8nakeUnt6+/oHoYGxoeGQ0nhgb3/LshquLkm4btlvRVE8Y0hIlX/qGqDiuUE3NEGWtvtbZLzeF60nb2vQPHbFrqjVL7ktd9ZkqulVZTaQoTUEkf4JMCFIIY8NO3GEHe7ChowETAhZ8xgZUePxtIwOCw9wuWsy5jGSwL3CMGGsbnCU4Q2W2zv8ar7ZD1uJ1p6YXqHU+xeDhsjKJGXqkG3qhB7qlJ3r/tVYrqNHxcsiz1tUKpxo/mSy+/asyefZx8Kn607OPfSwHXiV7dwKmcwu9q28enb8UVwozrVm6omf2f0ltuucbWM1X/TovChd/+NHYC78YNyjzvR0/wdZ8OrOYzuazqdxq2KoopjCNOe7HEnJYxwZKXL+GU5zhXIkqaWVBWeqmKpFQM4EvoeQ+AL7mkM4=/latexit
ri
latexit sha1_base64=kTQzD4iiA9afGMFuEz1PlDAJN4Y=AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhgUIwV0caShzwSJGR3HXFlX9ldSJD4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlbOqp9VQhMWYG+FhEPdABF6kjdAjDnEEAzIa0MChwyGsQoRNXxlxMJjEVdAmziKkuPsc5wiQtkFZnDJEYuv0r9Gq7LE6rfs1bVct0ykqDYuUYUTZC7tnPfbMHtgr+/yzVtut0ffSolkaaLlZDV4s5z7+VWk0Ozj5Vo307OAY265XhbybLtO/hTzQN886vdxONtpeY7fsjfzfsC57ohvozXf5LsOz1yP8SOSFXowaFP/djmFQ2IjFt2KJTCKS2vVa5ccKVrFO/UgihX2kkaf6NVziCh3BL8SETSE5SBV8nmYJP0JIfQHBBpDP/latexit
rj
22
Chemical Reaction Design and Discovery
EQ1 EQ2
TS
Chemical Reaction
49. / 23
Potential Energy Surface
latexit sha1_base64=Wyp4gB8RxRsto5OIPkBWRdHaY3U=AAAChHichVG7SgNBFD1ZNcZ31EawCQbFQsJE4wMLCYpgaRIThSiyu07i4GZ32Z0EYvAHtFUsrBQsxA/wA2z8AYt8glhGsLHwZrMgGtS7zM6ZM/fcOTNXsw3hSsbqAaWjsyvYHerp7esfGBwKD4/kXKvs6DyrW4bl7Giqyw1h8qwU0uA7tsPVkmbwbe1orbm/XeGOKyxzS1ZtvldSi6YoCF2VRKXW98NRFmNeRNpB3AdR+LFphR+wiwNY0FFGCRwmJGEDKlz68oiDwSZuDzXiHELC2+c4QS9py5TFKUMl9oj+RVrlfdakdbOm66l1OsWg4ZAygkn2zO5Ygz2xe/bCPn6tVfNqNL1UadZaWm7vD52OZd7/VZVoljj8Uv3pWaKAJc+rIO+2xzRvobf0lePLRmY5PVmbYjfslfxfszp7pBuYlTf9NsXTV3/40cgLvRg1KP6zHe0gNxuLL8QSqUQ0ueq3KoRxTGCa+rGIJDawiSzV5zjDOS6UoDKjzCnzrVQl4GtG8S2UlU9p44/F/latexit
E
latexit sha1_base64=VQOOvVSP0lW6m0gNVQ8zA5pNsEc=AAAChnichVG7SgNBFD1ZXzE+ErURbIJBsQo3Eh9YBW0sTWJMQCXsrmMcsi92NwEN/oBgq4WVgoX4AX6AjT9gkU8QSwUbC282C6Ki3mV2zpy5586ZuZpjSM8nakeUnt6+/oHoYGxoeGQ0nhgb3/LshquLkm4btlvRVE8Y0hIlX/qGqDiuUE3NEGWtvtbZLzeF60nb2vQPHbFrqjVL7ktd9ZkqulVZTaQoTUEkf4JMCFIIY8NO3GEHe7ChowETAhZ8xgZUePxtIwOCw9wuWsy5jGSwL3CMGGsbnCU4Q2W2zv8ar7ZD1uJ1p6YXqHU+xeDhsjKJGXqkG3qhB7qlJ3r/tVYrqNHxcsiz1tUKpxo/mSy+/asyefZx8Kn607OPfSwHXiV7dwKmcwu9q28enb8UVwozrVm6omf2f0ltuucbWM1X/TovChd/+NHYC78YNyjzvR0/wdZ8OrOYzuazqdxq2KoopjCNOe7HEnJYxwZKXL+GU5zhXIkqaWVBWeqmKpFQM4EvoeQ+AL7mkM4=/latexit
ri
latexit sha1_base64=kTQzD4iiA9afGMFuEz1PlDAJN4Y=AAAChnichVG7TgJBFD2sL8QHqI2JDZFgrMhgUIwV0caShzwSJGR3HXFlX9ldSJD4Aya2UlhpYmH8AD/Axh+w4BOMJSY2Fl6WTYwS8W5m58yZe+6cmSuZqmI7jHV9wtj4xOSUfzowMzs3HwwtLBZso2HJPC8bqmGVJNHmqqLzvKM4Ki+ZFhc1SeVFqb7X3y82uWUrhn7gtExe0cSarhwrsugQlbOqp9VQhMWYG+FhEPdABF6kjdAjDnEEAzIa0MChwyGsQoRNXxlxMJjEVdAmziKkuPsc5wiQtkFZnDJEYuv0r9Gq7LE6rfs1bVct0ykqDYuUYUTZC7tnPfbMHtgr+/yzVtut0ffSolkaaLlZDV4s5z7+VWk0Ozj5Vo307OAY265XhbybLtO/hTzQN886vdxONtpeY7fsjfzfsC57ohvozXf5LsOz1yP8SOSFXowaFP/djmFQ2IjFt2KJTCKS2vVa5ccKVrFO/UgihX2kkaf6NVziCh3BL8SETSE5SBV8nmYJP0JIfQHBBpDP/latexit
rj
22
Chemical Reaction Design and Discovery
EQ1 EQ2
TS
Chemical Reaction
Energy
latexit sha1_base64=Wyp4gB8RxRsto5OIPkBWRdHaY3U=AAAChHichVG7SgNBFD1ZNcZ31EawCQbFQsJE4wMLCYpgaRIThSiyu07i4GZ32Z0EYvAHtFUsrBQsxA/wA2z8AYt8glhGsLHwZrMgGtS7zM6ZM/fcOTNXsw3hSsbqAaWjsyvYHerp7esfGBwKD4/kXKvs6DyrW4bl7Giqyw1h8qwU0uA7tsPVkmbwbe1orbm/XeGOKyxzS1ZtvldSi6YoCF2VRKXW98NRFmNeRNpB3AdR+LFphR+wiwNY0FFGCRwmJGEDKlz68oiDwSZuDzXiHELC2+c4QS9py5TFKUMl9oj+RVrlfdakdbOm66l1OsWg4ZAygkn2zO5Ygz2xe/bCPn6tVfNqNL1UadZaWm7vD52OZd7/VZVoljj8Uv3pWaKAJc+rIO+2xzRvobf0lePLRmY5PVmbYjfslfxfszp7pBuYlTf9NsXTV3/40cgLvRg1KP6zHe0gNxuLL8QSqUQ0ueq3KoRxTGCa+rGIJDawiSzV5zjDOS6UoDKjzCnzrVQl4GtG8S2UlU9p44/F/latexit
E
ML
• Acceleration by ML potential?
• Artificial force learning?
ML
Forces
• Scope/network expansion?
ML
• Fill any gap between theory and
experiments (reality) by data?
50. / 23
23
Machine Learning and Machine Discovery
• Machine Learning: many fascinating technical topics of my long-
standing interests on “ML with combinatorial structures” (such as GNNs)
• Machine Discovery: many long-standing important open problems
towards “AI for automating discovery”
Prior Info
Observational data
Reported facts
Textbook knowledge
Discovery
Representation
Model (Belief)
Intervention
Hypothesis
New Info
Prior Info
• Identify relevant variables
• Set design choices
• Set experiments
• Interpret results
Model (Belief)
Hypothesis
https://itakigawa.github.io/data/icred_med_202110.pdf
An exciting “real-world” test bench for ML researchers!
This Slide