SlideShare a Scribd company logo
1 of 44
Download to read offline
via Learning to represent, predict,
generate and explain
A/Prof Truyen Tran
Head of AI, Health and Science
AI for automated
materials discovery
26/05/2023 2
Solves
Inspires
AI/ML
Memory
Learning
Reasoning
Computer
vision
Human-AI
Teaming
Optimisation
Industry
Health
Software
Drug
discovery
Materials
science Manufact
uring
Business
processes
Energy
What we do @A2I2
26/05/2023 3
Agrawal, A., & Choudhary, A. (2016). Perspective: Materials informatics and big data: Realization of the “fourth paradigm” of
science in materials science. Apl Materials, 4(5), 053208.
The 5th paradigm
(2020-present)
• Advanced deep learning
• Massive data simulation
• Powerful Foundation
Models
https://www.microsoft.com/en-us/research/blog/ai4science-to-empower-the-fifth-paradigm-of-scientific-discovery/
Challenges
Materials science: Materials discovery is
very slow and extremely costly.
Automated chemist: Chemical interaction
and reaction prediction is key for advancing
chemistry, but extremely challenging.
26/05/2023 4
Materials discovery as smart
search over in exponential
space
5
#REF: Gómez-Bombarelli, Rafael, et al. "Automatic chemical design using a data-driven
continuous representation of molecules." ACS Central Science (2016).
Photo credit: wustl.edu
Molecular search space: 1023 to 1060
| Knowledge-driven
| AI-driven
Space of innovation
• Molecular space exploration
• Small, medium, large, supra
• Molecular interaction
• Network, docking
• Chemical reaction, retrosynthesis
• Catalyst, yield, free-energy
• Crystal space exploration
• Alloy space exploration
• Microstructures
• Knowledge extraction, coding, expression,
manipulation
27/05/2023 6
• Representation
• Graphs, geometry, periodicity, token
• Materials manifold
• Learning, attention and memory
• Self-supervised, supervised, reinforcement
• Transfer, zero-shot, few-shot, adaptation learning
• Learning to reason
• Reasoning
• Optimisation
• Extrapolation, generation
• Abductive, inductive, deductive reasoning
Materials AI/ML
Image: Shutterstock
AI/ML Topics
26/05/2023 7
REPRESENTATION PREDICTION OPTIMIZATION &
GENERALIZATION
EXPLANATION
Molecule → fingerprints
26/05/2023 8
#REF: Duvenaud, David K., et al.
"Convolutional networks on graphs for
learning molecular fingerprints." Advances
in neural information processing systems.
2015.
• Graph → vector. Mostly discrete. Substructures
coded.
• Vectors are easy to manipulate. Not easy to
reconstruct the graphs from fingerprints.
Kadurin, Artur, et al. "The cornucopia of meaningful leads: Applying deep adversarial
autoencoders for new molecule development in oncology." Oncotarget 8.7 (2017): 10883.
Source: wikipedia.org
Molecule → string
• SMILES = Simplified Molecular-Input Line-
Entry System
• Ready for encoding/decoding with
sequential models (seq2seq, RL,
Transformer).
• BUT …
• String → graphs is not unique!
• Lots of string are invalid
• Precise 3D information is lost
• Short range in graph may become long range in
string
26/05/2023 9
#REF: Gómez-Bombarelli, Rafael, et al. "Automatic chemical design using a
data-driven continuous representation of molecules." arXiv preprint
arXiv:1610.02415 (2016).
26/05/2023 10
Molecule → graphs
• No regular, fixed-size structures
• Graphs are permutation invariant:
• #permutations are exponential function of #nodes
• The probability of a generated graph G need to be
marginalized over all possible permutations
#REF: Pham, T., Tran, T., & Venkatesh, S. (2018).
Relational dynamic memory networks. arXiv
preprint arXiv:1808.04247.
Input
process
Memory
process
Output
process
Controller
process
Message
passing
• Multiple objectives:
• Diversity of generated graphs
• Smoothness of latent space
• Agreement with or optimization
of multiple “drug-like” objectives
Representing proteins
• 1D sequence (vocab of size 20) –
hundreds to thousands in length
• 2D contact map – requires
prediction
• 3D structure – requires folding
information, either observed or
predicted. Now available thanks
to AlphaFold 2.
• NLP-inspired embedding
(word2vec, doc2vec, glove,
seq2vec, ELMo, BERT, GPT).
26/05/2023 11
#REF: Yang, K. K., Wu, Z., Bedbrook, C. N., & Arnold, F.
H. (2018). Learned protein embeddings for machine
learning. Bioinformatics, 34(15), 2642-2648.
26/05/2023 12
Crystal structure
• Definition:
• Crystal structure is the repeating arrangement in the 3D space of atoms
throughout the crystal.
• Crystal structure is presented by the arrangement of atoms within the unit cell.
• The atom interacts with atoms within unit cell and adjacent unit cells.
Crystal structure Ac₂AgIr Unit cell of crystal structure Ac₂AgIr
Slide credit: Tri Nguyen
26/05/2023 13
Crystal structure representation
• Crystal structure input:
• Atom type
• Atom coordinates
• Periodic lattice
• Multi-graph representation to model the periodic interaction
Slide credit: Tri Nguyen
Representing microstructures of crystal
mixture
26/05/2023 14
Generate prior 𝛽 grains
Add transformation phases
Generate dual phase models Feature information
Phase %
Orientation
Grain size
Distance to
triple point
• Input information for each
microstructure saved per voxel
• Saved data considers local
environment
Volume domain: 106
voxels
Slide credit: Sterjovski and Agius
26/05/2023 15
Topics
REPRESENTATION PREDICTION OPTIMIZATION &
GENERALIZATION
EXPLANATION
Molecular properties prediction
• Traditional techniques:
• Graph kernels (ML)
• Molecular fingerprints
(Chemistry)
• Modern techniques
• Molecule as graph: atoms as
nodes, chemical bonds as
edges
26/05/2023 16
#REF: Penmatsa, Aravind, Kevin H. Wang, and Eric Gouaux. "X-ray
structure of dopamine transporter elucidates antidepressant
mechanism." Nature 503.7474 (2013): 85-90.
A graph processing machine for molecular
property prediction
26/05/2023 17
#REF: Pham, T., Tran, T., & Venkatesh, S. (2018).
Relational dynamic memory networks. arXiv
preprint arXiv:1808.04247.
Input
process
Memory
process
Output
process
Controller
process
Message
passing
Unrolling
Controller
Memory
Graph
Query Output
Read Write
Multi-target prediction
26/05/2023 18
Possible
targets
Molecular
graph
#REF: Do, Kien, et al. "Attentional Multilabel
Learning over Graphs-A message passing
approach." Machine Learning, 2019.
Predict multiple properties
26/05/2023 19
#REF: Do, Kien, et al. "Attentional Multilabel Learning over Graphs-A message passing
approach." Machine Learning, 2019.
Chemical-chemical interaction via
Relational Dynamic Memory Networks
26/05/2023 20
𝑴1 … 𝑴𝐶
𝒓𝑡
1
…
𝒓𝑡
𝐾
𝒓𝑡
∗
Controller
Write
𝒉𝑡
Memo
ry
Graph
Query Output
Read
heads
#REF: Pham, Trang, Truyen Tran, and Svetha Venkatesh. "Relational
dynamic memory networks." arXiv preprint arXiv:1808.04247(2018).
Drug and protein binding
26/05/2023 21
Drug molecule
- Binds to protein
binding site
- Changes its target
activity
- Binding strength is
the binding affinity
Protein
- May change its
conformation due to
interaction with drug
molecule
- Its function is altered due
to the present of drug
molecule at its binding site
Image credit: Lancet
Slide credit: Tri Nguyen
26/05/2023 22
GEFA: Drug-protein binding as graph-in-
graph interaction
Protein graph
Drug graph
A
K
L
A
T
A
Drug
Graph-in-Graph
interaction
Nguyen, T. M., Nguyen, T., Le, T. M., & Tran, T. (2021). “GEFA: Early Fusion Approach in Drug-Target
Affinity Prediction”. IEEE/ACM Transactions on Computational Biology and Bioinformatics
Slide credit: Tri Nguyen
26/05/2023 23
GEFA (cont.)
Nguyen, T. M., Nguyen, T., Le, T. M., & Tran, T. (2021). “GEFA: Early Fusion Approach in Drug-Target Affinity
Prediction”. IEEE/ACM Transactions on Computational Biology and Bioinformatics
Slide credit: Tri Nguyen
Predicting stress-strain curve from
crystal mixture
• Transformer to leverage
long-range
dependencies between
voxels
• Input: Feature vectors
per voxel.
• Output: Strain curve per
voxel.
26/05/2023 24
26/05/2023 25
Topics
REPRESENTATION PREDICTION OPTIMIZATION &
GENERALIZATION
EXPLANATION
Molecular generation
• The molecular space is estimated to
be 1e+23 to 1e+60
• Only 1e+8 substances synthesized thus
far.
• It is impossible to model this space
fully.
• The current technologies are not
mature for graph generations.
• But approximate techniques do
exist.
26/05/2023 26
Source: pharmafactz.com
Combinatorial chemistry
• Generate variations on a template
• Returns a list of molecules from this template that
• Bind to the pocket with good pharmacodynamics?
• Have good pharmacokinetics?
• Are synthetically accessible?
26/05/2023 27
#REF: Talk by Chloé-Agathe Azencott titled “Machine learning for therapeutic
research”, 12/10/2017
26/05/2023 28
Retrosynthesis
prediction
• Once a molecular structure is
designed, how do we synthesize it?
• Retrosynthesis planning/prediction
• Identify a set of reactants to synthesize a
target molecule
• This is reverse of chemical reaction
prediction
Picture source: Tim Soderberg, “Retrosynthetic analysis and metabolic pathway prediction”, Organic Chemistry With a Biological Emphasis,
2016. URL: https://chem.libretexts.org/Courses/Oregon_Institute_of_Technology/OIT%3A_CHE_333_-
_Organic_Chemistry_III_(Lund)/2%3A_Retrosynthetic_analysis_and_metabolic_pathway_prediction
GTPN: Synthesis via reaction
prediction as neural graph morphism
• Input: A set of graphs = a
single big graph with
disconnected components
• Output: A new set of
graphs. Same nodes,
different edges.
• Model: Graph morphism
• Method: Graph
transformation policy
network (GTPN)
26/05/2023 29
Kien Do, Truyen Tran, and Svetha Venkatesh. "Graph Transformation Policy Network for Chemical Reaction
Prediction." KDD’19.
26/05/2023 30
Alloy design generation
• Scientific innovations are expensive
• One search per specific target
• Availability of growing data
Nguyen, P., Tran, T., Gupta, S., Rana, S., Barnett, M. and Venkatesh, S., 2019, May. Incomplete conditional density estimation for fast materials discovery.
In Proceedings of the 2019 SIAM International Conference on Data Mining (pp. 549-557). Society for Industrial and Applied Mathematics.
26/05/2023 31
Inverse design
• Leverage the existing data
and query the simulators in
an offline mode
• Avoid the global
optimization by learning the
inverse design function f -1(y)
• Predict design variables in a
single step
26/05/2023 32
Incomplete conditional density estimation
• Multimodal density estimation given
incomplete conditions
• However, integrating over h is still intractable, we
approximate the expectation by a function evaluation at
the mode
26/05/2023 33
Generated alloys
example
• Known-alloy dataset:
15,000 variations from 30
known series of
Aluminum alloys
• BO-search dataset:
15,000 variations from
1000 found alloys by
Bayesian optimization
• Input: phase diagram |
Output: element
composition
26/05/2023 34
Crystal structure generation
• Application in structure discovery: battery, aerospace
materials, etc.
• The stability of a solid-state crystal structure is connected
to its formation energy
• Target:
• Generate crystal-like structure
• Has low formation energy
• Diversity set of crystal structure candidates for active learning
Slide credit: Tri Nguyen
26/05/2023 35
GFlownet
• GFlownet learns to generate the composition object:
• From the starting state, policy network output the
probability distribution over building blocks
• Select building blocks randomly based on the output
probability distribution and create a new state →
calculate the new probability distribution
• Repeat until reaching the terminal state
• Getting the reward from the environment (sparse
reward)
• The complete set of actions from starting state to
terminal state is a trajectory
• Flow is a non-negative function defined on the set of
complete trajectory
• GFlownet is trained by matching the flow going
through state: in-flow = out-flow
Slide credit: Tri Nguyen
26/05/2023 36
GFlownet
• Advantage of GFlownet
• Diverse set of candidates → avoid
getting stuck in multi-modal
distribution (e.g stability/energy
landscape of crystal structure)
• Can sample in proportion to a given
reward function (crystal structure
generation: formation energy)
Slide credit: Tri Nguyen
26/05/2023 37
Crystal structure generation with
GFlownet
• State:
• Multi-graph representation for structure:
• Node: atoms
• Node feature: element type, fraction coordinate
• Edges: built using near-neighbor-based method CrystalNN with search cut-off starting
from 13 and increasing to 20
• Edge feature: cell-direction vector ‘to_image’, bond distance
• 3D grid space: currently occupied and available position to insert new atom
• Action:
• Available fraction coordinate on a 3D grid.
• The chosen element
Slide credit: Tri Nguyen
26/05/2023 38
GFlownet -
Forward policy
• Policy network:
calculate the probability
distribution over actions
Examples of
generated crystal
structure
Slide credit: Tri Nguyen
26/05/2023 39
Topics
REPRESENTATION PREDICTION OPTIMIZATION &
GENERALIZATION
EXPLANATION
Explaining DTA deep learning model:
feature attribution
26/05/2023
Explainer
Protein
Drug
Deep learning model
Affinity = 6.8
Show the contribution of
each part of input to the
model decision
Does not show the
causal relationship
between the input and
the output of model
40
Slide credit: Tri Minh Nguyen
26/05/2023 41
Drug agent
Protein agent
Actions:
Removing atoms,
bonds
Add atoms, bonds
Actions:
- Substituting one
residue of other
types with Alanine
DTA model
Drug-Target
Environment
Reward
+ Drug representation similarity
+ Protein representation similarity
+ Δ Affinity
Communicate to form a
common action
MACDA: MultiAgent Counterfactual Drug-target Affinity framework
Nguyen, T.M., Quinn,
T.P., Nguyen, T. and
Tran, T., 2022. Explaining
Black Box Drug Target
Prediction through
Model Agnostic
Counterfactual
Samples. IEEE/ACM
Transactions on
Computational Biology
and Bioinformatics.
Slide credit: Tri Minh Nguyen
26/05/2023 42
The road ahead
Image source: pobble365
Grand framework
• Two-step paradigm:
• Step 1: Compress ALL materials knowledge into a giant model.
• Data, context as episodic memory | Model weights as semantic memory.
• Step 2: Decompress knowledge into something new.
• This requires learning to reason – learn how to manipulate existing knowledge.
• Search, plan are reasoning. Both aims to minimize an objective (e.g., matching or energy).
26/05/2023 43
Picture taken from (Bommasani et al, 2021)
• Learning to reason (zero-shot) all.
• Eying few-shot capability (e.g., materials
prompting).
• Leverage LLMs capability.
Prediction versus understanding
• We can predict well without understanding (e.g.,
planet/star motion Newton).
• Guessing the God’s many complex behaviours versus
knowing his few universal laws.
• → Automated laws discovery!
• → Abductive reasoning.
26/05/2023 44

More Related Content

What's hot

Composite material
Composite materialComposite material
Composite materialsangeetkhule
 
Composite materials
Composite materialsComposite materials
Composite materialstej kumar
 
Fracture mechanics
Fracture mechanicsFracture mechanics
Fracture mechanicsDeepak Samal
 
Technological trends in hot forging practice
Technological trends in hot forging practiceTechnological trends in hot forging practice
Technological trends in hot forging practiceManohar M Hegde
 
Semi Solid Metal Casting
Semi Solid Metal CastingSemi Solid Metal Casting
Semi Solid Metal CastingAmruta Rane
 
Lec 2 compression test
Lec 2 compression  testLec 2 compression  test
Lec 2 compression testRania Atia
 
specimen preparation for microscopic observation
specimen preparation for microscopic observationspecimen preparation for microscopic observation
specimen preparation for microscopic observationdvj_gajjar
 
Polymer Matrix Composites (PMC) Manufacturing and application
Polymer Matrix Composites (PMC) Manufacturing and applicationPolymer Matrix Composites (PMC) Manufacturing and application
Polymer Matrix Composites (PMC) Manufacturing and applicationSazzad Hossain
 
Stereolithography - Presentation By Hitesh Sharma
Stereolithography - Presentation By Hitesh SharmaStereolithography - Presentation By Hitesh Sharma
Stereolithography - Presentation By Hitesh SharmaHitesh Sharma
 
material technology previous question paper
 material technology previous question paper material technology previous question paper
material technology previous question paperbalajirao mahendrakar
 
Fracture toughness measurement testing
Fracture toughness measurement testingFracture toughness measurement testing
Fracture toughness measurement testingkhileshkrbhandari
 

What's hot (20)

Composite materials lecture
Composite materials lectureComposite materials lecture
Composite materials lecture
 
Composite material
Composite materialComposite material
Composite material
 
Composite materials
Composite materialsComposite materials
Composite materials
 
Metal fatigue ppt
Metal fatigue pptMetal fatigue ppt
Metal fatigue ppt
 
ME6403 - ENGINEERING MATERIALS AND METALLURGY BY Mr.ARIVUMANI RAVANAN /AP/MEC...
ME6403 - ENGINEERING MATERIALS AND METALLURGY BY Mr.ARIVUMANI RAVANAN /AP/MEC...ME6403 - ENGINEERING MATERIALS AND METALLURGY BY Mr.ARIVUMANI RAVANAN /AP/MEC...
ME6403 - ENGINEERING MATERIALS AND METALLURGY BY Mr.ARIVUMANI RAVANAN /AP/MEC...
 
Fatigue Failure Slides
Fatigue Failure SlidesFatigue Failure Slides
Fatigue Failure Slides
 
Fracture mechanics
Fracture mechanicsFracture mechanics
Fracture mechanics
 
Fracture mechanics
Fracture mechanicsFracture mechanics
Fracture mechanics
 
Technological trends in hot forging practice
Technological trends in hot forging practiceTechnological trends in hot forging practice
Technological trends in hot forging practice
 
Semi Solid Metal Casting
Semi Solid Metal CastingSemi Solid Metal Casting
Semi Solid Metal Casting
 
Hardness testing
Hardness testingHardness testing
Hardness testing
 
Lec 2 compression test
Lec 2 compression  testLec 2 compression  test
Lec 2 compression test
 
specimen preparation for microscopic observation
specimen preparation for microscopic observationspecimen preparation for microscopic observation
specimen preparation for microscopic observation
 
Polymer Matrix Composites (PMC) Manufacturing and application
Polymer Matrix Composites (PMC) Manufacturing and applicationPolymer Matrix Composites (PMC) Manufacturing and application
Polymer Matrix Composites (PMC) Manufacturing and application
 
Stereolithography - Presentation By Hitesh Sharma
Stereolithography - Presentation By Hitesh SharmaStereolithography - Presentation By Hitesh Sharma
Stereolithography - Presentation By Hitesh Sharma
 
Textile composite testing
Textile composite testingTextile composite testing
Textile composite testing
 
Fracture Mechanics & Failure Analysis:Lecture Toughness and fracture toughness
Fracture Mechanics & Failure Analysis:Lecture Toughness and fracture toughnessFracture Mechanics & Failure Analysis:Lecture Toughness and fracture toughness
Fracture Mechanics & Failure Analysis:Lecture Toughness and fracture toughness
 
material technology previous question paper
 material technology previous question paper material technology previous question paper
material technology previous question paper
 
FRACTURE BEHAVIOUR
FRACTURE BEHAVIOURFRACTURE BEHAVIOUR
FRACTURE BEHAVIOUR
 
Fracture toughness measurement testing
Fracture toughness measurement testingFracture toughness measurement testing
Fracture toughness measurement testing
 

Similar to AI for automated materials discovery via learning to represent, predict, generate and explain

Physics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningPhysics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningKAMAL CHOUDHARY
 
Making effective use of graphics processing units (GPUs) in computations
Making effective use of graphics processing units (GPUs) in computationsMaking effective use of graphics processing units (GPUs) in computations
Making effective use of graphics processing units (GPUs) in computationsOregon State University
 
Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...Anubhav Jain
 
Automated Machine Learning Applied to Diverse Materials Design Problems
Automated Machine Learning Applied to Diverse Materials Design ProblemsAutomated Machine Learning Applied to Diverse Materials Design Problems
Automated Machine Learning Applied to Diverse Materials Design ProblemsAnubhav Jain
 
The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...Anubhav Jain
 
2D/3D Materials screening and genetic algorithm with ML model
2D/3D Materials screening and genetic algorithm with ML model2D/3D Materials screening and genetic algorithm with ML model
2D/3D Materials screening and genetic algorithm with ML modelaimsnist
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstracttsysglobalsolutions
 
Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Manuel Martín
 
Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Anubhav Jain
 
[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...
[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...
[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...thanhdowork
 
Machine Learning for Chemistry: Representing and Intervening
Machine Learning for Chemistry: Representing and InterveningMachine Learning for Chemistry: Representing and Intervening
Machine Learning for Chemistry: Representing and InterveningIchigaku Takigawa
 
Applications of Machine Learning for Materials Discovery at NREL
Applications of Machine Learning for Materials Discovery at NRELApplications of Machine Learning for Materials Discovery at NREL
Applications of Machine Learning for Materials Discovery at NRELaimsnist
 
2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible researchYannick Wurm
 
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...theijes
 

Similar to AI for automated materials discovery via learning to represent, predict, generate and explain (20)

Physics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningPhysics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learning
 
Making effective use of graphics processing units (GPUs) in computations
Making effective use of graphics processing units (GPUs) in computationsMaking effective use of graphics processing units (GPUs) in computations
Making effective use of graphics processing units (GPUs) in computations
 
AI that/for matters
AI that/for mattersAI that/for matters
AI that/for matters
 
Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...
 
Automated Machine Learning Applied to Diverse Materials Design Problems
Automated Machine Learning Applied to Diverse Materials Design ProblemsAutomated Machine Learning Applied to Diverse Materials Design Problems
Automated Machine Learning Applied to Diverse Materials Design Problems
 
The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...
 
MUSEPosterCoGAPS
MUSEPosterCoGAPSMUSEPosterCoGAPS
MUSEPosterCoGAPS
 
2D/3D Materials screening and genetic algorithm with ML model
2D/3D Materials screening and genetic algorithm with ML model2D/3D Materials screening and genetic algorithm with ML model
2D/3D Materials screening and genetic algorithm with ML model
 
AI for Science
AI for ScienceAI for Science
AI for Science
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
 
Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?
 
Research Proposal
Research ProposalResearch Proposal
Research Proposal
 
Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...Software tools, crystal descriptors, and machine learning applied to material...
Software tools, crystal descriptors, and machine learning applied to material...
 
[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...
[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...
[20240318_LabSeminar_Huy]GSTNet: Global Spatial-Temporal Network for Traffic ...
 
Machine Learning for Chemistry: Representing and Intervening
Machine Learning for Chemistry: Representing and InterveningMachine Learning for Chemistry: Representing and Intervening
Machine Learning for Chemistry: Representing and Intervening
 
Collins seattle-2014-final
Collins seattle-2014-finalCollins seattle-2014-final
Collins seattle-2014-final
 
Applications of Machine Learning for Materials Discovery at NREL
Applications of Machine Learning for Materials Discovery at NRELApplications of Machine Learning for Materials Discovery at NREL
Applications of Machine Learning for Materials Discovery at NREL
 
2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research
 
ME Synopsis
ME SynopsisME Synopsis
ME Synopsis
 
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
 

More from Deakin University

Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Deep learning and reasoning: Recent advances
Deep learning and reasoning: Recent advancesDeep learning and reasoning: Recent advances
Deep learning and reasoning: Recent advancesDeakin University
 
Deep analytics via learning to reason
Deep analytics via learning to reasonDeep analytics via learning to reason
Deep analytics via learning to reasonDeakin University
 
Generative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of MaterialsGenerative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of MaterialsDeakin University
 
Generative AI: Shifting the AI Landscape
Generative AI: Shifting the AI LandscapeGenerative AI: Shifting the AI Landscape
Generative AI: Shifting the AI LandscapeDeakin University
 
From deep learning to deep reasoning
From deep learning to deep reasoningFrom deep learning to deep reasoning
From deep learning to deep reasoningDeakin University
 
Machine Learning and Reasoning for Drug Discovery
Machine Learning and Reasoning for Drug DiscoveryMachine Learning and Reasoning for Drug Discovery
Machine Learning and Reasoning for Drug DiscoveryDeakin University
 
Deep learning 1.0 and Beyond, Part 2
Deep learning 1.0 and Beyond, Part 2Deep learning 1.0 and Beyond, Part 2
Deep learning 1.0 and Beyond, Part 2Deakin University
 
Deep learning 1.0 and Beyond, Part 1
Deep learning 1.0 and Beyond, Part 1Deep learning 1.0 and Beyond, Part 1
Deep learning 1.0 and Beyond, Part 1Deakin University
 
AI/ML as an empirical science
AI/ML as an empirical scienceAI/ML as an empirical science
AI/ML as an empirical scienceDeakin University
 
Machine Reasoning at A2I2, Deakin University
Machine Reasoning at A2I2, Deakin UniversityMachine Reasoning at A2I2, Deakin University
Machine Reasoning at A2I2, Deakin UniversityDeakin University
 
AI for tackling climate change
AI for tackling climate changeAI for tackling climate change
AI for tackling climate changeDeakin University
 
Deep learning and applications in non-cognitive domains I
Deep learning and applications in non-cognitive domains IDeep learning and applications in non-cognitive domains I
Deep learning and applications in non-cognitive domains IDeakin University
 
Deep learning and applications in non-cognitive domains II
Deep learning and applications in non-cognitive domains IIDeep learning and applications in non-cognitive domains II
Deep learning and applications in non-cognitive domains IIDeakin University
 
Deep learning and applications in non-cognitive domains III
Deep learning and applications in non-cognitive domains IIIDeep learning and applications in non-cognitive domains III
Deep learning and applications in non-cognitive domains IIIDeakin University
 

More from Deakin University (20)

Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Deep learning and reasoning: Recent advances
Deep learning and reasoning: Recent advancesDeep learning and reasoning: Recent advances
Deep learning and reasoning: Recent advances
 
Deep analytics via learning to reason
Deep analytics via learning to reasonDeep analytics via learning to reason
Deep analytics via learning to reason
 
Generative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of MaterialsGenerative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of Materials
 
Generative AI: Shifting the AI Landscape
Generative AI: Shifting the AI LandscapeGenerative AI: Shifting the AI Landscape
Generative AI: Shifting the AI Landscape
 
From deep learning to deep reasoning
From deep learning to deep reasoningFrom deep learning to deep reasoning
From deep learning to deep reasoning
 
Machine Learning and Reasoning for Drug Discovery
Machine Learning and Reasoning for Drug DiscoveryMachine Learning and Reasoning for Drug Discovery
Machine Learning and Reasoning for Drug Discovery
 
Deep Learning 2.0
Deep Learning 2.0Deep Learning 2.0
Deep Learning 2.0
 
Deep learning 1.0 and Beyond, Part 2
Deep learning 1.0 and Beyond, Part 2Deep learning 1.0 and Beyond, Part 2
Deep learning 1.0 and Beyond, Part 2
 
Deep learning 1.0 and Beyond, Part 1
Deep learning 1.0 and Beyond, Part 1Deep learning 1.0 and Beyond, Part 1
Deep learning 1.0 and Beyond, Part 1
 
Machine reasoning
Machine reasoningMachine reasoning
Machine reasoning
 
AI/ML as an empirical science
AI/ML as an empirical scienceAI/ML as an empirical science
AI/ML as an empirical science
 
Machine Reasoning at A2I2, Deakin University
Machine Reasoning at A2I2, Deakin UniversityMachine Reasoning at A2I2, Deakin University
Machine Reasoning at A2I2, Deakin University
 
AI in the Covid-19 pandemic
AI in the Covid-19 pandemicAI in the Covid-19 pandemic
AI in the Covid-19 pandemic
 
Visual reasoning
Visual reasoningVisual reasoning
Visual reasoning
 
AI for tackling climate change
AI for tackling climate changeAI for tackling climate change
AI for tackling climate change
 
AI for drug discovery
AI for drug discoveryAI for drug discovery
AI for drug discovery
 
Deep learning and applications in non-cognitive domains I
Deep learning and applications in non-cognitive domains IDeep learning and applications in non-cognitive domains I
Deep learning and applications in non-cognitive domains I
 
Deep learning and applications in non-cognitive domains II
Deep learning and applications in non-cognitive domains IIDeep learning and applications in non-cognitive domains II
Deep learning and applications in non-cognitive domains II
 
Deep learning and applications in non-cognitive domains III
Deep learning and applications in non-cognitive domains IIIDeep learning and applications in non-cognitive domains III
Deep learning and applications in non-cognitive domains III
 

Recently uploaded

Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsCharlene Llagas
 
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫qfactory1
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10ROLANARIBATO3
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2John Carlo Rollon
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 

Recently uploaded (20)

Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of Traits
 
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 

AI for automated materials discovery via learning to represent, predict, generate and explain

  • 1. via Learning to represent, predict, generate and explain A/Prof Truyen Tran Head of AI, Health and Science AI for automated materials discovery
  • 3. 26/05/2023 3 Agrawal, A., & Choudhary, A. (2016). Perspective: Materials informatics and big data: Realization of the “fourth paradigm” of science in materials science. Apl Materials, 4(5), 053208. The 5th paradigm (2020-present) • Advanced deep learning • Massive data simulation • Powerful Foundation Models https://www.microsoft.com/en-us/research/blog/ai4science-to-empower-the-fifth-paradigm-of-scientific-discovery/
  • 4. Challenges Materials science: Materials discovery is very slow and extremely costly. Automated chemist: Chemical interaction and reaction prediction is key for advancing chemistry, but extremely challenging. 26/05/2023 4
  • 5. Materials discovery as smart search over in exponential space 5 #REF: Gómez-Bombarelli, Rafael, et al. "Automatic chemical design using a data-driven continuous representation of molecules." ACS Central Science (2016). Photo credit: wustl.edu Molecular search space: 1023 to 1060 | Knowledge-driven | AI-driven
  • 6. Space of innovation • Molecular space exploration • Small, medium, large, supra • Molecular interaction • Network, docking • Chemical reaction, retrosynthesis • Catalyst, yield, free-energy • Crystal space exploration • Alloy space exploration • Microstructures • Knowledge extraction, coding, expression, manipulation 27/05/2023 6 • Representation • Graphs, geometry, periodicity, token • Materials manifold • Learning, attention and memory • Self-supervised, supervised, reinforcement • Transfer, zero-shot, few-shot, adaptation learning • Learning to reason • Reasoning • Optimisation • Extrapolation, generation • Abductive, inductive, deductive reasoning Materials AI/ML Image: Shutterstock
  • 7. AI/ML Topics 26/05/2023 7 REPRESENTATION PREDICTION OPTIMIZATION & GENERALIZATION EXPLANATION
  • 8. Molecule → fingerprints 26/05/2023 8 #REF: Duvenaud, David K., et al. "Convolutional networks on graphs for learning molecular fingerprints." Advances in neural information processing systems. 2015. • Graph → vector. Mostly discrete. Substructures coded. • Vectors are easy to manipulate. Not easy to reconstruct the graphs from fingerprints. Kadurin, Artur, et al. "The cornucopia of meaningful leads: Applying deep adversarial autoencoders for new molecule development in oncology." Oncotarget 8.7 (2017): 10883.
  • 9. Source: wikipedia.org Molecule → string • SMILES = Simplified Molecular-Input Line- Entry System • Ready for encoding/decoding with sequential models (seq2seq, RL, Transformer). • BUT … • String → graphs is not unique! • Lots of string are invalid • Precise 3D information is lost • Short range in graph may become long range in string 26/05/2023 9 #REF: Gómez-Bombarelli, Rafael, et al. "Automatic chemical design using a data-driven continuous representation of molecules." arXiv preprint arXiv:1610.02415 (2016).
  • 10. 26/05/2023 10 Molecule → graphs • No regular, fixed-size structures • Graphs are permutation invariant: • #permutations are exponential function of #nodes • The probability of a generated graph G need to be marginalized over all possible permutations #REF: Pham, T., Tran, T., & Venkatesh, S. (2018). Relational dynamic memory networks. arXiv preprint arXiv:1808.04247. Input process Memory process Output process Controller process Message passing • Multiple objectives: • Diversity of generated graphs • Smoothness of latent space • Agreement with or optimization of multiple “drug-like” objectives
  • 11. Representing proteins • 1D sequence (vocab of size 20) – hundreds to thousands in length • 2D contact map – requires prediction • 3D structure – requires folding information, either observed or predicted. Now available thanks to AlphaFold 2. • NLP-inspired embedding (word2vec, doc2vec, glove, seq2vec, ELMo, BERT, GPT). 26/05/2023 11 #REF: Yang, K. K., Wu, Z., Bedbrook, C. N., & Arnold, F. H. (2018). Learned protein embeddings for machine learning. Bioinformatics, 34(15), 2642-2648.
  • 12. 26/05/2023 12 Crystal structure • Definition: • Crystal structure is the repeating arrangement in the 3D space of atoms throughout the crystal. • Crystal structure is presented by the arrangement of atoms within the unit cell. • The atom interacts with atoms within unit cell and adjacent unit cells. Crystal structure Ac₂AgIr Unit cell of crystal structure Ac₂AgIr Slide credit: Tri Nguyen
  • 13. 26/05/2023 13 Crystal structure representation • Crystal structure input: • Atom type • Atom coordinates • Periodic lattice • Multi-graph representation to model the periodic interaction Slide credit: Tri Nguyen
  • 14. Representing microstructures of crystal mixture 26/05/2023 14 Generate prior 𝛽 grains Add transformation phases Generate dual phase models Feature information Phase % Orientation Grain size Distance to triple point • Input information for each microstructure saved per voxel • Saved data considers local environment Volume domain: 106 voxels Slide credit: Sterjovski and Agius
  • 15. 26/05/2023 15 Topics REPRESENTATION PREDICTION OPTIMIZATION & GENERALIZATION EXPLANATION
  • 16. Molecular properties prediction • Traditional techniques: • Graph kernels (ML) • Molecular fingerprints (Chemistry) • Modern techniques • Molecule as graph: atoms as nodes, chemical bonds as edges 26/05/2023 16 #REF: Penmatsa, Aravind, Kevin H. Wang, and Eric Gouaux. "X-ray structure of dopamine transporter elucidates antidepressant mechanism." Nature 503.7474 (2013): 85-90.
  • 17. A graph processing machine for molecular property prediction 26/05/2023 17 #REF: Pham, T., Tran, T., & Venkatesh, S. (2018). Relational dynamic memory networks. arXiv preprint arXiv:1808.04247. Input process Memory process Output process Controller process Message passing Unrolling Controller Memory Graph Query Output Read Write
  • 18. Multi-target prediction 26/05/2023 18 Possible targets Molecular graph #REF: Do, Kien, et al. "Attentional Multilabel Learning over Graphs-A message passing approach." Machine Learning, 2019.
  • 19. Predict multiple properties 26/05/2023 19 #REF: Do, Kien, et al. "Attentional Multilabel Learning over Graphs-A message passing approach." Machine Learning, 2019.
  • 20. Chemical-chemical interaction via Relational Dynamic Memory Networks 26/05/2023 20 𝑴1 … 𝑴𝐶 𝒓𝑡 1 … 𝒓𝑡 𝐾 𝒓𝑡 ∗ Controller Write 𝒉𝑡 Memo ry Graph Query Output Read heads #REF: Pham, Trang, Truyen Tran, and Svetha Venkatesh. "Relational dynamic memory networks." arXiv preprint arXiv:1808.04247(2018).
  • 21. Drug and protein binding 26/05/2023 21 Drug molecule - Binds to protein binding site - Changes its target activity - Binding strength is the binding affinity Protein - May change its conformation due to interaction with drug molecule - Its function is altered due to the present of drug molecule at its binding site Image credit: Lancet Slide credit: Tri Nguyen
  • 22. 26/05/2023 22 GEFA: Drug-protein binding as graph-in- graph interaction Protein graph Drug graph A K L A T A Drug Graph-in-Graph interaction Nguyen, T. M., Nguyen, T., Le, T. M., & Tran, T. (2021). “GEFA: Early Fusion Approach in Drug-Target Affinity Prediction”. IEEE/ACM Transactions on Computational Biology and Bioinformatics Slide credit: Tri Nguyen
  • 23. 26/05/2023 23 GEFA (cont.) Nguyen, T. M., Nguyen, T., Le, T. M., & Tran, T. (2021). “GEFA: Early Fusion Approach in Drug-Target Affinity Prediction”. IEEE/ACM Transactions on Computational Biology and Bioinformatics Slide credit: Tri Nguyen
  • 24. Predicting stress-strain curve from crystal mixture • Transformer to leverage long-range dependencies between voxels • Input: Feature vectors per voxel. • Output: Strain curve per voxel. 26/05/2023 24
  • 25. 26/05/2023 25 Topics REPRESENTATION PREDICTION OPTIMIZATION & GENERALIZATION EXPLANATION
  • 26. Molecular generation • The molecular space is estimated to be 1e+23 to 1e+60 • Only 1e+8 substances synthesized thus far. • It is impossible to model this space fully. • The current technologies are not mature for graph generations. • But approximate techniques do exist. 26/05/2023 26 Source: pharmafactz.com
  • 27. Combinatorial chemistry • Generate variations on a template • Returns a list of molecules from this template that • Bind to the pocket with good pharmacodynamics? • Have good pharmacokinetics? • Are synthetically accessible? 26/05/2023 27 #REF: Talk by Chloé-Agathe Azencott titled “Machine learning for therapeutic research”, 12/10/2017
  • 28. 26/05/2023 28 Retrosynthesis prediction • Once a molecular structure is designed, how do we synthesize it? • Retrosynthesis planning/prediction • Identify a set of reactants to synthesize a target molecule • This is reverse of chemical reaction prediction Picture source: Tim Soderberg, “Retrosynthetic analysis and metabolic pathway prediction”, Organic Chemistry With a Biological Emphasis, 2016. URL: https://chem.libretexts.org/Courses/Oregon_Institute_of_Technology/OIT%3A_CHE_333_- _Organic_Chemistry_III_(Lund)/2%3A_Retrosynthetic_analysis_and_metabolic_pathway_prediction
  • 29. GTPN: Synthesis via reaction prediction as neural graph morphism • Input: A set of graphs = a single big graph with disconnected components • Output: A new set of graphs. Same nodes, different edges. • Model: Graph morphism • Method: Graph transformation policy network (GTPN) 26/05/2023 29 Kien Do, Truyen Tran, and Svetha Venkatesh. "Graph Transformation Policy Network for Chemical Reaction Prediction." KDD’19.
  • 30. 26/05/2023 30 Alloy design generation • Scientific innovations are expensive • One search per specific target • Availability of growing data Nguyen, P., Tran, T., Gupta, S., Rana, S., Barnett, M. and Venkatesh, S., 2019, May. Incomplete conditional density estimation for fast materials discovery. In Proceedings of the 2019 SIAM International Conference on Data Mining (pp. 549-557). Society for Industrial and Applied Mathematics.
  • 31. 26/05/2023 31 Inverse design • Leverage the existing data and query the simulators in an offline mode • Avoid the global optimization by learning the inverse design function f -1(y) • Predict design variables in a single step
  • 32. 26/05/2023 32 Incomplete conditional density estimation • Multimodal density estimation given incomplete conditions • However, integrating over h is still intractable, we approximate the expectation by a function evaluation at the mode
  • 33. 26/05/2023 33 Generated alloys example • Known-alloy dataset: 15,000 variations from 30 known series of Aluminum alloys • BO-search dataset: 15,000 variations from 1000 found alloys by Bayesian optimization • Input: phase diagram | Output: element composition
  • 34. 26/05/2023 34 Crystal structure generation • Application in structure discovery: battery, aerospace materials, etc. • The stability of a solid-state crystal structure is connected to its formation energy • Target: • Generate crystal-like structure • Has low formation energy • Diversity set of crystal structure candidates for active learning Slide credit: Tri Nguyen
  • 35. 26/05/2023 35 GFlownet • GFlownet learns to generate the composition object: • From the starting state, policy network output the probability distribution over building blocks • Select building blocks randomly based on the output probability distribution and create a new state → calculate the new probability distribution • Repeat until reaching the terminal state • Getting the reward from the environment (sparse reward) • The complete set of actions from starting state to terminal state is a trajectory • Flow is a non-negative function defined on the set of complete trajectory • GFlownet is trained by matching the flow going through state: in-flow = out-flow Slide credit: Tri Nguyen
  • 36. 26/05/2023 36 GFlownet • Advantage of GFlownet • Diverse set of candidates → avoid getting stuck in multi-modal distribution (e.g stability/energy landscape of crystal structure) • Can sample in proportion to a given reward function (crystal structure generation: formation energy) Slide credit: Tri Nguyen
  • 37. 26/05/2023 37 Crystal structure generation with GFlownet • State: • Multi-graph representation for structure: • Node: atoms • Node feature: element type, fraction coordinate • Edges: built using near-neighbor-based method CrystalNN with search cut-off starting from 13 and increasing to 20 • Edge feature: cell-direction vector ‘to_image’, bond distance • 3D grid space: currently occupied and available position to insert new atom • Action: • Available fraction coordinate on a 3D grid. • The chosen element Slide credit: Tri Nguyen
  • 38. 26/05/2023 38 GFlownet - Forward policy • Policy network: calculate the probability distribution over actions Examples of generated crystal structure Slide credit: Tri Nguyen
  • 39. 26/05/2023 39 Topics REPRESENTATION PREDICTION OPTIMIZATION & GENERALIZATION EXPLANATION
  • 40. Explaining DTA deep learning model: feature attribution 26/05/2023 Explainer Protein Drug Deep learning model Affinity = 6.8 Show the contribution of each part of input to the model decision Does not show the causal relationship between the input and the output of model 40 Slide credit: Tri Minh Nguyen
  • 41. 26/05/2023 41 Drug agent Protein agent Actions: Removing atoms, bonds Add atoms, bonds Actions: - Substituting one residue of other types with Alanine DTA model Drug-Target Environment Reward + Drug representation similarity + Protein representation similarity + Δ Affinity Communicate to form a common action MACDA: MultiAgent Counterfactual Drug-target Affinity framework Nguyen, T.M., Quinn, T.P., Nguyen, T. and Tran, T., 2022. Explaining Black Box Drug Target Prediction through Model Agnostic Counterfactual Samples. IEEE/ACM Transactions on Computational Biology and Bioinformatics. Slide credit: Tri Minh Nguyen
  • 42. 26/05/2023 42 The road ahead Image source: pobble365
  • 43. Grand framework • Two-step paradigm: • Step 1: Compress ALL materials knowledge into a giant model. • Data, context as episodic memory | Model weights as semantic memory. • Step 2: Decompress knowledge into something new. • This requires learning to reason – learn how to manipulate existing knowledge. • Search, plan are reasoning. Both aims to minimize an objective (e.g., matching or energy). 26/05/2023 43 Picture taken from (Bommasani et al, 2021) • Learning to reason (zero-shot) all. • Eying few-shot capability (e.g., materials prompting). • Leverage LLMs capability.
  • 44. Prediction versus understanding • We can predict well without understanding (e.g., planet/star motion Newton). • Guessing the God’s many complex behaviours versus knowing his few universal laws. • → Automated laws discovery! • → Abductive reasoning. 26/05/2023 44