1. TITLE OF PRESENTATION |
Presented By
Date
CLARK Matthew
August 2018
Exploring the use of conditional generative
adversarial networks (cGAN) to analyze chemical
reactions via electron density fields
Matthew CLARK
Frederik van den BROEK
ACS 256th Annual Meeting 2018 Boston MA
CINF 149
PAPER ID: 2967997
2. TITLE OF PRESENTATION |
• Published chemistry -> descriptors -> predictive system
• Other workers use a variety of descriptors
• SMILES strings
• Connectivity matrix/graphs
• These have strengths and weaknesses
• Chemistry is based on movement of electrons
• Bonds are subjective and not limited to those used in sketching
programs
Chemistry Deep Learning
3. TITLE OF PRESENTATION | 3
Examples of problems with “normal” stick chemical structure
drawing
Charge is a molecular attribute, not
atomic
Representation of bond order is
approximate
How will you draw this?
These are challenging
Solid state –
catalysis/polymerization
Challenges
4. TITLE OF PRESENTATION |
• Electrons define chemistry
• Independent of the lines drawn for bond order
• Better representation of what is happening
4
Lets look at electrons for studying reactivity
Woodward , Fukui, and Hoffmann were pretty successful with
this kind of analysis ☺
5. TITLE OF PRESENTATION | 5
Electron Density Computed with Extended Hückel Theory
Electron density represented by
density values in a 3D regular
field
These allow point-by-point
comparisons among molecules
FORTICON8 – QCPE571 – ‘77
PSI1EHT – Severance ’86
Visualized with Pymol™, colored by
density written as b-factors in pdb
Essentially the same
FORTRAN program
Hoffman used
6. TITLE OF PRESENTATION | 6
Generative Adversarial Network – short explanation
Network
generates
proposed output
Training process
Discriminator
network trains
To distinguish
good/bad
output
“Real” output
used for training
Final output
Adversarial Feedback
Train on these
density changes
7. TITLE OF PRESENTATION | 7
Training and output
Predicted outputTraining transformation
Input for
prediction
8. TITLE OF PRESENTATION | 8
Other applications of generative adversarial networks
Image to Image Translation with Conditional Adversarial Networks ArXiv, Nov 2017 Isola, Zhu,
Zhou, Efros https://arxiv.org/pdf/1611.07004v2.pdf
• Used to colorize photos and movies
• Good for ‘high dimensionality’ problems
10. TITLE OF PRESENTATION | 10
Subsampling finds local electron density patterns
Sampling
subregion
over whole
region
• In movie colorization this is how it correctly renders a
hand regardless of position in the frame
• In chemistry it can create models regardless of the
orientation or exact location of a functional group in the
molecule
11. TITLE OF PRESENTATION | 11
Processing reactions to produce training set
A -> B transformations
3,278 reactions that met
MW, Yield criteria
KNIME workflow used to
process RD files into
reactants and products for
training
12. TITLE OF PRESENTATION | 12
High level process description
Select
reactions
Compute
EHMO
Reactant/product
Convert to
3D ED field
Train Deep
learning
Select
compound
Compute
EHMO
Convert to
3D ED field
Predict using
DL model
Training
Prediction
13. TITLE OF PRESENTATION |
• If trained on rotations can
produce a ‘canonical form’
• Insensitive to rotations
• Can be trained to deconvolute
conformations
• This should be explored more; is
a useful capability
13
Experiments with basic cheminformatics
14. TITLE OF PRESENTATION | 14
Training Examples
Tungsten complex chemistry – not addressable by most methods
16. TITLE OF PRESENTATION | 16
Reactants to Products Training
• Training to answer: what can this molecule become?
• Look at metabolites
• Transformations that could happen to a given starting
material
17. TITLE OF PRESENTATION | 17
Products to Reactants Training
• Training to answer: what molecule could become this molecule?
• Useful for retrosynthesis
18. TITLE OF PRESENTATION |
• The method appears to have promise
• Must interpret electron density to understand compounds
• Similar to crystallography!
• Output can be indistinct when several possibilities exist
• Koes group (Pitt) has some interesting solutions
• Gives insight into transformations
• Where transformations are likely to take place in a molecule
• Scoring results – how?
• Many to one relationship is not represented well
• System predicts only one output from an input structure
• Should include the nuclei!
• Difficult to distinguish among electronegative elements.
• Select a better test/reaction selection to demonstrate efficacy
• Autoencoding input
• Reproduce Woodward-Hoffmann rules with AI?
• Work with A + B -> C in addition to A -> B
18
Learnings
Good
Challenges
Next Step
19. TITLE OF PRESENTATION |
• Elsevier Machine Learning Services Team
• Frederik “Flying Dutchman” van den Broek
• Markus “Blockchain” Bussen
• Sally “Renzo” Makady
• Jabe “The Machine” Wilson
• Anton “Pathway” Yuryev
• Maria “Animal” Shkrob
• Tom “Running Man” Woodcock
• Tim “Hardcore” Hoctor
• You?
• Recruiting now in Boston/Cambridge area!
19