SlideShare a Scribd company logo
1 of 41
MOLECULAR
DYNAMICS
MACHINE
LEARNING
QUANTUM
MECHANICS
Does Scientific Research Need
Machine Learning?
U. Deva Priyakumar
Center for Computational Natural Sciences and Bioinformatics
International Institute of Information Technology, Hyderabad
devalab.org
Figure 10: (a) Radial distribution function obtained for Au/Pd-Owater for different
concentrations of aqueous EP (given in percentage in the insets) for Au10Pd10. (b) High
energy water molecules along with EP present at the surface of Au10Pd10.
The above sections illustrate that irrespective of the high affinity between NPs
and EP compared to that between NPs and water, few water molecules are found in the
first adsorption layer. The nature of Au/Pd-water interactions were further examined by
calculating the radial distribution functions corresponding to the Au/Pd atoms with the
oxygen atoms of water (Figures 10a and S29). At lower concentrations of aqueous EP
and in pure water, the distribution functions exhibit clear peaks corresponding to first
layer of adsorbed water and second solvation shell. The presence of distinct peaks for
the second solvation shell in 0.0 and 0.87 % aqueous EP solutions is demonstrative of
Page 22 of 37The Journal of Physical Chemistry
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Metal NP
Growth/
Dynamics
Research
AreasMachine
Learning
(D)RNA
Dynamics
Protein
Folding
Membrane
Proteins
devalab.org
THEORY
EXPERIMENT
COMPUTATION
Observe
Understand
Predict
REAL LIFE
APPLICATIONS
Computation
Program
Input
Computer
Output
Machine Learning
Output
Input
Computer
Program
Property
Computation
Program
Input
Computer
Output
Machine Learning
Output
Input
Computer
Program
Gaussian 16
Cartesian
coordinates
Property
Cartesian
coordinates
ML model
Supervised
Semi-supervised
Unsupervised
Machine
Learning
THEORY
EXPERIMENT
COMPUTATION
REAL LIFE
APPLICATIONS
COMPUTERS AND BIOMEDICAL RESEARCH 6,41 l-421 (1973)
Cybernetic Methods of Drug Design.
I. Statement of the Problem-The Perceptron Approach
S. A. HILLER,V. E. GOLENDER,ANDA. B. ROSENBLIT
The Institute of Organic Synthesis of the Academy of Sciences of the Latvian SSR,
Riga 6, Aizkraukles 21, U.S.S.R.
AND
L. A. RASTRIGINANDA. B. GLAZ
The Institute of EIectronics and Computing Technology of the Academy of Sciences
of the Latvian SSR, Riga, 6 Akademiyas 14, U.S.S.R.
ReceivedOctober 12,1972
It is revealed that the problem of drug design which is at present coped with on a semi-
intuitive basis may be. interpreted in terms of modem pattern recognition theory as a
problem of discriminating two classes of objects: the active and the inactive chemical
compounds.
In the meantime two questions are essentially important: (1) the presentation of in-
formation on the structure of a chemical compound, i.e., the elaboration of terms for
adequately describing the structure and (2) the selection of a recognition algorithm.
Thispaperdealswith theperceptronapproachto theresolutionof theproblem.The
structure is, therefore, presented as a sequence of certain coded functional groups and is
projected onto the perceptron retina. The error correction procedure with adaptation of
S-A connections is employed for classification.
The perceptron approach limitations are examined.
INTRODUCTION
The process of drug design is accompanied by a significant work on the synthesis
and pharmacological examination of a great number of compounds before a sub-
stance can be obtained which is found to possess all the necessary physiological
properties. This is caused by the fact that at the present stage of development of
pharmacological chemistry there exists no general theory which ties the structure of
substances with their physiological activity. Nonetheless a number of general aspects
uch a high error probability the recognition system appears to be suffi-
ctive.
studied a number of approaches to resolving the problem of predicting
l activity of chemical compounds. These approaches are peculiar in the
structure representation and by the pattern recognition algorithms.
deals with the perceptron approach (7).
THE PERCEPTRONAPPROACH
nition system we employed a three-layer perceptron-network, which
S-, A-, and R-units illustrated in Fig. 1 (8).
nits which receive information from the environment may be either
ut signal equal to 1) or inactive (output signal equal to 0).
solving prediction problems of pharmacological activity of chemical
the S-units of the perceptron form a receptor field it x n onto which is
FIG. 1. A three-layer perceptron.
algorithm (error correction procedure). After that the training setis presentedfor
testing, and a quality function (number of incorrect answersof the perceptron) is
determinedfor the given configuration of S-A connections. Besidesthis. the number
of correct answersis determinedfor eachA-unit. If the value of thequality function
happensto be greater than the predetermined value, the S-A connections for A-
units, whose number of correct answersis lower than a certain threshold (I, are
readjustedat random. Then the error correction procedure isrepeated,andin caseof
necessityanother random searchstepis made.
The searchiscontinued until the value of the quality function becomesequal to or
lower than the presetone or until the number of correct answersof all the A-units
exceedsthe threshhold. After this a test is madeaccording to a testing set.
EXPERIMENT
The possibility of recognizing pharmacological activity of a substanceby the
molecular structure wasinvestigated on aseriesof alkyl- and alcoxialkyl-substituted
1,3-dioxanes (9) which are presentedin Table 1. Thesechemical compounds are
representedby structural formula
RI, ,0--C&, ,R,
C C
R2’ ‘O-CH,’ ‘H
and may exist ascys- and trans-isomers.
CYBERNETIC METHODS OF DRUG DESIGN 417
TABLE 1
ANTICONVULSION ACTIVITY OF 1,3-DIOXANES
No. RI R2
1
-- CzHS
2
3
- C&b
4
5
- C&7
6
I
___ GHm
8
9
- CsH,a
10
11 CHZ-
------I
H
H
H
H
H
CH3
-
R3 Isomer
trans
iso-C,H,
CYS
trans
CH3
CYS
trans
iso-C3H7
CYS
tram
CH3
CYS
trans
ISO-CSH,
CYS
tram
CH3
Activity
(antagonism to
corasol)
---
+1
+1
-1
-1
+1
-I
+1
+1
+I
+I
+I
retina and two for inhibitory. The initial configuration of S-A connections was
selected at random. The threshhold of each A-unit was assumed to be equal to 1.
The perceptron was adapted according to the algorithm described earlier.
One part of the 46 compounds listed in Table 1 was selected for the training set
and included representatives of all the four previously mentioned groups of com-
pounds. The rest were used only for testing.
Three various learning sets containing n, x 22, n: 24. n3 26 objects were
selected. For each of these we selected one threshold value---q and conducted ten
independent experiments. The average results and confidence intervals which corres-
pond to 0.95 confidence probability are illustrated in Table 3. The results obtained
lead us to believe that the cybernetic approach to the drug design problem is quite
perspective.
TABLE 3
RESULTSOF EXPERIMENT
Learning Test
Learning Reliability Confidence Reliability Confidence
set of recognition interval of recognition interval
n, = 22 86 7 68 10
rlz = 24 89 6 71 13
I23 = 26 85 5 76 9
At the sametime the perceptron approach isnot completely free from anumber of
drawbacks andlimitations. The most significant obstaclein the way of wide-spread
employment of the perceptron approach is the difficulty of invariant structure
presentation on the retina of the perceptron. Furthermore, the processof adapting
S-A connectionsby method of random searchdemandsmuch computer time.
It should be noted, nevertheless, that the efficiency of this approach depends
significantly on the adequacy of the terms employed for structure description (in
terms of the perceptron approach-the method of structure presentation on the
perceptron retina).
A number of later paperswill bededicated to the discussionof a range of algor-
ithms in which an attempt is made to overcome the drawbacks of the perceptron
approach.
#ofpublications
0
200
400
600
800
1000
1200
Year
1991-921993-941995-961997-981999-002001-022003-042005-062007-082009-102011-122013-142015-162017-18
ACS Journals
Search: “machine learning”
anywhere in the article
**
Protein Folding
Fold a Protein
Won CASP competition - predicted 25/43
Runner up - predicted 3/43
Disclaimer: accuracy overestimated in favor of deepmind
deepmind (Google)
BAND NN
Bonds
Angles Nonbonds
Dihedrals
Artificial Neural Network
ML-FF!!!
H C O N
1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
(a) Atom identifier
H1
C2
O3
N4
H5
H6
1 0 0 0H1
0 0 0 1 2 1 0 0N4
Atom name Atom type
0 1 0 0
(b) Atom identifier and atom typing
Dihedral
atom p atom q atom r atom s dpqrs apqr aqrs bpq bqr brs
38 dimensions
Bond
atom p atom q bpq
17 dimensions
Angle
atom p atom q atom r apqr bpq bqr
27 dimensions
(c) Feature vectors of bonds, angles, nonbonds and dihedrals
Nonbond
atom p atom q npq
17 dimensions
H1
C2
O3
N4
H5
H6
H1-O3, C2-O3, C2-
N4, N4-H5, N4-H6
H1-C2-O3, H1-C2-
N4, N4-C2-O3, H5-
N4-C2, N6-N4-C2,
H5-N4-H6
H1-C2-N4-H5, H1-
C2-N4-H6, H5-N4-
C2-O3, H6-N4-C2-O3
H1…O3, H1…N4,
O3…N4, C2…H6,
C2…H5, H5…H6
H1…H6, H1…H5,
H5…O3, H6…O3
Bonds Angles Dihedrals Nonbonds
Cartesian coords.
r1, r2,… rN; Z1, Z2,… ZN
Eb Ea Ed En
E = Eb + Ea + Ed+ En
Dataset
• Subset of ANI-1 dataset 

• 57,462 molecules (all possible molecules with up to 8
C/N/O/H atoms)

• Normal mode sampling for higher energy states
(~22,000,000)

• DFT (wB97x/6-31G(d)) energies

• This study: ~7.6 million points (< 30 kcal/mol) -
80-10-10% for training-validation-test sets.
Smith et al. Chem. Sci., 2017, 8, 3192
Accuracy of ML-BAND
Frequency,%
0
25
50
75
100
AE, kcal/mol
0-1
1-2
2-3
3-4
4-5
5-6
AE, kcal/mol
0-1
1-2
2-3
3-4
4-5
5-6
6-7
7-8
(a) Test set (~700k molecules) (b) GDB10 (~1.5k molecules)
Structural/Geometric Isomers
RelativeEnergy,kcal/mol
-7
1.2
9.4
17.6
25.8
34
0 2 4 6 8 10 12 14 16
DFT ML-BAND AM1
C11H22 Isomers
C–C Bond Stretching
Energy,kcal/mol
0
7
14
21
28
35
C–C length, Å
1.2 1.35 1.5 1.65 1.8
DFT
ML-BAND
Energy,kcal/mol
0
4
8
12
16
20
C–C length, Å
1.2 1.375 1.55 1.725 1.9
DFT
ML-BAND
H
N
N
H
H
N
O
O
O
C–N Bond Stretching
Energy,kcal/mol
0
7
14
21
28
35
C–N length, Å
1 1.175 1.35 1.525 1.7
DFT
ML-BAND
Energy,kcal/mol
0
7
14
21
28
35
C–N length, Å
1.2 1.35 1.5 1.65 1.8
DFT
ML-BAND
H
N
N
H
H
N
O
O
O
C–C–C Bond Bending
Energy,kcal/mol
0
3.2
6.4
9.6
12.8
16
C–C–C length, degree
80 94 108 122 136 150
DFT
ML-BAND
Energy,kcal/mol
0
8
16
24
32
40
C–C–C angle, degree
80 94 108 122 136 150
DFT
ML-BAND
Fentanyl
Methamphentamine
Reaction Energies
O
H
O
H
O
O
O
O
H2
O O
H2O
OH
O
OH O
O
Intramolecular H-bond
Hydrogenation
Diels-Alder
Aldol condensation
Esterification
Rearrangement
Reactionenergy,kcal/mol
-70
-52.5
-35
-17.5
0
17.5
H
-bondH
ydrogenationDiels-Alder
Aldol
Esterification
Rearrangem
ent
DFT ML-BAND AM1
Geometry Optimization with ML-BAND
-2495
-2490
-2485
-2480
-2475
-2470
-2465
1
1745
3489
5233
6977
8721
10465
12209
13953
15697
17441
19185
20929
22673
24417
26161
27905
29649
31393
33137
34881
36625
38369
40113
41857
43601
45345
47089
Atomizationenergy,kcal/mol
-1260
-1210
-1160
-1110
-1060
1
230
459
688
917
1146
1375
1604
1833
2062
2291
2520
2749
2978
3207
3436
3665
3894
4123
Step # Step #
Octane Methylpropenol
Molecule
Property
Property
Molecule
Molecular Design
Sanchez-Lengeling & Aspuru-Guzik, Science, 2018, 361, 360.
Direct
(Expt/Comput.)
Inverse
(Generative ML)
Deep INorganic Material Generator
Dataset
npj Computational Materials 1, 15010 (2015)
ICSD (FIZ Karlsruhe) 161k
– duplicates,
incomplete,…
44k
< 35 atoms in unit
cell
30k
+ elemental, binary,
tertiary..,
563k
+ > 10 atoms of
same element
– incomplete data
– one but all
structures with same
composition
272k
Feature Vector
0 1 2 3 4 5 6 7 8 9 10
H 1 0 0 0 0 0 0 0 0 0 0
He 1 0 0 0 0 0 0 0 0 0 0
Li 1 0 0 0 0 0 0 0 0 0 0
Be 1 0 0 0 0 0 0 0 0 0 0
…
O 0 0 0 1 0 0 0 0 0 0 0
…
Al 0 0 1 0 0 0 0 0 0 0 0
…
Pu 1 0 0 0 0 0 0 0 0 0 0
Al2O3
Predictor Models
Performance
• Deep neural network with 7 hidden
layers.
• ReLU activation function for all
layers except the last.
• Linear activation for the last layer.
MAE = 0.051 eV/atom MAE = 0.068 eV/atom
MAE = 0.38 Å3
DING
Conditional Variational Autoencoder
Input OutputCodeencoder decoder
Input = Output
DING Architecture
Latent Space Continuity
*
*
DING
New Material Generation
Predictions Within the Dataset
Unbiased
Biased
Predictions Outside the Dataset
Unreasonable Requests!
THEORY
EXPERIMENT
COMPUTATION
Observe
Understand
Predict
REAL LIFE
APPLICATIONS

More Related Content

Similar to Event 32

Accelerating materials property predictions using machine learning
Accelerating materials property predictions using machine learningAccelerating materials property predictions using machine learning
Accelerating materials property predictions using machine learningGhanshyam Pilania
 
Diagnosis of Faulty Elements in Array Antenna using Nature Inspired Cuckoo Se...
Diagnosis of Faulty Elements in Array Antenna using Nature Inspired Cuckoo Se...Diagnosis of Faulty Elements in Array Antenna using Nature Inspired Cuckoo Se...
Diagnosis of Faulty Elements in Array Antenna using Nature Inspired Cuckoo Se...IJECEIAES
 
Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Atai Rabby
 
ACS 238th Meeting, 2009, Rasulev
ACS 238th Meeting, 2009, RasulevACS 238th Meeting, 2009, Rasulev
ACS 238th Meeting, 2009, RasulevB R
 
An Automatic Clustering Technique for Optimal Clusters
An Automatic Clustering Technique for Optimal ClustersAn Automatic Clustering Technique for Optimal Clusters
An Automatic Clustering Technique for Optimal ClustersIJCSEA Journal
 
An Improved Adaptive Multi-Objective Particle Swarm Optimization for Disassem...
An Improved Adaptive Multi-Objective Particle Swarm Optimization for Disassem...An Improved Adaptive Multi-Objective Particle Swarm Optimization for Disassem...
An Improved Adaptive Multi-Objective Particle Swarm Optimization for Disassem...IJRESJOURNAL
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsValery Tkachenko
 
Poster presentat a les jornades doctorals de la UAB
Poster presentat a les jornades doctorals de la UABPoster presentat a les jornades doctorals de la UAB
Poster presentat a les jornades doctorals de la UABElisabeth Ortega
 
Prediction Of Bioactivity From Chemical Structure
Prediction Of Bioactivity From Chemical StructurePrediction Of Bioactivity From Chemical Structure
Prediction Of Bioactivity From Chemical StructureJeremy Besnard
 
New Microsoft PowerPoint Presentation (2).pptx
New Microsoft PowerPoint Presentation (2).pptxNew Microsoft PowerPoint Presentation (2).pptx
New Microsoft PowerPoint Presentation (2).pptxpraveen kumar
 
A comparison of three chromatographic retention time prediction models
A comparison of three chromatographic retention time prediction modelsA comparison of three chromatographic retention time prediction models
A comparison of three chromatographic retention time prediction modelsAndrew McEachran
 
Probabilistic Error Bounds for Reduced Order Modeling M&C2015
Probabilistic Error Bounds for Reduced Order Modeling M&C2015Probabilistic Error Bounds for Reduced Order Modeling M&C2015
Probabilistic Error Bounds for Reduced Order Modeling M&C2015Mohammad
 
ProbErrorBoundROM_MC2015
ProbErrorBoundROM_MC2015ProbErrorBoundROM_MC2015
ProbErrorBoundROM_MC2015Mohammad Abdo
 
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...mathsjournal
 
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...mathsjournal
 
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...mathsjournal
 

Similar to Event 32 (20)

NMR Chemical Shift Prediction by Atomic Increment-Based Algorithms
NMR Chemical Shift Prediction by Atomic Increment-Based AlgorithmsNMR Chemical Shift Prediction by Atomic Increment-Based Algorithms
NMR Chemical Shift Prediction by Atomic Increment-Based Algorithms
 
Qsar
QsarQsar
Qsar
 
Accelerating materials property predictions using machine learning
Accelerating materials property predictions using machine learningAccelerating materials property predictions using machine learning
Accelerating materials property predictions using machine learning
 
Diagnosis of Faulty Elements in Array Antenna using Nature Inspired Cuckoo Se...
Diagnosis of Faulty Elements in Array Antenna using Nature Inspired Cuckoo Se...Diagnosis of Faulty Elements in Array Antenna using Nature Inspired Cuckoo Se...
Diagnosis of Faulty Elements in Array Antenna using Nature Inspired Cuckoo Se...
 
Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)
 
ACS 238th Meeting, 2009, Rasulev
ACS 238th Meeting, 2009, RasulevACS 238th Meeting, 2009, Rasulev
ACS 238th Meeting, 2009, Rasulev
 
An Automatic Clustering Technique for Optimal Clusters
An Automatic Clustering Technique for Optimal ClustersAn Automatic Clustering Technique for Optimal Clusters
An Automatic Clustering Technique for Optimal Clusters
 
An Improved Adaptive Multi-Objective Particle Swarm Optimization for Disassem...
An Improved Adaptive Multi-Objective Particle Swarm Optimization for Disassem...An Improved Adaptive Multi-Objective Particle Swarm Optimization for Disassem...
An Improved Adaptive Multi-Objective Particle Swarm Optimization for Disassem...
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpoints
 
A systematic approach for the generation and verification of structural hypot...
A systematic approach for the generation and verification of structural hypot...A systematic approach for the generation and verification of structural hypot...
A systematic approach for the generation and verification of structural hypot...
 
Poster presentat a les jornades doctorals de la UAB
Poster presentat a les jornades doctorals de la UABPoster presentat a les jornades doctorals de la UAB
Poster presentat a les jornades doctorals de la UAB
 
Prediction Of Bioactivity From Chemical Structure
Prediction Of Bioactivity From Chemical StructurePrediction Of Bioactivity From Chemical Structure
Prediction Of Bioactivity From Chemical Structure
 
New Microsoft PowerPoint Presentation (2).pptx
New Microsoft PowerPoint Presentation (2).pptxNew Microsoft PowerPoint Presentation (2).pptx
New Microsoft PowerPoint Presentation (2).pptx
 
A comparison of three chromatographic retention time prediction models
A comparison of three chromatographic retention time prediction modelsA comparison of three chromatographic retention time prediction models
A comparison of three chromatographic retention time prediction models
 
Probabilistic Error Bounds for Reduced Order Modeling M&C2015
Probabilistic Error Bounds for Reduced Order Modeling M&C2015Probabilistic Error Bounds for Reduced Order Modeling M&C2015
Probabilistic Error Bounds for Reduced Order Modeling M&C2015
 
ProbErrorBoundROM_MC2015
ProbErrorBoundROM_MC2015ProbErrorBoundROM_MC2015
ProbErrorBoundROM_MC2015
 
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
 
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
 
Validating Automated Structure Confirmation Using NMR Prediction in a Blind S...
Validating Automated Structure Confirmation Using NMR Prediction in a Blind S...Validating Automated Structure Confirmation Using NMR Prediction in a Blind S...
Validating Automated Structure Confirmation Using NMR Prediction in a Blind S...
 
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
 

Recently uploaded

BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfWildaNurAmalia2
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsHajira Mahmood
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptxSulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptxnoordubaliya2003
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 

Recently uploaded (20)

BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutions
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptxSulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
Sulphur & Phosphrus Cycle PowerPoint Presentation (2) [Autosaved]-3-1.pptx
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 

Event 32

  • 2. Does Scientific Research Need Machine Learning? U. Deva Priyakumar Center for Computational Natural Sciences and Bioinformatics International Institute of Information Technology, Hyderabad devalab.org
  • 3. Figure 10: (a) Radial distribution function obtained for Au/Pd-Owater for different concentrations of aqueous EP (given in percentage in the insets) for Au10Pd10. (b) High energy water molecules along with EP present at the surface of Au10Pd10. The above sections illustrate that irrespective of the high affinity between NPs and EP compared to that between NPs and water, few water molecules are found in the first adsorption layer. The nature of Au/Pd-water interactions were further examined by calculating the radial distribution functions corresponding to the Au/Pd atoms with the oxygen atoms of water (Figures 10a and S29). At lower concentrations of aqueous EP and in pure water, the distribution functions exhibit clear peaks corresponding to first layer of adsorbed water and second solvation shell. The presence of distinct peaks for the second solvation shell in 0.0 and 0.87 % aqueous EP solutions is demonstrative of Page 22 of 37The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Metal NP Growth/ Dynamics Research AreasMachine Learning (D)RNA Dynamics Protein Folding Membrane Proteins devalab.org
  • 8. COMPUTERS AND BIOMEDICAL RESEARCH 6,41 l-421 (1973) Cybernetic Methods of Drug Design. I. Statement of the Problem-The Perceptron Approach S. A. HILLER,V. E. GOLENDER,ANDA. B. ROSENBLIT The Institute of Organic Synthesis of the Academy of Sciences of the Latvian SSR, Riga 6, Aizkraukles 21, U.S.S.R. AND L. A. RASTRIGINANDA. B. GLAZ The Institute of EIectronics and Computing Technology of the Academy of Sciences of the Latvian SSR, Riga, 6 Akademiyas 14, U.S.S.R. ReceivedOctober 12,1972 It is revealed that the problem of drug design which is at present coped with on a semi- intuitive basis may be. interpreted in terms of modem pattern recognition theory as a problem of discriminating two classes of objects: the active and the inactive chemical compounds. In the meantime two questions are essentially important: (1) the presentation of in- formation on the structure of a chemical compound, i.e., the elaboration of terms for adequately describing the structure and (2) the selection of a recognition algorithm. Thispaperdealswith theperceptronapproachto theresolutionof theproblem.The structure is, therefore, presented as a sequence of certain coded functional groups and is projected onto the perceptron retina. The error correction procedure with adaptation of S-A connections is employed for classification. The perceptron approach limitations are examined. INTRODUCTION The process of drug design is accompanied by a significant work on the synthesis and pharmacological examination of a great number of compounds before a sub- stance can be obtained which is found to possess all the necessary physiological properties. This is caused by the fact that at the present stage of development of pharmacological chemistry there exists no general theory which ties the structure of substances with their physiological activity. Nonetheless a number of general aspects uch a high error probability the recognition system appears to be suffi- ctive. studied a number of approaches to resolving the problem of predicting l activity of chemical compounds. These approaches are peculiar in the structure representation and by the pattern recognition algorithms. deals with the perceptron approach (7). THE PERCEPTRONAPPROACH nition system we employed a three-layer perceptron-network, which S-, A-, and R-units illustrated in Fig. 1 (8). nits which receive information from the environment may be either ut signal equal to 1) or inactive (output signal equal to 0). solving prediction problems of pharmacological activity of chemical the S-units of the perceptron form a receptor field it x n onto which is FIG. 1. A three-layer perceptron. algorithm (error correction procedure). After that the training setis presentedfor testing, and a quality function (number of incorrect answersof the perceptron) is determinedfor the given configuration of S-A connections. Besidesthis. the number of correct answersis determinedfor eachA-unit. If the value of thequality function happensto be greater than the predetermined value, the S-A connections for A- units, whose number of correct answersis lower than a certain threshold (I, are readjustedat random. Then the error correction procedure isrepeated,andin caseof necessityanother random searchstepis made. The searchiscontinued until the value of the quality function becomesequal to or lower than the presetone or until the number of correct answersof all the A-units exceedsthe threshhold. After this a test is madeaccording to a testing set. EXPERIMENT The possibility of recognizing pharmacological activity of a substanceby the molecular structure wasinvestigated on aseriesof alkyl- and alcoxialkyl-substituted 1,3-dioxanes (9) which are presentedin Table 1. Thesechemical compounds are representedby structural formula RI, ,0--C&, ,R, C C R2’ ‘O-CH,’ ‘H and may exist ascys- and trans-isomers. CYBERNETIC METHODS OF DRUG DESIGN 417 TABLE 1 ANTICONVULSION ACTIVITY OF 1,3-DIOXANES No. RI R2 1 -- CzHS 2 3 - C&b 4 5 - C&7 6 I ___ GHm 8 9 - CsH,a 10 11 CHZ- ------I H H H H H CH3 - R3 Isomer trans iso-C,H, CYS trans CH3 CYS trans iso-C3H7 CYS tram CH3 CYS trans ISO-CSH, CYS tram CH3 Activity (antagonism to corasol) --- +1 +1 -1 -1 +1 -I +1 +1 +I +I +I retina and two for inhibitory. The initial configuration of S-A connections was selected at random. The threshhold of each A-unit was assumed to be equal to 1. The perceptron was adapted according to the algorithm described earlier. One part of the 46 compounds listed in Table 1 was selected for the training set and included representatives of all the four previously mentioned groups of com- pounds. The rest were used only for testing. Three various learning sets containing n, x 22, n: 24. n3 26 objects were selected. For each of these we selected one threshold value---q and conducted ten independent experiments. The average results and confidence intervals which corres- pond to 0.95 confidence probability are illustrated in Table 3. The results obtained lead us to believe that the cybernetic approach to the drug design problem is quite perspective. TABLE 3 RESULTSOF EXPERIMENT Learning Test Learning Reliability Confidence Reliability Confidence set of recognition interval of recognition interval n, = 22 86 7 68 10 rlz = 24 89 6 71 13 I23 = 26 85 5 76 9 At the sametime the perceptron approach isnot completely free from anumber of drawbacks andlimitations. The most significant obstaclein the way of wide-spread employment of the perceptron approach is the difficulty of invariant structure presentation on the retina of the perceptron. Furthermore, the processof adapting S-A connectionsby method of random searchdemandsmuch computer time. It should be noted, nevertheless, that the efficiency of this approach depends significantly on the adequacy of the terms employed for structure description (in terms of the perceptron approach-the method of structure presentation on the perceptron retina). A number of later paperswill bededicated to the discussionof a range of algor- ithms in which an attempt is made to overcome the drawbacks of the perceptron approach. #ofpublications 0 200 400 600 800 1000 1200 Year 1991-921993-941995-961997-981999-002001-022003-042005-062007-082009-102011-122013-142015-162017-18 ACS Journals Search: “machine learning” anywhere in the article **
  • 9.
  • 11.
  • 12. Fold a Protein Won CASP competition - predicted 25/43 Runner up - predicted 3/43 Disclaimer: accuracy overestimated in favor of deepmind
  • 14.
  • 16. H C O N 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 (a) Atom identifier H1 C2 O3 N4 H5 H6 1 0 0 0H1 0 0 0 1 2 1 0 0N4 Atom name Atom type 0 1 0 0 (b) Atom identifier and atom typing Dihedral atom p atom q atom r atom s dpqrs apqr aqrs bpq bqr brs 38 dimensions Bond atom p atom q bpq 17 dimensions Angle atom p atom q atom r apqr bpq bqr 27 dimensions (c) Feature vectors of bonds, angles, nonbonds and dihedrals Nonbond atom p atom q npq 17 dimensions
  • 17. H1 C2 O3 N4 H5 H6 H1-O3, C2-O3, C2- N4, N4-H5, N4-H6 H1-C2-O3, H1-C2- N4, N4-C2-O3, H5- N4-C2, N6-N4-C2, H5-N4-H6 H1-C2-N4-H5, H1- C2-N4-H6, H5-N4- C2-O3, H6-N4-C2-O3 H1…O3, H1…N4, O3…N4, C2…H6, C2…H5, H5…H6 H1…H6, H1…H5, H5…O3, H6…O3 Bonds Angles Dihedrals Nonbonds Cartesian coords. r1, r2,… rN; Z1, Z2,… ZN Eb Ea Ed En E = Eb + Ea + Ed+ En
  • 18. Dataset • Subset of ANI-1 dataset • 57,462 molecules (all possible molecules with up to 8 C/N/O/H atoms) • Normal mode sampling for higher energy states (~22,000,000) • DFT (wB97x/6-31G(d)) energies • This study: ~7.6 million points (< 30 kcal/mol) - 80-10-10% for training-validation-test sets. Smith et al. Chem. Sci., 2017, 8, 3192
  • 19. Accuracy of ML-BAND Frequency,% 0 25 50 75 100 AE, kcal/mol 0-1 1-2 2-3 3-4 4-5 5-6 AE, kcal/mol 0-1 1-2 2-3 3-4 4-5 5-6 6-7 7-8 (a) Test set (~700k molecules) (b) GDB10 (~1.5k molecules)
  • 21. C–C Bond Stretching Energy,kcal/mol 0 7 14 21 28 35 C–C length, Å 1.2 1.35 1.5 1.65 1.8 DFT ML-BAND Energy,kcal/mol 0 4 8 12 16 20 C–C length, Å 1.2 1.375 1.55 1.725 1.9 DFT ML-BAND H N N H H N O O O
  • 22. C–N Bond Stretching Energy,kcal/mol 0 7 14 21 28 35 C–N length, Å 1 1.175 1.35 1.525 1.7 DFT ML-BAND Energy,kcal/mol 0 7 14 21 28 35 C–N length, Å 1.2 1.35 1.5 1.65 1.8 DFT ML-BAND H N N H H N O O O
  • 23. C–C–C Bond Bending Energy,kcal/mol 0 3.2 6.4 9.6 12.8 16 C–C–C length, degree 80 94 108 122 136 150 DFT ML-BAND Energy,kcal/mol 0 8 16 24 32 40 C–C–C angle, degree 80 94 108 122 136 150 DFT ML-BAND Fentanyl Methamphentamine
  • 24. Reaction Energies O H O H O O O O H2 O O H2O OH O OH O O Intramolecular H-bond Hydrogenation Diels-Alder Aldol condensation Esterification Rearrangement Reactionenergy,kcal/mol -70 -52.5 -35 -17.5 0 17.5 H -bondH ydrogenationDiels-Alder Aldol Esterification Rearrangem ent DFT ML-BAND AM1
  • 25. Geometry Optimization with ML-BAND -2495 -2490 -2485 -2480 -2475 -2470 -2465 1 1745 3489 5233 6977 8721 10465 12209 13953 15697 17441 19185 20929 22673 24417 26161 27905 29649 31393 33137 34881 36625 38369 40113 41857 43601 45345 47089 Atomizationenergy,kcal/mol -1260 -1210 -1160 -1110 -1060 1 230 459 688 917 1146 1375 1604 1833 2062 2291 2520 2749 2978 3207 3436 3665 3894 4123 Step # Step # Octane Methylpropenol
  • 26. Molecule Property Property Molecule Molecular Design Sanchez-Lengeling & Aspuru-Guzik, Science, 2018, 361, 360. Direct (Expt/Comput.) Inverse (Generative ML)
  • 28. Dataset npj Computational Materials 1, 15010 (2015) ICSD (FIZ Karlsruhe) 161k – duplicates, incomplete,… 44k < 35 atoms in unit cell 30k + elemental, binary, tertiary.., 563k + > 10 atoms of same element – incomplete data – one but all structures with same composition 272k
  • 29. Feature Vector 0 1 2 3 4 5 6 7 8 9 10 H 1 0 0 0 0 0 0 0 0 0 0 He 1 0 0 0 0 0 0 0 0 0 0 Li 1 0 0 0 0 0 0 0 0 0 0 Be 1 0 0 0 0 0 0 0 0 0 0 … O 0 0 0 1 0 0 0 0 0 0 0 … Al 0 0 1 0 0 0 0 0 0 0 0 … Pu 1 0 0 0 0 0 0 0 0 0 0 Al2O3
  • 31. Performance • Deep neural network with 7 hidden layers. • ReLU activation function for all layers except the last. • Linear activation for the last layer. MAE = 0.051 eV/atom MAE = 0.068 eV/atom MAE = 0.38 Å3
  • 32. DING
  • 33. Conditional Variational Autoencoder Input OutputCodeencoder decoder Input = Output
  • 36. DING
  • 38. Predictions Within the Dataset Unbiased Biased