SlideShare a Scribd company logo
1 of 1
Objectives
Acknowledgements
• Svetlana Gelpi-Dominguez acknowledges support from NSF Northeast LSAMP Bridge to Doctorate Award
# 1400382. She would also like to thank Dr. Felice C. Lightstone, Dr. Brian J. Bennion, Dr. Sergio Wong,
and Dr. Eric Schwegler for the opportunity to work in the 2017 CCMS cohort. Svetlana would also like to
thank Dr. Miguel Morales-Silva and Tony Baylis for their constant mentoring at LLNL. Prepared by LLNL
under Contract DE-AC52-07NA27344.
References
• Hechinger M, Leonhard K, & Marquardt W (2012) What is Wrong with Quantitative
Structure–Property Relations Models Based on Three-Dimensional Descriptors?
Journal of Chemical Information and Modeling 52(8):1984-1993.
• Labute P (2000) A widely applicable set of descriptors. Journal of Molecular
Graphics and Modeling 18(4):464-477.
• Malde AK, et al. (2011) An Automated Force Field Topology Builder (ATB) and
Repository: Version 1.0. Journal of Chemical Theory and Computation 7(12):4026-
4037.
• Gaulton A, et al. (2017) The ChEMBL database in 2017. Nucleic Acids Research
45(Database issue):D945-D954.
Molecular Descriptors: Comparing Structural Complexity and Software
Svetlana Gelpí-Domínguez1,2 Sergio Wong2, Brian J. Bennion2 , Felice C. Lightstone2
1)Department of Chemistry, University of Connecticut, Storrs, 06269, CT
2)Lawrence Livermore National Laboratory, Livermore, 94550, CA
Methodology
Results and Discussion Conclusions
Figures A, B, C and D. A total of 94 molecules were used as input in their 1-D and 3-D format to calculate 188
molecular descriptors for each molecule. Figs A, B, and C are correlation plots for the descriptors molecular weight,
hydrophobicity, and accessible surface area. Fig D shows us that out of 188 molecular descriptors 130 (shaded in blue)
show an R2 of over 0.5. These are not dependent of 3-D structural input. The other 58 descriptors (shown in orange) must
be calculated using only 3-D structures files as input therefore explaining the low R2 values.
Figures E, F, G and H. Comparison of descriptors calculated using MOE and RDKit. Here we used 1-D
(SMILES strings) as input structures to compare 34 descriptors both programs have in common. We observe a
strong correlation in wolecular weight and hydrophobicity (fig. E). Fig G shows a low correlation between the MOE
SlogP_VSA descriptor and the Rdkit ‘MOE-like’ SlogP_VSA descriptor. Fig H shows the frequency of R2 for the
34 descriptors used in this comparison. The majority of the descriptors calculated have an R2 of under 0.6 showing a
low correlation between descriptors in both programs.
Future work
• Produce a reliable Quantitative Structure–Activity Relationship (QSAR) model
that yields the bio activity of these molecules against an important receptor such
as estrogen receptor alpha.
• Are 3-D descriptors necessary to build accurate QSAR models?
• There exists a strong correlation between calculated descriptors for 1-D
SMILES Strings and for descriptors based 3-D quantum mechanical structures.
• The average R2 between the calculated molecular descriptor calculations for 94
molecules in MOE for 1-D SMILES strings and 3-D structures was 0.72. In fig
D. it is noted that the majority of highly correlated molecular descriptors are 0-
D, 1-D, and 2-D descriptors. This means that for the purposes of using 0-D, 1-
D, and 2-D descriptors is isn’t essential to have 3-D structures as an input for
descriptor calculations.
• In the MOE calculations it was observed that 3-D descriptor values do depend
on the dimension used for the input structure.
• There is a low correlation between Moe descriptors in MOE and ‘Moe-like’
descriptors found in RDKit. (average R2 of 0.12).
• If your QSAR model does not depend on 3-D descriptors then your pipeline can
become more efficient by using 1-D SMILES strings for your descriptor
calculations.
SMILES String:
c1ccccc1
H)
Commercial Open Source
Abstract
What are descriptors? And how are they used? In a large effort to predict the
compound activity of over 1.7 million compounds in various in-vitro assays, the time it
takes to extract molecules from a database and process them for virtual screening is
crucial. Applications need to take advantage of LLNL’s high performance computing.
Simplified
Molecular-Input
Line-Entry System
(SMILES) c1ccccc1
1.7 million compounds
1-D Format
3-D Format
Predict Activity
Molecular Descriptor
Calculation.
4 types of Molecular Descriptors:
• Topological: Atom count
• Geometrical: Principal Moment of Inertia
• Electronic: Dipole moment
• 3D Descriptors
Software used to perform
calculations.
MOAD (clip.llnl.gov:5507)
• Are 3-D structures determined by ab-initio methods better to use than 1-D
SMILES Strings for the calculation of molecular descriptors? The answer to
this question can be found in figures A-D.
• Are MOE-like descriptors in RDKit the same as those created by MOE? The
answer to this question can be found in figures E-H.
• Upload the 1.7
million compounds
with their
calculated
descriptors to the
MOAD database.
R2 Frequency RDKit vs. MOE
B)A)
E) F)
G)
Positive Control
Positive Control
R2 Frequency Calculated DescriptorsD)
C)

More Related Content

What's hot

Ijarcet vol-2-issue-2-337-341
Ijarcet vol-2-issue-2-337-341Ijarcet vol-2-issue-2-337-341
Ijarcet vol-2-issue-2-337-341Editor IJARCET
 
Coates p: the use of genetic programming for applications in the field of spa...
Coates p: the use of genetic programming for applications in the field of spa...Coates p: the use of genetic programming for applications in the field of spa...
Coates p: the use of genetic programming for applications in the field of spa...ArchiLab 7
 
Urban strategies to promote resilient cities The case of enhancing Historic C...
Urban strategies to promote resilient cities The case of enhancing Historic C...Urban strategies to promote resilient cities The case of enhancing Historic C...
Urban strategies to promote resilient cities The case of enhancing Historic C...inventionjournals
 
ECCV 2010
ECCV 2010ECCV 2010
ECCV 2010pmhbath
 
Substructure Similarity Search in Graph Databases
Substructure Similarity Search in Graph DatabasesSubstructure Similarity Search in Graph Databases
Substructure Similarity Search in Graph Databasespgst
 
Dot matrix Analysis Tools (Bioinformatics)
Dot matrix Analysis Tools (Bioinformatics)Dot matrix Analysis Tools (Bioinformatics)
Dot matrix Analysis Tools (Bioinformatics)Safa Khalid
 
Overview combining ab initio with continuum theory
Overview combining ab initio with continuum theoryOverview combining ab initio with continuum theory
Overview combining ab initio with continuum theoryDierk Raabe
 
Identification des systémes dynamiques
Identification des systémes dynamiquesIdentification des systémes dynamiques
Identification des systémes dynamiquesAyoub Moufid
 
Car Accident Severity Report
Car Accident Severity ReportCar Accident Severity Report
Car Accident Severity ReportFaizan Hussain
 
A Genetic Algorithm on Optimization Test Functions
A Genetic Algorithm on Optimization Test FunctionsA Genetic Algorithm on Optimization Test Functions
A Genetic Algorithm on Optimization Test FunctionsIJMERJOURNAL
 
C055011012
C055011012C055011012
C055011012inventy
 

What's hot (13)

Ijarcet vol-2-issue-2-337-341
Ijarcet vol-2-issue-2-337-341Ijarcet vol-2-issue-2-337-341
Ijarcet vol-2-issue-2-337-341
 
Coates p: the use of genetic programming for applications in the field of spa...
Coates p: the use of genetic programming for applications in the field of spa...Coates p: the use of genetic programming for applications in the field of spa...
Coates p: the use of genetic programming for applications in the field of spa...
 
Lect07
Lect07Lect07
Lect07
 
Urban strategies to promote resilient cities The case of enhancing Historic C...
Urban strategies to promote resilient cities The case of enhancing Historic C...Urban strategies to promote resilient cities The case of enhancing Historic C...
Urban strategies to promote resilient cities The case of enhancing Historic C...
 
ECCV 2010
ECCV 2010ECCV 2010
ECCV 2010
 
Substructure Similarity Search in Graph Databases
Substructure Similarity Search in Graph DatabasesSubstructure Similarity Search in Graph Databases
Substructure Similarity Search in Graph Databases
 
SEO PROCESS
SEO PROCESSSEO PROCESS
SEO PROCESS
 
Dot matrix Analysis Tools (Bioinformatics)
Dot matrix Analysis Tools (Bioinformatics)Dot matrix Analysis Tools (Bioinformatics)
Dot matrix Analysis Tools (Bioinformatics)
 
Overview combining ab initio with continuum theory
Overview combining ab initio with continuum theoryOverview combining ab initio with continuum theory
Overview combining ab initio with continuum theory
 
Identification des systémes dynamiques
Identification des systémes dynamiquesIdentification des systémes dynamiques
Identification des systémes dynamiques
 
Car Accident Severity Report
Car Accident Severity ReportCar Accident Severity Report
Car Accident Severity Report
 
A Genetic Algorithm on Optimization Test Functions
A Genetic Algorithm on Optimization Test FunctionsA Genetic Algorithm on Optimization Test Functions
A Genetic Algorithm on Optimization Test Functions
 
C055011012
C055011012C055011012
C055011012
 

Similar to Comparing Structural Descriptors from 1D and 3D Formats

EFFICIENT SCHEMA BASED KEYWORD SEARCH IN RELATIONAL DATABASES
EFFICIENT SCHEMA BASED KEYWORD SEARCH IN RELATIONAL DATABASESEFFICIENT SCHEMA BASED KEYWORD SEARCH IN RELATIONAL DATABASES
EFFICIENT SCHEMA BASED KEYWORD SEARCH IN RELATIONAL DATABASESIJCSEIT Journal
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Ganesan Narayanasamy
 
A Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality IndicatorsA Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality Indicatorsvie_dels
 
Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)Zakaria Zubi
 
Lecture 9 molecular descriptors
Lecture 9  molecular descriptorsLecture 9  molecular descriptors
Lecture 9 molecular descriptorsRAJAN ROLTA
 
COQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODEL
COQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODELCOQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODEL
COQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODELcsandit
 
COQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODEL
COQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODELCOQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODEL
COQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODELcscpconf
 
WCTFR : W RAPPING C URVELET T RANSFORM B ASED F ACE R ECOGNITION
WCTFR : W RAPPING  C URVELET T RANSFORM  B ASED  F ACE  R ECOGNITIONWCTFR : W RAPPING  C URVELET T RANSFORM  B ASED  F ACE  R ECOGNITION
WCTFR : W RAPPING C URVELET T RANSFORM B ASED F ACE R ECOGNITIONcsandit
 
SEM on MIDUS Dataset
SEM on MIDUS DatasetSEM on MIDUS Dataset
SEM on MIDUS DatasetKan Yuenyong
 
Dg source allocation by fuzzy and sa in distribution system for loss reductio...
Dg source allocation by fuzzy and sa in distribution system for loss reductio...Dg source allocation by fuzzy and sa in distribution system for loss reductio...
Dg source allocation by fuzzy and sa in distribution system for loss reductio...Alexander Decker
 
AI approaches in healthcare - targeting precise and personalized medicine
AI approaches in healthcare - targeting precise and personalized medicine AI approaches in healthcare - targeting precise and personalized medicine
AI approaches in healthcare - targeting precise and personalized medicine DayOne
 
Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22Robin Gutell
 
NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...
NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...
NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...ssuser4b1f48
 
E132833
E132833E132833
E132833irjes
 

Similar to Comparing Structural Descriptors from 1D and 3D Formats (20)

BrazMedChem2014
BrazMedChem2014BrazMedChem2014
BrazMedChem2014
 
EFFICIENT SCHEMA BASED KEYWORD SEARCH IN RELATIONAL DATABASES
EFFICIENT SCHEMA BASED KEYWORD SEARCH IN RELATIONAL DATABASESEFFICIENT SCHEMA BASED KEYWORD SEARCH IN RELATIONAL DATABASES
EFFICIENT SCHEMA BASED KEYWORD SEARCH IN RELATIONAL DATABASES
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9
 
Unit 2 cadd assignment
Unit 2 cadd assignmentUnit 2 cadd assignment
Unit 2 cadd assignment
 
A Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality IndicatorsA Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality Indicators
 
Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)
 
Lecture 9 molecular descriptors
Lecture 9  molecular descriptorsLecture 9  molecular descriptors
Lecture 9 molecular descriptors
 
COQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODEL
COQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODELCOQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODEL
COQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODEL
 
COQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODEL
COQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODELCOQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODEL
COQUEL: A CONCEPTUAL QUERY LANGUAGE BASED ON THE ENTITYRELATIONSHIP MODEL
 
The influence of data curation on QSAR Modeling – examining issues of qualit...
 The influence of data curation on QSAR Modeling – examining issues of qualit... The influence of data curation on QSAR Modeling – examining issues of qualit...
The influence of data curation on QSAR Modeling – examining issues of qualit...
 
Sq lite module4
Sq lite module4Sq lite module4
Sq lite module4
 
WCTFR : W RAPPING C URVELET T RANSFORM B ASED F ACE R ECOGNITION
WCTFR : W RAPPING  C URVELET T RANSFORM  B ASED  F ACE  R ECOGNITIONWCTFR : W RAPPING  C URVELET T RANSFORM  B ASED  F ACE  R ECOGNITION
WCTFR : W RAPPING C URVELET T RANSFORM B ASED F ACE R ECOGNITION
 
Recursive
RecursiveRecursive
Recursive
 
SEM on MIDUS Dataset
SEM on MIDUS DatasetSEM on MIDUS Dataset
SEM on MIDUS Dataset
 
Dg source allocation by fuzzy and sa in distribution system for loss reductio...
Dg source allocation by fuzzy and sa in distribution system for loss reductio...Dg source allocation by fuzzy and sa in distribution system for loss reductio...
Dg source allocation by fuzzy and sa in distribution system for loss reductio...
 
AI approaches in healthcare - targeting precise and personalized medicine
AI approaches in healthcare - targeting precise and personalized medicine AI approaches in healthcare - targeting precise and personalized medicine
AI approaches in healthcare - targeting precise and personalized medicine
 
Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22
 
NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...
NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...
NS-CUK Journal club: HELee, Review on "Graph embedding on biomedical networks...
 
Energy management system
Energy management systemEnergy management system
Energy management system
 
E132833
E132833E132833
E132833
 

Recently uploaded

Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫qfactory1
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10ROLANARIBATO3
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |aasikanpl
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Temporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of MasticationTemporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of Masticationvidulajaib
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)DHURKADEVIBASKAR
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaPraksha3
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett SquareIsiahStephanRadaza
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsCharlene Llagas
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 

Recently uploaded (20)

Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Temporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of MasticationTemporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of Mastication
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett Square
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of Traits
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 

Comparing Structural Descriptors from 1D and 3D Formats

  • 1. Objectives Acknowledgements • Svetlana Gelpi-Dominguez acknowledges support from NSF Northeast LSAMP Bridge to Doctorate Award # 1400382. She would also like to thank Dr. Felice C. Lightstone, Dr. Brian J. Bennion, Dr. Sergio Wong, and Dr. Eric Schwegler for the opportunity to work in the 2017 CCMS cohort. Svetlana would also like to thank Dr. Miguel Morales-Silva and Tony Baylis for their constant mentoring at LLNL. Prepared by LLNL under Contract DE-AC52-07NA27344. References • Hechinger M, Leonhard K, & Marquardt W (2012) What is Wrong with Quantitative Structure–Property Relations Models Based on Three-Dimensional Descriptors? Journal of Chemical Information and Modeling 52(8):1984-1993. • Labute P (2000) A widely applicable set of descriptors. Journal of Molecular Graphics and Modeling 18(4):464-477. • Malde AK, et al. (2011) An Automated Force Field Topology Builder (ATB) and Repository: Version 1.0. Journal of Chemical Theory and Computation 7(12):4026- 4037. • Gaulton A, et al. (2017) The ChEMBL database in 2017. Nucleic Acids Research 45(Database issue):D945-D954. Molecular Descriptors: Comparing Structural Complexity and Software Svetlana Gelpí-Domínguez1,2 Sergio Wong2, Brian J. Bennion2 , Felice C. Lightstone2 1)Department of Chemistry, University of Connecticut, Storrs, 06269, CT 2)Lawrence Livermore National Laboratory, Livermore, 94550, CA Methodology Results and Discussion Conclusions Figures A, B, C and D. A total of 94 molecules were used as input in their 1-D and 3-D format to calculate 188 molecular descriptors for each molecule. Figs A, B, and C are correlation plots for the descriptors molecular weight, hydrophobicity, and accessible surface area. Fig D shows us that out of 188 molecular descriptors 130 (shaded in blue) show an R2 of over 0.5. These are not dependent of 3-D structural input. The other 58 descriptors (shown in orange) must be calculated using only 3-D structures files as input therefore explaining the low R2 values. Figures E, F, G and H. Comparison of descriptors calculated using MOE and RDKit. Here we used 1-D (SMILES strings) as input structures to compare 34 descriptors both programs have in common. We observe a strong correlation in wolecular weight and hydrophobicity (fig. E). Fig G shows a low correlation between the MOE SlogP_VSA descriptor and the Rdkit ‘MOE-like’ SlogP_VSA descriptor. Fig H shows the frequency of R2 for the 34 descriptors used in this comparison. The majority of the descriptors calculated have an R2 of under 0.6 showing a low correlation between descriptors in both programs. Future work • Produce a reliable Quantitative Structure–Activity Relationship (QSAR) model that yields the bio activity of these molecules against an important receptor such as estrogen receptor alpha. • Are 3-D descriptors necessary to build accurate QSAR models? • There exists a strong correlation between calculated descriptors for 1-D SMILES Strings and for descriptors based 3-D quantum mechanical structures. • The average R2 between the calculated molecular descriptor calculations for 94 molecules in MOE for 1-D SMILES strings and 3-D structures was 0.72. In fig D. it is noted that the majority of highly correlated molecular descriptors are 0- D, 1-D, and 2-D descriptors. This means that for the purposes of using 0-D, 1- D, and 2-D descriptors is isn’t essential to have 3-D structures as an input for descriptor calculations. • In the MOE calculations it was observed that 3-D descriptor values do depend on the dimension used for the input structure. • There is a low correlation between Moe descriptors in MOE and ‘Moe-like’ descriptors found in RDKit. (average R2 of 0.12). • If your QSAR model does not depend on 3-D descriptors then your pipeline can become more efficient by using 1-D SMILES strings for your descriptor calculations. SMILES String: c1ccccc1 H) Commercial Open Source Abstract What are descriptors? And how are they used? In a large effort to predict the compound activity of over 1.7 million compounds in various in-vitro assays, the time it takes to extract molecules from a database and process them for virtual screening is crucial. Applications need to take advantage of LLNL’s high performance computing. Simplified Molecular-Input Line-Entry System (SMILES) c1ccccc1 1.7 million compounds 1-D Format 3-D Format Predict Activity Molecular Descriptor Calculation. 4 types of Molecular Descriptors: • Topological: Atom count • Geometrical: Principal Moment of Inertia • Electronic: Dipole moment • 3D Descriptors Software used to perform calculations. MOAD (clip.llnl.gov:5507) • Are 3-D structures determined by ab-initio methods better to use than 1-D SMILES Strings for the calculation of molecular descriptors? The answer to this question can be found in figures A-D. • Are MOE-like descriptors in RDKit the same as those created by MOE? The answer to this question can be found in figures E-H. • Upload the 1.7 million compounds with their calculated descriptors to the MOAD database. R2 Frequency RDKit vs. MOE B)A) E) F) G) Positive Control Positive Control R2 Frequency Calculated DescriptorsD) C)