SlideShare a Scribd company logo
1 of 1
Download to read offline
ChemAxon
Graphisoft Park, Hx Building
H-1037 Budapest, Hungary
Phone: +36 1 453 2660
Fax: +36 1 453 2659
http://www.chemaxon.com
Making The Most of
Approximate Maximum Common Substructure Search
Introduction
Maximum common substructure (MCS) is a computationally
hard problem that is of great importance in multiple aspects of
chemical informatics. Its many applications include:
● Similarity search
● Clustering
● Reaction mapping
● Molecule alignment
● Matched molecular pair (MMP) search
Péter Englert12
, Péter Kovács12
1
Department of Algorithms and Applications, Faculty of Informatics, Eötvös Loránd University
2
ChemAxon Ltd.
Aim
The aim of our research was to develop highly efficient heuristic
MCS algorithms. Our implementations are optimized for both
running time and accuracy. We also consider the chemical
relevance of the result, which includes customizable matching
rules, keeping rings together, and providing appropriate
mappings for reaction mapping and molecule alignment.
Heuristics
Apart from comparing existing algorithms and heuristics, for
example, the connectivity heuristic, which we have found very
powerful; new heuristics and methods have also been
developed.
One of the most important ones is based on ECFP hash codes.
We assign hash values to atom neighborhoods, and prefer
matching atoms which have large neighborhoods whose hash
values match.
Mapping optimization
Another improvement is that the one-to-one atom
correspondence defined by the found common substructure is
also taken into account. This is especially important in
applications like molecule alignment or reaction mapping.
Examples:
Non-preferred mapping
The developed solution handles these simple cases, as well as
many other more complex ones.Customizability
Methods have been developed to provide features like keeping
rings together, enumerating multiple common substructures, or
providing fully customizable matching options. The possible
non-local and non-transitive nature of these matching rules
require special considerations. Examples:
Conclusion
Effective heuristics have been developed to improve both the
performance and the accuracy of MCS algorithms, as well as to
provide chemically more relevant results for several
applications.
The resulting implementations are included in the 6.0 version of
ChemAxon's JChem software suite.
Representation
A new representation for the compatiblity graph in max-clique
based methods has been developed, which greatly improves
both memory usage and running time.
The memory usage of the new representation was measured
on graphene structures of varying size. The empirical results
nicely support the theoretical improvement from O(m4
) to O(m3
)
where m is the bond count of the input molecules.
Preferred mapping
Non-preferred mapping Preferred mapping
Default MCS
MCS keeping rings
Results
The effectiveness of the mentioned heuristics was thoroughly
measured and compared.
Typical results when comparing the size of the found MCS
with and without using heuristics.
Running times for the same test set.
MCS where bonds in rings
can only match bonds in
rings of similar size
References
John W. Raymond and Peter Willett. Maximum common subgraph
isomorphism algorithms for the matching of chemical structures. Journal of
Computer-Aided Molecular Design, 2002, 16:521-33.
Andrea Grosso, Marco Locatelli, and Wayne Pullan. Simple ingredients
leading to very efficient heuristics for the maximum clique problem. Journal of
Heuristics, 2008, 14(6):587-612.
Takeshi Kawabata. Build-up algorithm for atomic correspondence between
chemical structures. Journal of Chemical Information and Modeling, 2011
51(8):1775-87.

More Related Content

Similar to Making the most of maximum common substructure search

Innovations in t-way test creation based on a hybrid hill climbing-greedy alg...
Innovations in t-way test creation based on a hybrid hill climbing-greedy alg...Innovations in t-way test creation based on a hybrid hill climbing-greedy alg...
Innovations in t-way test creation based on a hybrid hill climbing-greedy alg...IAESIJAI
 
Efficient evaluation of flatness error from Coordinate Measurement Data using...
Efficient evaluation of flatness error from Coordinate Measurement Data using...Efficient evaluation of flatness error from Coordinate Measurement Data using...
Efficient evaluation of flatness error from Coordinate Measurement Data using...Ali Shahed
 
An Integrated Solver For Optimization Problems
An Integrated Solver For Optimization ProblemsAn Integrated Solver For Optimization Problems
An Integrated Solver For Optimization ProblemsMonica Waters
 
A HYBRID CLUSTERING ALGORITHM FOR DATA MINING
A HYBRID CLUSTERING ALGORITHM FOR DATA MININGA HYBRID CLUSTERING ALGORITHM FOR DATA MINING
A HYBRID CLUSTERING ALGORITHM FOR DATA MININGcscpconf
 
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...cscpconf
 
Computational Optimization, Modelling and Simulation: Recent Trends and Chall...
Computational Optimization, Modelling and Simulation: Recent Trends and Chall...Computational Optimization, Modelling and Simulation: Recent Trends and Chall...
Computational Optimization, Modelling and Simulation: Recent Trends and Chall...Xin-She Yang
 
A simulation based multi-objective design optimization of electronic packages...
A simulation based multi-objective design optimization of electronic packages...A simulation based multi-objective design optimization of electronic packages...
A simulation based multi-objective design optimization of electronic packages...Phuong Dx
 
Manager’s Preferences Modeling within Multi-Criteria Flowshop Scheduling Prob...
Manager’s Preferences Modeling within Multi-Criteria Flowshop Scheduling Prob...Manager’s Preferences Modeling within Multi-Criteria Flowshop Scheduling Prob...
Manager’s Preferences Modeling within Multi-Criteria Flowshop Scheduling Prob...Waqas Tariq
 
Using a Detailed Chemical-Kinetics Mechanism to Ensure Accurate Combustion Si...
Using a Detailed Chemical-Kinetics Mechanism to Ensure Accurate Combustion Si...Using a Detailed Chemical-Kinetics Mechanism to Ensure Accurate Combustion Si...
Using a Detailed Chemical-Kinetics Mechanism to Ensure Accurate Combustion Si...Reaction Design
 
COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...
COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...
COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...IAEME Publication
 
Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...IAEME Publication
 
A Review of Constraint Programming
A Review of Constraint ProgrammingA Review of Constraint Programming
A Review of Constraint ProgrammingEditor IJCATR
 
Lecture 2 Basic Concepts of Optimal Design and Optimization Techniques final1...
Lecture 2 Basic Concepts of Optimal Design and Optimization Techniques final1...Lecture 2 Basic Concepts of Optimal Design and Optimization Techniques final1...
Lecture 2 Basic Concepts of Optimal Design and Optimization Techniques final1...Khalil Alhatab
 
Multi objective game theoretic scheduling of bag-of-tasks workflows on hybri...
Multi objective game theoretic  scheduling of bag-of-tasks workflows on hybri...Multi objective game theoretic  scheduling of bag-of-tasks workflows on hybri...
Multi objective game theoretic scheduling of bag-of-tasks workflows on hybri...Nexgen Technology
 
Development of an Explainable Model for a Gluconic Acid Bioreactor and Profit...
Development of an Explainable Model for a Gluconic Acid Bioreactor and Profit...Development of an Explainable Model for a Gluconic Acid Bioreactor and Profit...
Development of an Explainable Model for a Gluconic Acid Bioreactor and Profit...IRJET Journal
 
Sequential estimation of_discrete_choice_models__copy_-4
Sequential estimation of_discrete_choice_models__copy_-4Sequential estimation of_discrete_choice_models__copy_-4
Sequential estimation of_discrete_choice_models__copy_-4YoussefKitane
 
Unit Commitment with Lagrange Relaxation Using Particle Swarm Optimization
Unit Commitment with Lagrange Relaxation Using Particle Swarm OptimizationUnit Commitment with Lagrange Relaxation Using Particle Swarm Optimization
Unit Commitment with Lagrange Relaxation Using Particle Swarm OptimizationShanmuga Priyan Thiagarajan
 
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...ArchiLab 7
 

Similar to Making the most of maximum common substructure search (20)

Empirical and quantum mechanical methods of 13 c chemical shifts prediction c...
Empirical and quantum mechanical methods of 13 c chemical shifts prediction c...Empirical and quantum mechanical methods of 13 c chemical shifts prediction c...
Empirical and quantum mechanical methods of 13 c chemical shifts prediction c...
 
Innovations in t-way test creation based on a hybrid hill climbing-greedy alg...
Innovations in t-way test creation based on a hybrid hill climbing-greedy alg...Innovations in t-way test creation based on a hybrid hill climbing-greedy alg...
Innovations in t-way test creation based on a hybrid hill climbing-greedy alg...
 
Efficient evaluation of flatness error from Coordinate Measurement Data using...
Efficient evaluation of flatness error from Coordinate Measurement Data using...Efficient evaluation of flatness error from Coordinate Measurement Data using...
Efficient evaluation of flatness error from Coordinate Measurement Data using...
 
An Integrated Solver For Optimization Problems
An Integrated Solver For Optimization ProblemsAn Integrated Solver For Optimization Problems
An Integrated Solver For Optimization Problems
 
A systematic approach for the generation and verification of structural hypot...
A systematic approach for the generation and verification of structural hypot...A systematic approach for the generation and verification of structural hypot...
A systematic approach for the generation and verification of structural hypot...
 
A HYBRID CLUSTERING ALGORITHM FOR DATA MINING
A HYBRID CLUSTERING ALGORITHM FOR DATA MININGA HYBRID CLUSTERING ALGORITHM FOR DATA MINING
A HYBRID CLUSTERING ALGORITHM FOR DATA MINING
 
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...
 
Computational Optimization, Modelling and Simulation: Recent Trends and Chall...
Computational Optimization, Modelling and Simulation: Recent Trends and Chall...Computational Optimization, Modelling and Simulation: Recent Trends and Chall...
Computational Optimization, Modelling and Simulation: Recent Trends and Chall...
 
A simulation based multi-objective design optimization of electronic packages...
A simulation based multi-objective design optimization of electronic packages...A simulation based multi-objective design optimization of electronic packages...
A simulation based multi-objective design optimization of electronic packages...
 
Manager’s Preferences Modeling within Multi-Criteria Flowshop Scheduling Prob...
Manager’s Preferences Modeling within Multi-Criteria Flowshop Scheduling Prob...Manager’s Preferences Modeling within Multi-Criteria Flowshop Scheduling Prob...
Manager’s Preferences Modeling within Multi-Criteria Flowshop Scheduling Prob...
 
Using a Detailed Chemical-Kinetics Mechanism to Ensure Accurate Combustion Si...
Using a Detailed Chemical-Kinetics Mechanism to Ensure Accurate Combustion Si...Using a Detailed Chemical-Kinetics Mechanism to Ensure Accurate Combustion Si...
Using a Detailed Chemical-Kinetics Mechanism to Ensure Accurate Combustion Si...
 
COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...
COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...
COMPARISON BETWEEN THE GENETIC ALGORITHMS OPTIMIZATION AND PARTICLE SWARM OPT...
 
Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...
 
A Review of Constraint Programming
A Review of Constraint ProgrammingA Review of Constraint Programming
A Review of Constraint Programming
 
Lecture 2 Basic Concepts of Optimal Design and Optimization Techniques final1...
Lecture 2 Basic Concepts of Optimal Design and Optimization Techniques final1...Lecture 2 Basic Concepts of Optimal Design and Optimization Techniques final1...
Lecture 2 Basic Concepts of Optimal Design and Optimization Techniques final1...
 
Multi objective game theoretic scheduling of bag-of-tasks workflows on hybri...
Multi objective game theoretic  scheduling of bag-of-tasks workflows on hybri...Multi objective game theoretic  scheduling of bag-of-tasks workflows on hybri...
Multi objective game theoretic scheduling of bag-of-tasks workflows on hybri...
 
Development of an Explainable Model for a Gluconic Acid Bioreactor and Profit...
Development of an Explainable Model for a Gluconic Acid Bioreactor and Profit...Development of an Explainable Model for a Gluconic Acid Bioreactor and Profit...
Development of an Explainable Model for a Gluconic Acid Bioreactor and Profit...
 
Sequential estimation of_discrete_choice_models__copy_-4
Sequential estimation of_discrete_choice_models__copy_-4Sequential estimation of_discrete_choice_models__copy_-4
Sequential estimation of_discrete_choice_models__copy_-4
 
Unit Commitment with Lagrange Relaxation Using Particle Swarm Optimization
Unit Commitment with Lagrange Relaxation Using Particle Swarm OptimizationUnit Commitment with Lagrange Relaxation Using Particle Swarm Optimization
Unit Commitment with Lagrange Relaxation Using Particle Swarm Optimization
 
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...
Ying hua, c. (2010): adopting co-evolution and constraint-satisfaction concep...
 

Recently uploaded

Temporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of MasticationTemporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of Masticationvidulajaib
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2John Carlo Rollon
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxEran Akiva Sinbar
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayZachary Labe
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫qfactory1
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)DHURKADEVIBASKAR
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 

Recently uploaded (20)

Temporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of MasticationTemporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of Mastication
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work Day
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 

Making the most of maximum common substructure search

  • 1. ChemAxon Graphisoft Park, Hx Building H-1037 Budapest, Hungary Phone: +36 1 453 2660 Fax: +36 1 453 2659 http://www.chemaxon.com Making The Most of Approximate Maximum Common Substructure Search Introduction Maximum common substructure (MCS) is a computationally hard problem that is of great importance in multiple aspects of chemical informatics. Its many applications include: ● Similarity search ● Clustering ● Reaction mapping ● Molecule alignment ● Matched molecular pair (MMP) search Péter Englert12 , Péter Kovács12 1 Department of Algorithms and Applications, Faculty of Informatics, Eötvös Loránd University 2 ChemAxon Ltd. Aim The aim of our research was to develop highly efficient heuristic MCS algorithms. Our implementations are optimized for both running time and accuracy. We also consider the chemical relevance of the result, which includes customizable matching rules, keeping rings together, and providing appropriate mappings for reaction mapping and molecule alignment. Heuristics Apart from comparing existing algorithms and heuristics, for example, the connectivity heuristic, which we have found very powerful; new heuristics and methods have also been developed. One of the most important ones is based on ECFP hash codes. We assign hash values to atom neighborhoods, and prefer matching atoms which have large neighborhoods whose hash values match. Mapping optimization Another improvement is that the one-to-one atom correspondence defined by the found common substructure is also taken into account. This is especially important in applications like molecule alignment or reaction mapping. Examples: Non-preferred mapping The developed solution handles these simple cases, as well as many other more complex ones.Customizability Methods have been developed to provide features like keeping rings together, enumerating multiple common substructures, or providing fully customizable matching options. The possible non-local and non-transitive nature of these matching rules require special considerations. Examples: Conclusion Effective heuristics have been developed to improve both the performance and the accuracy of MCS algorithms, as well as to provide chemically more relevant results for several applications. The resulting implementations are included in the 6.0 version of ChemAxon's JChem software suite. Representation A new representation for the compatiblity graph in max-clique based methods has been developed, which greatly improves both memory usage and running time. The memory usage of the new representation was measured on graphene structures of varying size. The empirical results nicely support the theoretical improvement from O(m4 ) to O(m3 ) where m is the bond count of the input molecules. Preferred mapping Non-preferred mapping Preferred mapping Default MCS MCS keeping rings Results The effectiveness of the mentioned heuristics was thoroughly measured and compared. Typical results when comparing the size of the found MCS with and without using heuristics. Running times for the same test set. MCS where bonds in rings can only match bonds in rings of similar size References John W. Raymond and Peter Willett. Maximum common subgraph isomorphism algorithms for the matching of chemical structures. Journal of Computer-Aided Molecular Design, 2002, 16:521-33. Andrea Grosso, Marco Locatelli, and Wayne Pullan. Simple ingredients leading to very efficient heuristics for the maximum clique problem. Journal of Heuristics, 2008, 14(6):587-612. Takeshi Kawabata. Build-up algorithm for atomic correspondence between chemical structures. Journal of Chemical Information and Modeling, 2011 51(8):1775-87.