SlideShare a Scribd company logo
1 of 24
Download to read offline
ChemAxon UGM, San Diego, USA 25th September 2013
Recent improvements in Marvin v6:
Reaction Atom Mapping and its Application to
Reaction Validation in Pharmaceutical ELNs
Daniel Lowe and Roger Sayle
NextMove Software
Cambridge, UK
ChemAxon UGM, San Diego, USA 25th September 2013
What is Atom-Mapping?
Mapping
algorithm
ChemAxon UGM, San Diego, USA 25th September 2013
Why Perform Atom-Mapping?
• Assigning roles to reagents
• Normalization of reactions for registration
ChemAxon UGM, San Diego, USA 25th September 2013
Why Perform Atom-Mapping?
• More precise database searches
– Solvents/catalysts can be distinguished from
reactants
– Allows the relationship between the reactant
atoms and product atoms to be made explicit
ChemAxon UGM, San Diego, USA 25th September 2013
Example
• I want to find reactions converting an alkene
to a cyclopropane so I search for C=C>>C1CC1
ChemAxon UGM, San Diego, USA 25th September 2013
Why Perform Atom-Mapping?
• Identifying suspect reactions:
ChemAxon UGM, San Diego, USA 25th September 2013
Chemaxon atom mapping
ChemAxon UGM, San Diego, USA 25th September 2013
Chemaxon atom mapping
ChemAxon UGM, San Diego, USA 25th September 2013
Atom mapping modes
• Complete
• Changing
• Matching
ChemAxon UGM, San Diego, USA 25th September 2013
Methodology
Test set Reactions
Pharmaceutical ELN subset 18,244
ChemReact68 database 67,926
SPRESI database subset 5,230
Reactions extracted from 2008-
2011 USPTO patent applications*
562,872
* Lowe, D. M. Automated Extraction of Reactions from the Patent Literature.
243rd ACS National Meeting & Exposition, San Diego, CA, March 27, 2012.
ChemAxon UGM, San Diego, USA 25th September 2013
MetricS used
• Were all product atoms mapped
– Measures recall
• How many C-C bonds were broken
– Measures precision
ChemAxon UGM, San Diego, USA 25th September 2013
Ability to map all product atoms
0
10
20
30
40
50
60
70
80
PharmaELN ChemReact68 SPRESI USPTO
Percentofreactionswithallproductatoms
mapped
Marvin 5.10
Marvin 6.0
ChemDraw 12
ChemAxon UGM, San Diego, USA 25th September 2013
c-c bonds broken
0.0
0.2
0.4
0.6
0.8
1.0
1.2
PharmaELN ChemReact68 SPRESI USPTO
AveragenumberofC-Cbondsbrokenpermapping
(lowerisbetter)
Marvin 5.10
Marvin 6.0
ChemDraw 12
ChemAxon UGM, San Diego, USA 25th September 2013
Marvin 5.10
ChemDraw 12
Marvin 6.0
ChemAxon UGM, San Diego, USA 25th September 2013
Speed Comparison
*Comparison performed on the PharmaELN dataset on an i7-2600
0
50
100
150
200
250
300
350
Marvin 5.12 Marvin 6.0 Marvin 6.0
(multithreaded)
Reactionsmappedpersecond
ChemAxon UGM, San Diego, USA 25th September 2013
Difficult cases
ΔT
ChemAxon UGM, San Diego, USA 25th September 2013
Areas for improvements:
Implicit stoichiometry
ChemAxon UGM, San Diego, USA 25th September 2013
Areas for improvements:
many choices for reactant atom mapping
ChemAxon UGM, San Diego, USA 25th September 2013
0
10
20
30
40
50
60
70
80
90
100
PharmaELN
Percentofreactionswithallproductatomsmapped
Marvin 6.0
ChemDraw 12
Marvin6 + ChemDraw12
Consensus Result*
Consensus Methods
* Marvin 6.0 +
ChemDraw12 + 2
variants of GGA’s
Indigo toolkit +
InfoChem ICMap +
Pipeline Pilot + MDL
Cheshire
ChemAxon UGM, San Diego, USA 25th September 2013
Beyond atom mapping
• Missing reactants (often for routine reactions)
ChemAxon UGM, San Diego, USA 25th September 2013
Beyond atom mapping
• Change of stereoisomer or chiral resolution
(E)-3-{8-[2-(4-Isopropyl-1,3-thiazol-2-yl)ethyl]-2-methoxy-4-oxo-4H-pyrido[1,2-a]pyrimidin-3-yl}-2-propenoic acid (1
mg) was dissolved in CDCl3 (0.5 ml) and irradiated with light from a fluorescent lamp
for 19 hours . The solvent was evaporated to obtain the title compound (1 mg).
ChemAxon UGM, San Diego, USA 25th September 2013
Atom mapping + classification
0
10
20
30
40
50
60
70
80
90
100
Atom mapping
algorithms alone
Combined with
NameRXN
Percentofreactionswithallproduct
atomsmapped
Marvin 6.0
ChemDraw 12
Consensus
Result
Verified /
Recognised
by
NameRXN
(71%)
ChemAxon UGM, San Diego, USA 25th September 2013
conclusions
• Marvin v6’s atom mapping algorithm provides
large improvements in recall, precision and speed
over v5
• Atom mapping in some cases isn’t as simple as
finding a maximum common subgraph mapping
• Classification algorithms can be useful for the
validation of some reactions
ChemAxon UGM, San Diego, USA 25th September 2013
acknowledgements
• Zsolt Mohacsi and Istvan Rabel, ChemAxon
• Ed Griffen and Nick Tomkinson, AstraZeneca
• Andrew Wooster, GSK
• Hans Kraut, InfoChem
• Thank you for your time.

More Related Content

Similar to Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

Similar to Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns (6)

Ensuring Structural Compliance of Electric Vehicle Battery Pack Against Crush...
Ensuring Structural Compliance of Electric Vehicle Battery Pack Against Crush...Ensuring Structural Compliance of Electric Vehicle Battery Pack Against Crush...
Ensuring Structural Compliance of Electric Vehicle Battery Pack Against Crush...
 
IRJET- Static Analysis of Pulsar Bike Frame Made Up of Aluminum Alloy 6063
IRJET- Static Analysis of Pulsar Bike Frame Made Up of Aluminum Alloy 6063IRJET- Static Analysis of Pulsar Bike Frame Made Up of Aluminum Alloy 6063
IRJET- Static Analysis of Pulsar Bike Frame Made Up of Aluminum Alloy 6063
 
Clamp onguide v01_lowres
Clamp onguide v01_lowresClamp onguide v01_lowres
Clamp onguide v01_lowres
 
M55 Rocket Separation Operation 11 December 2012
M55 Rocket Separation Operation 11 December 2012M55 Rocket Separation Operation 11 December 2012
M55 Rocket Separation Operation 11 December 2012
 
20130827 defense y_song
20130827 defense y_song20130827 defense y_song
20130827 defense y_song
 
An Analysis of Amoxicillin Through GCMS and Later FTIR
An Analysis of Amoxicillin Through GCMS and Later FTIRAn Analysis of Amoxicillin Through GCMS and Later FTIR
An Analysis of Amoxicillin Through GCMS and Later FTIR
 

More from NextMove Software

CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...NextMove Software
 
Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...NextMove Software
 
CINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speedCINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speedNextMove Software
 
A de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILESA de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILESNextMove Software
 
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionRecent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionNextMove Software
 
Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...NextMove Software
 
Comparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule ImplementationsComparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule ImplementationsNextMove Software
 
Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...NextMove Software
 
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...NextMove Software
 
Recent improvements to the RDKit
Recent improvements to the RDKitRecent improvements to the RDKit
Recent improvements to the RDKitNextMove Software
 
Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...NextMove Software
 
Digital Chemical Representations
Digital Chemical RepresentationsDigital Chemical Representations
Digital Chemical RepresentationsNextMove Software
 
Challenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptionsChallenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptionsNextMove Software
 
PubChem as a Biologics Database
PubChem as a Biologics DatabasePubChem as a Biologics Database
PubChem as a Biologics DatabaseNextMove Software
 
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...NextMove Software
 
CINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction DatabasesCINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction DatabasesNextMove Software
 
Building on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfilesBuilding on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfilesNextMove Software
 
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...NextMove Software
 
Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)NextMove Software
 

More from NextMove Software (20)

DeepSMILES
DeepSMILESDeepSMILES
DeepSMILES
 
CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...
 
Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...
 
CINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speedCINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speed
 
A de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILESA de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILES
 
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionRecent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
 
Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...
 
Comparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule ImplementationsComparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule Implementations
 
Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...
 
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
 
Recent improvements to the RDKit
Recent improvements to the RDKitRecent improvements to the RDKit
Recent improvements to the RDKit
 
Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...
 
Digital Chemical Representations
Digital Chemical RepresentationsDigital Chemical Representations
Digital Chemical Representations
 
Challenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptionsChallenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptions
 
PubChem as a Biologics Database
PubChem as a Biologics DatabasePubChem as a Biologics Database
PubChem as a Biologics Database
 
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
 
CINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction DatabasesCINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction Databases
 
Building on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfilesBuilding on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfiles
 
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
 
Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)
 

Recently uploaded

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 

Recently uploaded (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 

Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

  • 1. ChemAxon UGM, San Diego, USA 25th September 2013 Recent improvements in Marvin v6: Reaction Atom Mapping and its Application to Reaction Validation in Pharmaceutical ELNs Daniel Lowe and Roger Sayle NextMove Software Cambridge, UK
  • 2. ChemAxon UGM, San Diego, USA 25th September 2013 What is Atom-Mapping? Mapping algorithm
  • 3. ChemAxon UGM, San Diego, USA 25th September 2013 Why Perform Atom-Mapping? • Assigning roles to reagents • Normalization of reactions for registration
  • 4. ChemAxon UGM, San Diego, USA 25th September 2013 Why Perform Atom-Mapping? • More precise database searches – Solvents/catalysts can be distinguished from reactants – Allows the relationship between the reactant atoms and product atoms to be made explicit
  • 5. ChemAxon UGM, San Diego, USA 25th September 2013 Example • I want to find reactions converting an alkene to a cyclopropane so I search for C=C>>C1CC1
  • 6. ChemAxon UGM, San Diego, USA 25th September 2013 Why Perform Atom-Mapping? • Identifying suspect reactions:
  • 7. ChemAxon UGM, San Diego, USA 25th September 2013 Chemaxon atom mapping
  • 8. ChemAxon UGM, San Diego, USA 25th September 2013 Chemaxon atom mapping
  • 9. ChemAxon UGM, San Diego, USA 25th September 2013 Atom mapping modes • Complete • Changing • Matching
  • 10. ChemAxon UGM, San Diego, USA 25th September 2013 Methodology Test set Reactions Pharmaceutical ELN subset 18,244 ChemReact68 database 67,926 SPRESI database subset 5,230 Reactions extracted from 2008- 2011 USPTO patent applications* 562,872 * Lowe, D. M. Automated Extraction of Reactions from the Patent Literature. 243rd ACS National Meeting & Exposition, San Diego, CA, March 27, 2012.
  • 11. ChemAxon UGM, San Diego, USA 25th September 2013 MetricS used • Were all product atoms mapped – Measures recall • How many C-C bonds were broken – Measures precision
  • 12. ChemAxon UGM, San Diego, USA 25th September 2013 Ability to map all product atoms 0 10 20 30 40 50 60 70 80 PharmaELN ChemReact68 SPRESI USPTO Percentofreactionswithallproductatoms mapped Marvin 5.10 Marvin 6.0 ChemDraw 12
  • 13. ChemAxon UGM, San Diego, USA 25th September 2013 c-c bonds broken 0.0 0.2 0.4 0.6 0.8 1.0 1.2 PharmaELN ChemReact68 SPRESI USPTO AveragenumberofC-Cbondsbrokenpermapping (lowerisbetter) Marvin 5.10 Marvin 6.0 ChemDraw 12
  • 14. ChemAxon UGM, San Diego, USA 25th September 2013 Marvin 5.10 ChemDraw 12 Marvin 6.0
  • 15. ChemAxon UGM, San Diego, USA 25th September 2013 Speed Comparison *Comparison performed on the PharmaELN dataset on an i7-2600 0 50 100 150 200 250 300 350 Marvin 5.12 Marvin 6.0 Marvin 6.0 (multithreaded) Reactionsmappedpersecond
  • 16. ChemAxon UGM, San Diego, USA 25th September 2013 Difficult cases ΔT
  • 17. ChemAxon UGM, San Diego, USA 25th September 2013 Areas for improvements: Implicit stoichiometry
  • 18. ChemAxon UGM, San Diego, USA 25th September 2013 Areas for improvements: many choices for reactant atom mapping
  • 19. ChemAxon UGM, San Diego, USA 25th September 2013 0 10 20 30 40 50 60 70 80 90 100 PharmaELN Percentofreactionswithallproductatomsmapped Marvin 6.0 ChemDraw 12 Marvin6 + ChemDraw12 Consensus Result* Consensus Methods * Marvin 6.0 + ChemDraw12 + 2 variants of GGA’s Indigo toolkit + InfoChem ICMap + Pipeline Pilot + MDL Cheshire
  • 20. ChemAxon UGM, San Diego, USA 25th September 2013 Beyond atom mapping • Missing reactants (often for routine reactions)
  • 21. ChemAxon UGM, San Diego, USA 25th September 2013 Beyond atom mapping • Change of stereoisomer or chiral resolution (E)-3-{8-[2-(4-Isopropyl-1,3-thiazol-2-yl)ethyl]-2-methoxy-4-oxo-4H-pyrido[1,2-a]pyrimidin-3-yl}-2-propenoic acid (1 mg) was dissolved in CDCl3 (0.5 ml) and irradiated with light from a fluorescent lamp for 19 hours . The solvent was evaporated to obtain the title compound (1 mg).
  • 22. ChemAxon UGM, San Diego, USA 25th September 2013 Atom mapping + classification 0 10 20 30 40 50 60 70 80 90 100 Atom mapping algorithms alone Combined with NameRXN Percentofreactionswithallproduct atomsmapped Marvin 6.0 ChemDraw 12 Consensus Result Verified / Recognised by NameRXN (71%)
  • 23. ChemAxon UGM, San Diego, USA 25th September 2013 conclusions • Marvin v6’s atom mapping algorithm provides large improvements in recall, precision and speed over v5 • Atom mapping in some cases isn’t as simple as finding a maximum common subgraph mapping • Classification algorithms can be useful for the validation of some reactions
  • 24. ChemAxon UGM, San Diego, USA 25th September 2013 acknowledgements • Zsolt Mohacsi and Istvan Rabel, ChemAxon • Ed Griffen and Nick Tomkinson, AstraZeneca • Andrew Wooster, GSK • Hans Kraut, InfoChem • Thank you for your time.