(ATS6-PLAT01) Chemistry Harmonization: Bringing together the Direct 9 and Pipeline Pilot Chemistry Data Models

  • 219 views
Uploaded on

Pipeline Pilot Chemistry 9.0 is inheriting many new chemical representations from the Accelrys Direct data model. These include the support of the Self Contained Sequence Representation (SCSR) …

Pipeline Pilot Chemistry 9.0 is inheriting many new chemical representations from the Accelrys Direct data model. These include the support of the Self Contained Sequence Representation (SCSR) biologics, enhanced Markush structure representations, Markush homology groups, and Non Specific Structures (NONS). Also significantly enhanced is the support for Sgroups, in particular for polymers, mixtures, and formulations. Further, Pipeline Pilot depiction has been upgraded to support these enhancements and the stereochemical perception and ring perception capabilities were improved based on Direct.
The major benefit of these changes is that Direct and Pipeline Pilot now use the same data model. Searches carried out in Direct or in Pipeline Pilot will return identical results and both products will deliver identical structural perceptions. This session will give guidance on how these changes will impact your calculators and models and how you can plan for a smooth upgrade.

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
219
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. (ATS6-PLAT01) Chemistry HarmonizationBringing together the Direct 9 and Pipeline Pilot ChemistryData ModelsTon van Daelen, Ph.D.Product Director, PlatformProduct Managementton.vandaelen@accelrys.comKeith Taylor, Ph.D.Product Manager, ChemistryProduct Managementkeith.taylor@accelrys.com
  • 2. The information on the roadmap and future software development efforts areintended to outline general product direction and should not be relied on in makinga purchasing decision.
  • 3. Content• We are harmonizing the chemical representations inPipeline Pilot 9.0 and Direct 9.0• Pipeline Pilot, Direct and Draw to adopt best-of-breedfeatures• What will you learn?– What your scientists need to be aware of– How to manage this change as an administrator
  • 4. Direct 9.0 – Changes• Note: Direct 9 will return different search results in somecases, consistent with Pipeline Pilot– Aromaticity perception now based on Hückel rule (4n+2)– Tautomer perception based on Sayle et al. paper• Consistency between Pipeline Pilot, Accelrys Direct, andAccelrys Draw– Same chemistry, same results everywhere*Canonicalization and Enumeration of Tautomers, Sayle and Delany, EuroMUG99, 28-29 October 1999, Cambridge, UK
  • 5. Pipeline Pilot 9.0 – New CapabilitiesConsistency between Pipeline Pilot Chemistry, Direct, and Draw• Enhanced representation – ‘What you see is what you have’• Depiction engine from Direct and Draw• Mappers supporting new representations• Calculators upgraded to interpret new representations• Enhanced perceptions of stereochemistry, aromaticity, and ringsNote• Changes to perception mean that models, and calculators must berelearned and re-baselined– Significant effect from new ring perception option– Stereochemistry and aromaticity have smaller, but important effect
  • 6. Pipeline Pilot 9.0 – Improved Chemical Representations• Single/double/triple bonds supportedin NONS• Coordination/Dative bond• Haptic bonds• Markush Homology Groups• Hydrogen bonds
  • 7. • Rendering between Accelrys Draw and Pipeline Pilot9.0 now consistent• Pipeline Pilot now supports:– PNG– JPEG– GIF– SVG– EMF – Linux and Windows!• SVG and EMF generation fast– ~ 10,000 structures per secondPipeline Pilot 9.0 – DepictionDrawPipelinePilot
  • 8. • Abbreviated groups are frequently used to simplify structures• Attachment points are now correct– The Pipeline Pilot 8.x depictions are incorrect on the left of the phenylgroup– The labels depicted imply different chemical entities• Visual corruption• Nitrile (CN) and isonitrile (NC) are chemically different• NCS and SCN are also different entities• Rich text markup renders correctly• Whitespace around labels is consistent– Affects perceived bond lengthPipeline Pilot 9.0 – DepictionDrawPipelinePilot
  • 9. • Markush/Rgroup depiction is complete in Pipeline Pilot rendering• Now renders– Rgroups definitions (e.g. R1 …)– Rgroup logic (R1 = 1; R2 >= 0)– Directionality indicated for fragments with multiple attachment points (e.g.“ on R2)Pipeline Pilot 9.0 – DepictionDrawPipelinePilot
  • 10. • Nonspecific (NONS) representation are equivalent withDirect 8.0 and Draw 4.1– Pipeline Pilot version does not lose information• Examples from mass spectrometry and industrial chemicalsPipeline Pilot 9.0 – DepictionDrawDrawPipelinePilotPipelinePilot
  • 11. • Increased focus on biological therapeutics• Representation exposed in Pipeline Pilot 8.5• Completed in 9.0– Much more functional and sophisticatedBiologicsPipelinePilotDraw
  • 12. Depiction Performance020004000600080001000012000PNG (Solid) JPEG (Solid) SVG (Solid) PNG (Transp) JPEG (Transp) SVG (Transp)Speed PP 8.5(mols/sec)Speed PP 9.0(mols/sec)Creating images for 10Kmolecules from Maybridgedataset (Image size = 128)Higher bars are better.PP 9.0 faster than PP 8.5(except for Transparent PNGimages)
  • 13. Depiction ParametersAdded many new parameters and rearrangedthem to mirror the settings in Accelrys Draw
  • 14. Example: Antibody-Drug Conjugates – New in 9.0• Accelrys Direct and Accelrys Draw understand Antibody-Drug Conjugates
  • 15. What does this mean to my scientists? (1)• Higher quality reports– Supports perception of quality research• Enhanced depiction of biologics and Markush generics– Look different and minor adjustments to depiction protocols may be needed• New chemical representations– No change to existing protocols– New opportunities opened up• Expect marginal differences in hit sets between Direct 8 and 9 due to differentaromaticity and tautomer perceptions
  • 16. • Enhanced mapping – New in 9.0 e.g. Imipramine MetabolitesMapping: Non Specific Structures - New
  • 17. • Screen MDDR data set– 129,237 structures screened in ~30s– No pre-processingMapping: Homology group screeningHits = 470Hits = 108Hits = 45Hits = 16Hits = 10
  • 18. • Changes to stereochemical and aromaticity perception drive changes in the behaviorof:– Learned models– Calculators– Structure Matchers• Need to relearn and re-baseline calculators and models• Change is discontinuous (!)• There will be no legacy mode– Because this will cause incompatibilities and drive confusionData Model Changes from PP 8.x  PP9
  • 19. Compatibility: Pipeline Pilot and Accelrys Direct• PP 9.0 and Direct 9.0 (2013)– 100% compatible• PP 9.0 and Direct 8.0– Only difference is aromaticityperception edge-cases– Direct 8.0 uses its currentaromaticity perception• Template based– Differs from that in Pipeline Pilot 9.0• Hückel (4n+2) rule based– Minor differences will be observed
  • 20. DatasetNumber ofStructuresCanonicalSMILESAlogPNumber ofRingsNumber ofAromatic RingsNumber ofStereo AtomsECFP4ACD 239,996 251 105 2,455 65 0 214Asinex 137,799 26 24 1,070 22 0 43Maybridge 51,058 2 0 438 0 0 1MDDR 2010 201,748 62 24 3,271 29 4 46WDI 53,517 37 14 612 10 0 42Observed Differences in Calculated ValuesTable shows the number of structures in the datasets that had different values in 9.0 compared with 8.5Difference generally very smallRing perception leads to more prominent differences especially in drug-like datasets
  • 21. • Descriptors such as EC Fingerprints, Canonical Smiles, Ring Counts,AlogP could be different from Pipeline Pilot 8.5• Results from learned models that use such descriptors could be alittle different– Retraining the models is recommended• Canonical SMILES and feature keys could be different– Recalculating database indices is recommended• Similarity and substructure searching could also produce differentresultsEffect of Perception Changes
  • 22. Effect of Perception ChangesComparison of DrugLike modelslearned in Pipeline Pilot 8.5 andretrained in Pipeline Pilot 9.0applied to molecules in the Asinexdata setThe results are very similar for mostmolecules, with larger deviationsfor a few
  • 23. What do I need to do as an admin?• When to upgrade?– Use Direct and AEP/PP independently:• Upgrade to get new capabilities– Use Direct and PP in a mixed environment:• As soon as possible in order to benefit from harmonized chemistry• If you are using ChemReg– Wait until AEP 9.1 is released and do one AEP upgrade– AEP 9.1 contains chemistry updates for Direct 9 capabilities• What instructions do I give my users?– Rebuild learned models and calculators under PP 9.0• What testing do I need to do?– Run your standard test yet and determine that differences from baseline areexpected due to the changes in chemical perception
  • 24. Implications for Other Products• Direct 9 retains historic APIs and search type– Maintenance and interfacing are unchanged• All supported versions of Draw are compatible with Direct 9• ChemReg 3.2 will be supported on Direct 8 and 9• AELN will support Direct 9 in a future release• Should I be running Direct 8 and 9 simultaneously for awhile?– This is possible but not recommended: different search results willconfuse users– Recommendation: verify your enterprise systems with Direct 9and then move Direct 9 to production
  • 25. Summary• Chemistry harmonization project:– PP 9.0 inherits many new chemical representations– Existing representations enhanced– Aromaticity, stereochemistry and ring perceptions enhanced– Significant improvement to depiction aesthetics• Accelrys Enterprise Platform, Pipeline Pilot 9 and Direct 9deliver the same results
  • 26. Where do I go for more information?• Resources– Admin guides• AEP/PP 9• Direct 9– Chemical representation changes documents• AEP/PP 9• Direct 9• Community / download– Log into Accelrys community forums• E.g.: https://community.accelrys.com/community/accelrys_direct__draw__and_jdraw• Accelrys is there to help– Customer support – upgrade strategies– Professional services – upgrade service
  • 27. Appendix• Additional Slides
  • 28. • Single chemistry foundation with single data model implemented ina single code stream– Adopted by Tools and Platform• Direct , Pipeline Pilot and Accelrys Enterprise Platform– Application Stack inherits all of the chemistry capabilities• Simplifies development and application environment• Enhances our ability to deliver new functionality more quickly across theproductsHarmonization delivers
  • 29. Other New Features in PP 9.0• Component for reaction-based tautomer enumeration• Based on a set of twenty one SMIRKS described in "Tautomerism in LargeDatabases", Sitzmann, M.; Ihlenfeldt, W.D. & Nicklaus, M. C., J. Comput. AidedMol. Des., 2010, 24, 521-551• Components to do Data Fusion and to Rank Similarities• Based on “Combination of Similarity Rankings Using Data Fusion”, Peter Willett, J.Chem. Inf. Model., 2013, 53, 1−10• Bad Isotope Filter now flags radioactive isotopes• Components to check structures for querying or registration• Customizable external elements table (PTable)• Alternative method to calculate atom-atom mappings inreactions
  • 30. • Ported CHRP mapper (FSMapper) to Pipeline Pilot source base• New mapping components decide automatically (user doesn’t know or care)which mapper to use (PP SGMapper or new FSMapper), depending on themolecular features present in queries and targets• FSMapper is used for• Reactions• Rgroups with two attachments• Polymers and link nodes• Variable-attachment bonds (Markush bonds)Harmonization of Mapping Functionality
  • 31. • New mapping components• Work with queries from Tag and from File• Old mapping components are in a deprecated folder• Use only PP SGMapper (don’t handle all the new features)• Can be used to reproduce previous mapping behavior if neededHarmonization of Mapping Functionality
  • 32. • Charged non-metals are now treated as their “isoelectronic” equivalent:– B- ~ C ~ N+ ~ O+2 ~ F+3– Si- ~ P ~ S+ ~ Cl+2• The bad valence filter is improved and now catches more bad anions.• Metal anions no longer have implicit hydrogens– Aluminum anions are an exception (for support of aluminum hydride anion)• Nitrogen (V) is still allowed as a drawing alternative for nitro- and diazo- groups, amineoxides, and related substructures. However, the application is now less likely to perceiveuncharged quaternary nitrogens as implicit hydrogens.• Atoms with illegal valence are now better distinguished from atoms with maximumvalence in ECFP fingerprint bits. For example, the Oxygen in N=O and N#O is now typeddifferently. This can affect the Canonical SMILES atom order for structures containingatoms with illegal valence.• The changes in valence result in changes to ECFP fingerprint bits and Canonical SMILES.Valence and Implicit Hydrogens
  • 33. Ring perception is improved. Previously, the SSSR ring perception algorithm was used, which is not uniqueand often misses rings in complex non-planar assemblies, when they are atom-order and bond-orderdependent. The unique “K-rings” perception algorithm is now used, which is the union of all possible SSSRsets. These changes result in changes to Canonical SMILES and improved aromaticity perception.Examples• Now perceived as 3 rings:• Now perceived as 4 rings:• Now perceived as 6 rings:Rings
  • 34. • The isoelectronic equivalence enhancement described in Valence and Implicit Hydrogens improves theperception of ring systems containing charged non-metals. Improved detection of bad valence foranions also contributes to improved perception of aromaticity.• The set of atoms that can contribute a lone pair to an aromatic ring is extended (from N,O,P,S) toinclude As, Se, and Te.• These changes result in changes to ECFP fingerprint bits and Canonical SMILES.Examples• Now perceived as aromatic:• No longer perceived as aromatic:Aromaticity Perception
  • 35. • The isoelectronic equivalence enhancement described in Valence and Implicit Hydrogens improves theperception of stereogenic centers that include charged non-metals.• The symmetric equivalence of O-/OH/=O groups attached to P and S atoms has been extended toinclude As, Se, and Te centers.• Stereo validation logic of reader code is synchronized with perception code. This allows for moreconsistent application of rules prohibiting S(IV) centers, P(V) centers, symmetric equivalence of O-/OH/=O, etc.• “Double-symmetric” ring atom perception is improved Several symmetric spiro cases are nowcorrectly not marked as pseudo-stereo.Examples• Now perceived as stereo:• More consistently perceived as not stereo:Stereochemical Perception
  • 36. 2,3,4,5-tetrahydro-1&lambda;<sup>6</sup>,4-benzothiazepine 1,1-dioxideOpenEye Molecule To Name<>Components2,3,4,5-tetrahydro-1λ6,4-benzothiazepine 1,1-dioxide2-[4-[(3,5-dichloro-4-pyridyl) oxy]phenyl] acetonitrileleucinetylenolMulti-language Options to use HTML tags andspecial charactersIUPAC, trivial,and commercialnamesImportant?
  • 37. OpenEye Molecule To Name Component
  • 38. OpenEye Molecule To Name Component2,3,4,5-tetrahydro-1λ6,4-benzothiazepine 1,1-dioxide2,3,4,5-tetrahydro-1&lambda;<sup>6</sup>,4-benzothiazepine 1,1-dioxideOptions to use HTML tags and special characters
  • 39. OpenEye Molecule From Name Component2-[4-[(3,5-dichloro-4-pyridyl) oxy]phenyl] acetonitrileleucinetylenol
  • 40. New Science• Scaffold Tree• Bases on "The Scaffold Tree, Visualization of the Scaffold Universe byHierarchical Scaffold Classification", Schuffenhauer, A., Ertl, P., Roggo, S.,Wetzel, S., Koch, M. A., Waldmann, H., J. Chem. Inf. Model. 2007, 47, 47-58• Quantitative Estimate of Drug-Likeness (QED)• Based on “Quantifying the Chemical Beauty of Drugs”, G. Richard Bickerton,Gaia V. Paolini, Jérémy Besnard, Sorel Muresan, Andrew L. Hopkins, NatureChemistry 4, 90–98 (2012)• Synthetic Accessibility (SAscore)• Based on “Estimation of Synthetic Accessibility Score of Drug-likeMolecules Based on Molecular Complexity and Fragment Contributions”,Peter Ertl and Ansgar Schuffenhauer, Journal of Cheminformatics, 2009, 1:8