Integrating Patents with Research Data

  • 249 views
Uploaded on

SureChem ACS 2012. Presented by Nico on behalf of all three authors. The data is searchable at https://open.surechem.com/login. Related information included recent posts at …

SureChem ACS 2012. Presented by Nico on behalf of all three authors. The data is searchable at https://open.surechem.com/login. Related information included recent posts at http://cdsouthan.blogspot.se/

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
249
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
7
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Integrating patent chemistry withpublic and private non-patentresearch resourcesNicko Goncharoff ACS Fall 2012Andrew Hinton, PhD 19 AugustChristopher Southan, PhD
  • 2. SureChem Data CollectionDatabase of automatically mined structure datafrom text and images•20M annotated US, EP, WO full text recordsand Japan patent abstracts•12M unique chemical structures I•MEDLINE – 19M abstracts (coming Q4)
  • 3.  Free resource for researchers  Professional search needs Enables linking to public and  Data export, alerts, patent family proprietary content search, chemical relevance filters…  API or Data Feed access to chemistry & full text  Integrate with internal databases & workflows
  • 4. Chemistry Mining Workflow
  • 5. Public Patent Chemistry Landscape
  • 6. Current Patent Sources In PubChem 4000000 3.7 M 3500000 3000000Numbers of SIDs 2.3 M 2500000 2000000 1500000 1000000 500000 280 K 10 K 0 EPO(Sling) Chemicalize.org IBM Thomson Thompson Pharma
  • 7. Patent & Literature Sources in PubChem The Big Three Thomson Pharma, ChEMBL +patents and literature PubMed + Journals 3,756,283 918,077 41% lead-like 45% lead-like 3,291,940 281,920 515,745 52,975 129,448 67,437 2,113,169 IBM, pre-2000 patents 2,369,481 32% lead-like
  • 8. SureChem to Deposit AllStructures* into PubChem - 2012•1976 to present•Deposition of structures only•View related patents in SureChemOpen•*Some filtering of common chemistry likely
  • 9. SureChem and IBM in PubChem (2 Example Patents)SureChem Total: 776 IBM Total : 527 US583593, Inhibitors of squalene synthetase and protein farnesyltransferase. Abbott 478 298 229 SureChem Total: 832 IBM Total: 239 686 146 93 WO-1994018188-A1 4-hydroxy-benzopyran-2-ones and 4- hydroxy-cycloalkyl[b]pyran-2-ones HIV protease inhibitors, Upjohn
  • 10. Identifying Relevant Chemistry - IC 50 US-20120035195-A1 BACE2, Hoffman LaRoche
  • 11. Structures with IC 50 Values US-20120035195-A1 PDF SureChemOpen Excel
  • 12. Search IC 50 Structures in PubChem search
  • 13. SureChem Unique Contribution SureChem Pubchem 79 96 (ThomsonPharma , Chemicalize) Stage No. of Structures Available from SureChem (SC) 1848 Pre-Exist in PubChem 669 Pre-Exist – not from IC 50 table 573 Pre-Exist – from IC 50 table 96 (12 from TP + 84 via chemicalize.org) Unique-SC with IC 50 79 Unique-SC – beyond IC 50 table 1100
  • 14. Identifying Relevant Chemistry Patent US-20120035195-A1http://opentox.informatik.un i-freiburg.de/ches- mapper/
  • 15. SureChem Chemical Relevance Filtering• Frequency counts of chemicals within patents• Additional molecular property filtering i.e. Lipinski descriptors• Natural Language Processing – based indexing of Exemplified Compounds Automated indexing of Exemplified Compounds in text
  • 16. ConclusionSureChem deposition into PubChem will – Significantly expand public patent chemistry scope – Contribute unique and timely MedChem-relevant data – Enable open drug discovery and chemical biology – Advance progress toward a more open, federated chemical information network