SlideShare a Scribd company logo
1 of 18
Integrating patent chemistry with
public and private non-patent
research resources	
  

Nicko Goncharoff           ACS Fall 2012
Andrew Hinton, PhD         19 August
Christopher Southan, PhD
SureChem Data Collection!

Database of automatically mined structure data
from text and images!
!
• 20M annotated US, EP, WO full text records
and Japan patent abstracts!
                             I!
• 12M unique chemical structures!
• MEDLINE – 19M abstracts (coming Q4)!
ª  Free resource for researchers!         ª  Professional search needs!
ª  Enables linking to public and          ª  Data export, alerts, patent family
    proprietary content                        search, chemical relevance filters…!




                           ª  API or Data Feed access to
                               chemistry & full text!
                           ª  Integrate with internal
                               databases & workflows
Chemistry Mining Workflow!
Public Patent Chemistry Landscape!
Current Patent Sources In PubChem!

                   4000000                                           3.7 M

                   3500000

                   3000000
Numbers of SID's




                                                            2.3 M
                   2500000

                   2000000

                   1500000

                   1000000

                    500000                   280 K
                               10 K
                         0
                             EPO(Sling)   Chemicalize.org    IBM     Thomson
                                                                    Thompson
                                                                     Pharma
Patent & Literature Sources in
                    PubChem !
                                                      The	
  Big	
  Three	
  
 Thomson Pharma,!                                                                                            ChEMBL + !
patents and literature !                                                                                 PubMed + Journals!
     3,756,283!                                                                                               918,077!
   41% lead-like!                                                                                           45% lead-like!
                                   3,291,940	
   281,920	
                        515,745	
  

                                                           52,975	
  

                                             129,448	
                   67,437	
  


                                                           2,113,169	
  




                  IBM,	
  	
  pre-­‐2000	
  patents	
  	
  	
  2,369,481	
  	
  	
  	
  32%	
  lead-­‐like	
  	
  
SureChem to Deposit All Structures*
      into PubChem - 2012!




• 1976 to present
• Deposition of structures only
• View related patents in SureChemOpen
• *Some filtering of common chemistry likely
SureChem and IBM in PubChem 

             (2 Example Patents)!
SureChem Total: 776! IBM Total : 527!
                                          US583593, Inhibitors of squalene
                                               synthetase and protein
                                            farnesyltransferase. Abbott !


   478	
       298	
     229	
          SureChem Total: 832 ! IBM Total: 239!




                                               686	
     146	
      93	
  
         WO-1994018188-A1 !
 4-hydroxy-benzopyran-2-ones and 4-
  hydroxy-cycloalkyl[b]pyran-2-ones
    HIV protease inhibitors, Upjohn!
Identifying Relevant Chemistry - IC50!
    US-20120035195-A1 BACE2, Hoffman LaRoche
Structures with IC50 Values!
         US-20120035195-A1




PDF       SureChemOpen       Excel
Search IC50 Structures in PubChem!

              search
SureChem Unique Contribution!


                SureChem
                                               Pubchem
                    79              96      (ThomsonPharma ,
                                               Chemicalize)




 Stage!                             No. of Structures!
 Available from SureChem (SC)!      1848!
 Pre-Exist in PubChem!              669!
 Pre-Exist – not from IC50 table!   573!
 Pre-Exist – from IC50 table!       96 (12 from TP + 84 via chemicalize.org)!
 Unique-SC with IC50!               79!

 Unique-SC – beyond IC50 table!     1100!
Identifying Relevant Chemistry!


                                 Patent 

                                 US-20120035195-A1!




http://opentox.informatik.uni-
   freiburg.de/ches-mapper/!
SureChem Chemical Relevance Filtering!
•  Frequency	
  counts	
  of	
  chemicals	
  within	
  patents	
  
•  AddiHonal	
  molecular	
  property	
  filtering	
  i.e.	
  Lipinski	
  descriptors	
  
 !
•  Natural	
  Language	
  Processing	
  –	
  based	
  indexing	
  of	
  Exemplified	
  Compounds	
  
 !
 !               Automated indexing of Exemplified Compounds in text!
Conclusion!
SureChem deposition into PubChem will

  –  Significantly expand public patent chemistry scope
  –  Contribute unique and timely MedChem-relevant data
  –  Enable open drug discovery and chemical biology
  –  Advance progress toward a more open, federated
     chemical information network

More Related Content

Viewers also liked

Ll Ml 280 Pres
Ll Ml 280 PresLl Ml 280 Pres
Ll Ml 280 PresMatt Lee
 
Hanshi Ross
Hanshi RossHanshi Ross
Hanshi Rossfirencir
 
Curso Tenerife 2010
Curso Tenerife 2010Curso Tenerife 2010
Curso Tenerife 2010firencir
 
Wireless Cyber Warfare
Wireless Cyber WarfareWireless Cyber Warfare
Wireless Cyber Warfareideaflashed
 
Digital Forensic tools - Application Specific
Digital Forensic tools - Application SpecificDigital Forensic tools - Application Specific
Digital Forensic tools - Application Specificideaflashed
 

Viewers also liked (11)

Ll Ml 280 Pres
Ll Ml 280 PresLl Ml 280 Pres
Ll Ml 280 Pres
 
Gelinas
GelinasGelinas
Gelinas
 
database.pdf
database.pdfdatabase.pdf
database.pdf
 
L.A.
L.A.L.A.
L.A.
 
Hanshi Ross
Hanshi RossHanshi Ross
Hanshi Ross
 
Curso Tenerife 2010
Curso Tenerife 2010Curso Tenerife 2010
Curso Tenerife 2010
 
Wireless Cyber Warfare
Wireless Cyber WarfareWireless Cyber Warfare
Wireless Cyber Warfare
 
Chapter 7
Chapter 7Chapter 7
Chapter 7
 
Smart Boards
Smart BoardsSmart Boards
Smart Boards
 
Digital Forensic tools - Application Specific
Digital Forensic tools - Application SpecificDigital Forensic tools - Application Specific
Digital Forensic tools - Application Specific
 
Cyber Warfare -
Cyber Warfare -Cyber Warfare -
Cyber Warfare -
 

Similar to SureChem - Integrating with public and proprietary data sources (ACS Fall 2012)

Integrating Patents with Research Data
Integrating Patents with Research DataIntegrating Patents with Research Data
Integrating Patents with Research DataChris Southan
 
How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...Ken Karapetyan
 
Causes and consequences of automated extraction of patent-specified virtual d...
Causes and consequences of automated extraction of patent-specified virtual d...Causes and consequences of automated extraction of patent-specified virtual d...
Causes and consequences of automated extraction of patent-specified virtual d...Chris Southan
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...Dr. Haxel Consult
 
The Open Patent Chemistry “Big Bang”: Implications, Opportunities and Caveats
The Open Patent Chemistry “Big Bang”: Implications, Opportunities and CaveatsThe Open Patent Chemistry “Big Bang”: Implications, Opportunities and Caveats
The Open Patent Chemistry “Big Bang”: Implications, Opportunities and CaveatsChris Southan
 
Patent chemisty big bang: utilities for SMEs
Patent chemisty big bang: utilities for SMEsPatent chemisty big bang: utilities for SMEs
Patent chemisty big bang: utilities for SMEsChris Southan
 
The open patent chemistry “big bang”: Implications, opportunities and caveats
The open patent chemistry “big bang”: Implications, opportunities and caveatsThe open patent chemistry “big bang”: Implications, opportunities and caveats
The open patent chemistry “big bang”: Implications, opportunities and caveatsDr. Haxel Consult
 

Similar to SureChem - Integrating with public and proprietary data sources (ACS Fall 2012) (20)

Integrating Patents with Research Data
Integrating Patents with Research DataIntegrating Patents with Research Data
Integrating Patents with Research Data
 
How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...
 
How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...
 
The importance of the InChI identifier as a foundation technology for eScienc...
The importance of the InChI identifier as a foundation technology for eScienc...The importance of the InChI identifier as a foundation technology for eScienc...
The importance of the InChI identifier as a foundation technology for eScienc...
 
Patents in PubChem
Patents in PubChemPatents in PubChem
Patents in PubChem
 
Causes and consequences of automated extraction of patent-specified virtual d...
Causes and consequences of automated extraction of patent-specified virtual d...Causes and consequences of automated extraction of patent-specified virtual d...
Causes and consequences of automated extraction of patent-specified virtual d...
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
 
New Approach Methods - What is That?
New Approach Methods - What is That?New Approach Methods - What is That?
New Approach Methods - What is That?
 
ChemSpider – The Vision and Challenges Associated with Building a Free Online...
ChemSpider – The Vision and Challenges Associated with Building a Free Online...ChemSpider – The Vision and Challenges Associated with Building a Free Online...
ChemSpider – The Vision and Challenges Associated with Building a Free Online...
 
AZ of Chemspider February 2011
AZ of Chemspider February 2011AZ of Chemspider February 2011
AZ of Chemspider February 2011
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
 
ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...
ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...
ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...
 
The Open Patent Chemistry “Big Bang”: Implications, Opportunities and Caveats
The Open Patent Chemistry “Big Bang”: Implications, Opportunities and CaveatsThe Open Patent Chemistry “Big Bang”: Implications, Opportunities and Caveats
The Open Patent Chemistry “Big Bang”: Implications, Opportunities and Caveats
 
Patent chemisty big bang: utilities for SMEs
Patent chemisty big bang: utilities for SMEsPatent chemisty big bang: utilities for SMEs
Patent chemisty big bang: utilities for SMEs
 
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
 
Overview of SureChEMBL
Overview of SureChEMBLOverview of SureChEMBL
Overview of SureChEMBL
 
The open patent chemistry “big bang”: Implications, opportunities and caveats
The open patent chemistry “big bang”: Implications, opportunities and caveatsThe open patent chemistry “big bang”: Implications, opportunities and caveats
The open patent chemistry “big bang”: Implications, opportunities and caveats
 
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
 
Ch08 massspec
Ch08 massspecCh08 massspec
Ch08 massspec
 
Bioalgo 2012-03-massspec
Bioalgo 2012-03-massspecBioalgo 2012-03-massspec
Bioalgo 2012-03-massspec
 

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 

Recently uploaded (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 

SureChem - Integrating with public and proprietary data sources (ACS Fall 2012)

  • 1. Integrating patent chemistry with public and private non-patent research resources   Nicko Goncharoff ACS Fall 2012 Andrew Hinton, PhD 19 August Christopher Southan, PhD
  • 2.
  • 3.
  • 4. SureChem Data Collection! Database of automatically mined structure data from text and images! ! • 20M annotated US, EP, WO full text records and Japan patent abstracts! I! • 12M unique chemical structures! • MEDLINE – 19M abstracts (coming Q4)!
  • 5. ª  Free resource for researchers! ª  Professional search needs! ª  Enables linking to public and ª  Data export, alerts, patent family proprietary content search, chemical relevance filters…! ª  API or Data Feed access to chemistry & full text! ª  Integrate with internal databases & workflows
  • 8. Current Patent Sources In PubChem! 4000000 3.7 M 3500000 3000000 Numbers of SID's 2.3 M 2500000 2000000 1500000 1000000 500000 280 K 10 K 0 EPO(Sling) Chemicalize.org IBM Thomson Thompson Pharma
  • 9. Patent & Literature Sources in PubChem ! The  Big  Three   Thomson Pharma,! ChEMBL + ! patents and literature ! PubMed + Journals! 3,756,283! 918,077! 41% lead-like! 45% lead-like! 3,291,940   281,920   515,745   52,975   129,448   67,437   2,113,169   IBM,    pre-­‐2000  patents      2,369,481        32%  lead-­‐like    
  • 10. SureChem to Deposit All Structures* into PubChem - 2012! • 1976 to present • Deposition of structures only • View related patents in SureChemOpen • *Some filtering of common chemistry likely
  • 11. SureChem and IBM in PubChem 
 (2 Example Patents)! SureChem Total: 776! IBM Total : 527! US583593, Inhibitors of squalene synthetase and protein farnesyltransferase. Abbott ! 478   298   229   SureChem Total: 832 ! IBM Total: 239! 686   146   93   WO-1994018188-A1 ! 4-hydroxy-benzopyran-2-ones and 4- hydroxy-cycloalkyl[b]pyran-2-ones HIV protease inhibitors, Upjohn!
  • 12. Identifying Relevant Chemistry - IC50! US-20120035195-A1 BACE2, Hoffman LaRoche
  • 13. Structures with IC50 Values! US-20120035195-A1 PDF SureChemOpen Excel
  • 14. Search IC50 Structures in PubChem! search
  • 15. SureChem Unique Contribution! SureChem Pubchem 79 96 (ThomsonPharma , Chemicalize) Stage! No. of Structures! Available from SureChem (SC)! 1848! Pre-Exist in PubChem! 669! Pre-Exist – not from IC50 table! 573! Pre-Exist – from IC50 table! 96 (12 from TP + 84 via chemicalize.org)! Unique-SC with IC50! 79! Unique-SC – beyond IC50 table! 1100!
  • 16. Identifying Relevant Chemistry! Patent 
 US-20120035195-A1! http://opentox.informatik.uni- freiburg.de/ches-mapper/!
  • 17. SureChem Chemical Relevance Filtering! •  Frequency  counts  of  chemicals  within  patents   •  AddiHonal  molecular  property  filtering  i.e.  Lipinski  descriptors   ! •  Natural  Language  Processing  –  based  indexing  of  Exemplified  Compounds   ! ! Automated indexing of Exemplified Compounds in text!
  • 18. Conclusion! SureChem deposition into PubChem will –  Significantly expand public patent chemistry scope –  Contribute unique and timely MedChem-relevant data –  Enable open drug discovery and chemical biology –  Advance progress toward a more open, federated chemical information network