SlideShare a Scribd company logo
Searching for
Patent Information in PubChem
Sunghwan Kim (sunghwan.kim@nih.gov),
Paul Thiessen, Asta Gindulyte, Evan Bolton
National Center for Biotechnology Information
National Library of Medicine
National Institutes of Health
ACS Fall 2018 National Meeting in Boston, MA
Sunday, August 19, 2018
2
 NIH’s chemical information resource.
 Collects public-domain chemical data from >620 data sources.
 Disseminates it back to the public free of charge.
What is PubChem?
The Public
Data
Collection
Data
Dissemination
(free of charge)
Government agencies
University labs
Publishers
Pharma Companies
Chemical venders
Public databases
3
 Data organization in PubChem
Unique chemical
structure extraction
through Standardization
Depositor-provided
substance descriptions
Unique chemical structures
Data Contributors
Substance
deposition
Depositor-provided
Bioactivity test results
Activity of tested
“substances”
Activity of “compounds” derived
from associated “substances”
Assay
deposition
4
Unique chemical
structure extraction
through Standardization
Activity of tested
“substances”
Activity of “compounds” derived
from associated “substances”
Data Contributors
Substance
deposition
Assay
deposition
 Data organization in PubChem
Substance ID (SID)
Compound ID (CID)
Assay ID (AID)
5
 PubChem (https://pubchem.ncbi.nlm.nih.gov)
 PubChem contains:
• >247.2 million substance descriptions,
• >96.4 million unique chemical structures,
• >236.7 million biological activity test results
• >1.25 million biological assays, covering >10,000
unique protein sequence targets.
The largest collection of
publicly available chemical information
from >620 data sources.
(as of August 15, 2018)
6
Patent Information
in PubChem
7
 Patent Information Sources
 SureChEMBL (formerly SureChem)
(https://www.surechembl.org/)
 IBM Almaden Research Center
(https://www.research.ibm.com/labs/almaden/)
 SCRIPDB
(http://dcv.uhnres.utoronto.ca/SCRIPDB/search/)
 NextMove Software
(https://www.nextmovesoftware.com/)
 BindingDB
(https://www.bindingdb.org/)
8
 Patent Information Sources
1
10
100
1,000
10,000
100,000
1,000,000
10,000,000
100,000,000
SureChEMBL IBM SCRIPDB NextMove BindingDB
# SID w/ patent # CID w/ patent # patent IDs
9
 Patent Information Sources
# Patent IDs 6,858,886
# SIDs with patent links 40,149,647
# CIDs with patent links 21,211,221
# SID-patent pairs 405,234,094
# CID-patent pairs 350,995,421
21 million compounds associated with
6.9 million patent documents.
10
How to Access
PubChem Patent Information
11
How to access PubChem patent information
1. How to find patent information for a given chemical.
2. How to find chemicals mentioned in a given patent document.
3. How to retrieve all chemicals with patent information.
4. How to search for chemicals with patent information through:
• Identity/similarity search
• Substructure/superstructure search
5. How to retrieve chemicals associated with a patent classification.
6. How to access patent information programmatically.
12
How to access PubChem patent information
1. How to find patent information for a given chemical.
2. How to find chemicals mentioned in a given patent document.
3. How to retrieve all chemicals with patent information.
4. How to search for chemicals with patent information through:
• Identity/similarity search
• Substructure/superstructure search
5. How to retrieve chemicals associated with a patent classification
6. How to access patent information programmatically
13
 Compound Summary page
 Provides an aggregated view of all information available in PubChem
for a given chemical.
 Can be accessed:
• from various search/analysis tools
• via a simple URL ending with the CID or common chemical name
(ex) aspirin (CID 2244)
https://pubchem.ncbi.nlm.nih.gov/compound/2244
https://pubchem.ncbi.nlm.nih.gov/compound/aspirin
14
 Compound Summary page
 Includes patent information on a given chemical.
• Drug patents from FDA Orange Book and DrugBank
• Depositor-provided patents that mention the chemical
• WIPO International Patent Classification
• Related records with patent information
15
Type “rosuvastatin”
16
Select one of the hits
17
Jump to “Patents”
18
Link to
the “Patent View” page
(to be discussed later)
19
Link to the USPTO
page
Link to
the “FDA Orange
Book” page
20
WIPO
International Patent
Classification
(IPC)
21
22
23
24
How to access PubChem patent information
1. How to find patent information for a given chemical.
2. How to find chemicals mentioned in a given patent document.
3. How to retrieve all chemicals with patent information.
4. How to search for chemicals with patent information through:
• Identity/similarity search
• Substructure/superstructure search
5. How to retrieve chemicals associated with a patent classification.
6. How to access patent information programmatically.
25
 Patent View
 PubChem generates the Patent View page for a patent document
available in PubChem.
 The Patent View provides:
• Patent title and abstract
• Inventor and applicant
• Application and publication dates
• List of chemicals mentioned
• Patent classification information
based on the WIPO International Patent Classification (IPC).
26
 Patent View
 Accessible via a simple web address containing the patent number at
the end.
(ex) The Patent View page for EP0521471:
https://pubchem.ncbi.nlm.nih.gov/patent/EP0521471
 It can also be accessed through several PubChem tools and services
such as:
• Compound Summary
• PubChem Search
• Classification Browser
27
Go to “PubChem Search” for
structure search!
28
29
30
31
Jump to “Compounds”
32
33
34
35
36
How to access PubChem patent information
1. How to find patent information for a given chemical.
2. How to find chemicals mentioned in a given patent document.
3. How to retrieve all chemicals with patent information.
4. How to search for chemicals with patent information through:
• Identity/similarity search
• Substructure/superstructure search
5. How to retrieve chemicals associated with a patent classification.
6. How to access patent information programmatically.
37
Type “has_patent”[filter]
38
39
How to access PubChem patent information
1. How to find patent information for a given chemical.
2. How to find chemicals mentioned in a given patent document.
3. How to retrieve all chemicals with patent information.
4. How to search for chemicals with patent information through:
• Identity/similarity search
• Substructure/superstructure search
5. How to retrieve chemicals associated with a patent classification.
6. How to access patent information programmatically.
40
Go to “PubChem Search” for
structure search!
41
42
Open the PubChem
Sketcher.
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
How to access PubChem patent information
1. How to find patent information for a given chemical.
2. How to find chemicals mentioned in a given patent document.
3. How to retrieve all chemicals with patent information.
4. How to search for chemicals with patent information through:
• Substructure/superstructure search
• Identity/similarity search
5. How to retrieve chemicals associated with a patent classification.
6. How to access patent information programmatically.
58
 Classification Browser
(https://pubchem.ncbi.nlm.nih.gov/classification)
 Browse PubChem data using a classification of interest.
 Search for records annotated with the desired classification/term.
 Available ontologies/classifications:
• MeSH
• ChEBI
• FDA Pharm Classes
• KEGG
• LIPID MAPS classification system for lipids
• PubChem Compound Table of Contents
• PubChem BioAssay Classification
• WHO ATC Code (Anatomical Therapeutic Chemical Classification
System)
• WIPO International Patent Classification
• ……
59
60
61
Select “WIPO: International Patent
Classification”
62
63
Click to retrieve
the 7,597 chemicals
64
65
 Classification Browser
(https://pubchem.ncbi.nlm.nih.gov/classification)
 Useful for retrieval of compounds from a small node (~103 compounds)
 Not good for retrieving compounds from a very large node (~106
compounds)
 This issue will be addressed in the future.
66
How to access PubChem patent information
1. How to find patent information for a given chemical.
2. How to find chemicals mentioned in a given patent document.
3. How to retrieve all chemicals with patent information.
4. How to search for chemicals with patent information through:
• Identity/similarity search
• Substructure/superstructure search
5. How to retrieve chemicals associated with a patent classification.
6. How to access patent information programmatically.
67
PUG-REST
 Representational State Transfer (REST)-style
interface.
 Simplified access route
without the overhead of XML or SOAP envelopes
 Access to data that are not accessible
through other PUG Services.
 Intended to handle short, synchronous requests (<30
seconds).
68
https://pubchem.ncbi.nlm.nih.gov/rest/pug/<INPUT>/<OPERATION>/<OUTPUT>[?OPTIONS]
Prolog
(common to all PUG REST requests)
Options specific to
some operations
<INPUT>
Specifies identifiers of interest,
by identifiers
by chemical name
by chemical structure search
by cross reference
by listkey, ......
<OPERATION>
Specifies what to do with input
get full records
get molecular properties
get synonyms or images
get cross references
many other operations
<OUTPUT>
Specifies desired output format
XML  PNG
JSON  SDF
JSONP  CSV
ASNB  TXT
ASNT
 URL construction for a PUG-REST request
 The three parts are (mostly) independent of each other.
 Many possible requests in a PUG-REST request.
69
https://pubchem.ncbi.nlm.nih.gov/rest/pug/<INPUT>/<OPERATION>/<OUTPUT>[?OPTIONS]
Prolog
(common to all PUG REST requests)
Options specific to
some operations
<INPUT>
Specifies identifiers of interest,
by identifiers
by chemical name
by chemical structure search
by cross reference
by listkey, ......
<OPERATION>
Specifies what to do with input
get full records
get molecular properties
get synonyms or images
get cross references
many other operations
<OUTPUT>
Specifies desired output format
XML  PNG
JSON  SDF
JSONP  CSV
ASNB  TXT
ASNT
 URL construction for a PUG-REST request
 Retrieve all Patent IDs associated with CID 2244.
https://......../rest/pug/compound/cid/2244/xrefs/PatentID/TXT
70
https://pubchem.ncbi.nlm.nih.gov/rest/pug/<INPUT>/<OPERATION>/<OUTPUT>[?OPTIONS]
Prolog
(common to all PUG REST requests)
Options specific to
some operations
<INPUT>
Specifies identifiers of interest,
by identifiers
by chemical name
by chemical structure search
by cross reference
by listkey, ......
<OPERATION>
Specifies what to do with input
get full records
get molecular properties
get synonyms or images
get cross references
many other operations
<OUTPUT>
Specifies desired output format
XML  PNG
JSON  SDF
JSONP  CSV
ASNB  TXT
ASNT
 URL construction for a PUG-REST request
 Retrieve all compounds associated with Patent US20050159403A1.
https://....../rest/pug/compound/xref/PatentID/US20050159403A1/cids/TXT
71
Limitations of
PubChem Patent Information
72
 Limitations
1. PubChem does not directly extract information from
patents. Instead, it relies on voluntary contributions from
data sources.
• Lag time between PubChem and original data sources.
• If the data sources are wrong, so is PubChem.
• PubChem does not cover all patent documents.
2. Not all patents worldwide are considered.
• Primary focus on USPTO
• EPO, WIPO, JPO
73
 Limitations
3. Multiple patent documents about a single invention (e.g.,
with different kind codes) are aggregated into a single
patent view page.
• It is not possible to tell between documents which
chemicals are mentioned.
4. Only WIPO IPC is available.
• Cooperative Patent Classification (CPC) information is
not available at this time.
74
Summary
75
 21 M unique compounds associated with 6.9 M
patents from five data sources, including:
• SureChEMBL
• IBM
• SCRIPDB
• NextMove
• BindingDB
 On the Summary page for each compound
• Patent IDs
• Patent Classifications
• FDA Orange book patents
• Structurally similar compounds with patent
information
76
 Various search types for chemicals with patent
information are supported.
• Text (chemical name) search
• Substructure/superstructure search
• Identity/similarity search
 Classification browser to retrieve compounds with a
given patent classification
 Programmatic access to patent information through
PUG-REST
77
Acknowledgements
Evan Bolton
Jie Chen
Tiejun Cheng
Asta Gindulyte
Jia He
Siqian He
Qingliang Li
Benjamin Shoemaker
Thiessen Paul
Bo Yu
Leonid Zaslavsky
Jian Zhang
 The PubChem Team
 PubChem depositors, users, and collaborators
 Funded by the National Library of Medicine

More Related Content

What's hot

Introduction to Patent Law
Introduction to Patent LawIntroduction to Patent Law
Introduction to Patent Law
Michael E. Dukes
 
Patent search
Patent searchPatent search
Patent search
Pratik Vora
 
Freshersfinals2 150921102829-lva1-app6891
Freshersfinals2 150921102829-lva1-app6891Freshersfinals2 150921102829-lva1-app6891
Freshersfinals2 150921102829-lva1-app6891
Quiz Club IIT Kanpur
 
Introduction to Intellectual Property and Patents
Introduction to Intellectual Property and PatentsIntroduction to Intellectual Property and Patents
Introduction to Intellectual Property and Patents
TT Consultants
 
Patent agent exam 2016 paper I answers
Patent agent exam 2016 paper I answers Patent agent exam 2016 paper I answers
Patent agent exam 2016 paper I answers
mskumar86
 
Authorities under the Patent Act.ppt
Authorities under the Patent Act.pptAuthorities under the Patent Act.ppt
Authorities under the Patent Act.ppt
Sheeraz Ahmed
 
Indian patent act 1970
Indian patent act 1970Indian patent act 1970
Indian patent act 1970
Sagar Savale
 
Introduction to Intellectual Property Rights
Introduction to Intellectual Property Rights Introduction to Intellectual Property Rights
Introduction to Intellectual Property Rights
PharmaTatva
 
Patent search analysis and report
Patent search analysis and reportPatent search analysis and report
Patent search analysis and report
Yash Patel
 
Compulsory liscencing
Compulsory liscencingCompulsory liscencing
Compulsory liscencing
vishnugm
 
non-obviousness and the patenting process
non-obviousness and the patenting processnon-obviousness and the patenting process
non-obviousness and the patenting processwelcometofacebook
 
Ethical and legal issues related to human-derived tissues (II)
Ethical and legal issues related to human-derived tissues (II)Ethical and legal issues related to human-derived tissues (II)
Ethical and legal issues related to human-derived tissues (II)
tbrc
 
Doctrine of equivalants
Doctrine of equivalantsDoctrine of equivalants
Doctrine of equivalantsAltacit Global
 
Prior Art Search - An Overview
Prior Art Search - An OverviewPrior Art Search - An Overview
Prior Art Search - An Overview
Manoj Prajapati
 
Trade-Related Aspects of Intellectual Property Rights (TRIPS)
Trade-Related Aspects of Intellectual Property Rights (TRIPS)Trade-Related Aspects of Intellectual Property Rights (TRIPS)
Trade-Related Aspects of Intellectual Property Rights (TRIPS)
Dr. Prashant Vats
 
Rebound- The General Quiz Prelims
Rebound- The General Quiz PrelimsRebound- The General Quiz Prelims
Rebound- The General Quiz Prelims
Akash Verma
 
IPR in Life Sciences :Unlock & Harness Your Innovative Potential
IPR in Life Sciences :Unlock & Harness Your Innovative PotentialIPR in Life Sciences :Unlock & Harness Your Innovative Potential
IPR in Life Sciences :Unlock & Harness Your Innovative Potential
sabuj kumar chaudhuri
 
3D Bioprinting Presentation.pptx
3D Bioprinting Presentation.pptx3D Bioprinting Presentation.pptx
3D Bioprinting Presentation.pptx
SwapnilUgle
 

What's hot (20)

Introduction to Patent Law
Introduction to Patent LawIntroduction to Patent Law
Introduction to Patent Law
 
Patent search
Patent searchPatent search
Patent search
 
Career in IPR
Career in IPRCareer in IPR
Career in IPR
 
Freshersfinals2 150921102829-lva1-app6891
Freshersfinals2 150921102829-lva1-app6891Freshersfinals2 150921102829-lva1-app6891
Freshersfinals2 150921102829-lva1-app6891
 
Introduction to Intellectual Property and Patents
Introduction to Intellectual Property and PatentsIntroduction to Intellectual Property and Patents
Introduction to Intellectual Property and Patents
 
Patent agent exam 2016 paper I answers
Patent agent exam 2016 paper I answers Patent agent exam 2016 paper I answers
Patent agent exam 2016 paper I answers
 
Authorities under the Patent Act.ppt
Authorities under the Patent Act.pptAuthorities under the Patent Act.ppt
Authorities under the Patent Act.ppt
 
Indian patent act 1970
Indian patent act 1970Indian patent act 1970
Indian patent act 1970
 
Introduction to Intellectual Property Rights
Introduction to Intellectual Property Rights Introduction to Intellectual Property Rights
Introduction to Intellectual Property Rights
 
Patent drafting
Patent draftingPatent drafting
Patent drafting
 
Patent search analysis and report
Patent search analysis and reportPatent search analysis and report
Patent search analysis and report
 
Compulsory liscencing
Compulsory liscencingCompulsory liscencing
Compulsory liscencing
 
non-obviousness and the patenting process
non-obviousness and the patenting processnon-obviousness and the patenting process
non-obviousness and the patenting process
 
Ethical and legal issues related to human-derived tissues (II)
Ethical and legal issues related to human-derived tissues (II)Ethical and legal issues related to human-derived tissues (II)
Ethical and legal issues related to human-derived tissues (II)
 
Doctrine of equivalants
Doctrine of equivalantsDoctrine of equivalants
Doctrine of equivalants
 
Prior Art Search - An Overview
Prior Art Search - An OverviewPrior Art Search - An Overview
Prior Art Search - An Overview
 
Trade-Related Aspects of Intellectual Property Rights (TRIPS)
Trade-Related Aspects of Intellectual Property Rights (TRIPS)Trade-Related Aspects of Intellectual Property Rights (TRIPS)
Trade-Related Aspects of Intellectual Property Rights (TRIPS)
 
Rebound- The General Quiz Prelims
Rebound- The General Quiz PrelimsRebound- The General Quiz Prelims
Rebound- The General Quiz Prelims
 
IPR in Life Sciences :Unlock & Harness Your Innovative Potential
IPR in Life Sciences :Unlock & Harness Your Innovative PotentialIPR in Life Sciences :Unlock & Harness Your Innovative Potential
IPR in Life Sciences :Unlock & Harness Your Innovative Potential
 
3D Bioprinting Presentation.pptx
3D Bioprinting Presentation.pptx3D Bioprinting Presentation.pptx
3D Bioprinting Presentation.pptx
 

Similar to Searching for patent information in PubChem

Cheminformatics Education with PubChem
Cheminformatics Education with PubChemCheminformatics Education with PubChem
Cheminformatics Education with PubChem
Sunghwan Kim
 
Exploiting PubChem for drug discovery based on natural products
Exploiting PubChem for drug discovery based on natural productsExploiting PubChem for drug discovery based on natural products
Exploiting PubChem for drug discovery based on natural products
Sunghwan Kim
 
Exploiting PubChem for Drug Discovery
Exploiting PubChem for Drug DiscoveryExploiting PubChem for Drug Discovery
Exploiting PubChem for Drug Discovery
Sunghwan Kim
 
PubChem as a resource for chemical information education
PubChem as a resource for chemical information educationPubChem as a resource for chemical information education
PubChem as a resource for chemical information education
Sunghwan Kim
 
PubChem: a public chemical information resource for big data chemistry
PubChem: a public chemical information resource for big data chemistryPubChem: a public chemical information resource for big data chemistry
PubChem: a public chemical information resource for big data chemistry
Sunghwan Kim
 
PubChem and Big Data Chemistry
PubChem and Big Data ChemistryPubChem and Big Data Chemistry
PubChem and Big Data Chemistry
Sunghwan Kim
 
PubChem: A Public Chemical Information Resource for Big Data Chemistry
PubChem: A Public Chemical Information Resource for Big Data ChemistryPubChem: A Public Chemical Information Resource for Big Data Chemistry
PubChem: A Public Chemical Information Resource for Big Data Chemistry
Sunghwan Kim
 
PubChem for chemical information literacy training
PubChem for chemical information literacy trainingPubChem for chemical information literacy training
PubChem for chemical information literacy training
Sunghwan Kim
 
PubChem as a resource for chemical information training
PubChem as a resource for chemical information trainingPubChem as a resource for chemical information training
PubChem as a resource for chemical information training
Sunghwan Kim
 
Pubchem
PubchemPubchem
Pubchem
samantlalit
 
Searching for chemical information using PubChem
Searching for chemical information using PubChemSearching for chemical information using PubChem
Searching for chemical information using PubChem
Sunghwan Kim
 
Patents in PubChem
Patents in PubChemPatents in PubChem
Patents in PubChem
Chris Southan
 
Patent chemisty big bang: utilities for SMEs
Patent chemisty big bang: utilities for SMEsPatent chemisty big bang: utilities for SMEs
Patent chemisty big bang: utilities for SMEs
Chris Southan
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
New developments in delivering public access to data from the National Center...
New developments in delivering public access to data from the National Center...New developments in delivering public access to data from the National Center...
New developments in delivering public access to data from the National Center...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data DashboardsAccessing Environmental Chemistry Data via Data Dashboards
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Environmental chemical information in PubChem
Environmental chemical information in PubChem Environmental chemical information in PubChem
Environmental chemical information in PubChem
Jian Zhang
 

Similar to Searching for patent information in PubChem (20)

Cheminformatics Education with PubChem
Cheminformatics Education with PubChemCheminformatics Education with PubChem
Cheminformatics Education with PubChem
 
Exploiting PubChem for drug discovery based on natural products
Exploiting PubChem for drug discovery based on natural productsExploiting PubChem for drug discovery based on natural products
Exploiting PubChem for drug discovery based on natural products
 
Exploiting PubChem for Drug Discovery
Exploiting PubChem for Drug DiscoveryExploiting PubChem for Drug Discovery
Exploiting PubChem for Drug Discovery
 
PubChem as a resource for chemical information education
PubChem as a resource for chemical information educationPubChem as a resource for chemical information education
PubChem as a resource for chemical information education
 
PubChem: a public chemical information resource for big data chemistry
PubChem: a public chemical information resource for big data chemistryPubChem: a public chemical information resource for big data chemistry
PubChem: a public chemical information resource for big data chemistry
 
PubChem and Big Data Chemistry
PubChem and Big Data ChemistryPubChem and Big Data Chemistry
PubChem and Big Data Chemistry
 
PubChem: A Public Chemical Information Resource for Big Data Chemistry
PubChem: A Public Chemical Information Resource for Big Data ChemistryPubChem: A Public Chemical Information Resource for Big Data Chemistry
PubChem: A Public Chemical Information Resource for Big Data Chemistry
 
PubChem for chemical information literacy training
PubChem for chemical information literacy trainingPubChem for chemical information literacy training
PubChem for chemical information literacy training
 
PubChem as a resource for chemical information training
PubChem as a resource for chemical information trainingPubChem as a resource for chemical information training
PubChem as a resource for chemical information training
 
Pubchem
PubchemPubchem
Pubchem
 
Searching for chemical information using PubChem
Searching for chemical information using PubChemSearching for chemical information using PubChem
Searching for chemical information using PubChem
 
Patents in PubChem
Patents in PubChemPatents in PubChem
Patents in PubChem
 
Patent chemisty big bang: utilities for SMEs
Patent chemisty big bang: utilities for SMEsPatent chemisty big bang: utilities for SMEs
Patent chemisty big bang: utilities for SMEs
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
 
New developments in delivering public access to data from the National Center...
New developments in delivering public access to data from the National Center...New developments in delivering public access to data from the National Center...
New developments in delivering public access to data from the National Center...
 
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data DashboardsAccessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
 
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
 
Environmental chemical information in PubChem
Environmental chemical information in PubChem Environmental chemical information in PubChem
Environmental chemical information in PubChem
 

More from Sunghwan Kim

PubChem for drug discovery in the age of big data and artificial intelligence
PubChem for drug discovery in the age of big data and artificial intelligencePubChem for drug discovery in the age of big data and artificial intelligence
PubChem for drug discovery in the age of big data and artificial intelligence
Sunghwan Kim
 
Cheminformatics Online Chemistry Course (OLCC): A Community Effort to Introdu...
Cheminformatics Online Chemistry Course (OLCC): A Community Effort to Introdu...Cheminformatics Online Chemistry Course (OLCC): A Community Effort to Introdu...
Cheminformatics Online Chemistry Course (OLCC): A Community Effort to Introdu...
Sunghwan Kim
 
PubChem as an Emerging Toxicological Information Resource
PubChem as an Emerging Toxicological Information ResourcePubChem as an Emerging Toxicological Information Resource
PubChem as an Emerging Toxicological Information Resource
Sunghwan Kim
 
Toxicological information in PubChem
Toxicological information in PubChemToxicological information in PubChem
Toxicological information in PubChem
Sunghwan Kim
 
Chemical Health and Safety Information in PubChem
Chemical Health and Safety Information in PubChemChemical Health and Safety Information in PubChem
Chemical Health and Safety Information in PubChem
Sunghwan Kim
 
Chemical Structure Standardization and Synonym Filtering in PubChem
Chemical Structure Standardization and Synonym Filtering in PubChemChemical Structure Standardization and Synonym Filtering in PubChem
Chemical Structure Standardization and Synonym Filtering in PubChem
Sunghwan Kim
 
A Brief Overview of Cheminformatics
A Brief Overview of CheminformaticsA Brief Overview of Cheminformatics
A Brief Overview of Cheminformatics
Sunghwan Kim
 
Development of machine learning-based prediction models for chemical modulato...
Development of machine learning-based prediction models for chemical modulato...Development of machine learning-based prediction models for chemical modulato...
Development of machine learning-based prediction models for chemical modulato...
Sunghwan Kim
 
Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...
Sunghwan Kim
 
NCBI Minute: Integrating PubChem into Your Chemistry Teaching
NCBI Minute: Integrating PubChem into Your Chemistry TeachingNCBI Minute: Integrating PubChem into Your Chemistry Teaching
NCBI Minute: Integrating PubChem into Your Chemistry Teaching
Sunghwan Kim
 
How can you access PubChem programmatically?
How can you access PubChem programmatically?How can you access PubChem programmatically?
How can you access PubChem programmatically?
Sunghwan Kim
 

More from Sunghwan Kim (11)

PubChem for drug discovery in the age of big data and artificial intelligence
PubChem for drug discovery in the age of big data and artificial intelligencePubChem for drug discovery in the age of big data and artificial intelligence
PubChem for drug discovery in the age of big data and artificial intelligence
 
Cheminformatics Online Chemistry Course (OLCC): A Community Effort to Introdu...
Cheminformatics Online Chemistry Course (OLCC): A Community Effort to Introdu...Cheminformatics Online Chemistry Course (OLCC): A Community Effort to Introdu...
Cheminformatics Online Chemistry Course (OLCC): A Community Effort to Introdu...
 
PubChem as an Emerging Toxicological Information Resource
PubChem as an Emerging Toxicological Information ResourcePubChem as an Emerging Toxicological Information Resource
PubChem as an Emerging Toxicological Information Resource
 
Toxicological information in PubChem
Toxicological information in PubChemToxicological information in PubChem
Toxicological information in PubChem
 
Chemical Health and Safety Information in PubChem
Chemical Health and Safety Information in PubChemChemical Health and Safety Information in PubChem
Chemical Health and Safety Information in PubChem
 
Chemical Structure Standardization and Synonym Filtering in PubChem
Chemical Structure Standardization and Synonym Filtering in PubChemChemical Structure Standardization and Synonym Filtering in PubChem
Chemical Structure Standardization and Synonym Filtering in PubChem
 
A Brief Overview of Cheminformatics
A Brief Overview of CheminformaticsA Brief Overview of Cheminformatics
A Brief Overview of Cheminformatics
 
Development of machine learning-based prediction models for chemical modulato...
Development of machine learning-based prediction models for chemical modulato...Development of machine learning-based prediction models for chemical modulato...
Development of machine learning-based prediction models for chemical modulato...
 
Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...
 
NCBI Minute: Integrating PubChem into Your Chemistry Teaching
NCBI Minute: Integrating PubChem into Your Chemistry TeachingNCBI Minute: Integrating PubChem into Your Chemistry Teaching
NCBI Minute: Integrating PubChem into Your Chemistry Teaching
 
How can you access PubChem programmatically?
How can you access PubChem programmatically?How can you access PubChem programmatically?
How can you access PubChem programmatically?
 

Recently uploaded

Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
UiPathCommunity
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 

Recently uploaded (20)

Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 

Searching for patent information in PubChem

  • 1. Searching for Patent Information in PubChem Sunghwan Kim (sunghwan.kim@nih.gov), Paul Thiessen, Asta Gindulyte, Evan Bolton National Center for Biotechnology Information National Library of Medicine National Institutes of Health ACS Fall 2018 National Meeting in Boston, MA Sunday, August 19, 2018
  • 2. 2  NIH’s chemical information resource.  Collects public-domain chemical data from >620 data sources.  Disseminates it back to the public free of charge. What is PubChem? The Public Data Collection Data Dissemination (free of charge) Government agencies University labs Publishers Pharma Companies Chemical venders Public databases
  • 3. 3  Data organization in PubChem Unique chemical structure extraction through Standardization Depositor-provided substance descriptions Unique chemical structures Data Contributors Substance deposition Depositor-provided Bioactivity test results Activity of tested “substances” Activity of “compounds” derived from associated “substances” Assay deposition
  • 4. 4 Unique chemical structure extraction through Standardization Activity of tested “substances” Activity of “compounds” derived from associated “substances” Data Contributors Substance deposition Assay deposition  Data organization in PubChem Substance ID (SID) Compound ID (CID) Assay ID (AID)
  • 5. 5  PubChem (https://pubchem.ncbi.nlm.nih.gov)  PubChem contains: • >247.2 million substance descriptions, • >96.4 million unique chemical structures, • >236.7 million biological activity test results • >1.25 million biological assays, covering >10,000 unique protein sequence targets. The largest collection of publicly available chemical information from >620 data sources. (as of August 15, 2018)
  • 7. 7  Patent Information Sources  SureChEMBL (formerly SureChem) (https://www.surechembl.org/)  IBM Almaden Research Center (https://www.research.ibm.com/labs/almaden/)  SCRIPDB (http://dcv.uhnres.utoronto.ca/SCRIPDB/search/)  NextMove Software (https://www.nextmovesoftware.com/)  BindingDB (https://www.bindingdb.org/)
  • 8. 8  Patent Information Sources 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 SureChEMBL IBM SCRIPDB NextMove BindingDB # SID w/ patent # CID w/ patent # patent IDs
  • 9. 9  Patent Information Sources # Patent IDs 6,858,886 # SIDs with patent links 40,149,647 # CIDs with patent links 21,211,221 # SID-patent pairs 405,234,094 # CID-patent pairs 350,995,421 21 million compounds associated with 6.9 million patent documents.
  • 10. 10 How to Access PubChem Patent Information
  • 11. 11 How to access PubChem patent information 1. How to find patent information for a given chemical. 2. How to find chemicals mentioned in a given patent document. 3. How to retrieve all chemicals with patent information. 4. How to search for chemicals with patent information through: • Identity/similarity search • Substructure/superstructure search 5. How to retrieve chemicals associated with a patent classification. 6. How to access patent information programmatically.
  • 12. 12 How to access PubChem patent information 1. How to find patent information for a given chemical. 2. How to find chemicals mentioned in a given patent document. 3. How to retrieve all chemicals with patent information. 4. How to search for chemicals with patent information through: • Identity/similarity search • Substructure/superstructure search 5. How to retrieve chemicals associated with a patent classification 6. How to access patent information programmatically
  • 13. 13  Compound Summary page  Provides an aggregated view of all information available in PubChem for a given chemical.  Can be accessed: • from various search/analysis tools • via a simple URL ending with the CID or common chemical name (ex) aspirin (CID 2244) https://pubchem.ncbi.nlm.nih.gov/compound/2244 https://pubchem.ncbi.nlm.nih.gov/compound/aspirin
  • 14. 14  Compound Summary page  Includes patent information on a given chemical. • Drug patents from FDA Orange Book and DrugBank • Depositor-provided patents that mention the chemical • WIPO International Patent Classification • Related records with patent information
  • 16. 16 Select one of the hits
  • 18. 18 Link to the “Patent View” page (to be discussed later)
  • 19. 19 Link to the USPTO page Link to the “FDA Orange Book” page
  • 21. 21
  • 22. 22
  • 23. 23
  • 24. 24 How to access PubChem patent information 1. How to find patent information for a given chemical. 2. How to find chemicals mentioned in a given patent document. 3. How to retrieve all chemicals with patent information. 4. How to search for chemicals with patent information through: • Identity/similarity search • Substructure/superstructure search 5. How to retrieve chemicals associated with a patent classification. 6. How to access patent information programmatically.
  • 25. 25  Patent View  PubChem generates the Patent View page for a patent document available in PubChem.  The Patent View provides: • Patent title and abstract • Inventor and applicant • Application and publication dates • List of chemicals mentioned • Patent classification information based on the WIPO International Patent Classification (IPC).
  • 26. 26  Patent View  Accessible via a simple web address containing the patent number at the end. (ex) The Patent View page for EP0521471: https://pubchem.ncbi.nlm.nih.gov/patent/EP0521471  It can also be accessed through several PubChem tools and services such as: • Compound Summary • PubChem Search • Classification Browser
  • 27. 27 Go to “PubChem Search” for structure search!
  • 28. 28
  • 29. 29
  • 30. 30
  • 32. 32
  • 33. 33
  • 34. 34
  • 35. 35
  • 36. 36 How to access PubChem patent information 1. How to find patent information for a given chemical. 2. How to find chemicals mentioned in a given patent document. 3. How to retrieve all chemicals with patent information. 4. How to search for chemicals with patent information through: • Identity/similarity search • Substructure/superstructure search 5. How to retrieve chemicals associated with a patent classification. 6. How to access patent information programmatically.
  • 38. 38
  • 39. 39 How to access PubChem patent information 1. How to find patent information for a given chemical. 2. How to find chemicals mentioned in a given patent document. 3. How to retrieve all chemicals with patent information. 4. How to search for chemicals with patent information through: • Identity/similarity search • Substructure/superstructure search 5. How to retrieve chemicals associated with a patent classification. 6. How to access patent information programmatically.
  • 40. 40 Go to “PubChem Search” for structure search!
  • 41. 41
  • 43. 43
  • 44. 44
  • 45. 45
  • 46. 46
  • 47. 47
  • 48. 48
  • 49. 49
  • 50. 50
  • 51. 51
  • 52. 52
  • 53. 53
  • 54. 54
  • 55. 55
  • 56. 56
  • 57. 57 How to access PubChem patent information 1. How to find patent information for a given chemical. 2. How to find chemicals mentioned in a given patent document. 3. How to retrieve all chemicals with patent information. 4. How to search for chemicals with patent information through: • Substructure/superstructure search • Identity/similarity search 5. How to retrieve chemicals associated with a patent classification. 6. How to access patent information programmatically.
  • 58. 58  Classification Browser (https://pubchem.ncbi.nlm.nih.gov/classification)  Browse PubChem data using a classification of interest.  Search for records annotated with the desired classification/term.  Available ontologies/classifications: • MeSH • ChEBI • FDA Pharm Classes • KEGG • LIPID MAPS classification system for lipids • PubChem Compound Table of Contents • PubChem BioAssay Classification • WHO ATC Code (Anatomical Therapeutic Chemical Classification System) • WIPO International Patent Classification • ……
  • 59. 59
  • 60. 60
  • 61. 61 Select “WIPO: International Patent Classification”
  • 62. 62
  • 63. 63 Click to retrieve the 7,597 chemicals
  • 64. 64
  • 65. 65  Classification Browser (https://pubchem.ncbi.nlm.nih.gov/classification)  Useful for retrieval of compounds from a small node (~103 compounds)  Not good for retrieving compounds from a very large node (~106 compounds)  This issue will be addressed in the future.
  • 66. 66 How to access PubChem patent information 1. How to find patent information for a given chemical. 2. How to find chemicals mentioned in a given patent document. 3. How to retrieve all chemicals with patent information. 4. How to search for chemicals with patent information through: • Identity/similarity search • Substructure/superstructure search 5. How to retrieve chemicals associated with a patent classification. 6. How to access patent information programmatically.
  • 67. 67 PUG-REST  Representational State Transfer (REST)-style interface.  Simplified access route without the overhead of XML or SOAP envelopes  Access to data that are not accessible through other PUG Services.  Intended to handle short, synchronous requests (<30 seconds).
  • 68. 68 https://pubchem.ncbi.nlm.nih.gov/rest/pug/<INPUT>/<OPERATION>/<OUTPUT>[?OPTIONS] Prolog (common to all PUG REST requests) Options specific to some operations <INPUT> Specifies identifiers of interest, by identifiers by chemical name by chemical structure search by cross reference by listkey, ...... <OPERATION> Specifies what to do with input get full records get molecular properties get synonyms or images get cross references many other operations <OUTPUT> Specifies desired output format XML  PNG JSON  SDF JSONP  CSV ASNB  TXT ASNT  URL construction for a PUG-REST request  The three parts are (mostly) independent of each other.  Many possible requests in a PUG-REST request.
  • 69. 69 https://pubchem.ncbi.nlm.nih.gov/rest/pug/<INPUT>/<OPERATION>/<OUTPUT>[?OPTIONS] Prolog (common to all PUG REST requests) Options specific to some operations <INPUT> Specifies identifiers of interest, by identifiers by chemical name by chemical structure search by cross reference by listkey, ...... <OPERATION> Specifies what to do with input get full records get molecular properties get synonyms or images get cross references many other operations <OUTPUT> Specifies desired output format XML  PNG JSON  SDF JSONP  CSV ASNB  TXT ASNT  URL construction for a PUG-REST request  Retrieve all Patent IDs associated with CID 2244. https://......../rest/pug/compound/cid/2244/xrefs/PatentID/TXT
  • 70. 70 https://pubchem.ncbi.nlm.nih.gov/rest/pug/<INPUT>/<OPERATION>/<OUTPUT>[?OPTIONS] Prolog (common to all PUG REST requests) Options specific to some operations <INPUT> Specifies identifiers of interest, by identifiers by chemical name by chemical structure search by cross reference by listkey, ...... <OPERATION> Specifies what to do with input get full records get molecular properties get synonyms or images get cross references many other operations <OUTPUT> Specifies desired output format XML  PNG JSON  SDF JSONP  CSV ASNB  TXT ASNT  URL construction for a PUG-REST request  Retrieve all compounds associated with Patent US20050159403A1. https://....../rest/pug/compound/xref/PatentID/US20050159403A1/cids/TXT
  • 72. 72  Limitations 1. PubChem does not directly extract information from patents. Instead, it relies on voluntary contributions from data sources. • Lag time between PubChem and original data sources. • If the data sources are wrong, so is PubChem. • PubChem does not cover all patent documents. 2. Not all patents worldwide are considered. • Primary focus on USPTO • EPO, WIPO, JPO
  • 73. 73  Limitations 3. Multiple patent documents about a single invention (e.g., with different kind codes) are aggregated into a single patent view page. • It is not possible to tell between documents which chemicals are mentioned. 4. Only WIPO IPC is available. • Cooperative Patent Classification (CPC) information is not available at this time.
  • 75. 75  21 M unique compounds associated with 6.9 M patents from five data sources, including: • SureChEMBL • IBM • SCRIPDB • NextMove • BindingDB  On the Summary page for each compound • Patent IDs • Patent Classifications • FDA Orange book patents • Structurally similar compounds with patent information
  • 76. 76  Various search types for chemicals with patent information are supported. • Text (chemical name) search • Substructure/superstructure search • Identity/similarity search  Classification browser to retrieve compounds with a given patent classification  Programmatic access to patent information through PUG-REST
  • 77. 77 Acknowledgements Evan Bolton Jie Chen Tiejun Cheng Asta Gindulyte Jia He Siqian He Qingliang Li Benjamin Shoemaker Thiessen Paul Bo Yu Leonid Zaslavsky Jian Zhang  The PubChem Team  PubChem depositors, users, and collaborators  Funded by the National Library of Medicine