Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Chemical Patent Curation and Management 
new tools and capabilities 
Árpád Figyelmesi
Motivation 
Knowing the chemical space covered by competitors’ patents is essential for successful drug discovery. 
●Idea ...
Challenges 
●Existing databases concept and quality 
●Manual processing time 
●Automatic processing quality 
●Visualizatio...
Computer-assisted data extraction and analysis 
●English, Chinese and Japanese N2S 
●Markush Editor 
●Structure Checker 
●...
Name to Structure 
●Support for many nomenclatures (common, drug names, Comp ID …) 
●IUPAC names used for exemplified stru...
Why other languages?
Markush representation 
●R-groups 
●Atom lists 
●Bond lists 
●Position variations 
●Link nodes 
●Repeating units 
●Homolog...
R-group Bridging 
“R1, and R2 each independently represents alkyl of 1 to 4 carbon atoms…, 
or R1 and R2 together form a s...
Markush Editor 
R-group definitions 
Tree view 
Scaffold 
Structure checker 
Nesting view & Preview
video 1-1.5 min 
Markush Editor Video
Workflow 
Collect 
●Search 
●Analyze 
Curate 
●Extract 
●Validate 
Store & Share 
●Markushes 
●Compounds 
●Documents 
Use ...
Compound Extraction View 
Compound list 
Project explorer 
Annotated document 
Selected structures
Markush Extraction View 
Markush editor 
Example structures 
Annotated document 
Project explorer 
Selected structures 
St...
video 1.5-2 min 
ChemCurator
General Document Curation 
Extract Markush Structures from patents 
Extract specific structures 
●Journal articles 
●Compa...
Input formats 
●Files (XML, PDF, HTML) 
●Google Patents 
●IFI CLAIMS 
●Images (CLiDE & OSRA)
Integration & Information Sharing 
Other ChemAxon products: 
•Direct IJC schema connection 
•Project sharing function 
•Ac...
Future plans 
Naming: 
●Improving accuracy 
●New languages 
Markush 
●Markush overlap 
●Chemical space visualization 
Chem...
Acknowledgment 
Daniel Bonniot 
Árpád Figyelmesi 
Gábor Botka 
David Deng 
Péter Kovács 
János Kendi 
markush-support@chem...
Upcoming SlideShare
Loading in …5
×

ICIC 2014 Chemical Patent Curation and Management – New Tools and Capabilities

1,351 views

Published on

Understanding competitors’ patent portfolios and protecting their own intellectual properties are key questions for pharmaceutical companies. Extracting and analyzing the chemical space covered by these patents is an extremely complex and time consuming challenge and requires many communication rounds between IP experts and members of drug discovery teams. ChemAxon has been working with researchers in the industry to develop tools to help in this area by building and analyzing project specific databases based on high quality computer-assisted extraction of chemical information from patent documents. These databases can be useful across the full drug discovery process from idea generation to lead candidate selection, drug design and creation of new patents. This way we can eliminate these rounds of communication, because IP experts can precisely translate the content of patent documents to the language of chemistry which is more comprehensible to other actors. This presentation will discuss the results of this development and technologies developed or used, namely: English and Chinese Name to Structure, which dramatically speeds up the extraction process; Markush Editor that helps draw complex Markush structures more easily, Structure Checker and Markush Validation, which confirm the quality of extracted information. We will also introduce our search, enumeration and hit visualization and our latest improvements that allow overlap analysis between Markush structures.

Published in: Internet
  • Be the first to comment

  • Be the first to like this

ICIC 2014 Chemical Patent Curation and Management – New Tools and Capabilities

  1. 1. Chemical Patent Curation and Management new tools and capabilities Árpád Figyelmesi
  2. 2. Motivation Knowing the chemical space covered by competitors’ patents is essential for successful drug discovery. ●Idea generation ●Lead candidates selection ●Drug design ●Patent claims construction
  3. 3. Challenges ●Existing databases concept and quality ●Manual processing time ●Automatic processing quality ●Visualization and analysis
  4. 4. Computer-assisted data extraction and analysis ●English, Chinese and Japanese N2S ●Markush Editor ●Structure Checker ● Markush Validation ●Search and representation
  5. 5. Name to Structure ●Support for many nomenclatures (common, drug names, Comp ID …) ●IUPAC names used for exemplified structures and R-group fragments ●Essential to extract chemical information from patents ●English (2008, Marvin 5.1) ●Chinese (2013, Marvin 5.12) ●Japanese (2014, Marvin 6.3)
  6. 6. Why other languages?
  7. 7. Markush representation ●R-groups ●Atom lists ●Bond lists ●Position variations ●Link nodes ●Repeating units ●Homology groups
  8. 8. R-group Bridging “R1, and R2 each independently represents alkyl of 1 to 4 carbon atoms…, or R1 and R2 together form a six membered heterocyclic ring.”
  9. 9. Markush Editor R-group definitions Tree view Scaffold Structure checker Nesting view & Preview
  10. 10. video 1-1.5 min Markush Editor Video
  11. 11. Workflow Collect ●Search ●Analyze Curate ●Extract ●Validate Store & Share ●Markushes ●Compounds ●Documents Use ●IJC ●Plexus ●Chemical space representation ●Structured chemical information ●High quality project specific database ●New opportunities, less risk, faster communication
  12. 12. Compound Extraction View Compound list Project explorer Annotated document Selected structures
  13. 13. Markush Extraction View Markush editor Example structures Annotated document Project explorer Selected structures Structure checker
  14. 14. video 1.5-2 min ChemCurator
  15. 15. General Document Curation Extract Markush Structures from patents Extract specific structures ●Journal articles ●Company reports ●Patent examples Structure extraction wizard ●Exclude fragments, chemical elements, etc.
  16. 16. Input formats ●Files (XML, PDF, HTML) ●Google Patents ●IFI CLAIMS ●Images (CLiDE & OSRA)
  17. 17. Integration & Information Sharing Other ChemAxon products: •Direct IJC schema connection •Project sharing function •Accessible from Plexus, IJC, etc. Third party tools: •Standard file formats •Export functions •Easily processable projects
  18. 18. Future plans Naming: ●Improving accuracy ●New languages Markush ●Markush overlap ●Chemical space visualization ChemCurator ●Non-hit visualization ●Markush extraction wizard
  19. 19. Acknowledgment Daniel Bonniot Árpád Figyelmesi Gábor Botka David Deng Péter Kovács János Kendi markush-support@chemaxon.com http://www.chemaxon.com/products/chemistry-text-mining-suite/chemcurator/

×