Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ICIC 2016: Mind the Gap: The novel benefits of human-curated substance locations for chemical patent analysis

643 views

Published on

Identifying and locating chemical substances, which can be disclosed in patents by names, structures, variable tables, etc. presents a time-intensive challenge to chemical patent analysis. Though emerging technology can help, recently published research demonstrates that algorithmic identification of chemical substances alone successfully identifies only ~60% of the disclosed compounds, compared to intellectual compound identification. PatentPakTM addresses this gap by extending the efforts of CAS scientists, who have intellectually analyzed the global patent literature for claimed and exemplified compounds for more than 100 years, to also elucidate the location of the substances in the patent text. This presentation will explore a number of examples, including a case study on vitamin D metabolites, to demonstrate the significant time savings and enhanced comprehensiveness of this approach.

Published in: Internet
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

ICIC 2016: Mind the Gap: The novel benefits of human-curated substance locations for chemical patent analysis

  1. 1. Mind the Gap: The novel benefits of human-curated substance locations for chemical patent analysis Aalt van de Kuilen, Patent Information Services BV, NL Paul Peters, CAS/ACS International, DE ICIC 2016 October 18, 2016 Heidelberg, Germany CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved.
  2. 2. Finding the relevant section(s) within the full-text of chemical patents is often a time-consuming challenge • They are not always as easy to track down as we might expect • They can be long and “artfully” written • The chemistry is often obscured within complex names, tables, text, graphics, etc. Sometimes it seems like the search may be complete, but the hunt is just beginning!
  3. 3. Even with a precise chemical patent search, reviewing the results can quickly become overwhelming => FILE CAPLUS => S L3 L4 1014 L3 => S L4 AND (BET OR BROMODOMAIN) AND P/DT L5 35 L4 AND (BET OR BROMODOMAIN) AND P/DT A query combining structure and text terms yields 35 patent publications. That shouldn’t be too bad, right? Only 5,498 pages to review. 479 pages 428 pages 321 pages 277 pages 263 pages 261 pages 240 pages 229 pages CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 3
  4. 4. Technology can help, but algorithmic extraction of chemistry in patents has significant limitations 4 Conclusion: Algorithmic extraction successfully found only 50-60% of the chemical structures in patents based on a limited sample, and they were often the least interesting ones.
  5. 5. Algorithms miss key substances for a myriad of reasons • Ambiguous naming • Markush representations • No name – Explanatory text or images, rather than as chemical names or structures • Stereochemistry issues • Multi-component substances CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 5 Normally PSS is used for poly(styrenesulfonic acid), but here it represents the aqueous dispersion, which CAS previously identified as poly(1-vinyl-2-pyrolidone) Normally PSS is used for poly(styrenesulfonic acid), but here it represents the aqueous dispersion, which CAS previously identified as poly(1-vinyl-2-pyrolidone)
  6. 6. PatentPakTM addresses this gap by combining human curation with new technology to expedite chemical patent analysis • Rapidly track down the specific location of hard-to-find chemical information in patents with interactive links to key substances – Benefit from the indexing efforts of hundreds of CAS scientists • Instantly and securely access patent PDFs from major patent offices – No more wasting time navigating multiple web sites • Locate patents in languages you know with CAplusSM global patent family coverage – Save time and translation costs • Conveniently share these benefits with other IP stakeholders – Even if they do not use STN® or SciFinder® CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 6
  7. 7. PatentPak is built on the indexing effort of the scientific analysts that create CAS REGISTRYSM • Scientists review each patent and identify new substances for CAS REGISTRY inclusion • They mark the specific location of substances in the text during analysis • Algorithmic processing with human intervention allows previously registered substances to be located and annotated in backfile documents CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 7 “I analyzed the chemistry in this entire patent to save you time.” Keiko Sugimoto Sr. Scientific Information Analyst, CAS
  8. 8. CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved. 8 PatentPak supplements CAplus records with direct pointers to the chemistry of interest Bibliographic information (partially shown) Hit substance indexing including roles Hit structure display from CAS REGISTRY PatentPak links for each hit compound PatentPak links for document
  9. 9. CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved. It is possible to access the original PDF…
  10. 10. CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved. … the annotated PDF (PDF +) …
  11. 11. CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved. … or review the patent using the interactive viewer
  12. 12. PatentPak links are available in transcripts, tables, and reports and accessible without an STN login ID to support workflow CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 12 No STN login ID required
  13. 13. PatentPak is also available in SciFinder CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved.
  14. 14. New CAplus records from 31 countries are annotated as part of the normal workflow, and the backfile is growing rapidly CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 14
  15. 15. The current backfile project will extend historical PatentPak coverage of key offices by more than a decade by year end ACS / Proprietary and Confidential / Do Not Distribute 15
  16. 16. 16 PatentPak example US5739376: Backfile operation for one of the first patents on Fullerene derivatives (Hoechst AG) • Originally a German basic patent from 1994 but substance locations have been added to the US equivalent from 1998 • Fullerene structures were symbolized by simple rings
  17. 17. PatentPak example WO2016087417: substances identified in a Markush table (Bayer CropScience AG) • Only a few selected substances in this patent are fully identified by name or structure • The vast majority of substances are indexed by assembling Markush tables
  18. 18. PatentPak example WO9851681: Substance identified as “oily product” (Sanofi) • This particular substance is only identified as “oily product” • CAS analyst indexing from the chemistry
  19. 19. PatentPak example WO2016120821: Find substances that cannot be identified by algorithm or structure extraction (Novartis AG) • Substances in formula VII are claimed by Markush: LG = “leaving group” • Analyst marked four specific compounds which are defined later in the claims - only a human can process claims like this!
  20. 20. PatentPak example DE2013016487: Multiple location markings (University of Heidelberg) • Analyst has marked multiple locations - claims and synthetic example
  21. 21. 21 PatentPak example WO2016001362: Find substances inferred by their starting material after enzymatic conversion (BASF) • Starting materials (substrates) identified by structure on page 51 • Products not listed but inferred in a table on page 27
  22. 22. PatentPak example WO2015018558: Inorganic chemistry can be equally challenging (PI Ceramic GmbH)
  23. 23. PatentPak example WO2014184355: Find assembled Markush tables (Dr. August Wolff GmbH & Co Arzeneimittel) • 9.5 pages of "table Markush“ structures - a core structure shown at the top, with fragments • The complete structure is assembled in a table at the back of PDF+ document, including page numbers, CAS RN, chemical name, and structures
  24. 24. Case study on new Vitamin D metabolites CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 24 How many patent families have been filed since 2013 on new Vitamin D metabolites? Find the answer by with
  25. 25. Stepwise approach CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 25 1. Structure search in Registry 2. Remove old compounds 3. Keep compounds with low reference count in CAplus 4. Transfer to Chemical Abstracts 5. Limit to new compounds and published in patents 6. Display records which have a PatentPak record PatentPak PDF| PatentPak PDF+ | PatentPak Interactive
  26. 26. Structure CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 26 Q CH 2 CH 3 Ak Broad definition of Vitamin D skeleton All rings are isolated and double bonds are mandatory
  27. 27. CAS REGISTRY search CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 27 FILE 'REGISTRY‘ ENTERED ON 22 SEP 2016 STRUCTURE UPLOADED => L3 has 6806 unique substances in Registry Refine to compounds registered since 2001 (ED>2000) => L4 has 2394 unique substances Refine to substances with less than 5 references (REF.CAPLUS<5) => L5 has 2159 unique substances
  28. 28. CAplus search strategy CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 28 Cross-over of L4 with 2159 unique substances => L5 has 503 references from all years Restrict the answer to patent records only (P/DT) => L6 has 234 patent references from all years Restrict to patents with a stronger chemistry focus using C07C as IPC or CPC codes => L7 has 136 patent references from all years Restrict to patents with a priority year after 2012 => L8 has 18 patent references
  29. 29. Findings of the 18 patent family records retrieved CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 29 Answer Country Language Pub.year Pages All subst Vitam-D PPAK 1 CN Chinese 2016 27 37 4 Yes 2 CN Chinese 2016 25 47 13 Yes 3 WO English 2016 106 202 55 PDF+ 4 CN Chinese 2016 13 4 1 Yes 5 CN Chinese 2016 5 3 1 Yes 6 CN Chinese 2015 21 9 2 Yes 7 CN Chinese 2015 14 9 4 Yes 8 CN Chinese 2015 9 4 1 Yes 9 WO German 2015 45 14 3 Yes 10 DE German 2015 22 14 3 Yes 11 CN Chinese 2014 14 16 1 Yes 12 CN Chinese 2015 12 7 1 Yes 13 US English 2015 18 5 3 Yes 14 WO Spanish 2015 75 141 29 Yes 15 US English 2015 21 18 3 Yes 16 WO English 2015 61 18 2 Yes 17 ES Spanish 2013 55 141 29 Yes 18 WO English 2013 50 30 3 Yes The result set includes three “double basic” pairs: 9+10, 14+17, 15+16
  30. 30. CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 30 L16 ANSWER 7 OF 18 CAPLUS COPYRIGHT 2016 ACS on STN PatentPak PDF | PatentPak PDF+ | PatentPak Interactive AN 2015:979679 CAPLUS Full-text<<LOGINID:ssscas83ppp:20160907>> DN 163:118806 TI 24,28-Olefine-1-hydroxy-vitamin D derivatives and preparation method IN Fang, Zhijie; Guo, Wei; Liu, Yanan; Li, Hongliang PA Nanjing University of Science and Technology, Peop. Rep. China SO Faming Zhuanli Shenqing, 14pp. CODEN: CNXXEV DT Patent LA Chinese FAN.CNT 1 PPPI PATENT NO. KIND DATE LANGUAGE PatentPak --------------- ---- -------- ---------- ------------------------ CN 104693087 A 20150610 Chinese PDF | PDF+ | Interactive PI PATENT NO. KIND DATE APPLICATION NO. DATE --------------- ---- -------- --------------------- -------- CN 104693087 A 20150610 CN 2013-10664076 20131210 <-- PRAI CN 2013-10664076 20131210 <-- Display Original Full-text PDF
  31. 31. CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 31 L16 ANSWER 7 OF 18 CAPLUS COPYRIGHT 2016 ACS on STN PatentPak PDF | PatentPak PDF+ | PatentPak Interactive AN 2015:979679 CAPLUS Full-text<<LOGINID:ssscas83ppp:20160907>> DN 163:118806 TI 24,28-Olefine-1-hydroxy-vitamin D derivatives and preparation method IN Fang, Zhijie; Guo, Wei; Liu, Yanan; Li, Hongliang PA Nanjing University of Science and Technology, Peop. Rep. China SO Faming Zhuanli Shenqing, 14pp. CODEN: CNXXEV DT Patent LA Chinese FAN.CNT 1 PPPI PATENT NO. KIND DATE LANGUAGE PatentPak --------------- ---- -------- ---------- ------------------------ CN 104693087 A 20150610 Chinese PDF | PDF+ | Interactive PI PATENT NO. KIND DATE APPLICATION NO. DATE --------------- ---- -------- --------------------- -------- CN 104693087 A 20150610 CN 2013-10664076 20131210 <-- PRAI CN 2013-10664076 20131210 <-- Original Full-text PDF + compound table
  32. 32. CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 32
  33. 33. CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 33
  34. 34. CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 34 L16 ANSWER 7 OF 18 CAPLUS COPYRIGHT 2016 ACS on STN PatentPak PDF | PatentPak PDF+ | PatentPak Interactive AN 2015:979679 CAPLUS Full-text<<LOGINID:ssscas83ppp:20160907>> DN 163:118806 TI 24,28-Olefine-1-hydroxy-vitamin D derivatives and preparation method IN Fang, Zhijie; Guo, Wei; Liu, Yanan; Li, Hongliang PA Nanjing University of Science and Technology, Peop. Rep. China SO Faming Zhuanli Shenqing, 14pp. CODEN: CNXXEV DT Patent LA Chinese FAN.CNT 1 PPPI PATENT NO. KIND DATE LANGUAGE PatentPak --------------- ---- -------- ---------- ------------------------ CN 104693087 A 20150610 Chinese PDF | PDF+ | Interactive PI PATENT NO. KIND DATE APPLICATION NO. DATE --------------- ---- -------- --------------------- -------- CN 104693087 A 20150610 CN 2013-10664076 20131210 <-- PRAI CN 2013-10664076 20131210 <-- Interactive Viewer for substance locations
  35. 35. CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 35 Interactive link to location of compound
  36. 36. CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 36
  37. 37. Answer #3 has >600 substance locations, which can only be seen in the PDF+; still very useful CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 37
  38. 38. Case study conclusions CAS is a division of the American Chemical Society. Copyright 2016 American Chemical Society. All rights reserved. 38 1. Fast identification of relevant patents, containing new compounds 2. Easy access to the patent document 3. Time savings when finding the compounds in a specific patent (PatentPak PDF+ compound table) 4. Quickly and easily locate a specific compound in a patent with links in the PatentPak Interactive Viewer
  39. 39. Overall conclusions 39 • Semantic technology has made great advances in classifying, mining and extracting chemical content from text; however, it has significant limitations • Human analysis is still necessary to find many of the key compound locations • PatentPak in STN provides convenient links for patent attorneys and outside council to facilitate their analysis work • PatentPak in SciFinder is designed to provide a direct interactive session for scientists to find relevant compounds and search them in SciFinder • PatentPak provides significant time savings when analyzing novel vitamin D metabolites disclosed in patents

×