Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

IC-SDV 2018: Martin Kracker (EPO) Linked Open EP data – a new Product from the EPO


Published on

The EPO has recently launched a new product: Linked open EP data. This open and free data set contains bibliographic information of EP publications and the Cooperative Patent Classification (CPC) hierarchy. Linked data , also known as Semantic Web, facilitates combining a particular data set with other linked data sets in any domain including patents. Given its URI, data about a resource, e. g. a patent publication, can be retrieved in a variety of formats over the web. For occasional use there is a simple data browser, an API and a SPARQL query interface. For heavier use, bulk data is available for download.

In this presentation we will introduce this new EPO product and illustrate the different ways this data can be inspected and retrieved. We will explore the content and point out potential use scenarios.

Published in: Internet
  • Be the first to comment

  • Be the first to like this

IC-SDV 2018: Martin Kracker (EPO) Linked Open EP data – a new Product from the EPO

  1. 1. Linked open EP Data – A new product from the EPO Martin Kracker IC-SDV, Nice, 21.04.2018EPO, Vienna
  2. 2. European Patent Office Agenda ▪ Motivation ▪ Linked Data ▪ Linked open EP data ▪ Potential of linked patent data 2
  3. 3. European Patent Office EPO Vienna: Patent publication MunichMunich The HagueThe Hague BerlinBerlin ViennaVienna Liaison office with the EU Headquarters BrusselsBrussels 3
  4. 4. European Patent Office EPO's PI quality criteria mantra ▪ Completeness ▪ Accuracy ▪ Timeliness ▪ Usability by as many persons possible 4
  5. 5. European Patent Office 5 The past of EPO’s PI distribution ▪ Microfiche ▪ Punch cards ▪ Floppy disks ▪ Laser disk ▪ Tapes ▪ Tape cartridges ▪ CDs & DVDs
  6. 6. European Patent Office The present of EPO’s PI distribution 6 Human access European Patent RegisterEuropean Publication Server Espacenet EP full-text search EP Bulletin search Global Patent Index PATSTAT Online Global Dossier Common Citation Document Computer access Web services Open Patent Services European Publication Server Data products EP (EBD, XML, PDF/A) worldwide (DOCDB, INPADOC) PATSTAT data
  7. 7. European Patent Office The future of EPO’s PI distribution ? 7 Source: Tim Berners-Lee 5 star Open Data plan
  8. 8. European Patent Office From Web of Documents to Web of Data (Semantic Web, Linked Data) If HTML and the Web made all the online documents look like one huge book, RDF, schema and inference languages will make all the data in the world look like one huge database. Tim Berners-Lee, Weaving the Web, Orion Publishing Group, UK, 1999 8
  9. 9. European Patent Office HTTP names as unique identifiers All business objects (called resources) will get a an HTTP name (URI) as globally unique identifier. In any Internet browser, each HTTP name will return some useful data in a standard format about that resource. It can also return relationships to other resources using their HTTP names. 9 Application identifier Publication identifier http://
  10. 10. European Patent Office Linked data: Just a (huge) collection of very simple facts 10 Our patent world inventor name: W. Kosman living in: NL nr: 1000000 office: EPO Linked Data model is a publication. publicationNumber "1000000". publicationAuthority “EP” http://data.../vc/C9B6819....6B is a person. http://data.../vc/C9B6819....6B fn "Kosman, W.". http://data.../vc/C9B6819....6B countryCode "NL". has inventor http://data.../vc/C9B6819....6B
  11. 11. European Patent Office All EP patents and all other related applications / publications ▪ “related” includes − international applications − priorities − applications in same DOCDB family − cited documents ▪ bibliographic data of EPs; basic data for non-EPs ▪ references to full text in EPO’s official Publication Server ▪ weekly update Citn Content: EP patents Internat. Appln Appln x EP Appln Priority Appln Citn cited Patent Simple family 11
  12. 12. European Patent Office Content: CPC Cooperative Patent Classification CPC hierarchy ▪ Most interesting data elements ▪ Linked to EP data set ▪ Recent CPC version 12 EP Appln CPC symbol Broader CPC Narrower CPC Narrower CPC Narrower CPC
  13. 13. European Patent Office 13 Data model: High level overview EP application EP publication Simple Family Citation CPC Applicant Inventor Agent Non-EP applications As published IPC KR patent data As updated/corrected External data
  14. 14. European Patent Office Open data license CC BY 4.0 ▪ Standard license, not handcrafted ▪ No costs, no registration ▪ May be shared, copy, redistributed in any medium or format ▪ May be adapted, remixed, transformed ▪ For any purpose, even commercially ▪ Attribution required 14
  15. 15. European Patent Office Access: Fair use policy ▪ GUI gives access to a reference data service ▪ Data browser and SPARQL endpoint are for occasional use: exploration, trying out new ideas, ... ▪ For production use: data must be downloaded 15
  16. 16. European Patent Office 16 API – Interactive features Simple browser for data exploration ▪ Nice presentation of resources ▪ Click to change focus
  17. 17. European Patent Office 17 API – Parameterized URIs Linked data API (LDA) ▪ Retrieve one or list of resources ▪ Filter ▪ Sort ▪ Define return format ▪ Custom views
  18. 18. European Patent Office 18 SPARQL queries Powerful query language ▪ for RDF graphs ▪ for heterogeneous data sets ▪ to explore data ▪ to explore structure (meta-data) ▪ SolR text index
  19. 19. European Patent Office Nevertheless: it is pure data 19 <> rdfs:label "EP 1676702 B1" ; patent:application <> ; patent:publicationAuthority <> ; patent:publicationDate "2008-11-26"^^xsd:date ; patent:publicationKind patent:publicationKind_B1 . patent:publicationKind_B1 rdfs:label "B1"@en . <> patent:applicationNumber "01945281" . patent:publicationKind_A1 rdfs:label "A1"@en .
  20. 20. European Patent Office 20 Download ▪ about 650 mio triples ▪ about 60 GB (N-triple format) ▪ Updated weekly
  21. 21. European Patent Office Documentation 21
  22. 22. European Patent Office Benefits of linked data for data consumers ▪ Very simple data format: “triples” ▪ Re-use of established ontologies (classes, properties) ▪ Infrastructure and standards already exist: The Web and various W3C recommendations 22 Less "data friction" when combining different data sets Target group: Data scientists, web developer, ...
  23. 23. European Patent Office 23 2008200920102011 2014 2017 Linked Open Data cloud Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
  24. 24. European Patent Office 24 Patent information can contribute to other domains Academic journals Image collections Geographical records Telephone directories Court decisions Standards Trade mark data Company registers Library of Congress National patent dataDictionaries and encyclopaedias Economic data National statistics University libraries Annual reports Technical magazines Classification data Government subsidies Patent data
  25. 25. European Patent Office Thank you for your attention! Questions? 25 Martin Kracker European Patent Office Directorate Publication