Why SKOS should be a Focal Point of your Linked Data Strategy

3,784 views

Published on

See how taxonomies and thesauri serve as a core element of a linked data strategy and how large knowledge graphs can be built around it. Based on semantic web standards like SKOS, OWL, and SPARQL, enterprises can develop highly agile data integration platforms.

Published in: Technology, Education

Why SKOS should be a Focal Point of your Linked Data Strategy

  1. 1. Why SKOS should be a focal point of your linked data strategy
  2. 2. Welcome to this webinar! Agenda (including several live demos) 1. Intro: Knowledge graphs & linked data 2. Various perspectives on SKOS 3. Linked data based information architecture 4. Proudly presenting … PoolParty 4.2 1. Discussion Andreas Blumauer, MSc IT CEO of Semantic Web Company, Vienna “Product owner” of PoolParty Semantic Suite Working in the fields of Text Mining, Semantic Web & Linked Data for over 13 years
  3. 3. About Semantic Web Company (SWC) SWC was founded 2001 in Vienna, Austria Over 20 experts in linked data technologies Product: PoolParty Suite (launched in 2009) Serving customers from various industries EU- & US-based partner network
  4. 4. Our network: Customers & Partners Customers ● Credit Suisse ● Daimler ● Roche ● Wolters Kluwer ● World Bank Group ● The Pokémon Company ● Healthdirect Australia ● Ministry of Finance (A) ● Wood Mackenzie ● Council of the E.U. ● TC Media ● American Physical Society ● Education Services Australia ● Pearson ● Techtarget ● Norwegian Directorate of Immigration ● REEEP ● GBPN ● City of Vienna ● ... Finance / Automotive / Publisher / Health Care / Public Administration / Energy / Education Partners ● Cognizant ● EPAM Systems ● iQuest ● DTI AG ● Tenforce ● OpenLink Software ● Ontotext ● brox ● bridgingIT ● Wolters Kluwer ● Term Management ● Taxonomy Strategies ● Search explained ● WAND ● Digirati ● KMSolutions ● Linked Data Factory ● Taxonic ● semweb
  5. 5. 1. Intro: Knowledge graphs & linked data
  6. 6. Graphs everywhere... Microsoft Facebook Google
  7. 7. Graphs are the key to ‘smart data’ It’s all about things, not strings!
  8. 8. Why is Google’s Knowledge Graph just about to transform ‘search’? Facts and context information around an entity, incl. dynamic API calls Imagine, your company had already its own specific knowledge graph(s). - complex queries (‘questions’) can be answered - integrated views on ‘things’ - networked nature of entities used for research - basis for personalised services
  9. 9. How does it work? http://rdf.freebase.com/m/07d5b “Tim Berners-Lee” rdfs:label http://rdf.freebase.com/m/04jpl freebase:place_of_birth “London”rdfs:label freebase:tourist_attractions http://rdf.freebase.com/m/07gyc “London Eye”rdfs:label
  10. 10. Why should I use graphs instead of relational databases? http://rdf.freebase.com/m/07d5b “Tim Berners-Lee” rdfs:label http://rdf.freebase.com/m/04jpl freebase:place_of_birth “London”rdfs:label freebase:tourist_attractions http://rdf.freebase.com/m/07gyc “London Eye”rdfs:label http://dbpedia.org/ resource/Tim_Berners-Lee http://www.w3.org/People/ Berners-Lee/ foaf:homepage http://dbpedia.org/resource/ World_Wide_Web_Consortium dbpedia:leaderName http://sws.geonames.org/ 4943351/ dbpedia:location “Massachusetts Institute of Technology” http://sws.geonames.org/ 2643743/ wgs84_pos:lat wgs84_pos:long 42.35954-71.09172
  11. 11. Why should I use semantic web standards based graphs? ● URIs ● HTTP and ● reuse of standards based vocabularies enable ● collaborative efforts to create o links o links o links o links o links o links o links ● … ● between entities and not only ● documents ● to build the basis for standards based Q&A machines
  12. 12. Pitstop: Linked Data is a data model based on graphs ● Linked Data is a graph based data model which can represent & process a wide range of information → Perfectly suitable for data integration & dynamic semantic publishing (DSP) in distributed environments (“semantic web”)
  13. 13. Linked data platforms: How many data sources are used here?
  14. 14. 2. Various perspectives on SKOS
  15. 15. SKOS makes taxonomies/thesauri accessible, linkable & reusable http://www.w3.org/2004/02/skos/
  16. 16. SKOS as a basis to visualize and browse semantic knowledge graphs
  17. 17. SKOS: You are not alone... ● Eurovoc (EU) ● ESCO (EU) ● Jurivoc (SUI) ● ScoT (AUS) ● Agrovoc (UN) ● MeSH (US) ● Getty Vocabularies (US) ● GEMET (EEA) ● GeoThesaurus (AT) ● STW Economy (DE) ● Polythematic SH (CZ) ● Canadian Subject Headings (Can) ● LCSH (US) ● Worldbank Taxonomy (WBG) ● Labor Law Germany Thesaurus (DE) ● Reegle Thesaurus (REEEP) ● Austrian Tax Law Thesaurus (AT) ● UNESCO Thesaurus (UN) ● New York Times SH (US) ● RAMEAU subject headings (FR) ● TheSoz (DE) ● The General Finnish Thesaurus (FIN) ● NAL Thesaurus (US) ● Social Semantic Web Thesaurus (AT) ● Courts thesaurus (DE) ● SITC-V4 (UN) ● Google Product Taxonomy (US) ● NAICS 2012 (US) ● Common Procurement Vocabulary (ES) ● UKAT UK Archival Thesaurus (UK) ● NASA taxonomy (US) ● IVOA astronomy vocabularies (UK) ● IPTC News Codes (UK) ● WAND taxonomies (US)
  18. 18. SKOS is a ‘semantic interface’ to retrieve and link distributed content EurovocWKD German labor law thesaurus STW Thesaurus DBpedia
  19. 19. SKOS is at the intersection of three disciplines and their paradigms SKOS librarians & taxonomists data engineers & artificial intelligenceschemas & ontologiestaxonomies & classification systems text mining & data analytics computational linguists & information managers
  20. 20. Provide integrated & interlinked views on all kind of information The SKOS/Linked Data based approach for information integration Transforming documents into SKOS based graphs Annotating & categorising documents SKOS based graph of concepts Tree of categories & terms Standards based ontologies linked to SKOS based concept graphs Schemas, classes, properties, restrictions & rules
  21. 21. Access SKOS via URIs and HTTP, based on standards, it’s machine-readable! SKOS based graph of concepts Tree of categories & terms http://vocabulary.semantic-web.at/semweb/367 Tim Berners-Lee TimBL skos:altLabelskos:prefLabel (Poly-)hierarchical relations Mappings non-hierarchical relations
  22. 22. Let your documents become part of something bigger & make them smart! Transforming documents into SKOS based graphs Annotating & categorising documents http://vocabulary.semantic- web.at/semweb/367 Tim Berners-Lee TimBL skos:altLabelskos:prefLabel Show me biographies of all computer scientists working for an organization located near Boston.
  23. 23. Make conceptual model & the semantics of your data explicitly available Standards based ontologies linked to SKOS based concept graphs Schemas, classes, properties, restrictions & rules
  24. 24. SKOS is not expressive enough? Apply ontologies on your SKOS thesauri!
  25. 25. 3. Linked data based information architecture
  26. 26. Four-layered information architecture enterprise knowledge model domain specific knowledge model annotation & categorization legacy data & documents
  27. 27. Linked Enterprise Data Relevant information to answer specific questions is all over the places. It’s often time consuming to find and link them. 1. Use taxonomies/ontologies for information integration 2. Use documents as they were a knowledge graph 3. Use relational databases as virtual RDF graphs
  28. 28. SPARQL is close to the way non- technicians use to formulate questions SELECT DISTINCT ?personname ?picture ?countryname ?hdi ?picture WHERE { ?person skos:prefLabel ?personname . ?country skos:prefLabel ?countryname . ?person a dbpedia:Person . ?country a dbpedia:Country . ?person skos:related ?country . ?country <http://dbpedia.org/property/hdi> ?hdi . FILTER ( ?hdi < 0.6) OPTIONAL { ?person foaf:depiction ?picture . } } ORDER BY DESC(?hdi) I want to explore medical research trends in relation to regional prosperity.
  29. 29. The traditional approach for data integration Person 4711 Name Jeff Bezos Affiliation Amazon Born in Albuquerque Land 4812 Name USA BIP $ 15.684 billion HDI 0.937 Solution: Application will be developed to integrate the two databases. Show me the ‘most influential people in the world’ who were born in countries with an HDI less than 0.5?
  30. 30. PersonOrganization Place affiliated with born in Ontology-Graph Jeff Bezos Amazon Albuquerque United States Knowledge Graph 2 GDP $ 15.684 billion HDI 0,937 Continents U.S. Thesaurus/Taxonomy-Graph America New Mexico Albuquerque South America Knowledge Graph 1 Show me the ‘most influential people in the world’ who were born in countries with an HDI less than 0.5? Solution: Taxonomies are used to link/map graphs
  31. 31. 4. Proudly presenting … PoolParty 4.2
  32. 32. The Hitchhiker’s Guide to Ontology Management The answer to taxonomy, ontology management and everything is...
  33. 33. PoolParty at a glance ● user-friendly: create & maintain knowledge graphs ● standards-based: based on W3C standards ● graph-based: natively built on graph databases ● embedded in ecosystem: use of linked (open) data ● best-of-breed: text mining, taxonomies & ontologies ● enterprise-ready: secure, simple to install ● integrable: connectors for SharePoint, Drupal, Confluence, WordPress, … your own CMS?
  34. 34. Creation and maintenance of SKOS graphs: drag & drop, autocomplete, browser-based
  35. 35. Import and creation of ontologies
  36. 36. Apply ontologies on SKOS model: Classification of concepts
  37. 37. Apply ontologies on SKOS model: Relations and Attributes
  38. 38. Apply ontologies on SKOS model: Domain and Range Restrictions For example: ‘Product’ and ‘Standard’ may be related by ‘uses’. PoolParty rule engine takes care of domain and range restrictions.
  39. 39. Make use of linked (open) data
  40. 40. ...and ask tricky questions Step 1: Link your thesaurus to knowledge graphs (e.g. DBpedia) / ontologies (e.g. schema.org) Step 2: Ask tricky questions PREFIX skos:<http://www.w3.org/2004/02/skos/core#> PREFIX foaf:<http://xmlns.com/foaf/0.1/> PREFIX dbpedia:<http://dbpedia.org/ontology/> SELECT DISTINCT ?personname ?timelink WHERE { ?person skos:prefLabel ?personname . ?person a dbpedia:Person . ?person <http://purl.org/dc/terms/subject> <http://dbpedia.org/resource/Category:Princeton_University_alumni> . OPTIONAL { ?person <http://mercury.poolparty.biz/PoolParty/schema/Time#time100page> ?timelink . } } alumni Who of the ‘most influential people in the world’ are Princeton University alumni? Results Carl Cahn http://time.com/70813/ Jeff Bezos http://time.com/70917/
  41. 41. Reuse ontologies from LOD sources
  42. 42. Link between several thesauri/taxonomies
  43. 43. Make use of text corpus analysis
  44. 44. Use your knowledge graph for entity extraction Step 1: Create a thesaurus Step 2: Deploy Extraction Service
  45. 45. Use PoolParty PowerTagging to integrate with enterprise content systems
  46. 46. Use PoolParty for Semantic Search
  47. 47. Make use of PoolParty Semantic Integrator
  48. 48. Start SKOS, grow big. So Long, and Thanks for All the Links
  49. 49. Announcements Webinar: Semantic SharePoint Wed, June 25, 2014 4:30 PM - 5:30 PM CEST Register here.
  50. 50. Contact points & further information Andreas Blumauer, MSc ITa.blumauer@semantic-web.at http://at.linkedin.com/in/andreasblumauer/ https://plus.google.com/115842492297705285184/ Semantic Web Company GmbH Mariahilfer Strasse 70/8, A-1070 Vienna +43-1-4021235 http://www.semantic-web.at http://www.poolparty-software.com http://slideshare.net/semwebcompany http://youtube.com/semwebcompany

×