Semantic web-and-public-data - en
Upcoming SlideShare
Loading in...5
×
 

Semantic web-and-public-data - en

on

  • 867 views

Linked (Open) Data in e-Government and Commercial Publishing 

Linked (Open) Data in e-Government and Commercial Publishing 

Statistics

Views

Total Views
867
Views on SlideShare
639
Embed Views
228

Actions

Likes
1
Downloads
1
Comments
0

7 Embeds 228

http://www.tenforce.com 187
http://tenforce.com 32
http://feedly.com 3
http://newsblur.com 3
http://tenforce.net 1
http://www.tenforce.net 1
http://www.tenforce.be 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • @frL'Internet comme il est familier aujourd'hui: - texte, photo, vidéo, .... - hyperliens (URL en format: http://{domaine}/{chemin} )Livraison lien hypertexte sur le protocole HTTP - Avec une immense infrastructure (serveurs: DNS, proxy, gestion du cache, DHCP, ...) - Soutenir les paramètres HTTP et négociation de contenu (type MIME/format, langue, ...)@enThe internet as it is familiar now:text, photo, video, ....hyperlinks (URL en format: http://{domain}/{path} )Hyperlinked delivery over the HTTP protocolWith an immense infrastructure (servers for DNS, Proxy, cache management, DHCP, ...)Supporting HTTP parameters and content negotiation (format/mime-type, language, ...)
  • The internet as it is familiar now:text, photo, video, ....hyperlinksURL en format: http://{domain}/{path}Hyperlinked delivery over the HTTP protocolWith an immense infrastructure (servers for DNS, Proxy, cache management, DHCP, ...)Supporting HTTP parameters and content negotiation (format/mime-type, language, ...)

Semantic web-and-public-data - en Semantic web-and-public-data - en Presentation Transcript

  • Linked (Open) Data in e-Government and Commercial Publishing EU F7 project LOD2 partner TenForce (BE) Johan De Smedt 2014-01-17 TenForce – project: LOD2 1
  • Introduction 2014-01-17 TenForce – project: LOD2 2
  • 2014-01-17 Internet and HTTP - example (1/.) http://www.gfii.fr/fr/ TenForce – project: LOD2 3
  • Internet and HTTP - example (1/.) (2/2) • The internet as it is familiar now: – text, photo, video, .... – hyperliens • URL format: http://{domaine}/{chemin} • Hyperlinked delivery over the HTTP protocol – With an immense infrastructure (servers for DNS, Proxy, cache management, DHCP, ...) – Supporting HTTP parameters and content negotiation (format/mime-type, language, ...) 2014-01-17 TenForce – project: LOD2 4
  • Categories of Internet Users (1/3) • Categories of users – Humans – Applications (software) • Information handling – Consumers – Publishers – Aggregators 2014-01-17 TenForce – project: LOD2 5
  • Categories of Internet Users 2/3 • Examples of non human users ... – – – – Index and search robots Mobile applications Browsers Information aggregators and suppliers • • • • • • • 2014-01-17 Portals – scientific editors (and others) Weather forecast Traffic News e-Goverrnement Hotel and travel booking ... TenForce – project: LOD2 6
  • Categories of Internet Users 3/3 • ... at the service of humans – economic activities – curiosity – Control (processing procedures, security, ...) – implementation of policies and directives – traffic control and guidance – ... 2014-01-17 TenForce – project: LOD2 7
  • The objective of web semantics • Provide the tools (semantic language) to enable communication between Internet users (especially between applications) – Manipulation of raw data to produce value-added information is a key element of the service industry knowledge • Establish – "Common understanding" – "Iteroperabillity" – "Collaboration" 2014-01-17 TenForce – project: LOD2 8
  • Key elements for the building a "common understanding" • Publish knowledge models for specific domains – Taxonomy, classification, Thesaurus, subject register, Named Authoithy lists, ... – About general publications, the labor market, legislation, geolocation, sports, politics, ... • Publish vocabularies to express relationships, dependencies, data values - knowledge base schema (ontology) – Works of art, rights, licenses, trade, ... – Establish a framework to build and publish (update and maintain) the above publications – Help make the Internet a growing collection of related databases – Use standard or reference ontologies and taxonomies • Publishing in a semantic format: – content (HTML/human) AND metadata (RDF/application) • Reliable publishers of quality data are added value 2014-01-17 TenForce – project: LOD2 9
  • eGovernment 2014-01-17 TenForce – project: LOD2 10
  • The Demo Application: CELLAR - LOD2 • What is CELLAR – Owner: The Publication Office of the European Union – On-line publications: • EU legislation - content and metadata • Shortly: EU and national Jurisprudence and case law. • What is LOD2 – LOD: Linked Open Data – links = hypertext links (HTTP) • A research project of the 7th EU Framework Programme • Participants: Industry, publishers, Universities, ICT enterprises • The demo application – Use CELLAR as the original source provider of content in private published content. • (example, the publisher: Wolters Kluwer – Germany [WKD]) 2014-01-17 TenForce – project: LOD2 11
  • Demo Use Case (1/3) • Legislation related products or tools used by: – – – – editorial staff of commercial publishers, their customers, Their customer’s customers and the general public ... are getting direct access to linked EU primary source content and metadata to: – improve information quality – reduce editorial work – broaden content and metadata product offering
  • Produits - sans LOD 2/5 Cloud products 1 source Unique source of content and metadata in the product 2014-01-17 TenForce – project: LOD2 13
  • Products – without LOD 3/5 • Without LOD – access is via Eur-Lex which is not the primary information source but a publication on its own • delay, availability, not the raw content or metadata – Scraped information is reviewed and stored locally • task for WKD editorial staff – WKD products need to be complete and selfcontained with limited linking to available online original source
  • Produits - avec LOD 4/5 1) original source of raw content and metadata – access by REST API 2) content and metadata sources - human interface Cloud products 3 Sources 3) enriched content and enriched metadata sources 2014-01-17 TenForce – project: LOD2 15
  • Products – with LOD 5/5 • With LOD there is: – Direct access to the primary information source • content and metadata – Application assistance for linking with and reusing content and metadata from the original source – WKD product offering is completed with the available online original source by exposing the origins
  • The Demo • Advanced search (SPARQL) in web databases – uses the vocabulary : DCAT – schema of the catalog of datasets • License information is added to datasets using linked data (LD) • Retrieve CELLAR stored content and metadata via LD • Integrate with EUROVOC using LD • Reuse CELLAR metadata in WKD content and add provenance (PROV) refering the oroginal source. • Goto the public URL – http://212.71.25.157:8080/wp9IntAppEx-1.0/ 2014-01-17 TenForce – project: LOD2 17
  • Demo (1/.) • Demo in @en and @de, could be in 20+ languages • Combined search on CELLAR WP7 LOD DCAT – Full text = “Agrarstruktur Griechenland” – Title = “Kommission” – Issue date = “[ 1986-07-05 , 2000-01-15 [“ – Theme = “Besteuerung” 2014-01-17 TenForce – project: LOD2 18
  • • full text = Agrarstruktur Griechenland – score/rank Demo (1.1/.) 2014-01-17 TenForce – project: LOD2 19
  • • • full text = Agrarstruktur Griechenland title = Kommission Demo (1.2/.) 2014-01-17 TenForce – project: LOD2 20
  • • • • full text = Agrarstruktur Griechenland title = Kommission publicaiton date [ 1986-07-05 , 2000-01-15 [ Demo (1.3/.) 2014-01-17 TenForce – project: LOD2 21
  • • • • • full text = Agrarstruktur Griechenland title = Kommission publicaiton date [ 1986-07-05 , 2000-01-15 [ theme = Besteuerung 2014-01-17 Demo (1.4/.) TenForce – project: LOD2 22
  • Demo (2/.) • License information – Should be available in the original source – Can be merged into the source by a download service, addressed via DCAT distribution information – License reference provides • • • • Work title Publication Office publisher License statement Primary source content 23
  • license reference with primary source title (from DCAT register) 2014-01-17 Demo (2.1/.) TenForce – project: LOD2 24
  • 2014-01-17 Demo (2.2/.) Publisher found in DCAT as linked data in license reference TenForce – project: LOD2 25
  • Demo (2.3/.) • License Statement as linked data form license reference 2014-01-17 TenForce – project: LOD2 26
  • 2014-01-17 Demo (2.4/.) Primary source document as linked data from license reference TenForce – project: LOD2 27
  • Demo (3/.) • Retrieve document from CELLAR – any available format • Demo uses: html, xhtml, pdf, pdfa1a, pdfa1b • Retrieve metadata from CELLAR – ELI metadata (RDF/XML format) – CELLAR metadata (RDF/XML format) – "Notice" metadata (Proprietary XML format) • ELI – “European Legislation Identifier”@en – http://publications.europa.eu/resource/oj/JOC_2012_325 _R_0003_01.FRA.xhtml 2014-01-17 TenForce – project: LOD2 28
  • 2014-01-17 Demo (3.1/.) Primary Source document retrieval options TenForce – project: LOD2 29
  • Demo (3.2/.) 2014-01-17 TenForce – project: LOD2 Retrieval Primary Source documents 30
  • Demo (3.3/.) • Primary Source metadata retrieval options – ELI (RDF/XML) – raw RDF (RDF/XML) – proprietary “notice” XML 2014-01-17 TenForce – project: LOD2 31
  • Note: Requires proper browser XML and RDF viewing options Demo (3.4/.) 2014-01-17 TenForce – project: LOD2 Retrieve Primary Source metadata 32
  • 2014-01-17 • EUROVOC integration Demo (4/.) TenForce – project: LOD2 33
  • 2014-01-17 Demo (5/.) Establish reuse - Drag and drop the cellar item over the WK item TenForce – project: LOD2 34
  • 2014-01-17 Demo (5.1/.) Add primary source reference as linked data TenForce – project: LOD2 35
  • 2014-01-17 Demo (5.2/5) Access primary source reference as linked data TenForce – project: LOD2 36
  • Exemples des cas d’usage connexes 2014-01-17 TenForce – project: LOD2 37
  • Scenario 1 – Employment Use Case: SME in the Aachen area has a job vacancy for a Java programmer Background: It is getting harder to find good software developers, esp. beyond urban centres. Applicants in areas close to national borders face the challenge that they need very practical information around mobility, which is currently hardly available Eurovoc topics covered: Labour, Labour Market, Job Mobility, Job Vacancy Sources involved: European Legislation, Eurostat, destat, ESCO, Open Street Map, Public transport Aachen, European Agency for Safety and Health at Work Solution: EC contributes core ingredients for a central hub for transnational job mobility challenges TenForce – project: LOD2 38
  • Scenario 2 – Environment Use Case: German supermarket chain wants to start an image campaign on seafood that is not in danger towards overfishing in the coming years Background: In Germany, the market for organic food is growing rapidly as is the support for sustainability. Unfortunately, the information on sustainability is so scattered, that there is no way – e.g. for advertising industry – to react properly and seriously on this consumer trend Eurovoc topics covered: Nature reserve, environmental politics, management of resources, Fishing industry, fresh fish, catch quota Sources involved: European legislation, Eurostat, destat, FAO, World Bank, European Environment Agency Solution: EC contributes core ingredients for a central hub for environmental protection TenForce – project: LOD2 39
  • Scenario 3 – Energy Use Case: House owner in the Netherlands wants to build solar cells on his roof Background: Due to the „Energiewende“ in Germany, a lot of knowledge on renewal energy, its impact, technologies and vendors has been created on a national level. This information is also relevant for other EU member states and their citizens Eurovoc topics covered: Energy industry, solar energy, photovoltaic cell Sources involved: European legislation, Eurostat, destat, Joint Research Center, Agency for the Cooperation of Energy Regulators, International Energy Agency, Stiftung Warentest Solution: EC contributes core ingredients for transnational energy challenges TenForce – project: LOD2 40
  • Next for CELLAR (2014) • Transform all published CELLAR legislation according ELI directive • Publish case law according ECLI directive • Publish the catalog of available legislation and case law (occasionally using the W3C DCAT recommendation) • Publish all EU used taxonomies using the LOD best practices. 2014-01-17 TenForce – project: LOD2 41
  • ESCO 2014-01-17 TenForce – project: LOD2 42
  • The ESCO Project • ESCO – Project owner: DG-EMPL – ESCO • https://ec.europa.eu/esco/home (version 0) • European Skills, Competences, Qualifications and Occupations • The knowledge base details concepts in three pillars (taxonomies) and provides semantically rich relations between the concepts. • Re-uses several other taxonomies (Eurostat, Unesco, DG-EAC, PO of the EU) 2014-01-17 TenForce – project: LOD2 43
  • ESCO Data Model Occupation Pillar Organized by economic activity sectors - Agriculture - Education - ... NACE subject correspondance exactMatch ISCO08 ISCO88 ROME O [Occupation] broaderMatch • mapped to broaderMatch – ISCO xx (standard of ILO/UNO) – ROME (French labor market standard) – ... 2014-01-17 broaderMatch exactMatch TenForce – project: LOD2 44
  • 2014-01-17 ESCO Data Model Occupation Pillar • relation Description text document - unstructured or semi structured Occupation TenForce – project: LOD2 aboutOccupation Occupation Description: ================ ============================= ============================= ============================= Skills: ================ ============================= ============================= ============================= Qualifications: ================ ============================= ================ ============================= 45
  • • Skills are – transversal (across activity sectors) – specific to an activity sector • Types of skills – knowledge, skill, competence, ability • ESCO Data Model Occupation Pillar Group of skills – Leaf Group of skills • Skill (member of a skill group) text document - unstructured or semi structured aboutOccupation Occupation 2014-01-17 TenForce – project: LOD2 Occupation Description: ================ ============================= ============================= ============================= Skills: ================ ============================= ============================= ============================= Qualifications: ================ ============================= • ================ ============================= skill essential skill desired relation occupation - skill 46
  • ESCO Data Model Foreign Language expertise (1) main facet (1) sub facet (3) sub facet (3) Language usage Facet Language Facet (4) (4) english member (2) german skos:exactMatch oasis LoC EU-PO understanding Listening member topMember narrower Reading dutch 1. Define the different aspects/dimensions of a concept: - main facet (0..1) - sub facets (0..n) Speaking Spoken interaction 2. Define/specify the standard to use or give a good description of the concepts contained by each facet narrower 3. For each list of values from step 2. a collection of concepts (Facet Group) is created. Spoken production 4. Manage the members of the facet group Writing • Skill and Skill facet 2014-01-17 TenForce – project: LOD2 47
  • ESCO Data Model Qualification Pillar • EQF, FoET, Awarding Body ESCO Q-Pillar exactMatch FoET Q-groups tagging EQF Q-members hasAwardingBodyDescription tagging Awarding Body description 2014-01-17 TenForce – project: LOD2 48
  • 2014-01-17 ESCO Data Model Occupation Pillar (Reprise) • relation descriptif text document - unstructured or semi structured Occupation TenForce – project: LOD2 aboutOccupation Occupation Description: ================ ============================= ============================= ============================= Skills: ================ ============================= ============================= ============================= Qualifications: ================ ============================= ================ ============================= 49
  • 2014-01-17 ESCO Data Model Occupation Pillar (Reprise) • Relationship: Occupation - Qualification aboutOccupation Occupation TenForce – project: LOD2 text document - unstructured or semi structured Occupation Description: ================ ============================= ============================= ============================= Skills: ================ ============================= ============================= ============================= Qualifications: ================ ============================= ================ ============================= qualification 50
  • ESCO Data Model Qualification Pillar • Qualification are maintained (direct) or included (indirect) • direct Qualification are maintained by the DG-EMPL/ESCO. Inclusion is an “as needed” basis – International qualification schemes (outside of the EU) • USA, Chine, ... – Qualifications awarded by enterprises • ORACLE, CISCO, Microsoft, ... • Qualification subject to indirect inclusion – Are maintained by national (EU member) organizations – Registered and structured by DG EAC (Education and Culture) – Transferred to DG EMPL using the XML schema of DG-EAC – Uploaded in ESCO by DG-EMPL/ESCO 2014-01-17 TenForce – project: LOD2 51
  • ESCO Data Model Qualification Pillar • Relationship description XML document + occasional description aboutQualification Qualification Description: ================ ============================= ============================= ============================= qualification hasAwardingBody skill competence Skills: ================ ============================= ============================= ============================= 2014-01-17 awarding body TenForce – project: LOD2 skill 52
  • ESCO Data Model - summary • ESCO consists of three pillars (A pillar is a class of concepts) – occupation – competence – qualification • ESCO concepts are mapped to other concepts of like taxonomies. The mapping is expressed using SKOS mapping properties. – The correspondence between ESCO and ISCO (ESCO occupation has as broader match an ISCO occupation group) – Planned: mapping ESCO to ROME (French occupation taxonomy) ... other mappings may be established as needed (O * NET) • The ESCO semantics are expressed using standard support taxonomies – To tag ESCO pillar concepts (using DCMI property dcterms:subject) – To structure recurring specializations in the ESCO model (using facets, collections or groups of concepts) – Examples • • • • • • • 2014-01-17 Location (Eurostat: NUTS; ISO 3166) economic activity sectors (Eurostat: NACE) European qualification Framework (EQF) CEFR (Common European Framework of Reference for Languages) UNESCO (ISU): FoET, ISCED Languages (Publication Office of the EU, Library of Congress, OASIS-psi, ISO 639) ... TenForce – project: LOD2 53
  • Tools for Linked Open Data 2014-01-17 TenForce – project: LOD2 54
  • A small list of tools for LOD • SPARQL end-point –NoSQL data base (RDF graph, Colonne) – Virtuoso, Oracle, Allegrograph • Frameworks integrating sematic libraries – Jena, Sesame • Analyser – Topbraid, Protégé • Alignment of knowledge bases – SILK: • http://lod2.eu/Project/Silk.html • http://wifo5-03.informatik.uni-mannheim.de/bizer/silk/ • LOD best practices – https://dvcs.w3.org/hg/gld/raw-file/default/bp/index.html 2014-01-17 TenForce – project: LOD2 55
  • TenForce References • Semantic web Projects – – – – – – – Eurovoc Cellar ESCO LOD2 (R&D) Wolters Kluwer ODP (Open Data Portal) ODS (Open Data Support) • ISO 25964 (Thesaurus standardization) • TenForce.com • johan.de-smedt@tenforce.com 2014-01-17 TenForce – project: LOD2 56