LOD2: State of Play WP3A - Knowledge Base Creation, Enrichment and Repair

1,347 views

Published on

LOD2 plenary meeting in Paris: presentation of WP3A: State of Play (Knowledge Base Creation, Enrichment and Repair) by Jens Lehmann (ULEI).

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,347
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

LOD2: State of Play WP3A - Knowledge Base Creation, Enrichment and Repair

  1. 1. Creating Knowledge out of Interlinked Data LOD2 Paris Meeting: WP3 Overview Knowledge Base Creation, Enrichment and Repair Jens Lehmann AKSW, Universität LeipzigLOD2 Presentation . 02.09.2010 . Page http://lod2.eu
  2. 2. Creating Knowledge out of Interlinked DataOutline • General WP3 Overview (Jens Lehmann) • WP structure • Deliverables • Progress • Task 3.2 Report: NLP2RDF + NIF (Sebastian Hellmann)LOD2 Event . 06.09.2010 . 2Page 2 http://lod2.eu
  3. 3. Creating Knowledge out of Interlinked DataWP 3 Task Overview • Research WP, 76 PMs, InfAI (37), NUIG (10), FUB (17), OpenLink (5), Exalead (7) • 3.1: Provenance-Aware Extraction of Linked Data from Existing Structured Formats • 3.2: Provenance-Aware Extraction of Linked Data from Unstructured and Semi-Structured Sources • 3.3: Knowledge Base Schema Enrichment • 3.4: Knowledge Base Repair • 3.5: Web Linkage ValidatorLOD2 Event . 06.09.2010 . 3Page 3 http://lod2.eu
  4. 4. Creating Knowledge out of Interlinked DataWP 3 Goals • General Goal: creation, improvement, repair of knowledge bases • Focus: very large knowledge bases, diverse knowledge, web/linked data • Refine existing (Virtuoso Sponger, RDF Views, Triplify, D2R) triplification approaches • Improve schema of knowledge based on data • Fix problems in knowledge bases e.g. inconsistencies • Techniques: Semi-automatic machine learning, ontology debugging, NLP, shallow parsing etc.LOD2 Event . 06.09.2010 . 4Page 4 http://lod2.eu
  5. 5. Creating Knowledge out of Interlinked DataWP 3 Task 3.1 • Provenance-Aware Extraction of Linked Data from Existing Structured Formats (spreadsheets, relational databases, CMS, logs, XML documents) • Partners: FUB, InfAI, OpenLink, Exalead • Provide: process description + tools • Standardisation of RDB2RDF mapping • Draws on existing tools/frameworks: • D2R (FUB) • Triplify (InfAI) • Virtuoso Sponger (OpenLink) • Deliverables: State-of-the Art Report (M6), D2R release (M20), Triplify release (M20)LOD2 Event . 06.09.2010 . 5Page 5 http://lod2.eu
  6. 6. Creating Knowledge out of Interlinked DataWP 3 Task 3.1 - Progress • D2R Server MetaData Extension (allows adding licencing and provenance output to D2R server) • Deliverable 3.1.1 completed: state of the art report about knowledge extraction from structured sources • 200+ tools collected at http://data.lod2.eu/2011/tools/ • http://en.wikipedia.org/wiki/Knowledge_extraction created • Addition of RDF2Triggers to RDF Views in Virtuoso: enables materialisation and synchronisation of RDF views as physical triples • Virtuoso sponger cartridges extendedLOD2 Event . 06.09.2010 . 6Page 6 http://lod2.eu
  7. 7. Creating Knowledge out of Interlinked DataWP 3 Task 3.2 • Provenance-Aware Extraction of Linked Data from Unstructured and Semi-Structured Sources (HTML, PDF+ Office documents with metadata) • Partners: FUB, InfAI, OpenLink, Exalead • NLP techniques / text understanding (combine approaches, not invent them) • Draws on existing tools: • NLP2RDF (InfAI) • Stanford Parser, ASV toolkit, Zemanta, Ontos API (all external) • DBpedia (FUB, InfAI, OpenLink) • Deliverables: NLP2RDF release (M8), DBpedia Live (M8), DBpedia Framework Extension (M20)LOD2 Event . 06.09.2010 . 7Page 7 http://lod2.eu
  8. 8. Creating Knowledge out of Interlinked DataWP 3 Task 3.2 - Progress • NLP2RDF + NIF: presented by Sebastian • DBpedia Live: • New server acquired • Running at http://live.dbpedia.org/sparql/ (beta version) • DBpedia I18N committee founded and multi-language support extended • DBpedia Spotlight released (http://dbpedia.org/spotlight): tool for annotating mentions of DBpedia resources in textLOD2 Event . 06.09.2010 . 8Page 8 http://lod2.eu
  9. 9. Creating Knowledge out of Interlinked DataWP 3 Task 3.3 • Knowledge Base Schema Enrichment • Partners: InfAI • Suggests OWL Schema Axioms to Knowlege Base Maintainers (Definitions, Super Classes, Disjointness) • Tightly coupled to Task 3.4 • Adapts existing approaches to work with very large Linked Data knowledge bases • Uses DL-Learner (InfAI) and external ontology learning approaches • Deliverables: Enrichment Method Report (M12), User Interface (M24), Evaluation (M36)LOD2 Event . 06.09.2010 . 9Page 9 http://lod2.eu
  10. 10. Creating Knowledge out of Interlinked DataWP 3 Task 3.4 • Knowledge Base Repair • Partners: InfAI, NUIG • Fix inconsistent knowledge bases, unsatisfiable classes, (some) modelling errors, (some) reasoning performance problems • Draws on a lot of existing work in ontology debugging and extends it to knowledge bases in the LOD cloud • Related to quality measures in WP4 • Result: ORE tool (together with Task 3.3) • Deliverables: Report on Modelling Errors/Problems (M6), 1st ORE Release (M28), 2nd ORE Release (M40)LOD2 Event . 06.09.2010 . 10 Page 10 http://lod2.eu
  11. 11. Creating Knowledge out of Interlinked DataWP 3 Task 3.4 - Progress • Google Code project for ORE (Ontology Repair and Enrichment) tool started: http://code.google.com/p/ore/ • Domain http://ore-tool.net/ with basic instructions • ORE 0.2 released (desktop version – web version in development at http://web.ore-tool.net) • ORE paper accepted at ISWC • Deliverable 3.4.1 completed (state of the art report on detectable errors in knowledge bases) • Preliminary work on algorithms for supporting debugging SPARQL endpoints and Linked DataLOD2 Event . 06.09.2010 . 11 Page 11 http://lod2.eu
  12. 12. Creating Knowledge out of Interlinked DataWP 3 Task 3.5 • Web Linkage Validator • Partners: NUIG • Tightly coupled to Task 4.2 (Unsupervised Interlinking) • Creates linkage reports for knowledge base maintainers • Could suggest to add further properties, more specific property values, better specify classes/properties for knowledge base entitites • Deliverables: Initial Release (M18), LOD2 Stack Component Release (M28)LOD2 Event . 06.09.2010 . 12 Page 12 http://lod2.eu
  13. 13. Creating Knowledge out of Interlinked Data Thanks for your attention!LOD2 Presentation . 02.09.2010 . Page http://lod2.eu

×