Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
co-funded by the European Union
Contextualisation
Dominique Ritze
Motivation
218.06.2014
Who is George
Grote?
Which resources share
the same subjects?
Example
318.06.2014
Work: Der zerbrochene Krug
Example
418.06.2014
Author: Ludwig Wittgenstein
Example
518.06.2014
Owner: Prinz Eugen von Savoyen
Example
618.06.2014
Subject: Adminstration
Example
718.06.2014
Place: Berlin
Overview
• Silk as Contextualisation Tool
• System Integration
• Contextualisation Progress and Results
• Challenges
• App...
Contextualisation with Silk
• Silk: Link Discovery Framework (UMA)
• Definition of linkage rules to create links between L...
Intergration of Silk
• Silk is integrated in OmNom as Web Service
1018.06.2014
use generated
configuration
show links
Access to Contextualisation Results
• Contextualization results (Linksets) are kept separate
from ingested data
• Linksets...
Used Linked Data Resources
1218.06.2014
Geonames GND
LCSHDBPedia
Freebase
Places Subjects
Agents
DDC
Linked
Geodata
Example Process
1318.06.2014
• Manual creation of linkage rules, e.g. compare
skos:prefLabel with rdfs:label using Levenst...
Results
• Contextualised all datasets that are currently ingested
-> no qualitative analysis so far
• increased the number...
Links in Pubby
1518.06.2014
Links to DBPedia
1618.06.2014
Links to GeoNames
1718.06.2014
Links in Pubby
1818.06.2014
Challenges
• In most cases, only a prefered label is available
– Nancy France vs. Nancy Kentucky
• Very specific rules for...
• Place: Wren Library, Trinity College Cambridge
• Agent: Georg Tanner, Maximilian II.
Unstructured Data
2018.06.2014
Results unstructured data
2118.06.2014
• Codices provenance
• WAB description
Applicability and Reuseability
• Created linkage rules can be reused but an adaption
might be necessary
• Knowledge about ...
Future Work
• Evaluation of the detected links
– Iterative process to improve the links
• Can we use existing information,...
Upcoming SlideShare
Loading in …5
×

DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

833 views

Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

  1. 1. co-funded by the European Union Contextualisation Dominique Ritze
  2. 2. Motivation 218.06.2014 Who is George Grote? Which resources share the same subjects?
  3. 3. Example 318.06.2014 Work: Der zerbrochene Krug
  4. 4. Example 418.06.2014 Author: Ludwig Wittgenstein
  5. 5. Example 518.06.2014 Owner: Prinz Eugen von Savoyen
  6. 6. Example 618.06.2014 Subject: Adminstration
  7. 7. Example 718.06.2014 Place: Berlin
  8. 8. Overview • Silk as Contextualisation Tool • System Integration • Contextualisation Progress and Results • Challenges • Applicability and Reuseability • Future Plans 818.06.2014
  9. 9. Contextualisation with Silk • Silk: Link Discovery Framework (UMA) • Definition of linkage rules to create links between Linked Data resources • http://context.dm2e.eu 918.06.2014
  10. 10. Intergration of Silk • Silk is integrated in OmNom as Web Service 1018.06.2014 use generated configuration show links
  11. 11. Access to Contextualisation Results • Contextualization results (Linksets) are kept separate from ingested data • Linksets are further described and versioned • Additional linkset properties (tbd): – Automatically created – Manually created – Recall-oriented (exploratory, but with wrong links) – Precision-oriented (incomplete, but high quality) 1118.06.2014
  12. 12. Used Linked Data Resources 1218.06.2014 Geonames GND LCSHDBPedia Freebase Places Subjects Agents DDC Linked Geodata
  13. 13. Example Process 1318.06.2014 • Manual creation of linkage rules, e.g. compare skos:prefLabel with rdfs:label using Levensthein distance, link if distance < 2 • Let Silk run to find the links
  14. 14. Results • Contextualised all datasets that are currently ingested -> no qualitative analysis so far • increased the number of existing links by 20% (performance requirement) • Different amounts of links were found – Dingler (UBER) 134 unique links – Deutsches Textarchiv (BBAW) 9946 unique links • Potential to find more links 1418.06.2014
  15. 15. Links in Pubby 1518.06.2014
  16. 16. Links to DBPedia 1618.06.2014
  17. 17. Links to GeoNames 1718.06.2014
  18. 18. Links in Pubby 1818.06.2014
  19. 19. Challenges • In most cases, only a prefered label is available – Nancy France vs. Nancy Kentucky • Very specific rules for different spellings/abbreviations required – Frankfurt am Main vs. Frankfurt a.M. vs. Frankfurt a/M • Unstructured data is not captured 1918.06.2014
  20. 20. • Place: Wren Library, Trinity College Cambridge • Agent: Georg Tanner, Maximilian II. Unstructured Data 2018.06.2014
  21. 21. Results unstructured data 2118.06.2014 • Codices provenance • WAB description
  22. 22. Applicability and Reuseability • Created linkage rules can be reused but an adaption might be necessary • Knowledge about the Silk framework and the similarity functions is required • Access to the datasets is required (as dump or in a triplestore) • Quality of the links is not ensured 2218.06.2014
  23. 23. Future Work • Evaluation of the detected links – Iterative process to improve the links • Can we use existing information, e.g. already known connections to strenghen/weaken links? • Questions that can be answered based on the links? – Where have the resources been published? – MarineLinves – Map of the ship routes 2318.06.2014

×