DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

606 views
551 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
606
On SlideShare
0
From Embeds
0
Number of Embeds
266
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

  1. 1. co-funded by the European Union Contextualisation Dominique Ritze
  2. 2. Motivation 218.06.2014 Who is George Grote? Which resources share the same subjects?
  3. 3. Example 318.06.2014 Work: Der zerbrochene Krug
  4. 4. Example 418.06.2014 Author: Ludwig Wittgenstein
  5. 5. Example 518.06.2014 Owner: Prinz Eugen von Savoyen
  6. 6. Example 618.06.2014 Subject: Adminstration
  7. 7. Example 718.06.2014 Place: Berlin
  8. 8. Overview • Silk as Contextualisation Tool • System Integration • Contextualisation Progress and Results • Challenges • Applicability and Reuseability • Future Plans 818.06.2014
  9. 9. Contextualisation with Silk • Silk: Link Discovery Framework (UMA) • Definition of linkage rules to create links between Linked Data resources • http://context.dm2e.eu 918.06.2014
  10. 10. Intergration of Silk • Silk is integrated in OmNom as Web Service 1018.06.2014 use generated configuration show links
  11. 11. Access to Contextualisation Results • Contextualization results (Linksets) are kept separate from ingested data • Linksets are further described and versioned • Additional linkset properties (tbd): – Automatically created – Manually created – Recall-oriented (exploratory, but with wrong links) – Precision-oriented (incomplete, but high quality) 1118.06.2014
  12. 12. Used Linked Data Resources 1218.06.2014 Geonames GND LCSHDBPedia Freebase Places Subjects Agents DDC Linked Geodata
  13. 13. Example Process 1318.06.2014 • Manual creation of linkage rules, e.g. compare skos:prefLabel with rdfs:label using Levensthein distance, link if distance < 2 • Let Silk run to find the links
  14. 14. Results • Contextualised all datasets that are currently ingested -> no qualitative analysis so far • increased the number of existing links by 20% (performance requirement) • Different amounts of links were found – Dingler (UBER) 134 unique links – Deutsches Textarchiv (BBAW) 9946 unique links • Potential to find more links 1418.06.2014
  15. 15. Links in Pubby 1518.06.2014
  16. 16. Links to DBPedia 1618.06.2014
  17. 17. Links to GeoNames 1718.06.2014
  18. 18. Links in Pubby 1818.06.2014
  19. 19. Challenges • In most cases, only a prefered label is available – Nancy France vs. Nancy Kentucky • Very specific rules for different spellings/abbreviations required – Frankfurt am Main vs. Frankfurt a.M. vs. Frankfurt a/M • Unstructured data is not captured 1918.06.2014
  20. 20. • Place: Wren Library, Trinity College Cambridge • Agent: Georg Tanner, Maximilian II. Unstructured Data 2018.06.2014
  21. 21. Results unstructured data 2118.06.2014 • Codices provenance • WAB description
  22. 22. Applicability and Reuseability • Created linkage rules can be reused but an adaption might be necessary • Knowledge about the Silk framework and the similarity functions is required • Access to the datasets is required (as dump or in a triplestore) • Quality of the links is not ensured 2218.06.2014
  23. 23. Future Work • Evaluation of the detected links – Iterative process to improve the links • Can we use existing information, e.g. already known connections to strenghen/weaken links? • Questions that can be answered based on the links? – Where have the resources been published? – MarineLinves – Map of the ship routes 2318.06.2014

×