Pascal ChristophCatalog enrichment à laLinked Open Data  SWIB12, Cologne, 2012-12-26  Workshop: Introduction to Linked Ope...
License2    This presentation – inclusive the graphics made by the author, are licensed CC0:    https://creativecommons.or...
Overview3       Catalog enrichment          Definition          Technique          Matching          Linking       I...
Overview4       Catalog enrichment          Definition          Technique          Matching          Linking       I...
Catalog enrichment ?
Catalog enrichment: definition6       Any addendum to the records:          linksto fulltexts/webpages/...          sub...
„INSTANT GRATIFICATION“
Overview9       Catalog enrichment          Definition          Technique          Matching          Linking       I...
Catalog enrichment: methods10                                                               Sourtce of the pictures :http:...
methods11     locale DB:                                               dynamic mashup:     + elaborated combination of the...
infrastructure12     RDF based storing with SPARQL endpoint:        Easy to add data        Open to be used by customer ...
Overview13        Catalog enrichment           Definition           Technique           Matching           Linking   ...
14     Source of the picture: http://www.flickr.com/photos/jhsum-commons/4419490136/
lobid.org15        triple store with SPARQL Endpoint: 4store        open data from the hbz union catalog        16 M re...
Software16        Silk        Culturegraph        Google-refine        Hadoop        ...     Christoph - Catalog-enri...
Matching algorithms17        depending on the data           Interestingdata reside „elsewhere“           => other cata...
Problem: disambiguation18        matching is to blurry        Post processing:          Allow only bundle with same cre...
Bundle having the same creator19     Christoph - Catalog-enrichment à à la Linkedmit LOD     Jansen / Christoph Kataloganr...
Bundle having different creators20     Christoph - Catalog-enrichment à à la Linkedmit LOD     Jansen / Christoph Kataloga...
LOW-HANGING     FRUITKai Schreiber, „Reiche Ernte” 7. August 2005 via Flickr CC BY-SA 2.0
Overview22        Catalog enrichment           Definition           Technique           Matching           Linking   ...
triplification23        Find predicates or mint them yourself           rdrel:workManifested           =>      Triple: ...
indexing24        What is the license ?        Import triples into the SPARQL-Endpoint          own „named graph“ has a...
Named Graphs25     Christoph - Catalog-enrichment à à la Linkedmit LOD     Jansen / Christoph KataloganreicherungOpen Data...
What we achieved26        12.000 „sure“ links to 4.000 DBpedia         resources => 4.000 new „Work“-levels (21.000      ...
What we achieved27        5.500 links zu 400 Project Gutenberg         ressources (fulltexts in differnet formats)       ...
What we achieved28     Sir Tim Berners Lee:                                                   Source of picture: http://ww...
LOW-HANGING    FRUITKai Schreiber, „Reiche Ernte” 7. August 2005 via Flickr CC BY-SA 2.0
What we achieved30                                    DBpedia example:                  „Die Heilige Johanna der Schlachth...
What we achieved34                              Open Library example:                             „With reference to refer...
Linking Example: LODUM36     Christoph - Catalog enrichment à à la Linked Open Data                 Kataloganreicherung la...
Integration into the catalog37        What is allowed ?        What should be integrated, what not?        Human readab...
Overview38        Catalog enrichment           Definition           Technique           Matching           Linking   ...
Implementation demo39     Christoph - Kataloganreicherung à la Linkedmit LOD     Jansen / Christoph -enrichment à la Linke...
Implementation demo40     Christoph - Kataloganreicherung à la Linkedmit LOD     Jansen / Christoph -enrichment à la Linke...
Overview41        Catalog enrichment           Definition           Technique           Matching           Linking   ...
43     Bildquelle: http://www.flickr.com/photos/library_of_congress/4037490394/
conclusion44     Everything thats possible with LOD could also     be achieved without LOD.     Its just easier with LOD. ...
LOD - Definition „linked“45                           Ad astra ?                             Addata ! ?                   ...
Open source46                                               http://sourceforge.net/projects/culturegraph/                 ...
47   Thank you !          Pascal Christoph          christoph@hbz-nrw.de          semweb@hbz-nrw.de
48              list of references- KiM: Empfehlungen zur Öffnung bibliothekarischer Datenhttps://wiki.d-nb.de/pages/viewp...
Swib12 workshop lod_beginners
Swib12 workshop lod_beginners
Swib12 workshop lod_beginners
Swib12 workshop lod_beginners
Swib12 workshop lod_beginners
Swib12 workshop lod_beginners
Upcoming SlideShare
Loading in …5
×

Swib12 workshop lod_beginners

653 views

Published on

catalog enrichment with LOD

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
653
On SlideShare
0
From Embeds
0
Number of Embeds
29
Actions
Shares
0
Downloads
10
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Swib12 workshop lod_beginners

  1. 1. Pascal ChristophCatalog enrichment à laLinked Open Data SWIB12, Cologne, 2012-12-26 Workshop: Introduction to Linked Open Data
  2. 2. License2 This presentation – inclusive the graphics made by the author, are licensed CC0: https://creativecommons.org/about/cc0 Pictures from http://www.istockphoto.com/ at slides 5, 7, 8 and 41 are licensed CC-BY-ND: http://creativecommons.org/licenses/by-nd/3.0/de/ Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod- cloud.net/ Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  3. 3. Overview3  Catalog enrichment  Definition  Technique  Matching  Linking  Implementation demo  Conclusion Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  4. 4. Overview4  Catalog enrichment  Definition  Technique  Matching  Linking  Implementation demo  Conclusion Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  5. 5. Catalog enrichment ?
  6. 6. Catalog enrichment: definition6  Any addendum to the records:  linksto fulltexts/webpages/...  subjects, tags, recensions  covers  ...  The source of the addendum does not matter (users, libraries, companies...)  New features: only indirect Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  7. 7. „INSTANT GRATIFICATION“
  8. 8. Overview9  Catalog enrichment  Definition  Technique  Matching  Linking  Implementation demo  Conclusion Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  9. 9. Catalog enrichment: methods10 Sourtce of the pictures :http://findicons.com/about database vs. mashup Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  10. 10. methods11 locale DB: dynamic mashup: + elaborated combination of the + data always up-to-date data + relatively easy to integrate the data + data can be used to search and browse and other features - needs (performant) API - continously high effort to - no search etc. integrate the data Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  11. 11. infrastructure12 RDF based storing with SPARQL endpoint:  Easy to add data  Open to be used by customer  Self-describing data  SPARQL is a (too?) powerful API Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26
  12. 12. Overview13  Catalog enrichment  Definition  Technique  Matching  Linking  Implementation demo  Conclusion Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  13. 13. 14 Source of the picture: http://www.flickr.com/photos/jhsum-commons/4419490136/
  14. 14. lobid.org15  triple store with SPARQL Endpoint: 4store  open data from the hbz union catalog  16 M records <=> 1 B Triple  links to:• 5.500 Projekt Gutenberg • 1.250.000 Open Library• 12.000 DBpedia • 700.000 ZDB• 70.000 b3kat • 800.000 LOC Iso-639-2• 200.000 Dewey Decimal Class. • 22.000.000 gnd authority file• 270.000 DNB Nationalbiografie • 32.000.000 lobid-organisations• 420.000 OCLC Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  15. 15. Software16  Silk  Culturegraph  Google-refine  Hadoop  ... Christoph - Catalog-enrichment à à la Linkedmit LOD Jansen / Christoph KataloganreicherungOpen Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  16. 16. Matching algorithms17  depending on the data  Interestingdata reside „elsewhere“  => other cataloging rules  DBpedia example:  Creator, ISBN etc. are often missing => only title  constraints:  german DBpedia  category:Literarisches_Werk , category:Lexikon,_Enzyklopädie Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  17. 17. Problem: disambiguation18  matching is to blurry  Post processing:  Allow only bundle with same creator Christoph - Catalog-enrichment à à la Linkedmit LOD Jansen / Christoph KataloganreicherungOpen Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  18. 18. Bundle having the same creator19 Christoph - Catalog-enrichment à à la Linkedmit LOD Jansen / Christoph KataloganreicherungOpen Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  19. 19. Bundle having different creators20 Christoph - Catalog-enrichment à à la Linkedmit LOD Jansen / Christoph KataloganreicherungOpen Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  20. 20. LOW-HANGING FRUITKai Schreiber, „Reiche Ernte” 7. August 2005 via Flickr CC BY-SA 2.0
  21. 21. Overview22  Catalog enrichment  Definition  Technique  Matching  Linking  Implementation demo  Conclusion Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  22. 22. triplification23  Find predicates or mint them yourself  rdrel:workManifested  => Triple: <lobid-resource> <rdrel:workManifested> <dbpedia-resource> Christoph - Catalog-enrichment à à la Linkedmit LOD Jansen / Christoph KataloganreicherungOpen Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  23. 23. indexing24  What is the license ?  Import triples into the SPARQL-Endpoint  own „named graph“ has advantages:  Easilyremovable/changeable  Provenience is stored  Query specific named graphs Christoph - Catalog-enrichment à à la Linkedmit LOD Jansen / Christoph KataloganreicherungOpen Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  24. 24. Named Graphs25 Christoph - Catalog-enrichment à à la Linkedmit LOD Jansen / Christoph KataloganreicherungOpen Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  25. 25. What we achieved26  12.000 „sure“ links to 4.000 DBpedia resources => 4.000 new „Work“-levels (21.000 discared links)  average size of a bundle: 3  links to freebase: 3.000  0.1 % enrichment Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  26. 26. What we achieved27  5.500 links zu 400 Project Gutenberg ressources (fulltexts in differnet formats)  => 0.05% enrichment  1.200.000 links to the work level of the Open Library  => 12.5% enrichment Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  27. 27. What we achieved28 Sir Tim Berners Lee: Source of picture: http://www.w3.org/DesignIssues/LinkedData.html Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 2012-12-26 2012-09-27
  28. 28. LOW-HANGING FRUITKai Schreiber, „Reiche Ernte” 7. August 2005 via Flickr CC BY-SA 2.0
  29. 29. What we achieved30 DBpedia example: „Die Heilige Johanna der Schlachthöfe“ Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  30. 30. What we achieved34 Open Library example: „With reference to reference“ Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  31. 31. Linking Example: LODUM36 Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  32. 32. Integration into the catalog37  What is allowed ?  What should be integrated, what not?  Human readable presentation of the links/URIs  (some) data should be indexed locally (e. g. to be able to search)  ... Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  33. 33. Overview38  Catalog enrichment  Definition  Technique  Matching  Linking  Implementation demo  Conclusion Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  34. 34. Implementation demo39 Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  35. 35. Implementation demo40 Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  36. 36. Overview41  Catalog enrichment  Definition  Technique  Matching  Linking  Implementation demo  Conclusion Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  37. 37. 43 Bildquelle: http://www.flickr.com/photos/library_of_congress/4037490394/
  38. 38. conclusion44 Everything thats possible with LOD could also be achieved without LOD. Its just easier with LOD. Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  39. 39. LOD - Definition „linked“45 Ad astra ? Addata ! ? Ad astra Ad data !To boldly go where no data has gone before. To boldly go where no data has gone before . Source of the picture:http://hubblesite.org/gallery/album/star/pr2006050d Christoph - Kataloganreicherung à la Linked Open Data 2012-09-27
  40. 40. Open source46 http://sourceforge.net/projects/culturegraph/ http://4store.org/ https://github.com/lobid/ Silk https://www.assembla.com/spaces/silk Christoph - Catalog enrichment à la Linked Open Data
  41. 41. 47 Thank you ! Pascal Christoph christoph@hbz-nrw.de semweb@hbz-nrw.de
  42. 42. 48 list of references- KiM: Empfehlungen zur Öffnung bibliothekarischer Datenhttps://wiki.d-nb.de/pages/viewpage.action?pageId=45419980- Till Kreutzer (2010): Open Data – Freigabe von Daten aus Bibliothekskatalogenhttp://www.hbz-nrw.de/dokumentencenter/veroeffentlichungen/open-data-leitfaden.pdf- Adrian Pohl (2010): Open Data im hbz-Verbund. Erschienen in: ProLibris. 3. Preprint:http://www.hbz-nrw.de/dokumentencenter/produkte/lod/aktuell/pohl_2010_open-data.pdf- Tim Berners Lees talk of Open Data (2010): http://www.youtube.com/watch?v=3YcZ3Zqk0a8- Jansen / Christoph: Dynamische Kataloganreicherung auf Basis von Linked Open Datahttp://de.slideshare.net/h_jansen/dynamische-kataloganreicherung-auf-basis-von-linked-open-data- Blog post: First results using SILK to link to DBpediahttps://wiki1.hbz-nrw.de/display/SEM/2012/05/03/First+results+using+SILK+to+link+to+DBpedia- Blog post: 1.2 M links to Open Libraryhttps://wiki1.hbz-nrw.de/display/SEM/2012/05/23/1.2+M+links+to+Open+Library- Oliver Flimm (2010): LOD und die Open Library http://de.slideshare.net/flimm/lod-openlibrary20100512- Directory of data „thedatahub“ aka CKAN: http://www.thedatahub.org/- 49 bibliographic data sources as LODhttp://thedatahub.org/group/bibliographic?tags=lod

×