Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Processing Linked Data with Catmandu

1,857 views

Published on

Introduction into library data processing with help of Catmandu

http://librecat.org

Published in: Education

Processing Linked Data with Catmandu

  1. 1. Processing Linked Data with Catmandu Patrick Hochstenbach | UGent http://librecat.org
  2. 2. Processing Linked Data with Catmandu | http://librecat.org LUND GHENT BIELEFELD
  3. 3. Processing Linked Data with Catmandu | http://librecat.org RATIONALE
  4. 4. Processing Linked Data with Catmandu | http://librecat.org KAHN-WILENSKI WEB HANDLE SERVICE PROVIDER REPOSITORY REPOSITORY I search a paper about...
  5. 5. Processing Linked Data with Catmandu | http://librecat.org Hypothesis 1: one network with a common schema Hypothesis 2: object-oriented design Hypothesis 3: the resource is the message
  6. 6. Processing Linked Data with Catmandu | http://librecat.org Hypothesis 1: one network with a common schema GOOGLE EUROPEANA OPENAIRE CRIS Videos Images Books Data sets
  7. 7. Processing Linked Data with Catmandu | http://librecat.org Hypothesis 2: object-oriented design Drive Race Park Economy Compact Minivan Convertible Wheel Half car Bicycle Zeppelin
  8. 8. Processing Linked Data with Catmandu | http://librecat.org Hypothesis 3: the resource researcher is the message DNS GOOGLE REPOSITORY CLOUD Dr. Müller
  9. 9. Processing Linked Data with Catmandu | http://librecat.org LIBRECAT/CATMANDU
  10. 10. Processing Linked Data with Catmandu | http://librecat.org CATMANDU PubMed MARC MODS EXCEL DSPACE Fedora SRU OAI-PMH DBI ISI Twitter DBI Atom EXCEL RDF JSON XML Solr ElasticSearch MongoDB Fedora Aleph Fix
  11. 11. Processing Linked Data with Catmandu | http://librecat.org FUNCTIONAL DESIGN JSON } each slice take group select map reduce add_field join_field lookup remove_field marc_mab count
  12. 12. Processing Linked Data with Catmandu | http://librecat.org LIBRECAT Institutional Repositories Search Engines Image Databases Archival Systems Data cleaning workbench Citation Style Processor
  13. 13. Processing Linked Data with Catmandu | http://librecat.org CATMANDU catmandu convert MARC to JSON < records.mrc catmandu convert OAI --url http://server/OAI to JSON catmandu convert SRU --url http://server/SRU --query dna to JSON catmandu convert DBI --query ‘SELECT * FROM table’ to JSON catmandu convert MARC to JSON < records.mrc catmandu convert OAI --url http://server/OAI to XML catmandu convert SRU --url http://server/SRU --query dna to YAML catmandu convert ArXiv --query ‘all:electron’ to CSV CONVERT
  14. 14. Processing Linked Data with Catmandu | http://librecat.org CATMANDU catmandu convert X to Y --fix ‘marc_map(“245”,”title”)’ catmandu convert X to Y --fix ‘prepend(“title”,”abcd-”)’ catmandu convert X toY --fix fixes.txt fixes.txt: remove_field(“_id”); marc_map(“001”, “merge.id”); prepend(“merge.id”, “author:”); add_field(“merge.source”,”author”); copy_field(“merge.id”,”_id”); FIX set_field add_field move_field copy_field remove_field upcase downcase capitalize trim substring prepend append lookup lookup_in_store count cmd split_field join_field retain_field replace_all collapse expand clone if_all_match if_any_match if_exists
  15. 15. Processing Linked Data with Catmandu | http://librecat.org CATMANDU catmandu import JSON to MongoDB --opt ... --opt ... catmandu import MARC to ElasticSearch catmandu import DC to FedoraCommons catmandu import CSV to DBI catmandu export MongoDB to JSON catmandu export Solr to YAML catmandu export DBI to CSV catmandu export FedoraCommons to Template --template test.tt test.tt: (TemplateToolKit) [%- FOREACH f IN record %] [% _id %] [% f.shift %][% f.shift %][% f.shift %][% f.join(“:”) %] [%- END %] IMPORT / EXPORT
  16. 16. Processing Linked Data with Catmandu | http://librecat.org CATMANDU https://metacpan.org/pod/Catmandu https://github.com/LibreCat/Catmandu http://librecat.org/tutorial/ http://librecat.org/catmandu/2013/06/21/catmandu-cheat-sheet.html
  17. 17. Processing Linked Data with Catmandu | http://librecat.org LIBRECAT http://biblio.ugent.be
  18. 18. Processing Linked Data with Catmandu | http://librecat.org LIBRECAT http://pub.uni-bielefeld.de/en
  19. 19. Processing Linked Data with Catmandu | http://librecat.org LIBRECAT http://adore.ugent.be
  20. 20. Processing Linked Data with Catmandu | http://librecat.org LIBRECAT http://libnew.ugent.be
  21. 21. Processing Linked Data with Catmandu | http://librecat.org Architecture FEDORA MEDIAMOSA VLE BIBLIO SCANNING RECEIVE ALEPH ABS CLOUD DEDUP/MERGE/AUGMENT BLACKLIGHT
  22. 22. Processing Linked Data with Catmandu | http://librecat.org LINKED DATA
  23. 23. Processing Linked Data with Catmandu | http://librecat.org PRODUCTION CATALOG MARC 245 $$a ... $$b 260 $$a ... 700 $$a ... JSON/YAML LINKED DATA
  24. 24. Processing Linked Data with Catmandu | http://librecat.org STAGE 1: CATALOG to MARC CATALOG MARC 245 $$a ... $$b 260 $$a ... 700 $$a ... $ catmandu export ALEPH to MARC
  25. 25. Processing Linked Data with Catmandu | http://librecat.org STAGE 2: MARC to JSON MARC 245 $$a ... $$b 260 $$a ... 700 $$a ... JSON/YAML
  26. 26. Processing Linked Data with Catmandu | http://librecat.org MARC 245 $$a ... $$b 260 $$a ... 700 $$a ... JSON/YAML STAGE 2: MARC to JSON Tolstoj, Lev Nikolaevič, Author War and peace / Title 1952. Publication Year Napoleonic Wars, Subject
  27. 27. Processing Linked Data with Catmandu | http://librecat.org STAGE 2: MARC to JSON MARC 245 $$a ... $$b 260 $$a ... 700 $$a ... JSON/YAML Tolstoj, Lev Nikolaevič, War and peace / 1952. Napoleonic Wars, Author Title Year Subject Tolstoj, Lev Nikolaevič War and peace 1952 Napoleonic Wars Author Title Year Subject FIX
  28. 28. Processing Linked Data with Catmandu | http://librecat.org STAGE 3a: JSON to RDF JSON/YAML LINKED DATA Tolstoj, Lev Nikolaevič War and peace 1952 Napoleonic Wars Author Title Year Subject ? Tolstoj, Lev Nikolaevič War and peace 1952 Napoleonic Wars dc:creator dc:title dc:date dc:subject FIX
  29. 29. Processing Linked Data with Catmandu | http://librecat.org JSON/YAML LINKED DATA STAGE 3a: JSON to RDF <http://example.org/000000008> <http://purl.org/dc/elements/1.1/creator> “Tolstoj, Lev Nikolaevič”; <http://purl.org/dc/elements/1.1/title> “War and peace” ; <http://purl.org/dc/elements/1.1/date> “1952” ; <http://purl.org/dc/elements/1.1/subject> “Napoleonic Wars” ; a <http://www.europeana.eu/schemas/edm/Book> . FIX RDF/Turtle http://demo.librecat.org/
  30. 30. Processing Linked Data with Catmandu | http://librecat.org STAGE 3b: RDF to Linked Data JSON/YAML LINKED DATA <http://example.org/000000008> <http://purl.org/dc/elements/1.1/creator> “Tolstoj, Lev Nikolaevič”; <http://purl.org/dc/elements/1.1/title> “War and peace” ; <http://purl.org/dc/elements/1.1/date> “1952” ; <http://purl.org/dc/elements/1.1/subject> “Napoleonic Wars” ; a <http://www.europeana.eu/schemas/edm/Book> . <http://example.org/000000008> <http://purl.org/dc/elements/1.1/creator> <http://viaf.org/viaf/96987389>; <http://purl.org/dc/elements/1.1/title> “War and peace” ; <http://purl.org/dc/elements/1.1/date> “1952” ; <http://purl.org/dc/elements/1.1/subject> <http://dbpedia.org/page/Napoleonic_ Wars> ; a <http://www.europeana.eu/schemas/edm/Book> . FIX
  31. 31. Processing Linked Data with Catmandu | http://librecat.org THANK YOU Nicolas Steenlant Nicolas Franck Snorri Briem Jörgen Eriksson Maria Hedberg Dave Sheroman Friedrich Summann Najko Jahn Vitali Peil Petra Kohorst Christian Pietsch Mathias Lösch Johan RolschewskiJakob Voß UGENT LUND BIELEFELD GBV STAATSBIBLIOTHEK ZU BERLIN Wouter Willaert INUITS

×