Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Wikidata as opportunity for special collections: the 20th Century Press Archives use case

366 views

Published on

Presentation at LIBER 2019, Linked Open Data Working Group,
26.06.2019, Dublin (Ireland)

Published in: Internet
  • Be the first to comment

Wikidata as opportunity for special collections: the 20th Century Press Archives use case

  1. 1. ZBW is member of the Leibniz Association Wikidata as opportunity for special collections: the 20th Century Press Archives use case Joachim Neubert ZBW – Leibniz Information Centre for Economics, Kiel/Hamburg LIBER 2019, Linked Open Data Working Group 26.06.2019, Dublin (Ireland)
  2. 2. Agenda 1. What are we dealing with? 2. Why Wikidata? 3. Transfering metadata a. Link to existing items b. Create missing items c. Add metadata to the items 4. Using the data 5. Future work Page 2
  3. 3. https://commons.wikimedia.org/wiki/File:ZBW-Personenarchiv_2015.jpg by Max-Michael Wannags What are we dealing with?
  4. 4. Page 4 What are we dealing with? Historic Press Archives, founded in 1909 (Hamburg) and 1914 (Kiel) • Some material dating back to 1826 • Collections closed in 2005 Thematic dossiers covering  Persons • Companies • Products • General subjects and events
  5. 5. Page 5 Current state Former DFG funded project, resulting in • Digitized roll films (material before 1949) • Relational database about dossiers, often with GND ID • Big filesystem (containing more than 2m pages) • Accessible via • custom application “Pressemappe 20. Jahrhundert” and • DFG-Viewer (METS/MODS files, per dossier) All metadata available under CC 0 license
  6. 6. Long term sustainability? Specialized application for discovery and access, architecturally outdated and expensive to maintain Page 6 http://webopac.hwwa.de/pressemappe20
  7. 7. Why Wikidata? Page 7
  8. 8. Wikidata basics • Knowledge base for Wikimedia projects • All kinds of entities: concepts, places, people, works … • Editable and extensible by everyone • Data available under CC0 • http://query.wikidata.org/ (SPARQL) • JSON API & database dumps • Sustainable foundation for long-term available data Page 8
  9. 9. Wikidata statements Page 9
  10. 10. Page 10
  11. 11. Linking mechanism: external identifiers • Property value: unique IDs from external database • + URL stub in the property definition („formatter URL“) • Almost 4,000 external identifier properties • Examples: • GND • proteins • African plants • Swedish cultural heritage objects Page 11
  12. 12. Transfering collection metadata to Wikidata 1. What are we dealing with? 2. Why Wikidata? 3. Transfering metadata a. Link to existing items b. Create missing items c. Add metadata to the items 4. Using the data 5. Future work Page 12
  13. 13. Wikidata property P4293 (PM20 folder ID) • Property proposal and discussion within the community Additional prerequisite: • RDF representation of PM20 contents and a SPARQL endpoint, allowing federated queries with the Wikidata endpoint Page 13
  14. 14. Link to existing items • Automatically inserted links derived from GND IDs • Tool-supported manual linking • Wikidata‘s Mix-n-match (great for persons, crowd-sourced) • custom tools (like this) • others (OpenRefine, …)  ~ 95% of PM20 person folders linked by mid-June 2019! Page 14
  15. 15. Checking proposed matches in Mix‘n‘match Seite 15
  16. 16. Add missing items to Wikidata - automatically Recommendations for item creation: • Pay attention to Wikidata’s notability criteria • Explain your plan and ask for feedback in the Wikidata project chat • Apply for a bot account to make mass edits (example) • Source every statement Process: • Transform query results to QuickStatements input file • Copy & paste into QuickStatements Page 16
  17. 17. QuickStatements input from PM20 • using a federated query to exclude existing Wikidata items • query output transformed by a script Page 17
  18. 18. Added Wikidata item Page 18  all 5200 PM20 person folders now linked from Wikidata!
  19. 19. Add metadata to Wikidata items e.g., for all persons in Wikidata with PM20 ID and the PM20 “field of activity”: “economics” or “business economics”, insert the according occupation into the WD person item (script, query) Page 19
  20. 20. Using the data on Wikidata Page 20
  21. 21. „Proof of concept“ example: Map of economists Page 21 Query link (on Wikidata SPARQL endpoint – see also list of all PM20 economists)
  22. 22. Display via DFG Viewer link Page 22
  23. 23. Future work • Build community support for further extension of the PM20 metadata • Create an item structure for the subject and ware archives, and link the folders (~ 12,000) • Link/create items for company folders (~ 8,000) • Create a static HTML site with one page per folder (+ additional navigation pages) on the PM20 web site which hosts the digitized images (= permanent reference) • Optionally, create additional Wikidata-based searching/browsing facilities • Retire the present ColdFusion application Page 23
  24. 24. Wikiproject 20th Century Press Archives Page 24 https://www.wikidata.org/wiki/Wikidata:WikiProject_20th_Century_Press_Archives
  25. 25. Page 25 Thanks for listening! Joachim Neubert ZBW – Leibniz Information Centre for Economics j.neubert@zbw.eu http://zbw.eu/labs https://www.wikidata.org/wiki/User:Jneubert

×