Linking Library Data using Fusepool

561 views
430 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
561
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Linking Library Data using Fusepool

  1. 1. Linking Library Data with Fusepool Johannes Hercher (Free University Berlin) June 25, 2014 @jhercher
  2. 2. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Context I care for metadata Ugh! 
 Your OPAC sucks We cooperate… How to link Library Data with the „Oceans“ of WWW ? German National Library published authority data
  3. 3. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Example a search in subject index (with GND Identifiers) a search in full text http://primo.fu-berlin.de • GND = Thesaurus for subject indexing in Germany • Search with GND limited to
 local resources
  4. 4. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library • search beyond the local holdings => easier, more reliable • suggest content using semantic relations 
 ( GND is a Thesaurus ! ) You* should use identifiers *publishers, authors, aggregators Assigning IDs 
 is time consuming - Reality - Assigning IDs 
 is fun - Vision -
  5. 5. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Questions & Tasks • Could machines do the subject indexing?
 -> Use SMA to enrich DBpedia pages with GND IDs • Can we support Librarians in subject indexing? 
 -> Build Annotator Prototype 
 
 https://github.com/jhercher/LEE/
  6. 6. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Demonstrator AnnotatorApp: 
 filters stoppwords and displays Library entities for your text
  7. 7. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Review concepts and start a search using concept id’s https://github.com/jhercher/LEE
  8. 8. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library How to Fusepool
  9. 9. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Workflow 1. Select a subset of GND Subject Headings using SPARQL 2. Import Subject Headings 3. Configure SMA dictionary component 4. Import documents (Graph) 5. Batch matching of documents with dictionaries using Fusepools DLC 6. Review results and build services on top
  10. 10. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library http://zbw.eu/beta/sparql/gnd http://d-nb.info/standards/elementset/gnd 
 NomenclatureInBiologyOrChemistry
 SubjectHeadingSensoStricto
 ProductNameOrBrandName
 HistoricSingleEventOrEra
 EthnographicName
 GroupOfPersons
 SubjectHeading Language

  11. 11. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library http://localhost:8080/admin/graphs/
  12. 12. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
  13. 13. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
  14. 14. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Results
  15. 15. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library <http://de.dbpedia.org/resource/Wilder_Streik_bei_Ford_(1973)>
 <http://purl.org/dc/elements/1.1/subject>
 <http://d-nb.info/gnd/7708211-4> , # Drug-eluting Stent(syn: DES) <http://d-nb.info/gnd/4302110-4> , # Ford
 <http://d-nb.info/gnd/4578282-9> , # sich [„self“@en] 
 <http://d-nb.info/gnd/4248646-4> , # Spitzel [„spy“@en] (syn: IM) <http://d-nb.info/gnd/4389837-3> , # August (month)
 <http://d-nb.info/gnd/4291333-0> , # Niederlage [„defeat“@en]
 <http://d-nb.info/gnd/4002623-1> . # Arbeitnehmer [„employee“@en] • GND Dictionary includes: articles, prepositions, adjectives… • Acronyms („IM, DES“) -> activate „Case Sensitivity“ • Not every match is useful in the context („August, Defeat“) http://localhost:8080/graph?name=urn:x-localinstance:/dlc/ {yourDataset}/enhance.graph
  16. 16. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library human (found in GND) = 1 SMA GND suggestions = 7 SMA correct = 3 precision = 33% recall = 100% SMA false = 1 Prototype: GND Annotator Persons LocationsTopics Time manual Evaluation only for Topics ok ok not relevant false not relevant ok not relevant
  17. 17. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Results (1) Recall: 78%" Precision: 73%
  18. 18. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Results (2) Recall: 90%" Precision: 72%
  19. 19. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library http://primo.kobv.de/docId=TN_thieme_articles10.1055/s-0029-1237743 Fusepool in the wild (1) no exact string match chemical term geographic financial education too broad
  20. 20. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Fusepool in the wild (2) Abstract Reviews TOC ISBN: 9783642371103 Drawback: 
 Quality of annotations depend on text input
  21. 21. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Feedback
  22. 22. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Why Fusepool? 1. Ready for the Semantic Web" • can handle graphs (clerezza, TDB,…) • Data i/o using REST 2. String Matching SMA" • Import & configuration of dictionaries (e.g. a Thesaurus) • batch matching & annotation using Data Life Center (DLC) 3. Easy to install Builds at http://jenkins.fusepool.info
  23. 23. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Conclusion ! • Fusepool: Infrastructure to build new services • … better linking beyond the aquarium(s) • TODO: • build tailored interfaces for annotation, search, recommender • improve the dictionaries
  24. 24. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Thank You! twitter: @jhercher github: https://github.com/jhercher/ mail: hercher@ub.fu-berlin.de

×