SlideShare a Scribd company logo
Linking Library Data
with Fusepool
Johannes Hercher (Free University Berlin)
June 25, 2014
@jhercher
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Context
I care for metadata Ugh! 

Your OPAC sucks
We
cooperate…
How to link Library Data
with the „Oceans“ of WWW ?
German
National Library
published authority
data
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Example
a search in subject index (with GND Identifiers)
a search in full text http://primo.fu-berlin.de
• GND = Thesaurus for
subject indexing in Germany
• Search with GND limited to

local resources
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
• search beyond the local
holdings => easier, more reliable
• suggest content using
semantic relations 

( GND is a Thesaurus ! )
You* should use
identifiers
*publishers, authors, aggregators
Assigning IDs 

is time consuming
- Reality -
Assigning IDs 

is fun
- Vision -
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Questions & Tasks
• Could machines do the subject indexing?

-> Use SMA to enrich DBpedia pages with GND IDs
• Can we support Librarians in subject indexing? 

-> Build Annotator Prototype 



https://github.com/jhercher/LEE/
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Demonstrator
AnnotatorApp: 

filters stoppwords and
displays Library entities
for your text
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Review concepts and
start a search using concept id’s
https://github.com/jhercher/LEE
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
How to Fusepool
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Workflow
1. Select a subset of GND Subject Headings using SPARQL
2. Import Subject Headings
3. Configure SMA dictionary component
4. Import documents (Graph)
5. Batch matching of documents with dictionaries using
Fusepools DLC
6. Review results and build services on top
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
http://zbw.eu/beta/sparql/gnd
http://d-nb.info/standards/elementset/gnd


NomenclatureInBiologyOrChemistry

SubjectHeadingSensoStricto

ProductNameOrBrandName

HistoricSingleEventOrEra

EthnographicName

GroupOfPersons

SubjectHeading
Language

Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
http://localhost:8080/admin/graphs/
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Results
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
<http://de.dbpedia.org/resource/Wilder_Streik_bei_Ford_(1973)>

<http://purl.org/dc/elements/1.1/subject>

<http://d-nb.info/gnd/7708211-4> , # Drug-eluting Stent(syn: DES)
<http://d-nb.info/gnd/4302110-4> , # Ford

<http://d-nb.info/gnd/4578282-9> , # sich [„self“@en] 

<http://d-nb.info/gnd/4248646-4> , # Spitzel [„spy“@en] (syn: IM)
<http://d-nb.info/gnd/4389837-3> , # August (month)

<http://d-nb.info/gnd/4291333-0> , # Niederlage [„defeat“@en]

<http://d-nb.info/gnd/4002623-1> . # Arbeitnehmer [„employee“@en]
• GND Dictionary includes: articles, prepositions, adjectives…
• Acronyms („IM, DES“) -> activate „Case Sensitivity“
• Not every match is useful in the context („August, Defeat“)
http://localhost:8080/graph?name=urn:x-localinstance:/dlc/
{yourDataset}/enhance.graph
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
human (found in GND) = 1
SMA GND suggestions = 7
SMA correct = 3
precision = 33%
recall = 100%
SMA false = 1
Prototype: GND Annotator
Persons LocationsTopics Time
manual Evaluation only for Topics
ok
ok
not relevant
false
not relevant
ok
not relevant
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Results (1)
Recall: 78%"
Precision: 73%
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Results (2)
Recall: 90%"
Precision: 72%
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
http://primo.kobv.de/docId=TN_thieme_articles10.1055/s-0029-1237743
Fusepool in the wild (1)
no exact
string match
chemical term geographic
financial
education
too broad
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Fusepool in the wild (2)
Abstract
Reviews
TOC
ISBN: 9783642371103
Drawback: 

Quality of annotations
depend on text input
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Feedback
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Why Fusepool?
1. Ready for the Semantic Web"
• can handle graphs (clerezza, TDB,…)
• Data i/o using REST
2. String Matching SMA"
• Import & configuration of dictionaries (e.g. a Thesaurus)
• batch matching & annotation using Data Life Center (DLC)
3. Easy to install Builds at http://jenkins.fusepool.info
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Conclusion
!
• Fusepool: Infrastructure to build new services
• … better linking beyond the aquarium(s)
• TODO:
• build tailored interfaces for annotation, search, recommender
• improve the dictionaries
Fusepool final public workshop!
Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
Thank You!
twitter: @jhercher
github: https://github.com/jhercher/
mail: hercher@ub.fu-berlin.de

More Related Content

Similar to Linking Library Data using Fusepool

Dealing with research data
Dealing with research dataDealing with research data
Dealing with research data
Elena Simukovic
 
Finding and managing engineering information
Finding and managing engineering informationFinding and managing engineering information
Finding and managing engineering information
Thomas Hapke
 
Finding and managing process engineering information
Finding and managing process engineering informationFinding and managing process engineering information
Finding and managing process engineering information
Thomas Hapke
 
2014 Census of Open Access Repositories in Germany, Austria and Switzerland
2014 Census of Open Access Repositories in Germany, Austria and Switzerland2014 Census of Open Access Repositories in Germany, Austria and Switzerland
2014 Census of Open Access Repositories in Germany, Austria and Switzerland
Paul Vierkant
 
All WP Meeting Athens - Europeana Cloud
All WP Meeting Athens - Europeana CloudAll WP Meeting Athens - Europeana Cloud
All WP Meeting Athens - Europeana Cloud
Digitised Manuscripts to Europeana
 
DM2E - Europeana Cloud
DM2E - Europeana CloudDM2E - Europeana Cloud
DM2E - Europeana Cloud
Joris Klerkx
 
DM2E and Europeana eCloud - Joris Klerkx
DM2E and Europeana eCloud - Joris KlerkxDM2E and Europeana eCloud - Joris Klerkx
DM2E and Europeana eCloud - Joris Klerkx
Digitised Manuscripts to Europeana
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with Nanopublications
Tobias Kuhn
 
Presentation of University of Hamburg
Presentation of University of HamburgPresentation of University of Hamburg
Presentation of University of Hamburg
Invest in Skåne
 
Call for Papers: ICL 2012 Special Track YAER12
Call for Papers: ICL 2012 Special Track YAER12Call for Papers: ICL 2012 Special Track YAER12
Call for Papers: ICL 2012 Special Track YAER12
Martin Ebner
 
BHL-Europe for sherborn 2011 - henning scholz
BHL-Europe for sherborn 2011 - henning scholzBHL-Europe for sherborn 2011 - henning scholz
BHL-Europe for sherborn 2011 - henning scholz
coelatura
 
Sherborn: Scholz - BHL-Europe: Tools and Services for Legacy Taxonomic Litera...
Sherborn: Scholz - BHL-Europe: Tools and Services for Legacy Taxonomic Litera...Sherborn: Scholz - BHL-Europe: Tools and Services for Legacy Taxonomic Litera...
Sherborn: Scholz - BHL-Europe: Tools and Services for Legacy Taxonomic Litera...
ICZN
 
Repositories, CTAs and the question of gratis vs. libre
Repositories, CTAs and the question of gratis vs. libre Repositories, CTAs and the question of gratis vs. libre
Repositories, CTAs and the question of gratis vs. libre
Heinz Pampel
 
Finding and managing engineering information
Finding and managing engineering informationFinding and managing engineering information
Finding and managing engineering information
Thomas Hapke
 
Word Sense Disambiguation in Old English
Word Sense Disambiguation in Old EnglishWord Sense Disambiguation in Old English
Word Sense Disambiguation in Old English
martinwunderlich
 
Pioneers of Information Science in Europe: The Oeuvre of Norbert Henrichs
Pioneers of Information Science in Europe: The Oeuvre of Norbert HenrichsPioneers of Information Science in Europe: The Oeuvre of Norbert Henrichs
Pioneers of Information Science in Europe: The Oeuvre of Norbert Henrichs
Wolfgang Stock
 
[Sommer] [7 into 1. Integration and Collaboration: The new library for Humani...
[Sommer] [7 into 1. Integration and Collaboration: The new library for Humani...[Sommer] [7 into 1. Integration and Collaboration: The new library for Humani...
[Sommer] [7 into 1. Integration and Collaboration: The new library for Humani...
Diane Koen
 
Summary of GSCL 2013 international NLP conference in Germany
Summary of GSCL 2013 international NLP conference in GermanySummary of GSCL 2013 international NLP conference in Germany
Summary of GSCL 2013 international NLP conference in Germany
Lifeng (Aaron) Han
 
Finding and managing engineering information … and the challenge of publishin...
Finding and managing engineering information … and the challenge of publishin...Finding and managing engineering information … and the challenge of publishin...
Finding and managing engineering information … and the challenge of publishin...
Thomas Hapke
 
The current state(s) of Open Science
The current state(s) of Open ScienceThe current state(s) of Open Science
The current state(s) of Open Science
uherb
 

Similar to Linking Library Data using Fusepool (20)

Dealing with research data
Dealing with research dataDealing with research data
Dealing with research data
 
Finding and managing engineering information
Finding and managing engineering informationFinding and managing engineering information
Finding and managing engineering information
 
Finding and managing process engineering information
Finding and managing process engineering informationFinding and managing process engineering information
Finding and managing process engineering information
 
2014 Census of Open Access Repositories in Germany, Austria and Switzerland
2014 Census of Open Access Repositories in Germany, Austria and Switzerland2014 Census of Open Access Repositories in Germany, Austria and Switzerland
2014 Census of Open Access Repositories in Germany, Austria and Switzerland
 
All WP Meeting Athens - Europeana Cloud
All WP Meeting Athens - Europeana CloudAll WP Meeting Athens - Europeana Cloud
All WP Meeting Athens - Europeana Cloud
 
DM2E - Europeana Cloud
DM2E - Europeana CloudDM2E - Europeana Cloud
DM2E - Europeana Cloud
 
DM2E and Europeana eCloud - Joris Klerkx
DM2E and Europeana eCloud - Joris KlerkxDM2E and Europeana eCloud - Joris Klerkx
DM2E and Europeana eCloud - Joris Klerkx
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with Nanopublications
 
Presentation of University of Hamburg
Presentation of University of HamburgPresentation of University of Hamburg
Presentation of University of Hamburg
 
Call for Papers: ICL 2012 Special Track YAER12
Call for Papers: ICL 2012 Special Track YAER12Call for Papers: ICL 2012 Special Track YAER12
Call for Papers: ICL 2012 Special Track YAER12
 
BHL-Europe for sherborn 2011 - henning scholz
BHL-Europe for sherborn 2011 - henning scholzBHL-Europe for sherborn 2011 - henning scholz
BHL-Europe for sherborn 2011 - henning scholz
 
Sherborn: Scholz - BHL-Europe: Tools and Services for Legacy Taxonomic Litera...
Sherborn: Scholz - BHL-Europe: Tools and Services for Legacy Taxonomic Litera...Sherborn: Scholz - BHL-Europe: Tools and Services for Legacy Taxonomic Litera...
Sherborn: Scholz - BHL-Europe: Tools and Services for Legacy Taxonomic Litera...
 
Repositories, CTAs and the question of gratis vs. libre
Repositories, CTAs and the question of gratis vs. libre Repositories, CTAs and the question of gratis vs. libre
Repositories, CTAs and the question of gratis vs. libre
 
Finding and managing engineering information
Finding and managing engineering informationFinding and managing engineering information
Finding and managing engineering information
 
Word Sense Disambiguation in Old English
Word Sense Disambiguation in Old EnglishWord Sense Disambiguation in Old English
Word Sense Disambiguation in Old English
 
Pioneers of Information Science in Europe: The Oeuvre of Norbert Henrichs
Pioneers of Information Science in Europe: The Oeuvre of Norbert HenrichsPioneers of Information Science in Europe: The Oeuvre of Norbert Henrichs
Pioneers of Information Science in Europe: The Oeuvre of Norbert Henrichs
 
[Sommer] [7 into 1. Integration and Collaboration: The new library for Humani...
[Sommer] [7 into 1. Integration and Collaboration: The new library for Humani...[Sommer] [7 into 1. Integration and Collaboration: The new library for Humani...
[Sommer] [7 into 1. Integration and Collaboration: The new library for Humani...
 
Summary of GSCL 2013 international NLP conference in Germany
Summary of GSCL 2013 international NLP conference in GermanySummary of GSCL 2013 international NLP conference in Germany
Summary of GSCL 2013 international NLP conference in Germany
 
Finding and managing engineering information … and the challenge of publishin...
Finding and managing engineering information … and the challenge of publishin...Finding and managing engineering information … and the challenge of publishin...
Finding and managing engineering information … and the challenge of publishin...
 
The current state(s) of Open Science
The current state(s) of Open ScienceThe current state(s) of Open Science
The current state(s) of Open Science
 

Recently uploaded

Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
jpupo2018
 

Recently uploaded (20)

Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
 

Linking Library Data using Fusepool

  • 1. Linking Library Data with Fusepool Johannes Hercher (Free University Berlin) June 25, 2014 @jhercher
  • 2. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Context I care for metadata Ugh! 
 Your OPAC sucks We cooperate… How to link Library Data with the „Oceans“ of WWW ? German National Library published authority data
  • 3. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Example a search in subject index (with GND Identifiers) a search in full text http://primo.fu-berlin.de • GND = Thesaurus for subject indexing in Germany • Search with GND limited to
 local resources
  • 4. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library • search beyond the local holdings => easier, more reliable • suggest content using semantic relations 
 ( GND is a Thesaurus ! ) You* should use identifiers *publishers, authors, aggregators Assigning IDs 
 is time consuming - Reality - Assigning IDs 
 is fun - Vision -
  • 5. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Questions & Tasks • Could machines do the subject indexing?
 -> Use SMA to enrich DBpedia pages with GND IDs • Can we support Librarians in subject indexing? 
 -> Build Annotator Prototype 
 
 https://github.com/jhercher/LEE/
  • 6. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Demonstrator AnnotatorApp: 
 filters stoppwords and displays Library entities for your text
  • 7. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Review concepts and start a search using concept id’s https://github.com/jhercher/LEE
  • 8. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library How to Fusepool
  • 9. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Workflow 1. Select a subset of GND Subject Headings using SPARQL 2. Import Subject Headings 3. Configure SMA dictionary component 4. Import documents (Graph) 5. Batch matching of documents with dictionaries using Fusepools DLC 6. Review results and build services on top
  • 10. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library http://zbw.eu/beta/sparql/gnd http://d-nb.info/standards/elementset/gnd 
 NomenclatureInBiologyOrChemistry
 SubjectHeadingSensoStricto
 ProductNameOrBrandName
 HistoricSingleEventOrEra
 EthnographicName
 GroupOfPersons
 SubjectHeading Language

  • 11. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library http://localhost:8080/admin/graphs/
  • 12. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
  • 13. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library
  • 14. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Results
  • 15. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library <http://de.dbpedia.org/resource/Wilder_Streik_bei_Ford_(1973)>
 <http://purl.org/dc/elements/1.1/subject>
 <http://d-nb.info/gnd/7708211-4> , # Drug-eluting Stent(syn: DES) <http://d-nb.info/gnd/4302110-4> , # Ford
 <http://d-nb.info/gnd/4578282-9> , # sich [„self“@en] 
 <http://d-nb.info/gnd/4248646-4> , # Spitzel [„spy“@en] (syn: IM) <http://d-nb.info/gnd/4389837-3> , # August (month)
 <http://d-nb.info/gnd/4291333-0> , # Niederlage [„defeat“@en]
 <http://d-nb.info/gnd/4002623-1> . # Arbeitnehmer [„employee“@en] • GND Dictionary includes: articles, prepositions, adjectives… • Acronyms („IM, DES“) -> activate „Case Sensitivity“ • Not every match is useful in the context („August, Defeat“) http://localhost:8080/graph?name=urn:x-localinstance:/dlc/ {yourDataset}/enhance.graph
  • 16. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library human (found in GND) = 1 SMA GND suggestions = 7 SMA correct = 3 precision = 33% recall = 100% SMA false = 1 Prototype: GND Annotator Persons LocationsTopics Time manual Evaluation only for Topics ok ok not relevant false not relevant ok not relevant
  • 17. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Results (1) Recall: 78%" Precision: 73%
  • 18. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Results (2) Recall: 90%" Precision: 72%
  • 19. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library http://primo.kobv.de/docId=TN_thieme_articles10.1055/s-0029-1237743 Fusepool in the wild (1) no exact string match chemical term geographic financial education too broad
  • 20. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Fusepool in the wild (2) Abstract Reviews TOC ISBN: 9783642371103 Drawback: 
 Quality of annotations depend on text input
  • 21. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Feedback
  • 22. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Why Fusepool? 1. Ready for the Semantic Web" • can handle graphs (clerezza, TDB,…) • Data i/o using REST 2. String Matching SMA" • Import & configuration of dictionaries (e.g. a Thesaurus) • batch matching & annotation using Data Life Center (DLC) 3. Easy to install Builds at http://jenkins.fusepool.info
  • 23. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Conclusion ! • Fusepool: Infrastructure to build new services • … better linking beyond the aquarium(s) • TODO: • build tailored interfaces for annotation, search, recommender • improve the dictionaries
  • 24. Fusepool final public workshop! Brussels, June 25th – Johannes Hercher, Free University Berlin, University Library Thank You! twitter: @jhercher github: https://github.com/jhercher/ mail: hercher@ub.fu-berlin.de