What can Linked Data do for Digital
Libraries?
Sören Auer
What is Linked Data?
Creating Knowledge
out of Interlinked Data
Web
server
Web
server
Problem: Try to search for these things on the current Web:
• Apartments near German-English bilingual childcare in Berlin
• ERP service providers with offices in Vienna and London
• Researchers working on Digital Library topics in Eastern Europe
Information is available on the Web, but opaque to current search.
Why do we need the Data Web?
passau.de
Has everything about
childcare in Passau.
Immobilienscout.de
Knows all about real estate
offers in GermanyDB
Web
server
DB
Web
server
Search engineHTML HTML
RDF
RDF
Solution: complement text on Web pages with structured linked
open data & intelligently combine/integrate/join such structured
information from different sources:
Creating Knowledge
out of Interlinked Data
1. Uses RDF Data Model
Linked Data in a Nutshell
TPDL2013
Valetta
24.9.2013
Uni Malta
organizes
starts
takesPlaceIn
2. Is serialised in triples:
Uni_Malta organizes TPDL2013 .
TPDL2013 starts “20130924”^^xsd:date .
TPDL2013 takesPlaceAt Valetta .
3. Uses Content-negotiation
Subject Predicate Object
The emerging Web of Data
20082007
2008
2008
2008
2009
2009
2010
Linking Open Data cloud diagram, by
Richard Cyganiak and Anja Jentzsch.
Creating Knowledge
out of Interlinked Data
Inter-
linking/
Fusing
Classifi-
cation/
Enrichment
Quality
Analysis
Evolution /
Repair
Search/
Browsing/
Exploration
Extraction
Storage/
Querying
Manual
revision/
authoring
Linked Data
Lifecycle
Creating Knowledge
out of Interlinked Data
Extraction
Inter-
linking
Enrichm
ent
Quality
Analysis
Evolution
Repair
Explora-
tion
Extrac-
tion
Store
Query
Author
ing
Creating Knowledge
out of Interlinked Data
Storage and Querying
Inter-
linking
Enrichm
ent
Quality
Analysis
Evolution
Repair
Explora-
tion
Extrac-
tion
Store
Query
Author
ing
Authoring
Inter-
linking
Enrichm
ent
Quality
Analysis
Evolution
Repair
Explora-
tion
Extrac-
tion
tore
uery
Author
ing
Creating Knowledge
out of Interlinked Data
© CC-BY-NC-ND by ~Dezz~ (residae on flickr)
Linking
Inter-
linking
Enrichm
ent
Quality
Analysis
Evolution
Repair
Explora-
tion
Extrac-
tion
Store
Query
Author
ing
Creating Knowledge
out of Interlinked Data
Enrichment
Inter-
linking
Enrichm
ent
Quality
Analysis
Evolution
Repair
Explora-
tion
Extrac-
tion
Store
Query
Author
ing
Creating Knowledge
out of Interlinked Data
Analysis
Quality
Inter-
linking
Enrichm
ent
Quality
Analysis
Evolution
Repair
Explora-
tion
Extrac-
tion
Store
Query
Author
ing
CC BY SA Wikipedia
Creating Knowledge
out of Interlinked Data
Evolution © CC-BY-SA by alasis on flickr)
Inter-
linking
Enrichm
ent
Quality
Analysis
Evolution
Repair
Explora-
tion
Extrac-
tion
Store
Query
Author
ing
Creating Knowledge
out of Interlinked Data
Exploration
Inter-
linking
Enrichm
ent
Quality
Analysis
Evolution
Repair
Explora-
tion
Extrac-
tion
Store
Query
Author
ing
Creating Knowledge
out of Interlinked Data
Virtuoso RDF Store
(conductor,
sparql, isparql, ods)
Virtuoso
sponger
lod2webapi
RDFAuthor
Limes
ORE
D2R
Semantic
Spatial
Browser
sparqlproxy
SigmaEE
OntoWiki
LOD open
refine
Silk-latc
stanbol
valiant
Dbpedia
Spotlight
SPARQLed
PoolParty
Sieve
PoolParty
Extractor
Inter-
linking/
Fusing
Classifi-
cation/
Enrichme
nt
Quality
Analysis
Evolution /
Repair
Search/
Browsing/
Exploratio
n
Extraction
Storage/
Querying
Manual
revision/
authoring
dl-learner
CubeViz
LOD2
demonstrator
R2R
rdf-dataset-integration
Silk
SIREn
Sparqlify
CSVImport
Dbpedia
Spotlight UI
CKAN
(source)
Mondeca Sparql
Endpoint Status
Sindice
(source)
Web Linkage
Validator
Dbpedia
(source)
LOD2 Stack Components
lodms
LOD2 stat.
Workbench
LOD2 Authentication
Component
Data Cube
Validation Tool
Data Cube
Merging Tool
Data Cube
Slicing Tool
LOD2 Provenance
Component
What is a Digital Library?
Creating Knowledge
out of Interlinked Data
Search for stuff…
In a library I can …
© dfulmer
Creating Knowledge
out of Interlinked Data
Look at stuff…
In a library I can …
© ~Brenda-Starr~
Creating Knowledge
out of Interlinked Data
Search for stuff…
In a Digital Library I can …
Creating Knowledge
out of Interlinked Data
Look at stuff…
In a Digital Library I can …
Creating Knowledge
out of Interlinked Data
Two definitions:
• Online access to digitized/digital artefacts
(articles, books, manuscripts, photographs,…)
• Digital Knowledge Hubs
new ways of sharing knowledge online
Digital Libraries
Creating Knowledge
out of Interlinked Data
In our digital world with completely new
technology (internet, crowd-sourcing, linked
data) and devices (ultrabooks, smart
TVs/phones, tablets) ist not sufficient to just
„digitize“ the concept of a library.
We must re-invent the digital library as a place
for knowledge sharing on the Web.
Hypothesis
Creating Knowledge
out of Interlinked Data
How can we re-invent the Library online?
We need new ingredients
© dhaun
Creating Knowledge
out of Interlinked Data
Think about new types of artefacts
• Thesauri, ontologies / knowledge bases
• Courseware / learning objects
• Data / knowledge assets
• Semantic descriptions
of the content of
publications
• …
How can we reinvent the Library online?
Creating Knowledge
out of Interlinked Data
Think about new types of collaboration and
interaction
• Crowd-sourcing
• Social networking
• Serious games
• …
How can we reinvent the Library online?
Creating Knowledge
out of Interlinked Data
Think about new
technologies
• Semantic Web /
Linked Data
• Wikis
• Mashups
• Mobile Apps
• …
How can we reinvent the Library online?
Creating Knowledge
out of Interlinked Data
How can we reinvent the Library online?
Mixing & mashing the ingridiens
• Thesauri, ontologies / knowledge bases
• Courseware / learning objects
• Knowledge assetts
• Semantic descriptions of publications
• …
Artefacts
• Crowd-sourcing
• Social networking
• Serious games
• …
Collaboration & interaction
• Semantic Web / Linked Data
• Wikis
• Mashups
• Mobile Apps
• …
New technologies
Photo © sylvar
Creating Knowledge
out of Interlinked Data
1. OntoWiki – a semantic data wiki
2. Cortex – a semantic digital library search
backend
3. SlideWiki – a platform for crowd-sourcing
multilingual OpenCourseWare
4. SemanticPapers – capturing the meaning
of scientific publications
Are these digital libraries?
Creating Knowledge
out of Interlinked Data
OntoWiki – a semantic data wiki
Creating Knowledge
out of Interlinked Data
1. Semantic (Text) Wikis
• Authoring of semantically
annotated texts
2. Semantic Data Wikis
• Direct authoring of
structured information
(i.e. RDF, RDF-Schema,
OWL)
Two Kinds of Semantic Wikis
OntoWiki Dynamic views on
knowledge bases
OntoWiki for the Catalogus
Professorum Lipsiensis
RDF triples on
resource details
page
Dynamic suggestions
from the Data Web
OntoWiki for the Catalogus
Professorum Lipsiensis
CPM
Ontologie
Catalogus Professorum Lipsiensis
Cortex – a semantic digital library
search backend
Creating Knowledge
out of Interlinked Data
Cortex – Flexible und zukunftsfähige Architektur
Import
Manager
Search
Manager
License
Manager
Availability
Manager
Rest API
ARCHIV INDEX
BMS DICT DS
DS
CMS / User managementPresentation
Import
Creating Knowledge
out of Interlinked Data
CORTEX Performance
Metric Description Performance
Queries per Second
(qps)
Number of search request, which can be
processed per second
2000
Search response time Maximum response time (till 100.000.000
objects and 2000 qps)
< 100 ms
Number of Objects Number of objects (resources), for which
CORTEX was developed and tested
100.000.000
Creating Knowledge
out of Interlinked Data
Try it out – www.ddb.de
Jochen Schon 39
SlideWiki – a platform for crowd-sourcing
multilingual OpenCourseWare
Creating Knowledge
out of Interlinked Data
How is SlideWiki different?
There are a number of online tools for presentations, such as Google
Docs Presentations, Prezi, SlideShare. SlideWiki differs quite a lot from
these due to its focus on:
E-learning - you can add questions to slides and thus compose
comprehensive self-assessment tests for learners
Collaboration - SlideWiki aims at empowering whole communities to
create presentations collaboratively
Translation - with SlideWiki content can be easily translated in more
than 50 languages
No other tool provides this twist and thus SlideWiki offers a unique
feature set.
Creating Knowledge
out of Interlinked Data
Researchers spend a lot of time in
• encoding information in text
• Decoding information from text
Can we make this more efficient?
Semantic Publications
Creating Knowledge
out of Interlinked Data
Researchers publish their findings in structured
form (e.g. encoded in a RDF knowledge base)
This would enormously simplify:
• Finding related work
• Creating a survey
• Assessing a contribution
• …
Vision of scientific publishing
Creating Knowledge
out of Interlinked Data
limes-paper describes appr123
appr123 a approach
appr123 for Link_Discovery
appr123 hasProp looseless
...
limes-paper describes impl123
impl123 a implementation
impl123 implements appr123
impl123 language Java
...
limes-paper describes eval123
eval123 a evaluation
eval123 evaluates impl123
eval123 uses DBpedia
...
Semantically describing the content of scientific
publications
Inter-
linking/
Fusing
Classifi-
cation/
Enrichment
Quality
Analysis
Evolution /
Repair
Search/
Browsing/
Exploration
Extraction
Storage/
Querying
Manual
revision/
authoring
Digital
Libraries in the
Linked Data
Lifecycle
Hosting &
maintenance of
exploration tools
Be the
“lighthouse”
for the LOD
ocean.
Library data is valuable
background knowledge for
KB enrichment & repair.
becoming
linking hubs for
the Data Web
support facilities
for knowledge
based authoring
& collaboration
Provide storage
facilities for
Linked Data
Extract and publish
structured (meta-)
data for library
content
Authorative
Linked Data for
quality
assessment
Creating Knowledge
out of Interlinked Data
• Digital libraries must support new types of
structured artefacts, interaction &
collaboration paradigms and technologies
• The Linked Data paradigm helps to
connect knowledge from distributed
heterogeneous sources.
Wrap-up
EU-FP7 LOD2 Project Overview . Page 51 http://lod2.eu
Creating Knowledge out of Interlinked Data
AKSW Team
EU-FP7 LOD2 Project Overview . Page 52 http://lod2.eu
Creating Knowledge out of Interlinked Data
The LOD2 Gang
Creating Knowledge
out of Interlinked Data
Thanks for your attention!
Sören Auer
http://www.iai.uni-bonn.de/~auer | http://aksw.org | http://lod2.org
auer@cs.uni-bonn.de

What can linked data do for digital libraries

  • 1.
    What can LinkedData do for Digital Libraries? Sören Auer
  • 2.
  • 3.
    Creating Knowledge out ofInterlinked Data Web server Web server Problem: Try to search for these things on the current Web: • Apartments near German-English bilingual childcare in Berlin • ERP service providers with offices in Vienna and London • Researchers working on Digital Library topics in Eastern Europe Information is available on the Web, but opaque to current search. Why do we need the Data Web? passau.de Has everything about childcare in Passau. Immobilienscout.de Knows all about real estate offers in GermanyDB Web server DB Web server Search engineHTML HTML RDF RDF Solution: complement text on Web pages with structured linked open data & intelligently combine/integrate/join such structured information from different sources:
  • 4.
    Creating Knowledge out ofInterlinked Data 1. Uses RDF Data Model Linked Data in a Nutshell TPDL2013 Valetta 24.9.2013 Uni Malta organizes starts takesPlaceIn 2. Is serialised in triples: Uni_Malta organizes TPDL2013 . TPDL2013 starts “20130924”^^xsd:date . TPDL2013 takesPlaceAt Valetta . 3. Uses Content-negotiation Subject Predicate Object
  • 5.
    The emerging Webof Data 20082007 2008 2008 2008 2009 2009 2010 Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
  • 6.
    Creating Knowledge out ofInterlinked Data Inter- linking/ Fusing Classifi- cation/ Enrichment Quality Analysis Evolution / Repair Search/ Browsing/ Exploration Extraction Storage/ Querying Manual revision/ authoring Linked Data Lifecycle
  • 7.
    Creating Knowledge out ofInterlinked Data Extraction Inter- linking Enrichm ent Quality Analysis Evolution Repair Explora- tion Extrac- tion Store Query Author ing
  • 8.
    Creating Knowledge out ofInterlinked Data Storage and Querying Inter- linking Enrichm ent Quality Analysis Evolution Repair Explora- tion Extrac- tion Store Query Author ing
  • 9.
  • 10.
    Creating Knowledge out ofInterlinked Data © CC-BY-NC-ND by ~Dezz~ (residae on flickr) Linking Inter- linking Enrichm ent Quality Analysis Evolution Repair Explora- tion Extrac- tion Store Query Author ing
  • 11.
    Creating Knowledge out ofInterlinked Data Enrichment Inter- linking Enrichm ent Quality Analysis Evolution Repair Explora- tion Extrac- tion Store Query Author ing
  • 12.
    Creating Knowledge out ofInterlinked Data Analysis Quality Inter- linking Enrichm ent Quality Analysis Evolution Repair Explora- tion Extrac- tion Store Query Author ing CC BY SA Wikipedia
  • 13.
    Creating Knowledge out ofInterlinked Data Evolution © CC-BY-SA by alasis on flickr) Inter- linking Enrichm ent Quality Analysis Evolution Repair Explora- tion Extrac- tion Store Query Author ing
  • 14.
    Creating Knowledge out ofInterlinked Data Exploration Inter- linking Enrichm ent Quality Analysis Evolution Repair Explora- tion Extrac- tion Store Query Author ing
  • 15.
    Creating Knowledge out ofInterlinked Data Virtuoso RDF Store (conductor, sparql, isparql, ods) Virtuoso sponger lod2webapi RDFAuthor Limes ORE D2R Semantic Spatial Browser sparqlproxy SigmaEE OntoWiki LOD open refine Silk-latc stanbol valiant Dbpedia Spotlight SPARQLed PoolParty Sieve PoolParty Extractor Inter- linking/ Fusing Classifi- cation/ Enrichme nt Quality Analysis Evolution / Repair Search/ Browsing/ Exploratio n Extraction Storage/ Querying Manual revision/ authoring dl-learner CubeViz LOD2 demonstrator R2R rdf-dataset-integration Silk SIREn Sparqlify CSVImport Dbpedia Spotlight UI CKAN (source) Mondeca Sparql Endpoint Status Sindice (source) Web Linkage Validator Dbpedia (source) LOD2 Stack Components lodms LOD2 stat. Workbench LOD2 Authentication Component Data Cube Validation Tool Data Cube Merging Tool Data Cube Slicing Tool LOD2 Provenance Component
  • 16.
    What is aDigital Library?
  • 17.
    Creating Knowledge out ofInterlinked Data Search for stuff… In a library I can … © dfulmer
  • 18.
    Creating Knowledge out ofInterlinked Data Look at stuff… In a library I can … © ~Brenda-Starr~
  • 19.
    Creating Knowledge out ofInterlinked Data Search for stuff… In a Digital Library I can …
  • 20.
    Creating Knowledge out ofInterlinked Data Look at stuff… In a Digital Library I can …
  • 21.
    Creating Knowledge out ofInterlinked Data Two definitions: • Online access to digitized/digital artefacts (articles, books, manuscripts, photographs,…) • Digital Knowledge Hubs new ways of sharing knowledge online Digital Libraries
  • 22.
    Creating Knowledge out ofInterlinked Data In our digital world with completely new technology (internet, crowd-sourcing, linked data) and devices (ultrabooks, smart TVs/phones, tablets) ist not sufficient to just „digitize“ the concept of a library. We must re-invent the digital library as a place for knowledge sharing on the Web. Hypothesis
  • 23.
    Creating Knowledge out ofInterlinked Data How can we re-invent the Library online? We need new ingredients © dhaun
  • 24.
    Creating Knowledge out ofInterlinked Data Think about new types of artefacts • Thesauri, ontologies / knowledge bases • Courseware / learning objects • Data / knowledge assets • Semantic descriptions of the content of publications • … How can we reinvent the Library online?
  • 25.
    Creating Knowledge out ofInterlinked Data Think about new types of collaboration and interaction • Crowd-sourcing • Social networking • Serious games • … How can we reinvent the Library online?
  • 26.
    Creating Knowledge out ofInterlinked Data Think about new technologies • Semantic Web / Linked Data • Wikis • Mashups • Mobile Apps • … How can we reinvent the Library online?
  • 27.
    Creating Knowledge out ofInterlinked Data How can we reinvent the Library online? Mixing & mashing the ingridiens • Thesauri, ontologies / knowledge bases • Courseware / learning objects • Knowledge assetts • Semantic descriptions of publications • … Artefacts • Crowd-sourcing • Social networking • Serious games • … Collaboration & interaction • Semantic Web / Linked Data • Wikis • Mashups • Mobile Apps • … New technologies Photo © sylvar
  • 28.
    Creating Knowledge out ofInterlinked Data 1. OntoWiki – a semantic data wiki 2. Cortex – a semantic digital library search backend 3. SlideWiki – a platform for crowd-sourcing multilingual OpenCourseWare 4. SemanticPapers – capturing the meaning of scientific publications Are these digital libraries?
  • 29.
    Creating Knowledge out ofInterlinked Data OntoWiki – a semantic data wiki
  • 30.
    Creating Knowledge out ofInterlinked Data 1. Semantic (Text) Wikis • Authoring of semantically annotated texts 2. Semantic Data Wikis • Direct authoring of structured information (i.e. RDF, RDF-Schema, OWL) Two Kinds of Semantic Wikis
  • 31.
    OntoWiki Dynamic viewson knowledge bases
  • 32.
    OntoWiki for theCatalogus Professorum Lipsiensis RDF triples on resource details page
  • 33.
    Dynamic suggestions from theData Web OntoWiki for the Catalogus Professorum Lipsiensis
  • 34.
  • 35.
  • 36.
    Cortex – asemantic digital library search backend
  • 37.
    Creating Knowledge out ofInterlinked Data Cortex – Flexible und zukunftsfähige Architektur Import Manager Search Manager License Manager Availability Manager Rest API ARCHIV INDEX BMS DICT DS DS CMS / User managementPresentation Import
  • 38.
    Creating Knowledge out ofInterlinked Data CORTEX Performance Metric Description Performance Queries per Second (qps) Number of search request, which can be processed per second 2000 Search response time Maximum response time (till 100.000.000 objects and 2000 qps) < 100 ms Number of Objects Number of objects (resources), for which CORTEX was developed and tested 100.000.000
  • 39.
    Creating Knowledge out ofInterlinked Data Try it out – www.ddb.de Jochen Schon 39
  • 40.
    SlideWiki – aplatform for crowd-sourcing multilingual OpenCourseWare
  • 45.
    Creating Knowledge out ofInterlinked Data How is SlideWiki different? There are a number of online tools for presentations, such as Google Docs Presentations, Prezi, SlideShare. SlideWiki differs quite a lot from these due to its focus on: E-learning - you can add questions to slides and thus compose comprehensive self-assessment tests for learners Collaboration - SlideWiki aims at empowering whole communities to create presentations collaboratively Translation - with SlideWiki content can be easily translated in more than 50 languages No other tool provides this twist and thus SlideWiki offers a unique feature set.
  • 46.
    Creating Knowledge out ofInterlinked Data Researchers spend a lot of time in • encoding information in text • Decoding information from text Can we make this more efficient? Semantic Publications
  • 47.
    Creating Knowledge out ofInterlinked Data Researchers publish their findings in structured form (e.g. encoded in a RDF knowledge base) This would enormously simplify: • Finding related work • Creating a survey • Assessing a contribution • … Vision of scientific publishing
  • 48.
    Creating Knowledge out ofInterlinked Data limes-paper describes appr123 appr123 a approach appr123 for Link_Discovery appr123 hasProp looseless ... limes-paper describes impl123 impl123 a implementation impl123 implements appr123 impl123 language Java ... limes-paper describes eval123 eval123 a evaluation eval123 evaluates impl123 eval123 uses DBpedia ... Semantically describing the content of scientific publications
  • 49.
    Inter- linking/ Fusing Classifi- cation/ Enrichment Quality Analysis Evolution / Repair Search/ Browsing/ Exploration Extraction Storage/ Querying Manual revision/ authoring Digital Libraries inthe Linked Data Lifecycle Hosting & maintenance of exploration tools Be the “lighthouse” for the LOD ocean. Library data is valuable background knowledge for KB enrichment & repair. becoming linking hubs for the Data Web support facilities for knowledge based authoring & collaboration Provide storage facilities for Linked Data Extract and publish structured (meta-) data for library content Authorative Linked Data for quality assessment
  • 50.
    Creating Knowledge out ofInterlinked Data • Digital libraries must support new types of structured artefacts, interaction & collaboration paradigms and technologies • The Linked Data paradigm helps to connect knowledge from distributed heterogeneous sources. Wrap-up
  • 51.
    EU-FP7 LOD2 ProjectOverview . Page 51 http://lod2.eu Creating Knowledge out of Interlinked Data AKSW Team
  • 52.
    EU-FP7 LOD2 ProjectOverview . Page 52 http://lod2.eu Creating Knowledge out of Interlinked Data The LOD2 Gang
  • 53.
    Creating Knowledge out ofInterlinked Data Thanks for your attention! Sören Auer http://www.iai.uni-bonn.de/~auer | http://aksw.org | http://lod2.org auer@cs.uni-bonn.de

Editor's Notes

  • #7 Storage. RDF Data Management is still more challenging than relational Data Management. We aim to close this performance gap by employing column-store technology, dynamic query optimization, adaptive caching of joins, optimized graph processing, cluster/cloud scalability. Authoring. LOD2 facilitates the authoring of rich semantic knowledge bases, by leveraging Semantic Wiki technology, the WYSIWIM paradigm (What You See Is What You Mean) and distributed social, semantic collaboration and networking techniques. Interlinking. Creating and maintaining links in a (semi-)automated fashion is still a major challenge and crucial for establishing coherence and facilitating data integration. We aim at linking approaches yielding high precision and recall, which configure themselves automatically or based on end-user feedback. Classification. Linked Data on the Web is mainly raw instance data. For data integration, fusion, search and many other applications, however, we need this raw instance data to be linked and integrated with upper level ontologies. Quality. The quality on the Data Web is varying as the quality on the document web varies. LOD2 develops techniques, which help to assess the quality based on characteristics such as provenance, context, coverage or structure. Evolution/Repair. Data on the Web is dynamic. We need to facilitate the evolution of data while keeping things stable. Changes and modifications to knowledge bases, vocabularies and ontologies should be transparent and observable. LOD2 also develops methods to spot problems in knowledge bases and to automatically suggest repair strategies. Search/Browsing/Exploration. For many users Data Web is still invisible below the surface. LOD2 develops search, browsing, exploration and visualization techniques for different kinds of Linked Data (i.e. spatial, temporal, statistic), which make the Data Web sensible for real users.
  • #11 http://www.flickr.com/photos/residae/2560241604/#/
  • #14 http://www.flickr.com/photos/alasis/3541341601/sizes/l/in/photostream/
  • #38 Im ersten Teil der technischen Präsentation will ich insbesondere auf den backend-Teil der Architektur bestehend aus Import und Discovery-system eingehen
  • #39 Die aufgeführten Leistungsdaten wurde auf der Referenzarchitektur des Betreibers der Deutschen Digitalen Bibliothek gemessen.