Linked (Open) Data
Opportunities and challenges
Makx Dekkers
mail@makxdekkers.com
Outline
• Basic notions
• Recent developments
• Comparing objectives
• Opportunities and risks
• Conclusions
© 2011 Makx Dekkers Journeés ABES 2011 2
BASIC NOTIONS
© 2011 Makx Dekkers Journeés ABES 2011 3
The idea and its history
• 1989: Tim Berners-Lee already talked about
linking documents and data together
(http://www.w3.org/History/1989/proposal.html)
• 2001: Tim Berners-Lee and Ora Lassila
introduced the “Semantic Web”
(http://www.scientificamerican.com/article.cfm?id=the-semantic-web)
• 2006: Tim Berners-Lee presented the initial
design issues (rules) for Linked Data
(http://www.w3.org/DesignIssues/LinkedData.html)
© 2011 Makx Dekkers Journeés ABES 2011 4
W3C Semantic Web initiative
• Objective
– to create a universal medium for the exchange of
data […] to smoothly interconnect personal
information management, enterprise application
integration, and the global sharing of commercial,
scientific and cultural data
• Main results
– Resource Description Framework (RDF), RDFa
(RDF-in-HTML), SPARQL Query Language
© 2011 Makx Dekkers Journeés ABES 2011 5
Core Linked Data Specifications
• Transport
– HTTP Hypertext Transfer Protocol
• Identification
– URI Uniform Resource Identifier
• Description and linking
– RDF Resource Description Framework
• Search and access
– SPARQL Query Language for RDF
© 2011 Makx Dekkers Journeés ABES 2011 6
The four rules of Linked Data
• TBL’s recommendations:
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up those
names
3. When someone looks up a URI, provide useful
information, using the standards (RDF*, SPARQL)
4. Include links to other URIs so that they can
discover more things
© 2011 Makx Dekkers Journeés ABES 2011 7
The basic model of RDF
• Resource Description Framework “triple”:
– Subject: the “thing” (resource) described
– Predicate: the characteristic of the resource
– Object: the value of the characteristic
Subject Object
Predicate
© 2011 Makx Dekkers Journeés ABES 2011 8
Complex structures in RDF
This
presentation
Makx
Dekkers Barcelona
Journées ABES
ABES
Montpellier
17-18 May 2011
presenter
partOf organizer location
hometown
date
location
© 2011 Makx Dekkers Journeés ABES 2011 9
Linked (Open / Enterprise) Data
• Commonalities
– Using Semantic Web technologies (RDF)
– Linking information resources, people, places
• Differences
– Open Data with open licenses; Enterprise Data
mostly for closed, controlled environments
– Open Data links to other Open Data, available for
external use; Enterprise Data may link to external
data but not openly available for external use
© 2011 Makx Dekkers Journeés ABES 2011 10
Linked Data -- Open Data
• Linked Data: focus on technology
– Semantic Web: Resource Description Framework,
and other Web standards
– Final solutions still under development
• Open Data: focus on strategy
– Based on notion that sharing is important and
benefits all
– Technology is secondary
© 2011 Makx Dekkers Journeés ABES 2011 11
The five-star system
Source: http://inkdroid.org/journal/2010/06/04/the-5-stars-of-open-linked-data/
© 2011 Makx Dekkers Journeés ABES 2011 12
The LOD diagram: 2007
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
© 2011 Makx Dekkers Journeés ABES 2011 13
25 datasets
The LOD diagram: 2008
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
© 2011 Makx Dekkers Journeés ABES 2011 14
45 datasets
The LOD diagram: 2009
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
© 2011 Makx Dekkers Journeés ABES 2011 15
95 datasets
The LOD diagram: 2010
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
© 2011 Makx Dekkers Journeés ABES 2011 16
203 datasets
RECENT DEVELOPMENTS
© 2011 Makx Dekkers Journeés ABES 2011 17
W3C communities
• LinkingOpenData SWEO Community Project
– Goal: to extend the Web with a data commons by
publishing various open data sets as RDF on the
Web and by setting RDF links between data items
from different data sources
(http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData)
• Library Linked Data Incubator Group
– to help increase global interoperability of library
data on the Web (http://www.w3.org/2005/Incubator/lld/)
© 2011 Makx Dekkers Journeés ABES 2011 18
More W3C communities
• Government Linking Data Working Group
– to provide standards and other information which
help governments around the world publish their
data as effective and usable linked data
(http://www.w3.org/2011/gld/charter)
• Semantic Web Health Care and Life Sciences
(HCLS) Interest Group
– to develop, advocate for, and support the use of
Semantic Web technologies for health care and
life science (e.g. biology, medicine)
(http://www.w3.org/2001/sw/hcls/)
© 2011 Makx Dekkers Journeés ABES 2011 19
Open Knowledge Foundation, okfn.org
• not-for-profit organization promoting open
knowledge: any kind of data and content that
can be freely used, reused, and redistributed
• Working and Interest Groups, e.g.
– Open Data in Science, Open Government Data,
Open Bibliographic Data, Cultural Heritage etc.
• CKAN.net: registry of open datasets and other
“knowledge resources”
© 2011 Makx Dekkers Journeés ABES 2011 20
Linked Data initiatives
Predicate vocabularies (descriptors)
Research Description and Access (RDA) http://metadataregistry.org/rdabrowse.htm
The Bibliographic Ontology (BIBO) http://bibliontology.com/
Dublin Core http://dublincore.org/
Object vocabularies (values)
Virtual International Authority File (VIAF) http://viaf.org/
Library of Congress authorities http://id.loc.gov/authorities/
AgroVOC (agricultural terminology) e.g. http://aims.fao.org/aos/agrovoc/c_550
DBPedia (based on Wikipedia) e.g. http://dbpedia.org/page/Montpellier
Bibliographic data
LIBRIS Sweden e.g. http://libris.kb.se/library/S
British Library http://www.bl.uk/bibliographic/datasamples.html
CrossRef (DOI metadata) http://www.crossref.org/CrossTech/linked_data/
© 2011 Makx Dekkers Journeés ABES 2011 21
More Linked Data initiatives
Broadcasting, publishing
BBC http://www.bbc.co.uk/blogs/bbcinternet/linked_data/
New York Times http://data.nytimes.com/
Governments (small sample)
USA http://data.gov/
France http://opendata.paris.fr/
Finland http://data.suomi.fi/
UK http://data.gov.uk/
Spain (Cataluña) http://dadesobertes.gencat.cat/
Norway http://data.norge.no
Netherlands http://www.overheid.nl/opendata
Australia http://data.gov.au/
© 2011 Makx Dekkers Journeés ABES 2011 22
COMPARING OBJECTIVES
© 2011 Makx Dekkers Journeés ABES 2011 23
Strategic aspects Linked Data
• Achieving global interoperability with minimal
coordination
• Aggregating human knowledge
• Supporting democracy, transparency and
accountability
• Enhancing and enriching information
• Enabling user-driven and user-generated
applications
© 2011 Makx Dekkers Journeés ABES 2011 24
Strategic aspects libraries
• Organizing information for use by specific
users for specific goals
• Ensuring and maintaining quality
• Sustaining services economically
• Preserving information for the long term
• Providing trusted services
© 2011 Makx Dekkers Journeés ABES 2011 25
Functional aspects Linked Data
• Searching distributed collections
• “Following your nose” – navigating links
between pieces of content
• Distributing responsibility for making
statements about things
• Leaving to the user whom and what to trust
• Leaving development of products and services
to an open market (apps)
© 2011 Makx Dekkers Journeés ABES 2011 26
Functional aspects libraries
• Describing information by professionals
• Bringing together and managing aggregations
of information
• Selecting relevant information
• Mixing analogue and digital resources
© 2011 Makx Dekkers Journeés ABES 2011 27
Technical aspects Linked Data
• Publishing and using machine-readable
statements (“data that speak for themselves”)
• Focusing on Semantic Web technology
• Enabling inferences across large distributed
data sets
• (Still to be done) Solving issues around
harvesting, caching and real-time updating
© 2011 Makx Dekkers Journeés ABES 2011 28
Technical aspects libraries
• Using proven technology to provide high-
quality services
• Managing production systems and services
• Guaranteeing performance, uptime,
consistency across data
© 2011 Makx Dekkers Journeés ABES 2011 29
Agility versus sustainability
• In the Linked Data space:
– Things move fast
– Trial-and-error
– Lots of development by volunteers (hackers)
• In the library domain:
– Operational systems need to evolve
– Need to handle legacy data
– Development by professionals in managed
projects
© 2011 Makx Dekkers Journeés ABES 2011 30
Data versus services
• In the Linked Data space:
– Focus on availability of “raw data”
– Quality is secondary
– Data and technology should lead to useful results
• In the library domain:
– Focus on services
– Quality is essential
– Data and technology in support of the service
© 2011 Makx Dekkers Journeés ABES 2011 31
Economic aspects
• In the Linked Data space:
– “Information wants to be free” – a human right?
– Short-term thinking: today is hot, yesterday is not
– Focus on applications to create value out of data
• In the library domain:
– Long-term view: sustainability is crucial
– Public money to provide community services
– Expected to do more with less money
© 2011 Makx Dekkers Journeés ABES 2011 32
OPPORTUNITIES AND RISKS
© 2011 Makx Dekkers Journeés ABES 2011 33
Strong points Linked Data
• Attempt to create a common technical
platform for machine-readable data
• Lots of enthusiasm in publishing open data
• Promise of global interoperability
• Mix of researchers, user communities,
hackers, professional data providers
• High visibility on political level
© 2011 Makx Dekkers Journeés ABES 2011 34
Risks Linked Data
• Driven by technology, not by requirements
• Technology may not (yet) be stable – RDF 2.0?
• Operational issues far from solved (reliability,
performance, quality, security, trust)
• Hope for general agreement across domains
may not be realistic
• Promise may turn into disappointment
© 2011 Makx Dekkers Journeés ABES 2011 35
Strong points libraries
• Long time operational experience in managing
information
• Professional intermediaries between users
and information needs
• Sustainable business models (albeit with
eternally shrinking budgets)
• Long-term perspective: the past (legacy data)
as well as the future (preservation)
© 2011 Makx Dekkers Journeés ABES 2011 36
Risks libraries
• Technologies change rapidly
• New skills difficult to spread through the
organization
• Some people see libraries as a thing of the
past (“the book museum”)
• Underestimation of information handling skills
• Information overload, human intervention
does not scale, need for better tools
© 2011 Makx Dekkers Journeés ABES 2011 37
Meeting both worlds
• An example: Europeana.eu
– Started out with domain perspectives (libraries,
archives, museums, audiovisual archives)
– “Traditional” approach (metadata mappings)
works but insufficient
– Using Linked Data approach preserves domain
specifics but allows for generalization to support
common services
– Cross-domain (but co-ordinated) interoperability
© 2011 Makx Dekkers Journeés ABES 2011 38
Europeana Data Model
Classes Properties
Simple example
Complex
example
Source at:
http://version1.europeana.eu/web/europeana-project/technicaldocuments/
© 2011 Makx Dekkers Journeés ABES 2011 39
CONCLUSION
© 2011 Makx Dekkers Journeés ABES 2011 40
Libraries and Linked Data
• Using Linked Data technology as the next step
in connecting services
• Offering information management skills to the
technology domain
• Creating a quality hub in the Linked Data
space
© 2011 Makx Dekkers Journeés ABES 2011 41
Best of both worlds
• Libraries providing stability and sustainability
to Linked Data spaces
• Library professionals helping to manage the
distributed collections
• Libraries delivering high-quality linked data to
the Web
• Technologists to provide the next generation
of systems and tools
© 2011 Makx Dekkers Journeés ABES 2011 42
Linked (Open) Data:
opportunity for libraries!
Thank you!
Makx Dekkers
mail@makxdekkers.com

Jabes 2011 - Conférence inaugurale "Linked Open Data : opportunités et défis"

  • 1.
    Linked (Open) Data Opportunitiesand challenges Makx Dekkers mail@makxdekkers.com
  • 2.
    Outline • Basic notions •Recent developments • Comparing objectives • Opportunities and risks • Conclusions © 2011 Makx Dekkers Journeés ABES 2011 2
  • 3.
    BASIC NOTIONS © 2011Makx Dekkers Journeés ABES 2011 3
  • 4.
    The idea andits history • 1989: Tim Berners-Lee already talked about linking documents and data together (http://www.w3.org/History/1989/proposal.html) • 2001: Tim Berners-Lee and Ora Lassila introduced the “Semantic Web” (http://www.scientificamerican.com/article.cfm?id=the-semantic-web) • 2006: Tim Berners-Lee presented the initial design issues (rules) for Linked Data (http://www.w3.org/DesignIssues/LinkedData.html) © 2011 Makx Dekkers Journeés ABES 2011 4
  • 5.
    W3C Semantic Webinitiative • Objective – to create a universal medium for the exchange of data […] to smoothly interconnect personal information management, enterprise application integration, and the global sharing of commercial, scientific and cultural data • Main results – Resource Description Framework (RDF), RDFa (RDF-in-HTML), SPARQL Query Language © 2011 Makx Dekkers Journeés ABES 2011 5
  • 6.
    Core Linked DataSpecifications • Transport – HTTP Hypertext Transfer Protocol • Identification – URI Uniform Resource Identifier • Description and linking – RDF Resource Description Framework • Search and access – SPARQL Query Language for RDF © 2011 Makx Dekkers Journeés ABES 2011 6
  • 7.
    The four rulesof Linked Data • TBL’s recommendations: 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) 4. Include links to other URIs so that they can discover more things © 2011 Makx Dekkers Journeés ABES 2011 7
  • 8.
    The basic modelof RDF • Resource Description Framework “triple”: – Subject: the “thing” (resource) described – Predicate: the characteristic of the resource – Object: the value of the characteristic Subject Object Predicate © 2011 Makx Dekkers Journeés ABES 2011 8
  • 9.
    Complex structures inRDF This presentation Makx Dekkers Barcelona Journées ABES ABES Montpellier 17-18 May 2011 presenter partOf organizer location hometown date location © 2011 Makx Dekkers Journeés ABES 2011 9
  • 10.
    Linked (Open /Enterprise) Data • Commonalities – Using Semantic Web technologies (RDF) – Linking information resources, people, places • Differences – Open Data with open licenses; Enterprise Data mostly for closed, controlled environments – Open Data links to other Open Data, available for external use; Enterprise Data may link to external data but not openly available for external use © 2011 Makx Dekkers Journeés ABES 2011 10
  • 11.
    Linked Data --Open Data • Linked Data: focus on technology – Semantic Web: Resource Description Framework, and other Web standards – Final solutions still under development • Open Data: focus on strategy – Based on notion that sharing is important and benefits all – Technology is secondary © 2011 Makx Dekkers Journeés ABES 2011 11
  • 12.
    The five-star system Source:http://inkdroid.org/journal/2010/06/04/the-5-stars-of-open-linked-data/ © 2011 Makx Dekkers Journeés ABES 2011 12
  • 13.
    The LOD diagram:2007 Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/ © 2011 Makx Dekkers Journeés ABES 2011 13 25 datasets
  • 14.
    The LOD diagram:2008 Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/ © 2011 Makx Dekkers Journeés ABES 2011 14 45 datasets
  • 15.
    The LOD diagram:2009 Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/ © 2011 Makx Dekkers Journeés ABES 2011 15 95 datasets
  • 16.
    The LOD diagram:2010 Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/ © 2011 Makx Dekkers Journeés ABES 2011 16 203 datasets
  • 17.
    RECENT DEVELOPMENTS © 2011Makx Dekkers Journeés ABES 2011 17
  • 18.
    W3C communities • LinkingOpenDataSWEO Community Project – Goal: to extend the Web with a data commons by publishing various open data sets as RDF on the Web and by setting RDF links between data items from different data sources (http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData) • Library Linked Data Incubator Group – to help increase global interoperability of library data on the Web (http://www.w3.org/2005/Incubator/lld/) © 2011 Makx Dekkers Journeés ABES 2011 18
  • 19.
    More W3C communities •Government Linking Data Working Group – to provide standards and other information which help governments around the world publish their data as effective and usable linked data (http://www.w3.org/2011/gld/charter) • Semantic Web Health Care and Life Sciences (HCLS) Interest Group – to develop, advocate for, and support the use of Semantic Web technologies for health care and life science (e.g. biology, medicine) (http://www.w3.org/2001/sw/hcls/) © 2011 Makx Dekkers Journeés ABES 2011 19
  • 20.
    Open Knowledge Foundation,okfn.org • not-for-profit organization promoting open knowledge: any kind of data and content that can be freely used, reused, and redistributed • Working and Interest Groups, e.g. – Open Data in Science, Open Government Data, Open Bibliographic Data, Cultural Heritage etc. • CKAN.net: registry of open datasets and other “knowledge resources” © 2011 Makx Dekkers Journeés ABES 2011 20
  • 21.
    Linked Data initiatives Predicatevocabularies (descriptors) Research Description and Access (RDA) http://metadataregistry.org/rdabrowse.htm The Bibliographic Ontology (BIBO) http://bibliontology.com/ Dublin Core http://dublincore.org/ Object vocabularies (values) Virtual International Authority File (VIAF) http://viaf.org/ Library of Congress authorities http://id.loc.gov/authorities/ AgroVOC (agricultural terminology) e.g. http://aims.fao.org/aos/agrovoc/c_550 DBPedia (based on Wikipedia) e.g. http://dbpedia.org/page/Montpellier Bibliographic data LIBRIS Sweden e.g. http://libris.kb.se/library/S British Library http://www.bl.uk/bibliographic/datasamples.html CrossRef (DOI metadata) http://www.crossref.org/CrossTech/linked_data/ © 2011 Makx Dekkers Journeés ABES 2011 21
  • 22.
    More Linked Datainitiatives Broadcasting, publishing BBC http://www.bbc.co.uk/blogs/bbcinternet/linked_data/ New York Times http://data.nytimes.com/ Governments (small sample) USA http://data.gov/ France http://opendata.paris.fr/ Finland http://data.suomi.fi/ UK http://data.gov.uk/ Spain (Cataluña) http://dadesobertes.gencat.cat/ Norway http://data.norge.no Netherlands http://www.overheid.nl/opendata Australia http://data.gov.au/ © 2011 Makx Dekkers Journeés ABES 2011 22
  • 23.
    COMPARING OBJECTIVES © 2011Makx Dekkers Journeés ABES 2011 23
  • 24.
    Strategic aspects LinkedData • Achieving global interoperability with minimal coordination • Aggregating human knowledge • Supporting democracy, transparency and accountability • Enhancing and enriching information • Enabling user-driven and user-generated applications © 2011 Makx Dekkers Journeés ABES 2011 24
  • 25.
    Strategic aspects libraries •Organizing information for use by specific users for specific goals • Ensuring and maintaining quality • Sustaining services economically • Preserving information for the long term • Providing trusted services © 2011 Makx Dekkers Journeés ABES 2011 25
  • 26.
    Functional aspects LinkedData • Searching distributed collections • “Following your nose” – navigating links between pieces of content • Distributing responsibility for making statements about things • Leaving to the user whom and what to trust • Leaving development of products and services to an open market (apps) © 2011 Makx Dekkers Journeés ABES 2011 26
  • 27.
    Functional aspects libraries •Describing information by professionals • Bringing together and managing aggregations of information • Selecting relevant information • Mixing analogue and digital resources © 2011 Makx Dekkers Journeés ABES 2011 27
  • 28.
    Technical aspects LinkedData • Publishing and using machine-readable statements (“data that speak for themselves”) • Focusing on Semantic Web technology • Enabling inferences across large distributed data sets • (Still to be done) Solving issues around harvesting, caching and real-time updating © 2011 Makx Dekkers Journeés ABES 2011 28
  • 29.
    Technical aspects libraries •Using proven technology to provide high- quality services • Managing production systems and services • Guaranteeing performance, uptime, consistency across data © 2011 Makx Dekkers Journeés ABES 2011 29
  • 30.
    Agility versus sustainability •In the Linked Data space: – Things move fast – Trial-and-error – Lots of development by volunteers (hackers) • In the library domain: – Operational systems need to evolve – Need to handle legacy data – Development by professionals in managed projects © 2011 Makx Dekkers Journeés ABES 2011 30
  • 31.
    Data versus services •In the Linked Data space: – Focus on availability of “raw data” – Quality is secondary – Data and technology should lead to useful results • In the library domain: – Focus on services – Quality is essential – Data and technology in support of the service © 2011 Makx Dekkers Journeés ABES 2011 31
  • 32.
    Economic aspects • Inthe Linked Data space: – “Information wants to be free” – a human right? – Short-term thinking: today is hot, yesterday is not – Focus on applications to create value out of data • In the library domain: – Long-term view: sustainability is crucial – Public money to provide community services – Expected to do more with less money © 2011 Makx Dekkers Journeés ABES 2011 32
  • 33.
    OPPORTUNITIES AND RISKS ©2011 Makx Dekkers Journeés ABES 2011 33
  • 34.
    Strong points LinkedData • Attempt to create a common technical platform for machine-readable data • Lots of enthusiasm in publishing open data • Promise of global interoperability • Mix of researchers, user communities, hackers, professional data providers • High visibility on political level © 2011 Makx Dekkers Journeés ABES 2011 34
  • 35.
    Risks Linked Data •Driven by technology, not by requirements • Technology may not (yet) be stable – RDF 2.0? • Operational issues far from solved (reliability, performance, quality, security, trust) • Hope for general agreement across domains may not be realistic • Promise may turn into disappointment © 2011 Makx Dekkers Journeés ABES 2011 35
  • 36.
    Strong points libraries •Long time operational experience in managing information • Professional intermediaries between users and information needs • Sustainable business models (albeit with eternally shrinking budgets) • Long-term perspective: the past (legacy data) as well as the future (preservation) © 2011 Makx Dekkers Journeés ABES 2011 36
  • 37.
    Risks libraries • Technologieschange rapidly • New skills difficult to spread through the organization • Some people see libraries as a thing of the past (“the book museum”) • Underestimation of information handling skills • Information overload, human intervention does not scale, need for better tools © 2011 Makx Dekkers Journeés ABES 2011 37
  • 38.
    Meeting both worlds •An example: Europeana.eu – Started out with domain perspectives (libraries, archives, museums, audiovisual archives) – “Traditional” approach (metadata mappings) works but insufficient – Using Linked Data approach preserves domain specifics but allows for generalization to support common services – Cross-domain (but co-ordinated) interoperability © 2011 Makx Dekkers Journeés ABES 2011 38
  • 39.
    Europeana Data Model ClassesProperties Simple example Complex example Source at: http://version1.europeana.eu/web/europeana-project/technicaldocuments/ © 2011 Makx Dekkers Journeés ABES 2011 39
  • 40.
    CONCLUSION © 2011 MakxDekkers Journeés ABES 2011 40
  • 41.
    Libraries and LinkedData • Using Linked Data technology as the next step in connecting services • Offering information management skills to the technology domain • Creating a quality hub in the Linked Data space © 2011 Makx Dekkers Journeés ABES 2011 41
  • 42.
    Best of bothworlds • Libraries providing stability and sustainability to Linked Data spaces • Library professionals helping to manage the distributed collections • Libraries delivering high-quality linked data to the Web • Technologists to provide the next generation of systems and tools © 2011 Makx Dekkers Journeés ABES 2011 42
  • 43.
    Linked (Open) Data: opportunityfor libraries! Thank you! Makx Dekkers mail@makxdekkers.com