SlideShare a Scribd company logo
1 of 52
Download to read offline


Linked Data -
Evolving the Web into a Global Dataspace
Anja Jentzsch - @anjeve	

Hasso Plattner Institute, Potsdam, Germany	

!
!
!
Open Data Lecture, HTW Berlin	

2015/01/12
Architecture of the classic Web
B C
HTMLHTML
Web 

Browsers
Search 

Engines
hyper-

links
A
HTML
• Single global document space	

• Small set of simple standards	

• HTML as document format	

• HTTP URLs as globally unique IDs	

!
• Retrieval mechanism: Hyperlinks to connect
everything
Web 2.0 APIs and Mashups
No single global dataspace	

!
Shortcomings	

1. APIs have proprietary interfaces	

2. Mashups are based on a fixed set of
data sources	

3. No hyperlinks between data items
within different APIs
Web

API
A
Mashup
Web

API
B
Web

API
C
Web

API
D
Web APIs slice the Web into Walled Gardens
Image: Bob Jagensdorf, http://flickr.com/photos/darwinbell/, CC-BY
Extend the Web with a single global dataspace	

1. by using RDF to publish structured data on the Web	

2. by setting links between data items within different data sources
Linked Data
B C
RDF
RDF

Links
A D E
RDF

Links
RDF

Links
RDF

Links
RDF
RDF
RDF
RDF
RDF RDF
RDF
RDF
RDF
Linked Data Principles
Set of best practices for publishing structured data on the Web in
accordance with the general architecture of the Web.	

1. Use URIs as names for things.	

2. Use HTTP URIs so that people can look up those names.	

3. When someone looks up a URI, provide useful RDF information.	

4. Include RDF statements that link to other URIs so that they can discover
related things.	

Tim Berners-Lee, http://www.w3.org/DesignIssues/LinkedData.html, 2006
The RDF Data Model
Anja Jentzsch
dbpedia:Berlin
foaf:name
foaf:based_near
foaf:Person
rdf:type
ns:anja
Data Items are identified with HTTP URIs
ns:anja = http://www.anjeve.de#anja

dbpedia:Berlin = http://dbpedia.org/resource/Berlin
foaf:name
foaf:based_near
foaf:Person
rdf:type
ns:anja
dbpedia:Berlin
Anja Jentzsch
Resolving URIs over the Web
dp:Cities_in_Germany
3.499.879
dp:population
skos:subject
dbpedia:Berlin
foaf:name
foaf:based_near
foaf:Person
rdf:type
ns:anja
Anja Jentzsch
Dereferencing URIs over the Web
dbpedia:Hamburg
dbpedia:Muenchen
skos:subject
skos:subject
dp:Cities_in_Germany
3.499.879
dp:population
skos:subject
dbpedia:Berlin
foaf:name
foaf:based_near
foaf:Person
rdf:type
ns:anja
Anja Jentzsch
RDF Representation Formats
• RDF/XML	

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"	

xmlns:foaf="http://xmlns.com/foaf/0.1/">	

<foaf:Person rdf:about="http://anjeve.de#anja">	

<foaf:name>Anja Jentzsch</foaf:name>	

</foaf:Person>	

!
• RDF N-Triples	

<http://anjeve.de#anja> <http://www.w3.org/1999/02/22-rdf-syntax-
ns#type> <http://xmlns.com/foaf/0.1/Person> .	

<http://anjeve.de#anja> <http://xmlns.com/foaf/0.1/name> „Anja
Jentzsch“ .
RDF Representation Formats
!
!
!
<http://anjeve.de#anja> <http://www.w3.org/1999/02/22-
rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person>.	

!
<Subject> <Predicate> <Object> .
!
In the end it‘s all triples!
foaf:Person
rdf:type
ns:anja
Properties of the Web of Linked Data
• Global, distributed dataspace build on a simple set of standards	

• RDF, URIs, HTTP	

• Entities are connected by links	

• creating a global data graph that spans data sets and 	

• enables the discovery of new data sources	

• Provides for data-coexistence	

• Everyone can publish data to the Web of Linked Data	

• Everyone can express their personal view on things	

• Everybody can use the vocabularies/schemas that they like
W3C Linking Open Data Project
• Grassroots community effort to	

• publish existing open license datasets as Linked Data on the Web	

• interlink things between different data sets
LOD Data Sets on the Web: May 2007
• 12 data sets	

• 500+ million RDF triples 	

• 120,000+ RDF links between data sets
LOD Data Sets on the Web: November 2007
• 28 data sets
LOD Data Sets on the Web: September 2008
• 45 data sets	

• 2+ billion RDF triples
LOD Data Sets on the Web: July 2009
• 95 data sets	

• 6.5+ billion RDF triples
LOD Data Sets on the Web: September 2010
• 203 data sets	

• Over 24,7 billion RDF triples 	

• Over 436 million RDF links between data sets
LOD Data Sets on the Web: September 2011
• 295 data sets	

• 31+ billion RDF triples	

• 504+ million RDF links between data sets http://lod-cloud.net
LOD Data Sets on the Web:August 2014
http://lod-cloud.net
• 1,019 data sets	

• 84+ billion RDF triples	

• 808+ million RDF links between data sets
LOD Data Set statistics as of 08/2014
LOD Cloud Data Catalog on the Data Hub	

• http://datahub.io/group/lodcloud 	

More statistics	

• http://lod-cloud.net/state/
Heterogeneity on the Web of Data
• The Web of Data is heterogeneous	

• Many different vocabularies are in use (469 as of January 2015)	

• Different data formats	

• Many different ways to represent the same information
DBpedia –The Hub 

on the Web of Data
• DBpedia is a joint project with the following goals	

• extracting structured information from Wikipedia	

• publish this information under an open license on the Web	

• setting links to other data sources

!
• Partners	

• Universität Mannheim (Germany)	

• Universität Leipzig (Germany)	

• OpenLink Software (UK)
Extracting structured data from Wikipedia
Extracting structured data from Wikipedia
dbpedia:Berlin rdf:type dbpedia-owl:City ,	

dbpedia-owl:PopulatedPlace ,	

dbpedia-owl:Place ;	

rdfs:label "Berlin"@en , "Berlino"@it ;	

dbpedia-owl:population 3499879 ;	

wgs84:lat 52.500557 ;	

wgs84:long 13.398889 .	

! 	

dbpedia:SoundCloud dbpedia-owl:location dbpedia:Berlin .
• access to DBpedia data:	

• dumps	

• SPARQL endpoint	

• Linked Data interface
The DBpedia Data Set
• Information on more than 4.58 million “things”	

• 1,445,000 persons	

• 241,000 organisations	

• 735,000 places	

• 123,000 music albums	

• 87,000 movies	

• 251,000 species	

• overall more than 3 billion RDF triples	

• title and abstract in 125 different languages	

• 25,200,000 links to images	

• 29,800,000 links to external web pages	

• 50,000,000 links to other Linked Data sets
DBpedia Mappings
• since March 2010 collaborative editing of	

• DBpedia ontology	

• mappings from Wikipedia infoboxes and tables to DBpedia ontology	

• curated in a public wiki with instant validation methods	

• http://mappings.dbpedia.org	

• multilingual mappings to the DBpedia ontology:	

• ar, be, bg, bn, ca, cs, cy, de, el, en, eo, es, et, eu, fr, ga, hi, hr, hu, id, it, ja, ko, nl,
pl, pt, ru, sk, sl, sr, tr, ur, zh	

!
• allows for a significant increase of the extracted data’s quality	

• each domain has its experts
DBpedia Use Cases
1. Hub for the growing Web of Data	

2. Improvement of Wikipedia search	

3. Data source for applications and mashups	

4. Text analysis and annotation
DBpedia Mobile
• displays Wikipedia data on a map	

• aggregates different data sources
Faceted Wikipedia Search
• faceted browsing and free text search
http://spotlight.dbpedia.org
Uptake in the Government Domain
• The EU is pushing Linked Data (LOD2, LATC, EuroStat)	

• W3C eGovernment Interest Group
Uptake in the Library Community
• Libraries publishing Linked Data	

• Library of Congress (subject headings)	

• German National Library (PND dataset and subject headings)	

• Swedish National Library (Libris - catalog)	

• Hungarian National Library (OPAC and Digital Library)	

• Europeana	

• W3C Library Linked Data Incubator Group	

• Goals: 	

• Integrate library catalogs on global scale	

• Interconnect resources between repositories (by topic, by location, by
historical period, by ...)
Uptake in Life Sciences
• W3C Linking Open Drug Data Effort	

• Bio2RDF Project	

• Allen Brain Atlas	

!
!
!
!
!
!
• Goal: Smoothly integrate internal and external data in a pay-as-you-go-fashion.
Uptake in the Media Industry
• Publish data as Linked Data or RDFa	

• Goal: Drive traffic to websites via
search engines
Linked Data Applications
Linked Data Browsers
• Tabulator Browser (MIT, USA)	

• Marbles (MES / Uni Mannheim, DE)	

• OpenLink RDF Browser (OpenLink, UK)	

• Zitgist RDF Browser (Zitgist, USA)	

• Humboldt (HP Labs, UK)	

• Disco Hyperdata Browser (Uni Mannheim, DE)	

• Fenfire (DERI, Irland)
Web of Data Search Engines
• Sig.ma (DERI, Ireland)	

• Falcons (IWS, China)	

• Swoogle (UMBC, USA)	

• VisiNav (DERI, Ireland)	

• Watson (Open University, UK)
44
Finding Linked Data sets
• Search engines	

• find data sets based on keywords	

!
• Data catalogs / directories	

• explore data sets and faceted search	

!
• Data Marketplaces	

• explore and consume data sets
45
Is your data 5 star?
Tim Berners-Lee, http://www.w3.org/DesignIssues/LinkedData.html, 2010
Make your data available on the Web (in whatever format)
under an open license.	

Make it available as structured data (e.g., Excel instead of image
scan of a table) so that it can be reused.	

Use non-proprietary, open formats (e.g., CSV instead of Excel).	

!
Use URIs to identify things, so that people can point at your stuff
and serve RDF from it.	

Link your data to other data to provide context.
★	

★ ★ 	

★ ★ ★ 	

★ ★ ★ ★ 	

★ ★ ★ ★ ★
Economic Opportunities
• Huge reservoir of free content which can be used to provide background facts
about 	

• places, 	

• people, 	

• topics	

• …
Background Facts
schema.org
• jointly proposed vocabularies for embedding data into HTML pages (Microdata)	

• available since June 2011
Options for Search Companies
• New vertical search engines	

• Sig.ma, FalconS, …	

• Google Knowledge Graph	

• 1.6 billion facts (2014)	

• Yahoo	

• Improving text search with background knowledge
Lower Data Integration Costs
Overall data integration effort is split between:	

!
• Data Publisher	

– publishes data as RDF	

– sets identity links	

– reuses terms or publishes mappings	

• Third Parties	

– set identity links pointing at your data	

– publish mappings to the Web	

• Data Consumer	

– has to do the rest	

– using record linkage and schema matching
techniques
Conclusion
• The Web of Data is growing fast	

• Linked Data provides a standardized data access interface	

• Allows for easy data integration, enhancement and browsing	

• Web search is evolving into query answering	

• Search engines will increasingly rely on structured data from the Web	

• Next step: Linked Data within enterprises	

• Alternative to data warehouses and EAI middleware	

• Advantages: schema-less data model, pay-as-you go data integration
Thanks!
References:	

• Tom Heath, Christian Bizer: Linked Data: Evolving the Web into a Global Data Space	

http://linkeddatabook.com/	

• Linking Open Data Project Wiki 	

http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData
Email: anja@anjeve.de	

Twitter: @anjeve

More Related Content

What's hot

2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk Cambridge2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk CambridgeMagnus Manske
 
Linked open data and libraries
Linked open data and librariesLinked open data and libraries
Linked open data and librariesAlison Hitchens
 
What is #LODLAM?! (revised January 2015)
What is #LODLAM?! (revised January 2015)What is #LODLAM?! (revised January 2015)
What is #LODLAM?! (revised January 2015)Alison Hitchens
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataSebastian Hellmann
 
20170501 Distributed Network of Digital Heritage Information
20170501  Distributed Network of Digital Heritage Information20170501  Distributed Network of Digital Heritage Information
20170501 Distributed Network of Digital Heritage InformationEnno Meijers
 
Discovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data PortalsDiscovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data PortalsPeter Haase
 
Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?OCLC
 
New approaches for data acquisition at europeana iiif, sitemaps and schema.o...
New approaches for data acquisition at europeana  iiif, sitemaps and schema.o...New approaches for data acquisition at europeana  iiif, sitemaps and schema.o...
New approaches for data acquisition at europeana iiif, sitemaps and schema.o...Nuno Freire
 
Viaf and isni ifla 2013 08-16
Viaf and isni  ifla 2013 08-16Viaf and isni  ifla 2013 08-16
Viaf and isni ifla 2013 08-16Janifer Gatenby
 
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterElephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterRobert H. McDonald
 
Scripting User Contributed Interlinking
Scripting User Contributed InterlinkingScripting User Contributed Interlinking
Scripting User Contributed Interlinkingwhalb
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichmentsemanticsconference
 
lodlam summit session browsable linked data
lodlam summit session browsable linked datalodlam summit session browsable linked data
lodlam summit session browsable linked dataEnno Meijers
 
Clipper, research data network
Clipper, research data networkClipper, research data network
Clipper, research data networkJisc RDM
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinAnja Jentzsch
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...Martin Klein
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataBoris Villazón-Terrazas
 
WG5: A data wrangling experiment
WG5: A data wrangling experimentWG5: A data wrangling experiment
WG5: A data wrangling experimentWARCnet
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsdgarijo
 
ORDS, research data network
ORDS, research data networkORDS, research data network
ORDS, research data networkJisc RDM
 

What's hot (20)

2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk Cambridge2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk Cambridge
 
Linked open data and libraries
Linked open data and librariesLinked open data and libraries
Linked open data and libraries
 
What is #LODLAM?! (revised January 2015)
What is #LODLAM?! (revised January 2015)What is #LODLAM?! (revised January 2015)
What is #LODLAM?! (revised January 2015)
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
 
20170501 Distributed Network of Digital Heritage Information
20170501  Distributed Network of Digital Heritage Information20170501  Distributed Network of Digital Heritage Information
20170501 Distributed Network of Digital Heritage Information
 
Discovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data PortalsDiscovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data Portals
 
Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?
 
New approaches for data acquisition at europeana iiif, sitemaps and schema.o...
New approaches for data acquisition at europeana  iiif, sitemaps and schema.o...New approaches for data acquisition at europeana  iiif, sitemaps and schema.o...
New approaches for data acquisition at europeana iiif, sitemaps and schema.o...
 
Viaf and isni ifla 2013 08-16
Viaf and isni  ifla 2013 08-16Viaf and isni  ifla 2013 08-16
Viaf and isni ifla 2013 08-16
 
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterElephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
 
Scripting User Contributed Interlinking
Scripting User Contributed InterlinkingScripting User Contributed Interlinking
Scripting User Contributed Interlinking
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichment
 
lodlam summit session browsable linked data
lodlam summit session browsable linked datalodlam summit session browsable linked data
lodlam summit session browsable linked data
 
Clipper, research data network
Clipper, research data networkClipper, research data network
Clipper, research data network
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked Data
 
WG5: A data wrangling experiment
WG5: A data wrangling experimentWG5: A data wrangling experiment
WG5: A data wrangling experiment
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologists
 
ORDS, research data network
ORDS, research data networkORDS, research data network
ORDS, research data network
 

Similar to Evolving the Web into a Global Dataspace with Linked Data

Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Anja Jentzsch
 
Open Data - Principles and Techniques
Open Data - Principles and TechniquesOpen Data - Principles and Techniques
Open Data - Principles and TechniquesBernhard Haslhofer
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked dataLaura Po
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataMarin Dimitrov
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationEnno Meijers
 
Linked Open Data in Romania
Linked Open Data in RomaniaLinked Open Data in Romania
Linked Open Data in RomaniaVlad Posea
 
Linked Open Data for Cultural Heritage
Linked Open Data for Cultural HeritageLinked Open Data for Cultural Heritage
Linked Open Data for Cultural HeritageNoreen Whysel
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareIMC Technologies
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataEUCLID project
 
Cloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentCloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentPeter Haase
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?Ivan Herman
 
The Web of data and web data commons
The Web of data and web data commonsThe Web of data and web data commons
The Web of data and web data commonsJesse Wang
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RESChristophe Guéret
 
Linked open data project
Linked open data projectLinked open data project
Linked open data projectFaathima Fayaza
 

Similar to Evolving the Web into a Global Dataspace with Linked Data (20)

Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
Linked Data Basics
Linked Data BasicsLinked Data Basics
Linked Data Basics
 
Open Data - Principles and Techniques
Open Data - Principles and TechniquesOpen Data - Principles and Techniques
Open Data - Principles and Techniques
 
Linked (Open) Data
Linked (Open) DataLinked (Open) Data
Linked (Open) Data
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
Finding Data Sets
Finding Data SetsFinding Data Sets
Finding Data Sets
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
 
Linked data 20171106
Linked data 20171106Linked data 20171106
Linked data 20171106
 
Linked Open Data in Romania
Linked Open Data in RomaniaLinked Open Data in Romania
Linked Open Data in Romania
 
Linked Data
Linked DataLinked Data
Linked Data
 
Linked Open Data for Cultural Heritage
Linked Open Data for Cultural HeritageLinked Open Data for Cultural Heritage
Linked Open Data for Cultural Heritage
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Cloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentCloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application Development
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
NISO Webinar: Library Linked Data: From Vision to Reality
NISO Webinar: Library Linked Data: From Vision to RealityNISO Webinar: Library Linked Data: From Vision to Reality
NISO Webinar: Library Linked Data: From Vision to Reality
 
The Web of data and web data commons
The Web of data and web data commonsThe Web of data and web data commons
The Web of data and web data commons
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RES
 
Linked open data project
Linked open data projectLinked open data project
Linked open data project
 

Recently uploaded

SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |aasikanpl
 

Recently uploaded (20)

SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
 

Evolving the Web into a Global Dataspace with Linked Data

  • 1. 
 Linked Data - Evolving the Web into a Global Dataspace Anja Jentzsch - @anjeve Hasso Plattner Institute, Potsdam, Germany ! ! ! Open Data Lecture, HTW Berlin 2015/01/12
  • 2. Architecture of the classic Web B C HTMLHTML Web 
 Browsers Search 
 Engines hyper-
 links A HTML • Single global document space • Small set of simple standards • HTML as document format • HTTP URLs as globally unique IDs ! • Retrieval mechanism: Hyperlinks to connect everything
  • 3. Web 2.0 APIs and Mashups No single global dataspace ! Shortcomings 1. APIs have proprietary interfaces 2. Mashups are based on a fixed set of data sources 3. No hyperlinks between data items within different APIs Web
 API A Mashup Web
 API B Web
 API C Web
 API D
  • 4. Web APIs slice the Web into Walled Gardens Image: Bob Jagensdorf, http://flickr.com/photos/darwinbell/, CC-BY
  • 5. Extend the Web with a single global dataspace 1. by using RDF to publish structured data on the Web 2. by setting links between data items within different data sources Linked Data B C RDF RDF
 Links A D E RDF
 Links RDF
 Links RDF
 Links RDF RDF RDF RDF RDF RDF RDF RDF RDF
  • 6. Linked Data Principles Set of best practices for publishing structured data on the Web in accordance with the general architecture of the Web. 1. Use URIs as names for things. 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful RDF information. 4. Include RDF statements that link to other URIs so that they can discover related things. Tim Berners-Lee, http://www.w3.org/DesignIssues/LinkedData.html, 2006
  • 7. The RDF Data Model Anja Jentzsch dbpedia:Berlin foaf:name foaf:based_near foaf:Person rdf:type ns:anja
  • 8. Data Items are identified with HTTP URIs ns:anja = http://www.anjeve.de#anja
 dbpedia:Berlin = http://dbpedia.org/resource/Berlin foaf:name foaf:based_near foaf:Person rdf:type ns:anja dbpedia:Berlin Anja Jentzsch
  • 9. Resolving URIs over the Web dp:Cities_in_Germany 3.499.879 dp:population skos:subject dbpedia:Berlin foaf:name foaf:based_near foaf:Person rdf:type ns:anja Anja Jentzsch
  • 10. Dereferencing URIs over the Web dbpedia:Hamburg dbpedia:Muenchen skos:subject skos:subject dp:Cities_in_Germany 3.499.879 dp:population skos:subject dbpedia:Berlin foaf:name foaf:based_near foaf:Person rdf:type ns:anja Anja Jentzsch
  • 11. RDF Representation Formats • RDF/XML <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/"> <foaf:Person rdf:about="http://anjeve.de#anja"> <foaf:name>Anja Jentzsch</foaf:name> </foaf:Person> ! • RDF N-Triples <http://anjeve.de#anja> <http://www.w3.org/1999/02/22-rdf-syntax- ns#type> <http://xmlns.com/foaf/0.1/Person> . <http://anjeve.de#anja> <http://xmlns.com/foaf/0.1/name> „Anja Jentzsch“ .
  • 12. RDF Representation Formats ! ! ! <http://anjeve.de#anja> <http://www.w3.org/1999/02/22- rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person>. ! <Subject> <Predicate> <Object> . ! In the end it‘s all triples! foaf:Person rdf:type ns:anja
  • 13. Properties of the Web of Linked Data • Global, distributed dataspace build on a simple set of standards • RDF, URIs, HTTP • Entities are connected by links • creating a global data graph that spans data sets and • enables the discovery of new data sources • Provides for data-coexistence • Everyone can publish data to the Web of Linked Data • Everyone can express their personal view on things • Everybody can use the vocabularies/schemas that they like
  • 14. W3C Linking Open Data Project • Grassroots community effort to • publish existing open license datasets as Linked Data on the Web • interlink things between different data sets
  • 15. LOD Data Sets on the Web: May 2007 • 12 data sets • 500+ million RDF triples • 120,000+ RDF links between data sets
  • 16. LOD Data Sets on the Web: November 2007 • 28 data sets
  • 17. LOD Data Sets on the Web: September 2008 • 45 data sets • 2+ billion RDF triples
  • 18. LOD Data Sets on the Web: July 2009 • 95 data sets • 6.5+ billion RDF triples
  • 19. LOD Data Sets on the Web: September 2010 • 203 data sets • Over 24,7 billion RDF triples • Over 436 million RDF links between data sets
  • 20. LOD Data Sets on the Web: September 2011 • 295 data sets • 31+ billion RDF triples • 504+ million RDF links between data sets http://lod-cloud.net
  • 21. LOD Data Sets on the Web:August 2014 http://lod-cloud.net • 1,019 data sets • 84+ billion RDF triples • 808+ million RDF links between data sets
  • 22. LOD Data Set statistics as of 08/2014 LOD Cloud Data Catalog on the Data Hub • http://datahub.io/group/lodcloud More statistics • http://lod-cloud.net/state/
  • 23. Heterogeneity on the Web of Data • The Web of Data is heterogeneous • Many different vocabularies are in use (469 as of January 2015) • Different data formats • Many different ways to represent the same information
  • 24. DBpedia –The Hub 
 on the Web of Data • DBpedia is a joint project with the following goals • extracting structured information from Wikipedia • publish this information under an open license on the Web • setting links to other data sources
 ! • Partners • Universität Mannheim (Germany) • Universität Leipzig (Germany) • OpenLink Software (UK)
  • 25. Extracting structured data from Wikipedia
  • 26. Extracting structured data from Wikipedia dbpedia:Berlin rdf:type dbpedia-owl:City , dbpedia-owl:PopulatedPlace , dbpedia-owl:Place ; rdfs:label "Berlin"@en , "Berlino"@it ; dbpedia-owl:population 3499879 ; wgs84:lat 52.500557 ; wgs84:long 13.398889 . ! dbpedia:SoundCloud dbpedia-owl:location dbpedia:Berlin . • access to DBpedia data: • dumps • SPARQL endpoint • Linked Data interface
  • 27. The DBpedia Data Set • Information on more than 4.58 million “things” • 1,445,000 persons • 241,000 organisations • 735,000 places • 123,000 music albums • 87,000 movies • 251,000 species • overall more than 3 billion RDF triples • title and abstract in 125 different languages • 25,200,000 links to images • 29,800,000 links to external web pages • 50,000,000 links to other Linked Data sets
  • 28. DBpedia Mappings • since March 2010 collaborative editing of • DBpedia ontology • mappings from Wikipedia infoboxes and tables to DBpedia ontology • curated in a public wiki with instant validation methods • http://mappings.dbpedia.org • multilingual mappings to the DBpedia ontology: • ar, be, bg, bn, ca, cs, cy, de, el, en, eo, es, et, eu, fr, ga, hi, hr, hu, id, it, ja, ko, nl, pl, pt, ru, sk, sl, sr, tr, ur, zh ! • allows for a significant increase of the extracted data’s quality • each domain has its experts
  • 29. DBpedia Use Cases 1. Hub for the growing Web of Data 2. Improvement of Wikipedia search 3. Data source for applications and mashups 4. Text analysis and annotation
  • 30.
  • 31. DBpedia Mobile • displays Wikipedia data on a map • aggregates different data sources
  • 32. Faceted Wikipedia Search • faceted browsing and free text search
  • 33.
  • 35. Uptake in the Government Domain • The EU is pushing Linked Data (LOD2, LATC, EuroStat) • W3C eGovernment Interest Group
  • 36. Uptake in the Library Community • Libraries publishing Linked Data • Library of Congress (subject headings) • German National Library (PND dataset and subject headings) • Swedish National Library (Libris - catalog) • Hungarian National Library (OPAC and Digital Library) • Europeana • W3C Library Linked Data Incubator Group • Goals: • Integrate library catalogs on global scale • Interconnect resources between repositories (by topic, by location, by historical period, by ...)
  • 37. Uptake in Life Sciences • W3C Linking Open Drug Data Effort • Bio2RDF Project • Allen Brain Atlas ! ! ! ! ! ! • Goal: Smoothly integrate internal and external data in a pay-as-you-go-fashion.
  • 38. Uptake in the Media Industry • Publish data as Linked Data or RDFa • Goal: Drive traffic to websites via search engines
  • 40. Linked Data Browsers • Tabulator Browser (MIT, USA) • Marbles (MES / Uni Mannheim, DE) • OpenLink RDF Browser (OpenLink, UK) • Zitgist RDF Browser (Zitgist, USA) • Humboldt (HP Labs, UK) • Disco Hyperdata Browser (Uni Mannheim, DE) • Fenfire (DERI, Irland)
  • 41.
  • 42. Web of Data Search Engines • Sig.ma (DERI, Ireland) • Falcons (IWS, China) • Swoogle (UMBC, USA) • VisiNav (DERI, Ireland) • Watson (Open University, UK)
  • 43.
  • 44. 44 Finding Linked Data sets • Search engines • find data sets based on keywords ! • Data catalogs / directories • explore data sets and faceted search ! • Data Marketplaces • explore and consume data sets
  • 45. 45
  • 46. Is your data 5 star? Tim Berners-Lee, http://www.w3.org/DesignIssues/LinkedData.html, 2010 Make your data available on the Web (in whatever format) under an open license. Make it available as structured data (e.g., Excel instead of image scan of a table) so that it can be reused. Use non-proprietary, open formats (e.g., CSV instead of Excel). ! Use URIs to identify things, so that people can point at your stuff and serve RDF from it. Link your data to other data to provide context. ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★
  • 47. Economic Opportunities • Huge reservoir of free content which can be used to provide background facts about • places, • people, • topics • … Background Facts
  • 48. schema.org • jointly proposed vocabularies for embedding data into HTML pages (Microdata) • available since June 2011
  • 49. Options for Search Companies • New vertical search engines • Sig.ma, FalconS, … • Google Knowledge Graph • 1.6 billion facts (2014) • Yahoo • Improving text search with background knowledge
  • 50. Lower Data Integration Costs Overall data integration effort is split between: ! • Data Publisher – publishes data as RDF – sets identity links – reuses terms or publishes mappings • Third Parties – set identity links pointing at your data – publish mappings to the Web • Data Consumer – has to do the rest – using record linkage and schema matching techniques
  • 51. Conclusion • The Web of Data is growing fast • Linked Data provides a standardized data access interface • Allows for easy data integration, enhancement and browsing • Web search is evolving into query answering • Search engines will increasingly rely on structured data from the Web • Next step: Linked Data within enterprises • Alternative to data warehouses and EAI middleware • Advantages: schema-less data model, pay-as-you go data integration
  • 52. Thanks! References: • Tom Heath, Christian Bizer: Linked Data: Evolving the Web into a Global Data Space http://linkeddatabook.com/ • Linking Open Data Project Wiki http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData Email: anja@anjeve.de Twitter: @anjeve