Linking Open Data Danny Ayers SemAst2009
Obligatory Children Slide
Credits: Many slides derived from those of Tom Heath, Michael Hausenblas, Chris Bizer, Richard Cyganiak, Olaf Hartig
Quote Linked Data is the Semantic Web done right, and the Web done right.” - Tim Berners-Lee
Introduce the concept of Linked Data Highlight why you would want to publish Linked Data on the Web Introduce the principles and best practices of publishing Linked Data on the Web Demonstrate the creation and consumption of Linked Data Answer your burning Linked Data publishing questions Objectives
Overview From a Web of Documents to a Web of Data Web APIs, Microformats, and Linked Data Linked Data Deployment on the Web What data is out there? Applications  What is being done with the data?
Classic Data : in silos Image: Bob Jagensdorf, http://flickr.com/photos/darwinbell/, CC-BY
The Classic Web Single global information space URLs as  globally unique IDs retrieval mechanism HTML as shared content format Hyperlinks Shortcomings Content is not well structured You can not ask expressive queries You can not process content within applications B C HTML HTML HTML Web  Browsers Search  Engines hyper- links A
What do we actually want? Use the Web like a  single global  database.
Solution Web APIs Microformats Linked Data Publish structured data directly on the Web Different Approaches
Web APIs
Mashups Positive APIs expose structured data APIs enable new applications Negative Usually proprietary interfaces Mashups are based only on fixed set of sources You can't usually set hyperlinks between data objects Mashup Up Web API A Web API B Web API C Web API D
Web APIs slice the Web into separate data silos Image: Bob Jagensdorf, http://flickr.com/photos/darwinbell/, CC-BY
What do we want? Use the Web as a single global database  using what we know works on the Web
Connecting Worlds One Web! Data Documents
What works on the Web? Uniform Identifiers (URIs) Common Interface Protocol (HTTP) Standard Representation Formats (HTML, Atom/RSS etc)
What works on the Web? The Hyperlink
Evolving the Link <a href=” http://example.org/home.html “> Home Page </a> page.html home.html
Evolving the Link <a href=”http://example.org/home.html”  rel=”home”> Home Page </a> page.html home.html home
Microformats Embed structured data into HTML pages. hCard, hCalender, hReview, XFN, … Compatible with the idea of the Web as single information space. Shortcomings Only a fixed set of microformats exist No direct way to connect data items <div class=&quot;vevent&quot;> <span class=&quot;summary&quot;>bdigital</span>    <abbr class=&quot;dtstart&quot; title=&quot;2008-05-20&quot;>May 20</abbr> -    <abbr class=&quot;dtend&quot; title=&quot;2007-05-22&quot;>22</abbr> </div>
Evolving the Link <rdf:Description  rdf:about=” http://example.org/page ”> <x:home  rdf:resource=” http://example.org/home ” /> </rdf:Description> page home x:home
Linked Data B C Thing typed links A D E typed links typed links typed links Thing Thing Thing Thing Thing Thing Thing Thing Thing Use Semantic Web technologies to publish structured data on the Web, set links between data from one data source  to data within other data sources .
RDF Relations <rdf:Description  rdf:about=” http://example.org/page ”> <x:home  rdf:resource=” http://example.org/home ” /> </rdf:Description> x:home subject object http://example.org /page ... http://example.org /home ...
Identifiers Identification by Description Non-HTTP URIs HTTP URIs
Linked Data Principles Use URIs as names for things Use HTTP URIs so that people can look up those names. When someone looks up a URI, provide useful information. Include links to other URIs. so that they can discover more things. Tim Berners-Lee 2007 http://www.w3.org/DesignIssues/LinkedData.html
The RDF Data Model Richard Cyganiak dbpedia:Berlin foaf:name foaf:based_near foaf:Person rdf:type pd:cygri
Data objects are identified with HTTP URIs pd:cygri Richard Cyganiak dbpedia:Berlin foaf:name foaf:based_near foaf:Person rdf:type pd:cygri  = http://richard.cyganiak.de/foaf.rdf#cygri dbpedia:Berlin  = http://dbpedia.org/resource/Berlin
Follow Your Nose... Richard Cyganiak dbpedia:Berlin foaf:name foaf:based_near foaf:Person rdf:type pd:cygri dp:Cities_in_Germany 3.405.259 dp:population skos:subject
Dereferencing URIs over the Web Richard Cyganiak dbpedia:Berlin foaf:name foaf:based_near foaf:Person rdf:type dbpedia:Hamburg dbpedia:Muenchen skos:subject skos:subject pd:cygri dp:Cities_in_Germany 3.405.259 dp:population skos:subject
Universal Connector <resourceB> rdfs:seeAlso <resourceA>
Universal Connector <resourceB> rdfs:seeAlso <resourceA> foaf:Person rdf:type
Take Care! <resourceB> owl:sameAs <resourceA>
The Disco – Hyperdata Browser
 
Semantic Web Specifications HTTP + URIs RDF + RDFs OWL SKOS SPARQL RDFa GRDDL Turtle/N3 Coming soon: POWDER, OWL2, RIF
What of Web Services? “ Web APIs Are Just Web Sites” - Paul Downey Linked Data is exposed as (very simple) RESTful services
What of RDF Stores? A triplestore is just a cache of a chunk of the Semantic Web Convenient for data merging, querying (SPARQL) and inference See also: Semantic Web Client Library
What of the Real World? source:  http://danbri.org/words/2008/04/15/300 based on timbl 1994
W3C Linking Open Data Project Community effort to publish existing open license datasets as Linked Data on the Web interlink things between different data sources
LOD Datasets on the Web:  May 2007
LOD Datasets on the Web:  August 2007
LOD Datasets on the Web: February 2008
LOD Datasets on the Web: September 2008
LOD Datasets on the Web:  March 2009
Spotlight: Geonames over 8 million  geographical  locations feature  hierarchy
Spotlight: DBpedia extracts structured data from Wikipedia. covers over 2.2 million concepts from various domains.
Example RDF Links RDF links from DBpedia to other data sources  RDF link from a FOAF profile to DBpedia <http://dbpedia.org/resource/Berlin> owl:sameAs <http://sws.geonames.org/2950159> .  <http://richard.cyganiak.de/foaf.rdf#cygri> foaf:topic_interest <http://dbpedia.org/resource/Semantic_Web> . <http://dbpedia.org/resource/Tim_Berners-Lee> owl:sameAs <http://www4.wiwiss.fu-berlin.de/dblp/resource/person/100007> .
Universities and Research Institutes Massachusetts Institute of Technology (USA) University of Southampton (UK) Freie Universität Berlin (DE) DERI (IRE) KMi, Open University (UK) University of London (UK) Universität Hannover (DE) University of Pennsylvania (USA) Universität Leipzig (DE) Universität Karlsruhe (DE) Joanneum (AT) University of Toronto (CA) Organizations publishing Linked Data Companies BBC (UK) OpenLink (UK) Zitgist (USA) Talis (UK) Garlik (UK) Mondeca (FR) Cyc Foundation (USA)
Applications B C Thing typed links A D E typed links typed links typed links Thing Thing Thing Thing Thing Thing Thing Thing Thing Search  Engines Linked Data Mashups Linked Data Browsers What can I do with this?
Linked Data Browsers Tabulator Browser (MIT, USA) Marbles (FU Berlin, DE) OpenLink RDF Browser (OpenLink, UK) Zitgist RDF Browser (Zitgist, USA) Disco Hyperdata Browser (FU Berlin, DE) Fenfire (DERI, Irland)
Tabulator
Linked Data Mashups Domain-specific applications  - using Linked Data from the Web
DBpedia Mobile Geospatial entry point into the Web of Data Starts with DBpedia, Revyu and Flickr data
Web of Data Search Engines Falcons (IWS, China) Sindice (DERI, Ireland) MicroSearch (Yahoo, Spain) Watson (Open University, UK) SWSE (DERI, Ireland) Swoogle (UMBC, USA)
Falcons
Why publish Linked Data on the Web? Your data becomes part of a single global data space (the Web of Data = Semantic Web) People can use various data browsers to explore your data Your data is crawled by Semantic Web search engines and can be used by independent applications  People start setting links to your data, which allows more people to find and use your data
Why publish Linked Data on the Web? Linked Data builds on the classic architecture of the Web Linked Data is more generic then Web APIs or Microformats Builds on standards in contrast to proprietary Web APIs Enables applications that work against an unbound set of data sources and incorporate new data sources as they become available on the Web
Linked Data Principles Use URIs as names for things Use HTTP URIs so that people can look up those names. When someone looks up a URI, provide useful information. Include links to other URIs. so that they can discover more things. Tim Berners-Lee 2007 http://www.w3.org/DesignIssues/LinkedData.html
Example : Source Data
Example : First Pass @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix x: <http://purl.org/stuff/astro#> . x:Megrez rdf:type x:A3_dwarf ; x:magnitude &quot;3.1&quot; ; x:partOf [ rdf:type x:Constellation ; x:hasName &quot;Ursa Major&quot;  ] . (RDF Turtle/N3 Syntax)
Example : Vocabulary @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix x: <http://purl.org/stuff/astro#> . x:Star rdf:type rdfs:Class . x:Constellation rdf:type  rdfs:Class . x:A3_Dwarf rdfs:subClassOf x:Star . x:magnitude rdf:type rdf:Property . x:partOf  rdf:type rdf:Property . x:hasName  rdf:type rdf:Property .
But... Use URIs as names for things Use HTTP URIs so that people can look up those names. When someone looks up a URI, provide useful information. Include links to other URIs. so that they can discover more things. Existing vocabularies and resource URIs are likely to be well-linked already –  reuse is good!
Example : Better Linkage @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix x: <http://purl.org/stuff/astro#> . @prefix astro: <http://archive.astro.umd.edu/ont/astronomy.owl#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix dolce: <http://www.loa-cnr.it/ontologies/DOLCE-Lite.owl#> . @prefix dbpedia: <http://dbpedia.org/resource/> . dbpedia:Delta_Ursae_Majoris rdf:type x:A3_Dwarf ; foaf:name &quot;Megrez&quot;; x:magnitude &quot;3.1&quot; ; dolce:part-of dbpedia:Ursa_Major ; rdfs:seeAlso <http://www.dcs.gla.ac.uk/workshops/semast09> . dbpedia:Ursa_Major rdf:type astro:Constellation .
Mass-Publication Techniques Generally either: bulk conversion live mapping from existing databases
New Vocabularies : voiD Vocabulary of Interlinked Datasets Classes:  Dataset | Linkset | TechnicalFeature | Properties:  statItem | feature | subset | target | sparqlEndpoint | linkPredicate | exampleResource | vocabulary | subjectsTarget | objectsTarget | dataDump | uriLookupEndpoint | uriRegexPattern |
New Vocabularies : Scovo Statistical Core Vocabulary
&quot;It is important to look as it as an interconnection bus, and it hasn't started working in earnest until one person's data is being used by some other unplanned use. This unexpected reuse is the measure.&quot; -  Tim Berners-Lee, re. State of the Semantic Web Finishing Quotes
&quot;Engineer for serendipity.&quot; -  Roy T. Fielding, re. REST Finishing Quotes
http://hyperdata.org/astronomy/ Related Links etc.

Linked Data

  • 1.
    Linking Open DataDanny Ayers SemAst2009
  • 2.
  • 3.
    Credits: Many slidesderived from those of Tom Heath, Michael Hausenblas, Chris Bizer, Richard Cyganiak, Olaf Hartig
  • 4.
    Quote Linked Datais the Semantic Web done right, and the Web done right.” - Tim Berners-Lee
  • 5.
    Introduce the conceptof Linked Data Highlight why you would want to publish Linked Data on the Web Introduce the principles and best practices of publishing Linked Data on the Web Demonstrate the creation and consumption of Linked Data Answer your burning Linked Data publishing questions Objectives
  • 6.
    Overview From aWeb of Documents to a Web of Data Web APIs, Microformats, and Linked Data Linked Data Deployment on the Web What data is out there? Applications What is being done with the data?
  • 7.
    Classic Data :in silos Image: Bob Jagensdorf, http://flickr.com/photos/darwinbell/, CC-BY
  • 8.
    The Classic WebSingle global information space URLs as globally unique IDs retrieval mechanism HTML as shared content format Hyperlinks Shortcomings Content is not well structured You can not ask expressive queries You can not process content within applications B C HTML HTML HTML Web Browsers Search Engines hyper- links A
  • 9.
    What do weactually want? Use the Web like a single global database.
  • 10.
    Solution Web APIsMicroformats Linked Data Publish structured data directly on the Web Different Approaches
  • 11.
  • 12.
    Mashups Positive APIsexpose structured data APIs enable new applications Negative Usually proprietary interfaces Mashups are based only on fixed set of sources You can't usually set hyperlinks between data objects Mashup Up Web API A Web API B Web API C Web API D
  • 13.
    Web APIs slicethe Web into separate data silos Image: Bob Jagensdorf, http://flickr.com/photos/darwinbell/, CC-BY
  • 14.
    What do wewant? Use the Web as a single global database using what we know works on the Web
  • 15.
    Connecting Worlds OneWeb! Data Documents
  • 16.
    What works onthe Web? Uniform Identifiers (URIs) Common Interface Protocol (HTTP) Standard Representation Formats (HTML, Atom/RSS etc)
  • 17.
    What works onthe Web? The Hyperlink
  • 18.
    Evolving the Link<a href=” http://example.org/home.html “> Home Page </a> page.html home.html
  • 19.
    Evolving the Link<a href=”http://example.org/home.html” rel=”home”> Home Page </a> page.html home.html home
  • 20.
    Microformats Embed structureddata into HTML pages. hCard, hCalender, hReview, XFN, … Compatible with the idea of the Web as single information space. Shortcomings Only a fixed set of microformats exist No direct way to connect data items <div class=&quot;vevent&quot;> <span class=&quot;summary&quot;>bdigital</span> <abbr class=&quot;dtstart&quot; title=&quot;2008-05-20&quot;>May 20</abbr> - <abbr class=&quot;dtend&quot; title=&quot;2007-05-22&quot;>22</abbr> </div>
  • 21.
    Evolving the Link<rdf:Description rdf:about=” http://example.org/page ”> <x:home rdf:resource=” http://example.org/home ” /> </rdf:Description> page home x:home
  • 22.
    Linked Data BC Thing typed links A D E typed links typed links typed links Thing Thing Thing Thing Thing Thing Thing Thing Thing Use Semantic Web technologies to publish structured data on the Web, set links between data from one data source to data within other data sources .
  • 23.
    RDF Relations <rdf:Description rdf:about=” http://example.org/page ”> <x:home rdf:resource=” http://example.org/home ” /> </rdf:Description> x:home subject object http://example.org /page ... http://example.org /home ...
  • 24.
    Identifiers Identification byDescription Non-HTTP URIs HTTP URIs
  • 25.
    Linked Data PrinciplesUse URIs as names for things Use HTTP URIs so that people can look up those names. When someone looks up a URI, provide useful information. Include links to other URIs. so that they can discover more things. Tim Berners-Lee 2007 http://www.w3.org/DesignIssues/LinkedData.html
  • 26.
    The RDF DataModel Richard Cyganiak dbpedia:Berlin foaf:name foaf:based_near foaf:Person rdf:type pd:cygri
  • 27.
    Data objects areidentified with HTTP URIs pd:cygri Richard Cyganiak dbpedia:Berlin foaf:name foaf:based_near foaf:Person rdf:type pd:cygri = http://richard.cyganiak.de/foaf.rdf#cygri dbpedia:Berlin = http://dbpedia.org/resource/Berlin
  • 28.
    Follow Your Nose...Richard Cyganiak dbpedia:Berlin foaf:name foaf:based_near foaf:Person rdf:type pd:cygri dp:Cities_in_Germany 3.405.259 dp:population skos:subject
  • 29.
    Dereferencing URIs overthe Web Richard Cyganiak dbpedia:Berlin foaf:name foaf:based_near foaf:Person rdf:type dbpedia:Hamburg dbpedia:Muenchen skos:subject skos:subject pd:cygri dp:Cities_in_Germany 3.405.259 dp:population skos:subject
  • 30.
    Universal Connector <resourceB>rdfs:seeAlso <resourceA>
  • 31.
    Universal Connector <resourceB>rdfs:seeAlso <resourceA> foaf:Person rdf:type
  • 32.
    Take Care! <resourceB>owl:sameAs <resourceA>
  • 33.
    The Disco –Hyperdata Browser
  • 34.
  • 35.
    Semantic Web SpecificationsHTTP + URIs RDF + RDFs OWL SKOS SPARQL RDFa GRDDL Turtle/N3 Coming soon: POWDER, OWL2, RIF
  • 36.
    What of WebServices? “ Web APIs Are Just Web Sites” - Paul Downey Linked Data is exposed as (very simple) RESTful services
  • 37.
    What of RDFStores? A triplestore is just a cache of a chunk of the Semantic Web Convenient for data merging, querying (SPARQL) and inference See also: Semantic Web Client Library
  • 38.
    What of theReal World? source: http://danbri.org/words/2008/04/15/300 based on timbl 1994
  • 39.
    W3C Linking OpenData Project Community effort to publish existing open license datasets as Linked Data on the Web interlink things between different data sources
  • 40.
    LOD Datasets onthe Web: May 2007
  • 41.
    LOD Datasets onthe Web: August 2007
  • 42.
    LOD Datasets onthe Web: February 2008
  • 43.
    LOD Datasets onthe Web: September 2008
  • 44.
    LOD Datasets onthe Web: March 2009
  • 45.
    Spotlight: Geonames over8 million geographical locations feature hierarchy
  • 46.
    Spotlight: DBpedia extractsstructured data from Wikipedia. covers over 2.2 million concepts from various domains.
  • 47.
    Example RDF LinksRDF links from DBpedia to other data sources RDF link from a FOAF profile to DBpedia <http://dbpedia.org/resource/Berlin> owl:sameAs <http://sws.geonames.org/2950159> . <http://richard.cyganiak.de/foaf.rdf#cygri> foaf:topic_interest <http://dbpedia.org/resource/Semantic_Web> . <http://dbpedia.org/resource/Tim_Berners-Lee> owl:sameAs <http://www4.wiwiss.fu-berlin.de/dblp/resource/person/100007> .
  • 48.
    Universities and ResearchInstitutes Massachusetts Institute of Technology (USA) University of Southampton (UK) Freie Universität Berlin (DE) DERI (IRE) KMi, Open University (UK) University of London (UK) Universität Hannover (DE) University of Pennsylvania (USA) Universität Leipzig (DE) Universität Karlsruhe (DE) Joanneum (AT) University of Toronto (CA) Organizations publishing Linked Data Companies BBC (UK) OpenLink (UK) Zitgist (USA) Talis (UK) Garlik (UK) Mondeca (FR) Cyc Foundation (USA)
  • 49.
    Applications B CThing typed links A D E typed links typed links typed links Thing Thing Thing Thing Thing Thing Thing Thing Thing Search Engines Linked Data Mashups Linked Data Browsers What can I do with this?
  • 50.
    Linked Data BrowsersTabulator Browser (MIT, USA) Marbles (FU Berlin, DE) OpenLink RDF Browser (OpenLink, UK) Zitgist RDF Browser (Zitgist, USA) Disco Hyperdata Browser (FU Berlin, DE) Fenfire (DERI, Irland)
  • 51.
  • 52.
    Linked Data MashupsDomain-specific applications - using Linked Data from the Web
  • 53.
    DBpedia Mobile Geospatialentry point into the Web of Data Starts with DBpedia, Revyu and Flickr data
  • 54.
    Web of DataSearch Engines Falcons (IWS, China) Sindice (DERI, Ireland) MicroSearch (Yahoo, Spain) Watson (Open University, UK) SWSE (DERI, Ireland) Swoogle (UMBC, USA)
  • 55.
  • 56.
    Why publish LinkedData on the Web? Your data becomes part of a single global data space (the Web of Data = Semantic Web) People can use various data browsers to explore your data Your data is crawled by Semantic Web search engines and can be used by independent applications People start setting links to your data, which allows more people to find and use your data
  • 57.
    Why publish LinkedData on the Web? Linked Data builds on the classic architecture of the Web Linked Data is more generic then Web APIs or Microformats Builds on standards in contrast to proprietary Web APIs Enables applications that work against an unbound set of data sources and incorporate new data sources as they become available on the Web
  • 58.
    Linked Data PrinciplesUse URIs as names for things Use HTTP URIs so that people can look up those names. When someone looks up a URI, provide useful information. Include links to other URIs. so that they can discover more things. Tim Berners-Lee 2007 http://www.w3.org/DesignIssues/LinkedData.html
  • 59.
  • 60.
    Example : FirstPass @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix x: <http://purl.org/stuff/astro#> . x:Megrez rdf:type x:A3_dwarf ; x:magnitude &quot;3.1&quot; ; x:partOf [ rdf:type x:Constellation ; x:hasName &quot;Ursa Major&quot; ] . (RDF Turtle/N3 Syntax)
  • 61.
    Example : Vocabulary@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix x: <http://purl.org/stuff/astro#> . x:Star rdf:type rdfs:Class . x:Constellation rdf:type rdfs:Class . x:A3_Dwarf rdfs:subClassOf x:Star . x:magnitude rdf:type rdf:Property . x:partOf rdf:type rdf:Property . x:hasName rdf:type rdf:Property .
  • 62.
    But... Use URIsas names for things Use HTTP URIs so that people can look up those names. When someone looks up a URI, provide useful information. Include links to other URIs. so that they can discover more things. Existing vocabularies and resource URIs are likely to be well-linked already – reuse is good!
  • 63.
    Example : BetterLinkage @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix x: <http://purl.org/stuff/astro#> . @prefix astro: <http://archive.astro.umd.edu/ont/astronomy.owl#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix dolce: <http://www.loa-cnr.it/ontologies/DOLCE-Lite.owl#> . @prefix dbpedia: <http://dbpedia.org/resource/> . dbpedia:Delta_Ursae_Majoris rdf:type x:A3_Dwarf ; foaf:name &quot;Megrez&quot;; x:magnitude &quot;3.1&quot; ; dolce:part-of dbpedia:Ursa_Major ; rdfs:seeAlso <http://www.dcs.gla.ac.uk/workshops/semast09> . dbpedia:Ursa_Major rdf:type astro:Constellation .
  • 64.
    Mass-Publication Techniques Generallyeither: bulk conversion live mapping from existing databases
  • 65.
    New Vocabularies :voiD Vocabulary of Interlinked Datasets Classes: Dataset | Linkset | TechnicalFeature | Properties: statItem | feature | subset | target | sparqlEndpoint | linkPredicate | exampleResource | vocabulary | subjectsTarget | objectsTarget | dataDump | uriLookupEndpoint | uriRegexPattern |
  • 66.
    New Vocabularies :Scovo Statistical Core Vocabulary
  • 67.
    &quot;It is importantto look as it as an interconnection bus, and it hasn't started working in earnest until one person's data is being used by some other unplanned use. This unexpected reuse is the measure.&quot; - Tim Berners-Lee, re. State of the Semantic Web Finishing Quotes
  • 68.
    &quot;Engineer for serendipity.&quot;- Roy T. Fielding, re. REST Finishing Quotes
  • 69.