Introduction to linked data and the semantic web
Upcoming SlideShare
Loading in...5
×
 

Introduction to linked data and the semantic web

on

  • 5,091 views

Talk on linked data for the BCS Data Management Specialist Group

Talk on linked data for the BCS Data Management Specialist Group

Statistics

Views

Total Views
5,091
Views on SlideShare
3,862
Embed Views
1,229

Actions

Likes
17
Downloads
180
Comments
0

11 Embeds 1,229

http://steve-dale.net 880
http://www.epimorphics.com 295
http://beta.publicsectorweb.com 25
http://feeds.feedburner.com 13
http://linkeddata.uriburner.com 5
http://localhost 3
http://twitter.com 2
http://openlinkeddatasolutions.magnify.net 2
http://www.stephendale.com 2
https://twitter.com 1
http://translate.googleusercontent.com 1
More...

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Introduction to linked data and the semantic web Introduction to linked data and the semantic web Presentation Transcript

  • Linked data and its role in the semantic web Dave Reynolds, Epimorphics Ltd @der42
  • Roadmap image: Leo Oosterloo @ flickr.com What is linked data? Examples Modelling Access Strengths and weaknesses other topics
  • Linked data intro
    • Linked data ...
    • publishing data on the web ...
    • ... to enable integration , linking and reuse across silos
  • Can’t we just publish data as files?
    • pdf
      • easy to read and publish 
    • Excel
      • allows further processing and analysis 
    • csv
      • processing without need for proprietary tools 
    • But ...
      • structure of data not explained
      • no connection between different data sets, silos
      • static and fixed – can’t retrieve just slices relevant to problem
  • Linked data
    • Apply the principles of the web to publication of data
    • The web :
      • is a global network of pages
      • each identified by a URL
      • fetching a URL gives a document
      • pages connected by links
      • open, anyone can say anything about anything else
  • Linked data
    • Apply the principles to the web to publication of data
    • The linked data web :
      • is a global network of things
      • each identified by a URI
      • fetching a URI gives a set of statements
      • things connected by typed links
      • open, anyone can say anything about anything else
    • Linked data is “data you can click on”
     
  • Example schools information http://education.data.gov.uk/id/school/401874
  • Example schools information http://education.data.gov.uk/id/school/401874 “ Cardiff High School” “ Secondary” “ Cardiff” label phase district
  • Example schools information http://education.data.gov.uk/id/school/401874 “ Cardiff High School” phase district http://statistics.data.gov.uk/id/local-authority-district/00PT “ Cardiff” label school:PhaseOfEducation_Secondary label
  • Example schools information http://education.data.gov.uk/id/school/401874 “ Cardiff High School” phase district http://statistics.data.gov.uk/id/local-authority-district/00PT “ Cardiff” label school:PhaseOfEducation_Secondary http://data.ordnancesurvey.co.uk/id/7000000000025484 label contains ward extent contains parish GML: 310499.4 184176.6 310476.5   ...
  • Example schools information http://education.data.gov.uk/id/school/401874 “ Cardiff High School” phase district http://statistics.data.gov.uk/id/local-authority-district/00PT “ Cardiff” label school:PhaseOfEducation_Secondary http://data.ordnancesurvey.co.uk/id/7000000000025484 label contains ward extent contains parish GML: 310499.4 184176.6 310476.5   ... same as
  •  
  • Linked data principles
    • Use URIs as names for things
    • Use HTTP URIs so that people can look up those names
    • When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)
    • Include links to other URIs, so that they can discover more things
    Pattern of application of semantic web stack
  • Linked open data cloud: 2007 Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  • Linked open data cloud: 2009 Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  • Linked open data cloud: 2010 Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  • Data.gov.uk – linked datasets and APIs
  • Data.gov.uk visualizations on top of linked data
  •  
  • Ordnance survey
  • Environment agency - data, API, visualizations
  • BBC – integration and site design
  • E-commerce and rich snippets Overstock.com Peek-cloppenburg.de
  • Internal use
  • Open?
    • Linked open data
    • =
    • linked data
    • +
    • open data
  • Modelling
  • Modelling Thing, entity, concept ... resource
    • resource being described
      • abstract concept
      • real world thing
      • data item, particular measurement
      • document
    • identify by URI
    • provide information making statements about those resources
    • identifier NOT a container c.f. UML
      • open schema
      • critical to open extensibility and integration
      • similar to Entity-Attribute-Value modelling
  • Modelling – RDF – Resource Description Framework Statement, triple, logical assertion Subject Predicate Object
  • Modelling – RDF Statement, triple, logical assertion Subject Predicate Object some school has a name/label some literal
  • Modelling – RDF Statement, triple, logical assertion Subject Predicate Object http://education.data.gov.uk/id/school/401874 has a name/label “ Cardiff High School”
  • Modelling – RDF Statement, triple, logical assertion Subject Predicate Object http://education.data.gov.uk/id/school/401874 http://www.w3.org/2000/01/rdf-schema#label “ Cardiff High School”
  • Modelling – RDF Statement, triple, logical assertion where school: = http://education.data.gov.uk/id/school/ rdfs: = http://www.w3.org/2000/01/rdf-schema# Subject Predicate Object school:401874 rdfs:label “ Cardiff High School”
  • Modelling – RDF Statement, triple, logical assertion Subject Predicate Object school:401874 rdfs:label “ Cardiff High School” school:401874 ont:districtAdministrative la: 00PT la: 00PT rdfs:label Cardiff
  • Modelling – RDF Statement, triple, logical assertion school:401874 “ Cardiff High School” ont:districtAdministrative la:00PT “ Cardiff” rdfs:label rdfs:label Subject Predicate Object school:401874 rdfs:label “ Cardiff High School” school:401874 ont:districtAdministrative la: 00PT la: 00PT rdfs:label “ Cardiff”
  • Modelling – RDF Statement, triple, logical assertion Subject Predicate Object school:401874 rdfs:label “ Cardiff High School” school:401874 ont:districtAdministrative la: 00PT la: 00PT rdfs:label “ Cardiff” la: 00PT rdfs:label “ Caerdydd”@cy
  • RDF Syntaxes
    • RDF/XML
      • normative
    • Turtle
      • more human readable/writable
      • being standardized
    • RDFa
      • embed in (X)HTML
    • [others omitted]
  • Modelling – RDF RDF/XML syntax <rdf:RDF xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot; xmlns:ont=&quot;http://education.data.gov.uk/def/school/&quot; xmlns:la=&quot;http://statistics.data.gov.uk/id/local-authority-district/&quot; xmlns:school=&quot;http://education.data.gov.uk/id/school/&quot; xmlns:rdfs=&quot;http://www.w3.org/2000/01/rdf-schema#&quot;> <rdf:Description rdf:about=&quot;http://education.data.gov.uk/id/school/401874&quot;> <rdfs:label>Cardiff High School</rdfs:label> <ont:districtAdministrative> <rdf:Description rdf:about=&quot;http://statistics.data.gov.uk/id/local-authority-district/00PT&quot;> <rdfs:label>Cardiff</rdfs:label> </rdf:Description> </ont:districtAdministrative> </rdf:Description> </rdf:RDF> Subject Predicate Object school:401874 rdfs:label “ Cardiff High School” school:401874 ont:districtAdministrative la: 00PT la: 00PT rdfs:label “ Cardiff”
  • Modelling – RDF Turtle syntax @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix school: <http://education.data.gov.uk/id/school/> . @prefix ont: <http://education.data.gov.uk/def/school/> . @prefix la: <http://statistics.data.gov.uk/id/local-authority-district/> . school:401874 rdfs:label &quot;Cardiff High School&quot;; ont:districtAdministrative la:00PT . la:00PT rdfs:label &quot;Cardiff&quot; . Subject Predicate Object school:401874 rdfs:label “ Cardiff High School” school:401874 ont:districtAdministrative la: 00PT la: 00PT rdfs:label “ Cardiff”
  • Modelling Vocabularies
    • so far no actual models, let alone semantics
    • want to define
      • types of thing : Class
      • what you can say about them : Property
    • encode definitions in more RDF and publish at the corresponding URIs
      • link from data to data model
      • reuse published vocabularies to enable integration
      • freely combine different vocabularies or new ones
  • Modelling – vocabularies Logical modelling
    • modelling the domain, not a particular data structure
      • what exists
      • what is asserted? what can you deduce from that?
      • not about constraints as such
      • monotonic, open world
    controlled vocabulary taxonomy thesaurus ontology O ntology
  • Modelling – vocabularies
    • unfamiliar terminology but related to
      • information architecture and conceptual modelling
      • domain-driven design
      • ... and yes knowledge representation
  • Modelling – RDFS RDF vocabulary description language
    • classes, types and type hierarchy
    ont:School rdfs:Class rdf:type “ School” rdfs:label
  • Modelling – RDFS RDF vocabulary description language
    • classes, types and type hierarchy
    ont:WelshEstablishment ont:School rdfs:Class rdf:type rdf:type rdfs:subClassOf “ School” rdfs:label
  • Modelling – RDFS RDF vocabulary description language
    • classes, types and type hierarchy
    school:401874 ont:WelshEstablishment ont:WelshEstablishment ont:School rdfs:Class rdf:type rdf:type rdf:type rdfs:subClassOf “ School” rdfs:label
  • Modelling – RDFS RDF vocabulary description language
    • classes, types and type hierarchy
    school:401874 ont:WelshEstablishment ont:WelshEstablishment ont:School rdfs:Class rdf:type rdf:type rdf:type rdfs:subClassOf school:401874 ont:WelshEstablishment ont:School rdf:type  “ School” rdfs:label “ School” rdfs:label
  • Modelling – RDFS RDF vocabulary description language
    • properties, property hierarchy
    school:401874 person:JoeBloggs ont:staffAt ont:headOf rdf:Property ont:headOf rdf:type rdfs:subPropertyOf  school:401874 person:JoeBloggs ont:staffAt ont:headOf
  • Modelling – RDFS RDF vocabulary description language
    • class/property relations
      • domain
      • range
    • Already have power to do some vocabulary mapping
      • declare classes or properties from different vocabularies to be equivalent:
      • A rdfs:subClassOf B
      • B rdfs:subClassOf A
  • Modelling - OWL
    • richer modelling and semantics
    • axioms on properties
      • transitive, symmetric, inverseOf, ...
      • functional, inverse functional
      • equivalent property
    • axioms on classes
      • intersection, union, disjoint, equivalent
    • restrictions on classes
      • some value from, all values from, cardinality, has value, one of, keys
    • axioms on individuals
      • same as, different from, all different
    • imports
  • Modelling – OWL
    • supports much richer modelling
    • consistency checking of model
    • consistency checking of data
      • some surprises if used to schema languages
      • open world, no unique name assumption
      • can extend to closed world checking
    • inference
      • classification
      • inferred relationships
  • Modelling Spectrum of goals and styles
    • Lightweight vocabularies
    • Rich ontological models
    • simple modelling
    • just enough agreement to get useful work done
    • removing boundaries to enable information to be found and connected
    • global consistency not possible
    • a little semantics goes a long way
    • rich domain models
    • need expressivity
    • consistency is critical
    • make complex inferences you can rely on, across data you trust
    • knowledge is power
  • Modelling Ontology reuse
    • invest in complete ontology for a domain
      • rich but general model, may be modular inside
      • strong “ontological commitment”
      • e.g. medical ontologies
    • reuse small, common, vocabularies
      • FOAF, SIOC, Dublin Core, Org ...
      • pick and choose classes and properties you need
      • fill in a few missing links for your domain
    • generic reusable vocabularies
      • Data cube vocabulary
  • Accessing all this data
    • link following
      • HTTP GET, follow links, aggregate relevant statements
    • query
      • SPARQL
  • SPARQL
    • core idea is pattern matching
      • graph patterns with variables
      • any subgraph which matches yields row of bindings
    • syntax based on Turtle syntax for RDF
    • web API endpoints
    • lots of power
    rdfs:label ont:districtAdministrative ?school [ ] “ Cardiff”
    • filters
    • optionals
    • named graphs
    • sub-queries
    • property chains
    • aggregation
    • federated query
    • update
    • construct
  • Accessing all this data
    • link following
      • HTTP GET, follow links, aggregate relevant statements
    • query
      • SPARQL
    • linked data API
      • RESTful API onto linked data resources
      • simple query, usable without RDF stack, web dev friendly
      • easy to layer visualizations and UIs on top
    • third parties
      • search engines and aggregators e.g. Sindice, sameAs.org
  • Semantic web layer cake
  • Strengths and weaknesses image: spcbrass @ flickr.com
  • Strengths
    • data integration
      • use of global identifiers (URIs)
      • composable – statements v. containers, schemaless
      • linking, vocabulary mapping
    • extensible, incremental, decentralized, resilient
      • no global ontology/schema to develop or maintain
      • freely add terms from other vocabularies
      • open world assumption
    • modelling and data entwined
      • link data to models, data in context
      • use same technology to share, manage extend models
    • supports inference and classification
    • rich access routes
      • web linking, download, query, web APIs
  • Weaknesses
    • complexity of the stack
      • alphabet soup – RDF, RDFS, OWL, SPARQL, RIF ..
      • unfamiliar “ontology”, “logical entailment”
      • lots of arcane details
      • RDF/XML syntax
    • performance of schema-less stores
      • optimization challenges
    • limited validation and constraints
    • cost of modelling,ontology development
    • no inbuilt notions of time, uncertainty
    • use the parts you need
    • tooling e.g. Linked Data API
    • core ideas not that complex
    • technology improving steadily
    • hybrid solutions
    • closed world checkers
    • ontology reuse
    • generic ontologies (data cube)
    • tools
    • model on top
  • Wrapping up image: erika g. @ flickr.com
  • Things we missed out
    • RDF nuances
      • blank nodes, containers and collections
      • named graphs
    • linked data nuances
      • URI for thing v. web page, content negotiation, httprange-14
      • URI architecture
    • OWL nuances
      • OWL species, serializations, lots of details
    • Other technologies in the stack
      • SPARQL update, rules (RIF), GRDDL, Powder, Geo SPARQL, RDB mapping, triple/quad stores
    • Embedding structured data in markup
      • RDFa, micro formats, micro data, schema.org and all that
  • Hot topics
    • Government linked data
      • identifiers to seed linked data
      • data publication
        • transparency, improving services, economic growth
    • structured data and search engines
      • rich snippets, structured results, SEO
      • search => question answering
    • user interfaces
      • visualization, exploration, exploiting linking
    • data as a service
  • fin.
  • Spare
  • Case study: Local government payments data model publish use