Your SlideShare is downloading. ×
0
Dan Brickley <danbri@few.vu.nl>
‘Semantic Web and linked Geo data’
Geonovum workshop,
Wageningen, 2010-10-12
Tuesday, 2 No...
Overview
• historical origins of the Semantic Web initiative
• example of SPARQL querying ‘Linked Data’
• some conclusions...
Part 1: RDF & history
Tuesday, 2 November 2010
Tuesday, 2 November 2010
Tuesday, 2 November 2010
Tuesday, 2 November 2010
Tuesday, 2 November 2010
Tuesday, 2 November 2010
Tuesday, 2 November 2010
Tuesday, 2 November 2010
Tuesday, 2 November 2010
Tuesday, 2 November 2010
Part 2: SemWeb today
• lessons: no global consistency;Web pages that
make claims; inter-twingularity...
• what does this m...
over 24.7 billion triples
over 436 million links between datasets
Tuesday, 2 November 2010
Tuesday, 2 November 2010
Tuesday, 2 November 2010
USA
UK
Tuesday, 2 November 2010
Linked Data guidelines
• 1. Use URIs as names for things (eg. schools!)
• 2. Use HTTP URIs to allow people to get info.
• ...
RDF/SPARQL example
“Q: Which schools in the BANES area have a nursery?”
prefix sch-ont: <http://education.data.gov.uk/def/...
In RDF “nodes and arcs”:
Tuesday, 2 November 2010
Fosse Way School, Fosseway Infant School, Keynsham Primary
School, King Edward's School, Midsomer Norton Primary School,
M...
RDF/XML at http://statistics.data.gov.uk/id/local-authority-district/00HA ...
Tuesday, 2 November 2010
More SPARQL-able queries from UK linked data :
Select the name, lowest and highest age ranges,
capacity and pupil:teacher ...
Lessons from part 1
• no global consistency: RDF and SPARQL
allow for contradictory, competing data
• semantics: RDF/XML, ...
‘Scope creep’
• “intertwingularity” is a silly name for a
serious problem: scope creep
• Schema designers are under consta...
In practice
• Each school could have an HTML/RDFa
page (or RDF/XML too)
• Datasets that distinguish institution from
locat...
Problems don't come nicely scoped and packaged into cleanly distinct
domains. Whenever you try to solve one problem, it bo...
“Pay as you go”
integration
• there is no single “right” ontology
• data can be mixed and merged ad-hoc
• relations like o...
Geo questions
• Can GML, KML etc be handled in RDF?
• yes, either as links, textual ‘islands’ or some
RDF systems have ext...
Suggestions
• Build a Linked Data test-bed with several
datasets whose coverage overlaps in scope
• each dataset initially...
Conclusions
• The Semantic Web project applies Web ideas to data
sharing.
• Linked RDF datasets have different emphasis (e...
Questions?
Credits: original NeXT browser, see
http://en.wikipedia.org/wiki/WorldWideWeb
Images:Tim Berners-Lee, Richard C...
Upcoming SlideShare
Loading in...5
×

Intertwingularity, Semantic Web and linked Geo data

7,467

Published on

A talk given at a GeoNovum workshop in the Netherlands,

Published in: Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
7,467
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
16
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "Intertwingularity, Semantic Web and linked Geo data"

  1. 1. Dan Brickley <danbri@few.vu.nl> ‘Semantic Web and linked Geo data’ Geonovum workshop, Wageningen, 2010-10-12 Tuesday, 2 November 2010
  2. 2. Overview • historical origins of the Semantic Web initiative • example of SPARQL querying ‘Linked Data’ • some conclusions and suggestions A brief introduction to SemanticWeb data sharing, focussing on underlying principles. Tuesday, 2 November 2010
  3. 3. Part 1: RDF & history Tuesday, 2 November 2010
  4. 4. Tuesday, 2 November 2010
  5. 5. Tuesday, 2 November 2010
  6. 6. Tuesday, 2 November 2010
  7. 7. Tuesday, 2 November 2010
  8. 8. Tuesday, 2 November 2010
  9. 9. Tuesday, 2 November 2010
  10. 10. Tuesday, 2 November 2010
  11. 11. Tuesday, 2 November 2010
  12. 12. Tuesday, 2 November 2010
  13. 13. Part 2: SemWeb today • lessons: no global consistency;Web pages that make claims; inter-twingularity... • what does this mean for modern RDF tools? • how can we share and link data in the Web, in practice? Tuesday, 2 November 2010
  14. 14. over 24.7 billion triples over 436 million links between datasets Tuesday, 2 November 2010
  15. 15. Tuesday, 2 November 2010
  16. 16. Tuesday, 2 November 2010
  17. 17. USA UK Tuesday, 2 November 2010
  18. 18. Linked Data guidelines • 1. Use URIs as names for things (eg. schools!) • 2. Use HTTP URIs to allow people to get info. • 3. Publish useful info there (eg. using RDF). • 4. Include links to other URIs in your data. see: http://www.w3.org/DesignIssues/LinkedData.html Tuesday, 2 November 2010
  19. 19. RDF/SPARQL example “Q: Which schools in the BANES area have a nursery?” prefix sch-ont: <http://education.data.gov.uk/def/school/> prefix xsd: <http://www.w3.org/2001/XMLSchema#> SELECT ?name WHERE { ?school a sch-ont:School; sch-ont:establishmentName ?name; sch-ont:districtAdministrative <http://statistics.data.gov.uk/id/local-authority-district/00HA> ; sch-ont:nurseryProvision "true"^^xsd:boolean } ORDER BY ?name examples by Leigh Dodds,Talis: http://blogs.talis.com/n2/archives/818 Tuesday, 2 November 2010
  20. 20. In RDF “nodes and arcs”: Tuesday, 2 November 2010
  21. 21. Fosse Way School, Fosseway Infant School, Keynsham Primary School, King Edward's School, Midsomer Norton Primary School, Monkton Prep School, Peasedown St John Primary School, Royal High School, Southdown Community Infant School, St Andrew's CofE Primary School, St Keyna Primary School, St Martin's Garden Primary School, St Saviour's CofE Infant School, The Paragon School, Junior School of Prior Park, College Trinity Coe VC Primary, Twerton Infant School... (according to the SPARQL RDF database at http://services.data.gov.uk/education/sparql ) Answer: Tuesday, 2 November 2010
  22. 22. RDF/XML at http://statistics.data.gov.uk/id/local-authority-district/00HA ... Tuesday, 2 November 2010
  23. 23. More SPARQL-able queries from UK linked data : Select the name, lowest and highest age ranges, capacity and pupil:teacher ratio for all schools in the Bath & North East Somerset district. What is the uri, name, and opening date of the oldest school in the UK? Select the name, easting and northing for the 100 newest schools in the UK. Select the uri, name, and the reason for closing for all schools that are currently scheduled for closure. The reason is a URI from a controlled vocabulary in the ontology. In which parliamentary constituencies did schools open in 2008? examples by Leigh Dodds,Talis: http://blogs.talis.com/n2/archives/818 Tuesday, 2 November 2010
  24. 24. Lessons from part 1 • no global consistency: RDF and SPARQL allow for contradictory, competing data • semantics: RDF/XML, RDFa, GRDDL - several ways to get RDF statements from a document; several publishing models for RDF in your Web site. • intertwingularity:“the interconnectedness of all things” as an engineering problem... Tuesday, 2 November 2010
  25. 25. ‘Scope creep’ • “intertwingularity” is a silly name for a serious problem: scope creep • Schema designers are under constant pressure to change, add, improve their designs. Problems are not tidily packaged. • RDF is built to survive this: independent schemas and datasets can be freely mixed together, without always ‘asking permission’. Tuesday, 2 November 2010
  26. 26. In practice • Each school could have an HTML/RDFa page (or RDF/XML too) • Datasets that distinguish institution from location might publish one set of RDF; others that flatten these aspects together can do likewise with their data. • Cross-dataset consistency comes later, if at all. Tuesday, 2 November 2010
  27. 27. Problems don't come nicely scoped and packaged into cleanly distinct domains. Whenever you try to solve one problem, it borders on a dozen others that are a higher priority for people elsewhere. You think you're working with 'events' data but find yourself with information describing musicians; you think you're describing musicians, but find yourself describing digital images; you think you're describing digital images, but find yourself describing geographic locations; you think you're building a database of geographic locations, and find yourself modeling the opening hours of the businesses based at those locations. To a poet or idealist, these interconnections might be beautiful or inspiring; to a project manager or product manager, they are as likely to be terrifying. By dropping in identifiers that link to a big pile of other people's data, we can hopefully make it easier to keep projects nicely scoped without needlessly restricting future functionality. An events database can remain an events database, but use identifiers for artists and performers, making it possible to filter events by properties of those participants. A database of places can be only a link or two away from records describing the opening hours or business offerings of the things at those places. Tuesday, 2 November 2010
  28. 28. “Pay as you go” integration • there is no single “right” ontology • data can be mixed and merged ad-hoc • relations like owl:sameAs, skos:closeMatch can be used to interlink datasets later • common models emerge from bottom up, “pave the cowpaths...” * * analogy by Richard Cyganiak Tuesday, 2 November 2010
  29. 29. Geo questions • Can GML, KML etc be handled in RDF? • yes, either as links, textual ‘islands’ or some RDF systems have extensions to support spatial queries within SPARQL. • Which geo-related ontology to use? • several exist, simple and complex. It depends. • Is it better to use a common ontology, or capture our data exactly in a custom one? • you can do both and let others decide. Tuesday, 2 November 2010
  30. 30. Suggestions • Build a Linked Data test-bed with several datasets whose coverage overlaps in scope • each dataset initially mapped to its own RDF • experiment with finding common models; schemas/ontologies, and shared identifiers • evaluate against use cases expressed as SPARQL queries Tuesday, 2 November 2010
  31. 31. Conclusions • The Semantic Web project applies Web ideas to data sharing. • Linked RDF datasets have different emphasis (eg. geo, schools, politics, events), accuracy and focus. • Treated properly this is a strength, as it allows the Web of data to grow organically without central control. • Location-related data is a natural ‘hub’, often mixed with non-geo data. RDF and SPARQL offer Web standards for sharing and querying such mixed data, allowing for decentralised schemas. Tuesday, 2 November 2010
  32. 32. Questions? Credits: original NeXT browser, see http://en.wikipedia.org/wiki/WorldWideWeb Images:Tim Berners-Lee, Richard Cyganiak,Anja Jentzsch Tuesday, 2 November 2010
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×