Bigdive 2014 - RDF, principles and case studies

389 views
315 views

Published on

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
389
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
7
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Bigdive 2014 - RDF, principles and case studies

  1. 1. RDF principles and case studies Diego Valerio Camarda regesta.exe www.regesta.com diego.camarda@regesta.com dvcama @ github&twitter
  2. 2. a brief introduction to linked open data
  3. 3. why things instead of documents Html page Html pageHtml page Html pageHtml page The nowadays WEB
  4. 4. why things instead of documents The nowadays WEB at least 1.85 billion indexed documents someone says 1 trillion online documents Html page Html pageHtml page Html pageHtml page
  5. 5. why things instead of documents The nowadays WEB at least 1.85 billion indexed documents someone says 1 trillion online documents actually the best HTML parser is still the HUMAN BRAIN Html page Html pageHtml page Html pageHtml page
  6. 6. why things instead of documents The nowadays WEB is not the WEB that Tim proposed in 1998
  7. 7. why things instead of documents The nowadays WEB is not the WEB that Tim proposed in 1998
  8. 8. why things instead of documents The nowadays WEB is not the WEB that Tim proposed in 1998
  9. 9. what about URIs and RDF a new way to publish data on the web ids are ambiguous and suck! Use URIs as names for things Use HTTP URIs so that people can look up those names Use the standards (RDF, SPARQL) providing useful information Include links to other URIs so that they can discover more things linked data principles Tim Berners-Lee July 27, 2006
  10. 10. HTTP://yourdomain.com/something what about URIs and RDF turning web pages in “real” data ids are ambiguous and suck!
  11. 11. what about URIs and RDF turning web pages in “real” data ids are ambiguous and suck!
  12. 12. […] l’animaletto venne indicato come: “il tasso del tasso del Tasso” Achille Campanile It’s time for machine (for parsing pages)
  13. 13. […] l’animaletto venne indicato come: “il tasso del tasso del Tasso” Achille Campanile It’s time for machine (for parsing pages) http://it.dbpedia.org/resource/Meles_meles http://it.dbpedia.org/resource/Taxus http://it.dbpedia.org/resource/Torquato_Tasso http://it.dbpedia.org/resource/Achille_Campanile (author of the sentence)
  14. 14. A new way to design databases RDF (aka ’define knowledge’)
  15. 15. Go Triples, go! the standard (old) approach ID_P COGNOME NOME REF_ID_SOCIETA GENERE 1 Camarda Diego 1 maschio 2 … … … … ID_SOCIETA DENOMINAZIONE SITO 1 Regesta.exe srl www.regesta.com
  16. 16. Go Triples, go! the new (cool) approach <http://www.regesta.com/diego>Subject
  17. 17. Go Triples, go! the new (cool) approach <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> Subject Predicate
  18. 18. Go Triples, go! the new (cool) approach <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’. Subject Predicate Object
  19. 19. Go Triples, go! the new (cool) approach <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’. <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/firstName> ‘Diego’. <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/gender> ‘male’.
  20. 20. Go Triples, go! the new (cool) approach <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ .
  21. 21. Go Triples, go! ok, but what a “diego” is?
  22. 22. Go Triples, go! it’s a person! <http://www.regesta.com/diego> a <http://xmlns.com/foaf/0.1/Person>
  23. 23. Go Triples, go! adding a Class <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ . <http://www.regesta.com/diego> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .
  24. 24. Go Triples, go! building a graph <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ ; <http://www.w3.org/1999/...#type> <http://xmlns.com/foaf/0.1/Person> . <http://www.regesta.com/diego> <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com/about> .
  25. 25. Go Triples, go! building a graph <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ ; <http://www.w3.org/1999/...#type> <http://xmlns.com/foaf/0.1/Person> ; <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com/about> . <http://www.regesta.com/about> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/org#Organization> .
  26. 26. Go Triples, go! building a graph <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ ; <http://www.w3.org/1999/...#type> <http://xmlns.com/foaf/0.1/Person> ; <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com/about> . <http://www.regesta.com/about> <http://www.w3.org/1999/...#type> <http://www.w3.org/ns/org#Organization> .
  27. 27. Go Triples, go! building a graph <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ ; <http://www.w3.org/1999/...#type> <http://xmlns.com/foaf/0.1/Person> ; <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com/about> . <http://www.regesta.com/about> <http://www.w3.org/1999/...#type> <http://www.w3.org/ns/org#Organization> ; <http://www.w3.org/2004/02/skos/core#prefLabel> ‘Regesta.exe srl’ ; <http://xmlns.com/foaf/0.1/homepage> <http://www.regesta.com> .
  28. 28. Go Triples, go! Objects could be Subjects diego
  29. 29. Go Triples, go! considering diego and regesta diego regesta
  30. 30. Go Triples, go! <diego> <memberOf> <regesta> diego regesta
  31. 31. Go Triples, go! but, <regesta> <locatedIn> <rome> diego regesta rome
  32. 32. Go Triples, go! <diego> <placeOfBirth> <rome> diego regesta rome
  33. 33. Go Triples, go! <rome> <parentADM> <italy> diego regesta rome italy
  34. 34. Go Triples, go! <silvia> <placeOfBirth> <italy> diego regesta silvia rome italy
  35. 35. Go Triples, go! <silvia> <…> <…> diego regesta silvia rome italy
  36. 36. Go Triples, go! <…> <…> <…> = a knowledge graph! diego regesta silvia rome italy
  37. 37. A lot of sentence to achieve (descriptive) freedom <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ . <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ . <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/gender> ‘male’ . <http://www.regesta.com/diego> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> . <http://www.regesta.com/diego> <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com> . <http://www.regesta.com/silvia> <http://xmlns.com/foaf/0.1/familyName> ‘Mazzini’ . <http://www.regesta.com/silvia> <http://xmlns.com/foaf/0.1/firstName> ‘Silvia’ . <http://www.regesta.com/silvia> <http://xmlns.com/foaf/0.1/gender> ‘female’ . <http://www.regesta.com/silvia> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> . <http://www.regesta.com/silvia> <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com> . <http://www.regesta.com> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/org#Organization> . <http://www.regesta.com> <http://www.w3.org/2004/02/skos/core#prefLabel> ‘Regesta.exe srl’ . <http://www.regesta.com/silvia> <http://xmlns.com/foaf/0.1/knows> <http://www.regesta.com/diego> . <…> <…> <…>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>.<noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> …
  38. 38. Standards for semantic web
  39. 39. RDF http://www.w3.org/standards/techs/rdf SPARQL http://www.w3.org/standards/techs/sparql ONTOLOGIES http://www.w3.org/standards/semanticweb/ontology Did you studied HTML? Good! it's time for a new standard
  40. 40. The Resource Description Framework is a general-purpose language for representing information in the Web. It's time for a new standard RDF
  41. 41. The SPARQL Protocol and RDF Query Language is a query language and protocol for RDF. It's time for a new standard SPARQL
  42. 42. On the Semantic Web, vocabularies define the concepts and relationships (also referred to as “terms”) used to describe and represent an area of concern. It's time for a new standard Ontologies
  43. 43. PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> foaf:firstName dc:title rdfs:label Pre:fixes (ontologies) just a few words
  44. 44. Browsing the web of data
  45. 45. Resource Description Framework › SPARQL endpoint › dereferenceable URIs › content negotiation › standard ports, like 80 (HTTP) › JSONP support MUST!
  46. 46. Resource Description Framework › SPARQL endpoint › dereferenceable URIs › content negotiation › standards port, like 80 (HTTP) › JSONP support › up-to-date › the endpoint URL is easy to deduce from resources › the resources are described by dc:title or rdfs:label › the endpoint hosts a page for humans › the resources and the endpoint are on the same domain SHOULD! (please do it, for me)
  47. 47. One single API a world to explore
  48. 48. One single API interlinking <a href=“…”>click here</a> owl:sameAs rdfs:seeAlso …
  49. 49. SELECT * {?minnesota ?banana ?sun} SPARQL a must know query language
  50. 50. SPARQL group graph pattern diego regesta silvia rome italy diego regesta silvia rome italy
  51. 51. SPARQL group graph pattern diego regesta rome silvia italy silvia italy
  52. 52. SELECT ?person { ?person <placeOfBirth> ?place ; <memberOf> ?company . ?company <locatedIn> ?place . } SPARQL group graph pattern <diego>
  53. 53. SELECT ?person ?prop ?obj { ?person <placeOfBirth> ?place ; <memberOf> ?company ; ?prop ?obj . ?company <locatedIn> ?place . } SPARQL group graph pattern (turn the page)
  54. 54. person prop obj <diego> rdf:type foaf:Person <diego> foaf:firstName ‘Diego’ <diego> foaf:familyName ‘Camarda’ <diego> foaf:gender ‘male’ <diego> org:memberOf <regesta> SPARQL group graph pattern
  55. 55. DESCRIBE <diego> SPARQL describe (turn the page)
  56. 56. <diego> rdf:type foaf:Person . <diego> foaf:firstName ‘Diego’ . <diego> foaf:familyName ‘Camarda’ . <diego> foaf:gender ‘male’ . <diego> org:memberOf <regesta> . <silvia> foaf:knows <diego> . SPARQL describe
  57. 57. CONSTRUCT {<diego> foaf:donaldDuck ?c} WHERE{<diego> ?b ?c. } SPARQL construct (turn the page)
  58. 58. <diego> foaf:donaldDuck foaf:Person . <diego> foaf:donaldDuck ‘Diego’ . <diego> foaf:donaldDuck ‘Camarda’ . <diego> foaf:donaldDuck ‘male’ . <diego> foaf:donaldDuck <regesta> . SPARQL construct
  59. 59. DISTINCT, COUNT GRAPH, PREFIX isBlank, isIRI, isLiteral, isNumeric FILTER, REGEX, STR FILTER NOT EXISTS, MINUS ORDER BY, OFFSET, LIMIT for other stuff http://www.w3.org/TR/sparql11-query/ SPARQL minimum requirements
  60. 60. Please start negotiating content right now! Hi dude, I accept: text/html,application/xhtml+xml Html pageGreat! I’ll serve you a web page Hi dude, I accept: application/rdf+xml RDF dataGreat… 303, redirect! Hi dude, I accept: pizza/margherita 406 errormmm… sorry
  61. 61. Please start negotiating content right now! application/rdf+xml application/xml text/plain text/turtle application/x-turtle application/trix application/x-trig text/n3 text/rdf+n3 application/trix application/x-trig application/x-binary-rdf text/x-nquads application/ld+json application/rdf+json application/xhtml+xml text/xml application/json application/rdf+xml application/rdf+n3 application/sparql-results+xml application/sparql-results+json
  62. 62. curl -L -H "Accept: application/rdf+xml" http://dati.camera.it/ocd/governo.rdf/g102 curl -L -H "Accept: text/n3" http://dati.camera.it/ocd/governo.rdf/g102 Please start negotiating content using CURL…
  63. 63. Java : Sesame / Jena Python : RDFLib Ruby : RDF.rb nodeJs : sparql-client or, as I do, simple HTTP GET + parsing result as json or xml Please start negotiating content …or a framework!
  64. 64. RDF data storing and deploying
  65. 65. It’s slow so keep calm 1 record 15 triples 2.949.771 votes 64.948.856 triples usually eg. Chamber of deputies data big data RDF probably will transform
  66. 66. Virtuoso Sesame Fuseki (Jena) Owlim / Bigdata (Sesame) AllegroGraph D2R server ARC2 … Triplestores I just need a SPARQL endpoint I just really need http://yourdomain/sparql
  67. 67. Case studies
  68. 68. select distinct ?o where {?s a ?o} select ?o count(distinct ?s) where {?s a ?o} select count(?s) where {?s ?p ?o} select count(?s) ?class where {?s ?p ?o; a ?class} select distinct ?p where {?s a <http://classe>; ?p ?o} select ?p count(?p) where {?s a <http://classe>; ?p ?o} select ?s where {?s a <http://classe>} ?p ?o where {<http://URI> ?p ?o} ?p ?o ?p1 ?o2 where {<http://URI> ?p ?o. OPTIONAL{?o ?p1 ?o2. FILTER(isBlank(?o))}} select distinct ?s ?title where {?s a <http://classe>; dc:title ?title. FILTER(REGEX(? title,’parola’,’i’))} LIMIT 100 SPARQL magic a query for all seasons
  69. 69. Case studies Chamber of deputies
  70. 70. http://dati.camera.it/sparql http://storia.camera.it From SPARQL to html
  71. 71. Case studies Central State Archive
  72. 72. http://acs.beniculturali.it/sparql http://labs.regesta.com/reloadProject From SPARQL to html
  73. 73. Useful links
  74. 74. W3C standards http://www.w3.org/standards/semanticweb/ OKFN endpoints status (and list) http://sparqles.okfn.org LodLive (a SPRQL navigator) http://en.lodlive.it a very good intro to RDF https://github.com/JoshData/rdfabout/blob/gh-pages/intro-to-rdf.md Tim Berners-Lee’s “Linked Data – 5 stars ranking” http://www.w3.org/DesignIssues/LinkedData.html My github page http://github.com/dvcama My email mailto:diego.camarda@regesta.com

×