Finding and consuming (Linked) Open Data

5,400 views
5,300 views

Published on

This presentation is about Open Data and Linked Open Data. It has been shown to researchers working in the eHumanities group.

Published in: Technology
1 Comment
10 Likes
Statistics
Notes
No Downloads
Views
Total views
5,400
On SlideShare
0
From Embeds
0
Number of Embeds
899
Actions
Shares
0
Downloads
54
Comments
1
Likes
10
Embeds 0
No embeds

No notes for slide

Finding and consuming (Linked) Open Data

  1. 1. Finding and consuming (Linked) Open Data Christophe Guéret (@cgueret) March 8, 2012http://latc-project.eu http://ehumanities.nl http://www.vu.nl
  2. 2. Th e ne xt two h ou rs Open Data What is it? Why opening data? How to find Open Data How to consume it Hands-on session Linked Data & Linked Open Data What is it? Relation with Open Data? How to get Linked Data Ways to consume itMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 2/
  3. 3. O p e n D ataMarch 8, 2012 F ind ing and cons u m ing (Linke d ) http://www.flickr.com/photos/-jvl-/4983920242 O p e n D ata 3/
  4. 4. O p e n D ata“A piece of content or data is open if anyone is free touse, reuse, and redistribute it — subject only, atmost, to the requirement to attribute and share-alike.” http://opendefinition.org/March 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 4/
  5. 5. Why op e ning d ata? Data has more value than applications Data is more used if its easier to use it Credit: Dorothea Salo, http://www.slideshare.net/cavlec/rdf-rda-and-other-tlasMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 5/
  6. 6. O p e n D ata for P u b lic ins titu tions Improve transparency Active citizenship and data journalism Create new opportunities Develop need-focused applications almost for free See all AppsforX challenges (Amsterdam, Nederland, …) http://opendatachallenge.org/ Let businesses sell services around the data Improve efficiency Help share data within institutionsMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 6/
  7. 7. O p e n D ata for R e s e arch e rs Consider data as an asset Like papers, can be referenced to Like papers, open access for increased usage “Better” science Reproducibility of experiments Cross usage of data sets in different studies Improve transparency (and decrease fraud?)March 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 7/
  8. 8. D ata workflow Search for the relevant data sets data integration and clean up Do Visualise and/or analyse the data Re-publish integrated and curated dataMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 8/
  9. 9. D ata workflow Search for the relevant data sets data integration and clean up Do Visualise the data Re-publish integrated and curated dataMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 9/
  10. 10. Th re e ways to s e arch for d ata Generic search engine with specific target Use keywords or keywords + file type Browse data archives Focused around particular topic(s) Explored by facets and keywords Use data portals “Yellow pages” for data archives, faceted searchMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 1 0/
  11. 11. U s ing a s e arch e ngineMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 11 /
  12. 12. D ata arch ive → D ryadMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 1 2/
  13. 13. D ata arch ive → E as yMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 1 3/
  14. 14. D ata p ortal → O ve rh e id .nlMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 1 4/
  15. 15. D ata p ortal → P u b licd ata.e uMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 1 5/
  16. 16. D ata p ortal → Kas ab iMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 1 6/
  17. 17. D ata catalogsMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 1 7/
  18. 18. D ata workflow Search for the relevant data sets data integration and clean up Do Visualise the data Re-publish integrated and curated dataMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 1 8/
  19. 19. D ata inte gration Unify the different data in a single format XLS + PDF + CSV => CSV Integrate the data Connect the bits and pieces Curate the data Fix errors in the data Process the data in preparation for its usageMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 1 9/
  20. 20. D ata inte gration Unify the different data in a single format XLS + PDF + CSV => CSV Use Linked Data to save time there! Integrate the data Connect the bits and pieces Curate the data Fix errors in the data Process the data in preparation for its usageMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 20/
  21. 21. D ata workflow Search for the relevant data sets data integration and clean up Do Visualise the data Re-publish integrated and curated dataMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 21 /
  22. 22. Vis u alis e d ata → D ataM arke tMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 22/
  23. 23. Vis u alis e d ata → G oogle e xp lore rMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 23/
  24. 24. Vis u alis e d ata → M icros oft e xp lore rMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 24/
  25. 25. Vis u alis e d ata → Wolfram Alp h aMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 25/
  26. 26. D ata workflow Search for the relevant data sets data integration and clean up Do Visualise the data Re-publish integrated and curated dataMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 26/
  27. 27. P u b lis h p roce s s e d d ata How? Send to data archive Publish on web sites Why? Re-usability Community process (“if I do it, other will do it”) Scientific processMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 27/
  28. 28. H and s on s e s s ionMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 28/
  29. 29. In 2001, what were the council election results in the county of Warwickshire (UK) ?March 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 29/
  30. 30. What is the evolution of literacy rate in Tanzania since 1988 ?March 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 30/
  31. 31. Can you make this plot of unemployment rates using the Google Public data explorer ?March 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 31 /
  32. 32. Linke d D ata & Linke d O p e n D ata Linked DataMarch 8, 2012 http://www.flickr.com/photos/erikcharlton/3337465138 F ind ing and cons u m ing (Linke d ) O p e n D ata 32/
  33. 33. Wh at is th e p rob le m ? Frank and Christophe publish some open data Roi wants to combine and enrich it Kennissen Stad Christophe Amsterdam Peter Barcelona WWW Frank David Parijs Ville Pays Roi Barcelone Espagne Paris France WWW Christophe Amsterdam Pays-Bas Marvel icons: mermer, DeviantArtMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 33/
  34. 34. Wh at is th e p rob le m ? Kennissen Stad Ville Pays Christophe Peter David Amsterdam Barcelona Parijs + Barcelone Paris Amsterdam Espagne France Pays-Bas = ? Data integration issue “Kennissen”, “Stad”, “Ville”, “Pays” ? “Paris” = “Parijs” ? “Amsterdam” = “Amsterdam” ? Lot of work for the data consumerMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 34/
  35. 35. Why is th is s o p rob le m atic? Un-even balance of information Christophe and Frank have more of it than RoiMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 35/
  36. 36. S olu tion: s h are m ore inform ation “Amsterdam” = “Amsterdam” ? Replace “Amsterdam” by “Amsterdam, Netherlands” “Kennissen”, “Stad”, “Ville”, “Pays” ? Provide a description for the meaning of the columns as a separate document “Paris” = “Parijs” ? Use English names instead of local onesMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 36/
  37. 37. Bu t is th at e nou gh ? There could still be several “Amsterdam, Netherlands” Precise until 100% certain of uniqueness Documentation of columns is one more thing to consume to use the data Its hard to enforce the usage of a single language to name thingsMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 37/
  38. 38. Linke d D ata id e a Data integration at the data level Define “things” in the data set Use unambiguous identifiers for the things Associate descriptions to the identifiers Connect things together Works in 1 2 Name fr is “Paris” Name is “Christophe” Name nl is “Parijs” ... ...March 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 38/
  39. 39. Linke d D ata and th e We b Proposal: use the Web as a platform Identifiers = URIs Descriptions = de-referenced documents This is a “triple” This is a “resource” ex:worksIn ex:Christophe dbpedia:Amsterdam Use of compact URIs dbpedia = http://dbpedia.org/resource/ ex = http://example.org/March 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 39/
  40. 40. Wh at is at d b p e d ia:Am s te rd am ?March 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 40/
  41. 41. Be ne fits of Linke d D ata Data model of triples and resources: Everything defined as described things and relations Cope easilly with heterogeneous descriptions Easy to cross-reference things between data sets The network contains both the data and its description Use the Web and other open standards (RDF, SPARQL, ...)March 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 41 /
  42. 42. Frank publishes his data Kennissen Stad Christophe Amsterdam Peter Barcelona David Parijs ex:Acquaintance rdf:type rdf:type rdf:type ex:Christophe ex:Peter ex:David ex:worksIn ex:worksIn ex:worksIn dbpedia:Amsterdam dbpedia:Barcelona dbpedia:ParisMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 42/
  43. 43. Christophe re-use part of Franks data Ville Pays to publish his data Barcelone Espagne Paris France Amsterdam Pays-Bas ex:Acquaintance rdf:type rdf:type rdf:type ex:Christophe ex:Peter ex:David ex:worksIn ex:worksIn ex:worksIn dbpedia:Amsterdam dbpedia:Barcelona dbpedia:Paris ex:isIn ex:isIn ex:isIn dbpedia:Netherlands dbpedia:Spain dbpedia:FranceMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 43/
  44. 44. Roi add some “Conocido”@esmore information rdf:label ex:Acquaintance rdf:type rdf:type rdf:type ex:Christophe ex:Peter ex:David ex:worksIn ex:worksIn ex:worksIn dbpedia:Amsterdam dbpedia:Barcelona dbpedia:Paris ex:isIn ex:isIn ex:isIn dbpedia:Netherlands dbpedia:Spain dbpedia:France ex:isIn ex:isIn ex:isIn dbpedia:EuropeMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 44/
  45. 45. R e as oning with S e m antics Bonus! dbpedia:Amsterdam ex:isIn dbpedia:Amsterdam ex:isIn rdf:type dbpedia:Netherlands + owl:TransitiveProperty = ex:isIn ex:isIn dbpedia:Europe dbpedia:Europe Example usage Materialize implicit information Check for consistencyu m ing (Linke d ) O p e n D ataMarch 8, 2012 F ind ing and cons 45/
  46. 46. Linke d D ata vs Linke d O p e n D ata Linked Data doesnt imply Open Data! Possible to use Linked Data principles to closed data Open Data doesnt imply Linked Data Many open data is not yet published as linked data Linked data + Open Data = Linked Open Data Global, web-scale, dataingspacee nof open dataMarch 8, 2012 F ind ing and cons u m (Linke d ) O p D ata 46/
  47. 47. R ou gh e s tim ate of s ize 295 data sets, 31B facts in LOD Cloud Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/March 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 47/
  48. 48. E ve ryone can e nrich th e clou d “Conocido”@es rdf:label ex:Acquaintance rdf:type rdf:type rdf:type ex:Christophe ex:Peter ex:David ex:worksIn ex:worksIn ex:worksIn dbpedia:Amsterdam dbpedia:Barcelona dbpedia:Paris ex:isIn ex:isIn ex:isIn dbpedia:Netherlands dbpedia:Spain dbpedia:France ex:isIn ex:isIn ex:isIn dbpedia:EuropeMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 48/
  49. 49. G e t Linke d O p e n D ata Linked Data is a graph data base on the Web It can be consumed in two ways As documents on the Web Open the resources and ask for RDF content to get a graph As a data base Query the data with SPARQL (equivalent of SQL)March 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 49/
  50. 50. S e arch for R D F d ocu m e ntsMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 50/
  51. 51. Look for th e R D F e xp ortMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 51 /
  52. 52. Look for th e R D F e xp ortMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 52/
  53. 53. Look for th e R D F e xp ortMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 53/
  54. 54. S ind ice We b d ata ins p e ctorMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 54/
  55. 55. H and s on s e s s ionMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 55/
  56. 56. G e t th e R D F of a Be s tBu y p rod u ctMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 56/
  57. 57. G e t R D F ou t of rotte ntom atoe sMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 57/
  58. 58. U s e -cas e : b u ild ing a s ocial ne twork of m u s iciansMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 58/
  59. 59. G oal Make a network Nodes = artists Edges => play(ed) in the same bandMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 59/
  60. 60. U s e Fre e b as e as d ata s ou rceMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 60/
  61. 61. G e tting th e d ata First option: Get all the pages for all the artists as RDF Merge them Filter the data to keep only the desired relations Second option: Extract a sub-graph out of the data graph of FreebaseMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 61 /
  62. 62. S PAR Q L q u e ry PREFIX fb: <http://rdf.freebase.com/ns/> SELECT distinct ?name1 ?name2 WHERE { ?g1 fb:music.group_membership.group ? group. ?g1 fb:music.group_membership.memberMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 62/
  63. 63. R e s u lt Use factforge.net Contains a copy of the data from Freebase Understands SPARQL queries Results: http://bit.ly/music_snMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 63/
  64. 64. H ot line for Linke d (O p e n) D ata Christophe Guéret c.d.m.gueret@vu.nl http://www.few.vu.nl/~cgueret @cgueret Rinke Hoekstra rinke.hoekstra@vu.nl http://www.rinkehoekstra.nl/ @rinkehoekstraMarch 8, 2012 F ind ing and cons u m ing (Linke d ) O p e n D ata 64/

×