0
Is Linked Data something for me?      Christophe Guéret, Clément Levallois  eHumanities group meeting, November 22, 2012  ...
Get ready ! Goal of today  Learn about Linked Data  See if that is something interesting for your activities              ...
Hands-on tutorial Make groups, one per table Pick a famous person of your choice per group Grab the material on http://bit...
Big data, but how to get it?  Cant alwaysgather all theinformationmanually                               4/
Big data, but how to get it? Data scattered indifferent informationsystems                               5/
Big data, but how to get it? Data in different formats                               6/
What if we could?  If all data where “readable”, connections betweendatasets could be made. We would simply knowmore than ...
Why is it so hard? Machines can not read the text and extract data            What is the name of that person?       8/
Ouch!  You just faced the same problem as machines:   Cant read the document and extract the data  Linked Data is a soluti...
Use case for the hands-on                            10/
What we will do...  Take a the webpage of a researcher (one page pergroup!) Explain why the data in this page is “buried” ...
Template 1 The name is in the title City is ambiguous                            12/
Template 2 The name is not visible on the page City is ambiguous                                       13/
Template 3 The name is in the description City is ambiguous                                  14/
Hands-on: check out the templates  Open the templates in a web browser and look attheir HTML source code                  ...
Hands-on: check out the templates  Change “William Smith” into a name of your own(one name per group)                     ...
First part of the hands-on                             17/
In what sense do we mean that the name of thisresearcher is buried in this web page? There is no way for a software readin...
Two roads from there… We could design a software that understands English  This is the approach of natural language proces...
Annotate the data We use a VOCABULARY for these annotations           foaf:name                                           ...
Wait! What is that “foaf:name” ? It is a term from a vocabulary  foaf:name comes from the vocabulary FOAF and is used to a...
Annotate the page with the data                                  22/
Hands-on: annotate with foaf:name  Add the “foaf:name” annotation to the threetemplates Step 1: declare the vocabulary FOA...
Hands-on: extract annotations  Use the RDFa extractor at http://bit.ly/RDFaParserto get the annotations from the three tem...
Bingo!  We get exactly the same result for the threetemplates  foaf:name = William Smith                                  ...
How this should look like now (here showing template 1)                                26/
How to choose a vocabulary? Vocabulary => consensus Therefore, it is better to  Avoid obscure vocabularies nobody knows  F...
What vocabularies are available? Many are well established: FOAF, SIOC, DublinCore, BIBO, … Creating vocabularies is doabl...
EU initiative “Core Vocabularies” from ISA program Combine existing terms and new ones                                    ...
Google/Bing/Yahoo/Yandex initiative Vocabulary: Schema.org Used by search engines to extract pages data                   ...
Facebook initiative Vocabulary: Open graph protocol Used to put the “Like” buttons on pages                               ...
How to use a vocabulary? Look at the documentation, e.g.  http://xmlns.com/foaf/spec/ Map your concepts to terms from the ...
Triples and subjects Remember, we created this annotation  . foaf:name "William Smith“ But what entity has “William Smith”...
We did not declare a subject This says that this is the foaf:name but does notdefine a subject → Use the page name by defa...
Why does this matter? Subjects can be used as objects to create links             foaf:knows                     foaf:name...
Picking a resource Need to be stable, web accessible, re-used Consensus again, example:  Amsterdam: http://dbpedia.org/res...
Hands-on: set the subject Step 1: decide on a resource for the person  http://example.org/william_smith  http://myurl.com/...
5-star Linked Data Rules (see http://5stardata.info/ ):  Resource are valid URIs  Machine readable data is associated to t...
Great! Were done now!  We added this structured piece of data to all thetemplates: <http://example.org/william_smith> foaf...
One of the new code All the annotated templates have their namesuffixed with “_with_name_and_subject”                     ...
Second part of the hands-on    Create some links                              41/
Creating links Links are used to connect two resources Example: William Smith knows Tim Berners-Lee  <http://example.org/w...
Hands-on: getting socialStep 1: ask 3 other groups in this workshop for their subject(remember, a subject is:<span about="...
Lets make some links                        44/
Remember, there are two Durham One of the US, one in the UK, similar importance Which one is the “Durham” on the profile? ...
Finding a resource on Geonames  Search by name, follow the RDF link, strip out the“/about.rdf” part                       ...
Hands-on: disambiguate Durham  Annotate “Durham” with a link to the exactresource Step 1: decide on which Durham to use St...
Hands-on: extract annotations  Use the RDFa extractor at http://bit.ly/RDFaParserto get the annotations from the three tem...
Hands-on: extract a network! Now use a little software from the dropBox                                              49/
Thats all for now!(but there is more to discover: ontologies, reasoning, SPARQL, ...)                                     ...
Upcoming SlideShare
Loading in...5
×

Is linked data something for me?

4,357

Published on

Slides prepared with Clement Levallois for the tutorial held at the Meertens institute. The presentation goes over the need for using Linked Data to make data machine readable. The hands-on part is focused on the annotation of a profile page with RDFa.

4 Comments
7 Likes
Statistics
Notes
No Downloads
Views
Total Views
4,357
On Slideshare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
22
Comments
4
Likes
7
Embeds 0
No embeds

No notes for slide

Transcript of "Is linked data something for me?"

  1. 1. Is Linked Data something for me? Christophe Guéret, Clément Levallois eHumanities group meeting, November 22, 2012 1/
  2. 2. Get ready ! Goal of today Learn about Linked Data See if that is something interesting for your activities 2/
  3. 3. Hands-on tutorial Make groups, one per table Pick a famous person of your choice per group Grab the material on http://bit.ly/ehg_tutorial orcatch a USB stick 3/
  4. 4. Big data, but how to get it? Cant alwaysgather all theinformationmanually 4/
  5. 5. Big data, but how to get it? Data scattered indifferent informationsystems 5/
  6. 6. Big data, but how to get it? Data in different formats 6/
  7. 7. What if we could? If all data where “readable”, connections betweendatasets could be made. We would simply knowmore than we do today. “Linked data” is an attempt to do that 7/
  8. 8. Why is it so hard? Machines can not read the text and extract data What is the name of that person? 8/
  9. 9. Ouch! You just faced the same problem as machines: Cant read the document and extract the data Linked Data is a solution to this problemNote: in the following we take the example of data “buried” inwebpages (html documents), but the same logic applies to otherkinds of docs (csv files, databases, your collection of pictures…) 9/
  10. 10. Use case for the hands-on 10/
  11. 11. What we will do... Take a the webpage of a researcher (one page pergroup!) Explain why the data in this page is “buried” Solve the issue by introducing some linked datasweetness in the webpage Show what we gained: now, we can connect theresearchers! 11/
  12. 12. Template 1 The name is in the title City is ambiguous 12/
  13. 13. Template 2 The name is not visible on the page City is ambiguous 13/
  14. 14. Template 3 The name is in the description City is ambiguous 14/
  15. 15. Hands-on: check out the templates Open the templates in a web browser and look attheir HTML source code 15/
  16. 16. Hands-on: check out the templates Change “William Smith” into a name of your own(one name per group) Change and pick another name! 16/
  17. 17. First part of the hands-on 17/
  18. 18. In what sense do we mean that the name of thisresearcher is buried in this web page? There is no way for a software reading this page to guess: is there a name on this page? if so, what is this name? What does this name represent? What does it relate to? But wait, my Internet browser can read html pages,why can’t it figure out the name of the researcher? Because the html code gives info about how to display the page, but no info about what the content means! 18/
  19. 19. Two roads from there… We could design a software that understands English This is the approach of natural language processing, statistics, etc... We can put extra code that tells directly to the softwarewhat the data means This is the linked data approach! This extra code in html pages is called “RDFa” 19/
  20. 20. Annotate the data We use a VOCABULARY for these annotations foaf:name 20/
  21. 21. Wait! What is that “foaf:name” ? It is a term from a vocabulary foaf:name comes from the vocabulary FOAF and is used to annotate the name of a person Key concept!!! Vocabulary = set of unambiguous consensualterms used to annotate pages with data Vocabulary are An agreement between data publisher and consumers Generally focused on particular topics 21/
  22. 22. Annotate the page with the data 22/
  23. 23. Hands-on: annotate with foaf:name Add the “foaf:name” annotation to the threetemplates Step 1: declare the vocabulary FOAF <html xmlns:foaf="http://xmlns.com/foaf/0.1/"> Step 2: annotate the data <span property="foaf:name">William Smith</span> Template 2 does not display the name we use a meta: <meta property="foaf:name" content="William Smith"/> 23/
  24. 24. Hands-on: extract annotations Use the RDFa extractor at http://bit.ly/RDFaParserto get the annotations from the three templates Command line tool: java -jar RDFaParser-0.0.6.jar template1.html java -jar RDFaParser-0.0.6.jar template2.html java -jar RDFaParser-0.0.6.jar template3.html All the three return the same result: nothing! 24/
  25. 25. Bingo! We get exactly the same result for the threetemplates foaf:name = William Smith 25/
  26. 26. How this should look like now (here showing template 1) 26/
  27. 27. How to choose a vocabulary? Vocabulary => consensus Therefore, it is better to Avoid obscure vocabularies nobody knows Focus on well organised and maintained vocabularies Why did we use FOAF? Specialised for personal profiles and widely accepted W3C support & recommended for use by EU members http://joinup.ec.europa.eu/asset/core_person/description 27/
  28. 28. What vocabularies are available? Many are well established: FOAF, SIOC, DublinCore, BIBO, … Creating vocabularies is doable but beware that: New vocabularies wont necessarily gain adoption Need to maintain the vocabulary Need to host it on the Web A vocabulary can borrow terms from other vocabs. 28/
  29. 29. EU initiative “Core Vocabularies” from ISA program Combine existing terms and new ones 29/
  30. 30. Google/Bing/Yahoo/Yandex initiative Vocabulary: Schema.org Used by search engines to extract pages data 30/
  31. 31. Facebook initiative Vocabulary: Open graph protocol Used to put the “Like” buttons on pages 31/
  32. 32. How to use a vocabulary? Look at the documentation, e.g. http://xmlns.com/foaf/spec/ Map your concepts to terms from the vocabulary Naam → foaf:name Voornaam → foaf:firstName Achternaam → foaf:lastName Werklocatie → foaf:based_near 32/
  33. 33. Triples and subjects Remember, we created this annotation . foaf:name "William Smith“ But what entity has “William Smith” for a name? <template1.html> foaf:name "William Smith" Meaning: This document has for name “William Smith” This is a “triple” made of a subject, a predicate and an object Subject = <template1.html> Predicate = foaf:name Object = "William Smith" 33/
  34. 34. We did not declare a subject This says that this is the foaf:name but does notdefine a subject → Use the page name by default foaf:name 34/
  35. 35. Why does this matter? Subjects can be used as objects to create links foaf:knows foaf:name Need a common subject to group annotations foaf:name William smith foaf:based_near Durham 35/
  36. 36. Picking a resource Need to be stable, web accessible, re-used Consensus again, example: Amsterdam: http://dbpedia.org/resource/Amsterdam TBL: http://www.w3.org/People/Berners-Lee/card#i The <C:/MyDirectory/templateX.html> are not validWeb based, we need to change that 36/
  37. 37. Hands-on: set the subject Step 1: decide on a resource for the person http://example.org/william_smith http://myurl.com/john_doe Step 2: add the resource with an “about” tag in thesame span as the foaf:name Example: You had: <span property="foaf:name"> It becomes: <span about="http://example.org/william_smith_page" property="foaf:name"> 37/
  38. 38. 5-star Linked Data Rules (see http://5stardata.info/ ): Resource are valid URIs Machine readable data is associated to the resource The data contains links to other resources Example http://dbpedia.org/resource/Amsterdam 38/
  39. 39. Great! Were done now! We added this structured piece of data to all thetemplates: <http://example.org/william_smith> foaf:name "William Smith" This data can be extracted by a software We can build our application that fetch personsname, but there are still no links between them :-/ 39/
  40. 40. One of the new code All the annotated templates have their namesuffixed with “_with_name_and_subject” 40/
  41. 41. Second part of the hands-on Create some links 41/
  42. 42. Creating links Links are used to connect two resources Example: William Smith knows Tim Berners-Lee <http://example.org/william_smith> foaf:knows <http://www.w3.org/People/Berners-Lee/card#i> Two usages: Create (social) networks by connecting resources Disambiguate text by pointing to the exact resource 42/
  43. 43. Hands-on: getting socialStep 1: ask 3 other groups in this workshop for their subject(remember, a subject is:<span about="http://example.org/william_smith_page" property="foaf:name">Step 2: use the 3 subjects you got to annotate the linksExample:I know<span rel="foaf:knows" resource="http://example.org/john_doe">John Doe</span>, and<span rel="foaf:knows" resource="http://myUrl.com/nchomsky">Noam Chomsky</span>, and also<span rel="foaf:knows" resource="http://ehumanities.knaw.nl/sally_wyatt">SallyWyatt</span> 43/
  44. 44. Lets make some links 44/
  45. 45. Remember, there are two Durham One of the US, one in the UK, similar importance Which one is the “Durham” on the profile? http://sws.geonames.org/4464368 http://sws.geonames.org/2650628 45/
  46. 46. Finding a resource on Geonames Search by name, follow the RDF link, strip out the“/about.rdf” part 46/
  47. 47. Hands-on: disambiguate Durham Annotate “Durham” with a link to the exactresource Step 1: decide on which Durham to use Step 2: annotate Durham with the link <span rel="foaf:based_near" about="http://example.org/william_smith" resource="http://sws.geonames.org/4464368">Durham</ span> 47/
  48. 48. Hands-on: extract annotations Use the RDFa extractor at http://bit.ly/RDFaParserto get the annotations from the three templates Command line tool: java -jar RDFaParser-0.0.6.jar template1.html java -jar RDFaParser-0.0.6.jar template2.html java -jar RDFaParser-0.0.6.jar template3.html All the three return the same result! 48/
  49. 49. Hands-on: extract a network! Now use a little software from the dropBox 49/
  50. 50. Thats all for now!(but there is more to discover: ontologies, reasoning, SPARQL, ...) 50/
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×