SlideShare a Scribd company logo
1 of 50
Download to read offline
Is Linked Data something for me?

      Christophe Guéret, Clément Levallois
  eHumanities group meeting, November 22, 2012




                                                 1/
Get ready !
 Goal of today

  Learn about Linked Data



  See if that is something interesting for your activities




                                                             2/
Hands-on tutorial
 Make groups, one per table


 Pick a famous person of your choice per group


 Grab the material on http://bit.ly/ehg_tutorial or
catch a USB stick




                                                      3/
Big data, but how to get it?
  Can't always
gather all the
information
manually




                               4/
Big data, but how to get it?
 Data scattered in
different information
systems




                               5/
Big data, but how to get it?
 Data in different formats




                               6/
What if we could?
  If all data where “readable”, connections between
datasets could be made. We would simply know
more than we do today.


 “Linked data” is an attempt to do that




                                                      7/
Why is it so hard?
 Machines can not read the text and extract data




            What is the name of that person?       8/
Ouch!
  You just faced the same problem as machines:
   Can't read the document and extract the data


  Linked Data is a solution to this problem


Note: in the following we take the example of data “buried” in
webpages (html documents), but the same logic applies to other
kinds of docs (csv files, databases, your collection of pictures…)




                                                                     9/
Use case for the hands-on




                            10/
What we will do...
  Take a the webpage of a researcher (one page per
group!)


 Explain why the data in this page is “buried”


 Solve the issue by introducing some linked data
sweetness in the webpage


  Show what we gained: now, we can connect the
researchers!
                                                     11/
Template 1
 The name is in the title
 City is ambiguous




                            12/
Template 2
 The name is not visible on the page
 City is ambiguous




                                       13/
Template 3
 The name is in the description
 City is ambiguous




                                  14/
Hands-on: check out the templates
  Open the templates in a web browser and look at
their HTML source code




                                                    15/
Hands-on: check out the templates
  Change “William Smith” into a name of your own
(one name per group)
                          Change and pick another name!




                                                          16/
First part of the hands-on




                             17/
In what sense do we mean that the name of this
researcher is buried in this web page?
 There is no way for a software reading this page to guess:
   is there a name on this page?
  if so, what is this name?
  What does this name represent? What does it relate to?


 But wait, my Internet browser can read html pages,
why can’t it figure out the name of the researcher?
  Because the html code gives info about how to display
 the page, but no info about what the content means!


                                                              18/
Two roads from there…


 We could design a software that understands English
  This is the approach of natural language processing,
 statistics, etc...


 We can put extra code that tells directly to the software
what the data means
  This is the linked data approach! This extra code in html
 pages is called “RDFa”



                                                              19/
Annotate the data
 We use a VOCABULARY for these annotations
           foaf:name




                                             20/
Wait! What is that “foaf:name” ?
 It is a term from a vocabulary
  foaf:name comes from the vocabulary FOAF and is used
 to annotate the name of a person          Key concept!!!




  Vocabulary = set of unambiguous consensual
terms used to annotate pages with data


 Vocabulary are
  An agreement between data publisher and consumers
  Generally focused on particular topics                    21/
Annotate the page with the data




                                  22/
Hands-on: annotate with foaf:name
  Add the “foaf:name” annotation to the three
templates


 Step 1: declare the vocabulary FOAF
  <html xmlns:foaf="http://xmlns.com/foaf/0.1/">


 Step 2: annotate the data
  <span property="foaf:name">William Smith</span>
  Template 2 does not display the name we use a meta:
  <meta property="foaf:name" content="William Smith"/>   23/
Hands-on: extract annotations
  Use the RDFa extractor at http://bit.ly/RDFaParser
to get the annotations from the three templates


 Command line tool:
  java -jar RDFaParser-0.0.6.jar template1.html
  java -jar RDFaParser-0.0.6.jar template2.html
  java -jar RDFaParser-0.0.6.jar template3.html


 All the three return the same result: nothing!
                                                       24/
Bingo!
  We get exactly the same result for the three
templates
  foaf:name = William Smith




                                                 25/
How this should look like now
 (here showing template 1)




                                26/
How to choose a vocabulary?
 Vocabulary => consensus


 Therefore, it is better to
  Avoid obscure vocabularies nobody knows
  Focus on well organised and maintained vocabularies


 Why did we use FOAF?
  Specialised for personal profiles and widely accepted
  W3C support & recommended for use by EU members
    http://joinup.ec.europa.eu/asset/core_person/description   27/
What vocabularies are available?
 Many are well established: FOAF, SIOC, Dublin
Core, BIBO, …


 Creating vocabularies is doable but beware that:
  New vocabularies won't necessarily gain adoption
  Need to maintain the vocabulary
  Need to host it on the Web


 A vocabulary can borrow terms from other vocabs.
                                                     28/
EU initiative
 “Core Vocabularies” from ISA program
 Combine existing terms and new ones




                                        29/
Google/Bing/Yahoo/Yandex initiative
 Vocabulary: Schema.org
 Used by search engines to extract pages' data




                                                 30/
Facebook initiative
 Vocabulary: Open graph protocol
 Used to put the “Like” buttons on pages




                                           31/
How to use a vocabulary?
 Look at the documentation, e.g.
  http://xmlns.com/foaf/spec/


 Map your concepts to terms from the vocabulary
  Naam → foaf:name
  Voornaam → foaf:firstName
  Achternaam → foaf:lastName
  Werklocatie → foaf:based_near


                                                  32/
Triples and subjects
 Remember, we created this annotation
  . foaf:name "William Smith“


 But what entity has “William Smith” for a name?
  <template1.html> foaf:name "William Smith"
        Meaning: This document has for name “William Smith”




 This is a “triple” made of a subject, a predicate and an object
  Subject = <template1.html>
  Predicate = foaf:name
  Object = "William Smith"

                                                                   33/
We did not declare a subject
 This says that this is the foaf:name but does not
define a subject → Use the page name by default
              foaf:name




                                                     34/
Why does this matter?
 Subjects can be used as objects to create links
             foaf:knows                     foaf:name




 Need a common subject to group annotations

                           foaf:name
                                                   William smith


                          foaf:based_near
                                                        Durham

                                                                   35/
Picking a resource
 Need to be stable, web accessible, re-used


 Consensus again, example:
  Amsterdam: http://dbpedia.org/resource/Amsterdam
  TBL: http://www.w3.org/People/Berners-Lee/card#i


 The <C:/MyDirectory/templateX.html> are not valid
Web based, we need to change that

                                                     36/
Hands-on: set the subject
 Step 1: decide on a resource for the person
  http://example.org/william_smith
  http://myurl.com/john_doe


 Step 2: add the resource with an “about” tag in the
same span as the foaf:name
 Example:
 You had: <span property="foaf:name">
 It becomes:
 <span about="http://example.org/william_smith_page" property="foaf:name">

                                                                             37/
5-star Linked Data
 Rules (see http://5stardata.info/ ):
  Resource are valid URIs
  Machine readable data is associated to the resource
  The data contains links to other resources
 Example http://dbpedia.org/resource/Amsterdam




                                                        38/
Great! We're done now!
  We added this structured piece of data to all the
templates:
 <http://example.org/william_smith> foaf:name "William Smith"



 This data can be extracted by a software


 We can build our application that fetch persons'
name, but there are still no links between them :-/


                                                                39/
One of the new code
 All the annotated templates have their name
suffixed with “_with_name_and_subject”




                                               40/
Second part of the hands-on


    Create some links




                              41/
Creating links
 Links are used to connect two resources


 Example: William Smith knows Tim Berners-Lee
  <http://example.org/william_smith> foaf:knows
 <http://www.w3.org/People/Berners-Lee/card#i>


 Two usages:
  Create (social) networks by connecting resources
  Disambiguate text by pointing to the exact resource
                                                        42/
Hands-on: getting social
Step 1: ask 3 other groups in this workshop for their subject
(remember, a subject is:
<span about="http://example.org/william_smith_page" property="foaf:name">

Step 2: use the 3 subjects you got to annotate the links
Example:

I know
<span rel="foaf:knows" resource="http://example.org/john_doe">John Doe</span>
, and
<span rel="foaf:knows" resource="http://myUrl.com/nchomsky">Noam Chomsky</span>
, and also
<span rel="foaf:knows" resource="http://ehumanities.knaw.nl/sally_wyatt">Sally
Wyatt</span>



                                                                                  43/
Let's make some links




                        44/
Remember, there are two Durham
 One of the US, one in the UK, similar importance
 Which one is the “Durham” on the profile?




  http://sws.geonames.org/4464368   http://sws.geonames.org/2650628

                                                                      45/
Finding a resource on Geonames
  Search by name, follow the RDF link, strip out the
“/about.rdf” part




                                                       46/
Hands-on: disambiguate Durham
  Annotate “Durham” with a link to the exact
resource


 Step 1: decide on which Durham to use


 Step 2: annotate Durham with the link
  <span rel="foaf:based_near"
 about="http://example.org/william_smith"
 resource="http://sws.geonames.org/4464368">Durham</
 span>
                                                       47/
Hands-on: extract annotations
  Use the RDFa extractor at http://bit.ly/RDFaParser
to get the annotations from the three templates


 Command line tool:
  java -jar RDFaParser-0.0.6.jar template1.html
  java -jar RDFaParser-0.0.6.jar template2.html
  java -jar RDFaParser-0.0.6.jar template3.html


 All the three return the same result!
                                                       48/
Hands-on: extract a network!
 Now use a little software from the dropBox




                                              49/
That's all for now!

(but there is more to discover: ontologies, reasoning, SPARQL, ...)




                                                                      50/

More Related Content

What's hot

Context is King: On Semantic Publishing
Context is King: On Semantic PublishingContext is King: On Semantic Publishing
Context is King: On Semantic PublishingStefan Gradmann
 
Articulo sobre foros (completo ingles)
Articulo sobre foros (completo ingles)Articulo sobre foros (completo ingles)
Articulo sobre foros (completo ingles)MrAxe Huerta
 
Top 100 Tools for Learning 2008
Top 100 Tools for Learning 2008Top 100 Tools for Learning 2008
Top 100 Tools for Learning 2008Jane Hart
 
Make useof.com dropbox
Make useof.com dropboxMake useof.com dropbox
Make useof.com dropboxwilkmjw
 
Metadata Training for Staff and Librarians for the New Data Environment
Metadata Training for Staff and Librarians for the New Data EnvironmentMetadata Training for Staff and Librarians for the New Data Environment
Metadata Training for Staff and Librarians for the New Data EnvironmentDiane Hillmann
 

What's hot (8)

Context is King: On Semantic Publishing
Context is King: On Semantic PublishingContext is King: On Semantic Publishing
Context is King: On Semantic Publishing
 
Linked Data
Linked DataLinked Data
Linked Data
 
Articulo sobre foros (completo ingles)
Articulo sobre foros (completo ingles)Articulo sobre foros (completo ingles)
Articulo sobre foros (completo ingles)
 
Top 100 Tools for Learning 2008
Top 100 Tools for Learning 2008Top 100 Tools for Learning 2008
Top 100 Tools for Learning 2008
 
Slides.ppt
Slides.pptSlides.ppt
Slides.ppt
 
Make useof.com dropbox
Make useof.com dropboxMake useof.com dropbox
Make useof.com dropbox
 
Downloading Steps
Downloading StepsDownloading Steps
Downloading Steps
 
Metadata Training for Staff and Librarians for the New Data Environment
Metadata Training for Staff and Librarians for the New Data EnvironmentMetadata Training for Staff and Librarians for the New Data Environment
Metadata Training for Staff and Librarians for the New Data Environment
 

Viewers also liked

Your next data viz gear should be a Wii-U
Your next data viz gear should be a Wii-UYour next data viz gear should be a Wii-U
Your next data viz gear should be a Wii-UChristophe Guéret
 
Clarifier le sens de vos données publiques avec le Web de données
Clarifier le sens de vos données publiques avec le Web de donnéesClarifier le sens de vos données publiques avec le Web de données
Clarifier le sens de vos données publiques avec le Web de donnéesChristophe Guéret
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RESChristophe Guéret
 
UXSG2014 Lightning Talks - UX and Semantic web making web more human (Nurgul ...
UXSG2014 Lightning Talks - UX and Semantic web making web more human (Nurgul ...UXSG2014 Lightning Talks - UX and Semantic web making web more human (Nurgul ...
UXSG2014 Lightning Talks - UX and Semantic web making web more human (Nurgul ...ux singapore
 
Semantic Web questions we couldn't ask 10 years ago
Semantic Web questions we couldn't ask 10 years agoSemantic Web questions we couldn't ask 10 years ago
Semantic Web questions we couldn't ask 10 years agoFrank van Harmelen
 
Semantic Web in the Plateau of Productivity
Semantic Web in the Plateau of ProductivitySemantic Web in the Plateau of Productivity
Semantic Web in the Plateau of ProductivityIoannis Stavrakantonakis
 
Semantic Web for Advanced Engineering
Semantic Web for Advanced EngineeringSemantic Web for Advanced Engineering
Semantic Web for Advanced EngineeringMarta Sabou
 
Introduction to Semantic Web Technologies
Introduction to Semantic Web TechnologiesIntroduction to Semantic Web Technologies
Introduction to Semantic Web TechnologiesIvan Herman
 
Introducing design thinking
Introducing design thinkingIntroducing design thinking
Introducing design thinkingZaana Jaclyn
 

Viewers also liked (12)

Your next data viz gear should be a Wii-U
Your next data viz gear should be a Wii-UYour next data viz gear should be a Wii-U
Your next data viz gear should be a Wii-U
 
The data behind the HuisKluis
The data behind the HuisKluisThe data behind the HuisKluis
The data behind the HuisKluis
 
Clarifier le sens de vos données publiques avec le Web de données
Clarifier le sens de vos données publiques avec le Web de donnéesClarifier le sens de vos données publiques avec le Web de données
Clarifier le sens de vos données publiques avec le Web de données
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RES
 
UXSG2014 Lightning Talks - UX and Semantic web making web more human (Nurgul ...
UXSG2014 Lightning Talks - UX and Semantic web making web more human (Nurgul ...UXSG2014 Lightning Talks - UX and Semantic web making web more human (Nurgul ...
UXSG2014 Lightning Talks - UX and Semantic web making web more human (Nurgul ...
 
UXSG#8 Workshop
UXSG#8 WorkshopUXSG#8 Workshop
UXSG#8 Workshop
 
Semantic Web questions we couldn't ask 10 years ago
Semantic Web questions we couldn't ask 10 years agoSemantic Web questions we couldn't ask 10 years ago
Semantic Web questions we couldn't ask 10 years ago
 
Semantic Web in the Plateau of Productivity
Semantic Web in the Plateau of ProductivitySemantic Web in the Plateau of Productivity
Semantic Web in the Plateau of Productivity
 
Semantic Web for Advanced Engineering
Semantic Web for Advanced EngineeringSemantic Web for Advanced Engineering
Semantic Web for Advanced Engineering
 
Neuroblastoma
NeuroblastomaNeuroblastoma
Neuroblastoma
 
Introduction to Semantic Web Technologies
Introduction to Semantic Web TechnologiesIntroduction to Semantic Web Technologies
Introduction to Semantic Web Technologies
 
Introducing design thinking
Introducing design thinkingIntroducing design thinking
Introducing design thinking
 

Similar to Is linked data something for me?

Data Portability with SIOC and FOAF
Data Portability with SIOC and FOAFData Portability with SIOC and FOAF
Data Portability with SIOC and FOAFUldis Bojars
 
Understanding the Standards Gap
Understanding the Standards GapUnderstanding the Standards Gap
Understanding the Standards GapDan Brickley
 
Building Secure Open & Distributed Social Networks
Building Secure Open & Distributed Social NetworksBuilding Secure Open & Distributed Social Networks
Building Secure Open & Distributed Social NetworksHenry Story
 
Semantic Web and the Social Web
Semantic Web and the Social WebSemantic Web and the Social Web
Semantic Web and the Social Webrobin fay
 
Final copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-...
Final copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-...Final copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-...
Final copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-...alysonkaye
 
Semantic Web 2.0: Creating Social Semantic Information Spaces
Semantic Web 2.0: Creating Social Semantic Information SpacesSemantic Web 2.0: Creating Social Semantic Information Spaces
Semantic Web 2.0: Creating Social Semantic Information SpacesJohn Breslin
 
Copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-10-12
Copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-10-12Copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-10-12
Copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-10-12alysonkaye
 
Information Searches
Information SearchesInformation Searches
Information SearchesBlair E
 
Implementing Semantic Queries in Online Social Networks
Implementing Semantic Queries in Online Social NetworksImplementing Semantic Queries in Online Social Networks
Implementing Semantic Queries in Online Social NetworksOtávio Calaça Xavier
 
2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked datavafopoulos
 
Computer and internet applications in medicine
Computer and internet applications in medicineComputer and internet applications in medicine
Computer and internet applications in medicineAhmed-Refat Refat
 
Getting out of Silo, Using Open Source Software to Share your Data
Getting out of Silo, Using Open Source Software to Share your DataGetting out of Silo, Using Open Source Software to Share your Data
Getting out of Silo, Using Open Source Software to Share your DataBoris Mann
 
2011 05-02 linked data intro
2011 05-02 linked data intro2011 05-02 linked data intro
2011 05-02 linked data introvafopoulos
 
The Semantic Web An Introduction
The Semantic Web An IntroductionThe Semantic Web An Introduction
The Semantic Web An Introductionshaouy
 
To the Cloud! How to Compile and Analyze Reference Statistics Easily and for ...
To the Cloud! How to Compile and Analyze Reference Statistics Easily and for ...To the Cloud! How to Compile and Analyze Reference Statistics Easily and for ...
To the Cloud! How to Compile and Analyze Reference Statistics Easily and for ...Sonnet Ireland
 
Linked Data: so what?
Linked Data: so what?Linked Data: so what?
Linked Data: so what?MIUR
 
FOAF for Social Network Portability
FOAF for Social Network PortabilityFOAF for Social Network Portability
FOAF for Social Network PortabilityUldis Bojars
 
DataPortability and Me: Introducing SIOC, FOAF and the Semantic Web
DataPortability and Me: Introducing SIOC, FOAF and the Semantic WebDataPortability and Me: Introducing SIOC, FOAF and the Semantic Web
DataPortability and Me: Introducing SIOC, FOAF and the Semantic WebJohn Breslin
 
Jarrar: The Next Generation of the Web 3.0: The Semantic Web
Jarrar: The Next Generation of the Web 3.0: The Semantic WebJarrar: The Next Generation of the Web 3.0: The Semantic Web
Jarrar: The Next Generation of the Web 3.0: The Semantic WebMustafa Jarrar
 

Similar to Is linked data something for me? (20)

Data Portability with SIOC and FOAF
Data Portability with SIOC and FOAFData Portability with SIOC and FOAF
Data Portability with SIOC and FOAF
 
Understanding the Standards Gap
Understanding the Standards GapUnderstanding the Standards Gap
Understanding the Standards Gap
 
Building Secure Open & Distributed Social Networks
Building Secure Open & Distributed Social NetworksBuilding Secure Open & Distributed Social Networks
Building Secure Open & Distributed Social Networks
 
Semantic Web and the Social Web
Semantic Web and the Social WebSemantic Web and the Social Web
Semantic Web and the Social Web
 
Final copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-...
Final copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-...Final copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-...
Final copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-...
 
Semantic Web 2.0: Creating Social Semantic Information Spaces
Semantic Web 2.0: Creating Social Semantic Information SpacesSemantic Web 2.0: Creating Social Semantic Information Spaces
Semantic Web 2.0: Creating Social Semantic Information Spaces
 
Copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-10-12
Copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-10-12Copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-10-12
Copyofopensourcesites softwareandpresentationoutlineforslideshowfinal5-10-12
 
Information Searches
Information SearchesInformation Searches
Information Searches
 
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
 
Implementing Semantic Queries in Online Social Networks
Implementing Semantic Queries in Online Social NetworksImplementing Semantic Queries in Online Social Networks
Implementing Semantic Queries in Online Social Networks
 
2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked data
 
Computer and internet applications in medicine
Computer and internet applications in medicineComputer and internet applications in medicine
Computer and internet applications in medicine
 
Getting out of Silo, Using Open Source Software to Share your Data
Getting out of Silo, Using Open Source Software to Share your DataGetting out of Silo, Using Open Source Software to Share your Data
Getting out of Silo, Using Open Source Software to Share your Data
 
2011 05-02 linked data intro
2011 05-02 linked data intro2011 05-02 linked data intro
2011 05-02 linked data intro
 
The Semantic Web An Introduction
The Semantic Web An IntroductionThe Semantic Web An Introduction
The Semantic Web An Introduction
 
To the Cloud! How to Compile and Analyze Reference Statistics Easily and for ...
To the Cloud! How to Compile and Analyze Reference Statistics Easily and for ...To the Cloud! How to Compile and Analyze Reference Statistics Easily and for ...
To the Cloud! How to Compile and Analyze Reference Statistics Easily and for ...
 
Linked Data: so what?
Linked Data: so what?Linked Data: so what?
Linked Data: so what?
 
FOAF for Social Network Portability
FOAF for Social Network PortabilityFOAF for Social Network Portability
FOAF for Social Network Portability
 
DataPortability and Me: Introducing SIOC, FOAF and the Semantic Web
DataPortability and Me: Introducing SIOC, FOAF and the Semantic WebDataPortability and Me: Introducing SIOC, FOAF and the Semantic Web
DataPortability and Me: Introducing SIOC, FOAF and the Semantic Web
 
Jarrar: The Next Generation of the Web 3.0: The Semantic Web
Jarrar: The Next Generation of the Web 3.0: The Semantic WebJarrar: The Next Generation of the Web 3.0: The Semantic Web
Jarrar: The Next Generation of the Web 3.0: The Semantic Web
 

More from Christophe Guéret

HHAI June 2022 - KGs and Hybrid Intelligence
HHAI June 2022 - KGs and Hybrid IntelligenceHHAI June 2022 - KGs and Hybrid Intelligence
HHAI June 2022 - KGs and Hybrid IntelligenceChristophe Guéret
 
Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...Christophe Guéret
 
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...Christophe Guéret
 
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"Christophe Guéret
 
The Entity Registry System (ERS)
The Entity Registry System (ERS)The Entity Registry System (ERS)
The Entity Registry System (ERS)Christophe Guéret
 
Let's downscale the semantic web !
Let's downscale the semantic web !Let's downscale the semantic web !
Let's downscale the semantic web !Christophe Guéret
 
The road towards a Web-based data ecosystem
The road towards a Web-based data ecosystemThe road towards a Web-based data ecosystem
The road towards a Web-based data ecosystemChristophe Guéret
 
Linked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesLinked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesChristophe Guéret
 
Downscaling information systems for education
Downscaling information systems for educationDownscaling information systems for education
Downscaling information systems for educationChristophe Guéret
 
ICT4D course 2013 - Low resources infrastructure
ICT4D course 2013 - Low resources infrastructureICT4D course 2013 - Low resources infrastructure
ICT4D course 2013 - Low resources infrastructureChristophe Guéret
 
ICT4D course 2013 - OLPC deployments
ICT4D course 2013 - OLPC deploymentsICT4D course 2013 - OLPC deployments
ICT4D course 2013 - OLPC deploymentsChristophe Guéret
 
Exposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVOExposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVOChristophe Guéret
 
Embedding young learners into the information society
Embedding young learners into the information societyEmbedding young learners into the information society
Embedding young learners into the information societyChristophe Guéret
 
Decentralised entity registry “WikiReg”
Decentralised entity registry “WikiReg”Decentralised entity registry “WikiReg”
Decentralised entity registry “WikiReg”Christophe Guéret
 
Evolutionary and Swarm Computing for scaling up the Semantic Web
Evolutionary and Swarm Computing for scaling up the Semantic WebEvolutionary and Swarm Computing for scaling up the Semantic Web
Evolutionary and Swarm Computing for scaling up the Semantic WebChristophe Guéret
 
Decentralised Open Data for World Citizens
Decentralised Open Data  for World CitizensDecentralised Open Data  for World Citizens
Decentralised Open Data for World CitizensChristophe Guéret
 
Assessing Linked Data Mappings using Network Measures
Assessing Linked Data Mappings using Network MeasuresAssessing Linked Data Mappings using Network Measures
Assessing Linked Data Mappings using Network MeasuresChristophe Guéret
 

More from Christophe Guéret (20)

HHAI June 2022 - KGs and Hybrid Intelligence
HHAI June 2022 - KGs and Hybrid IntelligenceHHAI June 2022 - KGs and Hybrid Intelligence
HHAI June 2022 - KGs and Hybrid Intelligence
 
Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...
 
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
 
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"
 
The Entity Registry System (ERS)
The Entity Registry System (ERS)The Entity Registry System (ERS)
The Entity Registry System (ERS)
 
Let's downscale the semantic web !
Let's downscale the semantic web !Let's downscale the semantic web !
Let's downscale the semantic web !
 
Linking knowledge spaces
Linking knowledge spacesLinking knowledge spaces
Linking knowledge spaces
 
Digital archiving 3.0
Digital archiving 3.0Digital archiving 3.0
Digital archiving 3.0
 
The road towards a Web-based data ecosystem
The road towards a Web-based data ecosystemThe road towards a Web-based data ecosystem
The road towards a Web-based data ecosystem
 
Linked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesLinked Open Data for Digital Humanities
Linked Open Data for Digital Humanities
 
Downscaling information systems for education
Downscaling information systems for educationDownscaling information systems for education
Downscaling information systems for education
 
ICT4D course 2013 - Low resources infrastructure
ICT4D course 2013 - Low resources infrastructureICT4D course 2013 - Low resources infrastructure
ICT4D course 2013 - Low resources infrastructure
 
ICT4D course 2013 - OLPC deployments
ICT4D course 2013 - OLPC deploymentsICT4D course 2013 - OLPC deployments
ICT4D course 2013 - OLPC deployments
 
ICT4D course 2013 - Sugar
ICT4D course 2013 - SugarICT4D course 2013 - Sugar
ICT4D course 2013 - Sugar
 
Exposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVOExposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVO
 
Embedding young learners into the information society
Embedding young learners into the information societyEmbedding young learners into the information society
Embedding young learners into the information society
 
Decentralised entity registry “WikiReg”
Decentralised entity registry “WikiReg”Decentralised entity registry “WikiReg”
Decentralised entity registry “WikiReg”
 
Evolutionary and Swarm Computing for scaling up the Semantic Web
Evolutionary and Swarm Computing for scaling up the Semantic WebEvolutionary and Swarm Computing for scaling up the Semantic Web
Evolutionary and Swarm Computing for scaling up the Semantic Web
 
Decentralised Open Data for World Citizens
Decentralised Open Data  for World CitizensDecentralised Open Data  for World Citizens
Decentralised Open Data for World Citizens
 
Assessing Linked Data Mappings using Network Measures
Assessing Linked Data Mappings using Network MeasuresAssessing Linked Data Mappings using Network Measures
Assessing Linked Data Mappings using Network Measures
 

Is linked data something for me?

  • 1. Is Linked Data something for me? Christophe Guéret, Clément Levallois eHumanities group meeting, November 22, 2012 1/
  • 2. Get ready ! Goal of today Learn about Linked Data See if that is something interesting for your activities 2/
  • 3. Hands-on tutorial Make groups, one per table Pick a famous person of your choice per group Grab the material on http://bit.ly/ehg_tutorial or catch a USB stick 3/
  • 4. Big data, but how to get it? Can't always gather all the information manually 4/
  • 5. Big data, but how to get it? Data scattered in different information systems 5/
  • 6. Big data, but how to get it? Data in different formats 6/
  • 7. What if we could? If all data where “readable”, connections between datasets could be made. We would simply know more than we do today. “Linked data” is an attempt to do that 7/
  • 8. Why is it so hard? Machines can not read the text and extract data What is the name of that person? 8/
  • 9. Ouch! You just faced the same problem as machines: Can't read the document and extract the data Linked Data is a solution to this problem Note: in the following we take the example of data “buried” in webpages (html documents), but the same logic applies to other kinds of docs (csv files, databases, your collection of pictures…) 9/
  • 10. Use case for the hands-on 10/
  • 11. What we will do... Take a the webpage of a researcher (one page per group!) Explain why the data in this page is “buried” Solve the issue by introducing some linked data sweetness in the webpage Show what we gained: now, we can connect the researchers! 11/
  • 12. Template 1 The name is in the title City is ambiguous 12/
  • 13. Template 2 The name is not visible on the page City is ambiguous 13/
  • 14. Template 3 The name is in the description City is ambiguous 14/
  • 15. Hands-on: check out the templates Open the templates in a web browser and look at their HTML source code 15/
  • 16. Hands-on: check out the templates Change “William Smith” into a name of your own (one name per group) Change and pick another name! 16/
  • 17. First part of the hands-on 17/
  • 18. In what sense do we mean that the name of this researcher is buried in this web page? There is no way for a software reading this page to guess: is there a name on this page? if so, what is this name? What does this name represent? What does it relate to? But wait, my Internet browser can read html pages, why can’t it figure out the name of the researcher? Because the html code gives info about how to display the page, but no info about what the content means! 18/
  • 19. Two roads from there… We could design a software that understands English This is the approach of natural language processing, statistics, etc... We can put extra code that tells directly to the software what the data means This is the linked data approach! This extra code in html pages is called “RDFa” 19/
  • 20. Annotate the data We use a VOCABULARY for these annotations foaf:name 20/
  • 21. Wait! What is that “foaf:name” ? It is a term from a vocabulary foaf:name comes from the vocabulary FOAF and is used to annotate the name of a person Key concept!!! Vocabulary = set of unambiguous consensual terms used to annotate pages with data Vocabulary are An agreement between data publisher and consumers Generally focused on particular topics 21/
  • 22. Annotate the page with the data 22/
  • 23. Hands-on: annotate with foaf:name Add the “foaf:name” annotation to the three templates Step 1: declare the vocabulary FOAF <html xmlns:foaf="http://xmlns.com/foaf/0.1/"> Step 2: annotate the data <span property="foaf:name">William Smith</span> Template 2 does not display the name we use a meta: <meta property="foaf:name" content="William Smith"/> 23/
  • 24. Hands-on: extract annotations Use the RDFa extractor at http://bit.ly/RDFaParser to get the annotations from the three templates Command line tool: java -jar RDFaParser-0.0.6.jar template1.html java -jar RDFaParser-0.0.6.jar template2.html java -jar RDFaParser-0.0.6.jar template3.html All the three return the same result: nothing! 24/
  • 25. Bingo! We get exactly the same result for the three templates foaf:name = William Smith 25/
  • 26. How this should look like now (here showing template 1) 26/
  • 27. How to choose a vocabulary? Vocabulary => consensus Therefore, it is better to Avoid obscure vocabularies nobody knows Focus on well organised and maintained vocabularies Why did we use FOAF? Specialised for personal profiles and widely accepted W3C support & recommended for use by EU members http://joinup.ec.europa.eu/asset/core_person/description 27/
  • 28. What vocabularies are available? Many are well established: FOAF, SIOC, Dublin Core, BIBO, … Creating vocabularies is doable but beware that: New vocabularies won't necessarily gain adoption Need to maintain the vocabulary Need to host it on the Web A vocabulary can borrow terms from other vocabs. 28/
  • 29. EU initiative “Core Vocabularies” from ISA program Combine existing terms and new ones 29/
  • 30. Google/Bing/Yahoo/Yandex initiative Vocabulary: Schema.org Used by search engines to extract pages' data 30/
  • 31. Facebook initiative Vocabulary: Open graph protocol Used to put the “Like” buttons on pages 31/
  • 32. How to use a vocabulary? Look at the documentation, e.g. http://xmlns.com/foaf/spec/ Map your concepts to terms from the vocabulary Naam → foaf:name Voornaam → foaf:firstName Achternaam → foaf:lastName Werklocatie → foaf:based_near 32/
  • 33. Triples and subjects Remember, we created this annotation . foaf:name "William Smith“ But what entity has “William Smith” for a name? <template1.html> foaf:name "William Smith" Meaning: This document has for name “William Smith” This is a “triple” made of a subject, a predicate and an object Subject = <template1.html> Predicate = foaf:name Object = "William Smith" 33/
  • 34. We did not declare a subject This says that this is the foaf:name but does not define a subject → Use the page name by default foaf:name 34/
  • 35. Why does this matter? Subjects can be used as objects to create links foaf:knows foaf:name Need a common subject to group annotations foaf:name William smith foaf:based_near Durham 35/
  • 36. Picking a resource Need to be stable, web accessible, re-used Consensus again, example: Amsterdam: http://dbpedia.org/resource/Amsterdam TBL: http://www.w3.org/People/Berners-Lee/card#i The <C:/MyDirectory/templateX.html> are not valid Web based, we need to change that 36/
  • 37. Hands-on: set the subject Step 1: decide on a resource for the person http://example.org/william_smith http://myurl.com/john_doe Step 2: add the resource with an “about” tag in the same span as the foaf:name Example: You had: <span property="foaf:name"> It becomes: <span about="http://example.org/william_smith_page" property="foaf:name"> 37/
  • 38. 5-star Linked Data Rules (see http://5stardata.info/ ): Resource are valid URIs Machine readable data is associated to the resource The data contains links to other resources Example http://dbpedia.org/resource/Amsterdam 38/
  • 39. Great! We're done now! We added this structured piece of data to all the templates: <http://example.org/william_smith> foaf:name "William Smith" This data can be extracted by a software We can build our application that fetch persons' name, but there are still no links between them :-/ 39/
  • 40. One of the new code All the annotated templates have their name suffixed with “_with_name_and_subject” 40/
  • 41. Second part of the hands-on Create some links 41/
  • 42. Creating links Links are used to connect two resources Example: William Smith knows Tim Berners-Lee <http://example.org/william_smith> foaf:knows <http://www.w3.org/People/Berners-Lee/card#i> Two usages: Create (social) networks by connecting resources Disambiguate text by pointing to the exact resource 42/
  • 43. Hands-on: getting social Step 1: ask 3 other groups in this workshop for their subject (remember, a subject is: <span about="http://example.org/william_smith_page" property="foaf:name"> Step 2: use the 3 subjects you got to annotate the links Example: I know <span rel="foaf:knows" resource="http://example.org/john_doe">John Doe</span> , and <span rel="foaf:knows" resource="http://myUrl.com/nchomsky">Noam Chomsky</span> , and also <span rel="foaf:knows" resource="http://ehumanities.knaw.nl/sally_wyatt">Sally Wyatt</span> 43/
  • 44. Let's make some links 44/
  • 45. Remember, there are two Durham One of the US, one in the UK, similar importance Which one is the “Durham” on the profile? http://sws.geonames.org/4464368 http://sws.geonames.org/2650628 45/
  • 46. Finding a resource on Geonames Search by name, follow the RDF link, strip out the “/about.rdf” part 46/
  • 47. Hands-on: disambiguate Durham Annotate “Durham” with a link to the exact resource Step 1: decide on which Durham to use Step 2: annotate Durham with the link <span rel="foaf:based_near" about="http://example.org/william_smith" resource="http://sws.geonames.org/4464368">Durham</ span> 47/
  • 48. Hands-on: extract annotations Use the RDFa extractor at http://bit.ly/RDFaParser to get the annotations from the three templates Command line tool: java -jar RDFaParser-0.0.6.jar template1.html java -jar RDFaParser-0.0.6.jar template2.html java -jar RDFaParser-0.0.6.jar template3.html All the three return the same result! 48/
  • 49. Hands-on: extract a network! Now use a little software from the dropBox 49/
  • 50. That's all for now! (but there is more to discover: ontologies, reasoning, SPARQL, ...) 50/