Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Linked Data and Semantic Web - EUDAT Summer School (Yann Le Franc, e-Science Data Factory)

999 views

Published on

Going from a web of document to a web of knowledge is one of the key goal set by the creator of the World Wide Web, Sir Tim Berners-Lee. This dream is becoming a reality more each day with the development and the integration of new formats and new technologies to represent data as knowledge graphs, interlinking concepts within documents or databases together. This presentation will provide an overview of the generic concepts supporting Linked Data, including formats, the existing technologies supporting these formats, introduce the key existing initiatives relying on these technologies. We will also address the challenge of semantic/knowledge modeling in science and in other domains and the need for more tools to support the use of these technologies. In particular, we will present the semantic annotation service B2NOTE and how the formats and technologies are used to extend the description of datasets within EUDAT and allow the possibility to create new datasets from multiples sources and multiple domains.

Visit https://eudat.eu/eudat-summer-school

Published in: Data & Analytics
  • Be the first to comment

Linked Data and Semantic Web - EUDAT Summer School (Yann Le Franc, e-Science Data Factory)

  1. 1. www.eudat.eu EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065 Introduction to Linked Data and Semantic Web Yann Le Franc, PhD This work is licensed under the Creative Commons CC-BY 4.0 licence. Attribution: EUDAT – www.eudat.eu Version 2017-1
  2. 2. How to cope with an expending universe of scientific data? “The Hitchhiker’s guide to the Semantic Web Galaxy”
  3. 3. How to cope with an expending universe of scientific data? “The Hitchhiker’s guide to the Semantic Web Galaxy”
  4. 4. EUDAT Summer School, 3-7 July 2017, Crete Introduction: a bit of context The general principles of Linked Data and standards Application: data annotations with B2NOTE Outline
  5. 5. EUDAT Summer School, 3-7 July 2017, Crete Problem: the volume of scientific data is expanding
  6. 6. EUDAT Summer School, 3-7 July 2017, Crete ? Challenge: Aggregating multi-dimensional data from multiple data sources
  7. 7. EUDAT Summer School, 3-7 July 2017, Crete ? Similar problem and challenge in Neuroscience
  8. 8. EUDAT Summer School, 3-7 July 2017, Crete Multiple species Multi-scale data ConnectivityGenes Molecules Electrical activity Functional Data aggregation Similar problem and challenge in Neuroscience
  9. 9. EUDAT Summer School, 3-7 July 2017, Crete Modeling Multiple species Multi-scale data ConnectivityGenes Molecules Electrical activity Functional Data Analysis Data aggregation Similar problem and challenge in Neuroscience
  10. 10. EUDAT Summer School, 3-7 July 2017, Crete Modeling Multiple species Multi-scale data ConnectivityGenes Molecules Electrical activity Functional Data Analysis Data aggregation Similar problem and challenge in Neuroscience
  11. 11. EUDAT Summer School, 3-7 July 2017, Crete Data enclosed in information silos : Distinct APIs, Data published within HTML or unstructured 2710 databases related to Neurosciences (Neuroscience Information Framework) How can we make these data resources interoperable and link them together? The current situation: distributed data resources in large variety of formats WebAPI <HTML> <HTML> WebAPI
  12. 12. EUDAT Summer School, 3-7 July 2017, Crete https://fr.wikipedia.org/wiki/Tim_Berners-Lee A global problem World Wide Web is a global document space Documents are interconnected with links Data is hidden in HTML pages: Easy to use by humans but not by machines Large diversity of Web APIs Impossible to access and interlink data Need for semantics for transforming the global document space into a global data space
  13. 13. EUDAT Summer School, 3-7 July 2017, Crete A solution for Life Science, the Universe and Everything
  14. 14. EUDAT Summer School, 3-7 July 2017, Crete What is Linked Data? Tim Berners-Lee (2006) - Design Issues Use URIs as name for things Use HTTP URIs so that people can look up those names (dereferencable) When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) Include links to other URIs, so that they can discover more things https://www.w3.org/DesignIssues/LinkedData.html
  15. 15. EUDAT Summer School, 3-7 July 2017, Crete Use URI instead of URN (Uniform Resource Name) and DOIs Example Real Person http://www.esciencedatafactory.com/people/yann_le_franc Description RDF (for machines) http://www.esciencedatafactory.com/people/yann_le_franc.rdf Description HTML (for humans) http://www.esciencedatafactory.com/people/yann_le_franc.html Separate the URI representing the real object or concept from its description Name things with URIs
  16. 16. EUDAT Summer School, 3-7 July 2017, Crete Make use of HTTP content negociation Two technical solutions for designing the URIs: 1 - Use the content negotiation Redirect 303 (see Other Link) 2 – Hash URI https://www.w3.org/TR/cooluris/ Make URI dereferencable https://www.w3.org/Protocols/rfc2616/rfc2616-sec12.html
  17. 17. EUDAT Summer School, 3-7 July 2017, Crete Make URI dereferencable Use the content negotiation Redirect 303 (see Other Link) Client Server
  18. 18. EUDAT Summer School, 3-7 July 2017, Crete GET URI Make URI dereferencable Use the content negotiation Redirect 303 (see Other Link) Client Server Client HEADER GET /people/yann_le_franc HTTP/1.1 Host: esciencedatafactory.com Accept: text/html, application/rdf+xml
  19. 19. EUDAT Summer School, 3-7 July 2017, Crete GET URI 303- See URI2 Make URI dereferencable Use the content negotiation Redirect 303 (see Other Link) Client Server Client HEADER GET /people/yann_le_franc HTTP/1.1 Host: esciencedatafactory.com Accept: text/html, application/rdf+xml Server Answer HTTP/1.1 303 See Other Location: http://www.esciencedatafactory.com/ people/yann_le_franc.rdf Vary: Accept
  20. 20. EUDAT Summer School, 3-7 July 2017, Crete GET URI 303- See URI2 GET URI2 Make URI dereferencable Use the content negotiation Redirect 303 (see Other Link) Client Server Client HEADER GET /people/yann_le_franc.rdf HTTP/1.1 Host: esciencedatafactory.com Accept: text/html, application/rdf+xml
  21. 21. EUDAT Summer School, 3-7 July 2017, Crete GET URI 303- See URI2 GET URI2 Content URI2 Make URI dereferencable Use the content negotiation Redirect 303 (see Other Link) Client Server Client HEADER GET /people/yann_le_franc HTTP/1.1 Host: esciencedatafactory.com Accept: text/html, application/rdf+xml Server Answer HTTP/1.1 200 OK Content-Type: application/rdf+xml …
  22. 22. EUDAT Summer School, 3-7 July 2017, Crete GET URI 303- See URI2 GET URI2 Content URI2 Make URI dereferencable Use the content negotiation Redirect 303 (see Other Link) Client Server Client HEADER GET /people/yann_le_franc HTTP/1.1 Host: esciencedatafactory.com Accept: text/html, application/rdf+xml Server Answer HTTP/1.1 200 OK Content-Type: application/rdf+xml … Requires 4 HTTP calls per item
  23. 23. EUDAT Summer School, 3-7 July 2017, Crete Make URI dereferencable 2 – Use Hash URI GET URI Client Server http://www.esciencedatafactory.com/people List of people • http://www.esciencedatafactory.com/people#yann_le_franc • http://www.esciencedatafactory.com/people#john_doe Client HEADER GET /people HTTP/1.1 Host: esciencedatafactory.com Accept: application/rdf+xml
  24. 24. EUDAT Summer School, 3-7 July 2017, Crete Make URI dereferencable 2 – Use Hash URI GET URI Content URI Client Server http://www.esciencedatafactory.com/people List of people • http://www.esciencedatafactory.com/people#yann_le_franc • http://www.esciencedatafactory.com/people#john_doe Client HEADER GET /people HTTP/1.1 Host: esciencedatafactory.com Accept: application/rdf+xml HTTP/1.1 200 OK Content-Type: application/rdf+xml The whole list Server Answer
  25. 25. EUDAT Summer School, 3-7 July 2017, Crete Make URI dereferencable 2 – Use Hash URI GET URI Content URI Client Server http://www.esciencedatafactory.com/people List of people • http://www.esciencedatafactory.com/people#yann_le_franc • http://www.esciencedatafactory.com/people#john_doe Client HEADER GET /people HTTP/1.1 Host: esciencedatafactory.com Accept: application/rdf+xml HTTP/1.1 200 OK Content-Type: application/rdf+xml The whole list Server Answer Cache
  26. 26. EUDAT Summer School, 3-7 July 2017, Crete Make URI dereferencable 2 – Use Hash URI GET URI Content URI Client ServerCache Get the whole file and then look into the file to find the items with the hash http://www.esciencedatafactory.com/people List of people • http://www.esciencedatafactory.com/people#yann_le_franc • http://www.esciencedatafactory.com/people#john_doe
  27. 27. EUDAT Summer School, 3-7 July 2017, Crete Resource A URI Resource B URI Relation URI My website http://www.example.com/ index.html Me http://myprofile/name Created by RDF Triple (subject, predicate, object) The RDF Data Model
  28. 28. EUDAT Summer School, 3-7 July 2017, Crete Labeled directed graph From W3C RDF 1.1. Primer https://www.w3.org/TR/rdf11-primer/ RDF in action
  29. 29. EUDAT Summer School, 3-7 July 2017, Crete RDF/XML RDF serializations <?xml version =“1.0” encoding=”UTF-8”?> <rdf:RDF xmlns:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:foaf=“http://xmlns.com/foaf/0.1”> <rdf:Description rdf:about=“http://www.esciencedatafactory.com/people/yann_le_franc”> <rdf:type rdf:resource=“http://xmlns.com/foaf/0.1/Person”> <foaf:name>Yann Le Franc</foaf:name> </rdf:Description> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> @prefix foaf: < http://xmlns.com/foaf/0.1> < http://www.esciencedatafactory.com/people/yann_le_franc> rdf:type foaf:Person foaf:name “Yann Le Franc” Turtle
  30. 30. EUDAT Summer School, 3-7 July 2017, Crete RDFa RDF serializations <!DOCTYPE html PUBLIC “ _//W3C//DTD XHTML+RDFa 1.0//EN” “http://www.w3c.org/MarkUp/DTD/xhtml-rdfa-1.dtd”> <html xmlns=“http://www.w3c.org/1999/xhtml” xmlns:rdf=“http://www.w3c.org/1999/02/22-rdf-syntax-ns#” xmlns:foaf=“http://xmlns.com/foaf/0.1/”> <head> <meta http-equiv=“Content-Type” content=“application/xhtml+xml; charset=UTF-8”/> <title>Profile page for Yann Le Franc</title> <:/head> <body> <div about=http://www.esciencedatafactory.com/people/yann_le_franc typeof=“foaf:Person”> <span property=“foaf:name”>Yann Le Franc</span> </div> </body> </html>
  31. 31. EUDAT Summer School, 3-7 July 2017, Crete Subject Predicate Object Alice is a friend of Bob Bob Is interested in The Mona Lisa Bob Is a Person Bob Is born 14 July 1990 The Mona Lisa Was created by Leonardo Da Vinci La Joconde in Washington Is about The Mona Lisa Triple Store SPARQL endpoint SPARQL Queries Publishing RDF
  32. 32. EUDAT Summer School, 3-7 July 2017, Crete RDF Triple store Graph database M. Junghanns and A. Petermann, “Management and Analysis of Big Graph Data: Current Systems and Open Challenges,” … (eds: S Sakr, 2017. B. Haslhofer, E. Momeni Roochi, B. Schandl, and S. Zander, “Europeana RDF Store Report,” Mar. 2011. Z. Kaoudi and G. Weikum, RDF in the clouds: a survey In The VLDB Journal. 2014. Technologies to publish RDF
  33. 33. EUDAT Summer School, 3-7 July 2017, Crete Resource 1: http://www.incf.org/images/newsroom/le-franc Resource 2: http://m.c.lnkd.licdn.com/mpr/mpr/shrink_200_200/p/2/000/22d/056/2bdc24c.jpg Last Name : Le Franc <last_name> Le Franc </last_name> Family Name : Le Franc <family_name> Le Franc </family_name> Do we need anything else?
  34. 34. EUDAT Summer School, 3-7 July 2017, Crete Resource 1: http://www.incf.org/images/newsroom/le-franc Resource 2: http://m.c.lnkd.licdn.com/mpr/mpr/shrink_200_200/p/2/000/22d/056/2bdc24c.jpg Last Name : Le Franc <last_name> Le Franc </last_name> Family Name : Le Franc <family_name> Le Franc </family_name> Do we need anything else? Synonym/Equivalent
  35. 35. EUDAT Summer School, 3-7 July 2017, Crete Resource 1: http://www.incf.org/images/newsroom/le-franc Resource 2: http://m.c.lnkd.licdn.com/mpr/mpr/shrink_200_200/p/2/000/22d/056/2bdc24c.jpg Last Name : Le Franc <last_name> Le Franc </last_name> Family Name : Le Franc <family_name> Le Franc </family_name> Do we need anything else? Synonym/Equivalent ? ? ?? ? WE NEED COMMON VOCABULARIES TO SHARE THE SAME SEMANT
  36. 36. EUDAT Summer School, 3-7 July 2017, Crete Yes if you are interested in: Sharing data with other Data aggregation from multiple sources Not if you are a lone scientist in your ivory tower Do we really need vocabularies?
  37. 37. EUDAT Summer School, 3-7 July 2017, Crete “In computer science and information science, an ontology formally represents knowledge as a set of concepts within a domain, using a shared vocabulary to denotes the types, properties and interrelationships of the concepts” - Wikipedia You need to create a controlled vocabulary also called ontology that could be used as a common “standardized” vocabulary to annotate your resource W3C semantic web standards:  RDF Schema OWL (Web Ontology Language) SKOS (Simple Knowledge Organization System) What is an ontology? How do you encode this in practice? How can we make it better?
  38. 38. EUDAT Summer School, 3-7 July 2017, Crete Class What is an ontology in practice?
  39. 39. EUDAT Summer School, 3-7 July 2017, Crete Class Unique identifier Label Human-readable definition Other metadata (creator, version, date,…) What is an ontology in practice?
  40. 40. EUDAT Summer School, 3-7 July 2017, Crete Superclass Unique identifier Label Human-readable definition Other metadata (creator, version, date,…) Subclass Unique identifier Label Human-readable definition is_aSubsumption relation Macaqua mulata is an animal What is an ontology in practice?
  41. 41. EUDAT Summer School, 3-7 July 2017, Crete Person Unique identifier Label Human-readable definition Other metadata (creator, version, date,…) Yann Le Franc Unique identifier Label Human-readable definition is_aSubsumption relation What is an ontology in practice?
  42. 42. EUDAT Summer School, 3-7 July 2017, Crete Superclass Subclass is_aSubsumption relation Superclass 2 has_a Associative relation What is an ontology in practice?
  43. 43. EUDAT Summer School, 3-7 July 2017, Crete Person Yann Le Franc is_aSubsumption relation Relations between concepts are based on first-order logic Use reasoners/classifiers- machine learning algorithms Name has_a Associative relation What is an ontology in practice?
  44. 44. EUDAT Summer School, 3-7 July 2017, Crete Structuring RDFRDF Schema OWL
  45. 45. EUDAT Summer School, 3-7 July 2017, Crete Structuring RDF: SKOS
  46. 46. EUDAT Summer School, 3-7 July 2017, Crete http://microformats.org/wiki/Main_Page Microformat and Schema.org
  47. 47. EUDAT Summer School, 3-7 July 2017, Crete http://microformats.org/wiki/Main_Page Microformat and Schema.org
  48. 48. EUDAT Summer School, 3-7 July 2017, Crete http://microformats.org/wiki/Main_Page Microformat and Schema.org
  49. 49. EUDAT Summer School, 3-7 July 2017, Crete http://schema.org/
  50. 50. EUDAT Summer School, 3-7 July 2017, Crete Example of vocabularies FOAF – Friend Of A Friend DCAT (Data Catalog Vocabulary) PROV (Provenance vocabulary) Web Annotation Music Ontology SIOC (Semantically Interlinked Online Community)
  51. 51. EUDAT Summer School, 3-7 July 2017, Crete By user:Marobi1 [CC0], via Wikimedia Commons https://en.wikipedia.org/wiki/Semantic_Web_Stack The semantic web stack
  52. 52. EUDAT Summer School, 3-7 July 2017, Crete  Limitation of a unique formal model: monolithic ontologies Difficulty to reconcile different models Lack of validation and quality testing for ontologies Difficult reach consensus on research topics Slow integration of the new concepts in existing ontologies Hard to use for scientists However designing common terminologies is valuable and Mostly Harmless ? Limits of the approach
  53. 53. EUDAT Summer School, 3-7 July 2017, Crete Google Knowledge Graph https://www.google.com/intl/bn/insidesearch/features/sea rch/knowledge.html Facebook graph: https://developers.facebook.com/docs/graph- api/overview/ Wikidata: https://www.wikidata.org/wiki/Wikidata:Main_Page Freebase Dbpedia https://datahub.io/dataset EBI RDF store Some major RDF resources
  54. 54. EUDAT Summer School, 3-7 July 2017, Crete Metadata Different types of metadata to describe the context, the content, the format and the history of the data Metadata are generally frozen after publication of a data record Descriptive Metadata can be incomplete and/or biased by the data publisher perspective
  55. 55. EUDAT Summer School, 3-7 July 2017, Crete Metadata Different types of metadata to describe the context, the content, the format and the history of the data Metadata are generally frozen after publication of a data record Descriptive Metadata can be incomplete and/or biased by the data publisher perspective  Annotations How to add new metadata/information in a flexible way?
  56. 56. EUDAT Summer School, 3-7 July 2017, Crete What do we mean by annotation? By definition, an annotation is “a note added to a text, book, drawing, etc., as a comment or an explanation” (from Merriam Webster). In our context, it is an assertion we want to make about a digital resource i.e. a text file, an image, a recording, a movie,... .
  57. 57. EUDAT Summer School, 3-7 July 2017, Crete Semantic Annotation: General Principles
  58. 58. EUDAT Summer School, 3-7 July 2017, Crete Web Annotation Data Model Use W3C Web Annotation data model – (https://www.w3.org/TR/annotation-model/) Serialized in JSON-LD (https://www.w3.org/TR/json-ld/) = JSON based representation of RDF graphs
  59. 59. EUDAT Summer School, 3-7 July 2017, Crete The annotation “use-cases” Manual annotations of data elements: semantic tagging and file linking Semi-automatic annotations of data element content: related with LTER Data Pilot Data curation: curation status tags Create aggregated datasets from multi-scale or multi-domain datasets.
  60. 60. EUDAT Summer School, 3-7 July 2017, Crete B2NOTE Crowdsourcing annotator All annotation are public Private annotation in the next release Easy-to-use auto-completion with terms from domain specific controlled vocabularies Intuitive User Interface Easily create new datasets selected based on annotations Easy integration approach based Widget/iframe approach Integrate with EUDAT services Integrate with community web UI Easy to deploy Store triples as JSON-LD in MongoDB backend Uses Django as CMS
  61. 61. EUDAT Summer School, 3-7 July 2017, Crete B2NOTE architecture
  62. 62. EUDAT Summer School, 3-7 July 2017, Crete B2NOTE Annotation Model anno1 rdf:type body1 oa:tagging oa:hasTargetoa:hasBody oa:motivatedBy oa: Annotation person1 dcterms:creator foaf:Person rdf:type “pseudo” foaf:nick client1 as:generator as:Application rdf:type “http://b2note.bsc.es” foaf:name “B2Note v1.0” foaf:homepage “2017-01-17T09:51:02Z” “2017-01-17T09:51:02Z” dcterms:created dcterms:issued “http://b2share.eudat.eu/record/30” oa:Composite Semantic Tag rdf:type oa:TextualBody Keyword and Comment rdf:type
  63. 63. EUDAT Summer School, 3-7 July 2017, Crete B2NOTE at work Try it @ http://b2note.bsc.es Login/Register Annotation interface Access to annotation
  64. 64. EUDAT Summer School, 3-7 July 2017, Crete B2NOTE at work Access semantic term information Search files using annotations Export annotations and selected data for reuse
  65. 65. EUDAT Summer School, 3-7 July 2017, Crete Test integration with B2SHARE https://trng-b2share.eudat.eu/
  66. 66. EUDAT Summer School, 3-7 July 2017, Crete The added-value of annotations Enriching digital content with your personal keyword without modifying the data record Structure data differently using annotations Support data curation before and after publication Create aggregated datasets from multi-scale or multi- domain datasets.
  67. 67. EUDAT Summer School, 3-7 July 2017, Crete Additional Resources EUDAT Webinar: Organise, retrieve and aggregate data using annotations with B2NOTE
  68. 68. www.eudat.eu

×