Pal gov.tutorial2.session15 1.linkeddata


Published on

Published in: Education
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Pal gov.tutorial2.session15 1.linkeddata

  1. 1. ‫أكاديمية الحكومة اإللكترونية الفلسطينية‬ The Palestinian eGovernment Academy www.egovacademy.psTutorial II: Data Integration and Open Information Systems Session 15.1 The Data Web and Linked Data Dr. Mustafa Jarrar University of Birzeit PalGov © 2011 1
  2. 2. AboutThis tutorial is part of the PalGov project, funded by the TEMPUS IV program of theCommission of the European Communities, grant agreement 511159-TEMPUS-1-2010-1-PS-TEMPUS-JPHES. The project website: www.egovacademy.psProject Consortium: Birzeit University, Palestine University of Trento, Italy (Coordinator ) Palestine Polytechnic University, Palestine Vrije Universiteit Brussel, Belgium Palestine Technical University, Palestine Université de Savoie, France Ministry of Telecom and IT, Palestine University of Namur, Belgium Ministry of Interior, Palestine TrueTrust, UK Ministry of Local Government, PalestineCoordinator:Dr. Mustafa JarrarBirzeit University, P.O.Box 14- Birzeit, PalestineTelfax:+972 2 2982935 mjarrar@birzeit.eduPalGov © 2011 2
  3. 3. © Copyright NotesEveryone is encouraged to use this material, or part of it, but shouldproperly cite the project (logo and website), and the author of that part.No part of this tutorial may be reproduced or modified in any form or byany means, without prior written permission from the project, who havethe full copyrights on the material. Attribution-NonCommercial-ShareAlike CC-BY-NC-SAThis license lets others remix, tweak, and build upon your work non-commercially, as long as they credit you and license their new creationsunder the identical terms. PalGov © 2011 3
  4. 4. Tutorial Map Topic h Intended Learning Objectives Session 1: XML Basics and Namespaces 3A: Knowledge and Understanding Session 2: XML DTD’s 3 2a1: Describe tree and graph data models. Session 3: XML Schemas 3 2a2: Understand the notation of XML, RDF, RDFS, and OWL. 2a3: Demonstrate knowledge about querying techniques for data Session 4: Lab-XML Schemas 3 models as SPARQL and XPath. Session 5: RDF and RDFs 3 2a4: Explain the concepts of identity management and Linked data. Session 6: Lab-RDF and RDFs 3 2a5: Demonstrate knowledge about Integration &fusion of Session 7: OWL (Ontology Web Language) 3 heterogeneous data. Session 8: Lab-OWL 3B: Intellectual Skills Session 9: Lab-RDF Stores -Challenges and Solutions 3 2b1: Represent data using tree and graph data models (XML & Session 10: Lab-SPARQL 3 RDF). Session 11: Lab-Oracle Semantic Technology 3 2b2: Describe data semantics using RDFS and OWL. Session 12_1: The problem of Data Integration 1.5 2b3: Manage and query data represented in RDF, XML, OWL. Session 12_2: Architectural Solutions for the Integration Issues 1.5 2b4: Integrate and fuse heterogeneous data. Session 13_1: Data Schema Integration 1C: Professional and Practical Skills Session 13_2: GAV and LAV Integration 1 2c1: Using Oracle Semantic Technology and/or Virtuoso to store Session 13_3: Data Integration and Fusion using RDF 1 and query RDF stores. Session 14: Lab-Data Integration and Fusion using RDF 3D: General and Transferable Skills 2d1: Working with team. Session 15_1: Data Web and Linked Data 1.5 2d2: Presenting and defending ideas. Session 15_2: RDFa 1.5 2d3: Use of creativity and innovation in problem solving. 2d4: Develop communication skills and logical reasoning abilities. Session 16: Lab-RDFa 3 PalGov © 2011 4
  5. 5. Module ILOsAfter completing this module students will beable to: -Explain the concepts of identity management and linked data. - Understand basic concepts of the Data Web. -Integrate and fuse heterogeneous data. PalGov © 2011 5
  6. 6. Semantic/ Data Web/ Web 3.0?“The goal of the Semantic Web isto create a universal medium for theexchange of DATA”, W3C.“The Semantic Web is a web of data, insome ways like a global database”,Tim Berners-Lee – Inventor of the WWW. PalGov © 2011 6
  7. 7. Web of Data• The Data Web envisions the web as a world-wide interlinked structured data.• The Web as we know it today is a global information space of linked documents.• The same vision is applied to data: publishing and connecting structured data on the web. PalGov © 2011 7
  8. 8. Classical Web Diagram Source: Christian Bizer • The classical web a global information space of linked documents. • Primary Units of the hypertext Web are: – HTML Documents, – Connected by Hyperlinks PalGov © 2011 8
  9. 9. The challenge• The problem is that the information on the classical web is not structured. – Programs cannot use such information in a useful way.• The Solution is to increase the structure of published information. PalGov © 2011 9
  10. 10. Web APIs and Mashups Diagram Source: Christian Bizer• Many major data sources such as Amazon, Yahoo!, eBay, and Google provide access to their data through APIs.• Currently, lists 3891 APIs and 6101 mashups (up to 14. Sep 2011). API API MashUp API PalGov © 2011 10
  11. 11. Web APIs and Mashups Picture Source: Christian Bizer• However, – APIs provide proprietary interfaces, – Data retrieved from these APIs is represented using different formats (different data models). – Mashups created using these APIs are based on a fixed set of data sources. This is because entities in different APIs are not linked. – You can not set hyperlinks between entities. APIs separates data PalGov © 2011 11
  12. 12. Beyond Web APIs and Mashups: The Data Web and Linked Data• The Data Web envisions the web as a world-wide interlinked structured data.• Linked data refers to the set of best practices for publishing and connecting structured data on the web.• Linked data best practices has lead to the extension of the web connecting data from diverse domains such as: – People, companies, books, scientific publications, films, music, television and radio programs, genes, proteins drugs, clinical trials, online communities, statistical and scientific data, reviews, … PalGov © 2011 12
  13. 13. The Data Web and Linked Data Diagram Source: Christian Bizer• While the primary units of the hypertext Web are HTML documents connected by un-typed Hyperlinks, Linked Data relies on documents containing data in RDF. However, rather than simply connecting these documents, Linked Data uses RDF to make typed statements that link arbitrary things in the world.• The result is a web of things in the world, described by data on the Web PalGov © 2011 13
  14. 14. The Data Web and Linked DataBerners-Lee (2006) outlined a set of rules for publishingdata on the Web in a way that all published data becomespart of a single global data space: 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names 3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) 4. Include links to other URIs, so that they can discover more things PalGov © 2011 14
  15. 15. Properties of the Web of Linked Data• Anyone can publish data to the Web of Linked Data• Entities are connected by links – creating a global data graph that spans data sources and enables the discovery of new data sources.• Data is self-describing – If an application encounters data represented using an unfamiliar vocabulary, the application can resolve the URIs that identify vocabulary terms in order to find their RDFS or OWL definition.• The Web of Data is open – meaning that applications can discover new data sources at run- time by following links. PalGov © 2011 15
  16. 16. Realization: Linking Open Data Project PalGov © 2011 16
  17. 17. Realization: Linking Open Data Project• Grassroots community effort to: – publish existing open license datasets as Linked Data on the Web – interlink things between different data sources• By September 2010 the cloud had grown to 25 billion RDF triples, interlinked by around 395 million RDF links. PalGov © 2011 17
  18. 18. Linking Data • How are same entities described in different datasets linked? • AGAIN: By linking the Global Identifier, that is, the URI! • Let’s have a look at real examples from real datasets: <> owl:sameAs <> • Linking the entity “Bethlehem” between the DBPedia dataset and the Geonames dataset in the Linking Open Data cloud. • This is done by linking the URIs of “Bethlehem” in both datasets using owl:sameAs.<> owl:sameAs<> • Linking the entity “Tim Berners-Lee” between the DBPedia dataset and the DBLP dataset . • This is done by linking the URIs of “Tim Berners-Lee” in both datasets using owl:sameAs.NOTE: The student is encouraged to visit the URIs specified above. PalGov © 2011 18
  19. 19. Resources  (Bethlehem URI in DBPedia) (Bethlehem URI in Geonames) PalGov © 2011 19
  20. 20. Applications Diagram Source: Christian Bizer• What Can I do with this? PalGov © 2011 20
  21. 21. Let’s draw a graph of our example! v:Person “George Mousa” v:nickname “Geno” v:Address v:city “Nablus” “Palestine” “George Mousa” PalGov © 2011 21
  22. 22. References• Christian Bizer: The Emerging Web of Linked Data. Presentation at SRI International, Artificial Intelligence Center. Menlo Park, USA. 2009.• W3C:• Linking Open Data: PalGov © 2011 22