Linked Open Data - Seminar 25.04.12


Published on

Linked Open Data presentation from seminar 25.04.12.
Seminar om lenka data, Vestlandsforsking 25.04.12

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Linked Open Data - Seminar 25.04.12

  1. 1. OpenLinked^ Data Rajendra Akerkar
  2. 2. Outline  Web evolution  Semantic Web  Why we need it?  Linked Data Paradigm  Tools 
  3. 3. Web
  4. 4. From Gopher to Super-Mashups
  5. 5. Why do we want to add meaning to data ?  When a computer understands what data means, it can do  search,  reasoning and 
  6. 6. Meaning is about understanding To understand we need a language A language starts with
  7. 7. Things mean something in words Online, we describe things with
  8. 8. Look at my coin collection The first coin is called “Silver Tram” and is from Armenia. It was made in 1246-47 AD. The second coin is called “Gold Stater of Lahor” and is from India. It was made in 127-151 AD. < ... etc >
  9. 9. <?xml version="1.0" encoding="ISO-8859-1"?> <collection name=”My coin collection"> <coin> <title>Silver Tram</title> <country>Armenia</country> <year>1246-47 AD</year> </coin> <coin> <title>Gold Stater of Lahor</title> <country>India</country> <year>127-151 AD</year> </coin> </collection>
  10. 10. We can’t understand words alone. We also need grammar Online grammar is RDF (Resource Description Framework)
  11. 11. This coin is from
  12. 12. predicate subject object This coin is from
  13. 13. With RDF Schema we can define concepts and make simple relations between
  14. 14. This coin is from India, hence from South
  15. 15. But, RDF schema is limited A language needs more expression and logic to make good reasoning possible That’s why OWL (The Web Ontology Language) was
  16. 16. Next, to reason you need
  17. 17. I got this coin from my
  18. 18. son of father I mother or father The rule for calling someone my grandfather is that one of my parents has a
  19. 19. Rules are formulated in Rule
  20. 20. <ruleml:imp> <ruleml:_rlab ruleml:href="#example1"/> <ruleml:_body> <swrlx:individualPropertyAtom swrlx:property="hasParent"> <ruleml:var>x1</ruleml:var> <ruleml:_head> <ruleml:var>x2</ruleml:var> <swrlx:individualPropertyAtom </swrlx:individualPropertyAtom> swrlx:property="hasGrandfather" <swrlx:individualPropertyAtom > swrlx:property="hasFather"> <ruleml:var>x1</ruleml:var> <ruleml:var>x2</ruleml:var> <ruleml:var>x3</ruleml:var> <ruleml:var>x3</ruleml:var> </swrlx:individualPropertyAtom> </swrlx:individualPropertyAtom> </ruleml:_body> </ruleml:_head> </ruleml:imp>
  21. 21. So, Words in XML Grammar in RDF (schema) and OWL Rules in RL There are a lot of things, that can be described using standard
  22. 22. Suppose, I want to search for a specific
  23. 23. “I want all the golden coins, designed in Asia, but used in the Europe, between 1958 and 1989”
  24. 24. We can use SPARQL (Protocol and RDF Query Language)
  25. 25. Because the Web is decentralized and data is in many places, not only language is important Exchange of data between different DB for knowledge creation is an ultimate
  26. 26. To make a connection a machine needs a source For this, we use resource identifiers Best known resource identifier is the URI which consists of a name (urn) and a location (url)
  27. 27. URI URL URN Gold Stater of Lahor
  28. 28. With all this background we are capable of using the power of all different data resources on the
  29. 29. Linked Data vs. Semantic Web  The Semantic Web, or the Web of Data, is the ultimate goal  Linked Data provides the means to reach that goal  Linked Data helps build the Web of Data that later can be exploited by more advanced technologies such as intelligent
  30. 30. Linked Data vs. Linked Open
  31. 31. Databases store data to answer questions (1) •How old is Rajendra? •When was VF founded? •Where does Rajendra work? •Where is VF located? •What is Rajendra interested •What can VF do for me? in? Persons
  32. 32. Databases store data to answer questions (2) •Rajendra is .. years old. •VF was founded 27 years ago. •Rajendra works in Sogndal. •VF is located in Norway. •Rajendra is interested in the •VF offers IT-Consulting & Linked Data. Research. name date_birth work_place interests organisation date_founded location services Rajendra 08-08 Sogndal Linked Data VF 1985 Norway IT-Consulting & Research Svein …. …. …. nLink …. …. …. Persons
  33. 33. Data from Databases can be exposed to the Web via HTML Persons
  34. 34. Data from Databases can be accessed via APIs <workPlace>Sogndal</workPlace> <location>Norway</location> getWorkplace(„Rajendra“) getLocation(„VF“) Persons
  35. 35. (Some) Information on the Web can be found via search engines Questions won t be Google answered
  36. 36. But how to get answers on complex questions? (1) Who is interested in „Linked Data“ and is working in the same country as VF is located?
  37. 37. But how to get answers on complex questions? (2) Who is interested in „Linked Data“ and is working in the same country as VF is located? work_place same thing? location Sogndal same country? Norway name date_birth work_place interests organisation date_founded location services Rajendra 08-08 Sogndal Linked Data VF 1985 Norway IT-Consulting & Research Svein …. …. …. nLink …. …. …. Still no answer Persons
  38. 38. Is Mapping the solution? Mapped! work_place location same country? Sogndal Norway Still not clear And what, if we need to add another database? name date_birth university course What, if DB-owners Rajendra 08-08 NTNU Computer Science can t agree on a Students Svein …. …. …. common model?
  39. 39. Mapping is no solution for a distributed Web of
  40. 40. Before I come up with a solution , let us understand four simple
  41. 41. Resources place type Norway isA isA type partOf work_place Sogndal
  42. 42. URIs & Namespaces rdf:type umbel:place dbpedia:Norway rdfs:subClassOf p:subdivisionName rdf:type rdfs:subClassOf geo:point dbpedia:Sogndal geonames:country dbpedia:Sogndal = rdfs:subClassOf = A namespace is an abstract container or environment created to hold a logical grouping of unique identifier or
  43. 43. Ontologies place type Norway isA isA type partOf work_place Sogndal location has has Person worksFor Organisation isA studiesAt
  44. 44. What, if each resource (classes and individuals) had a URI?
  45. 45. Expose data from databases as resources & triples on the Web dbpedia:Sogndal dbpedia:Norway foaf:based_near foaf:based_near persons:Rajendra orgs:VF foaf:name foaf:birthday foaf:based_near foaf:topic_interest foaf:name foaf:birthday foaf:based_near orgs:services persons:Rajendra Rajendra 08-08 dbpedia:Sogndal dbpedia:LinkedData orgs:VF VF 1985 dbpedia:Norway IT-Consulting & Research persons:Svein Svein …. …. …. orgs:nLink nLink …. …. …. Persons
  46. 46. Link data and do queries all over the Web dbpedia:Sogndal p:subdivisionName dbpedia:Norway foaf:based_near foaf:based_near persons:Rajendra orgs:VF foaf:topic_interest dbpedia:LinkedData Who is interested in „Linked Data“ and is working in the same country as VF is located?
  47. 47. Link data from more than 40 datasets Make use of more than 2 Billion triples!
  48. 48. The Linking Open Data cloud diagram Link data from more than 295 datasets Last updated: 2011-09-19
  49. 49. How to get answers on really complex questions? dbpedia:Sogndal p:subdivisionName dbpedia:Norway owl:sameAs Scandinavia:Norge foaf:based_near foaf:based_near persons:Rajendra orgs:VF 3.6 foaf:topic_interest Scandenavia:unemployment_rate_total dbpedia:LinkedData Who is interested in „Linked Data“ and is working in a country where the unemployment rate is lower than 4%?
  50. 50. New way to get knowledge and answers — not by searching the web, but by doing dynamic computations based on a vast collection of data, algorithms, and methods
  51. 51. Comprehensive Knowledge Archive Network Open Knowledge Foundation Licensed under the Open
  52. 52. A collaboration between: Norwegian Press Association, Association of Norwegian Editors, Norwegian Union of Journalists, and Department of Journalism
  53. 53. Linked data ... publishing data on the Web ... ... to enable integration, linking and reuse across
  54. 54. Six Steps to Publishing Linked Data 1. Understand the Principles 2. Model Your Data 3. Choose URIs for Things in your Data 4. Setup Your Infrastructure 5. Link to other Data Sets 6. Describe and Publicise your
  55. 55. Can’t we just publish data as files? pdf  easy to read and publish  Excel  allows further processing and analysis  csv  processing without need for proprietary tools  But ...  structure of data not explained  no connection between different data sets, silos  static and fixed – can’t retrieve just slices relevant to
  56. 56. Linked data Apply the principles of the Web to publication of data The Web:  is a global network of pages  each identified by a URL  fetching a URL gives a document  pages connected by links  open, anyone can say anything about anything
  57. 57. Linked data Apply the principles to the web to publication of data The linked data web:  is a global network of things   each identified by a URI  fetching a URI gives a set of statements  things connected by typed links   open, anyone can say anything about anything else Linked data is “data you can click on”
  58. 58. Linked Data - Paradigm  Use URIs as names for things  Use HTTP URIs so that people can look up those names.  When someone looks up a URI, provide useful information.  Include links to other URIs. so that they can discover more
  59. 59. LOD Benefits  other humans and applications can  easily access your data using Web technologies  follow the links in order to obtain further contextual information  links to your data and search engine indices can increase the visibility of your
  60. 60. JSON-LD - JSON for Linking Data  JSON-LD (JavaScript Object Notation for Linking Data) is a lightweight Linked Data format that gives your data context.  It is easy for humans to read and write. It is easy for machines to parse and generate.  It is based on the already successful JSON format and provides a way to help JSON data interoperate at Web-scale.  If you are already familiar with JSON, writing JSON-LD is very easy.  These properties make JSON-LD an ideal Linked Data interchange language for JavaScript environments, Web service, and unstructured databases such as CouchDB and MongoDB.
  61. 61.  This RDF model in standard XML notation  <rdf:RDF xmlns:rdf=" rdf-syntax-ns#" xmlns:dc=" /"> <rdf:Description rdf:about="/wiki/Tony_Benn"> <dc:title>Tony Benn</dc:title> <dc:publisher>Wikipedia</dc:publisher> </rdf:Description> </rdf:RDF>
  62. 62.  written in JSON-LD like this:  { "@context": { "title": "", "publisher": " r" }, "@id": "/wiki/Tony_Benn", "title": "Tony Benn", "publisher": "Wikipedia" }  A context is used to allow developers to use aliases for
  63. 63. JSON-LD object  An Internationalized Resource Identifier (IRI)  is a mechanism for representing unique identifiers on the web.  In Linked Data, IRIs (or URI references) are commonly used for describing entities and properties.  { "a": "Person", "name": "Manu Sporny", "homepage": "" "avatar": " anusporny" }
  64. 64. Unambiguous Identifiers for JSON  If a set of terms, like Person, name, and homepage, are defined in a context, and that context is used to resolve the names in JSON objects, machines could automatically expand the terms to something meaningful and unambiguous  { " ns#type": "", "": "Manu Sporny", "": "" "": " y" }
  65. 65. JSON-LD Example  Lets start by building up a fictitious bike store called "Links Bike Shop". Weve already got our bike store setup at and are using linked data principles.  Heres some of the URLs:  The home page of the store.  A chain product.  A chain lube
  66. 66.  We want to start creating some linked data for this fictitious store and start with rough JSON data on the store itself. { "@id": "", "@type": "Store", "name": "Links Bike Shop", "description": "The most "linked" bike store on earth!" }
  67. 67. Next lets create some rough data for our two premier products { "@id": " swift-chain", "@type": "Product", "name": "Links Swift Chain", "description": "A fine chain with many links.", "category": [" ts", " ns"], "price": "10.00", "stock": 10 }
  68. 68. { "@id": " speedy-lube", "@type": "Product", "name": "Links Speedy Lube", "description": "Lubricant for your chain links.", "category": [" es", " ns"], "price": "5.00", "stock": 20 }
  69. 69. To make this into a full JSON-LD document we combine the data, add a @context, and adjust some values. { "@id": "", "@type": "Store", "name": "Links Bike Shop", "description": "The most "linked" bike store on earth!", "product": [ ...
  70. 70. ], "@context": { "Store": "", "Product": "", "product": "", "category": { "@id": "", "@type": "@id" }, "price": "", "stock": "", "name": "", "description": "", "p": "", "cat": "" } }
  71. 71. Publishing Solutions and Tools  Triplify  Goal: expose semantics available in RDBMS as simple as possible  Available for most popular Web app languages  PHP (ready), Ruby/Python (under dev.)  Works with most popular Web app databases  MySQL, PHP-PDO DBs (SQLite, Oracle, DB2, MS SQL, PostgreSQL)
  72. 72. Virtuoso RDF Views  transforms the result of SQL SELECT statements into RDF  mapping steps  define RDFS class IRIs for each table  define construction of subject IRIs from primary key column values  define construction of predicate IRIs from each non-key
  73. 73. Marrying DBs with RDF & Ontologies Relational Databases RDF & Ontologies Data Model Relational Triples (tables, columns, rows) (subject, predicate, object) Schema and data   separation Implicit information   Scalability   Schema flexibility   Web data integration   readiness Using DBs for storage and querying of RDF & ontologies Publishing DB content as
  74. 74. DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link other data sets on the Web to Wikipedia data. The DBpedia knowledge base currently describes more than 2.6 million things, including at least 213,000 persons, 328,000 places, 57,000 music albums, 36,000 films, 20,000 companies. The knowledge base consists of 274 million pieces of information (RDF triples). DBpedia and all other linked data is searchable with SPARQL
  75. 75. Open Streetmap OpenStreetMap is a free editable map of the whole world. It is made by people like you. OpenStreetMap allows you to view, edit and use geographical data in a collaborative way from anywhere on Earth. GeoNames The GeoNames geographical database is available for download free of charge under a creative commons attribution license. It contains over eight million geographical names and consists of 6.5 million unique features.
  76. 76. Creating Open Data  Public Domain – Only after the expiration of copyright  Science Commons protocol for open data  Creative Commons Zero  Public Domain Dedication & License with Community Norms o Avoid Technical protection measures o Give credit where credit’s due o Use Open formats o Let others know! o Share your work too! Photo by suttonhoo @ Flickr, CC
  77. 77. Examples      
  78. 78. The road to open knowledge begins here! Thank you !