Linked Open Data - Seminar 25.04.12

835 views

Published on

Linked Open Data presentation from seminar 25.04.12.
Seminar om lenka data, Vestlandsforsking 25.04.12

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
835
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
13
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Linked Open Data - Seminar 25.04.12

  1. 1. OpenLinked^ Data Rajendra Akerkar rak@vestforsk.no
  2. 2. Outline  Web evolution  Semantic Web  Why we need it?  Linked Data Paradigm  Tools  JSON-LDwww.vestforsk.no
  3. 3. Web Evolutionwww.vestforsk.no
  4. 4. From Gopher to Super-Mashups http://reegle.info/countrieswww.vestforsk.no
  5. 5. Why do we want to add meaning to data ?  When a computer understands what data means, it can do  search,  reasoning and  combiningwww.vestforsk.no
  6. 6. Meaning is about understanding To understand we need a language A language starts with wordswww.vestforsk.no
  7. 7. Things mean something in words Online, we describe things with XMLwww.vestforsk.no
  8. 8. Look at my coin collection The first coin is called “Silver Tram” and is from Armenia. It was made in 1246-47 AD. The second coin is called “Gold Stater of Lahor” and is from India. It was made in 127-151 AD. < ... etc >www.vestforsk.no
  9. 9. <?xml version="1.0" encoding="ISO-8859-1"?> <collection name=”My coin collection"> <coin> <title>Silver Tram</title> <country>Armenia</country> <year>1246-47 AD</year> </coin> <coin> <title>Gold Stater of Lahor</title> <country>India</country> <year>127-151 AD</year> </coin> </collection>www.vestforsk.no
  10. 10. We can’t understand words alone. We also need grammar Online grammar is RDF (Resource Description Framework)www.vestforsk.no
  11. 11. This coin is from Indiawww.vestforsk.no
  12. 12. predicate subject object This coin is from Indiawww.vestforsk.no
  13. 13. With RDF Schema we can define concepts and make simple relations between themwww.vestforsk.no
  14. 14. This coin is from India, hence from South Asiawww.vestforsk.no
  15. 15. But, RDF schema is limited A language needs more expression and logic to make good reasoning possible That’s why OWL (The Web Ontology Language) was inventedwww.vestforsk.no
  16. 16. Next, to reason you need ruleswww.vestforsk.no
  17. 17. I got this coin from my grandfather.www.vestforsk.no
  18. 18. son of father I mother or father The rule for calling someone my grandfather is that one of my parents has a fatherwww.vestforsk.no
  19. 19. Rules are formulated in Rule Languagewww.vestforsk.no
  20. 20. <ruleml:imp> <ruleml:_rlab ruleml:href="#example1"/> <ruleml:_body> <swrlx:individualPropertyAtom swrlx:property="hasParent"> <ruleml:var>x1</ruleml:var> <ruleml:_head> <ruleml:var>x2</ruleml:var> <swrlx:individualPropertyAtom </swrlx:individualPropertyAtom> swrlx:property="hasGrandfather" <swrlx:individualPropertyAtom > swrlx:property="hasFather"> <ruleml:var>x1</ruleml:var> <ruleml:var>x2</ruleml:var> <ruleml:var>x3</ruleml:var> <ruleml:var>x3</ruleml:var> </swrlx:individualPropertyAtom> </swrlx:individualPropertyAtom> </ruleml:_body> </ruleml:_head> </ruleml:imp>www.vestforsk.no
  21. 21. So, Words in XML Grammar in RDF (schema) and OWL Rules in RL There are a lot of things, that can be described using standard formatswww.vestforsk.no
  22. 22. Suppose, I want to search for a specific coinwww.vestforsk.no
  23. 23. “I want all the golden coins, designed in Asia, but used in the Europe, between 1958 and 1989”www.vestforsk.no
  24. 24. We can use SPARQL (Protocol and RDF Query Language)www.vestforsk.no
  25. 25. Because the Web is decentralized and data is in many places, not only language is important Exchange of data between different DB for knowledge creation is an ultimate goalwww.vestforsk.no
  26. 26. To make a connection a machine needs a source For this, we use resource identifiers Best known resource identifier is the URI which consists of a name (urn) and a location (url)www.vestforsk.no
  27. 27. URI URL URN http://www.mycollection.in/ Gold Stater of Lahor goldStaterwww.vestforsk.no
  28. 28. With all this background we are capable of using the power of all different data resources on the Webwww.vestforsk.no
  29. 29. Linked Data vs. Semantic Web  The Semantic Web, or the Web of Data, is the ultimate goal  Linked Data provides the means to reach that goal  Linked Data helps build the Web of Data that later can be exploited by more advanced technologies such as intelligent agentswww.vestforsk.no
  30. 30. Linked Data vs. Linked Open Datawww.vestforsk.no
  31. 31. Databases store data to answer questions (1) •How old is Rajendra? •When was VF founded? •Where does Rajendra work? •Where is VF located? •What is Rajendra interested •What can VF do for me? in? Persons Organisationswww.vestforsk.no
  32. 32. Databases store data to answer questions (2) •Rajendra is .. years old. •VF was founded 27 years ago. •Rajendra works in Sogndal. •VF is located in Norway. •Rajendra is interested in the •VF offers IT-Consulting & Linked Data. Research. name date_birth work_place interests organisation date_founded location services Rajendra 08-08 Sogndal Linked Data VF 1985 Norway IT-Consulting & Research Svein …. …. …. nLink …. …. …. Persons Organisationswww.vestforsk.no
  33. 33. Data from Databases can be exposed to the Web via HTML Persons Organisationswww.vestforsk.no
  34. 34. Data from Databases can be accessed via APIs <workPlace>Sogndal</workPlace> <location>Norway</location> getWorkplace(„Rajendra“) getLocation(„VF“) Persons Organisationswww.vestforsk.no
  35. 35. (Some) Information on the Web can be found via search engines Questions won t be Google answered necessarilywww.vestforsk.no
  36. 36. But how to get answers on complex questions? (1) Who is interested in „Linked Data“ and is working in the same country as VF is located?www.vestforsk.no
  37. 37. But how to get answers on complex questions? (2) Who is interested in „Linked Data“ and is working in the same country as VF is located? work_place same thing? location Sogndal same country? Norway name date_birth work_place interests organisation date_founded location services Rajendra 08-08 Sogndal Linked Data VF 1985 Norway IT-Consulting & Research Svein …. …. …. nLink …. …. …. Still no answer Persons Organisationswww.vestforsk.no
  38. 38. Is Mapping the solution? Mapped! work_place location same country? Sogndal Norway Still not clear And what, if we need to add another database? name date_birth university course What, if DB-owners Rajendra 08-08 NTNU Computer Science can t agree on a Students Svein …. …. …. common model?www.vestforsk.no
  39. 39. Mapping is no solution for a distributed Web of datawww.vestforsk.no
  40. 40. Before I come up with a solution , let us understand four simple thingswww.vestforsk.no
  41. 41. Resources place type Norway isA isA type partOf work_place Sogndal locationwww.vestforsk.no
  42. 42. URIs & Namespaces rdf:type umbel:place dbpedia:Norway rdfs:subClassOf p:subdivisionName rdf:type rdfs:subClassOf geo:point dbpedia:Sogndal geonames:country dbpedia:Sogndal = http://dbpedia.org/resource/Sogndal rdfs:subClassOf = http://www.w3.org/2000/01/rdf-schema#subClassOf A namespace is an abstract container or environment created to hold a logical grouping of unique identifier or symbols.www.vestforsk.no
  43. 43. Ontologies place type Norway isA isA type partOf work_place Sogndal location has has Person worksFor Organisation isA studiesAt Universitywww.vestforsk.no
  44. 44. What, if each resource (classes and individuals) had a URI?www.vestforsk.no
  45. 45. Expose data from databases as resources & triples on the Web dbpedia:Sogndal dbpedia:Norway foaf:based_near foaf:based_near persons:Rajendra orgs:VF foaf:name foaf:birthday foaf:based_near foaf:topic_interest foaf:name foaf:birthday foaf:based_near orgs:services persons:Rajendra Rajendra 08-08 dbpedia:Sogndal dbpedia:LinkedData orgs:VF VF 1985 dbpedia:Norway IT-Consulting & Research persons:Svein Svein …. …. …. orgs:nLink nLink …. …. …. Persons Organisationswww.vestforsk.no
  46. 46. Link data and do queries all over the Web dbpedia:Sogndal p:subdivisionName dbpedia:Norway foaf:based_near foaf:based_near persons:Rajendra orgs:VF foaf:topic_interest dbpedia:LinkedData Who is interested in „Linked Data“ and is working in the same country as VF is located?www.vestforsk.no
  47. 47. Link data from more than 40 datasets Make use of more http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData than 2 Billion triples!www.vestforsk.no
  48. 48. The Linking Open Data cloud diagram Link data from more than 295 datasets Last updated: 2011-09-19 http://richard.cyganiak.de/2007/10/lod/www.vestforsk.no
  49. 49. How to get answers on really complex questions? dbpedia:Sogndal p:subdivisionName dbpedia:Norway owl:sameAs Scandinavia:Norge foaf:based_near foaf:based_near persons:Rajendra orgs:VF 3.6 foaf:topic_interest Scandenavia:unemployment_rate_total dbpedia:LinkedData Who is interested in „Linked Data“ and is working in a country where the unemployment rate is lower than 4%?www.vestforsk.no
  50. 50. New way to get knowledge and answers — not by searching the web, but by doing dynamic computations based on a vast collection of data, algorithms, and methods http://www.wolframalpha.com/www.vestforsk.no
  51. 51. Comprehensive Knowledge Archive Network Open Knowledge Foundation http://no.ckan.net/ Licensed under the Open Databasewww.vestforsk.no
  52. 52. A collaboration between: Norwegian Press Association, Association of Norwegian Editors, Norwegian Union of Journalists, and Department of Journalism http://www.offentlighet.no/Registeroffentlighet/Alle-registrewww.vestforsk.no
  53. 53. Linked data ... publishing data on the Web ... ... to enable integration, linking and reuse across siloswww.vestforsk.no
  54. 54. Six Steps to Publishing Linked Data 1. Understand the Principles 2. Model Your Data 3. Choose URIs for Things in your Data 4. Setup Your Infrastructure 5. Link to other Data Sets 6. Describe and Publicise your Datawww.vestforsk.no
  55. 55. Can’t we just publish data as files? pdf  easy to read and publish  Excel  allows further processing and analysis  csv  processing without need for proprietary tools  But ...  structure of data not explained  no connection between different data sets, silos  static and fixed – can’t retrieve just slices relevant to problemwww.vestforsk.no
  56. 56. Linked data Apply the principles of the Web to publication of data The Web:  is a global network of pages  each identified by a URL  fetching a URL gives a document  pages connected by links  open, anyone can say anything about anything elsewww.vestforsk.no
  57. 57. Linked data Apply the principles to the web to publication of data The linked data web:  is a global network of things   each identified by a URI  fetching a URI gives a set of statements  things connected by typed links   open, anyone can say anything about anything else Linked data is “data you can click on”www.vestforsk.no
  58. 58. Linked Data - Paradigm  Use URIs as names for things  Use HTTP URIs so that people can look up those names.  When someone looks up a URI, provide useful information.  Include links to other URIs. so that they can discover more things.www.vestforsk.no
  59. 59. LOD Benefits  other humans and applications can  easily access your data using Web technologies  follow the links in order to obtain further contextual information  links to your data and search engine indices can increase the visibility of your datawww.vestforsk.no
  60. 60. JSON-LD - JSON for Linking Data  JSON-LD (JavaScript Object Notation for Linking Data) is a lightweight Linked Data format that gives your data context.  It is easy for humans to read and write. It is easy for machines to parse and generate.  It is based on the already successful JSON format and provides a way to help JSON data interoperate at Web-scale.  If you are already familiar with JSON, writing JSON-LD is very easy.  These properties make JSON-LD an ideal Linked Data interchange language for JavaScript environments, Web service, and unstructured databases such as CouchDB and MongoDB. http://json-ld.org/spec/latest/json-ld-syntax/www.vestforsk.no
  61. 61.  This RDF model in standard XML notation  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22- rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1 /"> <rdf:Description rdf:about="/wiki/Tony_Benn"> <dc:title>Tony Benn</dc:title> <dc:publisher>Wikipedia</dc:publisher> </rdf:Description> </rdf:RDF>www.vestforsk.no
  62. 62.  written in JSON-LD like this:  { "@context": { "title": "http://purl.org/dc/elements/1.1/title", "publisher": "http://purl.org/dc/elements/1.1/publishe r" }, "@id": "/wiki/Tony_Benn", "title": "Tony Benn", "publisher": "Wikipedia" }  A context is used to allow developers to use aliases for IRIs.www.vestforsk.no
  63. 63. JSON-LD object  An Internationalized Resource Identifier (IRI)  is a mechanism for representing unique identifiers on the web.  In Linked Data, IRIs (or URI references) are commonly used for describing entities and properties.  { "a": "Person", "name": "Manu Sporny", "homepage": "http://manu.sporny.org/" "avatar": "http://twitter.com/account/profile_image/m anusporny" }www.vestforsk.no
  64. 64. Unambiguous Identifiers for JSON  If a set of terms, like Person, name, and homepage, are defined in a context, and that context is used to resolve the names in JSON objects, machines could automatically expand the terms to something meaningful and unambiguous  { "http://www.w3.org/1999/02/22-rdf-syntax- ns#type": "http://xmlns.com/foaf/0.1/Person", "http://xmlns.com/foaf/0.1/name": "Manu Sporny", "http://xmlns.com/foaf/0.1/homepage": "http://manu.sporny.org" "http://rdfs.org/sioc/ns#avatar": "http://twitter.com/account/profile_image/manusporn y" }www.vestforsk.no
  65. 65. JSON-LD Example  Lets start by building up a fictitious bike store called "Links Bike Shop". Weve already got our bike store setup athttp://store.example.com/ and are using linked data principles.  Heres some of the URLs:  http://store.example.com/: The home page of the store.  http://store.example.com/products/links-swift-chain: A chain product.  http://store.example.com/products/links-speedy-lube: A chain lube product.www.vestforsk.no
  66. 66.  We want to start creating some linked data for this fictitious store and start with rough JSON data on the store itself. { "@id": "http://store.example.com/", "@type": "Store", "name": "Links Bike Shop", "description": "The most "linked" bike store on earth!" }www.vestforsk.no
  67. 67. Next lets create some rough data for our two premier products { "@id": "http://store.example.com/products/links- swift-chain", "@type": "Product", "name": "Links Swift Chain", "description": "A fine chain with many links.", "category": ["http://store.example.com/categories/par ts", "http://store.example.com/categories/chai ns"], "price": "10.00", "stock": 10 }www.vestforsk.no
  68. 68. { "@id": "http://store.example.com/products/links- speedy-lube", "@type": "Product", "name": "Links Speedy Lube", "description": "Lubricant for your chain links.", "category": ["http://store.example.com/categories/lub es", "http://store.example.com/categories/chai ns"], "price": "5.00", "stock": 20 }www.vestforsk.no
  69. 69. To make this into a full JSON-LD document we combine the data, add a @context, and adjust some values. { "@id": "http://store.example.com/", "@type": "Store", "name": "Links Bike Shop", "description": "The most "linked" bike store on earth!", "product": [ ... ...www.vestforsk.no
  70. 70. ], "@context": { "Store": "http://ns.example.com/store#Store", "Product": "http://ns.example.com/store#Product", "product": "http://ns.example.com/store#product", "category": { "@id": "http://ns.example.com/store#category", "@type": "@id" }, "price": "http://ns.example.com/store#price", "stock": "http://ns.example.com/store#stock", "name": "http://purl.org/dc/terms/title", "description": "http://purl.org/dc/terms/description", "p": "http://store.example.com/products/", "cat": "http://store.example.com/category/" } }www.vestforsk.no
  71. 71. Publishing Solutions and Tools  Triplify  Goal: expose semantics available in RDBMS as simple as possible  Available for most popular Web app languages  PHP (ready), Ruby/Python (under dev.)  Works with most popular Web app databases  MySQL, PHP-PDO DBs (SQLite, Oracle, DB2, MS SQL, PostgreSQL)www.vestforsk.no
  72. 72. Virtuoso RDF Views  transforms the result of SQL SELECT statements into RDF  mapping steps  define RDFS class IRIs for each table  define construction of subject IRIs from primary key column values  define construction of predicate IRIs from each non-key columnwww.vestforsk.no
  73. 73. Marrying DBs with RDF & Ontologies Relational Databases RDF & Ontologies Data Model Relational Triples (tables, columns, rows) (subject, predicate, object) Schema and data   separation Implicit information   Scalability   Schema flexibility   Web data integration   readiness Using DBs for storage and querying of RDF & ontologies Publishing DB content as RDFwww.vestforsk.no
  74. 74. DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link other data sets on the Web to Wikipedia data. The DBpedia knowledge base currently describes more than 2.6 million things, including at least 213,000 persons, 328,000 places, 57,000 music albums, 36,000 films, 20,000 companies. The knowledge base consists of 274 million pieces of information (RDF triples). http://dbpedia.org/ DBpedia and all other linked data is searchable with SPARQL http://en.wikipedia.org/wiki/SPARQLwww.vestforsk.no
  75. 75. Open Streetmap OpenStreetMap is a free editable map of the whole world. It is made by people like you. OpenStreetMap allows you to view, edit and use geographical data in a collaborative way from anywhere on Earth. www.openstreetmap.org GeoNames The GeoNames geographical database is available for download free of charge under a creative commons attribution license. It contains over eight million geographical names and consists of 6.5 million unique features. www.geonames.orgwww.vestforsk.no
  76. 76. Creating Open Data  Public Domain – Only after the expiration of copyright  Science Commons protocol for open data  Creative Commons Zero  Public Domain Dedication & License with Community Norms o Avoid Technical protection measures o Give credit where credit’s due o Use Open formats o Let others know! o Share your work too! Photo by suttonhoo @ Flickr, CC BY-NC-SAwww.vestforsk.no
  77. 77. Examples  http://data-gov.tw.rpi.edu/wiki  http://dbrec.net/  http://fanhu.bz/  http://data.nytimes.com/schools/schools.html  http://sig.ma  http://visinav.deri.org/semtech2010/www.vestforsk.no
  78. 78. The road to open knowledge begins here! Thank you !www.vestforsk.no

×