Linked Data tutorial at Semtech 2012
Upcoming SlideShare
Loading in...5
×
 

Linked Data tutorial at Semtech 2012

on

  • 3,629 views

My Linked Data tutorial presentation that I presented at Semtech 2012.

My Linked Data tutorial presentation that I presented at Semtech 2012.

http://semtechbizsf2012.semanticweb.com/sessionPop.cfm?confid=65&proposalid=4724

Statistics

Views

Total Views
3,629
Views on SlideShare
3,617
Embed Views
12

Actions

Likes
4
Downloads
109
Comments
0

2 Embeds 12

http://www.linkedin.com 6
https://zen.myatos.net 6

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Linked Data tutorial at Semtech 2012 Linked Data tutorial at Semtech 2012 Presentation Transcript

    • June 4, 2012Linked DataJuan F. Sequeda – Daniel P. MirankerCapsentaSemantic Tech & Business Conference 2012www.capsenta.com 1
    • Outline Part 1: Introduction to Linked Data Part 2: Linked Data Principles Part 3: Linked Data Architectures Part 4: Linked Enterprise Datawww.capsenta.com June 4, 2012 2
    • Part 1: Introduction to Linked Datawww.capsenta.com June 4, 2012 3
    • The Web is a Data Shredder Structured Unstructured Data Data Thanks Martin Heppwww.capsenta.com June 4, 2012 4
    • The Web of Documents Search Search Engine Crawlerwww.capsenta.com June 4, 2012 5
    • What would we like? Make it easy for computers/software to find THINGS Do you SEARCH or do you FIND?www.capsenta.com June 4, 2012 6
    • Search for Football Players who went to the University of Texas at Austin, played for the Dallas Cowboys as Cornerbackwww.capsenta.com June 4, 2012 7
    • www.capsenta.com June 4, 2012 8
    • www.capsenta.com June 4, 2012 9
    • www.capsenta.com June 4, 2012 10
    • Why can’t we just FIND it…www.capsenta.com June 4, 2012 11
    • www.capsenta.com June 4, 2012 12
    • www.capsenta.com June 4, 2012 13
    • Guess how I FOUND out?www.capsenta.com June 4, 2012 14
    • On a Semantic Web Besides publishing documents on the web  which computers can’t understand easily Let’s publish on the web something that computers can understand DATAwww.capsenta.com June 4, 2012 15
    • The Semantic Web is a web of data The current web is a web of documentswww.capsenta.com June 4, 2012 16
    • But wait… doesn’t the web already have data?www.capsenta.com June 4, 2012 17
    • Current Data on the Web  Relational Databases  APIs  XML  CSV  XLS …  Can’t computers and applications already consume that data on the web?www.capsenta.com June 4, 2012 18
    • Yes! But it is all in different formats and data models!www.capsenta.com June 4, 2012 19
    • This makes it hard to integrate datawww.capsenta.com June 4, 2012 20
    • The data in different data sources aren’t linkedwww.capsenta.com June 4, 2012 21
    • For example, how do I state that the Juan Sequeda in Facebook is the same as Juan Sequeda in Twitterwww.capsenta.com June 4, 2012 22
    • Or if I create a mashup from different services, I have to learn different APIs and I get different formats of data backwww.capsenta.com June 4, 2012 23
    • Data is Siloedwww.capsenta.com June 4, 2012 24
    • Wouldn’t it be great if we had a standard way of publishing data on the Web?www.capsenta.com June 4, 2012 25
    • We have a standardized way of publishing documents on the web, right? HTMLwww.capsenta.com June 4, 2012 26
    • Then why can’t we have a standard way of publishing data on the Web?www.capsenta.com June 4, 2012 27
    • Good question! And the answer is YES. There is! RDFwww.capsenta.com June 4, 2012 28
    • Resource Description Framework (RDF) Data Model = a way to model data  i.e. Relational databases use relational data model RDF is a graph data modelwww.capsenta.com June 4, 2012 29
    • RDF is a Graph  <JuanSequeda> <firstName> “Juan”  <JuanSequeda> <lastName> “Sequeda”  <JuanSequeda> <livesIn> “Austin”  <JuanSequeda> <knows> <DanielMiranker>  ..  <DanielMiranker> <firstName> “Daniel”  <DanielMiranker> <lastName> “Miranker”  <DanielMiranker> <livesIn> “Austin”www.capsenta.com June 4, 2012 30
    • RDF can be serialized in different ways RDF/XML RDFa (RDF in HTML) N3 Turtle JSONwww.capsenta.com June 4, 2012 31
    • www.capsenta.com June 4, 2012 32
    • RDFawww.capsenta.com June 4, 2012 33
    • RDF/XMLwww.capsenta.com June 4, 2012 34
    • RDF/N-tripleswww.capsenta.com June 4, 2012 35
    • RDF/Turtlewww.capsenta.com June 4, 2012 36
    • So does that mean that I have to publish my data in RDF now?www.capsenta.com June 4, 2012 37
    • You don’t have to… but we would like you to www.capsenta.com June 4, 2012 38
    • An examplewww.capsenta.com June 4, 2012 39
    • Document on the Webwww.capsenta.com June 4, 2012 40
    • Databases back up documents THINGS have PROPERTIES: A Book as a Title, an author, … Isbn Title Author PublisherID ReleasedData 978-0-596- Programming Toby 1 July 2009 15381-6 the Semantic Segaran Web … … … … … This is a THING: PublisherID PublisherName A book title “Programming the 1 O’Reilly Media Semantic Web” by Toby Segaran, … … …www.capsenta.com June 4, 2012 41
    • Lets represent the data in RDFIsbn Title Author PublisherID ReleasedData978-0- Programming Toby 1 July 2009596- the Semantic Segaran15381- Web6 Programming title the SemanticPublisherID PublisherName Web1 O’Reilly Media author Toby book Segaran isbn 978-0-596-15381-6 publisher name Publisher O’Reillywww.capsenta.com June 4, 2012 42
    • Remember that we are on the web Everything on the web is identified by a URIwww.capsenta.com June 4, 2012 43
    • And now let’s link the data to other data Programming title the Semantic Web http:// …/isbn9 author Toby 78 Segaran isbn 978-0-596-15381-6 publisher http://…/ name publisher O’Reilly 1www.capsenta.com June 4, 2012 44
    • And now consider the data from Revyu.com http:// hasReview http:// …/revie …/isbn9 w1 78 description reviewer Awesome Book http:// name …/revie wer Juan Sequedawww.capsenta.com June 4, 2012 45
    • Let’s start to link data http:// hasReview http:// …/revie …/isbn9 78 Programming w1 the Semantic description title WebhasReviewer owl:sameAs Awesome http:// author Toby Book …/isbn9 Segaran 78 http://…/ reviewer name isbn 978-0-596-15381-6 Juan publisher Sequeda http://…/ publisher name O’Reilly 1 www.capsenta.com June 4, 2012 46
    • Juan Sequeda publishes data too http://juans http://dbpedia.org/Au livesIn stin equeda.cowww.capsenta.com name Juan Sequeda June 4, 2012 47 m/id
    • Let’s link more data http://…/ hasReview http://…/ review1 isbn978 descriptionhasReviewer Awesome Book http://…/ name reviewer sameAs Juan Sequeda http://juans http://dbpedia.org/Au livesIn stin equeda.co www.capsenta.com name Juan Sequeda June 4, 2012 48 m/id
    • And more http://…/ hasReview http://…/ review1 isbn978 Programming description title the Semantic WebhasReviewer owl:sameAs Awesome author http://…/ Toby Book isbn978 Segaran http://…/ reviewer name isbn 978-0-596-15381-6 owl:sameAs Juan publisher http://…/p Sequeda ublisher1 name O’Reilly http://juans http://dbpedia.org/Au livesIn stin equeda.co www.capsenta.com name Juan Sequeda June 4, 2012 49 m/id
    • Data on the Web that is in RDF and is linked to other RDF data is LINKED DATAwww.capsenta.com June 4, 2012 50
    • Linked Data makes the web appear as ONE GIANT HUGE GLOBAL DATABASE!www.capsenta.com June 4, 2012 51
    • I can query a database with SQL. Is there a way to query Linked Data with a query language?www.capsenta.com June 4, 2012 52
    • Yes! There is actually a standardize language for that SPARQLwww.capsenta.com June 4, 2012 53
    • FIND all the reviews on the book “Programming the Semantic Web” by people who live in Austinwww.capsenta.com June 4, 2012 54
    • SPARQL SELECT ?review ?comment WHERE { isbn:978 ex:hasReview ?review . ?review ex:description ?comment . ?review ex:hasReviewer ?person . ?person ex:lives dbpedia:Austin . }www.capsenta.com June 4, 2012 55
    • SELECT ?review ?comment WHERE { isbn:978 ex:hasReview ?review . ?review ex:description ?comment . ?review ex:hasReviewer ?person . ?person ex:lives dbpedia:Austin . http://…/ hasReview http://…/ } review1 isbn978 Programming description title the Semantic WebhasReviewer owl:sameAs Awesome author http://…/ Toby Book isbn978 Segaran http://…/ reviewer name isbn 978-0-596-15381-6 owl:sameAs Juan publisher http://…/p Sequeda ublisher1name O’Reilly http://juans http://dbpedia.org/Au livesIn stin equeda.co 56 Juan Sequedawww.capsenta.com name June 4, 2012 m/id
    • This looks cool, but let’s be realistic. What is the incentive to publish Linked Data on the Web?www.capsenta.com June 4, 2012 57
    • What was your incentive to publish an HTML page in 1990?www.capsenta.com June 4, 2012 58
    • 1) Share data in documents 2) Because you neighbor was doing it … later on … 3) Marketing, Advertising, …, SEOwww.capsenta.com June 4, 2012 59
    • So why should we publish Linked Data in 2012?www.capsenta.com June 4, 2012 60
    • 1) Share data as data 2) Because you neighbor is doing it … later on … 3) Marketing, Advertising, …, SEOwww.capsenta.com June 4, 2012 61
    • Linked Data Publishers  US and UK Government  BBC  NY Times  Best Buy  Sears  Kmart  Overstock  … too many more to namewww.capsenta.com June 4, 2012 62
    • Linked Open Datawww.capsenta.com June 4, 2012 63
    • http://www.w3.org/DesignIssues/LinkedData.htmlwww.capsenta.com June 4, 2012 64
    • May 2007www.capsenta.com June 4, 2012 65
    • Oct 2007www.capsenta.com June 4, 2012 66
    • Nov 2007www.capsenta.com June 4, 2012 67
    • Feb 2008www.capsenta.com June 4, 2012 68
    • Mar 2008www.capsenta.com June 4, 2012 69
    • Sept 2008www.capsenta.com June 4, 2012 70
    • Mar 2009 (1)www.capsenta.com June 4, 2012 71
    • Mar 2009 (2)www.capsenta.com June 4, 2012 72
    • July 2009www.capsenta.com June 4, 2012 73
    • September 2010www.capsenta.com June 4, 2012 74
    • September 2011Linking Open Datacloud diagram, byRichard Cyganiak andAnja Jentzsch. http://lod-cloud.net/www.capsenta.com June 4, 2012 75
    • YOU GET THE PICTURE ITS BIG and getting BIGGER and BIGGERwww.capsenta.com June 4, 2012 76
    • Part 2: Linked Data Principleswww.capsenta.com June 4, 2012 77
    • Linked Data is a set of best practices to publish and interlink data on the webwww.capsenta.com June 4, 2012 78
    • Linked Data Principles1. Use URIs as names for things2. Use HTTP URIs so that people can look up (dereference) those names.3. When someone looks up a URI, provide useful information.4. Include links to other URIs so that they can discover more things.www.capsenta.com June 4, 2012 79
    • 1. Use URIs as names for thingswww.capsenta.com June 4, 2012 80
    • 1) Use URIs as names for thingshttp://dbpedia.org/resource/Austin,_Texas http://xmlns.com/foaf/0.1/based_near http://juansequeda.com/foaf.rdf#me http://www.w3.org/People/Berners-Lee/card#i http://xmlns.com/foaf/0.1/knows www.capsenta.com June 4, 2012 81
    • 2. Use HTTP URIs so that people can look up (dereference) those names.www.capsenta.com June 4, 2012 82
    • 2) Use HTTP URIs HTTP client can lookup the URI using HTTP protocol and retrieve a description http://dbpedia.org/resource/Austin,_Texaswww.capsenta.com June 4, 2012 83
    • www.capsenta.com June 4, 2012 84
    • www.capsenta.com June 4, 2012 85
    • www.capsenta.com June 4, 2012 86
    • What’s with the redirection (303) ?www.capsenta.com June 4, 2012 87
    • www.capsenta.com June 4, 2012 88
    • http://upload.wikimedia.org/wikipedia/commons/0/06/AustinSkylineLouNeffPoint-2010-03-29-b.JPGwww.capsenta.com June 4, 2012 89
    • http://dbpedia.org/page/Austin,_Texaswww.capsenta.com June 4, 2012 90
    • Identifies the abstract concept of “the city of Austin, Texas” http://dbpedia.org/resource/Austin,_Texas Accept: text/html Accept: application/rdf+xmlhttp://dbpedia.org/page/Austin,_Texas http://dbpedia.org/data/Austin,_Texas.xml Identifies an HTML document that Identifies an RDF document that describes “the city of Austin, Texas” describes “the city of Austin, Texas”www.capsenta.com June 4, 2012 91
    • Minting HTTP URIs If you own the domain name and run a web server at that location, mint URIs in this namespace I own the domain capsenta.com I run the webserver http://capsenta.com I can mint URIs in this namespace  http://capsenta.com/person/Juan-Sequedawww.capsenta.com June 4, 2012 92
    • Cool URIs http://www.w3.org/TR/cooluris/ Don’t misuse a namespace that you don’t own  http://www.imdb.com/title Avoid implementation details  http://capsenta.com/person.php?id=123&format=rdf Use Natural Keys  http://capsenta.com/person/123www.capsenta.com June 4, 2012 93
    • 3. When someone looks up a URI, provide useful information.www.capsenta.com June 4, 2012 94
    • 3) Provide useful information How do we provide useful information in document form on the web?  HTML How do we provide useful information in data form on the web  RDFwww.capsenta.com June 4, 2012 95
    • What to publish?  Literal Triples <http://dbpedia.org/resource/Austin,_Texas> <http://xmlns.com/foaf/0.1/name> “City of Austin”  Outgoing Link Triples <http://dbpedia.org/resource/Austin,_Texas> <http://www.w3.org/2002/07/owl#sameAs> <http://rdf.freebase.com/ns/m/0vzm>  Incoming Link Triples <http://dbpedia.org/resource/Dakota_Johnson> <http://dbpedia.org/ontology/birthPlace> <http://dbpedia.org/resource/Austin,_Texas>www.capsenta.com June 4, 2012 96
    • What to publish? Description of the data set  Semantic Sitemaps  voiD (Vocabulary of Interlinked Datasets) Provenance Metadata Licenses Informationwww.capsenta.com June 4, 2012 97
    • Vocabularies (or Schemas or Ontologies)  Create your own using  RDFS/OWL/ SKOS  Reuse vocabularies  Dublin Core: metadata attributes  Friend of a Friend (FOAF): persons and relationships  Semantically Interlinked Online Communities (SIOC): describing users, posts, blogs, etc  Description of a Project (DOAP)  Music Ontology  Programmes Ontology: TV and radio programs  Good Relations: describing products and services  Review Vocabulary  Basic Geo (WGS84) Vocabularywww.capsenta.com June 4, 2012 98
    • 4. Include links to other URIs so that they can discover more things.www.capsenta.com June 4, 2012 99
    • 4) Include links to other things Set external RDF links into other data sources on the Web  Subject of the triple is in the namespace of one data set  Object of the triple is a URI in the namespace of another data set Connect siloed data islands Enable discoverywww.capsenta.com June 4, 2012 100
    • 4) Include links to other things  Relationship Link Triples <http://juansequeda.com/foaf.rdf#me> <http://xmlns.com/foaf/0.1/based_near> <http://dbpedia.org/resource/Austin,_Texas>  Identity Link Triples <http://dbpedia.org/resource/Austin,_Texas> <http://www.w3.org/2002/07/owl#sameAs> <http://rdf.freebase.com/ns/m/0vzm>  Vocabulary Link Triples <http://capsenta.com/vocab/name> <http://www.w3.org/2002/07/owl#equivalentProperty> <http://xmlns.com/foaf/0.1/name>www.capsenta.com June 4, 2012 101
    • Which predicate for linking to choose? Depends on your domain Is it widely used?  owl:sameAs  foaf:knows  foaf:based_near … If you create your own, relate it to a widely used predicatewww.capsenta.com June 4, 2012 102
    • Part 3: Linked Data Architectureswww.capsenta.com June 4, 2012 103
    • Static RDF Files Small amount of data (personal FOAF file) Use RDF/XML serialization Save as .rdf file and upload it to your server  http://www.capsenta.com/company.rdf  http://www.capsenta.com/company.rdf#this Configure MIME types  AddType application/rdf+xml .rdf Make RDF discoverable from HTMl  <link rel="alternate" type="application/rdf+xml" href="company.rdf">www.capsenta.com June 4, 2012 104
    • RDF in HTML (RDFa) Another syntax for RDF Useful if you have template HTML pages Drupal 7 will do this out of the boxwww.capsenta.com June 4, 2012 105
    • Triplestores (aka RDF db, …) Commercial  Oracle, IBM, OntoText (OWLIM), Franz (Allegrograph), Openlink (Virtuoso), C&P (Stardog), Ontoprise (OntoBroker), Meronymy Open Source  Jena, Sesame, Mulgara, 4Store (Garlik), BigData (Systap)www.capsenta.com June 4, 2012 106
    • RDB2RDF  Upcoming W3C RDB2RDF Standards  R2RML: mapping language  Direct Mapping: default automatic mapping  Two Approaches  Dynamic (SPARQL to SQL)  ETL (Dump RDB to RDF)  Ultrawrap  Supports W3C standard and more  SPARQL as fast as SQLwww.capsenta.com June 4, 2012 107
    • Unstructured to RDF Triplestore Entity Extractor Unstructuredwww.capsenta.com June 4, 2012 108
    • Semi-structured to RDF Triplestore XML2RDF, XLS2RDF, CVS2RDF Semi-structuredwww.capsenta.com June 4, 2012 109
    • RDB to RDF CMS with RDFa, RDB2RDF Semantic Wiki (SPARQL to SQL) Triplestore RDB2RDF ETL Relational Databasewww.capsenta.com June 4, 2012 110
    • Creating Linked Data Linked Data CMS with Data Linked Data RDB2RDF Custom Linked Web Server RDFa, Semantic Interface (i.e. Ultrawrap) Data Wrapper Publication Wiki RDB2RDF Data source Data Triplestore RDB with API Storage XML2RDF, Data Entity Extractor XLS2RDF, CVS2RDF Preparation Unstructured Semi-structured Structured Type of DataThanks Heath and Bizer www.capsenta.com June 4, 2012 111
    • Consuming Linked Data Application Schema Mapping Record Linkage Provenance Tracking Data Access Linked Data Creating Linked Datawww.capsenta.com June 4, 2012 112
    • Schema Matching  Renaming  <ex:name>  <foaf:name>  owl:equivalentClass and owl:equivalentProperty  rdfs:subClass or rdfs:subProperty  Structural Transformation  <ex:Juan> <ex:lives> “Austin”  <ex:Juan><foaf:based_near><db:Austin> . <db:Austin><rdfs:label> “Austin”.  SPARQL Construct, RIF, R2Rwww.capsenta.com June 4, 2012 113
    • Record Linkage Different URIs that identify the same thing Create owl:sameAs links between them Manually lookup: Sindice (Semi) Automatically: SILKwww.capsenta.com June 4, 2012 114
    • Provenance Keep track where the data is coming from  Quality  Trust Named Graphs SPARQL Graphwww.capsenta.com June 4, 2012 115
    • Centralized Application SPARQL Triplestore Creating Linked Datawww.capsenta.com June 4, 2012 116
    • Centralized Advantage  Include the datasets that you need  Complex queries and high performance  Reasoning Drawbacks  Depends on RDF dumps or crawling  Effort to setup the centralized triplestore  Queried data may be out of datewww.capsenta.com June 4, 2012 117
    • Federated Application SPARQL Federator SPARQL SPARQL SPARQL SPARQL RDB2RDF RDB2RDF Triplestore Triplestore Relational Relational Database Databasewww.capsenta.com June 4, 2012 118
    • Federated Advantage  Include the datasets that you need  Queried data is up to date Drawbacks  Requires existence of a SPARQL endpoint  Effort to setup federatorwww.capsenta.com June 4, 2012 119
    • Linked Traversal Application SPARQL Linked Traversal Query Engine Linked Data RDB2RDF Triplestore Relational Databasewww.capsenta.com June 4, 2012 120
    • Linked Traversal Advantage  No need to know the data sources in advance  Does not depend on the existence of SPARQL endpoints or RDF dumps  Queried data is up to date Drawbacks  Query execution time is slow  Unsuitable for some queries  Results may be incomplete  Still in researchwww.capsenta.com June 4, 2012 121
    • Applications Linked Data Browsers  http://browse.semanticweb.org/ Linked Data (Semantic Web) Search Engines  Falcons, SWSE, VisiNav, Sindice, Sigma, Swoogle, Wats on Search Engines  Google, Bing, Yahoo! Faceted Browsers  http://dbpedia.neofonie.de/browse/www.capsenta.com June 4, 2012 122
    • Domain Specific Applications BBC World Cup Seevl.net Linked Life Data Government appswww.capsenta.com June 4, 2012 123
    • Part 2: Linked Enterprise Datawww.capsenta.com June 4, 2012 124
    • Use Linked Data Principles internally Consume Linked (Open) Data Publish Linked (Open) Datawww.capsenta.com June 4, 2012 125
    • Linked Enterprise Data Linked Data can be used as an architectural style for integrating data in the Enterprise 1. Standard Data Access Mechanism: HTTP 2. Standard Address & Identifier Scheme: URI 3. Standard Data Model: RDFwww.capsenta.com June 4, 2012 126
    • Linked Enterprise Data Information creation  information sharing Produce and consume data specific to your needs but also produce it in a way that it can be connected to other data in the enterprise Distributed but connected! Data that you create, may benefit others! Share it!www.capsenta.com June 4, 2012 127
    • Benefits of RDF/Linked Data RDF (graphs) is a least common denominator  Text, CVS, XML, XLS, RDB to RDF  Imagine modeling a social network in XML Dynamic and Flexible  Adding a column to a table in my RDBMS takes 6 months to authorize!  With RDF, simply add the triple!  Incrementalwww.capsenta.com June 4, 2012 128
    • Benefits of RDF/Linked Data Power of the URI and Links  Universal Identifier  Create a “foreign key” to a table that I have no control of Scalability in months, not only seconds  “More can be done with less and faster”  “Cooperation without coordination”www.capsenta.com June 4, 2012 129
    • What’s next? W3C Linked Data Platform Working Group  http://www.w3.org/2012/ldp/charter Linked Data Basic Profile 1.0  http://www.w3.org/Submission/ldbp/www.capsenta.com June 4, 2012 130
    • Summarywww.capsenta.com June 4, 2012 131
    • Linked Data Checklist Does your data link to other data sets? Do you provide provenance metadata? Do you provide licensing metadata? Do you reuse common vocabularies? Do you map proprietary vocabulary terms to common vocabularies? Do you provide other access methods? Thanks Heath & Bizerwww.capsenta.com June 4, 2012
    • Acknowledgements  RiBS Lab – UT Austin  Olaf Hartig – Humboldt University Berlin  Patrick Sinclair – BBC  Jamie Taylor – Google  Tom Heath & Chris Bizer. Linked Data: Evolving the Web into a Global Data Space  David Wood (Ed.). Linking Enterprise Datawww.capsenta.com June 4, 2012 133
    • Thanks! Juan F. Sequeda Daniel P. Miranker juan@capsenta.com miranker@capsenta.com @juansequeda www.capsenta.comwww.capsenta.com June 4, 2012 134