What is the Semantic Web?Juan F. SequedaDepartment of Computer ScienceUniversity of Texas at AustinApril 2011
How many people are familiar withHTMLCSSBrowserHTTPXMLJSONRDFOWLSPARQL
What is the Web?
What is the Web?“… the Web, is a system of interlinked hypertext documents accessed via the Internet. With a web browser, one can view web pages that may contain text, images […] and navigate between them via hyperlinks”http://en.wikipedia.org/wiki/World_Wide_Web
Current Web = internet + links + docs
History of the WebCreated by Tim Berners-Lee at CERN in 1989Mosaic browser in 1993W3C created in 1994Exponential growth mid 90sSearch engines – Google 1998Dot-com boom 1999 – 2001Web 2.0 – blogs, Facebook, Twitter, etc
What is the problem?The web is full of documentsWe aren’t always interested in documentsWe are interested in THINGSThese THINGS might be in documentsWe can read a HTML document rendered in a browser and find what we are searching forThis is hard for computers. Computers have to guess (even though they are pretty good at it)
The Web is a Data ShredderStructured DataUnstructured DataThanks Martin Hepp
What would we like?Make it easy for computers/software to find THINGS
Do you SEARCH or do you FIND?
Search forFootball Players who went to the University of Texas at Austin, played for the Dallas Cowboys as Cornerback
Why can’t we just FIND it…
Guess how I FOUND out?
Semantic Web!
What is the Semantic Web?Besides publishing documents on the webwhich computers can’t understand easilyLet’s publish on the web something that computers can understand
What is the Semantic Web?Besides publishing documents on the webwhich computers can’t understand easilyLet’s publish on the web something that computers can understandDATA
The Semantic Web is a web of linked dataThe current web is a web of linked documents
But wait… doesn’t the web already have data?
Current Data on the WebRelational DatabasesAPIsXMLCSVXLS…Can’t computers and applications already consume that data on the web?
True! But it is all in different formats and data models!
This makes it hard to integrate data
The data in different data sources aren’t linked
For example, how do I know that the Juan Sequeda in Facebook is the same as Juan Sequeda in Twitter
Or if I create a mashup from different services, I have to learn different APIs and I get different formats of data back
Data is Siloed
Wouldn’t it be great if we had a standard way of publishing data on the Web?
We have a standardized way of publishing documents on the web, right?HTML
Then why can’t we have a standard way of publishing data on the Web?
Good question! And the answer is YES. There is!RDF
Resource Description Framework (RDF)A data model A way to model datai.e. Relational databases use relational data modelRDF is a triple data modelLabeled GraphSubject, Predicate, Object<Juan> <was born in> <California><California> <is part of> <the USA><Juan> <has hobby> <Salsa dancing>
RDF can be serialized in different waysRDF/XMLRDFa (RDF in HTML)N3TurtleJSON
So does that mean that I have to publish my data in RDF now?
You don’t have to… but we would like you to 
An example
Document on the Web
Databases back up documentsTHINGS have PROPERTIES:A Book as a Title, an author, …This is a THING:A book title “Programming the Semantic Web” by Toby Segaran, …
Lets represent the data in RDFProgramming the Semantic WebtitleauthorbookToby Segaranisbn978-0-596-15381-6publishernamePublisherO’Reilly
Remember that we are on the webEverything on the web is identified by a URI
And now let’s link the data to other dataProgramming the Semantic Webtitleauthorhttp://…/isbn978Toby Segaranisbn978-0-596-15381-6publishernamehttp://…/publisher1O’Reilly
And now consider the data from Revyu.comhasReviewhttp://…/review1http://…/isbn978descriptionreviewerAwesome Bookhttp://…/reviewernameJuan Sequeda
Let’s start to link datahasReviewhttp://…/review1http://…/isbn978Programming the Semantic WebtitledescriptionsameAshasReviewerAwesome Bookauthorhttp://…/isbn978Toby Segaranhttp://…/reviewernameisbn978-0-596-15381-6Juan Sequedapublishernamehttp://…/publisher1O’Reilly
Juan Sequeda publishes data toohttp://juansequeda.com/idhttp://dbpedia.org/AustinlivesInnameJuan Sequeda
Let’s link more datahasReviewhttp://…/review1http://…/isbn978descriptionhasReviewerAwesome Bookhttp://…/reviewernameJuan SequedasameAshttp://juansequeda.com/idhttp://dbpedia.org/AustinlivesInnameJuan Sequeda
And morehasReviewhttp://…/review1http://…/isbn978Programming the Semantic WebtitledescriptionsameAshasReviewerAwesome Bookauthorhttp://…/isbn978Toby Segaranhttp://…/reviewernameisbn978-0-596-15381-6Juan SequedapublishersameAshttp://…/publisher1nameO’Reillyhttp://juansequeda.com/idhttp://dbpedia.org/AustinlivesInnameJuan Sequeda
Data on the Web that is in RDF and is linked to other RDF data is LINKED DATA
Linked Data PrinciplesUse URIs as names for thingsUse HTTP URIs so that people can look up (dereference) those names.When someone looks up a URI, provide useful information.Include links to other URIs so that they can discover more things.
Linked Data makes the web appear as ONEGIANTHUGEGLOBALDATABASE!
I can query a database with SQL. Is there a way to query Linked Data with a query language?
Yes! There is actually a standardize language for thatSPARQL
FIND all the reviews on the book “Programming the Semantic Web” by people who live in Austin
hasReviewhttp://…/review1http://…/isbn978Programming the Semantic WebtitledescriptionsameAshasReviewerAwesome Bookauthorhttp://…/isbn978Toby Segaranhttp://…/reviewernameisbn978-0-596-15381-6Juan SequedapublishersameAsnamehttp://…/publisher1O’Reillyhttp://juansequeda.comhttp://dbpedia.org/AustinlivesInnameJuan Sequeda
This looks cool, but let’s be realistic. What is the incentive to publish Linked Data?
What was your incentive to publish an HTML page in 1990?
1) Share data in documents2) Because you neighbor was doing it… later on …3) Marketing, Advertising, SEO
So why should we publish Linked Data in 2011?
1) Share data as data2) Because you neighbor is doing it…3) (Semantic) SEO ++
Linked Data PublishersUK GovernmentUS GovernmentBBCOpen Calais – Thomson ReutersFreebase/GoogleNY TimesBest BuyCNETDbpediaOverstock.comO’Reilly Media…
Publishing Linked DataLegacy Data in Relational DatabasesD2R Server, Virtuoso, Triplify, UltrawrapCMSDrupal 7Native RDF DatabasesAllegroGraph, Jena, Sesame, Virtuoso, Talis PlatformIn HTML with RDFa
Links to other URIs
<span rel="foaf:interest"><a href="http://dbpedia.org/resource/Database" property="dcterms:title">Database</a>,<a href="http://dbpedia.org/resource/Data_integration" property="dcterms:title">Data Integration</a>,<a href="http://dbpedia.org/resource/Semantic_Web" property="dcterms:title">Semantic Web</a>,<a href="http://dbpedia.org/resource/Linked_Data" property="dcterms:title">Linked Data</a>,etc.</span>
(Semantic) SEO ++Markup your HTML with RDFaUse standard vocabularies (ontologies)Google VocabularyGood RelationsDublin CoreGoogle and Yahoo will crawl this data and use it for better rendering
May 2007
Oct 2007
Nov 2007
Feb 2008
Mar 2008
Sept 2008
Mar 2009 (1)
Mar 2009 (2)
July 2009
September 2010
April 2011YOU GET THE PICTUREITS BIG and getting BIGGER andBIGGER
Now what can we do with this data?
Query it!
Find all the locations of all the original paintings of Modigliani
Linked Data Browsers
Linked Data BrowsersNot actually separate browsers. Run inside of HTML browsersView the data that is returned after looking up a URI in tabular form(IMO) No usability
Linked Data BrowsersTabulatorhttp://www.w3.org/2005/ajar/tabOpenLinkhttp://ode.openlinksw.com/ZitgistDataviewrhttp://dataviewer.zitgist.com/Marbleshttp://www5.wiwiss.fu-berlin.de/marbles/Exploratorhttp://www.tecweb.inf.puc-rio.br/explorator
Faceted Browsers
http://dbpedia.neofonie.de
http://dev.semsol.com/2010/semtech/
On-the-fly Mashups
http://sig.ma
What’s next?
Time to create new and innovative ways to interact with Linked DataNew and improved search
This may be one of the Killer Apps that we have all been waiting forhttp://en.wikipedia.org/wiki/File:Mosaic_browser_plaque_ncsa.jpg
It’s time to partner with HCI communitySemantic Web UIs don’t have to be ugly
Linked Data ApplicationsSoftware system that makes use of data on the web from multiple datasets and that benefits from links between the datasets
Characteristics of Linked Data ApplicationsConsume data that is published on the web following the Linked Data principles: an application should be able to request, retrieve and process the accessed data
Discover further information by following the links between different data sources: the fourth principle enables this.
Combine the consumed linked data with data from sources (not necessarily Linked Data)
Expose the combined data back to the web following the Linked Data principles
Offer value to end-usersSemantic Web technologies integrate data across boundaries"National Instruments is currently releasing an internal system based on an RDF triple store.  This technology provides  increased flexibility and agility in managing and accessing data for our complex and ever-evolving product offerings.  With RDF powering one of our key information delivery infrastructures, we will enable greater capabilities at lower total cost that what we have seen with any competing platform."
RiBS - Miranker LabUltrawrapVirtualizes a RDBMS as Graph (RDF)Automatically generate the ontology from schemaQuery a RDBMS in SPARQL (language for RDF)Leverage SQL optimizer to do all the hard workInsert arbitrary RDF to your RDBMS without altering schema DiamondLinked Data query engineLinked Traversal based query executionStart with a URI that returns RDF and follow links
Ultrawrap enables your RDBMS to be linked with other RDF dataUltrawrapUltrawrapSpecifyUltrawrapNOW WE WANT TO QUERY THISMorphsterMorphbank
Query the Web of Linked Data with DiamondSPARQLQueryDiamondUltrawrapUltrawrapSpecifyUltrawrapMorphsterMorphbank
Example 1 (Specify – DBpedia)Get full name and guid from taxon with id http://tata.csres.utexas.edu:8080/specify/data/taxon51807#thingAND fin any subjects it may have “skos:subject”
Result Example 1Note that http://dbpedia.org/resource/Category:Fish_of_Australia comes from a different data source (dbpedia.org)
Example 2 (Specify-Morphbank)Get full name and guid from taxon with id http://tata.csres.utexas.edu:8080/specify/data/taxon42947#thingAND the rank and kingdom from Morphbank

What is the Semantic Web

  • 1.
    What is theSemantic Web?Juan F. SequedaDepartment of Computer ScienceUniversity of Texas at AustinApril 2011
  • 2.
    How many peopleare familiar withHTMLCSSBrowserHTTPXMLJSONRDFOWLSPARQL
  • 4.
  • 5.
    What is theWeb?“… the Web, is a system of interlinked hypertext documents accessed via the Internet. With a web browser, one can view web pages that may contain text, images […] and navigate between them via hyperlinks”http://en.wikipedia.org/wiki/World_Wide_Web
  • 6.
    Current Web =internet + links + docs
  • 7.
    History of theWebCreated by Tim Berners-Lee at CERN in 1989Mosaic browser in 1993W3C created in 1994Exponential growth mid 90sSearch engines – Google 1998Dot-com boom 1999 – 2001Web 2.0 – blogs, Facebook, Twitter, etc
  • 8.
    What is theproblem?The web is full of documentsWe aren’t always interested in documentsWe are interested in THINGSThese THINGS might be in documentsWe can read a HTML document rendered in a browser and find what we are searching forThis is hard for computers. Computers have to guess (even though they are pretty good at it)
  • 9.
    The Web isa Data ShredderStructured DataUnstructured DataThanks Martin Hepp
  • 10.
    What would welike?Make it easy for computers/software to find THINGS
  • 11.
    Do you SEARCHor do you FIND?
  • 12.
    Search forFootball Playerswho went to the University of Texas at Austin, played for the Dallas Cowboys as Cornerback
  • 16.
    Why can’t wejust FIND it…
  • 19.
    Guess how IFOUND out?
  • 20.
  • 21.
    What is theSemantic Web?Besides publishing documents on the webwhich computers can’t understand easilyLet’s publish on the web something that computers can understand
  • 22.
    What is theSemantic Web?Besides publishing documents on the webwhich computers can’t understand easilyLet’s publish on the web something that computers can understandDATA
  • 23.
    The Semantic Webis a web of linked dataThe current web is a web of linked documents
  • 24.
    But wait… doesn’tthe web already have data?
  • 25.
    Current Data onthe WebRelational DatabasesAPIsXMLCSVXLS…Can’t computers and applications already consume that data on the web?
  • 26.
    True! But itis all in different formats and data models!
  • 27.
    This makes ithard to integrate data
  • 28.
    The data indifferent data sources aren’t linked
  • 29.
    For example, howdo I know that the Juan Sequeda in Facebook is the same as Juan Sequeda in Twitter
  • 30.
    Or if Icreate a mashup from different services, I have to learn different APIs and I get different formats of data back
  • 31.
  • 32.
    Wouldn’t it begreat if we had a standard way of publishing data on the Web?
  • 33.
    We have astandardized way of publishing documents on the web, right?HTML
  • 34.
    Then why can’twe have a standard way of publishing data on the Web?
  • 35.
    Good question! Andthe answer is YES. There is!RDF
  • 36.
    Resource Description Framework(RDF)A data model A way to model datai.e. Relational databases use relational data modelRDF is a triple data modelLabeled GraphSubject, Predicate, Object<Juan> <was born in> <California><California> <is part of> <the USA><Juan> <has hobby> <Salsa dancing>
  • 37.
    RDF can beserialized in different waysRDF/XMLRDFa (RDF in HTML)N3TurtleJSON
  • 38.
    So does thatmean that I have to publish my data in RDF now?
  • 39.
    You don’t haveto… but we would like you to 
  • 40.
  • 41.
  • 42.
    Databases back updocumentsTHINGS have PROPERTIES:A Book as a Title, an author, …This is a THING:A book title “Programming the Semantic Web” by Toby Segaran, …
  • 43.
    Lets represent thedata in RDFProgramming the Semantic WebtitleauthorbookToby Segaranisbn978-0-596-15381-6publishernamePublisherO’Reilly
  • 44.
    Remember that weare on the webEverything on the web is identified by a URI
  • 45.
    And now let’slink the data to other dataProgramming the Semantic Webtitleauthorhttp://…/isbn978Toby Segaranisbn978-0-596-15381-6publishernamehttp://…/publisher1O’Reilly
  • 46.
    And now considerthe data from Revyu.comhasReviewhttp://…/review1http://…/isbn978descriptionreviewerAwesome Bookhttp://…/reviewernameJuan Sequeda
  • 47.
    Let’s start tolink datahasReviewhttp://…/review1http://…/isbn978Programming the Semantic WebtitledescriptionsameAshasReviewerAwesome Bookauthorhttp://…/isbn978Toby Segaranhttp://…/reviewernameisbn978-0-596-15381-6Juan Sequedapublishernamehttp://…/publisher1O’Reilly
  • 48.
    Juan Sequeda publishesdata toohttp://juansequeda.com/idhttp://dbpedia.org/AustinlivesInnameJuan Sequeda
  • 49.
    Let’s link moredatahasReviewhttp://…/review1http://…/isbn978descriptionhasReviewerAwesome Bookhttp://…/reviewernameJuan SequedasameAshttp://juansequeda.com/idhttp://dbpedia.org/AustinlivesInnameJuan Sequeda
  • 50.
    And morehasReviewhttp://…/review1http://…/isbn978Programming theSemantic WebtitledescriptionsameAshasReviewerAwesome Bookauthorhttp://…/isbn978Toby Segaranhttp://…/reviewernameisbn978-0-596-15381-6Juan SequedapublishersameAshttp://…/publisher1nameO’Reillyhttp://juansequeda.com/idhttp://dbpedia.org/AustinlivesInnameJuan Sequeda
  • 51.
    Data on theWeb that is in RDF and is linked to other RDF data is LINKED DATA
  • 52.
    Linked Data PrinciplesUseURIs as names for thingsUse HTTP URIs so that people can look up (dereference) those names.When someone looks up a URI, provide useful information.Include links to other URIs so that they can discover more things.
  • 53.
    Linked Data makesthe web appear as ONEGIANTHUGEGLOBALDATABASE!
  • 54.
    I can querya database with SQL. Is there a way to query Linked Data with a query language?
  • 55.
    Yes! There isactually a standardize language for thatSPARQL
  • 56.
    FIND all thereviews on the book “Programming the Semantic Web” by people who live in Austin
  • 57.
    hasReviewhttp://…/review1http://…/isbn978Programming the SemanticWebtitledescriptionsameAshasReviewerAwesome Bookauthorhttp://…/isbn978Toby Segaranhttp://…/reviewernameisbn978-0-596-15381-6Juan SequedapublishersameAsnamehttp://…/publisher1O’Reillyhttp://juansequeda.comhttp://dbpedia.org/AustinlivesInnameJuan Sequeda
  • 58.
    This looks cool,but let’s be realistic. What is the incentive to publish Linked Data?
  • 59.
    What was yourincentive to publish an HTML page in 1990?
  • 60.
    1) Share datain documents2) Because you neighbor was doing it… later on …3) Marketing, Advertising, SEO
  • 61.
    So why shouldwe publish Linked Data in 2011?
  • 62.
    1) Share dataas data2) Because you neighbor is doing it…3) (Semantic) SEO ++
  • 63.
    Linked Data PublishersUKGovernmentUS GovernmentBBCOpen Calais – Thomson ReutersFreebase/GoogleNY TimesBest BuyCNETDbpediaOverstock.comO’Reilly Media…
  • 64.
    Publishing Linked DataLegacyData in Relational DatabasesD2R Server, Virtuoso, Triplify, UltrawrapCMSDrupal 7Native RDF DatabasesAllegroGraph, Jena, Sesame, Virtuoso, Talis PlatformIn HTML with RDFa
  • 65.
  • 66.
    <span rel="foaf:interest"><a href="http://dbpedia.org/resource/Database"property="dcterms:title">Database</a>,<a href="http://dbpedia.org/resource/Data_integration" property="dcterms:title">Data Integration</a>,<a href="http://dbpedia.org/resource/Semantic_Web" property="dcterms:title">Semantic Web</a>,<a href="http://dbpedia.org/resource/Linked_Data" property="dcterms:title">Linked Data</a>,etc.</span>
  • 67.
    (Semantic) SEO ++Markupyour HTML with RDFaUse standard vocabularies (ontologies)Google VocabularyGood RelationsDublin CoreGoogle and Yahoo will crawl this data and use it for better rendering
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.
    April 2011YOU GETTHE PICTUREITS BIG and getting BIGGER andBIGGER
  • 80.
    Now what canwe do with this data?
  • 81.
  • 82.
    Find all thelocations of all the original paintings of Modigliani
  • 83.
  • 84.
    Linked Data BrowsersNotactually separate browsers. Run inside of HTML browsersView the data that is returned after looking up a URI in tabular form(IMO) No usability
  • 86.
  • 87.
  • 88.
  • 89.
  • 90.
  • 91.
  • 92.
  • 93.
    Time to createnew and innovative ways to interact with Linked DataNew and improved search
  • 94.
    This may beone of the Killer Apps that we have all been waiting forhttp://en.wikipedia.org/wiki/File:Mosaic_browser_plaque_ncsa.jpg
  • 95.
    It’s time topartner with HCI communitySemantic Web UIs don’t have to be ugly
  • 96.
    Linked Data ApplicationsSoftwaresystem that makes use of data on the web from multiple datasets and that benefits from links between the datasets
  • 97.
    Characteristics of LinkedData ApplicationsConsume data that is published on the web following the Linked Data principles: an application should be able to request, retrieve and process the accessed data
  • 98.
    Discover further informationby following the links between different data sources: the fourth principle enables this.
  • 99.
    Combine the consumedlinked data with data from sources (not necessarily Linked Data)
  • 100.
    Expose the combineddata back to the web following the Linked Data principles
  • 101.
    Offer value toend-usersSemantic Web technologies integrate data across boundaries"National Instruments is currently releasing an internal system based on an RDF triple store. This technology provides increased flexibility and agility in managing and accessing data for our complex and ever-evolving product offerings. With RDF powering one of our key information delivery infrastructures, we will enable greater capabilities at lower total cost that what we have seen with any competing platform."
  • 102.
    RiBS - MirankerLabUltrawrapVirtualizes a RDBMS as Graph (RDF)Automatically generate the ontology from schemaQuery a RDBMS in SPARQL (language for RDF)Leverage SQL optimizer to do all the hard workInsert arbitrary RDF to your RDBMS without altering schema DiamondLinked Data query engineLinked Traversal based query executionStart with a URI that returns RDF and follow links
  • 103.
    Ultrawrap enables yourRDBMS to be linked with other RDF dataUltrawrapUltrawrapSpecifyUltrawrapNOW WE WANT TO QUERY THISMorphsterMorphbank
  • 104.
    Query the Webof Linked Data with DiamondSPARQLQueryDiamondUltrawrapUltrawrapSpecifyUltrawrapMorphsterMorphbank
  • 105.
    Example 1 (Specify– DBpedia)Get full name and guid from taxon with id http://tata.csres.utexas.edu:8080/specify/data/taxon51807#thingAND fin any subjects it may have “skos:subject”
  • 106.
    Result Example 1Notethat http://dbpedia.org/resource/Category:Fish_of_Australia comes from a different data source (dbpedia.org)
  • 107.
    Example 2 (Specify-Morphbank)Getfull name and guid from taxon with id http://tata.csres.utexas.edu:8080/specify/data/taxon42947#thingAND the rank and kingdom from Morphbank