• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Semantic Web Landscape 2009
 

Semantic Web Landscape 2009

on

  • 82,337 views

These slides were originally a tutorial presented for the SIG preceding the May 2009 meeting of the PRISM Forum. ...

These slides were originally a tutorial presented for the SIG preceding the May 2009 meeting of the PRISM Forum.

They attempt to give a survey of the technologies, tools, and state of the world with respect to the Semantic Web as of the first half of 2009.

Statistics

Views

Total Views
82,337
Views on SlideShare
79,500
Embed Views
2,837

Actions

Likes
89
Downloads
1,848
Comments
7

37 Embeds 2,837

http://www.netzseite.de 778
http://www.thefigtrees.net 767
http://www.semweb.pl 690
http://glennas.wordpress.com 150
http://semantosoph.net 138
http://www.slideshare.net 89
http://thefigtrees.net 46
http://microreviews.org 35
http://506waselweb3.blogspot.com 30
http://planetrdf.com 23
http://daneel-ariantho.blogspot.com 19
http://localhost 13
http://linkeddata.uriburner.com 9
http://jscheiber.blogspot.com 6
http://daneel-ariantho.blogspot.fr 5
http://bibblio.wikispaces.com 5
http://creasywuqiong.blogspot.com 4
http://translate.googleusercontent.com 3
http://www.urenio.org 3
http://arkidmitra.wordpress.com 3
http://votresiteaccessible.net 2
http://biblioidees.wikispaces.com 2
https://s5-eu3.ixquick-proxy.com 2
http://www.linkedin.com 2
http://www.fbweb-test.comoj.com 1
http://sig.ma 1
http://www.josef-scheiber.de 1
http://bb9.canyons.edu 1
http://1378911466.nvmodules.netvibes.com 1
http://wiki.netex.es 1
https://si0.twimg.com 1
http://webcache.googleusercontent.com 1
http://localhost:3000 1
http://www.lmodules.com 1
http://www.thelibrarynews.com 1
http://www.blogger.com 1
http://dashboard.bloglines.com 1
More...

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial LicenseCC Attribution-NonCommercial License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

17 of 7 previous next Post a comment

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Great presentation on Semantic Web.
    Are you sure you want to
    Your message goes here
    Processing…
  • Great work!
    Are you sure you want to
    Your message goes here
    Processing…
  • Some very good examples. Definitely the best deck on semantic web & its technologies that I have seen.
    Are you sure you want to
    Your message goes here
    Processing…
  • Nice and illustrative work
    Are you sure you want to
    Your message goes here
    Processing…
  • Thanks for sharing and for using a drug design example.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Susie StephensBen AdidaEric Prud’hommeauxChris Bizer, Chris Becker
  • Executive summary.
  • Courtesy W3C SWEO group, http://linkeddata.org/docs/eswc2007-poster-linking-open-data.pdf
  • http://linkeddata.org/tools
  • http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/SemWebClients
  • See http://beckr.org/DBpediaMobile/ and http://wiki.dbpedia.org/DBpediaMobile
  • One of the goals of this tutorial is to de-mystify the all of the names of technologies, tools, projects, etc. that swirl around the Semantic Web story.And since I saw that as I researched this presentation, everyone seems to like this particular Gary Larson cartoon, it behooved me to include it.
  • Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
  • Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
  • Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
  • Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
  • Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
  • Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
  • The good – emphasize the importance of the foundational layers (URIs and RDF) ; emphasizes the long-term roadmap/vision of what’s needed for the Semantic WebThe bad – implies that perhaps things can’t be taken serious until all the pieces are in place ; implies an order to the research ; various versions of the cake tell different stories (importance of XML, absence of query, lack of UI/application layer, …)Valentin Zacharias wrote about the “infamy” part of the layer cake here: http://www.valentinzacharias.de/blog/2007/04/ban-semantic-web-layer-cake.html
  • http://www.w3.org/2001/sw/sweo/public/UseCases/
  • Definition.
  • Prescriptive.
  • Descriptive.
  • Formal.
  • The first is as opposed to relational tables or XML schemas where the schema needs to be explicitly adjusted to accommodate whatever data is being merged.The second is due to the expressivity of the model – can handle lists, trees, n-ary relations, etc.The third is as opposed to table & column identifiers or XML attribute names.
  • Quotation from http://xtech06.usefulinc.com/schedule/paper/61
  • Definition.
  • Prescriptive.
  • Descriptive.
  • Descriptive (part 2). This is leagues ahead of the situation with SQL!
  • To run for real: http://dbpedia.org/sparqlPREFIX type: PREFIX prop: SELECT ?country_name ?populationWHERE { ?country a type:LandlockedCountries ;rdfs:label ?country_name ;prop:populationEstimate ?population . FILTER (?population > 15000000 && langMatches(lang(?country_name), \"EN\")) .} ORDER BY DESC(?population)
  • http://bio2rdf.org/
  • http://bio2rdf.org/
  • Definition.
  • Definition.
  • Thanks to BijanParsia for much of this material http://www.cs.man.ac.uk/~bparsia/2009/comp60462/17-03-casestudies.pdf

Semantic Web Landscape 2009 Semantic Web Landscape 2009 Presentation Transcript

  • The 2009 Semantic Web Landscape Technologies, tools, and projects Lee Feigenbaum VP Technology & Standards, Cambridge Semantics Co-chair, W3C SPARQL Working Group For PRISM Forum SIG on Semantic Web May 12, 2009
  • Thanks Upfront Much material & wisdom used with gracious permission of: Ivan Herman W3C Semantic Web Activity Lead Bijan Parsia Co-editor of the core OWL 2 specification Ian Horrocks Co-chair of the W3C OWL 2 Working Group Phil Archer Chair of the W3C POWDER Working Group May 12, 2009 2
  • Thanks Upfront Much material & wisdom used with gracious permission of: Michael Hausenblas Evangelist for RDFa, Linked Data, and Multimedia Semantics Fabien Gandon Member, GRDDL and OWL 2 Working Groups Susie Stephens Co-chair W3C HCLS Interest Group Eric Prud’hommeaux W3C team member, Semantic Web expert May 12, 2009 3
  • Executive Summary: The Semantic Web in 2009 The Semantic Web in 2009 is characterized by a healthy environment of stable, broadly-implemented core standard technologies complemented by a number of continually emerging new standards. Adopters of Semantic Web technologies in 2009 can choose from a wide range of commercial and open-source interoperable tools and systems. Enterprise Semantic Web projects are beginning to move beyond proofs of concept to serious production implementations. Community projects on the World Wide Web have linked hundreds of public data sets into an emergent Semantic Web. May 12, 2009 4
  • Agenda Introduction The data model (RDF) The query language (SPARQL) Adding structure & semantics (RDFS, OWL, RIF) Working in the real world (GRDDL, RDF2RDB) Working on the Web (Linked Data, RDFa, POWDER) May 12, 2009 5
  • A Motivating Example: Drug Discovery The W3C HCLS interest group set out to use Semantic Web technologies to receive precise answers to a complex question: Find me genes involved in signal transduction that are related to pyramidal neurons. May 12, 2009 6
  • General search 223,000 hits, 0 results May 12, 2009 7
  • Domain-limited search 2,580 potential results May 12, 2009 8
  • Specific databases Too many silos! May 12, 2009 9
  • A Semantic Web Approach Integrate disparate databases… MeSH PubMed Entrez Gene Gene Ontology … May 12, 2009 10
  • A Semantic Web Approach (cont’d) …so that one query… May 12, 2009 11
  • A Semantic Web Approach (cont’d) …(trivially) spans several databases… May 12, 2009 12
  • A Semantic Web Approach (cont’d) …to deliver targeted results… May 12, 2009 13
  • What’s the trick? 1. Agreement on common terms and relationships 2. Incremental, flexible data structure 3. Good-enough modeling 4. Query interface tailored to the data model May 12, 2009 14
  • Names May 12, 2009 15
  • Branding Semantic Web Web of Data Giant Global Graph Data Web Web 3.0 Linked Data Web Semantic Data Web May 12, 2009 16
  • What is it & why do we care? (1) “The Semantic Web” Augments the World Wide Web Represents the Web’s information in a machine- readable fashion Enables… …targeted search …data browsing …automated agents World Wide Web : Web pages :: The Semantic Web : Data May 12, 2009 17
  • What is it & why do we care? (2) “Semantic Web technologies” A family of technology standards that ‘play nice together’, including: Flexible data model Expressive ontology language Distributed query language Drive Web sites, enterprise applications The technologies enable us to build applications and solutions that were not possible, practical, or feasible traditionally. May 12, 2009 18
  • A Common & Coherent Set of Technology Standards A common set of technologies: ...enables diverse uses ...encourages interoperability A coherent set of technologies: …encourage incremental application …provide a substantial base for innovation A standard set of technologies: ...reduces proprietary vendor lock-in ...encourages many choices for tool sets May 12, 2009 19
  • The (In)Famous Layer Cake May 12, 2009 20
  • Semantic Web Technology Timeline 2001 2004 2007 2008 2009 1999 RIF HCLS May 12, 2009 21
  • 2009: Where we are As technologies & tools have evolved, Semantic Web advocates have progressed through stages: Report on… Execute on… Semantic Web vision Initial experiments Experiments Technology standards Technology standards Software packages Software packages Proofs of concept Proofs of concept Production implementations May 12, 2009 22
  • 2009: Where we are (cont’d) http://www.w3.org/2001/sw/sweo/public/UseCases/ May 12, 2009 23
  • 2009: Where we’re not Image from Trey Ideker via Enoch Huang Semantic Web technologies are not a ‘magic crank’ for discovering new drugs (or solving other problems, for that matter)! May 12, 2009 24
  • 2009: Where we’re not (cont’d) XML vs. RDF? “Ontology” vs. “ontology”? Data integration vs. Semantic Web vs. reasoning vs. KBs Linked Data? vs. search vs. app. development vs. … The Semantic Web still suffers from confusing and conflicting messaging, each of which asserts it’s “correct”. May 12, 2009 25
  • 2009: Where we’re not (cont’d) People with appropriate skill sets for designing & building Semantic Web solutions are not widely available. May 12, 2009 26
  • 2009: Where we’re not (cont’d) We don’t yet have standard solutions for privacy, trust, probability, and other elements of the Semantic Web vision. May 12, 2009 27
  • Introduction to the Semantic Web approach How does a Semantic Web approach help us merge data sets, infer new relations, and integrate outside data sources? Thanks to Ivan Herman for this example May 12, 2009 28
  • The rough structure of data integration 1. Map the various data onto an abstract data representation Make the data independent of its internal • representation… 2. Merge the resulting representations 3. Start making queries on the whole Queries not possible on the individual data sets • May 12, 2009 29
  • Data set “A”: A simplified book store Books ID Author Title Publisher Year ISBN0-00-651409-X id_xyz The Glass Palace id_qpr 2000 Authors ID Name Home page id_xyz Ghosh, Amitav http://www.amitavghosh.com Publishers ID Publisher Name City id_qpr Harper Collins London May 12, 2009 30
  • st: 1 Export your data as a set of relations May 12, 2009 31
  • Some notes on the data export Data export does not necessarily mean physical conversion of the data Relations can be virtual, generated on-the-fly at query time via SQL “bridges” scraping HTML pages extracting data from Excel sheets etc. One can export part of the data May 12, 2009 32
  • Data set “F”: Another book store’s data A B D E Traducteur ID Titre Original 1 ISBN0 2020386682 Le Palais A13 ISBN-0-00-651409-X des miroirs 2 3 ID Auteur 6 ISBN-0-00-651409-X A12 7 Nom 11 Ghosh, Amitav 12 Besse, Christianne 13 May 12, 2009 33
  • 2nd: Export your second set of data May 12, 2009 34
  • 3rd: start merging your data May 12, 2009 35
  • 3rd: start merging your data (cont’d) May 12, 2009 36
  • 4th: Merge identical resources May 12, 2009 37
  • Start making queries… User of data set “F” can now ask queries like: “What is the title of the original version of Le Palais des miroirs?” This information is not in the data set “F”... …but can be retrieved after merging with data set “A”! May 12, 2009 38
  • 5th: Query the merged data set May 12, 2009 39
  • However, more can be achieved… We “know” that a:author and f:auteur are really the same But our automatic merge does not know that! Let us add some extra information to the merged data: a:author is the same as f:auteur Both identify a Person, a category (type) for certain resources May 12, 2009 40
  • 3rd revisited: Use the extra knowledge May 12, 2009 41
  • Start making richer queries! User of data set “F” can now query: “What is the home page of Le Palais des miroirs’s ‘auteur’?” The information is not in data set “F” or “A”… …but was made available by: Merging data sets “A” and “F” Adding three simple “glue” statements May 12, 2009 42
  • 6th: Richer queries May 12, 2009 43
  • Bring in other data sources We can integrate new information into our merged data set from other sources e.g. additional information about author Amitav Ghosh Perhaps the largest public source of general knowledge is Wikipedia Structured data can be extracted from Wikipedia using dedicated tools May 12, 2009 44
  • 7th: Merge with Wikipedia data May 12, 2009 45
  • 7th (cont’d): Merge with Wikipedia data May 12, 2009 46
  • 7th (cont’d): Merge with Wikipedia data May 12, 2009 47
  • Is that surprising? It may look like it but, in fact, it should not be… What happened via automatic means is done every day by Web users! The difference: a bit of extra rigour so that machines could do this, too May 12, 2009 48
  • What did we do? We combined different data sets that ...may be internal or somewhere on the Web ...are of different formats (RDBMS, Excel spreadsheet, (X)HTML, etc) ...have different names for the same relations We could combine the data because some URIs were identical i.e. the ISBNs in this case We could add some simple additional information (the “glue”) to help further merge data sets The result? Answer queries that could not previously be asked May 12, 2009 49
  • What did we do? (cont’d) May 12, 2009 50
  • The abstraction pays off because… …the graph representation is independent of the details of the native structures …a change in local database schemas, HTML structures, etc. do not affect the whole “schema independence” …new data, new connections can be added seamlessly & incrementally May 12, 2009 51
  • So where is the Semantic Web? Semantic Web technologies make such integration possible The rest of this tutorial introduces many of these technologies. May 12, 2009 52
  • Agenda Introduction The data model (RDF) The query language (SPARQL) Adding structure & semantics (RDFS, OWL, RIF) Working in the real world (GRDDL, RDF2RDB) Working on the Web (Linked Data, RDFa, POWDER) May 12, 2009 53
  • RDF is… Resource Description Framework May 12, 2009 54
  • RDF is… The data model of the Semantic Web. May 12, 2009 55
  • RDF is… A schema-less data model that features unambiguous identifiers and named relations between pairs of resources. May 12, 2009 56
  • RDF is… A labeled, directed graph of relations between resources and literal values. RDF graphs are collections of triples Triples are made up of a subject, a predicate, and an object predicate subject object Resources and relationships are named with URIs May 12, 2009 57
  • Example RDF triples “Lee Feigenbaum works for Cambridge Semantics” works for Lee Cambridge Feigenbaum Semantics “Lee Feigenbaum was born in 1978” born in Lee 1978 Feigenbaum “Cambridge Semantics is headquartered in Massachusetts” headquartered Cambridge Massachusetts Semantics May 12, 2009 58
  • Triples connect to form graphs works for Lee Cambridge Feigenbaum Semantics headquartered born in lives in Massachusetts 1978 capital Boston May 12, 2009 59
  • Why RDF? What’s different here? The graph data structure makes merging data with shared identifiers trivial (as we saw earlier) Triples act as a least common denominator for expressing data URIs for naming remove ambiguity …the same identifier means the same thing May 12, 2009 60
  • Why RDF? Incremental Integration Agile, Flexible URIs for Incremental Graph naming Model Integration Relational RDF Database May 12, 2009 61
  • Types of RDF Tools Triple stores Built on relational database Native RDF store Development libraries Full-featured application servers Most RDF tools contain some elements of each of these. May 12, 2009 62
  • Finding RDF Tools Community-maintained lists http://esw.w3.org/topic/SemanticWebTools Emphasis on large triple stores http://esw.w3.org/topic/LargeTripleStores Michael Bergman’s Sweet Tools searchable list: http://www.mkbergman.com/?page_id=325 May 12, 2009 63
  • RDF Tools – (Some) Triple Stores Commercial or Tool Environment Open-source Anzo Both Java ARC Open-source PHP AllegroGraph Commercial Java, Prolog Jena Open-source Java Mulgara Open-source Java Oracle RDF Commercial SQL / SPARQL RDF::Query Open-source Perl Redland Open-source C, many wrappers Sesame Open-source Java Talis Platform Commercial HTTP (Hosted) Virtuoso Both C++ May 12, 2009 64
  • Agenda Introduction The data model (RDF) The query language (SPARQL) Adding structure & semantics (RDFS, OWL, RIF) Working in the real world (GRDDL, RDF2RDB) Working on the Web (Linked Data, RDFa, POWDER) May 12, 2009 65
  • Motivating SPARQL With a query language, a client can design their own interface. --Leigh Dodds, Talis May 12, 2009 66
  • SPARQL is… SPARQL Protocol And RDF Query Language May 12, 2009 67
  • SPARQL is… The query language of the Semantic Web. May 12, 2009 68
  • SPARQL is… A SQL-like language for querying sets of RDF graphs. May 12, 2009 69
  • SPARQL is… A simple protocol for issuing queries and receiving results over HTTP. So… Every SPARQL client works with every SPARQL server! May 12, 2009 70
  • Why SPARQL? SPARQL lets us: Pull information from structured and semi- structured data. Explore data by discovering unknown relationships. Query and search an integrated view of disparate data sources. Glue separate software applications together by transforming data from one vocabulary to another. May 12, 2009 71
  • Dealer 2 Dealer 3 Dealer 1 Employee ERP / Budget Directory System Web EPA Fuel Efficiency Spreadsheet SPARQL Query Engine What automobiles get more than 25 miles per gallon, fit within my department’s budget, and can be purchased at a dealer located within 10 miles of one of my employees? SELECT ?automobile WHERE { ?automobile a ex:Car ; epa:mpg ?mpg ; ex:dealer ?dealer . ?employee a ex:Employee ; geo:loc ?loc . ?dealer geo:loc ?dealerloc . FILTER(?mpg > 25 && geo:dist(?loc, ?dealerloc) <= 10) . } Web dashboard SPARQL query
  • SPARQL Example: Querying Wikipedia Find me all landlocked countries with a population greater than 15 million. PREFIX type: <http://dbpedia.org/class/yago/> PREFIX prop: <http://dbpedia.org/property/> SELECT ?country_name ?population WHERE { ?country a type:LandlockedCountries ; rdfs:label ?country_name ; prop:populationEstimate ?population . FILTER ( ?population > 15000000 && langMatches(lang(?country_name), quot;ENquot;) ). } ORDER BY DESC(?population) May 12, 2009 73
  • SPARQL Example: Querying Wikipedia DBPedia SPARQL Endpoint
  • SPARQL Example: Querying Wikipedia
  • Types of SPARQL Tools Query engines Things that can run queries Most RDF stores provide a SPARQL engine Query rewriters E.g. to query relational databases (more later) Endpoints Things that accept queries on the Web and return results Client libraries Things that make it easy to ask queries May 12, 2009 76
  • Finding SPARQL Tools Community-maintained list of query engines http://esw.w3.org/topic/SparqlImplementations Publicly accessible SPARQL endpoints http://esw.w3.org/topic/SparqlEndpoints Michael Bergman’s Sweet Tools searchable list: http://www.mkbergman.com/?page_id=325 May 12, 2009 77
  • (Some) SPARQL’able Data Sets May 12, 2009 78
  • bio2rdf.org – querying life sciences data May 12, 2009 79
  • bio2rdf.org – querying life sciences data May 12, 2009 80
  • Agenda Introduction The data model (RDF) The query language (SPARQL) Adding structure & semantics (RDFS, OWL, RIF) Working in the real world (GRDDL, RDF2RDB) Working on the Web (Linked Data, RDFa, POWDER) May 12, 2009 81
  • Where’s the magic? We haven’t seen anything yet that begins to approach the long-term Semantic Web vision May 12, 2009 82
  • From the explicit to the inferred 3 pieces of the Semantic Web technology stack are about describing a domain well enough to capture (some of) the meaning of resources and relationships in the domain RDF Schema OWL RIF Apply knowledge to data to get more data. May 12, 2009 83
  • RDFS is… RDF Schema May 12, 2009 84
  • RDF Schema is… Elements of: Vocabulary (defining terms) I define a relationship called “prescribed dose.” Schema (defining types) “prescribed dose” relates “treatments” to “dosagees” Taxonomy (defining hierarchies) Any “doctor” is a “medical professional” May 12, 2009 85
  • WOL OWL is… Web Ontology Language May 12, 2009 86
  • OWL is… Elements of ontology Same/different identity “author” and “auteur” are the same relation two resources with the same “ISBN” are the same “book” More expressive type definitions A “cycle” is a “vehicle” with at least one “wheel” A “bicycle” is a “cycle” with exactly two “wheels” More expressive relation definitions “sibling” is a symmetric predicate the value of the “favorite dwarf” relation must be one of “happy”, “sleepy”, “sneezy”, “grumpy”, “dopey”, “bashful”, “doc” May 12, 2009 87
  • What can we do with OWL? Answer questions of Consistency Are there any contradictions in this model? Classification What are all the inferred types of this resource? Satisfiability Are there any classes in this ontology that cannot possibly have any members? May 12, 2009 88
  • Building Useful Ontologies Developing and maintaining quality ontolgies is very challenging Users need tools and services, e.g., to help check if ontology is: Meaningful — all named classes can have instances http://www.aber.ac.uk/compsci/public/media/presentations/OUCL-seminar.ppt
  • Building Useful Ontologies Developing and maintaining quality ontolgies is very challenging Users need tools and services, e.g., to help check if ontology is: Meaningful — all named classes can have instances Correct — captures intuitions of domain experts
  • Building Useful Ontologies Developing and maintaining quality ontolgies is very challenging Users need tools and services, e.g., to help check if ontology is: Meaningful — all named classes can have instances Correct — captures intuitions of domain experts Minimally redundant — no unintended synonyms Banana split Banana sundae
  • Example: SNOMED Large: 373,731 concepts & over 1 million terms NHS version extended to 542,380 classes with 19,828 additional named classes 148,821 class drug taxonomy (primitive hierarchy) OWL reasoner (FaCT++) classified NHS ontology Able to classify whole ontology in <4 hours Interesting results come from 19,828 additional named classes 180 missing subClass relationships were found, e.g.: Periocular_dermatitis subClassOf Disease_of_face May 12, 2009 92
  • Example: SNOMED May 12, 2009 93
  • RIF is… Rules Interchange Format May 12, 2009 97
  • RIF is… Standard representation for exchanging sets of logical and business rules Logical rules A buyer buys an item from a seller if the seller sells the item to the buyer A customer becomes a quot;Goldquot; customer as soon as his cumulative purchases during the current year top $5000 Production rules Customers that become quot;Goldquot; customers must be notified immediately, and a golden customer card will be printed and sent to them within one week For shopping carts worth more than $1000, quot;Goldquot; customers receive an additional discount of 10% of the total amount May 12, 2009 98
  • Developing Tools and Infrastructure Editors/environments Oiled, Protégé, Swoop, TopBraid, Ontotrack, … May 12, 2009 99
  • Developing Tools and Infrastructure Editors/environments Oiled, Protégé, Swoop, TopBraid, Ontotrack, … Reasoning systems Cerebra, FaCT++, Kaon2, Pellet, Racer, CEL, … Pellet KAON2 CEL May 12, 2009 100
  • Visualizing and Publishing Vocabularies May 12, 2009 101
  • Reusable, public ontologies FOAF The Event Ontology Measurement Units Ontology May 12, 2009 102
  • Agenda Introduction The data model (RDF) The query language (SPARQL) Adding structure & semantics (RDFS, OWL, RIF) Working in the real world (GRDDL, RDF2RDB) Working on the Web (Linked Data, RDFa, POWDER) May 12, 2009 103
  • Fantasy Land Architecture Ontology / + Schema Custo Custo Custo Custo Custo Custo m UI m UI m UI m UI m UI m UI May 12, 2009 104
  • Reality Internet DB2 XML LDAP Oracle Directory RDB Custo Custo Custo Custo Custo Custo m UI m UI m UI m UI m UI m UI May 12, 2009 105
  • GRDDL is… Gleaning Resource Descriptions from Dialects of Language May 12, 2009 106
  • GRDDL is… A method for authoritatively getting RDF data from XML and XHTML documents. May 12, 2009 107
  • GRDDL is… A mechanism for authoritatively deriving RDF data from families of XML and XHTML documents. May 12, 2009 108
  • GRDDL tools Most GRDDL tools are adapters to existing RDF stores or SPARQL engines to allow loading or querying data from XML and XHTML sources. Community-maintained list: http://esw.w3.org/topic/GrddlImplementations Host System GRDDL tool Jena GRDDL Reader for Jena RDFLib GRDDL.py Redland (built in) Swignition (built in) Virtuoso GRDDL “Sponger” May 12, 2009 109
  • RDB2RDF is… Relational Database to RDF May 12, 2009 110
  • RDB2RDF is… A proposed W3C Working Group to define a standard way to map from relational databases to RDF (and SPARQL). May 12, 2009 111
  • RDF2RDB tools Survey of existing approaches: http://www.w3.org/2005/Incubator/rdb2rdf/RDB2RDF_SurveyReport.pdf Tool Mapping Approach Dynamic vs. Static (ETL) Anzo D2RQ configuration graph Both Asio Tools OWL file, SWRL rules Both Dartgrid XML file, visual mapper Dynamic D2RQ D2RQ configuration file Both R2O R2O XML file Both RDBtoOnto Constraint rules Static (ETL) SDS EII Query Engine/OOM XML Both Triplify SQL config file Linked Data Virtuoso RDF View Meta-Schema Language Both May 12, 2009 112
  • What about… everything else? Standards don’t yet exist, but many tools exist to derive RDF and/or run SPARQL queries against other sources of data. May 12, 2009 113
  • LDAP Directories Squirrel RDF http://jena.sourceforge.net/SquirrelRDF/ May 12, 2009 114
  • Excel spreadsheets Anzo for Excel http://www.cambridgesemantics.com/products/anzo_for_excel May 12, 2009 115
  • Excel spreadsheets Semantic Discovery System http://insilicodiscovery.com/installation/index.php May 12, 2009 116
  • Web-based data sources Virtuoso Sponger Cartridges http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger May 12, 2009 117
  • Unstructured Text Calais http://www.opencalais.com/ May 12, 2009 118
  • Unstructured Text Zemanta Web Service http://developer.zemanta.com/ May 12, 2009 119
  • Agenda Introduction The data model (RDF) The query language (SPARQL) Adding structure & semantics (RDFS, OWL, RIF) Working in the real world (GRDDL, RDF2RDB) Working on the Web (Linked Data, RDFa, POWDER) May 12, 2009 120
  • Linked Data is… A simple set of 4 guidelines for publishing RDF data on the Web (over HTTP) Developed by Tim Berners-Lee in 2006 1. Use URIs as names for things • Globally unique identity 2. Use HTTP URIs • Everyone has a Web browser/client 3. When someone looks up a URI, provide useful information • …in the form of RDF data 4. Include links to other URIs • Foster discovery of additional information May 12, 2009 121
  • The Linking Open Data Project is... A community project started within the W3C Semantic Web Education & Outreach group in 2007 A wealth of existing, open Web-based data sets exposed in RDF and linked together A growing number of publicly available SPARQL endpoints The first steps of “The” Semantic Web? No longer easily measured or depicted! May 12, 2009 122
  • The LOD “cloud”, May 2007 May 12, 2009 123
  • The LOD “cloud”, March 2008 May 12, 2009 124
  • The LOD “cloud”, September 2008 May 12, 2009 125
  • The LOD “cloud”, March 2009 May 12, 2009 126
  • Application specific portions of the cloud Notably, bio-related data sets (in light purple) some by the W3C “Linking Open Drug Data” task force May 12, 2009 127
  • Sindice - Another view of data on the Web May 12, 2009 128
  • Tools: Publishing linked data Many tools we’ve already seen publish RDF data according to linked data principles E.g. Talis platform, Virtuoso, Triplify Others sit on top of existing systems and make the data available as Linked Data E.g. pubby May 12, 2009 129
  • Tools: the Data Browser World Wide Web : Web pages :: The Semantic Web : Data World Wide Web : Web browser :: Linked Data Web : Data browser May 12, 2009 130
  • Tabulator: Generic Data Browser May 12, 2009 131
  • Disco Hyperdata Browser May 12, 2009 132
  • OpenLink Data Explorer May 12, 2009 133
  • Marbles Linked Data Browser May 12, 2009 134
  • DBPedia Mobile May 12, 2009 135
  • DBPedia Mobile May 12, 2009 136
  • DBPedia Mobile May 12, 2009 137
  • DBPedia Mobile May 12, 2009 138
  • QDOS – your online digital status May 12, 2009 139
  • BBC Music Beta May 12, 2009 140
  • Producer-oriented Web to consumer- oriented Web On the current Web… Content publishers decide what can be done with the data (via links, script) On the Semantic Web… Content publishers publish actionable data Content consumers decide how to act on it May 12, 2009 141
  • UltraLink UltraLink is Novartis’s solution for cross-linking over 1,500,000 biologic and chemical terms, including synonyms, taxonomies, and pointers into data repositories. May 12, 2009 142
  • UltraLink What if an acquisition brings with it a new Web-based corpus of pathway data that uses terms not recognized by the annotators? New text miners must be created & deployed Finding & consuming data are too tightly coupled May 12, 2009 143
  • RDFa is… RDF in Attributes May 12, 2009 144
  • RDFa is… A collection of HTML attributes that allow RDF to be embedded directly in Web pages. May 12, 2009 145
  • Why RDFa? Don’t Repeat Yourself (DRY) In-context metadata (copy & paste) Authoritative (no screen scrapig) May 12, 2009 146
  • Who’s using RDFa? STW Thesaurus for Economics May 12, 2009 147
  • RDFa in action May 12, 2009 148
  • POWDER is… Protocol for Web Description Resources May 12, 2009 149
  • http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation descriptions applied to groups of online resources 150
  • many resources one description 151
  • grouping mechanisms... ... list URIs ... domain names, paths ... regular expressions on URIs 152
  • descriptions may be grouped queries are on individual resources 153
  • description… • Which resources does the DR describe? • What is the description? • Who has created the description? • When was the description created? • Until when is the description considered valid? • From when is the description considered valid? • Does anybody agree with this description? • Do other descriptions exist about this group of resources? 154
  • in order to... adapt authorize protect trust search monitor 155
  • Thanks & Questions lee@cambridgesemantics.com May 12, 2009 156