Semantic Web Landscape 2009

77,992 views
77,631 views

Published on

These slides were originally a tutorial presented for the SIG preceding the May 2009 meeting of the PRISM Forum.

They attempt to give a survey of the technologies, tools, and state of the world with respect to the Semantic Web as of the first half of 2009.

Published in: Education, Technology
7 Comments
90 Likes
Statistics
Notes
No Downloads
Views
Total views
77,992
On SlideShare
0
From Embeds
0
Number of Embeds
2,975
Actions
Shares
0
Downloads
1,890
Comments
7
Likes
90
Embeds 0
No embeds

No notes for slide
  • Susie StephensBen AdidaEric Prud’hommeauxChris Bizer, Chris Becker
  • Executive summary.
  • Courtesy W3C SWEO group, http://linkeddata.org/docs/eswc2007-poster-linking-open-data.pdf
  • http://linkeddata.org/tools
  • http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/SemWebClients
  • See http://beckr.org/DBpediaMobile/ and http://wiki.dbpedia.org/DBpediaMobile
  • One of the goals of this tutorial is to de-mystify the all of the names of technologies, tools, projects, etc. that swirl around the Semantic Web story.And since I saw that as I researched this presentation, everyone seems to like this particular Gary Larson cartoon, it behooved me to include it.
  • Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
  • Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
  • Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
  • Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
  • Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
  • Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
  • The good – emphasize the importance of the foundational layers (URIs and RDF) ; emphasizes the long-term roadmap/vision of what’s needed for the Semantic WebThe bad – implies that perhaps things can’t be taken serious until all the pieces are in place ; implies an order to the research ; various versions of the cake tell different stories (importance of XML, absence of query, lack of UI/application layer, …)Valentin Zacharias wrote about the “infamy” part of the layer cake here: http://www.valentinzacharias.de/blog/2007/04/ban-semantic-web-layer-cake.html
  • http://www.w3.org/2001/sw/sweo/public/UseCases/
  • Definition.
  • Prescriptive.
  • Descriptive.
  • Formal.
  • The first is as opposed to relational tables or XML schemas where the schema needs to be explicitly adjusted to accommodate whatever data is being merged.The second is due to the expressivity of the model – can handle lists, trees, n-ary relations, etc.The third is as opposed to table & column identifiers or XML attribute names.
  • Quotation from http://xtech06.usefulinc.com/schedule/paper/61
  • Definition.
  • Prescriptive.
  • Descriptive.
  • Descriptive (part 2). This is leagues ahead of the situation with SQL!
  • To run for real: http://dbpedia.org/sparqlPREFIX type: <http://dbpedia.org/class/yago/>PREFIX prop: <http://dbpedia.org/property/>SELECT ?country_name ?populationWHERE { ?country a type:LandlockedCountries ;rdfs:label ?country_name ;prop:populationEstimate ?population . FILTER (?population > 15000000 && langMatches(lang(?country_name), \"EN\")) .} ORDER BY DESC(?population)
  • http://bio2rdf.org/
  • http://bio2rdf.org/
  • Definition.
  • Definition.
  • Thanks to BijanParsia for much of this material http://www.cs.man.ac.uk/~bparsia/2009/comp60462/17-03-casestudies.pdf
  • Semantic Web Landscape 2009

    1. 1. The 2009 Semantic Web Landscape Technologies, tools, and projects Lee Feigenbaum VP Technology & Standards, Cambridge Semantics Co-chair, W3C SPARQL Working Group For PRISM Forum SIG on Semantic Web May 12, 2009
    2. 2. Thanks Upfront Much material & wisdom used with gracious permission of: Ivan Herman W3C Semantic Web Activity Lead Bijan Parsia Co-editor of the core OWL 2 specification Ian Horrocks Co-chair of the W3C OWL 2 Working Group Phil Archer Chair of the W3C POWDER Working Group May 12, 2009 2
    3. 3. Thanks Upfront Much material & wisdom used with gracious permission of: Michael Hausenblas Evangelist for RDFa, Linked Data, and Multimedia Semantics Fabien Gandon Member, GRDDL and OWL 2 Working Groups Susie Stephens Co-chair W3C HCLS Interest Group Eric Prud’hommeaux W3C team member, Semantic Web expert May 12, 2009 3
    4. 4. Executive Summary: The Semantic Web in 2009 The Semantic Web in 2009 is characterized by a healthy environment of stable, broadly-implemented core standard technologies complemented by a number of continually emerging new standards. Adopters of Semantic Web technologies in 2009 can choose from a wide range of commercial and open-source interoperable tools and systems. Enterprise Semantic Web projects are beginning to move beyond proofs of concept to serious production implementations. Community projects on the World Wide Web have linked hundreds of public data sets into an emergent Semantic Web. May 12, 2009 4
    5. 5. Agenda Introduction The data model (RDF) The query language (SPARQL) Adding structure & semantics (RDFS, OWL, RIF) Working in the real world (GRDDL, RDF2RDB) Working on the Web (Linked Data, RDFa, POWDER) May 12, 2009 5
    6. 6. A Motivating Example: Drug Discovery The W3C HCLS interest group set out to use Semantic Web technologies to receive precise answers to a complex question: Find me genes involved in signal transduction that are related to pyramidal neurons. May 12, 2009 6
    7. 7. General search 223,000 hits, 0 results May 12, 2009 7
    8. 8. Domain-limited search 2,580 potential results May 12, 2009 8
    9. 9. Specific databases Too many silos! May 12, 2009 9
    10. 10. A Semantic Web Approach Integrate disparate databases… MeSH PubMed Entrez Gene Gene Ontology … May 12, 2009 10
    11. 11. A Semantic Web Approach (cont’d) …so that one query… May 12, 2009 11
    12. 12. A Semantic Web Approach (cont’d) …(trivially) spans several databases… May 12, 2009 12
    13. 13. A Semantic Web Approach (cont’d) …to deliver targeted results… May 12, 2009 13
    14. 14. What’s the trick? 1. Agreement on common terms and relationships 2. Incremental, flexible data structure 3. Good-enough modeling 4. Query interface tailored to the data model May 12, 2009 14
    15. 15. Names May 12, 2009 15
    16. 16. Branding Semantic Web Web of Data Giant Global Graph Data Web Web 3.0 Linked Data Web Semantic Data Web May 12, 2009 16
    17. 17. What is it & why do we care? (1) “The Semantic Web” Augments the World Wide Web Represents the Web’s information in a machine- readable fashion Enables… …targeted search …data browsing …automated agents World Wide Web : Web pages :: The Semantic Web : Data May 12, 2009 17
    18. 18. What is it & why do we care? (2) “Semantic Web technologies” A family of technology standards that ‘play nice together’, including: Flexible data model Expressive ontology language Distributed query language Drive Web sites, enterprise applications The technologies enable us to build applications and solutions that were not possible, practical, or feasible traditionally. May 12, 2009 18
    19. 19. A Common & Coherent Set of Technology Standards A common set of technologies: ...enables diverse uses ...encourages interoperability A coherent set of technologies: …encourage incremental application …provide a substantial base for innovation A standard set of technologies: ...reduces proprietary vendor lock-in ...encourages many choices for tool sets May 12, 2009 19
    20. 20. The (In)Famous Layer Cake May 12, 2009 20
    21. 21. Semantic Web Technology Timeline 2001 2004 2007 2008 2009 1999 RIF HCLS May 12, 2009 21
    22. 22. 2009: Where we are As technologies & tools have evolved, Semantic Web advocates have progressed through stages: Report on… Execute on… Semantic Web vision Initial experiments Experiments Technology standards Technology standards Software packages Software packages Proofs of concept Proofs of concept Production implementations May 12, 2009 22
    23. 23. 2009: Where we are (cont’d) http://www.w3.org/2001/sw/sweo/public/UseCases/ May 12, 2009 23
    24. 24. 2009: Where we’re not Image from Trey Ideker via Enoch Huang Semantic Web technologies are not a ‘magic crank’ for discovering new drugs (or solving other problems, for that matter)! May 12, 2009 24
    25. 25. 2009: Where we’re not (cont’d) XML vs. RDF? “Ontology” vs. “ontology”? Data integration vs. Semantic Web vs. reasoning vs. KBs Linked Data? vs. search vs. app. development vs. … The Semantic Web still suffers from confusing and conflicting messaging, each of which asserts it’s “correct”. May 12, 2009 25
    26. 26. 2009: Where we’re not (cont’d) People with appropriate skill sets for designing & building Semantic Web solutions are not widely available. May 12, 2009 26
    27. 27. 2009: Where we’re not (cont’d) We don’t yet have standard solutions for privacy, trust, probability, and other elements of the Semantic Web vision. May 12, 2009 27
    28. 28. Introduction to the Semantic Web approach How does a Semantic Web approach help us merge data sets, infer new relations, and integrate outside data sources? Thanks to Ivan Herman for this example May 12, 2009 28
    29. 29. The rough structure of data integration 1. Map the various data onto an abstract data representation Make the data independent of its internal • representation… 2. Merge the resulting representations 3. Start making queries on the whole Queries not possible on the individual data sets • May 12, 2009 29
    30. 30. Data set “A”: A simplified book store Books ID Author Title Publisher Year ISBN0-00-651409-X id_xyz The Glass Palace id_qpr 2000 Authors ID Name Home page id_xyz Ghosh, Amitav http://www.amitavghosh.com Publishers ID Publisher Name City id_qpr Harper Collins London May 12, 2009 30
    31. 31. st: 1 Export your data as a set of relations May 12, 2009 31
    32. 32. Some notes on the data export Data export does not necessarily mean physical conversion of the data Relations can be virtual, generated on-the-fly at query time via SQL “bridges” scraping HTML pages extracting data from Excel sheets etc. One can export part of the data May 12, 2009 32
    33. 33. Data set “F”: Another book store’s data A B D E Traducteur ID Titre Original 1 ISBN0 2020386682 Le Palais A13 ISBN-0-00-651409-X des miroirs 2 3 ID Auteur 6 ISBN-0-00-651409-X A12 7 Nom 11 Ghosh, Amitav 12 Besse, Christianne 13 May 12, 2009 33
    34. 34. 2nd: Export your second set of data May 12, 2009 34
    35. 35. 3rd: start merging your data May 12, 2009 35
    36. 36. 3rd: start merging your data (cont’d) May 12, 2009 36
    37. 37. 4th: Merge identical resources May 12, 2009 37
    38. 38. Start making queries… User of data set “F” can now ask queries like: “What is the title of the original version of Le Palais des miroirs?” This information is not in the data set “F”... …but can be retrieved after merging with data set “A”! May 12, 2009 38
    39. 39. 5th: Query the merged data set May 12, 2009 39
    40. 40. However, more can be achieved… We “know” that a:author and f:auteur are really the same But our automatic merge does not know that! Let us add some extra information to the merged data: a:author is the same as f:auteur Both identify a Person, a category (type) for certain resources May 12, 2009 40
    41. 41. 3rd revisited: Use the extra knowledge May 12, 2009 41
    42. 42. Start making richer queries! User of data set “F” can now query: “What is the home page of Le Palais des miroirs’s ‘auteur’?” The information is not in data set “F” or “A”… …but was made available by: Merging data sets “A” and “F” Adding three simple “glue” statements May 12, 2009 42
    43. 43. 6th: Richer queries May 12, 2009 43
    44. 44. Bring in other data sources We can integrate new information into our merged data set from other sources e.g. additional information about author Amitav Ghosh Perhaps the largest public source of general knowledge is Wikipedia Structured data can be extracted from Wikipedia using dedicated tools May 12, 2009 44
    45. 45. 7th: Merge with Wikipedia data May 12, 2009 45
    46. 46. 7th (cont’d): Merge with Wikipedia data May 12, 2009 46
    47. 47. 7th (cont’d): Merge with Wikipedia data May 12, 2009 47
    48. 48. Is that surprising? It may look like it but, in fact, it should not be… What happened via automatic means is done every day by Web users! The difference: a bit of extra rigour so that machines could do this, too May 12, 2009 48
    49. 49. What did we do? We combined different data sets that ...may be internal or somewhere on the Web ...are of different formats (RDBMS, Excel spreadsheet, (X)HTML, etc) ...have different names for the same relations We could combine the data because some URIs were identical i.e. the ISBNs in this case We could add some simple additional information (the “glue”) to help further merge data sets The result? Answer queries that could not previously be asked May 12, 2009 49
    50. 50. What did we do? (cont’d) May 12, 2009 50
    51. 51. The abstraction pays off because… …the graph representation is independent of the details of the native structures …a change in local database schemas, HTML structures, etc. do not affect the whole “schema independence” …new data, new connections can be added seamlessly & incrementally May 12, 2009 51
    52. 52. So where is the Semantic Web? Semantic Web technologies make such integration possible The rest of this tutorial introduces many of these technologies. May 12, 2009 52
    53. 53. Agenda Introduction The data model (RDF) The query language (SPARQL) Adding structure & semantics (RDFS, OWL, RIF) Working in the real world (GRDDL, RDF2RDB) Working on the Web (Linked Data, RDFa, POWDER) May 12, 2009 53
    54. 54. RDF is… Resource Description Framework May 12, 2009 54
    55. 55. RDF is… The data model of the Semantic Web. May 12, 2009 55
    56. 56. RDF is… A schema-less data model that features unambiguous identifiers and named relations between pairs of resources. May 12, 2009 56
    57. 57. RDF is… A labeled, directed graph of relations between resources and literal values. RDF graphs are collections of triples Triples are made up of a subject, a predicate, and an object predicate subject object Resources and relationships are named with URIs May 12, 2009 57
    58. 58. Example RDF triples “Lee Feigenbaum works for Cambridge Semantics” works for Lee Cambridge Feigenbaum Semantics “Lee Feigenbaum was born in 1978” born in Lee 1978 Feigenbaum “Cambridge Semantics is headquartered in Massachusetts” headquartered Cambridge Massachusetts Semantics May 12, 2009 58
    59. 59. Triples connect to form graphs works for Lee Cambridge Feigenbaum Semantics headquartered born in lives in Massachusetts 1978 capital Boston May 12, 2009 59
    60. 60. Why RDF? What’s different here? The graph data structure makes merging data with shared identifiers trivial (as we saw earlier) Triples act as a least common denominator for expressing data URIs for naming remove ambiguity …the same identifier means the same thing May 12, 2009 60
    61. 61. Why RDF? Incremental Integration Agile, Flexible URIs for Incremental Graph naming Model Integration Relational RDF Database May 12, 2009 61
    62. 62. Types of RDF Tools Triple stores Built on relational database Native RDF store Development libraries Full-featured application servers Most RDF tools contain some elements of each of these. May 12, 2009 62
    63. 63. Finding RDF Tools Community-maintained lists http://esw.w3.org/topic/SemanticWebTools Emphasis on large triple stores http://esw.w3.org/topic/LargeTripleStores Michael Bergman’s Sweet Tools searchable list: http://www.mkbergman.com/?page_id=325 May 12, 2009 63
    64. 64. RDF Tools – (Some) Triple Stores Commercial or Tool Environment Open-source Anzo Both Java ARC Open-source PHP AllegroGraph Commercial Java, Prolog Jena Open-source Java Mulgara Open-source Java Oracle RDF Commercial SQL / SPARQL RDF::Query Open-source Perl Redland Open-source C, many wrappers Sesame Open-source Java Talis Platform Commercial HTTP (Hosted) Virtuoso Both C++ May 12, 2009 64
    65. 65. Agenda Introduction The data model (RDF) The query language (SPARQL) Adding structure & semantics (RDFS, OWL, RIF) Working in the real world (GRDDL, RDF2RDB) Working on the Web (Linked Data, RDFa, POWDER) May 12, 2009 65
    66. 66. Motivating SPARQL With a query language, a client can design their own interface. --Leigh Dodds, Talis May 12, 2009 66
    67. 67. SPARQL is… SPARQL Protocol And RDF Query Language May 12, 2009 67
    68. 68. SPARQL is… The query language of the Semantic Web. May 12, 2009 68
    69. 69. SPARQL is… A SQL-like language for querying sets of RDF graphs. May 12, 2009 69
    70. 70. SPARQL is… A simple protocol for issuing queries and receiving results over HTTP. So… Every SPARQL client works with every SPARQL server! May 12, 2009 70
    71. 71. Why SPARQL? SPARQL lets us: Pull information from structured and semi- structured data. Explore data by discovering unknown relationships. Query and search an integrated view of disparate data sources. Glue separate software applications together by transforming data from one vocabulary to another. May 12, 2009 71
    72. 72. Dealer 2 Dealer 3 Dealer 1 Employee ERP / Budget Directory System Web EPA Fuel Efficiency Spreadsheet SPARQL Query Engine What automobiles get more than 25 miles per gallon, fit within my department’s budget, and can be purchased at a dealer located within 10 miles of one of my employees? SELECT ?automobile WHERE { ?automobile a ex:Car ; epa:mpg ?mpg ; ex:dealer ?dealer . ?employee a ex:Employee ; geo:loc ?loc . ?dealer geo:loc ?dealerloc . FILTER(?mpg > 25 && geo:dist(?loc, ?dealerloc) <= 10) . } Web dashboard SPARQL query
    73. 73. SPARQL Example: Querying Wikipedia Find me all landlocked countries with a population greater than 15 million. PREFIX type: <http://dbpedia.org/class/yago/> PREFIX prop: <http://dbpedia.org/property/> SELECT ?country_name ?population WHERE { ?country a type:LandlockedCountries ; rdfs:label ?country_name ; prop:populationEstimate ?population . FILTER ( ?population > 15000000 && langMatches(lang(?country_name), quot;ENquot;) ). } ORDER BY DESC(?population) May 12, 2009 73
    74. 74. SPARQL Example: Querying Wikipedia DBPedia SPARQL Endpoint
    75. 75. SPARQL Example: Querying Wikipedia
    76. 76. Types of SPARQL Tools Query engines Things that can run queries Most RDF stores provide a SPARQL engine Query rewriters E.g. to query relational databases (more later) Endpoints Things that accept queries on the Web and return results Client libraries Things that make it easy to ask queries May 12, 2009 76
    77. 77. Finding SPARQL Tools Community-maintained list of query engines http://esw.w3.org/topic/SparqlImplementations Publicly accessible SPARQL endpoints http://esw.w3.org/topic/SparqlEndpoints Michael Bergman’s Sweet Tools searchable list: http://www.mkbergman.com/?page_id=325 May 12, 2009 77
    78. 78. (Some) SPARQL’able Data Sets May 12, 2009 78
    79. 79. bio2rdf.org – querying life sciences data May 12, 2009 79
    80. 80. bio2rdf.org – querying life sciences data May 12, 2009 80
    81. 81. Agenda Introduction The data model (RDF) The query language (SPARQL) Adding structure & semantics (RDFS, OWL, RIF) Working in the real world (GRDDL, RDF2RDB) Working on the Web (Linked Data, RDFa, POWDER) May 12, 2009 81
    82. 82. Where’s the magic? We haven’t seen anything yet that begins to approach the long-term Semantic Web vision May 12, 2009 82
    83. 83. From the explicit to the inferred 3 pieces of the Semantic Web technology stack are about describing a domain well enough to capture (some of) the meaning of resources and relationships in the domain RDF Schema OWL RIF Apply knowledge to data to get more data. May 12, 2009 83
    84. 84. RDFS is… RDF Schema May 12, 2009 84
    85. 85. RDF Schema is… Elements of: Vocabulary (defining terms) I define a relationship called “prescribed dose.” Schema (defining types) “prescribed dose” relates “treatments” to “dosagees” Taxonomy (defining hierarchies) Any “doctor” is a “medical professional” May 12, 2009 85
    86. 86. WOL OWL is… Web Ontology Language May 12, 2009 86
    87. 87. OWL is… Elements of ontology Same/different identity “author” and “auteur” are the same relation two resources with the same “ISBN” are the same “book” More expressive type definitions A “cycle” is a “vehicle” with at least one “wheel” A “bicycle” is a “cycle” with exactly two “wheels” More expressive relation definitions “sibling” is a symmetric predicate the value of the “favorite dwarf” relation must be one of “happy”, “sleepy”, “sneezy”, “grumpy”, “dopey”, “bashful”, “doc” May 12, 2009 87
    88. 88. What can we do with OWL? Answer questions of Consistency Are there any contradictions in this model? Classification What are all the inferred types of this resource? Satisfiability Are there any classes in this ontology that cannot possibly have any members? May 12, 2009 88
    89. 89. Building Useful Ontologies Developing and maintaining quality ontolgies is very challenging Users need tools and services, e.g., to help check if ontology is: Meaningful — all named classes can have instances http://www.aber.ac.uk/compsci/public/media/presentations/OUCL-seminar.ppt
    90. 90. Building Useful Ontologies Developing and maintaining quality ontolgies is very challenging Users need tools and services, e.g., to help check if ontology is: Meaningful — all named classes can have instances Correct — captures intuitions of domain experts
    91. 91. Building Useful Ontologies Developing and maintaining quality ontolgies is very challenging Users need tools and services, e.g., to help check if ontology is: Meaningful — all named classes can have instances Correct — captures intuitions of domain experts Minimally redundant — no unintended synonyms Banana split Banana sundae
    92. 92. Example: SNOMED Large: 373,731 concepts & over 1 million terms NHS version extended to 542,380 classes with 19,828 additional named classes 148,821 class drug taxonomy (primitive hierarchy) OWL reasoner (FaCT++) classified NHS ontology Able to classify whole ontology in <4 hours Interesting results come from 19,828 additional named classes 180 missing subClass relationships were found, e.g.: Periocular_dermatitis subClassOf Disease_of_face May 12, 2009 92
    93. 93. Example: SNOMED May 12, 2009 93
    94. 94. RIF is… Rules Interchange Format May 12, 2009 97
    95. 95. RIF is… Standard representation for exchanging sets of logical and business rules Logical rules A buyer buys an item from a seller if the seller sells the item to the buyer A customer becomes a quot;Goldquot; customer as soon as his cumulative purchases during the current year top $5000 Production rules Customers that become quot;Goldquot; customers must be notified immediately, and a golden customer card will be printed and sent to them within one week For shopping carts worth more than $1000, quot;Goldquot; customers receive an additional discount of 10% of the total amount May 12, 2009 98
    96. 96. Developing Tools and Infrastructure Editors/environments Oiled, Protégé, Swoop, TopBraid, Ontotrack, … May 12, 2009 99
    97. 97. Developing Tools and Infrastructure Editors/environments Oiled, Protégé, Swoop, TopBraid, Ontotrack, … Reasoning systems Cerebra, FaCT++, Kaon2, Pellet, Racer, CEL, … Pellet KAON2 CEL May 12, 2009 100
    98. 98. Visualizing and Publishing Vocabularies May 12, 2009 101
    99. 99. Reusable, public ontologies FOAF The Event Ontology Measurement Units Ontology May 12, 2009 102
    100. 100. Agenda Introduction The data model (RDF) The query language (SPARQL) Adding structure & semantics (RDFS, OWL, RIF) Working in the real world (GRDDL, RDF2RDB) Working on the Web (Linked Data, RDFa, POWDER) May 12, 2009 103
    101. 101. Fantasy Land Architecture Ontology / + Schema Custo Custo Custo Custo Custo Custo m UI m UI m UI m UI m UI m UI May 12, 2009 104
    102. 102. Reality Internet DB2 XML LDAP Oracle Directory RDB Custo Custo Custo Custo Custo Custo m UI m UI m UI m UI m UI m UI May 12, 2009 105
    103. 103. GRDDL is… Gleaning Resource Descriptions from Dialects of Language May 12, 2009 106
    104. 104. GRDDL is… A method for authoritatively getting RDF data from XML and XHTML documents. May 12, 2009 107
    105. 105. GRDDL is… A mechanism for authoritatively deriving RDF data from families of XML and XHTML documents. May 12, 2009 108
    106. 106. GRDDL tools Most GRDDL tools are adapters to existing RDF stores or SPARQL engines to allow loading or querying data from XML and XHTML sources. Community-maintained list: http://esw.w3.org/topic/GrddlImplementations Host System GRDDL tool Jena GRDDL Reader for Jena RDFLib GRDDL.py Redland (built in) Swignition (built in) Virtuoso GRDDL “Sponger” May 12, 2009 109
    107. 107. RDB2RDF is… Relational Database to RDF May 12, 2009 110
    108. 108. RDB2RDF is… A proposed W3C Working Group to define a standard way to map from relational databases to RDF (and SPARQL). May 12, 2009 111
    109. 109. RDF2RDB tools Survey of existing approaches: http://www.w3.org/2005/Incubator/rdb2rdf/RDB2RDF_SurveyReport.pdf Tool Mapping Approach Dynamic vs. Static (ETL) Anzo D2RQ configuration graph Both Asio Tools OWL file, SWRL rules Both Dartgrid XML file, visual mapper Dynamic D2RQ D2RQ configuration file Both R2O R2O XML file Both RDBtoOnto Constraint rules Static (ETL) SDS EII Query Engine/OOM XML Both Triplify SQL config file Linked Data Virtuoso RDF View Meta-Schema Language Both May 12, 2009 112
    110. 110. What about… everything else? Standards don’t yet exist, but many tools exist to derive RDF and/or run SPARQL queries against other sources of data. May 12, 2009 113
    111. 111. LDAP Directories Squirrel RDF http://jena.sourceforge.net/SquirrelRDF/ May 12, 2009 114
    112. 112. Excel spreadsheets Anzo for Excel http://www.cambridgesemantics.com/products/anzo_for_excel May 12, 2009 115
    113. 113. Excel spreadsheets Semantic Discovery System http://insilicodiscovery.com/installation/index.php May 12, 2009 116
    114. 114. Web-based data sources Virtuoso Sponger Cartridges http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger May 12, 2009 117
    115. 115. Unstructured Text Calais http://www.opencalais.com/ May 12, 2009 118
    116. 116. Unstructured Text Zemanta Web Service http://developer.zemanta.com/ May 12, 2009 119
    117. 117. Agenda Introduction The data model (RDF) The query language (SPARQL) Adding structure & semantics (RDFS, OWL, RIF) Working in the real world (GRDDL, RDF2RDB) Working on the Web (Linked Data, RDFa, POWDER) May 12, 2009 120
    118. 118. Linked Data is… A simple set of 4 guidelines for publishing RDF data on the Web (over HTTP) Developed by Tim Berners-Lee in 2006 1. Use URIs as names for things • Globally unique identity 2. Use HTTP URIs • Everyone has a Web browser/client 3. When someone looks up a URI, provide useful information • …in the form of RDF data 4. Include links to other URIs • Foster discovery of additional information May 12, 2009 121
    119. 119. The Linking Open Data Project is... A community project started within the W3C Semantic Web Education & Outreach group in 2007 A wealth of existing, open Web-based data sets exposed in RDF and linked together A growing number of publicly available SPARQL endpoints The first steps of “The” Semantic Web? No longer easily measured or depicted! May 12, 2009 122
    120. 120. The LOD “cloud”, May 2007 May 12, 2009 123
    121. 121. The LOD “cloud”, March 2008 May 12, 2009 124
    122. 122. The LOD “cloud”, September 2008 May 12, 2009 125
    123. 123. The LOD “cloud”, March 2009 May 12, 2009 126
    124. 124. Application specific portions of the cloud Notably, bio-related data sets (in light purple) some by the W3C “Linking Open Drug Data” task force May 12, 2009 127
    125. 125. Sindice - Another view of data on the Web May 12, 2009 128
    126. 126. Tools: Publishing linked data Many tools we’ve already seen publish RDF data according to linked data principles E.g. Talis platform, Virtuoso, Triplify Others sit on top of existing systems and make the data available as Linked Data E.g. pubby May 12, 2009 129
    127. 127. Tools: the Data Browser World Wide Web : Web pages :: The Semantic Web : Data World Wide Web : Web browser :: Linked Data Web : Data browser May 12, 2009 130
    128. 128. Tabulator: Generic Data Browser May 12, 2009 131
    129. 129. Disco Hyperdata Browser May 12, 2009 132
    130. 130. OpenLink Data Explorer May 12, 2009 133
    131. 131. Marbles Linked Data Browser May 12, 2009 134
    132. 132. DBPedia Mobile May 12, 2009 135
    133. 133. DBPedia Mobile May 12, 2009 136
    134. 134. DBPedia Mobile May 12, 2009 137
    135. 135. DBPedia Mobile May 12, 2009 138
    136. 136. QDOS – your online digital status May 12, 2009 139
    137. 137. BBC Music Beta May 12, 2009 140
    138. 138. Producer-oriented Web to consumer- oriented Web On the current Web… Content publishers decide what can be done with the data (via links, script) On the Semantic Web… Content publishers publish actionable data Content consumers decide how to act on it May 12, 2009 141
    139. 139. UltraLink UltraLink is Novartis’s solution for cross-linking over 1,500,000 biologic and chemical terms, including synonyms, taxonomies, and pointers into data repositories. May 12, 2009 142
    140. 140. UltraLink What if an acquisition brings with it a new Web-based corpus of pathway data that uses terms not recognized by the annotators? New text miners must be created & deployed Finding & consuming data are too tightly coupled May 12, 2009 143
    141. 141. RDFa is… RDF in Attributes May 12, 2009 144
    142. 142. RDFa is… A collection of HTML attributes that allow RDF to be embedded directly in Web pages. May 12, 2009 145
    143. 143. Why RDFa? Don’t Repeat Yourself (DRY) In-context metadata (copy & paste) Authoritative (no screen scrapig) May 12, 2009 146
    144. 144. Who’s using RDFa? STW Thesaurus for Economics May 12, 2009 147
    145. 145. RDFa in action May 12, 2009 148
    146. 146. POWDER is… Protocol for Web Description Resources May 12, 2009 149
    147. 147. http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation descriptions applied to groups of online resources 150
    148. 148. many resources one description 151
    149. 149. grouping mechanisms... ... list URIs ... domain names, paths ... regular expressions on URIs 152
    150. 150. descriptions may be grouped queries are on individual resources 153
    151. 151. description… • Which resources does the DR describe? • What is the description? • Who has created the description? • When was the description created? • Until when is the description considered valid? • From when is the description considered valid? • Does anybody agree with this description? • Do other descriptions exist about this group of resources? 154
    152. 152. in order to... adapt authorize protect trust search monitor 155
    153. 153. Thanks & Questions lee@cambridgesemantics.com May 12, 2009 156

    ×