See http://beckr.org/DBpediaMobile/ and http://wiki.dbpedia.org/DBpediaMobile
One of the goals of this tutorial is to de-mystify the all of the names of technologies, tools, projects, etc. that swirl around the Semantic Web story.And since I saw that as I researched this presentation, everyone seems to like this particular Gary Larson cartoon, it behooved me to include it.
Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
Thanks to Fabien Gandon for the POWDER slides: http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
The good – emphasize the importance of the foundational layers (URIs and RDF) ; emphasizes the long-term roadmap/vision of what’s needed for the Semantic WebThe bad – implies that perhaps things can’t be taken serious until all the pieces are in place ; implies an order to the research ; various versions of the cake tell different stories (importance of XML, absence of query, lack of UI/application layer, …)Valentin Zacharias wrote about the “infamy” part of the layer cake here: http://www.valentinzacharias.de/blog/2007/04/ban-semantic-web-layer-cake.html
http://www.w3.org/2001/sw/sweo/public/UseCases/
Definition.
Prescriptive.
Descriptive.
Formal.
The first is as opposed to relational tables or XML schemas where the schema needs to be explicitly adjusted to accommodate whatever data is being merged.The second is due to the expressivity of the model – can handle lists, trees, n-ary relations, etc.The third is as opposed to table & column identifiers or XML attribute names.
Quotation from http://xtech06.usefulinc.com/schedule/paper/61
Definition.
Prescriptive.
Descriptive.
Descriptive (part 2). This is leagues ahead of the situation with SQL!
To run for real: http://dbpedia.org/sparqlPREFIX type: PREFIX prop: SELECT ?country_name ?populationWHERE { ?country a type:LandlockedCountries ;rdfs:label ?country_name ;prop:populationEstimate ?population . FILTER (?population > 15000000 && langMatches(lang(?country_name), \"EN\")) .} ORDER BY DESC(?population)
http://bio2rdf.org/
http://bio2rdf.org/
Definition.
Definition.
Thanks to BijanParsia for much of this material http://www.cs.man.ac.uk/~bparsia/2009/comp60462/17-03-casestudies.pdf
Semantic Web Landscape 2009Presentation Transcript
The 2009 Semantic Web Landscape
Technologies, tools, and projects
Lee Feigenbaum
VP Technology & Standards, Cambridge Semantics
Co-chair, W3C SPARQL Working Group
For PRISM Forum SIG on Semantic Web
May 12, 2009
Thanks Upfront
Much material & wisdom used with gracious
permission of:
Ivan Herman
W3C Semantic Web Activity Lead
Bijan Parsia
Co-editor of the core OWL 2 specification
Ian Horrocks
Co-chair of the W3C OWL 2 Working Group
Phil Archer
Chair of the W3C POWDER Working Group
May 12, 2009 2
Thanks Upfront
Much material & wisdom used with gracious
permission of:
Michael Hausenblas
Evangelist for RDFa, Linked Data, and Multimedia Semantics
Fabien Gandon
Member, GRDDL and OWL 2 Working Groups
Susie Stephens
Co-chair W3C HCLS Interest Group
Eric Prud’hommeaux
W3C team member, Semantic Web expert
May 12, 2009 3
Executive Summary: The Semantic
Web in 2009
The Semantic Web in 2009 is characterized by a healthy
environment of stable, broadly-implemented core standard
technologies complemented by a number of continually emerging
new standards.
Adopters of Semantic Web technologies in 2009 can choose from
a wide range of commercial and open-source interoperable tools
and systems.
Enterprise Semantic Web projects are beginning to move beyond
proofs of concept to serious production implementations.
Community projects on the World Wide Web have linked
hundreds of public data sets into an emergent Semantic Web.
May 12, 2009 4
Agenda
Introduction
The data model (RDF)
The query language (SPARQL)
Adding structure & semantics (RDFS, OWL, RIF)
Working in the real world (GRDDL, RDF2RDB)
Working on the Web (Linked Data, RDFa,
POWDER)
May 12, 2009 5
A Motivating Example: Drug Discovery
The W3C HCLS interest group set out to use
Semantic Web technologies to receive precise
answers to a complex question:
Find me genes involved in signal transduction that
are related to pyramidal neurons.
May 12, 2009 6
General search
223,000 hits, 0 results
May 12, 2009 7
Domain-limited search
2,580 potential results
May 12, 2009 8
Specific databases
Too many silos!
May 12, 2009 9
A Semantic Web Approach
Integrate disparate databases…
MeSH
PubMed
Entrez Gene
Gene Ontology
…
May 12, 2009 10
A Semantic Web Approach (cont’d)
…so that one query…
May 12, 2009 11
A Semantic Web Approach (cont’d)
…(trivially) spans several databases…
May 12, 2009 12
A Semantic Web Approach (cont’d)
…to deliver targeted results…
May 12, 2009 13
What’s the trick?
1. Agreement on common terms and
relationships
2. Incremental, flexible data structure
3. Good-enough modeling
4. Query interface tailored to the data
model
May 12, 2009 14
Names
May 12, 2009 15
Branding
Semantic Web
Web of Data
Giant Global Graph
Data Web
Web 3.0
Linked Data Web
Semantic Data Web
May 12, 2009 16
What is it & why do we care? (1)
“The Semantic Web”
Augments the World Wide Web
Represents the Web’s information in a machine-
readable fashion
Enables…
…targeted search
…data browsing
…automated agents
World Wide Web : Web pages :: The Semantic Web : Data
May 12, 2009 17
What is it & why do we care? (2)
“Semantic Web technologies”
A family of technology standards that ‘play nice
together’, including:
Flexible data model
Expressive ontology language
Distributed query language
Drive Web sites, enterprise applications
The technologies enable us to build applications and solutions that
were not possible, practical, or feasible traditionally.
May 12, 2009 18
A Common & Coherent Set of Technology
Standards
A common set of technologies:
...enables diverse uses
...encourages interoperability
A coherent set of technologies:
…encourage incremental application
…provide a substantial base for innovation
A standard set of technologies:
...reduces proprietary vendor lock-in
...encourages many choices for tool sets
May 12, 2009 19
The (In)Famous Layer Cake
May 12, 2009 20
Semantic Web Technology Timeline
2001 2004 2007 2008 2009
1999
RIF
HCLS
May 12, 2009 21
2009: Where we are
As technologies & tools have evolved, Semantic
Web advocates have progressed through stages:
Report on… Execute on…
Semantic Web vision Initial experiments
Experiments Technology standards
Technology standards Software packages
Software packages Proofs of concept
Proofs of concept Production implementations
May 12, 2009 22
2009: Where we are (cont’d)
http://www.w3.org/2001/sw/sweo/public/UseCases/
May 12, 2009 23
2009: Where we’re not
Image from Trey Ideker via Enoch Huang
Semantic Web technologies are not a ‘magic crank’ for discovering
new drugs (or solving other problems, for that matter)!
May 12, 2009 24
2009: Where we’re not (cont’d)
XML vs. RDF?
“Ontology” vs.
“ontology”?
Data integration vs.
Semantic Web vs.
reasoning vs. KBs
Linked Data?
vs. search vs. app.
development vs. …
The Semantic Web still suffers from confusing and conflicting
messaging, each of which asserts it’s “correct”.
May 12, 2009 25
2009: Where we’re not (cont’d)
People with appropriate skill sets for designing & building Semantic
Web solutions are not widely available.
May 12, 2009 26
2009: Where we’re not (cont’d)
We don’t yet have standard solutions for privacy, trust, probability,
and other elements of the Semantic Web vision.
May 12, 2009 27
Introduction to the Semantic Web
approach
How does a Semantic Web approach help us
merge data sets, infer new relations, and integrate
outside data sources?
Thanks to Ivan Herman for this example
May 12, 2009 28
The rough structure of data integration
1. Map the various data onto an abstract data
representation
Make the data independent of its internal
•
representation…
2. Merge the resulting representations
3. Start making queries on the whole
Queries not possible on the individual data sets
•
May 12, 2009 29
Data set “A”: A simplified book store
Books
ID Author Title Publisher Year
ISBN0-00-651409-X id_xyz The Glass Palace id_qpr 2000
Authors
ID Name Home page
id_xyz Ghosh, Amitav http://www.amitavghosh.com
Publishers
ID Publisher Name City
id_qpr Harper Collins London
May 12, 2009 30
st:
1 Export your data as a set of relations
May 12, 2009 31
Some notes on the data export
Data export does not necessarily mean
physical conversion of the data
Relations can be virtual, generated on-the-fly at
query time
via SQL “bridges”
scraping HTML pages
extracting data from Excel sheets
etc.
One can export part of the data
May 12, 2009 32
Data set “F”: Another book store’s data
A B D E
Traducteur
ID Titre Original
1
ISBN0 2020386682 Le Palais A13 ISBN-0-00-651409-X
des
miroirs
2
3
ID Auteur
6
ISBN-0-00-651409-X A12
7
Nom
11
Ghosh, Amitav
12
Besse, Christianne
13
May 12, 2009 33
2nd: Export your second set of data
May 12, 2009 34
3rd: start merging your data
May 12, 2009 35
3rd: start merging your data (cont’d)
May 12, 2009 36
4th: Merge identical resources
May 12, 2009 37
Start making queries…
User of data set “F” can now ask queries like:
“What is the title of the original version of Le
Palais des miroirs?”
This information is not in the data set “F”...
…but can be retrieved after merging with data
set “A”!
May 12, 2009 38
5th: Query the merged data set
May 12, 2009 39
However, more can be achieved…
We “know” that a:author and f:auteur are
really the same
But our automatic merge does not know that!
Let us add some extra information to the
merged data:
a:author is the same as f:auteur
Both identify a Person, a category (type) for certain
resources
May 12, 2009 40
3rd revisited: Use the extra knowledge
May 12, 2009 41
Start making richer queries!
User of data set “F” can now query:
“What is the home page of Le Palais des miroirs’s
‘auteur’?”
The information is not in data set “F” or “A”…
…but was made available by:
Merging data sets “A” and “F”
Adding three simple “glue” statements
May 12, 2009 42
6th: Richer queries
May 12, 2009 43
Bring in other data sources
We can integrate new information into our
merged data set from other sources
e.g. additional information about author Amitav
Ghosh
Perhaps the largest public source of general
knowledge is Wikipedia
Structured data can be extracted from Wikipedia
using dedicated tools
May 12, 2009 44
7th: Merge with Wikipedia data
May 12, 2009 45
7th (cont’d): Merge with Wikipedia data
May 12, 2009 46
7th (cont’d): Merge with Wikipedia data
May 12, 2009 47
Is that surprising?
It may look like it but, in fact, it should not be…
What happened via automatic means is done
every day by Web users!
The difference: a bit of extra rigour so that
machines could do this, too
May 12, 2009 48
What did we do?
We combined different data sets that
...may be internal or somewhere on the Web
...are of different formats (RDBMS, Excel spreadsheet,
(X)HTML, etc)
...have different names for the same relations
We could combine the data because some URIs were
identical
i.e. the ISBNs in this case
We could add some simple additional information (the
“glue”) to help further merge data sets
The result? Answer queries that could not previously be
asked
May 12, 2009 49
What did we do? (cont’d)
May 12, 2009 50
The abstraction pays off because…
…the graph representation is independent of
the details of the native structures
…a change in local database schemas, HTML
structures, etc. do not affect the whole
“schema independence”
…new data, new connections can be added
seamlessly & incrementally
May 12, 2009 51
So where is the Semantic Web?
Semantic Web technologies make such integration possible
The rest of this tutorial introduces many of
these technologies.
May 12, 2009 52
Agenda
Introduction
The data model (RDF)
The query language (SPARQL)
Adding structure & semantics (RDFS, OWL, RIF)
Working in the real world (GRDDL, RDF2RDB)
Working on the Web (Linked Data, RDFa,
POWDER)
May 12, 2009 53
RDF is…
Resource Description Framework
May 12, 2009 54
RDF is…
The data model of the Semantic Web.
May 12, 2009 55
RDF is…
A schema-less data model that features
unambiguous identifiers and named relations
between pairs of resources.
May 12, 2009 56
RDF is…
A labeled, directed graph of relations between
resources and literal values.
RDF graphs are collections of triples
Triples are made up of a subject, a predicate,
and an object
predicate
subject object
Resources and relationships are named with
URIs
May 12, 2009 57
Example RDF triples
“Lee Feigenbaum works for Cambridge
Semantics”
works for
Lee Cambridge
Feigenbaum Semantics
“Lee Feigenbaum was born in 1978”
born in
Lee
1978
Feigenbaum
“Cambridge Semantics is headquartered in
Massachusetts”
headquartered
Cambridge
Massachusetts
Semantics
May 12, 2009 58
Triples connect to form graphs
works for
Lee Cambridge
Feigenbaum Semantics
headquartered
born in
lives in
Massachusetts
1978
capital
Boston
May 12, 2009 59
Why RDF? What’s different here?
The graph data structure makes merging data
with shared identifiers trivial (as we saw
earlier)
Triples act as a least common denominator for
expressing data
URIs for naming remove ambiguity
…the same identifier means the same thing
May 12, 2009 60
Why RDF? Incremental Integration
Agile,
Flexible
URIs for
Incremental
Graph
naming
Model Integration
Relational
RDF
Database
May 12, 2009 61
Types of RDF Tools
Triple stores
Built on relational database
Native RDF store
Development libraries
Full-featured application servers
Most RDF tools contain some elements of each of
these.
May 12, 2009 62
Finding RDF Tools
Community-maintained lists
http://esw.w3.org/topic/SemanticWebTools
Emphasis on large triple stores
http://esw.w3.org/topic/LargeTripleStores
Michael Bergman’s Sweet Tools searchable list:
http://www.mkbergman.com/?page_id=325
May 12, 2009 63
RDF Tools – (Some) Triple Stores
Commercial or
Tool Environment
Open-source
Anzo Both Java
ARC Open-source PHP
AllegroGraph Commercial Java, Prolog
Jena Open-source Java
Mulgara Open-source Java
Oracle RDF Commercial SQL / SPARQL
RDF::Query Open-source Perl
Redland Open-source C, many wrappers
Sesame Open-source Java
Talis Platform Commercial HTTP (Hosted)
Virtuoso Both C++
May 12, 2009 64
Agenda
Introduction
The data model (RDF)
The query language (SPARQL)
Adding structure & semantics (RDFS, OWL, RIF)
Working in the real world (GRDDL, RDF2RDB)
Working on the Web (Linked Data, RDFa,
POWDER)
May 12, 2009 65
Motivating SPARQL
With a query language, a client can design their
own interface.
--Leigh Dodds, Talis
May 12, 2009 66
SPARQL is…
SPARQL Protocol And RDF Query Language
May 12, 2009 67
SPARQL is…
The query language of the Semantic Web.
May 12, 2009 68
SPARQL is…
A SQL-like language for querying sets of RDF
graphs.
May 12, 2009 69
SPARQL is…
A simple protocol for issuing queries and
receiving results over HTTP. So…
Every SPARQL client works with every SPARQL
server!
May 12, 2009 70
Why SPARQL?
SPARQL lets us:
Pull information from structured and semi-
structured data.
Explore data by discovering unknown
relationships.
Query and search an integrated view of disparate
data sources.
Glue separate software applications together by
transforming data from one vocabulary to
another.
May 12, 2009 71
Dealer 2
Dealer 3
Dealer 1
Employee ERP / Budget
Directory System
Web EPA Fuel Efficiency
Spreadsheet
SPARQL Query Engine
What automobiles get more than 25 miles per gallon, fit within my
department’s budget, and can be purchased at a dealer located within 10 miles
of one of my employees?
SELECT ?automobile
WHERE {
?automobile a ex:Car ; epa:mpg ?mpg ;
ex:dealer ?dealer .
?employee a ex:Employee ; geo:loc ?loc .
?dealer geo:loc ?dealerloc .
FILTER(?mpg > 25 &&
geo:dist(?loc, ?dealerloc) <= 10) .
}
Web dashboard SPARQL query
SPARQL Example: Querying Wikipedia
Find me all landlocked countries with a population
greater than 15 million.
PREFIX type: <http://dbpedia.org/class/yago/>
PREFIX prop: <http://dbpedia.org/property/>
SELECT ?country_name ?population
WHERE {
?country a type:LandlockedCountries ;
rdfs:label ?country_name ;
prop:populationEstimate ?population .
FILTER (
?population > 15000000 &&
langMatches(lang(?country_name), quot;ENquot;)
).
}
ORDER BY DESC(?population)
May 12, 2009 73
SPARQL Example: Querying Wikipedia
DBPedia SPARQL Endpoint
SPARQL Example: Querying Wikipedia
Types of SPARQL Tools
Query engines
Things that can run queries
Most RDF stores provide a SPARQL engine
Query rewriters
E.g. to query relational databases (more later)
Endpoints
Things that accept queries on the Web and return
results
Client libraries
Things that make it easy to ask queries
May 12, 2009 76
Finding SPARQL Tools
Community-maintained list of query engines
http://esw.w3.org/topic/SparqlImplementations
Publicly accessible SPARQL endpoints
http://esw.w3.org/topic/SparqlEndpoints
Michael Bergman’s Sweet Tools searchable list:
http://www.mkbergman.com/?page_id=325
May 12, 2009 77
(Some) SPARQL’able Data Sets
May 12, 2009 78
bio2rdf.org – querying life sciences data
May 12, 2009 79
bio2rdf.org – querying life sciences data
May 12, 2009 80
Agenda
Introduction
The data model (RDF)
The query language (SPARQL)
Adding structure & semantics (RDFS, OWL, RIF)
Working in the real world (GRDDL, RDF2RDB)
Working on the Web (Linked Data, RDFa,
POWDER)
May 12, 2009 81
Where’s the magic?
We haven’t seen anything yet that begins to
approach the long-term Semantic Web vision
May 12, 2009 82
From the explicit to the inferred
3 pieces of the Semantic Web technology stack
are about describing a domain well enough to
capture (some of) the meaning of resources
and relationships in the domain
RDF Schema
OWL
RIF
Apply knowledge to data to get more data.
May 12, 2009 83
RDFS is…
RDF Schema
May 12, 2009 84
RDF Schema is…
Elements of:
Vocabulary (defining terms)
I define a relationship called “prescribed dose.”
Schema (defining types)
“prescribed dose” relates “treatments” to “dosagees”
Taxonomy (defining hierarchies)
Any “doctor” is a “medical professional”
May 12, 2009 85
WOL OWL is…
Web Ontology Language
May 12, 2009 86
OWL is…
Elements of ontology
Same/different identity
“author” and “auteur” are the same relation
two resources with the same “ISBN” are the same “book”
More expressive type definitions
A “cycle” is a “vehicle” with at least one “wheel”
A “bicycle” is a “cycle” with exactly two “wheels”
More expressive relation definitions
“sibling” is a symmetric predicate
the value of the “favorite dwarf” relation must be one of
“happy”, “sleepy”, “sneezy”, “grumpy”, “dopey”, “bashful”,
“doc”
May 12, 2009 87
What can we do with OWL?
Answer questions of
Consistency
Are there any contradictions in this model?
Classification
What are all the inferred types of this resource?
Satisfiability
Are there any classes in this ontology that cannot
possibly have any members?
May 12, 2009 88
Building Useful Ontologies
Developing and maintaining quality ontolgies is very challenging
Users need tools and services, e.g., to help check if ontology is:
Meaningful — all named classes can have
instances
http://www.aber.ac.uk/compsci/public/media/presentations/OUCL-seminar.ppt
Building Useful Ontologies
Developing and maintaining quality ontolgies is very challenging
Users need tools and services, e.g., to help check if ontology is:
Meaningful — all named classes can have
instances
Correct — captures intuitions of domain experts
Building Useful Ontologies
Developing and maintaining quality ontolgies is very challenging
Users need tools and services, e.g., to help check if ontology is:
Meaningful — all named classes can have
instances
Correct — captures intuitions of domain experts
Minimally redundant — no unintended
synonyms
Banana split Banana sundae
Example: SNOMED
Large: 373,731 concepts & over 1 million terms
NHS version extended to 542,380 classes with
19,828 additional named classes
148,821 class drug taxonomy (primitive hierarchy)
OWL reasoner (FaCT++) classified NHS ontology
Able to classify whole ontology in <4 hours
Interesting results come from 19,828 additional named
classes
180 missing subClass relationships were found, e.g.:
Periocular_dermatitis subClassOf Disease_of_face
May 12, 2009 92
Example: SNOMED
May 12, 2009 93
RIF is…
Rules Interchange Format
May 12, 2009 97
RIF is…
Standard representation for exchanging sets of logical
and business rules
Logical rules
A buyer buys an item from a seller if the seller sells the item
to the buyer
A customer becomes a quot;Goldquot; customer as soon as his
cumulative purchases during the current year top $5000
Production rules
Customers that become quot;Goldquot; customers must be notified
immediately, and a golden customer card will be printed
and sent to them within one week
For shopping carts worth more than $1000, quot;Goldquot;
customers receive an additional discount of 10% of the total
amount
May 12, 2009 98
Developing Tools and Infrastructure
Editors/environments
Oiled, Protégé, Swoop, TopBraid, Ontotrack, …
May 12, 2009 99
Developing Tools and Infrastructure
Editors/environments
Oiled, Protégé, Swoop, TopBraid, Ontotrack, …
Reasoning systems
Cerebra, FaCT++, Kaon2, Pellet, Racer, CEL, …
Pellet
KAON2 CEL
May 12, 2009 100
Visualizing and Publishing Vocabularies
May 12, 2009 101
Reusable, public ontologies
FOAF
The Event Ontology
Measurement Units Ontology
May 12, 2009 102
Agenda
Introduction
The data model (RDF)
The query language (SPARQL)
Adding structure & semantics (RDFS, OWL, RIF)
Working in the real world (GRDDL, RDF2RDB)
Working on the Web (Linked Data, RDFa,
POWDER)
May 12, 2009 103
Fantasy Land Architecture
Ontology /
+ Schema
Custo Custo Custo Custo Custo Custo
m UI m UI m UI m UI m UI m UI
May 12, 2009 104
Reality
Internet
DB2
XML
LDAP
Oracle Directory
RDB
Custo Custo Custo Custo Custo Custo
m UI m UI m UI m UI m UI m UI
May 12, 2009 105
GRDDL is…
Gleaning Resource Descriptions from Dialects of
Language
May 12, 2009 106
GRDDL is…
A method for authoritatively getting RDF data
from XML and XHTML documents.
May 12, 2009 107
GRDDL is…
A mechanism for authoritatively deriving RDF
data from families of XML and XHTML
documents.
May 12, 2009 108
GRDDL tools
Most GRDDL tools are adapters to existing RDF
stores or SPARQL engines to allow loading or
querying data from XML and XHTML sources.
Community-maintained list:
http://esw.w3.org/topic/GrddlImplementations
Host System GRDDL tool
Jena GRDDL Reader for Jena
RDFLib GRDDL.py
Redland (built in)
Swignition (built in)
Virtuoso GRDDL “Sponger”
May 12, 2009 109
RDB2RDF is…
Relational Database to RDF
May 12, 2009 110
RDB2RDF is…
A proposed W3C Working Group to define a
standard way to map from relational databases
to RDF (and SPARQL).
May 12, 2009 111
RDF2RDB tools
Survey of existing approaches:
http://www.w3.org/2005/Incubator/rdb2rdf/RDB2RDF_SurveyReport.pdf
Tool Mapping Approach Dynamic vs. Static (ETL)
Anzo D2RQ configuration graph Both
Asio Tools OWL file, SWRL rules Both
Dartgrid XML file, visual mapper Dynamic
D2RQ D2RQ configuration file Both
R2O R2O XML file Both
RDBtoOnto Constraint rules Static (ETL)
SDS EII Query Engine/OOM XML Both
Triplify SQL config file Linked Data
Virtuoso RDF View Meta-Schema Language Both
May 12, 2009 112
What about… everything else?
Standards don’t yet exist, but many tools exist to
derive RDF and/or run SPARQL queries against
other sources of data.
May 12, 2009 113
LDAP Directories
Squirrel RDF
http://jena.sourceforge.net/SquirrelRDF/
May 12, 2009 114
Excel spreadsheets
Anzo for Excel
http://www.cambridgesemantics.com/products/anzo_for_excel
May 12, 2009 115
Excel spreadsheets
Semantic Discovery System
http://insilicodiscovery.com/installation/index.php
May 12, 2009 116
Web-based data sources
Virtuoso Sponger Cartridges
http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger
May 12, 2009 117
Unstructured Text
Calais
http://www.opencalais.com/
May 12, 2009 118
Unstructured Text
Zemanta Web Service
http://developer.zemanta.com/
May 12, 2009 119
Agenda
Introduction
The data model (RDF)
The query language (SPARQL)
Adding structure & semantics (RDFS, OWL, RIF)
Working in the real world (GRDDL, RDF2RDB)
Working on the Web (Linked Data, RDFa,
POWDER)
May 12, 2009 120
Linked Data is…
A simple set of 4 guidelines for publishing RDF data on the
Web (over HTTP)
Developed by Tim Berners-Lee in 2006
1. Use URIs as names for things
• Globally unique identity
2. Use HTTP URIs
• Everyone has a Web browser/client
3. When someone looks up a URI, provide useful information
• …in the form of RDF data
4. Include links to other URIs
• Foster discovery of additional information
May 12, 2009 121
The Linking Open Data Project is...
A community project started within the W3C
Semantic Web Education & Outreach group in
2007
A wealth of existing, open Web-based data sets
exposed in RDF and linked together
A growing number of publicly available SPARQL
endpoints
The first steps of “The” Semantic Web?
No longer easily measured or depicted!
May 12, 2009 122
The LOD “cloud”, May 2007
May 12, 2009 123
The LOD “cloud”, March 2008
May 12, 2009 124
The LOD “cloud”, September 2008
May 12, 2009 125
The LOD “cloud”, March 2009
May 12, 2009 126
Application specific portions of the cloud
Notably, bio-related data sets (in light purple)
some by the W3C “Linking Open Drug Data” task force
May 12, 2009 127
Sindice - Another view of data on the Web
May 12, 2009 128
Tools: Publishing linked data
Many tools we’ve already seen publish RDF
data according to linked data principles
E.g. Talis platform, Virtuoso, Triplify
Others sit on top of existing systems and make
the data available as Linked Data
E.g. pubby
May 12, 2009 129
Tools: the Data Browser
World Wide Web : Web pages :: The Semantic Web : Data
World Wide Web : Web browser :: Linked Data Web : Data browser
May 12, 2009 130
Tabulator: Generic Data Browser
May 12, 2009 131
Disco Hyperdata Browser
May 12, 2009 132
OpenLink Data Explorer
May 12, 2009 133
Marbles Linked Data Browser
May 12, 2009 134
DBPedia Mobile
May 12, 2009 135
DBPedia Mobile
May 12, 2009 136
DBPedia Mobile
May 12, 2009 137
DBPedia Mobile
May 12, 2009 138
QDOS – your online digital status
May 12, 2009 139
BBC Music Beta
May 12, 2009 140
Producer-oriented Web to consumer-
oriented Web
On the current Web…
Content publishers decide what can be done with
the data (via links, script)
On the Semantic Web…
Content publishers publish actionable data
Content consumers decide how to act on it
May 12, 2009 141
UltraLink
UltraLink is Novartis’s solution for cross-linking over 1,500,000 biologic
and chemical terms, including synonyms, taxonomies, and pointers
into data repositories.
May 12, 2009 142
UltraLink
What if an acquisition brings with it a new
Web-based corpus of pathway data that uses
terms not recognized by the annotators?
New text miners must be created & deployed
Finding & consuming data are too tightly coupled
May 12, 2009 143
RDFa is…
RDF in Attributes
May 12, 2009 144
RDFa is…
A collection of HTML attributes that allow RDF to
be embedded directly in Web pages.
May 12, 2009 145
Why RDFa?
Don’t Repeat Yourself (DRY)
In-context metadata (copy & paste)
Authoritative (no screen scrapig)
May 12, 2009 146
Who’s using RDFa?
STW Thesaurus for Economics
May 12, 2009 147
RDFa in action
May 12, 2009 148
POWDER is…
Protocol for Web Description Resources
May 12, 2009 149
http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation
descriptions applied to
groups of online resources
150
many
resources
one
description 151
grouping mechanisms...
... list URIs
... domain names, paths
... regular expressions on URIs
152
descriptions
may be grouped
queries
are on individual resources
153
description…
• Which resources does the DR describe?
• What is the description?
• Who has created the description?
• When was the description created?
• Until when is the description considered valid?
• From when is the description considered valid?
• Does anybody agree with this description?
• Do other descriptions exist about this group of resources?
154
in order to...
adapt
authorize
protect
trust
search
monitor
155
Thanks & Questions
lee@cambridgesemantics.com
May 12, 2009 156
Great work! I liked the examples of RDF, OWL and SPARQL. Thank you for posting.
Praveen. 3 years ago
Note that many of the slides have accompanying notes.
Please also note that slide 61 works a bit better in the Powerpoint version, due to animations. 4 years ago