LD & the research and
education space
Christophe Guéret
Archive Development (TV) / TSA (D&E)
@cgueret
disclaimer: will be experimenting
following guidelines from Alice
Bartlett and Russell Davies (mostly)
@mildlydiverting made the Google Template
1. Linked Data
2. Research and Education Space
1. Linked Data
Linked Data
Linked Open Data
Semantic Web
Three different concepts !
Linked Data
Method for publishing data using HTTP as a storage and
communication layer
More info on https://en.wikipedia.org/wiki/Linked_data
Linked Open Data
Linked Data applied to Open Data
For a definition of Open Data : http://opendefinition.org/
Semantic Web
Using the semantic capabilities of Linked Data to derive
information
Why Linked Data ?
Slides derived from http://www.slideshare.net/cgueret/linking-knowledge-spaces
Dealing with documents until 1989
1. Find a source for the document
2. Find a way to parse the document
3. Create links and index the content
4. Repeat on each update
Then came the Web...
Standardized, connected, decentralised, easy
Document
Document
Document
One server Another server
Dealing with data until, well, now
1. Find a source for the data
2. Find a way to parse the data
3. Create links and index the data
4. Repeat on each update
We deal with data
the way we dealt
with documents
20 years ago
Many formats, no links, ETL
Okay it’s not that bad...
● We have Web APIs now
● All the APIs are RESTful
● All the APIs do JSON, or XML
But what about schemas ? and links ?
Linked Data = doing the
Web again but for data
instead of documents
It’s possible
because
“Factual
knowledge is a
graph”
http://videolectures.net/iswc2011_van_harmelen_universal/
Concretely
“Lille is in France and called Rijsel in Dutch”
http://dbpedia.org/resource/Lille
http://dbpedia.org/resource/France“Rijsel”@nl
http://dbpedia.org/ontology/country
http://www.w3.org/2000/01/rdf-schema#label
Not rocket science:
● Use IRIs for identifiers
● Bind identifiers to data about them
● Bind identifiers to other identifiers
● Use IRIs for typing the links
Vocabularies
● QB : statistics
● PROV-O : provenance
● SIOC : social media
● Schema.org : search engine results
● And many more ...
Every data set is a graph part of a
bigger graph
Only need to know the vocabulary
used to meaningfully consume it
Vocabularies are part of the graph too!
http://dbpedia.org/ontology/country
http://www.w3.org/2000/01/rdf-schema#comment
“The country where the thing is located.”@en
Content negotiation
http://dbpedia.org/resource/Cardiff
http://dbpedia.org/page/Cardiff http://dbpedia.org/data/Cardiff.ttl
Content negotiation
http://www.bbc.co.uk/programmes/b006q2x0#programme
http://www.bbc.co.uk/programmes/b006q2x0 http://www.bbc.co.uk/programmes/b006q2x0.rdf
Two major traps to avoid
● Using the same IRI for a thing and
the document describing it
● Applying a license to a thing
instead of applying it to a document
And the semantic web then ?
LOD + Semantics = Semantic Web
● Document vocabularies with logics => New
data gets derived
● “Lille is in France” + “All cities in France
are in Europe” => “Lille is in Europe”
http://www.slideshare.net/ConnectedDataLondon/ten-years-of-linked-data-at-the-bbc
2. Research and Education Space
http://res.space
The Research and Education Space
In practice we
● Help GLAMs to publish LOD
● Crawl and index that LOD
● Provide a search interface over the
data crawled
Our crawler follows links across data
publishers to hunt for (properly
licensed) LOD
There is a set of rules used to
interpret the data in a specific way
All of that is open source ! Both what we use and what we code :-)
https://github.com/orgs/bbcarchdev
http://acropolis.org.uk/

Informal presentation about RES