Linking bbc.co.uk to the linked data cloud
by Tom Scott
- 5,654 views
Introduction to Linked Data and what the BBC has done to link to the LOD cloud. As presented at the Open Knowledge Conference.
Introduction to Linked Data and what the BBC has done to link to the LOD cloud. As presented at the Open Knowledge Conference.
Statistics
- Likes
- 9
- Downloads
- 54
- Comments
- 0
- Embed Views
- Views on SlideShare
- 4,931
- Total Views
- 5,654



Although, curiously, it can also be thought of as the ‘the web done right’ - the web as it was originally designed to be.
But what is it?
Well it can be described with 4 simple rules.
Those documents make assertions about things in the real world but that doesn’t mean the identifiers can only be used to identify web documents.
Just as a passport or driving license, in the real world, can be thought of as identifiers for people so URI can be used as identifiers for people, concepts or things on the web.
Minting URIs for things rather than pages helps make the web more human literate because it means we are identifying those things that people care about.
URI’s are globally unique, open to all and decentralised.
Don’t go using DOI or any other identifier - on the web all you need is an HTTP URI.
Providing the data as RDF means that machines can process that information for people to use. Making it more useful.
And that means contextual links to other resources elsewhere on the web, not just your site.
And that’s it. Pretty simple.
And I would argue that, other than the RDF bit, these principles should be followed for any website - they just make sense.
For those in their late 30s you’ll probably remember the film War Games - because this was filmed before the Web they had to find and connect to each computer and know about the computer’s location. If you remember, they plugged their phone into a modem and phoned up the computer.
The joy of the web is because it adds a level of abstraction - freeing you from the networking, routing and server location and letting you focus on the document.
It helps us design a system that is more human literate, and more useful.
This is possible because we are identifying real world stuff and the relationships between them.
Each of those APIs tend to be proprietary and specific to the site. As a result there’s an overhead every time someone wants to add that data source.
These APIs give you access to the silo - but the silo still remains.
Using RDF and LOD means there is a generic method to access data on the web.
First up it’s worth pointing out the obvious, the BBC is a big place and so it would be wrong to assume that everything we’re doing online is following these principles. But there’s quite a lot of stuff going on.
Well the BBC’s programme support, music discovery and, soon, natural history content are all adopting these principles.
In other words persistent HTTP URIs that can be dereferenced to HTML, RDF, JSON and mobile views for programmes, artists, species and habitats.
So for example the previous artist page transcludes this resource - but the resource also has it’s own URI.
If it doesn’t have a URI it’s not on the web.
So the URI for this programme is:
bbc.co.uk/programmes/b00ht655#programme
Through content negotiation we are able to server an HTML
So that, for example, you can go from a tracklist on an episode page of Jo Whiley on the Radio 1 site to the U2 artist page and them from there to all episodes of Chris Evans which have played U2.
Or from an episode of Nature’s great events to the page about Brown Bears to all BBC TV programmes about Brown Bears.
So for example here are all the URI we know about that are about an artist. Note this set is also at a URI.
Where URIs already exist to represent that concept we using it rather than minting our own.
Rather than minting our own URI for artist biographic info we use wikipedia’s.