Making bbc.co.uk more human literate (ISKO 2009)

•Download as ZIP, PDF•

2 likes•1,978 views

The BBC is working to make its website bbc.co.uk more "human literate" by linking content across domains using Linked Data principles. It has historically had separate microsites that were not well connected. Now it is using URIs to identify topics rather than just documents, providing structured data about URIs through RDF, and including links between related URIs. This allows information to be freed from data silos and for users to discover related content more easily by following semantic connections between pages. The BBC is applying these Linked Data techniques to both legacy content and new pages to better integrate different domains like music, nature, and television.

Technology Entertainment & Humor Business

Making bbc.co.uk more human literate
Tom Scott

The BBC has historically created a series of
microsites – each coherent in their own right but
not across the breadth of BBC content
Radio 4 Big Bang http://www.bbc.co.uk/radio4/bigbang/

so I can’t currently ﬁnd everything Brian Cox...

...Paul Weller...

Paul Weller http://www.ﬂickr.com/photos/johnbullas/3410330728/

Nor can I follow my nose, I can’t browse by
meaning, from one page to the next following a
semantic thread
Snickers http://www.ﬂickr.com/photos/homer4k/386980596/

Linked Data has helped us build a coherent,
scalable, sane service. One that we hope is a bit
more human literate.
Linked Data cloud diagram http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-03-05_colored.png

Use URIs to identify things not only documents

How it works: The Web http://ﬂickr.com/photos/danbri/2415237566/

Use HTTP URIs - globally unique names that
anyone can dereference

Colon Slash Slash http://www.ﬂickr.com/photos/jeffsmallwood/299208539/

Provide useful information [in RDF] when someone
looks up a URI

Information Desk http://www.ﬂickr.com/photos/metropol2/149294506/

Include links to other URIs to let people discover
related information

Links http://www.ﬂickr.com/photos/ravages/2831688538/

But why?

Good Question http://www.ﬂickr.com/photos/emagic/56206868/

Increasing levels of abstraction make technology
more and more human centric

an old lady with an umbrella in Ravangla market! http://www.ﬂickr.com/photos/sukanto_debnath/519690623/

Linked Data frees information from data silos

Silos http://www.ﬂickr.com/photos/bottleleaf/2218990208/

Linked Data at the BBC

Test Card X http://www.ﬂickr.com/photos/marksmanuk/3098983708/

The legacy content

BBC B - keyboard http://www.ﬂickr.com/photos/barnoid/235295234/

Concept extraction
Pig farming, GM crops, environmental catastophes,
Argentina, Pennsylvania, Uganda, Jimmy Doherty

Equivalence mapping
GM crops, GM food, genetically-modified food,
frankenstein food (etc.)

You might like…
GM Food
Tagging
Jimmy Doherty GM food, domestic pig, environmental disaster, Argentina,
Pennsylvania (US state), Uganda, Jimmy Doherty (farmer)
Pigs

Environmental disaster

Argentina
Publishing engine
GM Food, Jimmy Doherty, Pigs, Environmental disaster,
Argentina

DBpedia (wikipedia) - as a controlled vocabulary

You might like You might like

Cold War Nuclear warfare
Soviet Union John F Kennedy
Espionage Communism
John Le Carre Cuba

CONTENT PAGE NAVIGATION PAGE CONTENT PAGE

Navigation badges and topic pages join legacy
pages together

The new stuff...

Internet http://www.ﬂickr.com/photos/transkamp/54371294/

In the music domain we have a page for every
artist the BBC plays

And for natural history... species pages...

But the context lives in the joins between these
domains

Track listings on episode pages linked to artist
pages

And because the web is about URIs not pages

Seperate URLs for each resource
bbc.co.uk/music/artists/:artist/reviews

and the rest of web
bbc.co.uk/music/artists/:artist/links

In other words everything is addressable and
therefore meshable across the web and across
domains with the BBC
Droplet highway http://www.ﬂickr.com/photos/cesarcabrera/446101505/

One URI many representations
bbc.co.uk/programmes/b00ht655#programme

Programmes ontology
http://www.bbc.co.uk/ontologies/programmes
Understanding the big BBC graph
http://blogs.talis.com/n2/archives/569
Music ontology
http://musicontology.com
My blog
http://derivadow.com

Recently uploaded

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2

Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood

Platformless Horizons for Digital AdaptabilityWSO2

DBX First Quarter 2024 Investor PresentationDropbox

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub

ICT role in 21st century education and its challengesrafiqahmad00786416

Exploring Multimodal Embeddings with MilvusZilliz

Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez

Understanding the FAA Part 107 License ..Christopher Logan Kennedy

MINDCTI Revenue Release Quarter One 2024MIND CTI

Recently uploaded (20)

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

WSO2's API Vision: Unifying Control, Empowering Developers

Artificial Intelligence Chap.5 : Uncertainty

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Platformless Horizons for Digital Adaptability

DBX First Quarter 2024 Investor Presentation

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

Boost Fertility New Invention Ups Success Rates.pdf

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model

Apidays New York 2024 - The value of a flexible API Management solution for O...

Finding Java's Hidden Performance Traps @ DevoxxUK 2024

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

ICT role in 21st century education and its challenges

Exploring Multimodal Embeddings with Milvus

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Understanding the FAA Part 107 License ..

MINDCTI Revenue Release Quarter One 2024

Featured

2024 State of Marketing Report – by HubspotMarius Sescu

Everything You Need To Know About ChatGPTExpeed Software

Product Design Trends in 2024 | Teenage EngineeringsPixeldarts

How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow

AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork

Skeleton Culture CodeSkeleton Technologies

PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley

Content Methodology: A Best Practices Report (Webinar)contently

How to Prepare For a Successful Job Search for 2024Albert Qian

Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)

Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal

5 Public speaking tips from TED - Visualized summarySpeakerHub

ChatGPT and the Future of Work - Clark Boyd Clark Boyd

Getting into the tech field. what next Tessa Mero

Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray

How to have difficult conversations Rajiv Jayarajah, MAppComm, ACC

Introduction to Data ScienceChristy Abraham Joy

Time Management & Productivity - Best PracticesVit Horky

The six step guide to practical project managementMindGenius

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36

Featured (20)

2024 State of Marketing Report – by Hubspot

Everything You Need To Know About ChatGPT

Product Design Trends in 2024 | Teenage Engineerings

How Race, Age and Gender Shape Attitudes Towards Mental Health

AI Trends in Creative Operations 2024 by Artwork Flow.pdf

Skeleton Culture Code

PEPSICO Presentation to CAGNY Conference Feb 2024

Content Methodology: A Best Practices Report (Webinar)

How to Prepare For a Successful Job Search for 2024

Social Media Marketing Trends 2024 // The Global Indie Insights

Trends In Paid Search: Navigating The Digital Landscape In 2024

5 Public speaking tips from TED - Visualized summary

ChatGPT and the Future of Work - Clark Boyd

Getting into the tech field. what next

Google's Just Not That Into You: Understanding Core Updates & Search Intent

How to have difficult conversations

Introduction to Data Science

Time Management & Productivity - Best Practices

The six step guide to practical project management

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...

Making bbc.co.uk more human literate (ISKO 2009)

1. Making bbc.co.uk more human literate Tom Scott

2. The BBC has historically created a series of microsites – each coherent in their own right but not across the breadth of BBC content Radio 4 Big Bang http://www.bbc.co.uk/radio4/bigbang/

3. so I can’t currently ﬁnd everything Brian Cox...

4. ...Paul Weller... Paul Weller http://www.ﬂickr.com/photos/johnbullas/3410330728/

5. ...Lion...

6. ...or even Jeremy Clarkson

7. Nor can I follow my nose, I can’t browse by meaning, from one page to the next following a semantic thread Snickers http://www.ﬂickr.com/photos/homer4k/386980596/

8. But things are changing

9. Linked Data has helped us build a coherent, scalable, sane service. One that we hope is a bit more human literate. Linked Data cloud diagram http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-03-05_colored.png

10. Use URIs to identify things not only documents How it works: The Web http://ﬂickr.com/photos/danbri/2415237566/

11. Use HTTP URIs - globally unique names that anyone can dereference Colon Slash Slash http://www.ﬂickr.com/photos/jeffsmallwood/299208539/

12. Provide useful information [in RDF] when someone looks up a URI Information Desk http://www.ﬂickr.com/photos/metropol2/149294506/

13. Include links to other URIs to let people discover related information Links http://www.ﬂickr.com/photos/ravages/2831688538/

14. But why? Good Question http://www.ﬂickr.com/photos/emagic/56206868/

15. Increasing levels of abstraction make technology more and more human centric an old lady with an umbrella in Ravangla market! http://www.ﬂickr.com/photos/sukanto_debnath/519690623/

16. Linked Data frees information from data silos Silos http://www.ﬂickr.com/photos/bottleleaf/2218990208/

17. Linked Data at the BBC Test Card X http://www.ﬂickr.com/photos/marksmanuk/3098983708/

18. The legacy content BBC B - keyboard http://www.ﬂickr.com/photos/barnoid/235295234/

19. Concept extraction Pig farming, GM crops, environmental catastophes, Argentina, Pennsylvania, Uganda, Jimmy Doherty Equivalence mapping GM crops, GM food, genetically-modified food, frankenstein food (etc.) You might like… GM Food Tagging Jimmy Doherty GM food, domestic pig, environmental disaster, Argentina, Pennsylvania (US state), Uganda, Jimmy Doherty (farmer) Pigs Environmental disaster Argentina Publishing engine GM Food, Jimmy Doherty, Pigs, Environmental disaster, Argentina DBpedia (wikipedia) - as a controlled vocabulary

20. You might like You might like Cold War Nuclear warfare Soviet Union John F Kennedy Espionage Communism John Le Carre Cuba CONTENT PAGE NAVIGATION PAGE CONTENT PAGE Navigation badges and topic pages join legacy pages together

21. The new stuff... Internet http://www.ﬂickr.com/photos/transkamp/54371294/

22. A page per programme

23. In the music domain we have a page for every artist the BBC plays

24. And for natural history... species pages...

25. ...habitat pages...

26. ...and adaptation pages

27. But the context lives in the joins between these domains

28. Programmes on species pages

29. Track listings on episode pages linked to artist pages

30. Programmes on artist pages

31. And because the web is about URIs not pages

32. Seperate URLs for each resource bbc.co.uk/music/artists/:artist/reviews

33. bbc.co.uk/music/artists/:artist/news

34. and the rest of web bbc.co.uk/music/artists/:artist/links

35. In other words everything is addressable and therefore meshable across the web and across domains with the BBC Droplet highway http://www.ﬂickr.com/photos/cesarcabrera/446101505/

36. One URI many representations bbc.co.uk/programmes/b00ht655#programme

37. One URI many representations bbc.co.uk/programmes/b00ht655#programme

38. One URI many representations bbc.co.uk/programmes/b00ht655#programme

39. Programmes ontology http://www.bbc.co.uk/ontologies/programmes Understanding the big BBC graph http://blogs.talis.com/n2/archives/569 Music ontology http://musicontology.com My blog http://derivadow.com

Editor's Notes

The BBC is a big place and I would like to suggest that in someways it represents a microcosm of the wider web. We produce so much content that traditional 'left hand nav' style navigation doesn't work. From a UX POV, nor from a coordination and governance POV. As a result the BBC has historically created a series of microsite. Each coherent in their own right but not across the breadth of BBC content. Consider for example I can navigate around a Radio 4 site about the opening of the LHC... but...
I can&#x2019;t find everything to BBC knows about Brian Cox... but equally I can&#x2019;t find everything
Paul Weller, or any other artist, nor can I find everything
But things are changing.. It has been my honour to work on a few of projects where we took a different approach. Starting with the data and how people think about it rather than starting with the web page down. And when I say data I really mean starting with understanding what things people care about and giving each of those things a URI. /programmes - ensures every programme the BBC broadcasts has a web presence, has a URI. And that that URI can be dereferenced to return an HTML document, an RDF document, JSON, iCal or mobile views. /music - is built with MusicBrainz and gives us a page for every artist the BBC plays. Because it&#x2019;s built with Musicbrainz not only do we get a URI per artist we also get links into the rest of the web and lovely webscale identifiers. I&#x2019;m now working on a new project, BBC Earth, which is seeking to bring the BBC&#x2019;s Natural History content online in a similar fashion. A page per species, habitat, behaviour and adaptation - all linked to the programme space and the wider web through dbpedia and elsewhere. And of course as with programmes and music the API is the website - the URIs can return RDF, JSON etc. as well as HTML.
Of course what I&#x2019;m talking about is Linked Data... even if we didn&#x2019;t quite realise that when we started. But the idea that we should care about our URIs, care about having one per concept, care about having machine representations for those resources instead of a separate API has helped us build a coherent, scalable, sane service. One that we hope is a bit more human literate. Linking Open Data is a grassroots project to use web technologies to expose data on the web. It is for many people synonymous with the semantic web - and while this isn&#x2019;t quite true. It does, as far as I&#x2019;m concerned, represent a very large subset of the semantic web project. Although, curiously, it can also be thought of as the &#x2018;the web done right&#x2019; - as it was originally designed to be. But what is it? Well it can be described with 4 simple rules.
The web was designed to be a web of things, not just a web of documents. Those documents make assertions about things in the real world but that doesn&#x2019;t mean the identifiers can only be used to identify web documents. Just as a passport or driving license, in the real world, can be thought of as identifiers for people so URI can be used as identifiers for people, concepts or things on the web. Minting URIs for things rather than pages helps make the web more human literate because it means we are identifying those things that people care about.
The beauty of the web is its ubiquitous nature - the fact it is decentralised and able to function on any platform. This is because of TimBL&#x2019;s key invention the HTTP URI. URI&#x2019;s are globally unique, open to all and decentralised. Don&#x2019;t go using DOI or any other identifier - on the web all you need is an HTTP URI.
And obviously you need to provide some information at that URI. When people dereference it you need to give them some data - ideally as RDF as well as HTML. Providing the data as RDF means that machines can process that information for people to use. Making it more useful.
And of course you also need to provide links to other resources so people can continue their journey. And that means contextual links to other resources elsewhere on the web, not just your site. And that&#x2019;s it. Pretty simple. And I would argue that, other than the RDF bit, these principles should be followed for any website - they just make sense.
Before the Web people still networked their computers - but to access that network you needed to know about the network, the routing and the computers themselves. For those in their late 30s you&#x2019;ll probably remember the film War Games - because this was filmed before the Web they had to find and connect to each computer. The joy of the web is that it adds a level of abstraction - freeing you from the networking, routing and server location and letting you focus on the document.
The evolution from ARPANET, to Internet, to Web gave us increasing levels of abstraction, making the technology more and more human centric because it allowed us to stop worrying about the location of servers and their networking and start thinking about documents. The semantic web project I hope will help the BBC to move away from caring about the document and towards the ideas, concepts and things we care about. So you can find all things Brian Cox or follow your nose to from an episode of Blue Planet via a page about dolphins to everything the BBC knows about ecolocation. Following the principles of Linked Data allows us to add a further level of abstraction - freeing us from the document and letting us focus on the things, people and stuff that matters to people. It helps us design a system that is more human literate, and more useful. This is possible because we are identifying real world stuff and the relationships between them.
Making data available as data means it&#x2019;s more useful because other people can build with it to create interesting things for people. Of course you don&#x2019;t have to use Linked Data to achieve this - there are other ways of doing it - lots of sites now provide APIs which is good just not great. Each of those APIs tend to be proprietary and specific to the site. As a result there&#x2019;s an overhead every time someone wants to add that data source. These APIs give you access to the silo - but the silo still remains. Using RDF and LOD means there is a generic method to access data on the web.
So what are we doing at the BBC? First up it&#x2019;s worth pointing out the obvious, the BBC is a big place and so it would be wrong to assume that everything we&#x2019;re doing online is following these principles. But there&#x2019;s quite a lot of stuff going on. There are two areas of activity - the first a root and branch development - tackling a specific domain and a second retrofitting technologies to the existing, legacy content to make it more coherent.
There&#x2019;s a lot of it. Loads. And we can&#x2019;t very well ignore it or leave it behind. We need a way to describe and link the existing microsites together.
We are doing that by using the wikipedia text as an evidence set to recommend IDs We can start to automate some of this to extract concepts and recommend tags before publishing a navigation badge back on the page.
These Navigation badges then link off to topic or aggregation pages which in turn link through to other content pages. Allow people to navigate between pieces of content, via topic pages, following their areas of interest.
What about the new stuff? Well the BBC&#x2019;s programme support, music discovery and, soon, natural history content are all being designed and build from the ground up following the LOD principles described earlier. In other words persistent HTTP URIs that can be dereferenced to HTML, RDF, JSON and mobile views for programmes, artists, species and habitats each with further links on to other resources.
That means we have a page for every programme the BBC broadcasts on TV and Radio.
And we have separate pages for every artist the BBC plays on the new music site.

Making bbc.co.uk more human literate (ISKO 2009)

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

Making bbc.co.uk more human literate (ISKO 2009)

Editor's Notes