Transcript of "BBC Linked Data Platform (SemTechBiz San Fran 2013)"
5 June 2013
BBC Linked Data Platform
Using semantic technologies to make our content more connected and more discoverable
A (very) short history
✤ Dynamic Semantic Publishing
✤ BBC Sport - Transition from ‘static’ to ‘dynamic’
✤ Introduction of Semantic Technologies for World Cup 2010
✤ Raising the bar for Olympics 2012
✤ Linked Data Platform & The Creative Work
Athletes & Medals: from trackside to our audience
✤ Minimal metadata
✤ Enough non-semantic metadata to support ‘rich links’ in a wide
range of applications
✤ Enough semantic metadata (tags) to support discovery through
✤ Full metadata requires a content-type-speciﬁc metadata API
✤ Access to content requires a content API
✤ Automated index pages/feeds
✤ Semantic navigation
✤ Semantic search
✤ A typical query:
✤ Top 10, most recent, BBC News Items about Politicians who are
members of The Labour Party
Powered by LDP
BBC Olympics 2012
BBC Knowledge & Learning Beta
BBC News Local Beta
BBC Sport Mobile App
Our own URIs
✤ Everything has a ‘Thing URI’:
✤ Opaque ID, dereferencable*
✤ BBC controls identity, therefore quality & consistency
✤ bbc:sameAs to DBPedia, Wikidata, Freebase etc
Our own ontologies
✤ Core set of ontologies that are BBC owned
✤ Creative Work, BBC, (Organsational) Provenance, etc
✤ Ability to change regularly and unilaterally
✤ Provide ‘mappings’ to more widely used ontologies
✤ Domain ontologies can be shared or reused
✤ Sport, Politics, GeoLocation, etc
✤ Provided through Mashery
✤ ‘Connected Studio’ events will validate
✤ Public beta to follow
✤ JSON-LD & Turtle
✤ Self-provisioned, cloud-based
✤ Data Dumps
Managing concepts across BBC
✤ Which domain ‘owns’ Arnold Schwarzenegger?
✤ News? Entertainment? History? Politics?
✤ Can domains ‘own’ predicates?
✤ Layering information over shared concepts
✤ High quality sub-sets vs. lower quality ‘long-tail’
✤ Synchronisation with external datasets
✤ Tools for creating and managing concepts
✤ Emerging, splitting & combining concepts
✤ Linked Data gives us a language to solve these problems
Often subjective, never complete
✤ What is this TV programme about?
✤ Manual tag curation
✤ Long-term expense
✤ Automated tag generation
✤ Short-term expense
✤ Value in data or algorithm?
✤ Relies on assumptions
✤ Our approach? Invest in both. Validate learnings.
When to reason?
✤ Our options...
✤ Before writing to the triple store
✤ Materialised in the triple store (Forward-chaining inference)
✤ Inferred by the SPARQL engine (Backward-chaining inference)
✤ After SPARQL results have returned
✤ None/some/all of the above
Maturity of SemanticTech
✤ From a Software Industry perspective, Semantic (RDF) Technology is
not mainstream and is therefore hard to sell
✤ Library/application immaturity can be a hinderance to innovation
✤ I believe the Sem Tech industry needs to focus on
simplicity and abstraction
✤ Semantic Technology is complex, but using it, need not be
Find out more
✤ Video from QCon London 2013:
✤ BBC Internet Blog:
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.