• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Bbc semantic

Bbc semantic






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Bbc semantic Bbc semantic Presentation Transcript

    • Media Meets Semantic Web – How the BBC Uses DBpedia and Linked Data to Make Connections Georgi Kobilarov et. al. ESWC 2009©www.sti-innsbruck.at INNSBRUCK www.sti-innsbruck.at Copyright 2008 STI
    • • BBC working to integrate data and linking documents across BBC domains • Collaboration with Freie Universität Berlin, Rattle Research (and Ontotext) • Semantic Web context: usage of Linked Data from MusicBrainz and DBpediawww.sti-innsbruck.at 2
    • Problem • BBC publishes large amounts of online content text/videos/audio • Mostly data for broadcast brands and domain specific microsites • Division of its services by domain, e.g. food, music, news etc.  No interlinking between these domain specific sites – not using the full potential of available datawww.sti-innsbruck.at 3
    • Objectives • DBpedia to provide a common ”controlled” vocabulary and equivalency service, which in turn is used to add ”topic badges” to existing, legacy web pages • Soft transition of the old to the new system – Developing a new service that supports the branding of our Radio stations, TV channels and programmes (bbc.co.uk/programmes) – Developing a new music offering (bbc.co.uk/music/beta) that builds on existing open web standards and is fully integrated with programme support service – Simple navigational elements (i.e. Topic Badges and term extraction) to support contextual, semantic navigation – Common set of web scale identifiers to help classify all BBC online content (and external URLs) and to help create equivalency between multiple vocabularieswww.sti-innsbruck.at 4
    • Cross-Linking Legacy Content with Legacy Systems • Desire to link to further BBC domains (apart from programmes and music) – Through an about-relationship between programmes, people, places and subjects • Data was created with a legacy auto-categorization system called CIS. • CIS holds a hierarchy of terms in five main top-level classes: – Proper names – Subjects – Brands – Time periods – Places  Objects identified with /programmes and /music are also to be found within other domains: Mechanism to map between equivalent terms  Linking CIS Concepts to DBpedia www.sti-innsbruck.at 5
    • Linking BBC Domainswww.sti-innsbruck.at 6
    • Linking BBC Domains • DBpedia weighted Label Lookup using Wikipedia inter-article-links as weight indicator – links(redirect)*log2(weight(article)) • Context-Based Disambiguation – Disambiguate possible concept matches to identify similarity contexts of CIS terms by clustering matches and finding according contexts in DBpediawww.sti-innsbruck.at 7
    • Linking Documents to Concepts • Named entity extraction system Muddy Boots – Instead of solutions from OpenCalais, Twine and Zemanta because it reuses existing web identifiers, i.e. Wikipedia/Dbpedia URIs • BBC News articles, recognize entities in those articles • Use DBpedia identifier for those entities • Content Link Tool to add or remove DBpedia identifiers from any given BBC URLwww.sti-innsbruck.at 8
    • Create User Journeys: Topic Pages and Navigation Badges • Topic pages – Creation of aggregation pages of unstructured and structured content – Pull together the modeled world of BBC programmes (CIS identifiers mapped to DBpedia) and unstructured world of BBC News articles • Navigational Badges – Once a user has entered an area of BBC content there are few links through to other related content – Providing this link is the role of the navigation badgewww.sti-innsbruck.at 9
    • Conclusions • User experience in the center of BBC efforts • Semantics as enabler • What we can learn form the BBC – User should be in the center of efforts – Pages not strictly structured according to domain model – Semantics primarily enable smart interlinking to additional content – Well hidden magic – Simplicity of domain models is beauty • For more information refer to “Beyond the polar bear presentation” – http://www.slideshare.net/reduxd/beyond-the-polar-bearwww.sti-innsbruck.at 10