Introducing linked data into BBC News online


Published on

A presentation for the 2013 AGM of the IPTC held in Paris on June 24-26

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • UK's most popular news website - 6 million unique browsers every day (3rd biggest site in the UK after Google and Facebook) publish around 500 articles every day - local, national global publish in 27 languages as World Service (+ 2 UK languages alongside English) hundreds of journalists, many working cross-media (TV/radio/online)
  • articles created in a home-grown Content Management System flat page publishing via FTP - good for high load events but limits our UX and data potential
  • - need to minimise impact on journalists - integration with existing tools and workflow as much as possible
  • pilot - can we automate the production of the local news region sub-index pages?  (currently manual task to maintain these pages) GET articles about or mentioning places that fall within the BBC News region
  • - a simple ontology for people, organisations, places and intangibles (themes) and their intersection with events - based on rNews, the Event ontology and PA ’ s SNaP Stuff ontology - annotate articles with events, where the event:place is Birmingham etc.
  • - IPTC rNews terms in RDFa - basic publishing metadata in the <head> for rich snippets - linked open data in the body
  • - immediate results - rich snippets for articles - apparently better ranking by topic (anecdotal)
  • - we introduced the change in the first week of May - by the end of may we were seeing some positive press coverage, people were noticing
  • Introducing linked data into BBC News online

    1. 1. Linked Data in BBC News IPTC AGM June 25th 2013
    2. 2.
    3. 3. moving to linked data • moving from static HTML to dynamic, responsive site • introducing linked data to power content aggregations around related topics • starting to embed linked open data in every page as RDFa • using the IPTC rNews vocabulary to describe contnet in a machine-readable way
    4. 4. impact on journalists • annotating (“tagging”) content with topics • tool embedded into existing CMS • concept extraction/NLP for topic suggestion • journalists accept/reject suggested topics for annotation
    5. 5. pilot - local indexes
    6. 6. learning from the pilot • generally - it works • but duplication for big events • also need pinning • concept extraction poor • journalists gaming the system
    7. 7. corenews model
    8. 8. pilot - publishing RDFa • using RDFa + rNews to embed machine- readable metadata in article source code • discoverability: rich snippets + better ranking • publish Linked Open Data: <articleURI> rdf:type rnews:Article <articleURI> rnews:about <thingURI> etc...
    9. 9. learning from the pilot
    10. 10. learning from the pilot
    11. 11. next steps • rolling out tagging to journalists throughout BBC News • making better use of rNews/RDFa - full mark-up integration • piloting the use of storyline in data-driven news
    12. 12. more info • News-Linked-Data-Ontology • -05-01.shtml • • twitter: @jeremytarling
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.