SlideShare a Scribd company logo
1 of 78
BBC Dynamic Semantic Publishing [DSP]
               MarkLogic User Group July 2012
               •   Jem Rayfield : Lead Technical Architect
               •   BBC Future Media




Future Media                                                 © BBC MMXII
Outline


               BBC News Online

               BBC World Cup 2010

               BBC Sport 2012 + Olympics

               BBC News Mobile




Future Media                               © BBC MMXII
Radio since 1922   TV Since 1930   Web since 1994




 Future Media                              © BBC MMXII
online
http://bbc.co.uk/news
  Future Media          © BBC MMXII
BBC News [Static Publishing]




Future Media                   © BBC MMXII
Static News Architecture




Future Media               © BBC MMXII
BBC
CPS/CMS

Asset
Authoring




Future Media   © BBC MMXII
BBC
CPS/CMS

Index
Authoring




Future Media   © BBC MMXII
Static News
The Good

               1) Simple

               2) Scales cheaply

               3) Difficult to break
                [bad rendering logic etc..]

               4) Handles high load




Future Media                                  © BBC MMXII
Static News
The BAD        1) Relational taxonomic
               meta model
               2) Static! Inflexible! SSI!
               3) Document publishing
               4) Content non re-usable
               5) Content non
               repurpose-able
               6) Difficult to personalize

               7) Publication per output

Future Media                                 © BBC MMXII
BBC World
                       Cup 2010




http://bbc.co.uk/worldcup
  Future Media               © BBC MMXII
World Cup 2010


2. 32 teams, 8 groups, 736 players  776 pages

4. Fixtures & Results, Groups & Teams pages

6. To many web pages for too few journalists

8. Improve the publishing system to help achieve all of this


  Future Media                                                 © BBC MMXII
Page Per Player
http://news.bbc.co.uk/sport/football/world_cup_2010/groups_and_teams/team/england/wayne_rooney




     Future Media                                                                  © BBC MMXII
Page
Per
Team




 Future Media   © BBC MMXII
Page
Per
Group




 Future Media   © BBC MMXII
Open Sport Ontology
BBC Sport: http://www.bbc.co.uk/ontologies/sport




   Future Media                                    © BBC MMXII
Semantic publishing




               TRIPLE STORE




                              ONTOLOGY




                                             USER EXPERIENCE


Future Media                                                   © BBC MMXII
Graffiti: Suggest -> Tag [Player]




Future Media                        © BBC MMXII
Graffiti: Suggest -> Tag [Location] (Geonames)




  Future Media                           © BBC MMXII
Graffiti Demo…
  (maybe video,
  depending on wifi… )




Journalism               © BBC MMIX
Rationale
  • Automated content publishing
  • Huge increase in content breadth (number of manageable pages)
  • Content re-use and re-purposing, increasing reach
  • Simplified content management
  • Journalist headcount reduction
  • Multi-dimensional entry points and semantic navigation
  • Improved user experience with high levels of user engagement
  • Dynamic, state (time|event) and semantic driven page layout
  • Personalized content aggregations
  • Open data and API’s
Future Media                                                 © BBC MMXII
World Cup DSP Architecture




Future Media                 © BBC MMXII
API
Stack




Future Media   © BBC MMXII
Highly Scalable Clustered BigOWLIM




Journalism                           © BBC MMIX
Extendable
Domain Driven
Asset
Tagging




Journalism      © BBC MMIX
Open Ontology/Dataset reuse
Event | Geonames | Foaf | Etc.




Journalism                       © BBC MMIX
Infer… player->team->competition




Journalism                         © BBC MMIX
REST API
             Content negotiation: json rdf, xml rdf, turtle

             Publically accessible (with SSL cert)

             GET /sport/football/teams/<TEAM>
             Accept: application/rdf+json

             GET /sport/football/<COMPETITION>
             Accept: application/rdf+xml

             GET /assets/<ASSET>
             Accept: text/rdf+n3

Journalism   Etc….                                            © BBC MMIX
GET Accept text/rdf+n3
https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea

      <http://www.chelseafc.com/>
         domain:documentType <http://www.bbc.co.uk/things/document-types/homepage> , <http://www.bbc.co.uk/things/document-types/external> .

      <http://www.bbc.co.uk/sport/football/teams/chelsea>
         domain:documentType <http://www.bbc.co.uk/things/document-types/bbc-document> , <http://www.bbc.co.uk/things/document-types/homepage> .

      <http://www.bbc.co.uk/things/2acacd19-6609-1840-9c2b-b0820c50d281#id>
         a      sport:CompetitiveSportingOrganisation ;
         domain:canonicalName
               "Chelsea"^^<xsd:string> ;
         domain:document <http://www.chelseafc.com/> , <http://www.bbc.co.uk/sport/football/teams/chelsea> ;
         domain:externalId <http://dbpedia.org/resource/Chelsea_F.C.> , <urn:sports-stats:137316635> ;
         domain:name "Chelsea" ;
         domain:shortName "Chelsea"^^<xsd:string> ;
         sport:competesIn <http://www.bbc.co.uk/things/5cd4682a-7643-f445-8b1f-bcbaf450bc89#id> .

      <http://dbpedia.org/resource/Chelsea_F.C.>
         domain:externalIdType
               <http://www.bbc.co.uk/things/external-id-types/dbpedia> .

      <urn:sports-stats:137316635>
         domain:externalIdType
             <http://www.bbc.co.uk/things/external-id-types/bbc-sport-stats> .

      <http://www.bbc.co.uk/things/5cd4682a-7643-f445-8b1f-bcbaf450bc89#id>
         domain:canonicalName
               "Premier League"^^<xsd:string> ;
         domain:externalId <urn:sports-stats:118996114> ;
         sport:competitionType
               <http://www.bbc.co.uk/things/competition-types/domestic-league> .

 Journalism                                                                                                                          © BBC MMIX
PHP->EasyRDF->API
 PHP Render layer consumes RDF from REST API via EasyRDF
 (http://www.aelius.com/njh/easyrdf/)

 EasyRDF open PHP library
 (Primary committer Nicholas Humfrey BBC)

             protected function getOptions() {
                         return array(
                            "config" => array("usecert" => true),
                                    "headers" => array(
                                    "Accept" => "application/rdf+json",
                                    "X-Expect" => "http://www.bbc.co.uk/things/platforms/hiweb"
                        )
             );

             $options = $this->getOptions()
             $response = $this->get("https://api.test.bbc.co.uk/dsp/sport/football/teams/chelsea", $options)
             $this->data = new EasyRdf_Graph("http://www.bbc.co.uk", $response->getBody());
             $teams = $this->data->allofType("sport:CompetitiveSportingOrganisation”)
Journalism                                                                                                     © BBC MMIX
World Cup statistics the GOOD
• 750+ Dynamic aggregations/pages (Player, Squad, Group, etc..)

• Average unique page requests a day : 2 million +

• Average OWLIM SPARQL queries a day : 1 million

• 100s RDF statement updates/inserts per minute with full OWL reasoning
and associated inference.

• Multi data center fully resilient, clustered 6 node triple store

• RDF graph model ideally suited to model domain representations such as
sport
 Future Media                                                        © BBC MMXII
World Cup statistics the BAD

• Sports stories and indices static

• Sport content not responsive or personalized

• RDF Store unable to handle thousands of statistic updates a second

• RDF Store forward-chained closures expensive increase write latency

• RDF graph model and SPARQL not ideally suited to the BBC’s News and
Sport document publication model


 Future Media                                                           © BBC MMXII
BBC Sport 2012;
                 Online Refresh




http://bbc.co.uk/sport
  Future Media                     © BBC MMXII
Sport Refresh 2012
• Page per Athlete [10,000+], Page per country [200+], Page per
  Discipline [400-500], Page per venue, Page per team
   A lot of output…

• Almost real time statistics and live event pages

• Time coded, metadata annotated, on demand video, 58,000
  hours of content

• Far too many web pages for far too few journalists

• DSP annotation architecture to automate content aggregation
   Future Media                                           © BBC MMXII
Sport Refresh 2012; Red -> Bright Yellow




 Future Media                          © BBC MMXII
10000+ Dynamic Aggregations




 Future Media                 © BBC MMXII
Lots of Dynamic (Live) sports stats




 Future Media                         © BBC MMXII
Dynamic Navigation




 Future Media        © BBC MMXII
Static Stories +
Dynamic includes




 Future Media      © BBC MMXII
Olympics - 27 Live Video Steams
Live Stats overlays
Stats -> Ontology driven aggregations




 Future Media                           © BBC MMXII
XML powered
  video chapter
  points


Page URL
http://www.test.bbc.co.uk/sport/olympics/2012/live-video/p00g2lqp


Xml api to get video log events
https://api.test.bbc.co.uk/olympicdata/bdf-log/pid/p00g2lqp




     Future Media                                                   © BBC MMXII
Augment architecture with a Content Store
2. Atomic content assets stored in MarkLogic XML store

4. XML content queryable via Xquery

6. Content Assets searchable

8. Sports statistics searchable/queryable via XQuery

10. Ontological SPARQL via BigOWLIM, assets Xquery via
 MarkLogic
  Future Media                                           © BBC MMXII
Static
Sports stats
(OLD)




 Future Media   © BBC MMXII
Dynamic
Sports stats




 Future Media   © BBC MMXII
Olympics XML etc..




 Future Media        © BBC MMXII
Ontology
  Aware
  NLP


• Information Workbench
• OWLIM
• (Spice) GATE+Ontotext




    Future Media          © BBC MMXII
Extraction Highlights
                    Ex-England         ? Roy Hodgson:
                    boss       Sven-   coach            Generic
                                       ? Roy Hodgson:
                    Goran Eriksson
                    says a "smear      hockey player    Analysis
                                       ? ……….
                    campaign" has                              …                   Update




                                                                         CES APP
                    been aimed at
                    Roy Hodgson                         KB Gazetteer
                    for omitting Rio
                    Ferdinand.
                                                               …
                                                               …
V Sven-Goran        V Rio Ferdinand    V Roy Hodgson:
                    -…….                                       …
Eriksson
-…….                - ……….
                                       coach
                                       -Roy Hodgson:           …                        OWLIM
- ……….                                 hockey player
                                       - ……….           Disambiguation   Retrain &
                                                               …         Adapt
                                                               …
               1.   Eriksson      (78%)                        …
               2.   Roy Hodgson (69%)                   Relevance
               3.   Rio Ferdinand (58%)                                                Curate
                                                        Ranking
               4.   …


     Future Media                                                                               © BBC MMXII
Disambiguation of Locations
•Geospatial distance - a feature of OWLIM
•Super region – GeoNames hierarchy and containment relations, e.g.
parentFeature
•RDF Rank
•Human approval score (on the basis of curated documents)
•Class/code based priority – fine grained ontology may allow a rule or machine
learning prioritization of classes and entities based on learning we already have.
•Asset geo association - some entities could be disambiguated by using the
asset domain association. BBC UK local sports is more likely to talk about national
entities.




    Future Media                                                           © BBC MMXII
Entity Relevance: Objective
• Rank entities by their relatedness to the article
• Accuracy 75%
• We consider various frequencies of entity mentions in the article
  and in the entire set of articles
• Positions in the article fields or in the first paragraphs of the
  body boost the relevance




   Future Media                                             © BBC MMXII
DSP
Architecture




  Future Media   © BBC MMXII
API
Stack

MarkLogic!!




 Future Media   © BBC MMXII
“You could run Nasa on that”
Lee Pollington




  Future Media                 © BBC MMXII
Xquery master<->master rep




Future Media                 © BBC MMXII
MarkLogic 5 & XA Tx   (just… thanks John Snelson)




Future Media                                        © BBC MMXII
Plenty of
Caching




Future Media   © BBC MMXII
Sport Stats REST API




  Future Media         © BBC MMXII
Sport Stats REST API




  Future Media         © BBC MMXII
Sport Stats REST API Demo….




  Future Media                © BBC MMXII
Olympics API (RDF)
           /tripod2012
                      /athletes
                                   /{uid}
                      /countries
                                 /{country}
                      /countries-iso
                                 /{iso}
                      /sports
                                 /stories
                                 /{discipline}
                                 /{discipline}/events
                                 /{discipline}/events/stories
                                 /{discipline}/events/{event}
                      /metadata
                                 /disciplines
                      /onestowatch
                                 /countries/{countryUrlName}
                                 /london2012
                                 /sports/{disciplineUrlName}
                                 /sports/{disciplineUrlName}/events/{eventUrlName}
                      /podium/events
                                 /{rscCode}
                      /record/events
                      /venues
                                 /{urlName}
  Future Media                                                                       © BBC MMXII
@prefix domain: <http://www.bbc.co.uk/ontologies/domain/> .


Olympics API (RDF)
             @prefix sesame: <http://www.openrdf.org/schema/sesame#> .
             @prefix owlim: <http://www.ontotext.com/> .
             @prefix oly: <http://www.bbc.co.uk/ontologies/2012olympics/> .
             @prefix par: <http://purl.org/vocab/participation/schema#> .
             @prefix dc: <http://purl.org/dc/elements/1.1/> .


             <http://www.bbc.co.uk/things/82f5db84-0591-49ee-b6f4-a1d26e9381fb#id> a sport:Person ;
                             rdfs:label "Usain Bolt"^^xsd:string , "Bolt Usain-athletics-jam-1986-08-21"^^xsd:string ;
                             foaf:name "Usain Bolt"^^xsd:string , "Bolt Usain-athletics-jam-1986-08-21"^^xsd:string ;
                             domain:canonicalName "Bolt Usain-athletics-jam-1986-08-21"^^xsd:string ;
                             foaf:givenName "Usain"^^xsd:string ;
                             foaf:familyName "Bolt"^^xsd:string ;
                             domain:name "Usain Bolt"^^xsd:string ;
                             oly:dateOfBirth "1986-08-21"^^xsd:date ;
                             oly:gender "M"^^xsd:string ;
                             oly:height "195.0"^^xsd:float ;
                             oly:weight "94.0"^^xsd:float ;
                             oly:worldOlympicDream "true"^^xsd:boolean ;
                             sport:discipline <http://www.bbc.co.uk/things/b3a086df-ab42-2b44-be8b-76b600bfcdce#id> ;
                             sport:competesIn <http://www.bbc.co.uk/things/1b499a08-4f02-4196-aa6c-c43ea353138b#id> .

             <http://www.bbc.co.uk/things/b3a086df-ab42-2b44-be8b-76b600bfcdce#id> a sport:SportsDiscipline ;
                             domain:name "Athletics"^^xsd:string ;
                             domain:document <http://www.bbc.co.uk/sport/olympics/2012/sports/athletics> .

             <http://www.bbc.co.uk/things/1b499a08-4f02-4196-aa6c-c43ea353138b#id> a sport:MedalCompetition ;
                             domain:name "Men's 100m"^^xsd:string ;
                             domain:shortName "Men's 100m"^^xsd:string ;
                             domain:document <http://www.bbc.co.uk/sport/olympics/2012/sports/athletics/events/mens-100m> ;
                             oly:measurementType <http://www.bbc.co.uk/things/measurement-types/time> ;
                             domain:externalId <urn:ioc2012:ATM001000> .

             <http://www.bbc.co.uk/things/903ef380-bdae-4a45-9a8b-5e5a270a7d6c#id> oly:oneToWatch <http://www.bbc.co.uk/things/82f5db84-0591-49ee-b6f4-a1d26e9381fb#id> .

             <http://news.bbc.co.uk/sport1/hi/athletics/16554814.stm#asset> tag:tag <http://www.bbc.co.uk/things/a50dc8ba-947e-4856-8eb0-1cdbbf208ef7#thing> ;
                             dc:title "Event Guide: ATHLETICS"^^xsd:string ;
                             asset:storyType <http://www.bbc.co.uk/things/story-types/profile> ;
                             domain:document <http://news.bbc.co.uk/sport1/mobile/athletics/16554814.stm> .

             <http://www.bbc.co.uk/things/a50dc8ba-947e-4856-8eb0-1cdbbf208ef7#thing> tag:taggedWithTag <http://www.bbc.co.uk/things/b3a086df-ab42-2b44-be8b-76b600bfcdce#id> .

             <http://news.bbc.co.uk/sport1/mobile/athletics/16554814.stm> domain:platform <http://www.bbc.co.uk/things/platforms/mobile> .

  Future Media                                                                                                                                                    © BBC MMXII
Olympics API (XML)
       /olympicdata/

                       /athletes
                                      /simulator/{scenarioName}
                                      /{guid}
                       /bdf-log
                                      /pid/{pid}
                                      /pid/{pid}/chapter-points
                                      /{logId}
                       /chapter-points
                                      /pid/{pid}
                                      /{logId}
                       /days-to-go
                       /live-text
                                      /assets
                                      /assets/{id}
                       /medallists
                                      /athletes/{guid}
                                      /{medalGroup}
                       /medals
                                      /athletes/{athleteGuid}
                                      /countries/{country}
                       /medaltable
                                      /countries/{country}
                                      /disciplines/medals
                                      /disciplines/{rsc}
                                      /overall
                       /sportcontent
                       /obsvideosessions
                                      /full
                                      /update
                       /podium
                                      /countries/{country}
                                      /disciplines/{rsccode}
                                      /events/{rsccode}
  Future Media                        /latest                     © BBC MMXII
Olympics API (XML)
                 /olympicdata/

                                 /pulse
                                                /beat
                                                /beats
                                 /records
                                                /athletes/{guid}
                                                /events/{rsc}
                                 /results
                                                /athletes/{guid}
                                 /schedule
                                                /detail/days/{date}
                                                /detail/disciplines-code/{rsccode}
                                                /detail/disciplines-code/{rsccode}/events/{eventrsccode}
                                                /detail/disciplines-code/{rsc}/days/{date}
                                                /detail/disciplines/{rsc}/days/{date}
                                                /detail/disciplines/{urlname}
                                                /detail/disciplines/{urlname}/days/{date}
                                                /detail/disciplines/{urlname}/events/{eventrsccode}
                                                /overview/days
                                                /overview/disciplines
                                                /overview/disciplines/{rsc}
                                 /stats
                                                /count/{directory:.+}
                                                /sessionCode/{sessionCode}
                                                /simulator/{scenarioName}
                                                /{documentSerial}
                                 /team
                                                /{odfid}
                                 /unit-status
                                                /{sessionCode}
                                 /video
                                                /days/{date}
                                                /videosessionid/{videoSessionId}
                                                /{pid}
  Future Media                                                                                             © BBC MMXII
Olympics API (XML - Medallist)
https://api.int.bbc.co.uk/olympicdata/public/medallists/men

<?xml version="1.0" encoding="UTF-8"?>
<!--Generated on: 30/03/2012 17:42:13 | Transform Time: 111-->
<document>
         <header>
                  <documentGroup>general</documentGroup>
                  <documentType>medalistCompact</documentType>
                  <rsc code="GEM000000"/>
                  <documentSerial>b1df2feb-1bcd-43b4-aa61-8ae9c82c3bb9</documentSerial>
                  <timeStamp>2012-03-30T15:42:13.843Z</timeStamp>
         </header>
         <medalist>
                 <m o="1" r="1" i="82f5db84-0591-49ee-b6f4-a1d26e9381fb" g=“2" s="1" b="0" t="2" c=”JAM">Usain Bolt</m>
                 <m o="2" r="1" i="56bdea69-dce5-4cab-91ff-802be5077350" g="1" s="0" b="0" t="1" c="NED">VOGELS Guus</m>
                 <m o="3" r="1" i="1f199d68-3a7d-44fa-adae-a5cde084395b" g="1" s="0" b="0" t="1" c="NED">DERIKX Rob</m>
                 <!- blah blah and blah ->
        </medalist>
</document>


XML and RDF data inter-linked via GUIDS




          Future Media                                                                                                     © BBC MMXII
Olympics API (XML – Video catch example)
                 https://api.stage.live.co.uk/olympicdata/public/videos/catchup

                 <?xml version="1.0" encoding="UTF-8"?>
                 <catchup poll-interval-in-seconds="60">
                   <slot index="1" type="session">
                    <pid>b017t8b4</pid>
                    <videoSessionId>OBS-1284552321</videoSessionId>
                    <sessionCode>FB001</sessionCode>
                    <discipline RSC="FB0000000" name="Football" url="http://www.bbc.co.uk/sport/olympics/2012/sports/football" urlName="football" icon="http://static.live.bbci.
                 /ivp2012/images/icons/sports/football.png"/>
                    <scheduleStart>2001-12-17T09:30:47Z</scheduleStart>
                    <scheduleEnd>2001-12-17T09:30:47Z</scheduleEnd>
                    <actualStart>2001-12-17T09:30:47Z</actualStart>
                    <actualEnd>2001-12-17T09:30:47Z</actualEnd>
                    <available>true</available>
                    <editorsPick>true</editorsPick>
                    <live>false</live>
                    <title>28/11/2011</title>
                    <shortSynopsis>A desperate Roxy makes the unwise decision to dig for dirt on Derek.</shortSynopsis>
                    <myImageBaseUrl>http://node2.bbcimg.co.uk/iplayer/images/episode/</myImageBaseUrl>
                  </slot>
                   <slot index="2" type="session">
                    <pid>hbtest1</pid>
                    <videoSessionId>OBS-64835275893</videoSessionId>
                    <sessionCode>HB001</sessionCode>
                    <discipline RSC="HB0000001"/>
                    <scheduleStart>2001-12-17T09:30:47Z</scheduleStart>
                    <scheduleEnd>2001-12-17T09:30:47Z</scheduleEnd>
                    <actualStart>2001-12-17T09:30:47Z</actualStart>
                    <actualEnd>2001-12-17T09:30:47Z</actualEnd>
                    <available>true</available>
                    <editorsPick>true</editorsPick>
                    <live>false</live>
                   </slot>
  Future Media    </catchup>                                                                                                                            © BBC MMXII
Olympics API (Video log/chapter points)
                 GET https://api.test.bbc.co.uk/olympicdata/bdf-log/pid/p00g2lqp

                 <document>
                   <logEvent>
                      <header>
                        <documentGroup>general</documentGroup>
                        <documentType>videoLogging</documentType>
                        <rsc code="FE0000000"/>
                        <documentSerial>fdf38714-45d3-4d89-5100-8883be340700</documentSerial>
                        <timeStamp>2012-07-10T10:42:38+01:00</timeStamp>
                        <video>
                           <pid>p00g2lqp</pid>
                           <timeCode>2012-07-10T08:56:38Z</timeCode>
                        </video>
                        <taggingtool>Version 0.1.10</taggingtool>
                      </header>
                      <Log>
                        <LogId>BBC-TT-1341913358.6783</LogId>
                        <Action>UPDATE</Action>
                        <Date>2012-07-10</Date>
                        <TimeCode>09:56:38</TimeCode>
                        <RSC>FE0000000</RSC>
                        <Keywords>
                           <Keyword>BBC free text</Keyword>
                        </Keywords>
                        <bbcLabel>Test One</bbcLabel>
                      </Log>
                   </logEvent>
                   <logEvent>
                      <header>
                        <documentGroup>general</documentGroup>
                        <documentType>videoLogging</documentType>
                        <rsc code=""/>
                        <documentSerial>20dc4019-d3be-4bb1-6a75-256ba7622b5d</documentSerial>
                        <timeStamp>2012-07-10T10:42:51+01:00</timeStamp>
                        <video>
                           <pid>p00g2lqp</pid>
                           <timeCode>2012-07-10T09:16:38Z</timeCode>
                        </video>
                        <taggingtool>Version 0.1.10</taggingtool>
  Future Media        </header>                                                                 © BBC MMXII
                      <Log>
Olympics Architecture



 Large A1 printout….
 Way
 Too big for a slide..




   Future Media          © BBC MMXII
online
2012  http://m.bbc.co.uk/news
  Future Media                   © BBC MMXII
Dynamic News mobile
• Multi device capability

• Responsive Web design

• Built on a dynamic service API

• New re-usable content model

• Dynamic assets


   Future Media                    © BBC MMXII
Responsive Dynamic News mobile (iPad)




  Future Media                          © BBC MMXII
Responsive Dynamic News mobile (iPhone)




  Future Media                       © BBC MMXII
Recap  Static News Architecture




Future Media                       © BBC MMXII
Dynamic
News Architecture




Future Media        © BBC MMXII
New Content Model (Re-usable XML/RDF )




  Future Media                      © BBC MMXII
MarkLogics handy Xinclude resolution
Including story data on news index XML

<item>
   <xi:include href="http://www.bbc.co.uk/asset/13447877"
    xpointer="xmlns(bbc=http://www.bbc.co.uk/content/asset)
xpointer(/bbc:story/bbc:itemMeta)">
    <xi:fallback>
       <!-- Unable to find href="http://www.bbc.co.uk/asset/13447877"
xpointer="xmlns(bbc=http://www.bbc.co.uk/content/asset) xpointer(/bbc:story/bbc:itemMeta)" -->
     </xi:fallback>
   </xi:include>
   ...




     Future Media                                                                      © BBC MMXII
News Index API
Including story data on news index XML

HTTP GET
https://api.live.bbc.co.uk/content/asset/news/technology/

HTTP Headers
X-Candy-Audience: Domestic
X-Candy-Platform: EnhancedMobile
                                                            Contextualised output
Accept: application/json
                                                            •Audience
Or                                                          •Platform
                                                            •Response type
HTTP Headers
X-Candy-Audience: Domestic
X-Candy-Platform: EnhancedMobile
Accept: application/xml
     Future Media                                                          © BBC MMXII
News Story API
Including story data on news index XML

HTTP GET
https://api.live.bbc.co.uk/content/asset/news/uk-17829360

HTTP Headers
X-Candy-Audience: Domestic
X-Candy-Platform: EnhancedMobile
Accept: application/json

Or

HTTP Headers
X-Candy-Audience: Domestic
X-Candy-Platform: EnhancedMobile
Accept: application/xml
     Future Media                                           © BBC MMXII
Platform future…..
BBC sport site re-engineered to use fully dynamic approach (News Mobile style)

BBC news high web site re-engineered to use fully dynamic approach (News Mobile style)

MarkLogic as CMS repository (iSite)

MarkLogic Binary storage R&D

Etc….etc..




 Future Media                                                                    © BBC MMXII
Questions?


               Slides:


               jem.rayfield
               @bbc.co.uk
               Twitter: @jemrayfield




Future Media                           © BBC MMXII

More Related Content

Similar to Mark logic user-group-2012

Transforming the User Experience of the BBC
Transforming the User Experience of the BBCTransforming the User Experience of the BBC
Transforming the User Experience of the BBCRichard Titus
 
BBC Backstage: APIs & Feeds 2009
BBC Backstage: APIs & Feeds 2009BBC Backstage: APIs & Feeds 2009
BBC Backstage: APIs & Feeds 2009Rain Ashford
 
Hacks & Hackers BBC R&D
Hacks & Hackers  BBC R&DHacks & Hackers  BBC R&D
Hacks & Hackers BBC R&DGeorge Wright
 
BBC Olympics: An accessibility case study
BBC Olympics: An accessibility case studyBBC Olympics: An accessibility case study
BBC Olympics: An accessibility case studyAlistair Duggin
 
Linked Data on the BBC
Linked Data on the BBCLinked Data on the BBC
Linked Data on the BBCYves Raimond
 
Guardian Open Platform Launch Event
Guardian Open Platform Launch EventGuardian Open Platform Launch Event
Guardian Open Platform Launch EventMatt McAlister
 
BBC Olympics: An Accessibility Study
BBC Olympics: An Accessibility StudyBBC Olympics: An Accessibility Study
BBC Olympics: An Accessibility StudyNomensa
 
Bbc rd preso_southampton
Bbc rd preso_southamptonBbc rd preso_southampton
Bbc rd preso_southamptonGeorge Wright
 
BBC olympics 2012 experience oct18
BBC olympics 2012 experience oct18BBC olympics 2012 experience oct18
BBC olympics 2012 experience oct18Matt Turner
 
FIFA World Cup 2014 packages - BBC
FIFA World Cup 2014 packages - BBC  FIFA World Cup 2014 packages - BBC
FIFA World Cup 2014 packages - BBC Spark Media
 
Web rtc 동향과 이슈 2017년_정리
Web rtc 동향과 이슈 2017년_정리Web rtc 동향과 이슈 2017년_정리
Web rtc 동향과 이슈 2017년_정리sung young son
 
BBC Backstage 2009
BBC Backstage 2009BBC Backstage 2009
BBC Backstage 2009Rain Ashford
 
BBC JUICER API Presentation - for SeedHack 4.0 - BBC News Labs
BBC JUICER API Presentation - for SeedHack 4.0 - BBC News LabsBBC JUICER API Presentation - for SeedHack 4.0 - BBC News Labs
BBC JUICER API Presentation - for SeedHack 4.0 - BBC News LabsBBC News Labs
 
NoTube project results. Bringing TV and Web together.
NoTube project results. Bringing TV and Web together. NoTube project results. Bringing TV and Web together.
NoTube project results. Bringing TV and Web together. MODUL Technology GmbH
 
Notube slides-120130071559-phpapp01
Notube slides-120130071559-phpapp01Notube slides-120130071559-phpapp01
Notube slides-120130071559-phpapp01Nur Karadeniz
 
What Is Backstage Overview Google Day
What Is Backstage Overview Google DayWhat Is Backstage Overview Google Day
What Is Backstage Overview Google DayIan Forrester
 
Adrian Woolard Beyond Digital V1
Adrian Woolard Beyond Digital V1Adrian Woolard Beyond Digital V1
Adrian Woolard Beyond Digital V1BBC
 
CIPR Social Media Conference - James Howard, BBC Future Media, The Semantic W...
CIPR Social Media Conference - James Howard, BBC Future Media, The Semantic W...CIPR Social Media Conference - James Howard, BBC Future Media, The Semantic W...
CIPR Social Media Conference - James Howard, BBC Future Media, The Semantic W...Chartered Institute of Public Relations
 
Building a Bank with Go
Building a Bank with GoBuilding a Bank with Go
Building a Bank with GoC4Media
 

Similar to Mark logic user-group-2012 (20)

Transforming the User Experience of the BBC
Transforming the User Experience of the BBCTransforming the User Experience of the BBC
Transforming the User Experience of the BBC
 
BBC Backstage: APIs & Feeds 2009
BBC Backstage: APIs & Feeds 2009BBC Backstage: APIs & Feeds 2009
BBC Backstage: APIs & Feeds 2009
 
Hacks & Hackers BBC R&D
Hacks & Hackers  BBC R&DHacks & Hackers  BBC R&D
Hacks & Hackers BBC R&D
 
BBC Olympics: An accessibility case study
BBC Olympics: An accessibility case studyBBC Olympics: An accessibility case study
BBC Olympics: An accessibility case study
 
Linked Data on the BBC
Linked Data on the BBCLinked Data on the BBC
Linked Data on the BBC
 
Guardian Open Platform Launch Event
Guardian Open Platform Launch EventGuardian Open Platform Launch Event
Guardian Open Platform Launch Event
 
BBC Olympics: An Accessibility Study
BBC Olympics: An Accessibility StudyBBC Olympics: An Accessibility Study
BBC Olympics: An Accessibility Study
 
Bbc rd preso_southampton
Bbc rd preso_southamptonBbc rd preso_southampton
Bbc rd preso_southampton
 
BBC olympics 2012 experience oct18
BBC olympics 2012 experience oct18BBC olympics 2012 experience oct18
BBC olympics 2012 experience oct18
 
FIFA World Cup 2014 packages - BBC
FIFA World Cup 2014 packages - BBC  FIFA World Cup 2014 packages - BBC
FIFA World Cup 2014 packages - BBC
 
Web rtc 동향과 이슈 2017년_정리
Web rtc 동향과 이슈 2017년_정리Web rtc 동향과 이슈 2017년_정리
Web rtc 동향과 이슈 2017년_정리
 
BBC Backstage 2009
BBC Backstage 2009BBC Backstage 2009
BBC Backstage 2009
 
BBC JUICER API Presentation - for SeedHack 4.0 - BBC News Labs
BBC JUICER API Presentation - for SeedHack 4.0 - BBC News LabsBBC JUICER API Presentation - for SeedHack 4.0 - BBC News Labs
BBC JUICER API Presentation - for SeedHack 4.0 - BBC News Labs
 
NoTube project results. Bringing TV and Web together.
NoTube project results. Bringing TV and Web together. NoTube project results. Bringing TV and Web together.
NoTube project results. Bringing TV and Web together.
 
Notube slides-120130071559-phpapp01
Notube slides-120130071559-phpapp01Notube slides-120130071559-phpapp01
Notube slides-120130071559-phpapp01
 
What Is Backstage Overview Google Day
What Is Backstage Overview Google DayWhat Is Backstage Overview Google Day
What Is Backstage Overview Google Day
 
Into The Box 2023 Keynote day 2
Into The Box 2023 Keynote day 2Into The Box 2023 Keynote day 2
Into The Box 2023 Keynote day 2
 
Adrian Woolard Beyond Digital V1
Adrian Woolard Beyond Digital V1Adrian Woolard Beyond Digital V1
Adrian Woolard Beyond Digital V1
 
CIPR Social Media Conference - James Howard, BBC Future Media, The Semantic W...
CIPR Social Media Conference - James Howard, BBC Future Media, The Semantic W...CIPR Social Media Conference - James Howard, BBC Future Media, The Semantic W...
CIPR Social Media Conference - James Howard, BBC Future Media, The Semantic W...
 
Building a Bank with Go
Building a Bank with GoBuilding a Bank with Go
Building a Bank with Go
 

Recently uploaded

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Recently uploaded (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

Mark logic user-group-2012

  • 1. BBC Dynamic Semantic Publishing [DSP] MarkLogic User Group July 2012 • Jem Rayfield : Lead Technical Architect • BBC Future Media Future Media © BBC MMXII
  • 2. Outline BBC News Online BBC World Cup 2010 BBC Sport 2012 + Olympics BBC News Mobile Future Media © BBC MMXII
  • 3. Radio since 1922 TV Since 1930 Web since 1994 Future Media © BBC MMXII
  • 5. BBC News [Static Publishing] Future Media © BBC MMXII
  • 6. Static News Architecture Future Media © BBC MMXII
  • 9. Static News The Good 1) Simple 2) Scales cheaply 3) Difficult to break [bad rendering logic etc..] 4) Handles high load Future Media © BBC MMXII
  • 10. Static News The BAD 1) Relational taxonomic meta model 2) Static! Inflexible! SSI! 3) Document publishing 4) Content non re-usable 5) Content non repurpose-able 6) Difficult to personalize 7) Publication per output Future Media © BBC MMXII
  • 11. BBC World Cup 2010 http://bbc.co.uk/worldcup Future Media © BBC MMXII
  • 12. World Cup 2010 2. 32 teams, 8 groups, 736 players  776 pages 4. Fixtures & Results, Groups & Teams pages 6. To many web pages for too few journalists 8. Improve the publishing system to help achieve all of this Future Media © BBC MMXII
  • 16. Open Sport Ontology BBC Sport: http://www.bbc.co.uk/ontologies/sport Future Media © BBC MMXII
  • 17. Semantic publishing TRIPLE STORE ONTOLOGY USER EXPERIENCE Future Media © BBC MMXII
  • 18. Graffiti: Suggest -> Tag [Player] Future Media © BBC MMXII
  • 19. Graffiti: Suggest -> Tag [Location] (Geonames) Future Media © BBC MMXII
  • 20. Graffiti Demo… (maybe video, depending on wifi… ) Journalism © BBC MMIX
  • 21. Rationale • Automated content publishing • Huge increase in content breadth (number of manageable pages) • Content re-use and re-purposing, increasing reach • Simplified content management • Journalist headcount reduction • Multi-dimensional entry points and semantic navigation • Improved user experience with high levels of user engagement • Dynamic, state (time|event) and semantic driven page layout • Personalized content aggregations • Open data and API’s Future Media © BBC MMXII
  • 22. World Cup DSP Architecture Future Media © BBC MMXII
  • 23. API Stack Future Media © BBC MMXII
  • 24. Highly Scalable Clustered BigOWLIM Journalism © BBC MMIX
  • 26. Open Ontology/Dataset reuse Event | Geonames | Foaf | Etc. Journalism © BBC MMIX
  • 28. REST API Content negotiation: json rdf, xml rdf, turtle Publically accessible (with SSL cert) GET /sport/football/teams/<TEAM> Accept: application/rdf+json GET /sport/football/<COMPETITION> Accept: application/rdf+xml GET /assets/<ASSET> Accept: text/rdf+n3 Journalism Etc…. © BBC MMIX
  • 29. GET Accept text/rdf+n3 https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea <http://www.chelseafc.com/> domain:documentType <http://www.bbc.co.uk/things/document-types/homepage> , <http://www.bbc.co.uk/things/document-types/external> . <http://www.bbc.co.uk/sport/football/teams/chelsea> domain:documentType <http://www.bbc.co.uk/things/document-types/bbc-document> , <http://www.bbc.co.uk/things/document-types/homepage> . <http://www.bbc.co.uk/things/2acacd19-6609-1840-9c2b-b0820c50d281#id> a sport:CompetitiveSportingOrganisation ; domain:canonicalName "Chelsea"^^<xsd:string> ; domain:document <http://www.chelseafc.com/> , <http://www.bbc.co.uk/sport/football/teams/chelsea> ; domain:externalId <http://dbpedia.org/resource/Chelsea_F.C.> , <urn:sports-stats:137316635> ; domain:name "Chelsea" ; domain:shortName "Chelsea"^^<xsd:string> ; sport:competesIn <http://www.bbc.co.uk/things/5cd4682a-7643-f445-8b1f-bcbaf450bc89#id> . <http://dbpedia.org/resource/Chelsea_F.C.> domain:externalIdType <http://www.bbc.co.uk/things/external-id-types/dbpedia> . <urn:sports-stats:137316635> domain:externalIdType <http://www.bbc.co.uk/things/external-id-types/bbc-sport-stats> . <http://www.bbc.co.uk/things/5cd4682a-7643-f445-8b1f-bcbaf450bc89#id> domain:canonicalName "Premier League"^^<xsd:string> ; domain:externalId <urn:sports-stats:118996114> ; sport:competitionType <http://www.bbc.co.uk/things/competition-types/domestic-league> . Journalism © BBC MMIX
  • 30. PHP->EasyRDF->API PHP Render layer consumes RDF from REST API via EasyRDF (http://www.aelius.com/njh/easyrdf/) EasyRDF open PHP library (Primary committer Nicholas Humfrey BBC) protected function getOptions() { return array( "config" => array("usecert" => true), "headers" => array( "Accept" => "application/rdf+json", "X-Expect" => "http://www.bbc.co.uk/things/platforms/hiweb" ) ); $options = $this->getOptions() $response = $this->get("https://api.test.bbc.co.uk/dsp/sport/football/teams/chelsea", $options) $this->data = new EasyRdf_Graph("http://www.bbc.co.uk", $response->getBody()); $teams = $this->data->allofType("sport:CompetitiveSportingOrganisation”) Journalism © BBC MMIX
  • 31. World Cup statistics the GOOD • 750+ Dynamic aggregations/pages (Player, Squad, Group, etc..) • Average unique page requests a day : 2 million + • Average OWLIM SPARQL queries a day : 1 million • 100s RDF statement updates/inserts per minute with full OWL reasoning and associated inference. • Multi data center fully resilient, clustered 6 node triple store • RDF graph model ideally suited to model domain representations such as sport Future Media © BBC MMXII
  • 32. World Cup statistics the BAD • Sports stories and indices static • Sport content not responsive or personalized • RDF Store unable to handle thousands of statistic updates a second • RDF Store forward-chained closures expensive increase write latency • RDF graph model and SPARQL not ideally suited to the BBC’s News and Sport document publication model Future Media © BBC MMXII
  • 33. BBC Sport 2012; Online Refresh http://bbc.co.uk/sport Future Media © BBC MMXII
  • 34. Sport Refresh 2012 • Page per Athlete [10,000+], Page per country [200+], Page per Discipline [400-500], Page per venue, Page per team  A lot of output… • Almost real time statistics and live event pages • Time coded, metadata annotated, on demand video, 58,000 hours of content • Far too many web pages for far too few journalists • DSP annotation architecture to automate content aggregation Future Media © BBC MMXII
  • 35. Sport Refresh 2012; Red -> Bright Yellow Future Media © BBC MMXII
  • 36. 10000+ Dynamic Aggregations Future Media © BBC MMXII
  • 37. Lots of Dynamic (Live) sports stats Future Media © BBC MMXII
  • 38. Dynamic Navigation Future Media © BBC MMXII
  • 39. Static Stories + Dynamic includes Future Media © BBC MMXII
  • 40. Olympics - 27 Live Video Steams Live Stats overlays Stats -> Ontology driven aggregations Future Media © BBC MMXII
  • 41. XML powered video chapter points Page URL http://www.test.bbc.co.uk/sport/olympics/2012/live-video/p00g2lqp Xml api to get video log events https://api.test.bbc.co.uk/olympicdata/bdf-log/pid/p00g2lqp Future Media © BBC MMXII
  • 42. Augment architecture with a Content Store 2. Atomic content assets stored in MarkLogic XML store 4. XML content queryable via Xquery 6. Content Assets searchable 8. Sports statistics searchable/queryable via XQuery 10. Ontological SPARQL via BigOWLIM, assets Xquery via MarkLogic Future Media © BBC MMXII
  • 43. Static Sports stats (OLD) Future Media © BBC MMXII
  • 44. Dynamic Sports stats Future Media © BBC MMXII
  • 45. Olympics XML etc.. Future Media © BBC MMXII
  • 46. Ontology Aware NLP • Information Workbench • OWLIM • (Spice) GATE+Ontotext Future Media © BBC MMXII
  • 47. Extraction Highlights Ex-England ? Roy Hodgson: boss Sven- coach Generic ? Roy Hodgson: Goran Eriksson says a "smear hockey player Analysis ? ………. campaign" has … Update CES APP been aimed at Roy Hodgson KB Gazetteer for omitting Rio Ferdinand. … … V Sven-Goran V Rio Ferdinand V Roy Hodgson: -……. … Eriksson -……. - ………. coach -Roy Hodgson: … OWLIM - ………. hockey player - ………. Disambiguation Retrain & … Adapt … 1. Eriksson (78%) … 2. Roy Hodgson (69%) Relevance 3. Rio Ferdinand (58%) Curate Ranking 4. … Future Media © BBC MMXII
  • 48. Disambiguation of Locations •Geospatial distance - a feature of OWLIM •Super region – GeoNames hierarchy and containment relations, e.g. parentFeature •RDF Rank •Human approval score (on the basis of curated documents) •Class/code based priority – fine grained ontology may allow a rule or machine learning prioritization of classes and entities based on learning we already have. •Asset geo association - some entities could be disambiguated by using the asset domain association. BBC UK local sports is more likely to talk about national entities. Future Media © BBC MMXII
  • 49. Entity Relevance: Objective • Rank entities by their relatedness to the article • Accuracy 75% • We consider various frequencies of entity mentions in the article and in the entire set of articles • Positions in the article fields or in the first paragraphs of the body boost the relevance Future Media © BBC MMXII
  • 50. DSP Architecture Future Media © BBC MMXII
  • 52. “You could run Nasa on that” Lee Pollington Future Media © BBC MMXII
  • 53. Xquery master<->master rep Future Media © BBC MMXII
  • 54. MarkLogic 5 & XA Tx (just… thanks John Snelson) Future Media © BBC MMXII
  • 56. Sport Stats REST API Future Media © BBC MMXII
  • 57. Sport Stats REST API Future Media © BBC MMXII
  • 58. Sport Stats REST API Demo…. Future Media © BBC MMXII
  • 59. Olympics API (RDF) /tripod2012 /athletes /{uid} /countries /{country} /countries-iso /{iso} /sports /stories /{discipline} /{discipline}/events /{discipline}/events/stories /{discipline}/events/{event} /metadata /disciplines /onestowatch /countries/{countryUrlName} /london2012 /sports/{disciplineUrlName} /sports/{disciplineUrlName}/events/{eventUrlName} /podium/events /{rscCode} /record/events /venues /{urlName} Future Media © BBC MMXII
  • 60. @prefix domain: <http://www.bbc.co.uk/ontologies/domain/> . Olympics API (RDF) @prefix sesame: <http://www.openrdf.org/schema/sesame#> . @prefix owlim: <http://www.ontotext.com/> . @prefix oly: <http://www.bbc.co.uk/ontologies/2012olympics/> . @prefix par: <http://purl.org/vocab/participation/schema#> . @prefix dc: <http://purl.org/dc/elements/1.1/> . <http://www.bbc.co.uk/things/82f5db84-0591-49ee-b6f4-a1d26e9381fb#id> a sport:Person ; rdfs:label "Usain Bolt"^^xsd:string , "Bolt Usain-athletics-jam-1986-08-21"^^xsd:string ; foaf:name "Usain Bolt"^^xsd:string , "Bolt Usain-athletics-jam-1986-08-21"^^xsd:string ; domain:canonicalName "Bolt Usain-athletics-jam-1986-08-21"^^xsd:string ; foaf:givenName "Usain"^^xsd:string ; foaf:familyName "Bolt"^^xsd:string ; domain:name "Usain Bolt"^^xsd:string ; oly:dateOfBirth "1986-08-21"^^xsd:date ; oly:gender "M"^^xsd:string ; oly:height "195.0"^^xsd:float ; oly:weight "94.0"^^xsd:float ; oly:worldOlympicDream "true"^^xsd:boolean ; sport:discipline <http://www.bbc.co.uk/things/b3a086df-ab42-2b44-be8b-76b600bfcdce#id> ; sport:competesIn <http://www.bbc.co.uk/things/1b499a08-4f02-4196-aa6c-c43ea353138b#id> . <http://www.bbc.co.uk/things/b3a086df-ab42-2b44-be8b-76b600bfcdce#id> a sport:SportsDiscipline ; domain:name "Athletics"^^xsd:string ; domain:document <http://www.bbc.co.uk/sport/olympics/2012/sports/athletics> . <http://www.bbc.co.uk/things/1b499a08-4f02-4196-aa6c-c43ea353138b#id> a sport:MedalCompetition ; domain:name "Men's 100m"^^xsd:string ; domain:shortName "Men's 100m"^^xsd:string ; domain:document <http://www.bbc.co.uk/sport/olympics/2012/sports/athletics/events/mens-100m> ; oly:measurementType <http://www.bbc.co.uk/things/measurement-types/time> ; domain:externalId <urn:ioc2012:ATM001000> . <http://www.bbc.co.uk/things/903ef380-bdae-4a45-9a8b-5e5a270a7d6c#id> oly:oneToWatch <http://www.bbc.co.uk/things/82f5db84-0591-49ee-b6f4-a1d26e9381fb#id> . <http://news.bbc.co.uk/sport1/hi/athletics/16554814.stm#asset> tag:tag <http://www.bbc.co.uk/things/a50dc8ba-947e-4856-8eb0-1cdbbf208ef7#thing> ; dc:title "Event Guide: ATHLETICS"^^xsd:string ; asset:storyType <http://www.bbc.co.uk/things/story-types/profile> ; domain:document <http://news.bbc.co.uk/sport1/mobile/athletics/16554814.stm> . <http://www.bbc.co.uk/things/a50dc8ba-947e-4856-8eb0-1cdbbf208ef7#thing> tag:taggedWithTag <http://www.bbc.co.uk/things/b3a086df-ab42-2b44-be8b-76b600bfcdce#id> . <http://news.bbc.co.uk/sport1/mobile/athletics/16554814.stm> domain:platform <http://www.bbc.co.uk/things/platforms/mobile> . Future Media © BBC MMXII
  • 61. Olympics API (XML) /olympicdata/ /athletes /simulator/{scenarioName} /{guid} /bdf-log /pid/{pid} /pid/{pid}/chapter-points /{logId} /chapter-points /pid/{pid} /{logId} /days-to-go /live-text /assets /assets/{id} /medallists /athletes/{guid} /{medalGroup} /medals /athletes/{athleteGuid} /countries/{country} /medaltable /countries/{country} /disciplines/medals /disciplines/{rsc} /overall /sportcontent /obsvideosessions /full /update /podium /countries/{country} /disciplines/{rsccode} /events/{rsccode} Future Media /latest © BBC MMXII
  • 62. Olympics API (XML) /olympicdata/ /pulse /beat /beats /records /athletes/{guid} /events/{rsc} /results /athletes/{guid} /schedule /detail/days/{date} /detail/disciplines-code/{rsccode} /detail/disciplines-code/{rsccode}/events/{eventrsccode} /detail/disciplines-code/{rsc}/days/{date} /detail/disciplines/{rsc}/days/{date} /detail/disciplines/{urlname} /detail/disciplines/{urlname}/days/{date} /detail/disciplines/{urlname}/events/{eventrsccode} /overview/days /overview/disciplines /overview/disciplines/{rsc} /stats /count/{directory:.+} /sessionCode/{sessionCode} /simulator/{scenarioName} /{documentSerial} /team /{odfid} /unit-status /{sessionCode} /video /days/{date} /videosessionid/{videoSessionId} /{pid} Future Media © BBC MMXII
  • 63. Olympics API (XML - Medallist) https://api.int.bbc.co.uk/olympicdata/public/medallists/men <?xml version="1.0" encoding="UTF-8"?> <!--Generated on: 30/03/2012 17:42:13 | Transform Time: 111--> <document> <header> <documentGroup>general</documentGroup> <documentType>medalistCompact</documentType> <rsc code="GEM000000"/> <documentSerial>b1df2feb-1bcd-43b4-aa61-8ae9c82c3bb9</documentSerial> <timeStamp>2012-03-30T15:42:13.843Z</timeStamp> </header> <medalist> <m o="1" r="1" i="82f5db84-0591-49ee-b6f4-a1d26e9381fb" g=“2" s="1" b="0" t="2" c=”JAM">Usain Bolt</m> <m o="2" r="1" i="56bdea69-dce5-4cab-91ff-802be5077350" g="1" s="0" b="0" t="1" c="NED">VOGELS Guus</m> <m o="3" r="1" i="1f199d68-3a7d-44fa-adae-a5cde084395b" g="1" s="0" b="0" t="1" c="NED">DERIKX Rob</m> <!- blah blah and blah -> </medalist> </document> XML and RDF data inter-linked via GUIDS Future Media © BBC MMXII
  • 64. Olympics API (XML – Video catch example) https://api.stage.live.co.uk/olympicdata/public/videos/catchup <?xml version="1.0" encoding="UTF-8"?> <catchup poll-interval-in-seconds="60"> <slot index="1" type="session"> <pid>b017t8b4</pid> <videoSessionId>OBS-1284552321</videoSessionId> <sessionCode>FB001</sessionCode> <discipline RSC="FB0000000" name="Football" url="http://www.bbc.co.uk/sport/olympics/2012/sports/football" urlName="football" icon="http://static.live.bbci. /ivp2012/images/icons/sports/football.png"/> <scheduleStart>2001-12-17T09:30:47Z</scheduleStart> <scheduleEnd>2001-12-17T09:30:47Z</scheduleEnd> <actualStart>2001-12-17T09:30:47Z</actualStart> <actualEnd>2001-12-17T09:30:47Z</actualEnd> <available>true</available> <editorsPick>true</editorsPick> <live>false</live> <title>28/11/2011</title> <shortSynopsis>A desperate Roxy makes the unwise decision to dig for dirt on Derek.</shortSynopsis> <myImageBaseUrl>http://node2.bbcimg.co.uk/iplayer/images/episode/</myImageBaseUrl> </slot> <slot index="2" type="session"> <pid>hbtest1</pid> <videoSessionId>OBS-64835275893</videoSessionId> <sessionCode>HB001</sessionCode> <discipline RSC="HB0000001"/> <scheduleStart>2001-12-17T09:30:47Z</scheduleStart> <scheduleEnd>2001-12-17T09:30:47Z</scheduleEnd> <actualStart>2001-12-17T09:30:47Z</actualStart> <actualEnd>2001-12-17T09:30:47Z</actualEnd> <available>true</available> <editorsPick>true</editorsPick> <live>false</live> </slot> Future Media </catchup> © BBC MMXII
  • 65. Olympics API (Video log/chapter points) GET https://api.test.bbc.co.uk/olympicdata/bdf-log/pid/p00g2lqp <document> <logEvent> <header> <documentGroup>general</documentGroup> <documentType>videoLogging</documentType> <rsc code="FE0000000"/> <documentSerial>fdf38714-45d3-4d89-5100-8883be340700</documentSerial> <timeStamp>2012-07-10T10:42:38+01:00</timeStamp> <video> <pid>p00g2lqp</pid> <timeCode>2012-07-10T08:56:38Z</timeCode> </video> <taggingtool>Version 0.1.10</taggingtool> </header> <Log> <LogId>BBC-TT-1341913358.6783</LogId> <Action>UPDATE</Action> <Date>2012-07-10</Date> <TimeCode>09:56:38</TimeCode> <RSC>FE0000000</RSC> <Keywords> <Keyword>BBC free text</Keyword> </Keywords> <bbcLabel>Test One</bbcLabel> </Log> </logEvent> <logEvent> <header> <documentGroup>general</documentGroup> <documentType>videoLogging</documentType> <rsc code=""/> <documentSerial>20dc4019-d3be-4bb1-6a75-256ba7622b5d</documentSerial> <timeStamp>2012-07-10T10:42:51+01:00</timeStamp> <video> <pid>p00g2lqp</pid> <timeCode>2012-07-10T09:16:38Z</timeCode> </video> <taggingtool>Version 0.1.10</taggingtool> Future Media </header> © BBC MMXII <Log>
  • 66. Olympics Architecture Large A1 printout…. Way Too big for a slide.. Future Media © BBC MMXII
  • 67. online 2012  http://m.bbc.co.uk/news Future Media © BBC MMXII
  • 68. Dynamic News mobile • Multi device capability • Responsive Web design • Built on a dynamic service API • New re-usable content model • Dynamic assets Future Media © BBC MMXII
  • 69. Responsive Dynamic News mobile (iPad) Future Media © BBC MMXII
  • 70. Responsive Dynamic News mobile (iPhone) Future Media © BBC MMXII
  • 71. Recap  Static News Architecture Future Media © BBC MMXII
  • 73. New Content Model (Re-usable XML/RDF ) Future Media © BBC MMXII
  • 74. MarkLogics handy Xinclude resolution Including story data on news index XML <item> <xi:include href="http://www.bbc.co.uk/asset/13447877" xpointer="xmlns(bbc=http://www.bbc.co.uk/content/asset) xpointer(/bbc:story/bbc:itemMeta)"> <xi:fallback> <!-- Unable to find href="http://www.bbc.co.uk/asset/13447877" xpointer="xmlns(bbc=http://www.bbc.co.uk/content/asset) xpointer(/bbc:story/bbc:itemMeta)" --> </xi:fallback> </xi:include> ... Future Media © BBC MMXII
  • 75. News Index API Including story data on news index XML HTTP GET https://api.live.bbc.co.uk/content/asset/news/technology/ HTTP Headers X-Candy-Audience: Domestic X-Candy-Platform: EnhancedMobile Contextualised output Accept: application/json •Audience Or •Platform •Response type HTTP Headers X-Candy-Audience: Domestic X-Candy-Platform: EnhancedMobile Accept: application/xml Future Media © BBC MMXII
  • 76. News Story API Including story data on news index XML HTTP GET https://api.live.bbc.co.uk/content/asset/news/uk-17829360 HTTP Headers X-Candy-Audience: Domestic X-Candy-Platform: EnhancedMobile Accept: application/json Or HTTP Headers X-Candy-Audience: Domestic X-Candy-Platform: EnhancedMobile Accept: application/xml Future Media © BBC MMXII
  • 77. Platform future….. BBC sport site re-engineered to use fully dynamic approach (News Mobile style) BBC news high web site re-engineered to use fully dynamic approach (News Mobile style) MarkLogic as CMS repository (iSite) MarkLogic Binary storage R&D Etc….etc.. Future Media © BBC MMXII
  • 78. Questions? Slides: jem.rayfield @bbc.co.uk Twitter: @jemrayfield Future Media © BBC MMXII

Editor's Notes

  1. These relationships mean more interesting user journeys, greater link density, and more interesting queries on the data.
  2. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  3. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  4. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  5. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  6. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  7. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  8. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  9. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  10. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  11. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  12. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  13. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  14. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  15. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  16. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  17. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  18. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  19. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  20. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  21. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  22. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup
  23. Demo: GET https://api.live.bbc.co.uk/dsp/sport/football/teams/chelsea Accept text/rdf+n3 GET https://api.live.bbc.co.uk/dsp/sport/football/facup