The Semantic Web, Linked Dataand NewsStuart MylesAssociated Press9th Feb 2011
AgendaQuestions I will answer:What are “the Semantic Web” and “Linked Data”?How is the IPTC exploring these technologies?What has the IPTC learnt so far?What are some example uses in the news industry?Questions I will ask:Can MINDS identify business or editorial applications?© 2011 IPTC (www.iptc.org)    All rights reserved2
The Semantic Web and Linked DataMoving beyond a web of documents© 2011 IPTC (www.iptc.org)    All rights reserved3To a web of dataSemantic Web technologiesextend today’s web withmachine readable informationlinks between data and servicesAlso known as “The Giant Global Graph”, “Web 3.0”, “A Web of Things” …There are different theories about what it is and how to get there.
Semantic Web and Linked Datahttp://bnode.org/media/2009/07/08/semantic_web_technology_stack.png
IPTC and the Semantic WebThe IPTC is following three paths into the Semantic Web world:Create a news ontology, based on NewsML-G2Formal semantics for news, specified using OWLExpress news metadata in HTML Using RDFa (rNews), microformats (hNews)Turn IPTC subject codes into Linked DataConnect related data across the web using URIs, HTTP & RDFA set of principles from Tim Berners Lee http://www.w3.org/DesignIssues/LinkedData.html© 2011 IPTC (www.iptc.org)    All rights reserved5
Following the Linked Data PathThe Linked Data principles, as specified by TBLUse URIs as names for things Use HTTP URIs so that people can look up those names. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) Include links to other URIs, so that they can discover more thingsApply the principles to IPTC’s NewsCodesAlready published as XML (G2 Knowledge Items)And as HTMLThe experiment: convert XML into RDF© 2011 IPTC (www.iptc.org)    All rights reserved6
Linked Data Lessons LearntOne Model (RDF), Multiple Vocabularies (SKOS, DC…)Tool support is a key benefitBasics well-documentedPull is better than pushLinking is really mappingSemWeb and Linked Data generates lots of interestIPTC’s NewsCodes as Linked Data publicly launched:http://www.iptc.org/site/NewsCodes/NewsCodes_Retrieval_in_Different_Formats© 2011 IPTC (www.iptc.org)    All rights reserved7
The BBC created their World Cup 2010 site using Semantic Web technologiesNews Semantic Web Examples© 2011 IPTC (www.iptc.org)    All rights reserved8http://www.bbc.co.uk/blogs/bbcinternet/2010/07/the_world_cup_and_a_call_to_ac.htmlhNews allows publishers to mark up news specific metadata in their HTML pages. Services are now starting to exploit the microformat in innovative wayshttp://microformats.org/wiki/hnewshttp://www.newsregistry.com/https://www.readability.com/publishers/guidelines/
Other News Linked Data ExamplesJournalisted is a database of journalistsA Linked Data URI per journalist© 2011 IPTC (www.iptc.org)    All rights reserved9http://mediastandardstrust.org/blog/journalisted-as-a-linked-data-resourceThe Guardian added linked data support to their “Open Platform” Content APIBut not using RDF…http://www.guardian.co.uk/open-platform/blog/linked-data-open-platform
A Solution in Search of a Problem?GeorgiKobilarovrecently opined thatThe technical approach of linked data isn't the issue(although the technology can be hard)The difficulty is seeing the problem that is to be solvedhttp://blog.georgikobilarov.com/2011/01/making-linked-data-work-isnt-the-problem/News + Linked Data + ? = Profit!IPTC has experimented with the technical aspectCan MINDS supply the business or editorial problems?© 2011 IPTC (www.iptc.org)    All rights reserved10

Linked Data for News Introduction

  • 1.
    The Semantic Web,Linked Dataand NewsStuart MylesAssociated Press9th Feb 2011
  • 2.
    AgendaQuestions I willanswer:What are “the Semantic Web” and “Linked Data”?How is the IPTC exploring these technologies?What has the IPTC learnt so far?What are some example uses in the news industry?Questions I will ask:Can MINDS identify business or editorial applications?© 2011 IPTC (www.iptc.org) All rights reserved2
  • 3.
    The Semantic Weband Linked DataMoving beyond a web of documents© 2011 IPTC (www.iptc.org) All rights reserved3To a web of dataSemantic Web technologiesextend today’s web withmachine readable informationlinks between data and servicesAlso known as “The Giant Global Graph”, “Web 3.0”, “A Web of Things” …There are different theories about what it is and how to get there.
  • 4.
    Semantic Web andLinked Datahttp://bnode.org/media/2009/07/08/semantic_web_technology_stack.png
  • 5.
    IPTC and theSemantic WebThe IPTC is following three paths into the Semantic Web world:Create a news ontology, based on NewsML-G2Formal semantics for news, specified using OWLExpress news metadata in HTML Using RDFa (rNews), microformats (hNews)Turn IPTC subject codes into Linked DataConnect related data across the web using URIs, HTTP & RDFA set of principles from Tim Berners Lee http://www.w3.org/DesignIssues/LinkedData.html© 2011 IPTC (www.iptc.org) All rights reserved5
  • 6.
    Following the LinkedData PathThe Linked Data principles, as specified by TBLUse URIs as names for things Use HTTP URIs so that people can look up those names. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) Include links to other URIs, so that they can discover more thingsApply the principles to IPTC’s NewsCodesAlready published as XML (G2 Knowledge Items)And as HTMLThe experiment: convert XML into RDF© 2011 IPTC (www.iptc.org) All rights reserved6
  • 7.
    Linked Data LessonsLearntOne Model (RDF), Multiple Vocabularies (SKOS, DC…)Tool support is a key benefitBasics well-documentedPull is better than pushLinking is really mappingSemWeb and Linked Data generates lots of interestIPTC’s NewsCodes as Linked Data publicly launched:http://www.iptc.org/site/NewsCodes/NewsCodes_Retrieval_in_Different_Formats© 2011 IPTC (www.iptc.org) All rights reserved7
  • 8.
    The BBC createdtheir World Cup 2010 site using Semantic Web technologiesNews Semantic Web Examples© 2011 IPTC (www.iptc.org) All rights reserved8http://www.bbc.co.uk/blogs/bbcinternet/2010/07/the_world_cup_and_a_call_to_ac.htmlhNews allows publishers to mark up news specific metadata in their HTML pages. Services are now starting to exploit the microformat in innovative wayshttp://microformats.org/wiki/hnewshttp://www.newsregistry.com/https://www.readability.com/publishers/guidelines/
  • 9.
    Other News LinkedData ExamplesJournalisted is a database of journalistsA Linked Data URI per journalist© 2011 IPTC (www.iptc.org) All rights reserved9http://mediastandardstrust.org/blog/journalisted-as-a-linked-data-resourceThe Guardian added linked data support to their “Open Platform” Content APIBut not using RDF…http://www.guardian.co.uk/open-platform/blog/linked-data-open-platform
  • 10.
    A Solution inSearch of a Problem?GeorgiKobilarovrecently opined thatThe technical approach of linked data isn't the issue(although the technology can be hard)The difficulty is seeing the problem that is to be solvedhttp://blog.georgikobilarov.com/2011/01/making-linked-data-work-isnt-the-problem/News + Linked Data + ? = Profit!IPTC has experimented with the technical aspectCan MINDS supply the business or editorial problems?© 2011 IPTC (www.iptc.org) All rights reserved10

Editor's Notes

  • #3 Standard machine-readable way to express the meaning of content, data or anything else you can represent on the webGoal is to express relationships between concepts more completely and enable more automated linking of dataBringing disparate data together allows us to answer new questions, and access data from different anglesE.g. Alzheimer’s advances just announced Explore the AP archive based on your particular perspective, e.g. show me all the AP content about people who are alums of a particular universityThe Semantic web is enabling users to share preferences and other information across multiple social networking sites – when I go to a news site, that site could take advantage of what my Facebook account knows about me and deliver a customized experienceKey concepts:Primary model is the RDF standard with extensions for specific types of concepts, e.g. people, products, contentRDF expresses relationships between 2 concepts: Alex Rodriguez <plays for> the New York YankeesEach concept has it’s own unique URL, known as a URI (a web page per thing)Flexible architectureThis is an emerging field that is gaining traction, but it’s important for the AP to lead in this space – to help define the semantic web standards for news
  • #4 We are already expressing the semantics of AP content through our automated tagging, rich entity and taxonomy data and the rollout of hNewsDion Lewis example: college football player, running back for University of Pittsburgh, hometown is Albany, NYhNews provides a simple way to express some of the semantics of news content
  • #5 Publishing Linked DataMaking our taxonomy and vocabulary data available in standard semantic web formatsMaking our tagging service available externally, so that we become the taxonomy standard for newsMaking News Content Semantic-Web Ready
  • #6 There are several ways this could benefit the APInternal Enable editorial to generate new types of data visualizations and interactivesElections, Economic Stress MapExternalPublish vocabulary data using RDF and URIs to support delivery for CAPIPublish as web pages to help drive traffic to AP and member web properties (e.g. Topic Pages)Make the AP tagging service available externally to drive adoption of our standard taxonomy and increase news participation in the Semantic WebOffer an alternative to OpenCalais for news publishersEstablish AP as the standard for news taggingOpportunities with CMS vendors, aggregators like Yahoo and GoogleBenefits: News focus, More detailed subject tagging, hierarchy, higher accuracy/disambiguationMap our vocabularies to others (NY Times, BBC) to enable aggregation of related content and data, e.g. mash ups of AP and other content and dataThese are all possibilities enabled by putting some common semantic web infrastructure in place,
  • #7 Support additional formats for content delivery and trackingRDFIPTC is looking into developing a version of NewsMLG2 in RDFThomson Reuters and OpenCalais deliver tagging in RDFBBC – using RDF to produce their World Cup site http://news.bbc.co.uk/sport2/hi/football/world_cup_2010/default.stmRDFaRDFa is a version of RDF for HTMLNY Times is interested in partnering on RDFa for news as part of their SEO strategyFacebook’sOpenGraph is based on RDFaGoogle Rich Snippets support RDFaOthers are seeing SEO benefits from RDFa (Best Buy)Customers could choose to receive eitherhNews or RDFa depending on their use case