NITF 2010 Spring Working Group


Published on

The IPTC's News Industry Text Format is an XML format for news article content and metadata. This presentation discusses the progress on the roadmap to NITF 4.0 - incorporating the Semantic Web, more complete namespace support and aligning NITF with IPTC's NewsML-G2.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

NITF 2010 Spring Working Group

  1. 1. NITF<br />Stuart Myles<br />Associated Press<br />Paris, France / March 8th, 2010<br />
  2. 2. © IPTC –<br />2<br />Agenda<br />Approval of minutes from previous meeting<br />Matters Arising<br />Chairman’s Report<br /><ul><li>NITF 4.0
  3. 3. Other text markup
  4. 4. Documentation</li></li></ul><li>© IPTC –<br />3<br />NITF Minutes<br />Approval of Minutes from previous meeting:<br />Held on 9th October 2009<br />
  5. 5. © IPTC –<br />4<br />NITF Matters<br />Matters arising?<br />
  6. 6. © IPTC –<br />5<br />Chairman’s Report<br />NITF = “News Industry Text Format”<br />Defines the content and structure of articles<br />IPTC’s most widely-used XML standard<br />421 members on the Y! list<br />down from 435 in October<br />4 emails since October<br />NITF 3.5 released in December 2009<br /><br /><br />
  7. 7. NITF 4.0 Road Map<br />In October 2010 we proposed a road map:<br />Kick off NITF 4.0 in Spring 2010<br />Discuss<br />G2ization<br />RDFization<br />Namespaces<br />Target NITF 4.0 for end of 2010<br />© IPTC –<br />6<br />
  8. 8. NITF 4.0<br />NITF 4.0:<br />Unlocking the power of NITF<br />© IPTC –<br />7<br />
  9. 9. NITF 4.0 – Semantic Web<br />Dear IPTC Standards Committee,<br />Please set up a Working Group to consider<br />RDF, Semantic Web and Linked Data.<br />How might they relate to IPTC standards?<br />Regards,<br />NITF Working Group<br />October 2009<br />© IPTC –<br />8<br />
  10. 10. NITF and the Semantic Web<br />For a Dow Jones project, I created a representation of key article information<br />I used semantic web vocabularies – chiefly FOAF and Dublin Core Terms<br />But there was no match for “byline”<br />I considered using G2’s <by> element<br />But NITF’s <byline> was actually what I needed<br />© IPTC –<br />9<br />
  11. 11. Semantic Web:News Vocabulary<br />IPTC could create a news-specific vocabulary of terms.<br />I saw a need, as have New York Times and others<br />© IPTC –<br />10<br />
  12. 12. Semantic Web Vocabularies<br />Best known RDF vocabularies are<br />FOAF = Friend of a Friend<br /><br />DCMI Terms = Dublin Core Metadata Initiative Terms<br /><br />Other examples at<br />© IPTC –<br />11<br />
  13. 13. Semantic Web Vocabulary<br />An example from Dublin Core Terms:<br />© IPTC –<br />12<br />
  14. 14. Semantic Web Vocabularies<br />An example from Dublin Core Terms:<br />There are some news-specific terms that aren’t defined in other vocabularies, such as “byline”.<br /> We could define a news vocabulary (a relatively simple data model) or a full ontology (richer but more work).<br />© IPTC –<br />13<br />
  15. 15. NITF 4.0 and Semantic Web<br />Should IPTC take a lead role?<br />Other organizations are starting to create news vocabularies<br />Are there meaningful differences between NITF and the G2 family?<br />Maybe a way to bring the two closer together<br />Note that NITF has always been “semantic”<br /><br />© IPTC –<br />14<br />
  16. 16. Geographic Information<br />Gerd Kamp from DPA Infocom discusses using NITF to represent locations:<br /><br />He found everything he needed<br />Except for a way to represent a centroid<br />Centroid is the central point of a place<br />Expressed a latitude and longitude<br />© IPTC –<br />15<br />
  17. 17. A georss:point in NITF<br />Adding a centroid using georss<br />© IPTC –<br />16<br />
  18. 18. Adding Latitude and Longitude<br />We could add latitude and longitude to NITF’s location-related elements<br />Maps as user interfaces to news are growing in popularity<br />But geographic information can be quite complex<br />Centroid, Bounding Box, Bounding Polygon…<br />So can we consider a different approach?<br />© IPTC –<br />17<br />
  19. 19. The GeoRSS Namespace<br />GeoRSS is widely used in RSS and ATOM<br />Designed to be embedded in XML<br /><br />So why recreate those structures in NITF?<br />© IPTC –<br />18<br />
  20. 20. Foreign Namespace<br />In NITF 3.5, we completed the support for “foreign namespaces” introduced into the schema in v3.4<br />Specifically, the “enriched text” has a choice of<br /> <any namespace="##other"/><br />This allows other namespaces to be used within such NITF elements as caption, tagline, etc.<br />© IPTC –<br />19<br />
  21. 21. Foreign Namespaces Elsewhere?<br />So far, we have only allowed non NITF namespaces within enriched text<br />This means that NITF is a “closed” schema<br />All innovation in the use of NITF needs to be centralized within the IPTC<br />Do we want to allow other namespaces to be mixed in with NITF documents?<br />Allow proprietary extensions to be “legal”<br />© IPTC –<br />20<br />
  22. 22. NITF 4.0 and G2<br />IPTC’s G2 standard is a unified framework<br />Packaging and exchanging news content<br />Standard model for news metadata regardless of the content or media type<br />However, NITF predates and stands outside the G2 framework<br />Can NITF join the G2 family of standards?<br />© IPTC –<br />21<br />
  23. 23. NITF and G2<br />We studied how SportsML became part of the G2 family<br />It seems a similar path is possible for NITF<br />The biggest change will be the inline adoption of QCodes in NITF<br />Colon separated scheme:code syntax for controlled vocabularies<br />© IPTC –<br />22<br />
  24. 24. NITF and G2<br />With work, NITF can be brought within the G2 framework<br />NITF would bring inline semantics (entities) into G2<br />Should NITF Classic live on?<br />© IPTC –<br />23<br />
  25. 25. NITF 4.0<br />Unlocking the power of NITF<br />Joining the Semantic Web<br />Opening up to other namespaces<br />Joining the G2 family of standards<br />© IPTC –<br />24<br />
  26. 26. Other Text Markup<br />NITF isn’t the only text markup effort<br />Or even the most active<br />HTML5<br />hNews<br />IPTC 7901<br />© IPTC –<br />25<br />
  27. 27. HTML5 New Elements<br />HTML5 is introducing several new structural elements, including<br /><section> <article><br /><aside> <header> <footer><br />HTML5 is moving confidently beyond presentation into news-like structure<br /><br />© IPTC –<br />26<br />
  28. 28. hNews<br />A microformat for adding some news-specific semantics into display-ready HTML<br />Adopted by Associated Press for recent Winter Games and forthcoming World Cup websites<br />We know of around 200 other websites using hNews<br />Starting to see some tools being built<br />© IPTC –<br />27<br />
  29. 29. IPTC 7901<br />An idea to add markup to pre-XML text markup<br />Can we use Markdown?<br />The idea will be discussed later during the Standards Meeting<br />© IPTC –<br />28<br />
  30. 30. NITF Documentation<br />Upgrading the NITF website. Some ideas:<br />Simplify getting to the NITF specs<br />Perhaps adopt Subversion for previous versions?<br />Supply NITF <-> XHTML XSLT transforms<br />Copy NITF DTD documentation into the XSD<br />Modernize the documentation<br />Discuss NITF and G2?<br />Volunteers to take on any of the work?<br />© IPTC –<br />29<br />
  31. 31. NITF<br />Any other business?<br />© IPTC –<br />30<br />
  32. 32. © IPTC –<br />31<br />NITF<br />Date and place of next meeting:<br />San Francisco, USA - Summer 2010<br />Merci!<br />