Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

IPTC Approach to News in JSON


Published on

ninjs is IPTC's news in JSON standard. How was the design of ninjs approached? What were the different options which were considered? What is different about designing in JSON versus other formats, such as XML and RDF?

Published in: Technology
  • Be the first to comment

  • Be the first to like this

IPTC Approach to News in JSON

  1. 1. News in JSON Activity The ninjs Approach to ... News in JSON
  2. 2. What ninjs is Not • Not a restricted news data model • Not XML in JSON • Not RDF in JSON © 2017 IPTC ( All rights reserved 2
  3. 3. ninjs is comprehensive © 2017 IPTC ( All rights reserved 3
  4. 4. ninjs Data Model © 2017 IPTC ( All rights reserved 4
  5. 5. Data Model ninjs • The ninjs data model is more comprehensive than other IPTC data models – We selected a set of priority properties to represent – NewsML-G2, NewsML 1, rNews, NITF – We are ready to add more • ninjs is a JSON representation of a news item – Text, Photo, Graphic, Video, Audio, Package – You can represent a complete item, with all properties – Or you may want to convey key properties – Associations are themselves ninjs documents © 2017 IPTC ( All rights reserved 5
  6. 6. A Complete NINJS 1.1 Article { "uri" : "", "type" : "text", "versioncreated" : "2013-07-09T10:37:00Z", "byline" : "Paulo Santalucia and Frances d'Emilio", "headline" : "Captain of wrecked cruise ship on trial in Italy", "body_text" : "GROSSETO, Italy (EP) -- The trial of the captain of the shipwrecked Costa Concordia cruise liner has begun in a theater converted into a courtroom …" } © 2015 IPTC ( All rights reserved 6
  7. 7. XML and RDF: Powerful Tools • XML – Namespaces – XSLT – XPath and XQuery – Schema Validation • RDF – Object graphs – Sets of triples – Object lists © 2017 IPTC ( All rights reserved 7
  8. 8. And Yet Developers Prefer JSON • How to measure “preferences”? • – “Most Popular” 5/10 JSON only, 4/10 JSON+XML, 1 XML only – JSON only Facebook Graph, Google Maps, Twitter, AccuWeather, Pinterest, Reddit, Foursquare – XML and JSON Google Cloud Storage, Linkedin, Flickr • Databases – trends towards JSON – Only JSON MongoDB, CouchDB, Elasticsearch – Added JSON eXistDB, BaseX, MarkLogic, Oracle Database, PostgresSQL • For AP – the number one request is “can we get this in JSON instead?” © 2017 IPTC ( All rights reserved 8
  9. 9. Why JSON? • Maps easily into modern programming data structures – Feels “more natural” to developers • No namespaces – Biggest strength of XML and RDF – Biggest headache for developers • JSON ecosystem is improving (XML history repeating) – Elasticsearch dominates – Improved developer tools e.g. jq, XQuery support for JSON • Many developers see JSON as simpler and better than XML and may never have heard of RDF © 2017 IPTC ( All rights reserved 9
  10. 10. News in JSON Approach • Create a JSON representation of news that feels natural – Alternative is not using XML or RDF mapped into JSON – But a “hand crafted” JSON from scratch • Process – educate ourselves on JSON best practices – Select the news feature to model in JSON – Identify various representation alternatives in JSON – Try them out with a variety of tools – Pick the “best” one • Goal - JSON developer would look at ninjs and recognize it as a native implementation © 2010 IPTC ( All rights reserved 10
  11. 11. Text Markup in JSON • How to represent richly marked up text in JSON? • A sweet spot for document-oriented XML • Could be HTML, XHTML, NITF ... • We experiment with two existing text markup examples • NITF: fishing.xml • HTML: 5-Microdata-in-IPTC-namespace © 2010 IPTC ( All rights reserved 11
  12. 12. Text Markup Options in JSON • Plain text, stripped of markup • Preserved but escaped markup – HTML: – XML: – See to-escape-in-my-html-json-response for a discussion of how to escape markup in JSON • Mechanically create JSON structures to mimic the original markup – We used JSONML as an example – NITF : – HTML: © 2010 IPTC ( All rights reserved 12
  13. 13. What We Learnt • Both plain text (no markup) and escaped markup have clear use cases – Plain text can be useful for search, for example – Escaped markup works well for direct display on a webpage • Markup translated (like JSONML) works OK if you have a library to implement the rules – But what is the added benefit beyond just working directly with XML or HTML? – Who will write and maintain the libraries for ever language? • Ninjs supports both plain and escaped text via pattern properties © 2010 IPTC ( All rights reserved 13
  14. 14. Things We Considered But Decided Against • Translating from an existing XML standard into JSON – Not all IPTC standards are XML – Not all publishers use the same IPTC standards – Not all publishers use any IPTC standards • “Mechanically” translating from XML into JSON – There are many libraries that can do this – Different choices for how to represent certain XML features – So each technique results in a slightly different JSON – We felt that more a more “natural” JSON would be more valuable © 2010 IPTC ( All rights reserved 14
  15. 15. Beyond JSON – Binary Formats • IPTC started on ninjs in 2012 – five years ago! – Developer interest is moving on… • AP is looking at binary formats – Row-based – e.g. avro – Columnar – e.g. ORC or Parquet • Same issues – Mechanically translate from other XML or (more likely) JSON? – Or handcraft for most natural / best benefit? – How to quickly become experts in the best practices? © 2010 IPTC ( All rights reserved 15
  16. 16. News in JSON / ninjs • IPTC already has a lot of overlapping standards • Let’s try to avoid creating duplicate JSON standards too • Let’s build on what we have – For example, incorporating IKOS into ninjs – Solve problems of co-branding (ninjs vs NewsML-G2 in JSON) © 2010 IPTC ( All rights reserved 16