News in JSON Activity
http://www.flickr.com/photos/jondresner/5789254800/
The ninjs Approach to ...
News in JSON
http://www.flickr.com/photos/jondresner/5789254800/
What ninjs is Not
• Not a restricted news data model
• Not XML in JSON
• Not RDF in JSON
© 2017 IPTC (www.iptc.org) All rights reserved 2
ninjs is comprehensive
© 2017 IPTC (www.iptc.org) All rights reserved 3
http://groups.yahoo.com/neo/groups/iptc-news-in-json-dev
ninjs Data Model
© 2017 IPTC (www.iptc.org) All rights reserved 4
http://dev.iptc.org/ninjs
Data Model ninjs
• The ninjs data model is more comprehensive than other
IPTC data models
– We selected a set of priority properties to represent
– NewsML-G2, NewsML 1, rNews, NITF
– We are ready to add more
• ninjs is a JSON representation of a news item
– Text, Photo, Graphic, Video, Audio, Package
– You can represent a complete item, with all properties
– Or you may want to convey key properties
– Associations are themselves ninjs documents
© 2017 IPTC (www.iptc.org) All rights reserved 5
A Complete NINJS 1.1 Article
{
"uri" : "http://ninjs.example.com/newsitems/20130709simp123",
"type" : "text",
"versioncreated" : "2013-07-09T10:37:00Z",
"byline" : "Paulo Santalucia and Frances d'Emilio",
"headline" : "Captain of wrecked cruise ship on trial in
Italy",
"body_text" : "GROSSETO, Italy (EP) -- The trial of the
captain of the shipwrecked Costa Concordia cruise liner has begun
in a theater converted into a courtroom …"
}
© 2015 IPTC (www.iptc.org) All rights reserved 6
XML and RDF: Powerful Tools
• XML
– Namespaces
– XSLT
– XPath and XQuery
– Schema Validation
• RDF
– Object graphs
– Sets of triples
– Object lists
© 2017 IPTC (www.iptc.org) All rights reserved 7
And Yet Developers Prefer JSON
• How to measure “preferences”?
• https://www.programmableweb.com/apis
– “Most Popular” 5/10 JSON only, 4/10 JSON+XML, 1 XML only
– JSON only Facebook Graph, Google Maps, Twitter,
AccuWeather, Pinterest, Reddit, Foursquare
– XML and JSON Google Cloud Storage, Linkedin, Flickr
• Databases – trends towards JSON
– Only JSON MongoDB, CouchDB, Elasticsearch
– Added JSON eXistDB, BaseX, MarkLogic, Oracle Database,
PostgresSQL
• For AP – the number one request is “can we get this in
JSON instead?”
© 2017 IPTC (www.iptc.org) All rights reserved 8
Why JSON?
• Maps easily into modern programming data structures
– Feels “more natural” to developers
• No namespaces
– Biggest strength of XML and RDF
– Biggest headache for developers
• JSON ecosystem is improving (XML history repeating)
– Elasticsearch dominates
– Improved developer tools e.g. jq, XQuery support for JSON
• Many developers see JSON as simpler and better than
XML and may never have heard of RDF
© 2017 IPTC (www.iptc.org) All rights reserved 9
News in JSON Approach
• Create a JSON representation of news that feels natural
– Alternative is not using XML or RDF mapped into JSON
– But a “hand crafted” JSON from scratch
• Process – educate ourselves on JSON best practices
– Select the news feature to model in JSON
– Identify various representation alternatives in JSON
– Try them out with a variety of tools
– Pick the “best” one
• Goal - JSON developer would look at ninjs and
recognize it as a native implementation
© 2010 IPTC (www.iptc.org) All rights reserved 10
Text Markup in JSON
• How to represent richly marked up text in JSON?
• A sweet spot for document-oriented XML
• Could be HTML, XHTML, NITF ...
• We experiment with two existing text markup examples
• NITF: http://www.iptc.org/std/NITF/3.2/examples/nitf-
fishing.xml
• HTML: http://dev.iptc.org/Implementation-Guide-HTML-
5-Microdata-in-IPTC-namespace
© 2010 IPTC (www.iptc.org) All rights reserved 11
Text Markup Options in JSON
• Plain text, stripped of markup
• Preserved but escaped markup
– HTML: https://gist.github.com/anonymous/4996653
– XML: https://gist.github.com/anonymous/4996676
– See http://stackoverflow.com/questions/993970/what-do-i-need-
to-escape-in-my-html-json-response for a discussion of how to
escape markup in JSON
• Mechanically create JSON structures to mimic the
original markup
– We used JSONML as an example http://www.jsonml.org/
– NITF : https://gist.github.com/anonymous/4996697
– HTML: https://gist.github.com/anonymous/4996720
© 2010 IPTC (www.iptc.org) All rights reserved 12
What We Learnt
• Both plain text (no markup) and escaped markup have
clear use cases
– Plain text can be useful for search, for example
– Escaped markup works well for direct display on a webpage
• Markup translated (like JSONML) works OK if you have
a library to implement the rules
– But what is the added benefit beyond just working directly with
XML or HTML?
– Who will write and maintain the libraries for ever language?
• Ninjs supports both plain and escaped text via pattern
properties
© 2010 IPTC (www.iptc.org) All rights reserved 13
Things We Considered But
Decided Against
• Translating from an existing XML standard into JSON
– Not all IPTC standards are XML
– Not all publishers use the same IPTC standards
– Not all publishers use any IPTC standards
• “Mechanically” translating from XML into JSON
– There are many libraries that can do this
– Different choices for how to represent certain XML features
– So each technique results in a slightly different JSON
– We felt that more a more “natural” JSON would be more valuable
© 2010 IPTC (www.iptc.org) All rights reserved 14
Beyond JSON – Binary Formats
• IPTC started on ninjs in 2012 – five years ago!
– Developer interest is moving on…
• AP is looking at binary formats
– Row-based – e.g. avro
– Columnar – e.g. ORC or Parquet
• Same issues
– Mechanically translate from other XML or (more likely) JSON?
– Or handcraft for most natural / best benefit?
– How to quickly become experts in the best practices?
© 2010 IPTC (www.iptc.org) All rights reserved 15
News in JSON / ninjs
• IPTC already has a lot of overlapping standards
• Let’s try to avoid creating duplicate JSON standards too
• Let’s build on what we have
– For example, incorporating IKOS into ninjs
– Solve problems of co-branding (ninjs vs NewsML-G2 in JSON)
© 2010 IPTC (www.iptc.org) All rights reserved 16

IPTC Approach to News in JSON

  • 1.
    News in JSONActivity http://www.flickr.com/photos/jondresner/5789254800/ The ninjs Approach to ... News in JSON http://www.flickr.com/photos/jondresner/5789254800/
  • 2.
    What ninjs isNot • Not a restricted news data model • Not XML in JSON • Not RDF in JSON © 2017 IPTC (www.iptc.org) All rights reserved 2
  • 3.
    ninjs is comprehensive ©2017 IPTC (www.iptc.org) All rights reserved 3 http://groups.yahoo.com/neo/groups/iptc-news-in-json-dev
  • 4.
    ninjs Data Model ©2017 IPTC (www.iptc.org) All rights reserved 4 http://dev.iptc.org/ninjs
  • 5.
    Data Model ninjs •The ninjs data model is more comprehensive than other IPTC data models – We selected a set of priority properties to represent – NewsML-G2, NewsML 1, rNews, NITF – We are ready to add more • ninjs is a JSON representation of a news item – Text, Photo, Graphic, Video, Audio, Package – You can represent a complete item, with all properties – Or you may want to convey key properties – Associations are themselves ninjs documents © 2017 IPTC (www.iptc.org) All rights reserved 5
  • 6.
    A Complete NINJS1.1 Article { "uri" : "http://ninjs.example.com/newsitems/20130709simp123", "type" : "text", "versioncreated" : "2013-07-09T10:37:00Z", "byline" : "Paulo Santalucia and Frances d'Emilio", "headline" : "Captain of wrecked cruise ship on trial in Italy", "body_text" : "GROSSETO, Italy (EP) -- The trial of the captain of the shipwrecked Costa Concordia cruise liner has begun in a theater converted into a courtroom …" } © 2015 IPTC (www.iptc.org) All rights reserved 6
  • 7.
    XML and RDF:Powerful Tools • XML – Namespaces – XSLT – XPath and XQuery – Schema Validation • RDF – Object graphs – Sets of triples – Object lists © 2017 IPTC (www.iptc.org) All rights reserved 7
  • 8.
    And Yet DevelopersPrefer JSON • How to measure “preferences”? • https://www.programmableweb.com/apis – “Most Popular” 5/10 JSON only, 4/10 JSON+XML, 1 XML only – JSON only Facebook Graph, Google Maps, Twitter, AccuWeather, Pinterest, Reddit, Foursquare – XML and JSON Google Cloud Storage, Linkedin, Flickr • Databases – trends towards JSON – Only JSON MongoDB, CouchDB, Elasticsearch – Added JSON eXistDB, BaseX, MarkLogic, Oracle Database, PostgresSQL • For AP – the number one request is “can we get this in JSON instead?” © 2017 IPTC (www.iptc.org) All rights reserved 8
  • 9.
    Why JSON? • Mapseasily into modern programming data structures – Feels “more natural” to developers • No namespaces – Biggest strength of XML and RDF – Biggest headache for developers • JSON ecosystem is improving (XML history repeating) – Elasticsearch dominates – Improved developer tools e.g. jq, XQuery support for JSON • Many developers see JSON as simpler and better than XML and may never have heard of RDF © 2017 IPTC (www.iptc.org) All rights reserved 9
  • 10.
    News in JSONApproach • Create a JSON representation of news that feels natural – Alternative is not using XML or RDF mapped into JSON – But a “hand crafted” JSON from scratch • Process – educate ourselves on JSON best practices – Select the news feature to model in JSON – Identify various representation alternatives in JSON – Try them out with a variety of tools – Pick the “best” one • Goal - JSON developer would look at ninjs and recognize it as a native implementation © 2010 IPTC (www.iptc.org) All rights reserved 10
  • 11.
    Text Markup inJSON • How to represent richly marked up text in JSON? • A sweet spot for document-oriented XML • Could be HTML, XHTML, NITF ... • We experiment with two existing text markup examples • NITF: http://www.iptc.org/std/NITF/3.2/examples/nitf- fishing.xml • HTML: http://dev.iptc.org/Implementation-Guide-HTML- 5-Microdata-in-IPTC-namespace © 2010 IPTC (www.iptc.org) All rights reserved 11
  • 12.
    Text Markup Optionsin JSON • Plain text, stripped of markup • Preserved but escaped markup – HTML: https://gist.github.com/anonymous/4996653 – XML: https://gist.github.com/anonymous/4996676 – See http://stackoverflow.com/questions/993970/what-do-i-need- to-escape-in-my-html-json-response for a discussion of how to escape markup in JSON • Mechanically create JSON structures to mimic the original markup – We used JSONML as an example http://www.jsonml.org/ – NITF : https://gist.github.com/anonymous/4996697 – HTML: https://gist.github.com/anonymous/4996720 © 2010 IPTC (www.iptc.org) All rights reserved 12
  • 13.
    What We Learnt •Both plain text (no markup) and escaped markup have clear use cases – Plain text can be useful for search, for example – Escaped markup works well for direct display on a webpage • Markup translated (like JSONML) works OK if you have a library to implement the rules – But what is the added benefit beyond just working directly with XML or HTML? – Who will write and maintain the libraries for ever language? • Ninjs supports both plain and escaped text via pattern properties © 2010 IPTC (www.iptc.org) All rights reserved 13
  • 14.
    Things We ConsideredBut Decided Against • Translating from an existing XML standard into JSON – Not all IPTC standards are XML – Not all publishers use the same IPTC standards – Not all publishers use any IPTC standards • “Mechanically” translating from XML into JSON – There are many libraries that can do this – Different choices for how to represent certain XML features – So each technique results in a slightly different JSON – We felt that more a more “natural” JSON would be more valuable © 2010 IPTC (www.iptc.org) All rights reserved 14
  • 15.
    Beyond JSON –Binary Formats • IPTC started on ninjs in 2012 – five years ago! – Developer interest is moving on… • AP is looking at binary formats – Row-based – e.g. avro – Columnar – e.g. ORC or Parquet • Same issues – Mechanically translate from other XML or (more likely) JSON? – Or handcraft for most natural / best benefit? – How to quickly become experts in the best practices? © 2010 IPTC (www.iptc.org) All rights reserved 15
  • 16.
    News in JSON/ ninjs • IPTC already has a lot of overlapping standards • Let’s try to avoid creating duplicate JSON standards too • Let’s build on what we have – For example, incorporating IKOS into ninjs – Solve problems of co-branding (ninjs vs NewsML-G2 in JSON) © 2010 IPTC (www.iptc.org) All rights reserved 16