Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

2,194 views

Published on

Published in: Technology, Education
  • Be the first to comment

Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

  1. 1. PostgreSQL 9.3 and JSON Andrew Dunstan andrew@dunslane.net andrew.dunstan@pgexperts.com
  2. 2. Overview ● What is JSON? ● Why use JSON? ● Quick review of 9.2 features ● 9.3 new features ● Building on 9.3 features ● Some useful extensions ● Future work
  3. 3. What is JSON? ● Data serialization format ● Lightweight ● Human readable ● Becoming ubiquitous ● Simpler and more compact than XML
  4. 4. What it looks like { "books" : [ { "title": "Catch 22”, "author": "Joseph Heller"}, { "title": "Catcher in the Rye", "author": "J. D. Salinger"} ], "publishers": [ { "name": "Random House" }, { "name": "Penguin" } ], "active": true, "version": 35, "date": "2003-09-13", “reference”: null } Scalars: ● quoted strings ● numbers ● true, false, null No extensions No date/time types
  5. 5. Why use it? ● Everyone is moving that way ● Understood everywhere there is a JavaScript interpreter – Especially browsers ● ... and in a large number of other languages – e.g. Perl, Python ● node.js is becoming very widely used ● More compact than XML ● Most applications don't need the richer structure of XML
  6. 6. Why not use it? ● Overly verbose ● Field names are repeated ● Arguably less readable than, say, YAML ● Not suitable for huge objects
  7. 7. Review – pre 9.2 facilities Nothing – store JSON as text ● No validation ● No JSON production ● No JSON extraction
  8. 8. Review – 9.2 data type New JSON type ● Stored as text ● Reasonably performant state-based validating parser ● Kudos: Robert Haas
  9. 9. Review – 9.2 production functions ● Turn non-JSON data into JSON ● row_to_json(anyrecord) ● array_to_json(anyarray) ● Optional second param for pretty printing ● My humble contribution ☺
  10. 10. What's missing? ● JSON production features are incomplete ● JSON processing is totally absent ● Have to use PLV8, PLPerl or some such
  11. 11. 9.3 Features – JSON production ● to_json(any) ● Can be used on any datum, not just arrays and records ● json_agg(record) ● Much faster than array_to_json(array_agg(record))
  12. 12. 9.3 and casts to JSON ● Production functions honor casts to JSON for non-builtin types ● Not needed for builtins, as we know how to convert them ● Saves syscache lookups where we know it's not necessary
  13. 13. 9.3 hstore and JSON ● hstore_to_json(hstore) ● Also used as a cast function ● hstore_to_json_loose(hstore) ● Uses heuristics about whether or not certain possibly numeric and boolean values need to be quoted.
  14. 14. 9.3 JSON parser rewrite ● New parser uses recursive descent pattern ● Caller can supply event handlers for certain events ● c.f. XML SAX parsers ● Validator uses NULL handlers for all events ● Tokenizing routines of previous parser largely kept
  15. 15. 9.3 JSON processsing functions ● All leverage new parser API ● Operators give a more natural style to extraction operations ● Many have two forms, producing either JSON output, which can be further processed, or text output, which cannot. ● Text output is de-escaped and dequoted
  16. 16. 9.3 extraction operators (1) ● -> fetch an array element or object member as json ● '[4,5,6]'::json->2 ⟹ 6 – json arrays are 0 based, unlike SQL arrays ● '{"a":1,"b":2}'::json->'b' ⟹ 2
  17. 17. 9.3 extraction operators (2) ● ->> fetch an array element orobject member as text ● '["a","b","c"]'::json->2 ⟹ c – Instead of "c"
  18. 18. 9.3 extraction operators (3) ● #> and #>> fetch data pointed at by a path ● Path is an array of text elements ● Treats arrays correctly by some trying to treat path element as an integer of necessary ● '{"a":[6,7,8]}'::json#>'{a,1}' ⟹ 7 ●
  19. 19. 9.3 extraction functions ● json_extract_path(json, VARIADIC path_elems text[]); ● json_extract_path_text(json, VARIADIC path_elems text[]); ● Same as #> and #>> operators, but you can pass the path as a variadic array ● json_extract_path('{"a":[6,7,8]}','a','1') ⟹ 7
  20. 20. 9.3 turn JSON into records ● CREATE TYPE x AS (a int, b int); ● SELECT * FROM json_populate_record(null::x, '{"a":1,"b":2}', false); ● SELECT * FROM json_populate_recordset(null::x,'[{"a":1," b":2},{"a":3,"b":4}]', false);
  21. 21. 9.3 turn JSON into key/value pairs ● SELECT * FROM json_each('{"a":1,"b":"foo"}') ● SELECT * FROM json_each_text('{"a":1,"b":"foo"}') ● Deliver columns named “key” and “value”
  22. 22. 9.3 get keys from JSON object ● SELECT * FROM json_object_keys('{"a":1,"b":"foo"}')
  23. 23. 9.3 JSON array processing ● SELECT json_array_length('[1,2,3,4]'); ● SELECT * FROM json_array_elements('[1,2,3,4]')
  24. 24. 9.3 the JSON C API ● Need ● A state object ● Some handlers ● Stash these in a JsonSemAction object ● call pg_json_parse()
  25. 25. 9.3 API example ● Code can be cloned from https://bitbucket.org/adunstan/json_typeof ● CREATE FUNCTION json_typeof(json) RETURNS text AS 'MODULE_PATHNAME' LANGUAGE C STRICT IMMUTABLE;
  26. 26. 9.3 API example state object ● #include <utils/jsonapi.h> ● typedef struct typeof_state { JsonLexContext *lex; JsonTokenType result; } TypeofState;
  27. 27. 9.3 API example event handlers ● Need one for each of: ● Array start ● Object start ● Scalar
  28. 28. 9.3 API example array/object start handlers ● static void json_typeof_astart (void *state) { TypeofState *_state = (TypeofState *) state; if (_state->lex->lex_level == 0) _state->result = JSON_TOKEN_ARRAY_START; } static void json_typeof_ostart (void *state) { TypeofState *_state = (TypeofState *) state; if (_state->lex->lex_level == 0) _state->result = JSON_TOKEN_OBJECT_START; }
  29. 29. 9.3 API example scalar handler ● static void json_typeof_scalar( void *state, char *token, JsonTokenType tokentype) { TypeofState *_state = (TypeofState *) state; if (_state->lex->lex_level == 0) _state->result = tokentype; }
  30. 30. 9.3 example function code Datum json_typeof(PG_FUNCTION_ARGS) { text *json = PG_GETARG_TEXT_P(0); TypeofState *state; JsonLexContext *lex = makeJsonLexContext(json, false); JsonSemAction *sem; state = palloc0(sizeof(TypeofState)); sem = palloc0(sizeof(JsonSemAction)); state->lex = lex; sem->semstate = (void *) state; sem->object_start = json_typeof_ostart; sem->array_start = json_typeof_astart; sem->scalar = json_typeof_scalar; pg_parse_json(lex, sem); PG_RETURN_TEXT_P(token_to_type(state->result)); }
  31. 31. 9.3 CAPI – what else is it good for? ● Transformations of all kinds ● XML ● YAML ● .... ● An XML transformation was the original working model for creating the API
  32. 32. Other useful modules ● Developed for use by my clients ● json_build lets you compose json of arbitrary complexity ● json_object creates a json object from a 1- or 2-dimensional array of text, similar to hstore facility.
  33. 33. json_build ● Two functions: ● build_json_object(VARIADIC “any”) ● build_json_array(VARIADIC “any”) ● Together these let you compose deeply nested tree structures safely.
  34. 34. json_build example SELECT build_json_object( 'a', build_json_object('b',false,'c',99), 'd', build_json_object('e',array[9,8,7]::int[], 'f', (select row_to_json(r) from ( select relkind, oid::regclass as name from pg_class where relname = 'pg_class') r))); {"a" : {"b" : false, "c" : 99}, "d" : {"e" : [9,8,7], "f" : {"relkind":"r","name":"pg_class"}}}
  35. 35. json_object examples SELECT json_object('{a,1,b,2,3,NULL,"d e f","a b c"}'); json_object ------------------------------------------------------- {"a" : "1", "b" : "2", "3" : null, "d e f" : "a b c"} -- same but with two dimensions SELECT json_object('{{a,1},{b,2},{3,NULL},{"d e f","a b c"}}'); json_object ------------------------------------------------------- {"a" : "1", "b" : "2", "3" : null, "d e f" : "a b c"}
  36. 36. json_object examples SELECT json_object('{a,1,b,2,3,NULL,"d e f","a b c"}'); json_object ------------------------------------------------------- {"a" : "1", "b" : "2", "3" : null, "d e f" : "a b c"} -- same but with two dimensions SELECT json_object('{{a,1},{b,2},{3,NULL},{"d e f","a b c"}}'); json_object ------------------------------------------------------- {"a" : "1", "b" : "2", "3" : null, "d e f" : "a b c"}
  37. 37. Extension repos ● https://github.com/pgexperts/json_build ● https://bitbucket.org/qooleot/json_object ● Also on PGXN
  38. 38. Future of JSON in PostgreSQL ● Possible binary representation ● Avoid reparsing ● Could be vastly faster for some operations ● Should we use a dictionary and store structure separately? ● General document store ● Need to get around the “rewrite a whole datum” issue
  39. 39. Credit ● 9.3 core development work sponsored by Heroku ● Other work sponsored by PGExperts clients and IVC
  40. 40. Questions?

×