PostgreSQL 9.3 and JSON
Andrew Dunstan
andrew@dunslane.net
andrew.dunstan@pgexperts.com
Overview
● What is JSON?
● Why use JSON?
● Quick review of 9.2 features
● 9.3 new features
● Building on 9.3 features
● So...
What is JSON?
● Data serialization format
● Lightweight
● Human readable
● Becoming ubiquitous
● Simpler and more compact ...
What it looks like
{
"books" : [
{ "title": "Catch 22”, "author": "Joseph Heller"},
{ "title": "Catcher in the Rye", "auth...
Why use it?
● Everyone is moving that way
● Understood everywhere there is a JavaScript
interpreter
– Especially browsers
...
Why not use it?
● Overly verbose
● Field names are repeated
● Arguably less readable than, say, YAML
● Not suitable for hu...
Review – pre 9.2 facilities
Nothing – store JSON as text
● No validation
● No JSON production
● No JSON extraction
Review – 9.2 data type
New JSON type
● Stored as text
● Reasonably performant state-based validating
parser
● Kudos: Rober...
Review – 9.2 production functions
● Turn non-JSON data into JSON
● row_to_json(anyrecord)
● array_to_json(anyarray)
● Opti...
What's missing?
● JSON production features are incomplete
● JSON processing is totally absent
● Have to use PLV8, PLPerl o...
9.3 Features – JSON production
● to_json(any)
● Can be used on any datum, not just
arrays and records
● json_agg(record)
●...
9.3 and casts to JSON
● Production functions honor casts to JSON
for non-builtin types
● Not needed for builtins, as we kn...
9.3 hstore and JSON
● hstore_to_json(hstore)
● Also used as a cast function
● hstore_to_json_loose(hstore)
● Uses heuristi...
9.3 JSON parser rewrite
● New parser uses recursive descent
pattern
● Caller can supply event handlers for
certain events
...
9.3 JSON processsing functions
● All leverage new parser API
● Operators give a more natural style to
extraction operation...
9.3 extraction operators (1)
● -> fetch an array element or object
member as json
● '[4,5,6]'::json->2 ⟹ 6
– json arrays a...
9.3 extraction operators (2)
● ->> fetch an array element orobject
member as text
● '["a","b","c"]'::json->2 ⟹ c
– Instead...
9.3 extraction operators (3)
● #> and #>> fetch data pointed at by a
path
● Path is an array of text elements
● Treats arr...
9.3 extraction functions
● json_extract_path(json, VARIADIC
path_elems text[]);
● json_extract_path_text(json, VARIADIC
pa...
9.3 turn JSON into records
● CREATE TYPE x AS (a int, b int);
● SELECT * FROM
json_populate_record(null::x,
'{"a":1,"b":2}...
9.3 turn JSON into key/value pairs
● SELECT * FROM
json_each('{"a":1,"b":"foo"}')
● SELECT * FROM
json_each_text('{"a":1,"...
9.3 get keys from JSON object
● SELECT * FROM
json_object_keys('{"a":1,"b":"foo"}')
9.3 JSON array processing
● SELECT json_array_length('[1,2,3,4]');
● SELECT * FROM
json_array_elements('[1,2,3,4]')
9.3 the JSON C API
● Need
● A state object
● Some handlers
● Stash these in a JsonSemAction object
● call pg_json_parse()
9.3 API example
● Code can be cloned from
https://bitbucket.org/adunstan/json_typeof
● CREATE FUNCTION json_typeof(json)
R...
9.3 API example state object
● #include <utils/jsonapi.h>
● typedef struct typeof_state {
JsonLexContext *lex;
JsonTokenTy...
9.3 API example event handlers
● Need one for each of:
● Array start
● Object start
● Scalar
9.3 API example array/object
start handlers
● static void json_typeof_astart (void *state)
{
TypeofState *_state = (Typeof...
9.3 API example scalar handler
● static void json_typeof_scalar(
void *state,
char *token,
JsonTokenType tokentype)
{
Type...
9.3 example function code
Datum json_typeof(PG_FUNCTION_ARGS)
{
text *json = PG_GETARG_TEXT_P(0);
TypeofState *state;
Json...
9.3 CAPI – what else is it good for?
● Transformations of all kinds
● XML
● YAML
● ....
● An XML transformation was the or...
Other useful modules
● Developed for use by my clients
● json_build lets you compose json of arbitrary
complexity
● json_o...
json_build
● Two functions:
● build_json_object(VARIADIC “any”)
● build_json_array(VARIADIC “any”)
● Together these let yo...
json_build example
SELECT build_json_object(
'a', build_json_object('b',false,'c',99),
'd', build_json_object('e',array[9,...
json_object examples
SELECT json_object('{a,1,b,2,3,NULL,"d e f","a b c"}');
json_object
---------------------------------...
json_object examples
SELECT json_object('{a,1,b,2,3,NULL,"d e f","a b c"}');
json_object
---------------------------------...
Extension repos
● https://github.com/pgexperts/json_build
● https://bitbucket.org/qooleot/json_object
● Also on PGXN
Future of JSON in PostgreSQL
● Possible binary representation
● Avoid reparsing
● Could be vastly faster for some operatio...
Credit
● 9.3 core development work sponsored by
Heroku
● Other work sponsored by PGExperts clients
and IVC
Questions?
Upcoming SlideShare
Loading in...5
×

Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

1,661

Published on

Published in: Technology, Education
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,661
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
19
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

  1. 1. PostgreSQL 9.3 and JSON Andrew Dunstan andrew@dunslane.net andrew.dunstan@pgexperts.com
  2. 2. Overview ● What is JSON? ● Why use JSON? ● Quick review of 9.2 features ● 9.3 new features ● Building on 9.3 features ● Some useful extensions ● Future work
  3. 3. What is JSON? ● Data serialization format ● Lightweight ● Human readable ● Becoming ubiquitous ● Simpler and more compact than XML
  4. 4. What it looks like { "books" : [ { "title": "Catch 22”, "author": "Joseph Heller"}, { "title": "Catcher in the Rye", "author": "J. D. Salinger"} ], "publishers": [ { "name": "Random House" }, { "name": "Penguin" } ], "active": true, "version": 35, "date": "2003-09-13", “reference”: null } Scalars: ● quoted strings ● numbers ● true, false, null No extensions No date/time types
  5. 5. Why use it? ● Everyone is moving that way ● Understood everywhere there is a JavaScript interpreter – Especially browsers ● ... and in a large number of other languages – e.g. Perl, Python ● node.js is becoming very widely used ● More compact than XML ● Most applications don't need the richer structure of XML
  6. 6. Why not use it? ● Overly verbose ● Field names are repeated ● Arguably less readable than, say, YAML ● Not suitable for huge objects
  7. 7. Review – pre 9.2 facilities Nothing – store JSON as text ● No validation ● No JSON production ● No JSON extraction
  8. 8. Review – 9.2 data type New JSON type ● Stored as text ● Reasonably performant state-based validating parser ● Kudos: Robert Haas
  9. 9. Review – 9.2 production functions ● Turn non-JSON data into JSON ● row_to_json(anyrecord) ● array_to_json(anyarray) ● Optional second param for pretty printing ● My humble contribution ☺
  10. 10. What's missing? ● JSON production features are incomplete ● JSON processing is totally absent ● Have to use PLV8, PLPerl or some such
  11. 11. 9.3 Features – JSON production ● to_json(any) ● Can be used on any datum, not just arrays and records ● json_agg(record) ● Much faster than array_to_json(array_agg(record))
  12. 12. 9.3 and casts to JSON ● Production functions honor casts to JSON for non-builtin types ● Not needed for builtins, as we know how to convert them ● Saves syscache lookups where we know it's not necessary
  13. 13. 9.3 hstore and JSON ● hstore_to_json(hstore) ● Also used as a cast function ● hstore_to_json_loose(hstore) ● Uses heuristics about whether or not certain possibly numeric and boolean values need to be quoted.
  14. 14. 9.3 JSON parser rewrite ● New parser uses recursive descent pattern ● Caller can supply event handlers for certain events ● c.f. XML SAX parsers ● Validator uses NULL handlers for all events ● Tokenizing routines of previous parser largely kept
  15. 15. 9.3 JSON processsing functions ● All leverage new parser API ● Operators give a more natural style to extraction operations ● Many have two forms, producing either JSON output, which can be further processed, or text output, which cannot. ● Text output is de-escaped and dequoted
  16. 16. 9.3 extraction operators (1) ● -> fetch an array element or object member as json ● '[4,5,6]'::json->2 ⟹ 6 – json arrays are 0 based, unlike SQL arrays ● '{"a":1,"b":2}'::json->'b' ⟹ 2
  17. 17. 9.3 extraction operators (2) ● ->> fetch an array element orobject member as text ● '["a","b","c"]'::json->2 ⟹ c – Instead of "c"
  18. 18. 9.3 extraction operators (3) ● #> and #>> fetch data pointed at by a path ● Path is an array of text elements ● Treats arrays correctly by some trying to treat path element as an integer of necessary ● '{"a":[6,7,8]}'::json#>'{a,1}' ⟹ 7 ●
  19. 19. 9.3 extraction functions ● json_extract_path(json, VARIADIC path_elems text[]); ● json_extract_path_text(json, VARIADIC path_elems text[]); ● Same as #> and #>> operators, but you can pass the path as a variadic array ● json_extract_path('{"a":[6,7,8]}','a','1') ⟹ 7
  20. 20. 9.3 turn JSON into records ● CREATE TYPE x AS (a int, b int); ● SELECT * FROM json_populate_record(null::x, '{"a":1,"b":2}', false); ● SELECT * FROM json_populate_recordset(null::x,'[{"a":1," b":2},{"a":3,"b":4}]', false);
  21. 21. 9.3 turn JSON into key/value pairs ● SELECT * FROM json_each('{"a":1,"b":"foo"}') ● SELECT * FROM json_each_text('{"a":1,"b":"foo"}') ● Deliver columns named “key” and “value”
  22. 22. 9.3 get keys from JSON object ● SELECT * FROM json_object_keys('{"a":1,"b":"foo"}')
  23. 23. 9.3 JSON array processing ● SELECT json_array_length('[1,2,3,4]'); ● SELECT * FROM json_array_elements('[1,2,3,4]')
  24. 24. 9.3 the JSON C API ● Need ● A state object ● Some handlers ● Stash these in a JsonSemAction object ● call pg_json_parse()
  25. 25. 9.3 API example ● Code can be cloned from https://bitbucket.org/adunstan/json_typeof ● CREATE FUNCTION json_typeof(json) RETURNS text AS 'MODULE_PATHNAME' LANGUAGE C STRICT IMMUTABLE;
  26. 26. 9.3 API example state object ● #include <utils/jsonapi.h> ● typedef struct typeof_state { JsonLexContext *lex; JsonTokenType result; } TypeofState;
  27. 27. 9.3 API example event handlers ● Need one for each of: ● Array start ● Object start ● Scalar
  28. 28. 9.3 API example array/object start handlers ● static void json_typeof_astart (void *state) { TypeofState *_state = (TypeofState *) state; if (_state->lex->lex_level == 0) _state->result = JSON_TOKEN_ARRAY_START; } static void json_typeof_ostart (void *state) { TypeofState *_state = (TypeofState *) state; if (_state->lex->lex_level == 0) _state->result = JSON_TOKEN_OBJECT_START; }
  29. 29. 9.3 API example scalar handler ● static void json_typeof_scalar( void *state, char *token, JsonTokenType tokentype) { TypeofState *_state = (TypeofState *) state; if (_state->lex->lex_level == 0) _state->result = tokentype; }
  30. 30. 9.3 example function code Datum json_typeof(PG_FUNCTION_ARGS) { text *json = PG_GETARG_TEXT_P(0); TypeofState *state; JsonLexContext *lex = makeJsonLexContext(json, false); JsonSemAction *sem; state = palloc0(sizeof(TypeofState)); sem = palloc0(sizeof(JsonSemAction)); state->lex = lex; sem->semstate = (void *) state; sem->object_start = json_typeof_ostart; sem->array_start = json_typeof_astart; sem->scalar = json_typeof_scalar; pg_parse_json(lex, sem); PG_RETURN_TEXT_P(token_to_type(state->result)); }
  31. 31. 9.3 CAPI – what else is it good for? ● Transformations of all kinds ● XML ● YAML ● .... ● An XML transformation was the original working model for creating the API
  32. 32. Other useful modules ● Developed for use by my clients ● json_build lets you compose json of arbitrary complexity ● json_object creates a json object from a 1- or 2-dimensional array of text, similar to hstore facility.
  33. 33. json_build ● Two functions: ● build_json_object(VARIADIC “any”) ● build_json_array(VARIADIC “any”) ● Together these let you compose deeply nested tree structures safely.
  34. 34. json_build example SELECT build_json_object( 'a', build_json_object('b',false,'c',99), 'd', build_json_object('e',array[9,8,7]::int[], 'f', (select row_to_json(r) from ( select relkind, oid::regclass as name from pg_class where relname = 'pg_class') r))); {"a" : {"b" : false, "c" : 99}, "d" : {"e" : [9,8,7], "f" : {"relkind":"r","name":"pg_class"}}}
  35. 35. json_object examples SELECT json_object('{a,1,b,2,3,NULL,"d e f","a b c"}'); json_object ------------------------------------------------------- {"a" : "1", "b" : "2", "3" : null, "d e f" : "a b c"} -- same but with two dimensions SELECT json_object('{{a,1},{b,2},{3,NULL},{"d e f","a b c"}}'); json_object ------------------------------------------------------- {"a" : "1", "b" : "2", "3" : null, "d e f" : "a b c"}
  36. 36. json_object examples SELECT json_object('{a,1,b,2,3,NULL,"d e f","a b c"}'); json_object ------------------------------------------------------- {"a" : "1", "b" : "2", "3" : null, "d e f" : "a b c"} -- same but with two dimensions SELECT json_object('{{a,1},{b,2},{3,NULL},{"d e f","a b c"}}'); json_object ------------------------------------------------------- {"a" : "1", "b" : "2", "3" : null, "d e f" : "a b c"}
  37. 37. Extension repos ● https://github.com/pgexperts/json_build ● https://bitbucket.org/qooleot/json_object ● Also on PGXN
  38. 38. Future of JSON in PostgreSQL ● Possible binary representation ● Avoid reparsing ● Could be vastly faster for some operations ● Should we use a dictionary and store structure separately? ● General document store ● Need to get around the “rewrite a whole datum” issue
  39. 39. Credit ● 9.3 core development work sponsored by Heroku ● Other work sponsored by PGExperts clients and IVC
  40. 40. Questions?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×