These days you can't go far without encountering XML or JSON and in the world of the web these data types are ubiquitous. Since version 8.3 XML has been supported as a data type and JSON support was introduced in 9.2. We'll be looking at what advantages there are in storing your data with these data types and how we can query and manipulate our data once it's stored.
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Handling XML and JSON in the Database
1. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
PGDay UK 2013
Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
12th July 2013
2. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
Overview
● XML
– XML Primer
– History of XML in PostgreSQL
– Using PostgreSQL's XML features
● JSON
– JSON Primer
– History of JSON in PostgreSQL
– Using PostgreSQL's JSON features
3. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
About Me
● Been using PostgreSQL for ~10 years
● Contributed some XML support
– XMLEXISTS/xpath_exists()
– xml_is_well_formed()
● Buildfarm member piapiac
– Amazon EC2 based build for JDBC driver
– Has lead to a number of bugfix patches for JDBC
http://www.pgbuildfarm.org/cgi-bin/show_status.pl?
member=piapiac
4. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
Reasons to store XML/JSON
● Client application uses it
– Configuration
– Serialised objects
● Data format/schema is highly complex/variable
● You just don't “care” about the data!
– Audit data
– Application log files
5. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
Primer: XML
● eXtensible Markup Language
● Human readable data interchange &
serialisation format
● Consists of a root element containing a mix of
child elements and text content with any
element having optional attributes
<root>
<child attribute=”value”>
text based content
</child>
</root>
6. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
XML Support
● ANSI and ISO standards exist
– Introduced in SQL/XML 2003
– Augmented in SQL/XML 2006
● Prior to 8.3 XML support was a contrib module
● Added to core in 8.3 but remains a compile
time option enabled with:
configure --with-libxml
7. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
XML Support
● xml datatype (internally stored as text)
● Keywords from the standards
– DOCUMENT, CONTENT
– XMLPARSE, XMLSERIALIZE ...
● Predicates, also from the standards
– IS [NOT] DOCUMENT
– XMLEXISTS (9.1)
● A number of support functions, not standard
– xmlconcat(), xpath() ...
8. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
XML: The Hard Way
● Using “standard” SQL, XML is inserted:
INSERT INTO demo (data) VALUES (XMLPARSE (DOCUMENT '<?
xml version="1.0"?> <book> <title>Manual</title>
<chapter>...</chapter> </book>'))
● To retrieve using “standard” SQL:
SELECT XMLSERIALIZE(DOCUMENT data AS text) FROM ...
9. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
XML: The Easy Way
● It's a normal datatype, use normal casting!
● INSERT (::xml is optional as text will
implicitly cast to xml even in 9.3)
pgday=# INSERT INTO demo (data) VALUES ('<?xml
version="1.0"?> pgday-# <book> <title>Manual</title>
pgday-# <chapter>...</chapter> </book>'::xml);
INSERT 0 1
10. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
XML: The Easy Way
● SELECT (::text is optional as far as
rendering in the psql client is concerned)
pgday=# SELECT data::text FROM demo WHERE ...
data
----------------------------------------------------------
<book><title>Manual</title><chapter>...</chapter></book>
(1 row)
11. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
XML: xmloption
● When casting without XMLPARSE or
XMLSERIALIZE the choice of DOCUMENT or
CONTENT is determined by the value of the 'XML
option' session variable
SET XML OPTION { DOCUMENT | CONTENT}
SET xmloption TO { DOCUMENT |
CONTENT}
12. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
XML: Predicates
● IS DOCUMENT / IS NOT DOCUMENT
– Use to filter between DOCUMENT and CONTENT
– Only works with data that is already cast as XML
pgday=# SELECT XMLSERIALIZE(DOCUMENT data AS text)
FROM
pgday-# WHERE data IS DOCUMENT;
pgday=# SELECT XMLSERIALIZE(CONTENT data AS text)
FROM
pgday-# WHERE data IS NOT DOCUMENT;
13. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
XML: Predicates
● xml_is_well_formed()
– Introduced in 9.1 and takes text as a parameter
– Sensitive to XMLOPTION
● SET xmloption DOCUMENT;
– Also xml_is_well_formed_documet() and
xml_is_well_formed_content()
14. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
XML: xpath()
xpath(xpath, xml [, nsarray])
● Allows you to extract elements and text
● Supports namespaces
● Returns an array of XML
● Also returns empty arrays when there is no match
SELECT xpath('//title/text()',data)
FROM ...
15. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
XML: XMLEXISTS
XMLEXISTS(text PASSING [BY REF] xml [BY REF])
● From the standard, useful for predicates
● First parameter is an xpath expression
SELECT xpath('//title/text()',data)
FROM demo WHERE XMLEXISTS('//title'
PASSING data);
16. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
XML: xpath_exists()
xpath_exists(xpath, xml [, nsarray])
● Serves the same purpose as XMLEXISTS but:
– Supports namespaces
– Syntax is simplier
SELECT xpath('//title/text()',data)
FROM demo WHERE
xpath_exists('//title', data);
17. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
Primer: JSON
● JavaScript Object Notation
● Also a human readable data interchange &
serialisation format
● { } denote objects containing a comma
separated list of name value pairs where
values can be nested objects or arrays
{
“name”:”value”,
“nestedObject”: {
“array”:[“value1”,”value2”]
},
“numeric”:13
}
18. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
JSON Support
● Not a SQL standard
● PostgreSQL is currently the only RDBMS with
native support
● Design borrows heavily from PostgreSQL's
existing XML support
– Internal storage format is TEXT
– Basic validation (e.g. well formed)
● Does not require special libraries to enable
19. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
JSON Support
● Basic support introduced in 9.2
– JSON datatype
– 2 support functions
● row_to_json()
● array_to_json()
● Operators and additional functions in 9.3
– ->,->>,#>,#>>
– 10 more support functions
20. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
JSON Support
● INSERT always ensures type validity
pgday=# INSERT INTO demo VALUES
pgday-# ('{"name":"Mike","hungry":true}');
INSERT 0 1
pgday=# INSERT INTO demo VALUES
pgday-# ('{"name":"Mike",hungry:true}');
ERROR: invalid input syntax for type json at
character 28
21. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
JSON: Operators
● Retrieve the value of an attribute
SELECT data->'name' AS name FROM …
● Use the value of an attribute in a predicate
SELECT data FROM demo WHERE
data->>'name' = 'Mike';
22. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
JSON: Functions
row_to_json(record [, pretty_bool])
Returns a result set where each row is a JSON object
pgday=# SELECT * FROM demo;
username | posts | email
----------+-------+-------------------
mlfowler | 121 | mike@mlfowler.com
fowlerm | 9 | mike@work.com
pgday=# SELECT row_to_json(demo) FROM demo;
row_to_json
-----------------------------------------------------------------
{"username":"mlfowler","posts":121,"email":"mike@mlfowler.com"}
{"username":"fowlerm","posts":9,"email":"mike@work.com"}
23. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
JSON: Functions
json_object_keys(json)
Returns all the keys in the top level
pgday=# SELECT json_object_keys(data) FROM demo;
json_object_keys
------------------
name
hungry
24. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
Caveats
● They are not indexable
– You will need to plan how best to retrieve
● Best used programmaticly
– Syntax of XML and JSON can be unwieldy
● Non-UTF8 encoded databases
● Not portable
25. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
Summary
● Using the PostgreSQL XML and JSON
datatypes allows you finer control of an
otherwise free format field
● PostgreSQL's XML non-standard support is
much easier to use than the standard
● JSON is still being very actively developed
26. Handling XML and JSON in the Database
Mike Fowler, mike@mlfowler.com
PGDayUK 2013
Thank you!
Mike Fowler
mike@mlfowler.com