Agenda
• About Anders Karlsson
• JSON, the new CSV – The basics!
• mysqljson - JSON import and export
with MariaDB and MySQL
• MariaDB JSON Support Extensions
• Dynamic columns
• MariaDB 10.0.1 new stuff
• Examples
• Questions and Answers
About Anders Karlsson
• Senior Sales Engineer at SkySQL
• Former Database Architect at Recorded Future, Sales
Engineer and Consultant with Oracle, Informix,
TimesTen, MySQL / Sun / Oracle etc.
• Has been in the RDBMS business for 20+ years
• Has also worked as Tech Support engineer, Porting
Engineer and in many other roles
• Outside SkySQL I build websites (www.papablues.com),
develop Open Source software (MyQuery,
mycleaner etc), am a keen photographer, has
an affection for English Real Ales and a great
interest in computer history
29/04/2013 SkySQL Ab 2011 Confidential 3
JSON, The new CSV – The basics
• JSON = Java Script Object Notation
– Not for Java Script only!
• JSON is easy to use, write and read
• JSON is reasonably well standardized
– Not to the extent that it is standardized to
become useless to mere mortals (Like XML)
– Rather, a simple, no frills, standard
– But more so than, say, CSV
• JSON is not tied to a specific platform,
database or application
JSON, The new CSV – The basics
• The JSON value types are simple
– Number
– String
– NULL
– TRUE / FALSE
– Object
– Array
• An object is a collection of elements, each
with a unique name and a value of one of the
basic types (including object)
• An array is a unordered list of values
JSON, The new CSV – The basics
• An example of a simple JSON value:
[
{"name": "Smith", "age": 57},
{"name": "Allen", "salary": 1600},
{"name": "King", "job": "Manager", "salary": "5000"}
]
• Another example:
{
"John": "The first name",
"Doe": "The last name"
}
JSON, The new CSV – The basics
• So what about this example:
{
"John": "The first name",
"Doe": "The last name",
"John": "Some other guys name"
}
• How many members does this
object have?
– 3?
– 2?
– 57?
JSON, The new CSV – The basics
• String specifics
– UNICODE / UTF8 only
– Backslash escapes, so binary data can be
represented
• Numbers are implementation defined,
regrettable, but mostly you get
– 32-bit signed integer
– 64-bit IEEE Double
JSON in a file
• JSON can appear in multiple ways in files, for
example (not exhaustive):
– As separate objects
{"col1": "value1", "col2": "value2"}
{"col1": "value1_2", "col3": "value3"}
– As an array of objects
[{"emp": [{"name": "Smith"},{"name": "Allen"}]},
{"dept": {"name": "dev"}}]
– As an array of simple, non-object, values
[{"col1": "value1", "col2": "value2"},
{"col1": "value1_2", "col3": "value3"}]
So, why is JSON useful?
• JSON works with more complex data than CSV
• JSON is better standardized than CSV
• JSON is great for interoperability
– If you want to use both relational data, with the
stricter schema and datatypes with a the more
flexible schema-less NoSQL options, than JSON is
great!
• JSON is used by JavaScript (of course),
MongoDB, Elasticsearch, CouchDB and many
others and can be used with many more!
• JSON is also a bit of fun!
Why JSON? Why not XML?
• JSON has numerous good support libraries
that are well-thought-out, stable and easy to
use
– I tend to use Jansson, a C-library for manipulating
JSON
– Most script languages has JSON parsers, so that, a
JSON object can easily be transformed into a Ruby
or Python object
• XML on the other hand is complex and
requires a rocket scientist to use and is also
hard to read.
mysqljson – Export and Import
• My project for JSON import and export for
MySQL and MariaDB
• Available on sourceforge
• Supports several file formats
– Object format import is still not released, although
the code is mostly done
• Table and column name mapping
• Column values can be generated
– Fixed
– Incremental
mysqljson – Export and Import
• Does not resolve, say, Foreign Key lookups
• Export allows simple table exports, as well as
ad-hoc SQL export
• Import is parallel
– Parallel on table by table
– Parallel on table level
JSON support in MariaDB
• MariaDB supports dynamic columns in version
5.3 and up
– Dynamic columns is a column type that allows
structured data to be stored in it
– Dynamic columns are stored in as BLOBs
– Dynamic columns consists of arbitrary key-value
pairs, similar to JSON objects. Key is unique within
an object
– Supported by a client-side API
JSON support in recent MariaDB
• MariaDB 10.0.1 adds a lot to dynamic columns
– Support for named keys (pre MariaDB 10.0.1 the
key was an integer)
– Support for JSON export of dynamic columns
• To be added are
– Support for JSON arrays
– Support for parsing JSON objects
– Support for more advanced JSON manipulation
MariaDB dynamic columns functions
• COLUMN_CREATE
– Create a dynamic column
– Dynamic columns may be nested
• COLUMN_GET
– Get the value of an item in a dynamic column
• COLUMN_ADD / COLUMN_DELETE
– Add / update / delete an item from a dynamic
column
• And more…
MariaDB 10.0.1 additions
• COLUMN_JSON
– Extract the value of a dynamic column as a correct
valid JSON object
• COLUMN_CHECK
– Check that the format of a BLOB is a correct
dynamic column
JSON with MariaDB 10.0.1
CREATE TABLE presidents(id INT NOT NULL
PRIMARY KEY AUTO_INCREMENT, info BLOB);
INSERT INTO presidents(id, info)
VALUES(NULL, COLUMN_CREATE('firstname',
'Richard', 'nickname', 'Tricky Dick',
'lastname', 'Nixon'));
INSERT INTO presidents(id, info)
VALUES(NULL, COLUMN_CREATE('firstname',
'George', 'lastname', 'Bush'));
JSON with MariaDB 10.0.1
mysql> SELECT id, COLUMN_JSON(info) FROM presidents;
+----+---------------------------------------------------------------------+
| id | COLUMN_JSON(info) |
+----+---------------------------------------------------------------------+
| 1 | {"lastname":"Nixon","nickname":"Tricky Dick","firstname":"Richard"} |
| 2 | {"lastname":"Bush","firstname":"George"} |
+----+---------------------------------------------------------------------+
mysql> UPDATE presidents SET info = COLUMN_ADD(info, 'nickname', 'W') WHERE
id = 2;
mysql> SELECT id, COLUMN_JSON(info) FROM presidents;
+----+---------------------------------------------------------------------+
| id | COLUMN_JSON(info) |
+----+---------------------------------------------------------------------+
| 1 | {"lastname":"Nixon","nickname":"Tricky Dick","firstname":"Richard"} |
| 2 | {"lastname":"Bush","nickname":"W","firstname":"George"} |
+----+---------------------------------------------------------------------+
Indexing JSON in MariaDB
• JSON items can be indexed in MariaDB, using
Virtual columns
• This is not optimal, but it is what is currently
available
CREATE TABLE presidents(id INT NOT NULL
PRIMARY KEY AUTO_INCREMENT, info BLOB,
lastname VARCHAR(64) AS (COLUMN_GET(info,
'lastname' AS CHAR(64))) PERSISTENT);
CREATE INDEX president_lastname ON
presidents(lastname);
Indexing JSON in MariaDB
mysql> SELECT rows_read FROM information_schema.index_statistics WHERE index_name =
'president_lastname';
+-----------+
| rows_read |
+-----------+
| 6 |
+-----------+
1 row in set (0.00 sec)
mysql> select COLUMN_JSON(info) from presidents where lastname = 'Bush';
+---------------------------------------------------------+
| COLUMN_JSON(info) |
+---------------------------------------------------------+
| {"lastname":"Bush","nickname":"W","firstname":"George"} |
+---------------------------------------------------------+
1 row in set (0.00 sec)
mysql> SELECT rows_read FROM information_schema.index_statistics WHERE index_name =
'president_lastname';
+-----------+
| rows_read |
+-----------+
| 7 |
+-----------+
1 row in set (0.00 sec)
Triggers on JSON in MariaDB
• Again: Use virtual columns
mysql> CREATE TRIGGER presidents_change AFTER UPDATE ON
presidents FOR EACH ROW INSERT INTO changelog VALUES(NOW(),
CONCAT('Name change from ', old.lastname, ' to ', new.lastname));
Query OK, 0 rows affected (0.10 sec)
mysql> UPDATE presidents SET info = column_add(info, 'lastname',
'Obama') WHERE lastname = 'Bush';
Query OK, 1 row affected (0.05 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> SELECT * FROM changelog;
+---------------------+--------------------------------+
| logtime | logtext |
+---------------------+--------------------------------+
| 2013-04-18 22:06:07 | Name change from Bush to Obama |
+---------------------+--------------------------------+
1 row in set (0.00 sec)
The missing stuff
• JSON Parser for JSON input
• Support for all JSON datatypes
– NULL
– Numeric
– Boolean
• Support for JSON arrays
• Better indexing without resorting
to virtual columns
The missing stuff
• More JSON manipulation functions
• A proper JSON datatype
– Enforced UTF8
– JSON even in the SCHEMA
– Default JSON output format
• To SELECT a JSON column without having to resort to
COLUMN_JSON to get JSON out