A story about developing an application for an online store, persisting all the data as JSON.
Gives an overview of JSON functionality in Oracle Database 19c.
7
Agile Database Development
with JSON
Chris Saxon
Developer Advocate, @ChrisRSaxon & @SQLDaily
blogs.oracle.com/sql
youtube.com/c/TheMagicofSQL
asktom.oracle.com
The following is intended to outline our general product direction. It is intended for information purposes
only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code,
or functionality, and should not be relied upon in making purchasing decisions. The development,
release, timing, and pricing of any features or functionality described for Oracle’s products may change
and remains at the sole discretion of Oracle Corporation.
Statements in this presentation relating to Oracle’s future plans, expectations, beliefs, intentions and
prospects are “forward-looking statements” and are subject to material risks and uncertainties. A detailed
discussion of these factors and other risks that affect our business is contained in Oracle’s Securities and
Exchange Commission (SEC) filings, including our most recent reports on Form 10-K and Form 10-Q
under the heading “Risk Factors.” These filings are available on the SEC’s website or on Oracle’s website
at http://www.oracle.com/investor. All information in this presentation is current as of September 2019
and Oracle undertakes no duty to update any statement in light of new information or future events.
Safe Harbor
User Story #1
We must be able to store
product & order details
So we need to create the tables
and define CRUD operations on them
create table products (
product_id integer
not null
primary key,
product_json ##TODO##
not null,
check (
json_data is json
)
);
create table orders (
order_id integer
not null
primary key,
order_json ##TODO##
not null,
check (
json_data is json
)
);
The tables are just a
primary key, JSON column,
& is json constraint
create table products (
product_id integer
not null
primary key,
product_json ##TODO##
not null,
check (
json_data is json
)
);
create table orders (
order_id integer
not null
primary key,
order_json ##TODO##
not null,
check (
json_data is json
)
);
But which data type
to use for JSON?!
Which data type should you use for JSON?
"Small" documents varchar2
"Large" documents ???
<= 4,000 bytes / 32k
create table products (
product_id integer
not null
primary key,
product_json blob
not null,
check (
json_data is json
)
);
create table orders (
order_id integer
not null
primary key,
order_json blob
not null,
check (
json_data is json
)
);
insert into products ( product_json )
values ( utl_raw.cast_to_raw ( '{
"productName": "..."
}' ) );
BLOBs need extra processing on insert
select product_json from products;
PRODUCT_JSON
7B202274686973223A20227468617422207D
and select to make them human readable
select json_serialize (
product_json
returning clob
pretty
) jdata
from products;
JDATA
{
"productName": "..."
}
Added in 19c
json_serialize
converts JSON data to
text; which you can
pretty print for
readability
select json_query (
product_json,
'$' returning clob
pretty
) jdata
from products;
JDATA
{
"productName": "..."
}
In earlier releases use
json_query
The clob return type
was added in 18c
User Story #2
Customers must be able to
search by price
So we need to query the products table for JSON
where the unitPrice is in the specified range
{
"productName": "GEEKWAGON",
"descripion": "Ut commodo in …",
"unitPrice": 35.97,
"bricks": [ {
"colour": "red", "shape": "cube",
"quantity": 13
}, {
"colour": "green", "shape": "cube",
"quantity": 17
}, …
]
}
We need to search for this
value in the documents
select * from products p
where p.product_json.unitPrice <= :max_price;
But remember it returns
varchar2
=> implicit conversion!
Use simple dot-notation to access the value
select * from products p
where json_value (
product_json,
'$.unitPrice' returning number
) <= :max_price;
json_value gives you more control
So this returns number
=> no implicit conversion! :)
select * from products p
where p.product_json.unitPrice.number()
<= :max_price;
From 19c you can state
the return type with
simple dot-notation
User Story #3
Customers must be able to view their
orders
Showing order details and a list of what they bought
So we need to join the order productIds to products
{
"customerId" : 2,
"orderDatetime" : "2019-01-01T03:25:43",
"products" : [ {
"productId" : 1,
"unitPrice" : 74.95
}, {
"productId" : 10,
"unitPrice" : 35.97
}, …
]
}
We need to extract these
from the product array
select json_query (
order_json, '$.products[*].productId'
with array wrapper
)
from orders o;
PRODUCTS
[2,8,5]
[3,9,6]
[1,10,7,4]
...
But to join these to
products, we need to
convert them to rows…
…or with json_query
with order_items as (
select order_id, t.*
from orders o, json_table (
order_json
columns (
customerId,
nested products[*] columns (
productId,
unitPrice
) )
) t
)
Simplified syntax 18c
with order_items as (
select order_id, t.*
from orders o, json_table (
order_json
columns (
customerId,
nested products[*] columns (
productId,
unitPrice
) )
) t
)
This tells the database to
return a row for each
element in the products
array…
User Story #4
Sales must be able to view
today's orders
We need to create a dashboard counting orders
So we need to search for orders placed today
{
"customerId" : 2,
"orderDatetime" : "2019-01-01T03:25:43",
"products" : [ {
"productId" : 1,
"unitPrice" : 74.95
}, {
"productId" : 10,
"unitPrice" : 35.97
}, …
]
}
We need to search
for this value in the
documents
select * from orders o
where o.order_json.orderDatetime >=
trunc ( sysdate );
ORA-01861: literal does
not match format string
Remember the
implicit conversions?
It fails for dates!
Use simple dot-notation to access the value
select * from orders o
where json_value (
order_json,
'$.orderDatetime' returning date
) >= trunc ( sysdate )
So you need to define the
return type; JSON dates
conform to ISO 8601 date
select * from orders o
where json_value (
order_json,
'$.orderDatetime' returning date
) >= trunc ( sysdate )
But the query is very slow…
select * from orders o
where json_value (
order_json,
'$.orderDatetime' returning date
) >= trunc ( sysdate )
{ "customerId": 1, … }
{ "customerId": 2, … }
…
User Story #4b
… and make it fast!
currently the query does a full table scan
To speed it up we need to create an index!
create index orders_date_i
on orders ( order_json );
ORA-02327: cannot create index on
expression with datatype LOB
You can't index LOB data
create search index orders_json_i
on orders ( order_json )
for json
parameters ( 'sync (on commit)' );
Added in 12.2, a json search
index enables JSON queries
to use an index
JSON Search Indexes
select * from orders o
where json_value (
order_json,
'$.orderDatetime' returning date
) >= trunc ( sysdate )
{ "customerId": 1, … }
{ "customerId": 2, … }
…
-----------------------------------------------------
| Id | Operation | Name |
-----------------------------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | TABLE ACCESS BY INDEX ROWID| ORDERS |
|* 2 | DOMAIN INDEX | ORDERS_JSON_I |
-----------------------------------------------------
With the search index in place,
the optimizer can use it
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(JSON_VALUE("ORDER_JSON" FORMAT JSON ,
'$.orderDatetime' RETURNING TIMESTAMP NULL
ON ERROR) >= TIMESTAMP' 2019-01-15 00:00:00')
2 - access("CTXSYS"."CONTAINS"("O"."ORDER_JSON",
'sdatap(TMS_orderDatetime >=
"2019-01-15T00:00:00+00:00" /orderDatetime)')>0)
Under the covers, this uses Oracle Text
create index order_date_i
on orders (
json_value (
order_json,
'$.orderDatetime'
returning date
error on error
null on empty
)
);
It's more efficient to
create a function-
based index,
matching the search
you'll do
This has some other
benefits…
create index order_date_i
on orders (
json_value (
order_json,
'$.orderDatetime'
returning date
error on error
null on empty
)
);
Data validation!
If the value is not
a JSON date;
inserts will raise
an exception
create index order_date_i
on orders (
json_value (
order_json,
'$.orderDatetime'
returning date
error on error
null on empty
)
);
From 12.2 you can
also raise an error
if the attribute is
not present
------------------------------------------------------------
| Id | Operation | Name |
------------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| ORDERS |
|* 2 | INDEX RANGE SCAN | ORDER_DATE_I |
------------------------------------------------------------
The function-based index is more
efficient, so the optimizer will choose
this over the search index
Search vs. Function-Based Indexes
JSON Search Index Function-based Index
Applicability Any JSON query Matching function
Performance Slower Faster
Use Ad-hoc queries Application queries
"bricks": [ {
"colour": "red",
"shape": "cube",
"quantity": 13
}, {
"colour": "green",
"shape": "cuboid",
"quantity": 17
}, …
] join on
colour, shape
We need to combine the spreadsheet
data with the stored JSON
Step 1: transform JSON
to rows-and-columns
Step 3: convert
back to JSON
Step 2: join
the costs
Photo by Gus Ruballo on Unsplash
Buckle up!
This will be a bumpy ride!
select * from external ( (
colour varchar2(30),
shape varchar2(30),
unit_cost number
)
default directory tmp
location ( 'costs.csv' )
)
From 18c you can query files "on the
fly" with an inline external table
select product_id, j.*
from products, json_table (
product_json columns (
nested bricks[*] columns (
pos for ordinality,
colour path '$.colour',
shape path '$.shape',
brick format json path '$'
)
)
) j
Using JSON_table to
extract the bricks as rows
with costs as (
select * from external …
), bricks as (
select product_id, j.*
from products, json_table (
…
)
)
select …
from bricks join costs
on …
We've joined the data, but how do
we convert it back to JSON?
select json_object (
'colour' value b.colour,
'shape' value b.shape,
'quantity' value b.quantity,
'unitCost' value c.cost
)
from bricks b
join costs c
on b.colour = c.colour
and b.shape = c.shape;
So you can create a brick
object with json_object…
select json_mergepatch (
brick,
'{ "unitCost": ' || c.cost || '}'
)
from bricks b
join costs c
on b.colour = c.colour
and b.shape = c.shape;
Add/replace this…
…to this document
… or use
json_mergepatch (19c)
to add it to the brick object
{
"colour": "red",
"shape": "cube",
"quantity": 13,
"unitCost": 0.59
}
{
"colour": "green",
"shape": "cuboid",
"quantity": 17,
"unitCost": 0.39
}
This returns a row
for each brick
To combine them
into an array for
each product, use
json_arrayagg
{
"productName": "GEEKWAGON",
"descripion": "Ut commodo in …",
"unitPrice": 35.97,
"bricks": [ {
…, "unitCost": 0.59
}, {
…, "unitCost": 0.39
}, …
]
}
Finally!
We've added
unitCost to every
element in the array
We just need to
update the table…
desc orders
Name Null? Type
ORDER_ID NOT NULL NUMBER(38)
ORDER_JSON NOT NULL BLOB
ORDER_JSON$customerId NUMBER
ORDER_JSON$orderDatetime VARCHAR2(32)
ORDER_JSON$code VARCHAR2(8)
ORDER_JSON$discountAmount NUMBER
Sadly it only exposes
scalar (non-array) values
PRODUCT_ID SHAPE COLOUR
1 cube green
1 cube red
1 cylinder blue
1 cylinder blue
1 cylinder green
1 cylinder green
… … …
The unique key
for a brick is
(colour, shape)
Some products have
duplicate entries
in the bricks array!
We're shipping too
many bricks!
User Story #8
FIX ALL THE DATAZ!
We need to remove all the duplicate entries
from the product brick arrays
Wrong Data Model
PRODUCTS BRICKS
The JSON models the relationship between
products and bricks as 1:M
This is the wrong data model
the relationship is M:M
Fixed It!
PRODUCTS BRICKSPRODUCT_BRICKS
unique (
product_id,
brick_id
)
{ JSON } { JSON }{ JSON }
You need a junction table
between products and bricks
This avoids duplication &
enables constraints
select distinct "PRODUCT_JSON$shape" shape,
"PRODUCT_JSON$colour" colour,
"PRODUCT_JSON$unitCost" unit_cost
from product_bricks_vw
Moving from 1:M to M:M
Using the JSON Data Guide
view, you can find all the
unique brick types…
with vals as (
select distinct "PRODUCT_JSON$shape" shape,
"PRODUCT_JSON$colour" colour,
"PRODUCT_JSON$unitCost" unit_cost
from product_bricks_vw
)
select rownum brick_id,
v.*
from vals v;
…assign a unique ID to each
( colour, shape ) …
create table bricks as
with vals as (
select distinct "PRODUCT_JSON$shape" shape,
"PRODUCT_JSON$colour" colour,
"PRODUCT_JSON$unitCost" unit_cost
from product_bricks_vw
)
select rownum brick_id,
v.*
from vals v;
…and create a table
from the results!
create table bricks as
with vals as (
select distinct "PRODUCT_JSON$shape" "shape",
"PRODUCT_JSON$colour" "colour",
"PRODUCT_JSON$unitCost" "unitCost"
from product_bricks_vw
)
select rownum brick_id,
json_object ( v.* ) brick_json
from vals v;
19c simplification
(Storing the values as
JSON if you want)
create table product_bricks as
select distinct product_id, brick_id
from product_bricks_vw
join bricks
on ...
Create the Join Table
json_mergepatch (
product_json,
'{ "bricks": null }'
)
If you pass a null value for an
attribute to JSON_mergepatch,
it's removed from the source
Removing the bricks array from products