SlideShare a Scribd company logo
You, Me, and JSONB
(a lightning talk)
3.15.17
You
• Are you doing a lot of JSON manipulation inside PostgreSQL?
• Do you need indexed lookups for arbitrary key searches?
• …if “yes”, have you looked at JSONB?
• http://stackoverflow.com/questions/22654170/explanation-of-jsonb-
introduced-by-postgresql
Me
• Work for Tableau
• Have a lot of crufty old code laying around…particularly conversion
formulas for JSON to GeoJSON
• Discovered JSONB while trying to optimize a very selective SELECT
from a table I had created using auditing triggers…
JSONB
JSONB vs JSON
JSON
• Stores data in text format
• Input is as fast, as no conversions are required
• Processing functions must re-parse the data on each execution
• Indexing is not supported
• All white space and line feeds in the input are preserved as-is
• Duplicate keys are retained, processing functions only consider the last value
• Order of the keys is preserved
JSONB
• Stores data in decomposed binary format
• Input is slightly slower, as there is conversion overhead involved
• Re-parsing is not needed, making data processing significantly faster
• Indexing is supported
• Extra white space and line feeds are stripped
• Duplicate keys are purged at input, only the last value is stored
• Order is not preserved
INSERT w/ JSON
• Query returned successfully:
• 111396 rows affected,
• 10.6 secs execution time.
INSERT w/ JSONB
• Query returned successfully:
• 111396 rows affected,
• 17.4 secs execution time.
https://datahub.io/dataset/william-shakespeare-plays/resource/f357200a-d71c-44f4-8271-
b00348a6f9c4
INSERT INTO … (SELECT * SHAKESPEARE)
..But wait!!! There is INDEXING!
CREATE TABLE public.will_play_text_raw_idx (
row_id SERIAL PRIMARY KEY,
data jsonb);
CREATE INDEX idxgin ON will_play_text_raw_idx USING gin (data);
Query returned successfully: 111396 rows affected,
19.2 secs execution time.
Find out what type of json objects there are…
SELECT json(b)_object_keys(data) FROM
public.will_play_text_raw
GROUP BY json(b)_object_keys(data);
JSON
• 1 secs execution time.
JSONB
• 6.1 secs execution time.
JSONB + INDEX (*just slightly faster then JSONB)
• 6.1 secs execution time.
Running with EXPLAIN ANALYZE VERBOSE…
JSON (1 sec)
"HashAggregate (cost=87267.47..87368.97 rows=20000 width=173)
(actual time=1051.083..1051.087 rows=6 loops=1)"
" Output: (json_object_keys(data))"
" Group Key: json_object_keys(will_play_text_raw.data)"
" -> Seq Scan on public.will_play_text_raw (cost=0.00..59418.47 rows=11139600 width=173)
(actual time=0.045..808.669 rows=668376 loops=1)"
" Output: json_object_keys(data)"
"Planning time: 0.175 ms"
"Execution time: 1051.170 ms"
Running with EXPLAIN ANALYZE VERBOSE…
JSONB (6.1 sec)
"Group (cost=4676148.26..4787265.77 rows=11139600 width=202)
(actual time=5242.724..6106.799 rows=6 loops=1)"
" Output: (jsonb_object_keys(data))"
" Group Key: (jsonb_object_keys(will_play_text_raw.data))"
" -> Sort (cost=4676148.26..4703997.26 rows=11139600 width=202)
(actual time=5242.719..6004.605 rows=668376 loops=1)"
" Output: (jsonb_object_keys(data))"
" Sort Key: (jsonb_object_keys(will_play_text_raw.data))"
" Sort Method: external merge Disk: 13352kB"
" -> Seq Scan on public.will_play_text_raw (cost=0.00..59794.47 rows=11139600 width=202)
(actual time=0.038..431.132 rows=668376 loops=1)"
" Output: jsonb_object_keys(data)"
"Planning time: 0.180 ms"
"Execution time: 6120.343 ms"
Doing a sort + sequential scan.
Notice that the cost of the sort on
the first row is very similar to the
all row costs. Usually when you a
step with very similar first row
and all row costs, that operation
requires all the data from all the
preceding steps.
Running with EXPLAIN ANALYZE VERBOSE…
JSONB + IDX (6.1 secs…slightly faster then just JSONB)
"Group (cost=4676148.26..4787265.77 rows=11139600 width=202)
(actual time=5231.472..6090.149 rows=6 loops=1)"
" Output: (jsonb_object_keys(data))"
" Group Key: (jsonb_object_keys(will_play_text_raw_idx.data))"
" -> Sort (cost=4676148.26..4703997.26 rows=11139600 width=202)
(actual time=5231.466..5987.161 rows=668376 loops=1)"
" Output: (jsonb_object_keys(data))"
" Sort Key: (jsonb_object_keys(will_play_text_raw_idx.data))"
" Sort Method: external merge Disk: 13352kB"
" -> Seq Scan on public.will_play_text_raw_idx (cost=0.00..59794.47 rows=11139600 width=202)
(actual time=0.034..424.509 rows=668376 loops=1)"
" Output: jsonb_object_keys(data)"
"Planning time: 0.182 ms"
"Execution time: 6103.038 ms"
When To Avoid JSONB in a PostgreSQL
Schema
https://blog.heapanalytics.com/when-to-avoid-jsonb-in-a-postgresql-schema/
- Slow queries due to a lack of table statistics (in other words, reconsider if you are
doing a lot of aggregate and grouping functions)
- Larger table footprint
(note: read this if you want to think more about “working around JSONB’s lack of
stats”)
https://www.postgresql.org/message-id/54C738E1.8080405@agliodbs.com
So…when does using JSONB **WORK?!**
Where doeth Shakespeare use the phrase
‘thou art’?
JSON
• 594 msecs execution time.
JSONB
• 133 msecs execution time.
JSONB + INDEX
• 125 msecs execution time.
ALMOST
5x’s
faster!!!
Where and Who said “To be, or not to be”
SELECT * FROM public.will_play_text_raw
Where data @> '{"text_entry" : "To be, or not to be: that is the
question:"}';
JSONB
• 78 msecs execution time.
JSONB + INDEX
• 31 msecs execution time.
https://www.postgresql.org/docs/9.4/static/functions-json.html#FUNCTIONS-JSONB-OP-TABLE
Note: ‘@>’ operator only
available with JSONB
Creating GeoJSON using JSON operations on a
regular PostGIS enabled spatial table
JSON (16 msecs)
SELECT row_to_json(fc)
FROM ( SELECT 'FeatureCollection' As type, array_to_json(array_agg(f)) As features
FROM (SELECT 'Feature' As type
, ST_AsGeoJSON(lg.geom, 4)::json As geometry
, row_to_json((SELECT l FROM (SELECT gid) As l
)) As properties
FROM public.smallworld As lg ) As f ) As fc;
Creating GeoJSON using JSONB operations on
a regular PostGIS enabled spatial table
JSONB (31 msecs)
SELECT to_jsonb(fc)
FROM ( SELECT 'FeatureCollection' As type, array_to_json(array_agg(f)) As features
FROM (SELECT 'Feature' As type
, ST_AsGeoJSON(lg.geom, 4)::jsonb As geometry
, to_jsonb((SELECT l FROM (SELECT gid) As l
)) As properties
FROM public.smallworld As lg ) As f ) As fc;
http://geojsonlint.com/
Is it valid GeoJSON?
Check here….
Weather Data – JSON to GeoJSON
http://openweathermap.org/api

More Related Content

What's hot

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Nosh Petigara
 
Updates to the java api for json processing for java ee 8
Updates to the java api for json processing for java ee 8Updates to the java api for json processing for java ee 8
Updates to the java api for json processing for java ee 8
Alex Soto
 
SQL for Elasticsearch
SQL for ElasticsearchSQL for Elasticsearch
SQL for Elasticsearch
Jodok Batlogg
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Rafał Kuć
 
MySQL And Search At Craigslist
MySQL And Search At CraigslistMySQL And Search At Craigslist
MySQL And Search At Craigslist
Jeremy Zawodny
 
Elk stack @inbot
Elk stack @inbotElk stack @inbot
Elk stack @inbot
Jilles van Gurp
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearch
clintongormley
 
Rapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and PythonRapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and Python
Rick Copeland
 
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Jeremy Zawodny
 
Latinoware
LatinowareLatinoware
Latinoware
kchodorow
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
antoinegirbal
 
Elasticsearch 설치 및 기본 활용
Elasticsearch 설치 및 기본 활용Elasticsearch 설치 및 기본 활용
Elasticsearch 설치 및 기본 활용
종민 김
 
CouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourCouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 Hour
Peter Friese
 
Automated Slow Query Analysis: Dex the Index Robot
Automated Slow Query Analysis: Dex the Index RobotAutomated Slow Query Analysis: Dex the Index Robot
Automated Slow Query Analysis: Dex the Index Robot
MongoDB
 
Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?
inovex GmbH
 
Fusion-io and MySQL at Craigslist
Fusion-io and MySQL at CraigslistFusion-io and MySQL at Craigslist
Fusion-io and MySQL at Craigslist
Jeremy Zawodny
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
Lukas Vlcek
 
Build your first MongoDB App in Ruby @ StrangeLoop 2013
Build your first MongoDB App in Ruby @ StrangeLoop 2013Build your first MongoDB App in Ruby @ StrangeLoop 2013
Build your first MongoDB App in Ruby @ StrangeLoop 2013
Steven Francia
 
Webinar: Getting Started with MongoDB - Back to Basics
Webinar: Getting Started with MongoDB - Back to BasicsWebinar: Getting Started with MongoDB - Back to Basics
Webinar: Getting Started with MongoDB - Back to Basics
MongoDB
 
From Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityFrom Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalability
Stéphane Gamard
 

What's hot (20)

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Updates to the java api for json processing for java ee 8
Updates to the java api for json processing for java ee 8Updates to the java api for json processing for java ee 8
Updates to the java api for json processing for java ee 8
 
SQL for Elasticsearch
SQL for ElasticsearchSQL for Elasticsearch
SQL for Elasticsearch
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
 
MySQL And Search At Craigslist
MySQL And Search At CraigslistMySQL And Search At Craigslist
MySQL And Search At Craigslist
 
Elk stack @inbot
Elk stack @inbotElk stack @inbot
Elk stack @inbot
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearch
 
Rapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and PythonRapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and Python
 
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
 
Latinoware
LatinowareLatinoware
Latinoware
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Elasticsearch 설치 및 기본 활용
Elasticsearch 설치 및 기본 활용Elasticsearch 설치 및 기본 활용
Elasticsearch 설치 및 기본 활용
 
CouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourCouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 Hour
 
Automated Slow Query Analysis: Dex the Index Robot
Automated Slow Query Analysis: Dex the Index RobotAutomated Slow Query Analysis: Dex the Index Robot
Automated Slow Query Analysis: Dex the Index Robot
 
Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?
 
Fusion-io and MySQL at Craigslist
Fusion-io and MySQL at CraigslistFusion-io and MySQL at Craigslist
Fusion-io and MySQL at Craigslist
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
 
Build your first MongoDB App in Ruby @ StrangeLoop 2013
Build your first MongoDB App in Ruby @ StrangeLoop 2013Build your first MongoDB App in Ruby @ StrangeLoop 2013
Build your first MongoDB App in Ruby @ StrangeLoop 2013
 
Webinar: Getting Started with MongoDB - Back to Basics
Webinar: Getting Started with MongoDB - Back to BasicsWebinar: Getting Started with MongoDB - Back to Basics
Webinar: Getting Started with MongoDB - Back to Basics
 
From Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityFrom Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalability
 

Similar to You, me, and jsonb

Jsquery - the jsonb query language with GIN indexing support
Jsquery - the jsonb query language with GIN indexing supportJsquery - the jsonb query language with GIN indexing support
Jsquery - the jsonb query language with GIN indexing support
Alexander Korotkov
 
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
pgdayrussia
 
2015-12-05 Александр Коротков, Иван Панченко - Слабо-структурированные данные...
2015-12-05 Александр Коротков, Иван Панченко - Слабо-структурированные данные...2015-12-05 Александр Коротков, Иван Панченко - Слабо-структурированные данные...
2015-12-05 Александр Коротков, Иван Панченко - Слабо-структурированные данные...
HappyDev
 
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
Yandex
 
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander KorotkovPostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
Nikolay Samokhvalov
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Jonathan Katz
 
Working with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDBWorking with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDB
ScaleGrid.io
 
NoSQL для PostgreSQL: Jsquery — язык запросов
NoSQL для PostgreSQL: Jsquery — язык запросовNoSQL для PostgreSQL: Jsquery — язык запросов
NoSQL для PostgreSQL: Jsquery — язык запросов
CodeFest
 
Json in Postgres - the Roadmap
 Json in Postgres - the Roadmap Json in Postgres - the Roadmap
Json in Postgres - the Roadmap
EDB
 
Elasticsearch War Stories
Elasticsearch War StoriesElasticsearch War Stories
Elasticsearch War Stories
Arno Broekhof
 
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
Ontico
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
Jurriaan Persyn
 
PostgreSQL 9.4 JSON Types and Operators
PostgreSQL 9.4 JSON Types and OperatorsPostgreSQL 9.4 JSON Types and Operators
PostgreSQL 9.4 JSON Types and Operators
Nicholas Kiraly
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL
Satoshi Nagayasu
 
PostgreSQLからMongoDBへ
PostgreSQLからMongoDBへPostgreSQLからMongoDBへ
PostgreSQLからMongoDBへ
Basuke Suzuki
 
JSON's big problem android_taipei_201709
JSON's big problem android_taipei_201709JSON's big problem android_taipei_201709
JSON's big problem android_taipei_201709
PRADA Hsiung
 
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
Ontico
 
MongoDB Shell Tips & Tricks
MongoDB Shell Tips & TricksMongoDB Shell Tips & Tricks
MongoDB Shell Tips & Tricks
MongoDB
 
MongoDB Command Line Tools
MongoDB Command Line ToolsMongoDB Command Line Tools
MongoDB Command Line Tools
Rainforest QA
 
Shell Tips and Tricks
Shell Tips and TricksShell Tips and Tricks
Shell Tips and Tricks
MongoDB
 

Similar to You, me, and jsonb (20)

Jsquery - the jsonb query language with GIN indexing support
Jsquery - the jsonb query language with GIN indexing supportJsquery - the jsonb query language with GIN indexing support
Jsquery - the jsonb query language with GIN indexing support
 
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
 
2015-12-05 Александр Коротков, Иван Панченко - Слабо-структурированные данные...
2015-12-05 Александр Коротков, Иван Панченко - Слабо-структурированные данные...2015-12-05 Александр Коротков, Иван Панченко - Слабо-структурированные данные...
2015-12-05 Александр Коротков, Иван Панченко - Слабо-структурированные данные...
 
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
Типы данных JSONb, соответствующие индексы и модуль jsquery – Олег Бартунов, ...
 
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander KorotkovPostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
 
Working with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDBWorking with JSON Data in PostgreSQL vs. MongoDB
Working with JSON Data in PostgreSQL vs. MongoDB
 
NoSQL для PostgreSQL: Jsquery — язык запросов
NoSQL для PostgreSQL: Jsquery — язык запросовNoSQL для PostgreSQL: Jsquery — язык запросов
NoSQL для PostgreSQL: Jsquery — язык запросов
 
Json in Postgres - the Roadmap
 Json in Postgres - the Roadmap Json in Postgres - the Roadmap
Json in Postgres - the Roadmap
 
Elasticsearch War Stories
Elasticsearch War StoriesElasticsearch War Stories
Elasticsearch War Stories
 
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
 
PostgreSQL 9.4 JSON Types and Operators
PostgreSQL 9.4 JSON Types and OperatorsPostgreSQL 9.4 JSON Types and Operators
PostgreSQL 9.4 JSON Types and Operators
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL
 
PostgreSQLからMongoDBへ
PostgreSQLからMongoDBへPostgreSQLからMongoDBへ
PostgreSQLからMongoDBへ
 
JSON's big problem android_taipei_201709
JSON's big problem android_taipei_201709JSON's big problem android_taipei_201709
JSON's big problem android_taipei_201709
 
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
 
MongoDB Shell Tips & Tricks
MongoDB Shell Tips & TricksMongoDB Shell Tips & Tricks
MongoDB Shell Tips & Tricks
 
MongoDB Command Line Tools
MongoDB Command Line ToolsMongoDB Command Line Tools
MongoDB Command Line Tools
 
Shell Tips and Tricks
Shell Tips and TricksShell Tips and Tricks
Shell Tips and Tricks
 

Recently uploaded

“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Edge AI and Vision Alliance
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Pitangent Analytics & Technology Solutions Pvt. Ltd
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
DianaGray10
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 

Recently uploaded (20)

“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 

You, me, and jsonb

  • 1. You, Me, and JSONB (a lightning talk) 3.15.17
  • 2. You • Are you doing a lot of JSON manipulation inside PostgreSQL? • Do you need indexed lookups for arbitrary key searches? • …if “yes”, have you looked at JSONB? • http://stackoverflow.com/questions/22654170/explanation-of-jsonb- introduced-by-postgresql
  • 3. Me • Work for Tableau • Have a lot of crufty old code laying around…particularly conversion formulas for JSON to GeoJSON • Discovered JSONB while trying to optimize a very selective SELECT from a table I had created using auditing triggers…
  • 5. JSONB vs JSON JSON • Stores data in text format • Input is as fast, as no conversions are required • Processing functions must re-parse the data on each execution • Indexing is not supported • All white space and line feeds in the input are preserved as-is • Duplicate keys are retained, processing functions only consider the last value • Order of the keys is preserved JSONB • Stores data in decomposed binary format • Input is slightly slower, as there is conversion overhead involved • Re-parsing is not needed, making data processing significantly faster • Indexing is supported • Extra white space and line feeds are stripped • Duplicate keys are purged at input, only the last value is stored • Order is not preserved
  • 6. INSERT w/ JSON • Query returned successfully: • 111396 rows affected, • 10.6 secs execution time. INSERT w/ JSONB • Query returned successfully: • 111396 rows affected, • 17.4 secs execution time. https://datahub.io/dataset/william-shakespeare-plays/resource/f357200a-d71c-44f4-8271- b00348a6f9c4 INSERT INTO … (SELECT * SHAKESPEARE)
  • 7. ..But wait!!! There is INDEXING! CREATE TABLE public.will_play_text_raw_idx ( row_id SERIAL PRIMARY KEY, data jsonb); CREATE INDEX idxgin ON will_play_text_raw_idx USING gin (data); Query returned successfully: 111396 rows affected, 19.2 secs execution time.
  • 8. Find out what type of json objects there are… SELECT json(b)_object_keys(data) FROM public.will_play_text_raw GROUP BY json(b)_object_keys(data); JSON • 1 secs execution time. JSONB • 6.1 secs execution time. JSONB + INDEX (*just slightly faster then JSONB) • 6.1 secs execution time.
  • 9. Running with EXPLAIN ANALYZE VERBOSE… JSON (1 sec) "HashAggregate (cost=87267.47..87368.97 rows=20000 width=173) (actual time=1051.083..1051.087 rows=6 loops=1)" " Output: (json_object_keys(data))" " Group Key: json_object_keys(will_play_text_raw.data)" " -> Seq Scan on public.will_play_text_raw (cost=0.00..59418.47 rows=11139600 width=173) (actual time=0.045..808.669 rows=668376 loops=1)" " Output: json_object_keys(data)" "Planning time: 0.175 ms" "Execution time: 1051.170 ms"
  • 10. Running with EXPLAIN ANALYZE VERBOSE… JSONB (6.1 sec) "Group (cost=4676148.26..4787265.77 rows=11139600 width=202) (actual time=5242.724..6106.799 rows=6 loops=1)" " Output: (jsonb_object_keys(data))" " Group Key: (jsonb_object_keys(will_play_text_raw.data))" " -> Sort (cost=4676148.26..4703997.26 rows=11139600 width=202) (actual time=5242.719..6004.605 rows=668376 loops=1)" " Output: (jsonb_object_keys(data))" " Sort Key: (jsonb_object_keys(will_play_text_raw.data))" " Sort Method: external merge Disk: 13352kB" " -> Seq Scan on public.will_play_text_raw (cost=0.00..59794.47 rows=11139600 width=202) (actual time=0.038..431.132 rows=668376 loops=1)" " Output: jsonb_object_keys(data)" "Planning time: 0.180 ms" "Execution time: 6120.343 ms" Doing a sort + sequential scan. Notice that the cost of the sort on the first row is very similar to the all row costs. Usually when you a step with very similar first row and all row costs, that operation requires all the data from all the preceding steps.
  • 11. Running with EXPLAIN ANALYZE VERBOSE… JSONB + IDX (6.1 secs…slightly faster then just JSONB) "Group (cost=4676148.26..4787265.77 rows=11139600 width=202) (actual time=5231.472..6090.149 rows=6 loops=1)" " Output: (jsonb_object_keys(data))" " Group Key: (jsonb_object_keys(will_play_text_raw_idx.data))" " -> Sort (cost=4676148.26..4703997.26 rows=11139600 width=202) (actual time=5231.466..5987.161 rows=668376 loops=1)" " Output: (jsonb_object_keys(data))" " Sort Key: (jsonb_object_keys(will_play_text_raw_idx.data))" " Sort Method: external merge Disk: 13352kB" " -> Seq Scan on public.will_play_text_raw_idx (cost=0.00..59794.47 rows=11139600 width=202) (actual time=0.034..424.509 rows=668376 loops=1)" " Output: jsonb_object_keys(data)" "Planning time: 0.182 ms" "Execution time: 6103.038 ms"
  • 12. When To Avoid JSONB in a PostgreSQL Schema https://blog.heapanalytics.com/when-to-avoid-jsonb-in-a-postgresql-schema/ - Slow queries due to a lack of table statistics (in other words, reconsider if you are doing a lot of aggregate and grouping functions) - Larger table footprint (note: read this if you want to think more about “working around JSONB’s lack of stats”) https://www.postgresql.org/message-id/54C738E1.8080405@agliodbs.com
  • 13. So…when does using JSONB **WORK?!**
  • 14. Where doeth Shakespeare use the phrase ‘thou art’? JSON • 594 msecs execution time. JSONB • 133 msecs execution time. JSONB + INDEX • 125 msecs execution time.
  • 16. Where and Who said “To be, or not to be” SELECT * FROM public.will_play_text_raw Where data @> '{"text_entry" : "To be, or not to be: that is the question:"}'; JSONB • 78 msecs execution time. JSONB + INDEX • 31 msecs execution time. https://www.postgresql.org/docs/9.4/static/functions-json.html#FUNCTIONS-JSONB-OP-TABLE Note: ‘@>’ operator only available with JSONB
  • 17.
  • 18. Creating GeoJSON using JSON operations on a regular PostGIS enabled spatial table JSON (16 msecs) SELECT row_to_json(fc) FROM ( SELECT 'FeatureCollection' As type, array_to_json(array_agg(f)) As features FROM (SELECT 'Feature' As type , ST_AsGeoJSON(lg.geom, 4)::json As geometry , row_to_json((SELECT l FROM (SELECT gid) As l )) As properties FROM public.smallworld As lg ) As f ) As fc;
  • 19. Creating GeoJSON using JSONB operations on a regular PostGIS enabled spatial table JSONB (31 msecs) SELECT to_jsonb(fc) FROM ( SELECT 'FeatureCollection' As type, array_to_json(array_agg(f)) As features FROM (SELECT 'Feature' As type , ST_AsGeoJSON(lg.geom, 4)::jsonb As geometry , to_jsonb((SELECT l FROM (SELECT gid) As l )) As properties FROM public.smallworld As lg ) As f ) As fc; http://geojsonlint.com/ Is it valid GeoJSON? Check here….
  • 20. Weather Data – JSON to GeoJSON http://openweathermap.org/api

Editor's Notes

  1. JSONB support can be nice for infrequently accessed attributes. jsonb supports subset operators (@> and <@) which is handy for complex filter conditions. Indexing helps…but it mostly helps with querying the data from the table (wouldn’t help you much for JOINS).
  2. Could increase work_mem
  3. This guy found disk space savings of about 30% by pulling 45 commonly used fields OUT of JSONB format