SlideShare a Scribd company logo
1 of 34
Download to read offline
Better Than You Think
JSON Data in ClickHouse
Robert Hodges
Altinity
1
V 2021-09
© 2021 Altinity, Inc.
Presenter and Company Bio
www.altinity.com
Enterprise provider for ClickHouse, a
popular, open source data warehouse.
We run Altinity.Cloud, the first
managed ClickHouse in AWS and GCP
Robert Hodges - Altinity CEO
30+ years on DBMS plus
virtualization and security. Using
ClickHouse since 2019.
2
© 2021 Altinity, Inc.
JSON is pervasive as raw data
Web site access log in JSON format:
{
"@timestamp": 895873059,
"clientip": "54.72.5.0",
"request": "GET /images/home_bg_stars.gif HTTP/1.1",
"status": 200,
"size": 2557
}
© 2021 Altinity, Inc.
How to model JSON data in ClickHouse
SQL Table
Tabular: every key
is a column
SQL Table
Array
of
Keys
Arrays: Header
values with
key-value pairs
Array
of
Values
SQL Table
JSON
Blob
JSON: Header
values with JSON
string (“blob”)
© 2021 Altinity, Inc.
Simplest design maps all keys to columns
CREATE TABLE http_logs_tabular
(
`@timestamp` DateTime,
`clientip` IPv4,
`status` UInt16,
`request` String,
`size` UInt32
)
ENGINE = MergeTree
PARTITION BY toStartOfDay(`@timestamp`)
ORDER BY `@timestamp`
SETTINGS index_granularity = 8192
© 2021 Altinity, Inc.
Loading *might* be this easy
head http_logs.json
{"@timestamp": 895873059, "clientip":"54.72.5.0", "request":
"GET /images/home_bg_stars.gif HTTP/1.1", "status": 200,
"size": 2557}
{"@timestamp": 895873059, "clientip":"53.72.5.0", "request":
"GET /images/home_tool.gif HTTP/1.0", "status": 200, "size":
327}
...
clickhouse-client --query 
'INSERT INTO http_logs_tabular Format JSONEachRow' 
< http_logs_tabular
© 2021 Altinity, Inc.
But what happens if...
You don’t know the columns
up front?
JSON keys and values differ
between records?
JSON values are dirty or don’t
map trivially to columns?
7
How can I handle that in
ClickHouse??
© 2021 Altinity, Inc.
Start by storing the JSON as text
CREATE TABLE http_logs
(
`file` String,
`message` String
)
ENGINE = MergeTree
PARTITION BY file
ORDER BY tuple()
SETTINGS index_granularity = 8192
“Blob”
© 2021 Altinity, Inc.
Load data in whatever way is easiest...
head http_logs.csv
"file","message"
"documents-211998.json","{""@timestamp"": 895873059,
""clientip"":""54.72.5.0"", ""request"": ""GET
/images/home_bg_stars.gif HTTP/1.1"", ""status"": 200, ""size"":
2557}"
"documents-211998.json","{""@timestamp"": 895873059,
""clientip"":""53.72.5.0"", ""request"": ""GET /images/home_tool.gif
HTTP/1.0"", ""status"": 200, ""size"": 327}"
...
clickhouse-client --query 
'INSERT INTO http_logs Format CSVWithNames' 
< http_logs.csv
© 2021 Altinity, Inc.
You can query using JSON* functions
-- Get a JSON string value
SELECT JSONExtractString(message, 'request') AS request
FROM http_logs LIMIT 3
-- Get a JSON numeric value
SELECT JSONExtractInt(message, 'status') AS status
FROM http_logs LIMIT 3
-- Use values to answer useful questions.
SELECT JSONExtractInt(message, 'status') AS status, count() as count
FROM http_logs WHERE status >= 400
WHERE toDateTime(JSONExtractUInt32(message, '@timestamp') BETWEEN
'1998-05-20 00:00:00' AND '1998-05-20 23:59:59'
GROUP BY status ORDER BY status
© 2021 Altinity, Inc.
-- Get using JSON function
SELECT JSONExtractString(message, 'request')
FROM http_logs LIMIT 3
-- Get it with proper type.
SELECT visitParamExtractString(message, 'request')
FROM http_logs LIMIT 3
JSON* vs visitParam functions
SLOWER
Complete
JSON parser
FASTER
But cannot distinguish same
name in different structures
© 2021 Altinity, Inc.
We can improve usability by ordering data
CREATE TABLE http_logs_sorted (
`file` String,
`message` String,
timestamp DateTime DEFAULT
toDateTime(JSONExtractUInt(message, '@timestamp'))
)
ENGINE = MergeTree
PARTITION BY toStartOfMonth(timestamp)
ORDER BY timestamp
INSERT INTO http_logs_sorted
SELECT file, message FROM http_logs
© 2021 Altinity, Inc.
And still further by adding more columns
ALTER TABLE http_logs_sorted
ADD COLUMN `status` Int16 DEFAULT JSONExtractInt(message,
'status') CODEC(ZSTD(1))
ALTER TABLE http_logs_sorted
ADD COLUMN `request` String DEFAULT
JSONExtractString(message, 'request')
-- Force columns to be materialized
ALTER TABLE http_logs_sorted
UPDATE status=status, request=request
WHERE 1
© 2021 Altinity, Inc.
Our query is now simpler...
SELECT
status, count() as count
FROM http_logs_sorted WHERE status >= 400 AND
timestamp BETWEEN
'1998-05-20 00:00:00' AND '1998-05-20 23:59:59'
GROUP BY status ORDER BY status
© 2021 Altinity, Inc.
And MUCH faster!
SELECT
status, count() as count
FROM http_logs_sorted WHERE status >= 400 AND
timestamp BETWEEN
'1998-05-20 00:00:00' AND '1998-05-20 23:59:59'
GROUP BY status ORDER BY status
0.014 seconds vs 9.8 seconds!
Can use primary
key index to drop
blocks
100x less I/O to read
© 2021 Altinity, Inc.
Extracted columns are like indexes
16
© 2021 Altinity, Inc.
(Here’s how we got that column size data)
-- System tables are your friends
SELECT
table, name,
data_compressed_bytes,
formatReadableSize(data_compressed_bytes) AS tc,
formatReadableSize(data_uncompressed_bytes) AS tu,
data_compressed_bytes / data_uncompressed_bytes AS ratio,
type, compression_codec
FROM system.columns
WHERE database = currentDatabase()
AND table LIKE 'http%'
ORDER BY table, name
© 2021 Altinity, Inc.
Another way to store JSON objects: Maps
CREATE TABLE http_logs_map (
`file` String, `message` Map(String, String),
timestamp DateTime
DEFAULT toDateTime(toUInt32(message['@timestamp']))
CODEC(Delta, ZSTD(1))
)
ENGINE = MergeTree
PARTITION BY toStartOfMonth(timestamp)
ORDER BY timestamp
© 2021 Altinity, Inc.
Loading and querying JSON in Maps
-- Load data
INSERT into http_logs_map(file, message)
SELECT file,
JSONExtractKeysAndValues(message, 'String') message
FROM http_logs
-- Run a query.
SELECT message['status'] status, count()
FROM http_logs_map
GROUP BY status ORDER BY status
4-5x faster than accessing
JSON string objects
© 2021 Altinity, Inc.
Storing JSON in paired arrays
CREATE TABLE http_logs_arrays (
`file` String,
`keys` Array(String),
`values` Array(String),
timestamp DateTime CODEC(Delta, ZSTD(1))
)
ENGINE = MergeTree
PARTITION BY toStartOfMonth(timestamp)
ORDER BY timestamp
© 2021 Altinity, Inc.
Querying values in arrays
-- Run a query.
SELECT values[indexOf(keys, 'status')] status, count()
FROM http_logs_arrays
GROUP BY status ORDER BY status
status|count() |
------|--------|
200 |24917090|
206 | 64935|
302 | 1941|
304 | 4899616|
400 | 888|
404 | 115005|
500 | 525|
4-5x faster than accessing
JSON string objects
© 2021 Altinity, Inc.
So... What do paired arrays look like?
SELECT * FROM http_logs_arrays LIMIT 3
Row 1:
──────
file: documents-211998.json
keys: ['@timestamp','clientip','request','status','size']
values: ['895435201','30.20.6.0','GET /french/index.html
HTTP/1.0','200','954']
timestamp: 1998-05-17 20:00:01
...
Key
Value
© 2021 Altinity, Inc.
Loading JSON to paired arrays
-- Load data. Might be better to format outside ClickHouse.
INSERT into http_logs_arrays(file, keys, values, timestamp)
SELECT file,
arrayMap(x -> x.1,
JSONExtractKeysAndValues(message, 'String')) keys,
arrayMap(x -> x.2,
JSONExtractKeysAndValues(message, 'String')) values,
toDateTime(JSONExtractUInt(message, '@timestamp'))
timestamp
FROM http_logs limit 30000000
© 2021 Altinity, Inc.
Must we always copy between tables?
24
It seems painful.
Surely there is a
better way!
© 2021 Altinity, Inc.
Use materialized views to help with loading
Nginx Logs
Web Logs http_logs_etl http_logs_arrays
MV
INSERT
INTO
Enrich data with
materialized view
© 2021 Altinity, Inc.
Create the base ETL table
CREATE TABLE http_logs_etl (
`file` String,
`message` String
)
ENGINE = Null Does not store
data but triggers
materialized views
© 2021 Altinity, Inc.
Create the target table (as before…)
CREATE TABLE http_logs_arrays (
`file` String,
`keys` Array(String),
`values` Array(String),
timestamp DateTime CODEC(Delta, ZSTD(1))
)
ENGINE = MergeTree
PARTITION BY toStartOfMonth(timestamp)
ORDER BY timestamp
© 2021 Altinity, Inc.
Create materialized view for ETL
-- Fires whenever a block arrives at http_logs_etl.
CREATE MATERIALIZED VIEW http_logs_etl_mv
TO http_logs_arrays
AS
SELECT file,
arrayMap(x -> x.1,
JSONExtractKeysAndValues(message, 'String')) keys,
arrayMap(x -> x.2,
JSONExtractKeysAndValues(message, 'String')) values,
toDateTime(JSONExtractUInt(message, '@timestamp')) timestamp
FROM http_logs_etl
Add WHERE conditions to filter data
Do any transforms you like!
© 2021 Altinity, Inc.
Now load data and have at it...
-- Clear the target table to start from scratch.
TRUNCATE TABLE http_logs_arrays
-- Load data via ETL table.
INSERT INTO http_logs_etl(file, message)
SELECT file, message FROM http_logs LIMIT 1000000
-- Confirm rows arrived in the arrays table.
SELECT count() FROM http_logs_arrays
count()|
-------|
1000000|
New data are instantly queryable
© 2021 Altinity, Inc.
Roadmap and
more
information
30
© 2021 Altinity, Inc.
ClickHouse JSON support is not perfect
Complex JSON is hard to access!
SELECT
JSONExtractUInt(
JSONExtractRaw(
JSONExtractRaw(
'{"a": {"b": {"c": 1}}}', 'a'), 'b'), 'c')
AS anInt
JSON performance is hard to optimize!
31
© 2021 Altinity, Inc.
What’s coming to help?
Semi-structured data!
ClickHouse will map JSON structure to columnar
storage and provide convenient query syntax
https://github.com/ClickHouse/ClickHouse/issues/17623
32
© 2021 Altinity, Inc.
You can go a long way in the meantime!
Lots of options to represent JSON in ClickHouse tables
● Transform fully to columns
● Store JSON blob and add columns with indexes as needed
● Transform to Map
● Transform to paired arrays
Load JSON blobs to staging table and transform with materialized views
Mix and match to process JSON in a way that works best for your apps!
33
Fan Favorite
© 2021 Altinity, Inc.
Questions?
Thank you!
Altinity
https://altinity.com
ClickHouse
https://github.com/ClickH
ouse/ClickHouse
Altinity.Cloud
https://altinity.com/cloud-
database/
We are hiring!
34

More Related Content

What's hot

Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Altinity Ltd
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...Altinity Ltd
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevAltinity Ltd
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAltinity Ltd
 
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...Altinity Ltd
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouseAltinity Ltd
 
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareAltinity Ltd
 
Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Ltd
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovAltinity Ltd
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesAltinity Ltd
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesAltinity Ltd
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOAltinity Ltd
 
Materialize: a platform for changing data
Materialize: a platform for changing dataMaterialize: a platform for changing data
Materialize: a platform for changing dataAltinity Ltd
 
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...Altinity Ltd
 
Fun with click house window functions webinar slides 2021-08-19
Fun with click house window functions webinar slides  2021-08-19Fun with click house window functions webinar slides  2021-08-19
Fun with click house window functions webinar slides 2021-08-19Altinity Ltd
 
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...HostedbyConfluent
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotXiang Fu
 
My first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdfMy first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdfAlkin Tezuysal
 

What's hot (20)

Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdf
 
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
 
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
 
Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdf
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei Milovidov
 
ClickHouse Intro
ClickHouse IntroClickHouse Intro
ClickHouse Intro
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
 
Materialize: a platform for changing data
Materialize: a platform for changing dataMaterialize: a platform for changing data
Materialize: a platform for changing data
 
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
 
Fun with click house window functions webinar slides 2021-08-19
Fun with click house window functions webinar slides  2021-08-19Fun with click house window functions webinar slides  2021-08-19
Fun with click house window functions webinar slides 2021-08-19
 
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
 
ClickHouse Keeper
ClickHouse KeeperClickHouse Keeper
ClickHouse Keeper
 
My first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdfMy first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdf
 

Similar to Better than you think: Handling JSON data in ClickHouse

A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...Altinity Ltd
 
Session06 handling xml data
Session06  handling xml dataSession06  handling xml data
Session06 handling xml datakendyhuu
 
Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015datastaxjp
 
What's New for Developers in SQL Server 2008?
What's New for Developers in SQL Server 2008?What's New for Developers in SQL Server 2008?
What's New for Developers in SQL Server 2008?ukdpe
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 OverviewEric Nelson
 
Introduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for CassandraIntroduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for CassandraDataStax Academy
 
Windows Azure and a little SQL Data Services
Windows Azure and a little SQL Data ServicesWindows Azure and a little SQL Data Services
Windows Azure and a little SQL Data Servicesukdpe
 
Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...
Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...
Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...Trivadis
 
Intake 38 data access 5
Intake 38 data access 5Intake 38 data access 5
Intake 38 data access 5Mahmoud Ouf
 
Accessing data with android cursors
Accessing data with android cursorsAccessing data with android cursors
Accessing data with android cursorsinfo_zybotech
 
Accessing data with android cursors
Accessing data with android cursorsAccessing data with android cursors
Accessing data with android cursorsinfo_zybotech
 
Chapter 3: ado.net
Chapter 3: ado.netChapter 3: ado.net
Chapter 3: ado.netNgeam Soly
 
Windows Mobile 5.0 Data Access And Storage Webcast
Windows Mobile 5.0 Data Access And Storage WebcastWindows Mobile 5.0 Data Access And Storage Webcast
Windows Mobile 5.0 Data Access And Storage WebcastVinod Kumar
 
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10Altinity Ltd
 
Going Native: Leveraging the New JSON Native Datatype in Oracle 21c
Going Native: Leveraging the New JSON Native Datatype in Oracle 21cGoing Native: Leveraging the New JSON Native Datatype in Oracle 21c
Going Native: Leveraging the New JSON Native Datatype in Oracle 21cJim Czuprynski
 

Similar to Better than you think: Handling JSON data in ClickHouse (20)

A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
 
Session06 handling xml data
Session06  handling xml dataSession06  handling xml data
Session06 handling xml data
 
Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015
 
What's New for Developers in SQL Server 2008?
What's New for Developers in SQL Server 2008?What's New for Developers in SQL Server 2008?
What's New for Developers in SQL Server 2008?
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 Overview
 
Introduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for CassandraIntroduction to Dating Modeling for Cassandra
Introduction to Dating Modeling for Cassandra
 
Windows Azure and a little SQL Data Services
Windows Azure and a little SQL Data ServicesWindows Azure and a little SQL Data Services
Windows Azure and a little SQL Data Services
 
Intake 37 ef2
Intake 37 ef2Intake 37 ef2
Intake 37 ef2
 
Databases with SQLite3.pdf
Databases with SQLite3.pdfDatabases with SQLite3.pdf
Databases with SQLite3.pdf
 
Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...
Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...
Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...
 
Intake 38 data access 5
Intake 38 data access 5Intake 38 data access 5
Intake 38 data access 5
 
Accessing data with android cursors
Accessing data with android cursorsAccessing data with android cursors
Accessing data with android cursors
 
Accessing data with android cursors
Accessing data with android cursorsAccessing data with android cursors
Accessing data with android cursors
 
Chapter 3: ado.net
Chapter 3: ado.netChapter 3: ado.net
Chapter 3: ado.net
 
Windows Mobile 5.0 Data Access And Storage Webcast
Windows Mobile 5.0 Data Access And Storage WebcastWindows Mobile 5.0 Data Access And Storage Webcast
Windows Mobile 5.0 Data Access And Storage Webcast
 
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
 
Thunderstruck
ThunderstruckThunderstruck
Thunderstruck
 
Going Native: Leveraging the New JSON Native Datatype in Oracle 21c
Going Native: Leveraging the New JSON Native Datatype in Oracle 21cGoing Native: Leveraging the New JSON Native Datatype in Oracle 21c
Going Native: Leveraging the New JSON Native Datatype in Oracle 21c
 
Mobile Web 5.0
Mobile Web 5.0Mobile Web 5.0
Mobile Web 5.0
 
JSON and XML
JSON and XMLJSON and XML
JSON and XML
 

More from Altinity Ltd

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxAltinity Ltd
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Altinity Ltd
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceAltinity Ltd
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfAltinity Ltd
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfAltinity Ltd
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Altinity Ltd
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Altinity Ltd
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfAltinity Ltd
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsAltinity Ltd
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAltinity Ltd
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache PinotAltinity Ltd
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Ltd
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...Altinity Ltd
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfAltinity Ltd
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...Altinity Ltd
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...Altinity Ltd
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...Altinity Ltd
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...Altinity Ltd
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...Altinity Ltd
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfAltinity Ltd
 

More from Altinity Ltd (20)

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open Source
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdf
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom Apps
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree Engine
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
 

Recently uploaded

IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfPratikPatil591646
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are successPratikSingh115843
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 

Recently uploaded (17)

IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdf
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are success
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 

Better than you think: Handling JSON data in ClickHouse

  • 1. Better Than You Think JSON Data in ClickHouse Robert Hodges Altinity 1 V 2021-09
  • 2. © 2021 Altinity, Inc. Presenter and Company Bio www.altinity.com Enterprise provider for ClickHouse, a popular, open source data warehouse. We run Altinity.Cloud, the first managed ClickHouse in AWS and GCP Robert Hodges - Altinity CEO 30+ years on DBMS plus virtualization and security. Using ClickHouse since 2019. 2
  • 3. © 2021 Altinity, Inc. JSON is pervasive as raw data Web site access log in JSON format: { "@timestamp": 895873059, "clientip": "54.72.5.0", "request": "GET /images/home_bg_stars.gif HTTP/1.1", "status": 200, "size": 2557 }
  • 4. © 2021 Altinity, Inc. How to model JSON data in ClickHouse SQL Table Tabular: every key is a column SQL Table Array of Keys Arrays: Header values with key-value pairs Array of Values SQL Table JSON Blob JSON: Header values with JSON string (“blob”)
  • 5. © 2021 Altinity, Inc. Simplest design maps all keys to columns CREATE TABLE http_logs_tabular ( `@timestamp` DateTime, `clientip` IPv4, `status` UInt16, `request` String, `size` UInt32 ) ENGINE = MergeTree PARTITION BY toStartOfDay(`@timestamp`) ORDER BY `@timestamp` SETTINGS index_granularity = 8192
  • 6. © 2021 Altinity, Inc. Loading *might* be this easy head http_logs.json {"@timestamp": 895873059, "clientip":"54.72.5.0", "request": "GET /images/home_bg_stars.gif HTTP/1.1", "status": 200, "size": 2557} {"@timestamp": 895873059, "clientip":"53.72.5.0", "request": "GET /images/home_tool.gif HTTP/1.0", "status": 200, "size": 327} ... clickhouse-client --query 'INSERT INTO http_logs_tabular Format JSONEachRow' < http_logs_tabular
  • 7. © 2021 Altinity, Inc. But what happens if... You don’t know the columns up front? JSON keys and values differ between records? JSON values are dirty or don’t map trivially to columns? 7 How can I handle that in ClickHouse??
  • 8. © 2021 Altinity, Inc. Start by storing the JSON as text CREATE TABLE http_logs ( `file` String, `message` String ) ENGINE = MergeTree PARTITION BY file ORDER BY tuple() SETTINGS index_granularity = 8192 “Blob”
  • 9. © 2021 Altinity, Inc. Load data in whatever way is easiest... head http_logs.csv "file","message" "documents-211998.json","{""@timestamp"": 895873059, ""clientip"":""54.72.5.0"", ""request"": ""GET /images/home_bg_stars.gif HTTP/1.1"", ""status"": 200, ""size"": 2557}" "documents-211998.json","{""@timestamp"": 895873059, ""clientip"":""53.72.5.0"", ""request"": ""GET /images/home_tool.gif HTTP/1.0"", ""status"": 200, ""size"": 327}" ... clickhouse-client --query 'INSERT INTO http_logs Format CSVWithNames' < http_logs.csv
  • 10. © 2021 Altinity, Inc. You can query using JSON* functions -- Get a JSON string value SELECT JSONExtractString(message, 'request') AS request FROM http_logs LIMIT 3 -- Get a JSON numeric value SELECT JSONExtractInt(message, 'status') AS status FROM http_logs LIMIT 3 -- Use values to answer useful questions. SELECT JSONExtractInt(message, 'status') AS status, count() as count FROM http_logs WHERE status >= 400 WHERE toDateTime(JSONExtractUInt32(message, '@timestamp') BETWEEN '1998-05-20 00:00:00' AND '1998-05-20 23:59:59' GROUP BY status ORDER BY status
  • 11. © 2021 Altinity, Inc. -- Get using JSON function SELECT JSONExtractString(message, 'request') FROM http_logs LIMIT 3 -- Get it with proper type. SELECT visitParamExtractString(message, 'request') FROM http_logs LIMIT 3 JSON* vs visitParam functions SLOWER Complete JSON parser FASTER But cannot distinguish same name in different structures
  • 12. © 2021 Altinity, Inc. We can improve usability by ordering data CREATE TABLE http_logs_sorted ( `file` String, `message` String, timestamp DateTime DEFAULT toDateTime(JSONExtractUInt(message, '@timestamp')) ) ENGINE = MergeTree PARTITION BY toStartOfMonth(timestamp) ORDER BY timestamp INSERT INTO http_logs_sorted SELECT file, message FROM http_logs
  • 13. © 2021 Altinity, Inc. And still further by adding more columns ALTER TABLE http_logs_sorted ADD COLUMN `status` Int16 DEFAULT JSONExtractInt(message, 'status') CODEC(ZSTD(1)) ALTER TABLE http_logs_sorted ADD COLUMN `request` String DEFAULT JSONExtractString(message, 'request') -- Force columns to be materialized ALTER TABLE http_logs_sorted UPDATE status=status, request=request WHERE 1
  • 14. © 2021 Altinity, Inc. Our query is now simpler... SELECT status, count() as count FROM http_logs_sorted WHERE status >= 400 AND timestamp BETWEEN '1998-05-20 00:00:00' AND '1998-05-20 23:59:59' GROUP BY status ORDER BY status
  • 15. © 2021 Altinity, Inc. And MUCH faster! SELECT status, count() as count FROM http_logs_sorted WHERE status >= 400 AND timestamp BETWEEN '1998-05-20 00:00:00' AND '1998-05-20 23:59:59' GROUP BY status ORDER BY status 0.014 seconds vs 9.8 seconds! Can use primary key index to drop blocks 100x less I/O to read
  • 16. © 2021 Altinity, Inc. Extracted columns are like indexes 16
  • 17. © 2021 Altinity, Inc. (Here’s how we got that column size data) -- System tables are your friends SELECT table, name, data_compressed_bytes, formatReadableSize(data_compressed_bytes) AS tc, formatReadableSize(data_uncompressed_bytes) AS tu, data_compressed_bytes / data_uncompressed_bytes AS ratio, type, compression_codec FROM system.columns WHERE database = currentDatabase() AND table LIKE 'http%' ORDER BY table, name
  • 18. © 2021 Altinity, Inc. Another way to store JSON objects: Maps CREATE TABLE http_logs_map ( `file` String, `message` Map(String, String), timestamp DateTime DEFAULT toDateTime(toUInt32(message['@timestamp'])) CODEC(Delta, ZSTD(1)) ) ENGINE = MergeTree PARTITION BY toStartOfMonth(timestamp) ORDER BY timestamp
  • 19. © 2021 Altinity, Inc. Loading and querying JSON in Maps -- Load data INSERT into http_logs_map(file, message) SELECT file, JSONExtractKeysAndValues(message, 'String') message FROM http_logs -- Run a query. SELECT message['status'] status, count() FROM http_logs_map GROUP BY status ORDER BY status 4-5x faster than accessing JSON string objects
  • 20. © 2021 Altinity, Inc. Storing JSON in paired arrays CREATE TABLE http_logs_arrays ( `file` String, `keys` Array(String), `values` Array(String), timestamp DateTime CODEC(Delta, ZSTD(1)) ) ENGINE = MergeTree PARTITION BY toStartOfMonth(timestamp) ORDER BY timestamp
  • 21. © 2021 Altinity, Inc. Querying values in arrays -- Run a query. SELECT values[indexOf(keys, 'status')] status, count() FROM http_logs_arrays GROUP BY status ORDER BY status status|count() | ------|--------| 200 |24917090| 206 | 64935| 302 | 1941| 304 | 4899616| 400 | 888| 404 | 115005| 500 | 525| 4-5x faster than accessing JSON string objects
  • 22. © 2021 Altinity, Inc. So... What do paired arrays look like? SELECT * FROM http_logs_arrays LIMIT 3 Row 1: ────── file: documents-211998.json keys: ['@timestamp','clientip','request','status','size'] values: ['895435201','30.20.6.0','GET /french/index.html HTTP/1.0','200','954'] timestamp: 1998-05-17 20:00:01 ... Key Value
  • 23. © 2021 Altinity, Inc. Loading JSON to paired arrays -- Load data. Might be better to format outside ClickHouse. INSERT into http_logs_arrays(file, keys, values, timestamp) SELECT file, arrayMap(x -> x.1, JSONExtractKeysAndValues(message, 'String')) keys, arrayMap(x -> x.2, JSONExtractKeysAndValues(message, 'String')) values, toDateTime(JSONExtractUInt(message, '@timestamp')) timestamp FROM http_logs limit 30000000
  • 24. © 2021 Altinity, Inc. Must we always copy between tables? 24 It seems painful. Surely there is a better way!
  • 25. © 2021 Altinity, Inc. Use materialized views to help with loading Nginx Logs Web Logs http_logs_etl http_logs_arrays MV INSERT INTO Enrich data with materialized view
  • 26. © 2021 Altinity, Inc. Create the base ETL table CREATE TABLE http_logs_etl ( `file` String, `message` String ) ENGINE = Null Does not store data but triggers materialized views
  • 27. © 2021 Altinity, Inc. Create the target table (as before…) CREATE TABLE http_logs_arrays ( `file` String, `keys` Array(String), `values` Array(String), timestamp DateTime CODEC(Delta, ZSTD(1)) ) ENGINE = MergeTree PARTITION BY toStartOfMonth(timestamp) ORDER BY timestamp
  • 28. © 2021 Altinity, Inc. Create materialized view for ETL -- Fires whenever a block arrives at http_logs_etl. CREATE MATERIALIZED VIEW http_logs_etl_mv TO http_logs_arrays AS SELECT file, arrayMap(x -> x.1, JSONExtractKeysAndValues(message, 'String')) keys, arrayMap(x -> x.2, JSONExtractKeysAndValues(message, 'String')) values, toDateTime(JSONExtractUInt(message, '@timestamp')) timestamp FROM http_logs_etl Add WHERE conditions to filter data Do any transforms you like!
  • 29. © 2021 Altinity, Inc. Now load data and have at it... -- Clear the target table to start from scratch. TRUNCATE TABLE http_logs_arrays -- Load data via ETL table. INSERT INTO http_logs_etl(file, message) SELECT file, message FROM http_logs LIMIT 1000000 -- Confirm rows arrived in the arrays table. SELECT count() FROM http_logs_arrays count()| -------| 1000000| New data are instantly queryable
  • 30. © 2021 Altinity, Inc. Roadmap and more information 30
  • 31. © 2021 Altinity, Inc. ClickHouse JSON support is not perfect Complex JSON is hard to access! SELECT JSONExtractUInt( JSONExtractRaw( JSONExtractRaw( '{"a": {"b": {"c": 1}}}', 'a'), 'b'), 'c') AS anInt JSON performance is hard to optimize! 31
  • 32. © 2021 Altinity, Inc. What’s coming to help? Semi-structured data! ClickHouse will map JSON structure to columnar storage and provide convenient query syntax https://github.com/ClickHouse/ClickHouse/issues/17623 32
  • 33. © 2021 Altinity, Inc. You can go a long way in the meantime! Lots of options to represent JSON in ClickHouse tables ● Transform fully to columns ● Store JSON blob and add columns with indexes as needed ● Transform to Map ● Transform to paired arrays Load JSON blobs to staging table and transform with materialized views Mix and match to process JSON in a way that works best for your apps! 33 Fan Favorite
  • 34. © 2021 Altinity, Inc. Questions? Thank you! Altinity https://altinity.com ClickHouse https://github.com/ClickH ouse/ClickHouse Altinity.Cloud https://altinity.com/cloud- database/ We are hiring! 34