SlideShare a Scribd company logo
1 of 43
Download to read offline
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Keisuke Suzuki
Software engineer
PlazmaDB
ペタバイトオーダのデータ分析
基盤を支える分散ストレージの
アーキテクチャとその運用
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Who am I?
Keisuke Suzuki
• Backend Engineer @ Treasure Data KK
– Ex. Fujitsu
• DB / Distributed system / Performance
optimization
• Twitter: @yajilobee
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Treasure Data
& PlazmaDB
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Arm Treasure Data eCDP
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
PlazmaDB
Streaming data
Bulk load
PlazmaDB
Metadata
(PostgreSQL)
AWS S3 /
Riak CS
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Daily Workload & Storage Size
Import Query Storage size
500 Billion Records / day
~ 5.8 Million Records / sec
5 PB (+5~10 TB / day)
55 Trillion Records
600,000 Queries / day
15 Trillion Records / day
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
PlazmaDB Features
- Columnar format
- Partitioned by time and optionally user defined column
- Schema less
- Partition index
- Partition optimization
- Merge partitions
- Realtime Storage & Archive Storage
- Transaction
- Read committed isolation
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
PlazmaDB Features
- Columnar format
- Partitioned by time and optionally user defined column
- Schema less
- Partition index
- Partition optimization
- Merge partitions
- Realtime Storage & Archive Storage
- Transaction
- Read committed isolation
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Log data
time orderid user region price ...
2018-01-01 10:00:00 1 1 ‘A’ 10000
2018-01-01 10:03:03 2 7 ‘C’ 40000
2018-01-01 10:23:03 3 6 ‘B’ 3000
2018-01-01 10:23:12 4 3 ‘A’ 5500
2018-01-01 11:04:44 5 1 ‘A’ 20000
2018-01-01 11:30:00 6 8 ‘C’ 3000
...
Many columns (attributes)
Accumulate
over time
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Analytical Query
SELECT
region,
SUM(price)
FROM
orders
WHERE time >= ‘2018-01-01 10:00’
AND time <= ‘2018-01-01 11:00’
GROUP BY
region
Few part of columns
Filter by time window
time orderid user region price ...
2018-01-01 10:00:00 1 1 ‘A’ 10000
2018-01-01 10:03:03 2 7 ‘C’ 40000
2018-01-01 10:23:03 3 6 ‘B’ 3000
2018-01-01 10:23:12 4 3 ‘A’ 5500
2018-01-01 11:04:44 5 1 ‘A’ 20000
2018-01-01 11:30:00 6 8 ‘C’ 3000
...
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
time orderid user region price ...
2018-01-01 10:00:00 1 1 ‘A’ 10000
2018-01-01 10:03:03 2 7 ‘C’ 40000
2018-01-01 10:23:03 3 6 ‘B’ 3000
2018-01-01 10:23:12 4 3 ‘A’ 5500
2018-01-01 11:04:44 5 1 ‘A’ 20000
2018-01-01 11:30:00 6 8 ‘C’ 3000
...
Inefficiency of Row Based Format
SELECT
region,
SUM(price)
FROM
orders
WHERE time >= ‘2018-01-01 10:00’
AND time <= ‘2018-01-01 11:00’
GROUP BY
region
Scan direction Scanned data
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
time orderid user region price ...
2018-01-01 10:00:00 1 1 ‘A’ 10000
2018-01-01 10:03:03 2 7 ‘C’ 40000
2018-01-01 10:23:03 3 6 ‘B’ 3000
2018-01-01 10:23:12 4 3 ‘A’ 5500
2018-01-01 11:04:44 5 1 ‘A’ 20000
2018-01-01 11:30:00 6 8 ‘C’ 3000
...
Columnar Format
SELECT
region,
SUM(price)
FROM
orders
WHERE time >= ‘2018-01-01 10:00’
AND time <= ‘2018-01-01 11:00’
GROUP BY
region
Scan direction Scanned data
Few part of columns
Filter by time window
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
2018-01-01 10:23:03 3 6 ...
2018-01-01 10:23:12 4 3 ...
PlazmaDB
PlazmaDB Partitions
2018-01-01 10:00:00 1 1 ...
2018-01-01 10:03:03 2 7 ...
2018-01-01 11:04:44 5 1 ...
{“time”: “2018-01-01 10:00:00”, “orderid”: 1, …},
{“time”: “2018-01-01 10:03:03”, “orderid”: 2, …}
Worker{“time”: “2018-01-01 10:23:03”, “orderid”: 3, …},
{“time”: “2018-01-01 10:23:12”, “orderid”: 4, …}
{“time”: “2018-01-01 11:04:44”, “orderid”: 5, …}
Application
Partition file: A S3/RiakCS Object
Send logs periodically
Convert to columnar
& Store a partition
Table is collection of partitions
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
2018-01-01 10:23:03 3 6 ...
2018-01-01 10:23:12 4 3 ...
Meta DB (PostgreSQL)
PlazmaDB Metadata
2018-01-01 10:00:00 1 1 ...
2018-01-01 10:03:03 2 7 ...
2018-01-01 11:04:44 5 1 ...
data_set_id path ...
1
1
1
2
2018-01-01 10:00:00 1 1 ...
2018-01-01 10:03:03 2 7 ...
PlazmaDB is Multi tenant
data_set_id: ID combination of User, Database, Table
Data set 1
Data set 2
AWS S3 / RiakCS
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
2018-01-01 10:23:03 3 6 ...
2018-01-01 10:23:12 4 3 ...
Meta DB (PostgreSQL)
Partition Index
2018-01-01 10:00:00 1 1 ...
2018-01-01 10:03:03 2 7 ...
2018-01-01 11:04:44 6 1 ...
data_set_id time_range path ...
1 [2018-01-01 10:00:00,
2018-01-01 10:03:03]
1 [2018-01-01 10:23:03,
2018-01-01 10:23:12]
1 [2018-01-01 11:04:44,
2018-01-01 10:04:44]
2
2018-01-01 10:00:00 1 1 ...
2018-01-01 10:03:03 2 7 ...
Data set 1
Data set 2
AWS S3 / RiakCS
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Partition Lookup
SELECT
region,
SUM(price)
FROM
orders -- assume this is data set 1
WHERE time >= ‘2018-01-01 10:00’
AND time <= ‘2018-01-01 11:00’
GROUP BY
region
Meta DB (PostgreSQL)
data_set_id time_range path ...
1 [2018-01-01 10:00:00,
2018-01-01 10:03:03]
1 [2018-01-01 10:23:03,
2018-01-01 10:23:12]
1 [2018-01-01 11:04:44,
2018-01-01 10:04:44]
2
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
2018-01-01 10:23:03 3 6 ‘B’ 3000
2018-01-01 10:23:12 4 3 ‘A’ 5500
time orderid user region price ...
2018-01-01 10:00:00 1 1 ‘A’ 10000
2018-01-01 10:03:03 2 7 ‘C’ 40000
Skip Partition Scan
SELECT
region,
SUM(price)
FROM
orders
WHERE time >= ‘2018-01-01 10:00’
AND time <= ‘2018-01-01 11:00’
GROUP BY
region
Scan direction Scanned data
2018-01-01 11:04:44 5 1 ‘A’ 20000
2018-01-01 11:30:00 6 8 ‘C’ 3000
...
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
How to find Partitions?
SELECT
region,
SUM(price)
FROM
orders -- assume this is data set 1
WHERE time >= ‘2018-01-01 10:00’
AND time <= ‘2018-01-01 11:00’
GROUP BY
region
Meta DB (PostgreSQL)
data_set_id time_range path ...
1 [2018-01-01 10:00:00,
2018-01-01 10:03:03]
1 [2018-01-01 10:23:03,
2018-01-01 10:23:12]
1 [2018-01-01 11:04:44,
2018-01-01 10:04:44]
2
Number of partitions in a data set can be
large (1M+) for large tables.
?
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Range Type and GiST Index of PostgreSQL
Meta DB (PostgreSQL)
data_set_id time_range path ...
1 [2018-01-01 10:00:00,
2018-01-01 10:03:03]
1 [2018-01-01 10:23:03,
2018-01-01 10:23:12]
1 [2018-01-01 11:04:44,
2018-01-01 10:04:44]
2
GiST Index
Range type
• Overlap operator
time_range && [2018-01-01 10:00, 2018-01-01 11:00]
Overlap is checked by index scan
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
User Defined Partition (Beta)
SELECT … FROM … WHERE time > … AND time <... AND region = ‘A’
Time Time
Region(userdefinedkey)
A
C
B
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
PlazmaDB
Partition Fragmentation
Data set 1
Data set 2
Presto worker
Hive worker
● Latency (30ms+)
● Get operation cost
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Data set 2
PlazmaDB
Data set 2
Data set 1
Realtime Storage & Archive Storage
Realtime Storage Archive Storage
Partitions
imported 1 hour
Merge
Data set 1
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Data set 2
PlazmaDB
Data set 2
Data set 1
Realtime Storage & Archive Storage
Realtime Storage Archive Storage
Partitions
imported 1 hour
Merge
Data set 1
Reduced to 1/20 - 1/100 partitions
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Streaming & Bulk data upload
Realtime Storage Archive Storage
Bulk loadStreaming data
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Fragmentation of Archive Storage
data_set_id time_range import_time ...
1 [... 10:00:00, ... 10:03:03] … 10:05:00
1 [... 10:23:03, ... 10:23:12] … 10:25:00
...
1 [... 20:30:00, ... 20:34:44] … 20:35:00
1 [... 10:00:44, ... 20:44:14] … 20:45:00
...
data_set_id time_range ...
1 [... 10:00:00, ... 11:00:00]
...
1 [... 10:00:44, … 11:00:00]
1 [... 15:04:38, ... 15:34:44]
1 [... 20:00:00, ... 21:00:00]
...
Realtime Storage Archive Storage
Delayed data Split by 1 hour window
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Remerge Partitions
Archive Storage
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Re: PlazmaDB features
- Columnar format
- Partitioned by time and optionally user defined column
- Schema less
- Partition index
- Partition optimization
- Merge partitions
- Realtime Storage & Archive Storage
- Transaction
- Read committed isolation
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Current PlazmaDB
& Future Challenges
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Data Volume
PlazmaDB
Meta DB (PostgreSQL)
Realtime Storage Archive Storage
AWS S3 / Riak CS
5 PB
GiST GiST
Partition
Metadata
Partition
Metadata
1 TB
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Monitoring
• Arm Treasure Data
– Detailed log analyze
• DataDog
– Metrics visualization
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Read Workload on Meta DB
• Metadata size ~ 1TB
• Shared Buffer size ~ 150GB
• But, Hot Data size is much
smaller than Shared Buffer
# of Read Requested on Data Set /day
#ofPartitionsRead/day
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Write Workload on Meta DB
200k transaction / min
~ 3k transaction / sec
3 MB / sec
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
PostgreSQL Auto VACUUM (FREEZE)
VACUUM FREEZE: Vacuum to prevent transaction ID wraparound failures
• Force full scan on relation (as of PostgreSQL 9.4)
– Hot data may be evicted to scan relations for vacuum
=> Read workload can be affected
– PostgreSQL 9.6 or later mitigate the problem
• The more transaction IDs are consumed, the more vacuum can be happened
– In our case, it happens every 2-3 day
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Impact of VACUUM FREEZE
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
GiST Index Bloat
100GB
150GB
50GB
200GB
B-tree
GiST
Bloat 30-40GB/month
Reindex (~20h)
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Metadata Table Partitioning
GiST
Partition
Metadata
One relation’s reindex becomes shorter and space saving
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Meta DB Scale out
DB1 DB2 DB3
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
More Partition Skip
SELECT
region,
SUM(price)
FROM
orders
WHERE time >= ‘2018-01-01 10:00’
AND time <= ‘2018-01-01 11:00’
AND user_age >= 20
AND user_age <= 30
GROUP BY
region
Meta DB (PostgreSQL)
data_set_id time_range path ...
1 [2018-01-01 10:00:00,
2018-01-01 10:03:03]
1 [2018-01-01 10:23:03,
2018-01-01 10:23:12]
1 [2018-01-01 11:04:44,
2018-01-01 10:04:44]
2
Scan partition & Filter
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
More Partition Skip
SELECT
region,
SUM(price)
FROM
orders
WHERE time >= ‘2018-01-01 10:00’
AND time <= ‘2018-01-01 11:00’
AND user_age >= 20
AND user_age <= 30
GROUP BY
region
Meta DB (PostgreSQL)
data_set_id time_range user_age_range
1 [2018-01-01 10:00:00,
2018-01-01 10:03:03]
[35, 40]
1 [2018-01-01 10:23:03,
2018-01-01 10:23:12]
[25, 30]
1 [2018-01-01 11:04:44,
2018-01-01 10:04:44]
2
Store metadata on frequently accessed columns
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Smart Partition Selection for Remerge
Remerge is resource consuming
• Current
– Data set size
– # of Partitions
• Idea
– Access frequency
– Data freshness
Fresh data is likely to be hot
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
Summary
- PlazmaDB: Storage Layer of Arm Treasure Data Analytics Platform
- Optimization for Analytical Queries
- Columnar + Time Partitioning + Partition Index
- Optimization for Streaming data
- Realtime & Archive Storage + Merge Partition
- Challenges
- Reduce impact of PostgreSQL VACUUM FREEZE
- GiST index management
- More Partition Optimization
- Enrich Metadata
- Smart Remerge
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
We are Hiring!!
https://www.treasuredata.com/company/careers/
Thank You!
Danke!
Merci!
谢谢!
Gracias!
Kiitos!
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.

More Related Content

What's hot

ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...Altinity Ltd
 
MongoDB - External Authentication
MongoDB - External AuthenticationMongoDB - External Authentication
MongoDB - External AuthenticationJason Terpko
 
Data pipeline and data lake
Data pipeline and data lakeData pipeline and data lake
Data pipeline and data lakeDaeMyung Kang
 
Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)Kevin Weil
 
Meetup: Streaming Data Pipeline Development
Meetup:  Streaming Data Pipeline DevelopmentMeetup:  Streaming Data Pipeline Development
Meetup: Streaming Data Pipeline DevelopmentTimothy Spann
 
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유 (2부)
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유 (2부)[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유 (2부)
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유 (2부)Hyojun Jeon
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Sadayuki Furuhashi
 
Etl is Dead; Long Live Streams
Etl is Dead; Long Live StreamsEtl is Dead; Long Live Streams
Etl is Dead; Long Live Streamsconfluent
 
Spark + S3 + R3를 이용한 데이터 분석 시스템 만들기
Spark + S3 + R3를 이용한 데이터 분석 시스템 만들기Spark + S3 + R3를 이용한 데이터 분석 시스템 만들기
Spark + S3 + R3를 이용한 데이터 분석 시스템 만들기AWSKRUG - AWS한국사용자모임
 
ClickHouse Introduction by Alexander Zaitsev, Altinity CTO
ClickHouse Introduction by Alexander Zaitsev, Altinity CTOClickHouse Introduction by Alexander Zaitsev, Altinity CTO
ClickHouse Introduction by Alexander Zaitsev, Altinity CTOAltinity Ltd
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemShirshanka Das
 
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...Flink Forward
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouseAltinity Ltd
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsDatabricks
 
[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영NAVER D2
 
Practical Memory Tuning for PostgreSQL
Practical Memory Tuning for PostgreSQLPractical Memory Tuning for PostgreSQL
Practical Memory Tuning for PostgreSQLGrant McAlister
 
How I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with AirflowHow I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with AirflowPyData
 
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드confluent
 
Amazon SageMaker를 통한 대용량 모델 훈련 방법 살펴보기 - 김대근 AWS AI/ML 스페셜리스트 솔루션즈 아키텍트 / 최영준...
Amazon SageMaker를 통한 대용량 모델 훈련 방법 살펴보기 - 김대근 AWS AI/ML 스페셜리스트 솔루션즈 아키텍트 / 최영준...Amazon SageMaker를 통한 대용량 모델 훈련 방법 살펴보기 - 김대근 AWS AI/ML 스페셜리스트 솔루션즈 아키텍트 / 최영준...
Amazon SageMaker를 통한 대용량 모델 훈련 방법 살펴보기 - 김대근 AWS AI/ML 스페셜리스트 솔루션즈 아키텍트 / 최영준...Amazon Web Services Korea
 

What's hot (20)

ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
 
MongoDB - External Authentication
MongoDB - External AuthenticationMongoDB - External Authentication
MongoDB - External Authentication
 
Data pipeline and data lake
Data pipeline and data lakeData pipeline and data lake
Data pipeline and data lake
 
Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)
 
Meetup: Streaming Data Pipeline Development
Meetup:  Streaming Data Pipeline DevelopmentMeetup:  Streaming Data Pipeline Development
Meetup: Streaming Data Pipeline Development
 
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유 (2부)
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유 (2부)[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유 (2부)
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유 (2부)
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1
 
Etl is Dead; Long Live Streams
Etl is Dead; Long Live StreamsEtl is Dead; Long Live Streams
Etl is Dead; Long Live Streams
 
Spark + S3 + R3를 이용한 데이터 분석 시스템 만들기
Spark + S3 + R3를 이용한 데이터 분석 시스템 만들기Spark + S3 + R3를 이용한 데이터 분석 시스템 만들기
Spark + S3 + R3를 이용한 데이터 분석 시스템 만들기
 
ClickHouse Introduction by Alexander Zaitsev, Altinity CTO
ClickHouse Introduction by Alexander Zaitsev, Altinity CTOClickHouse Introduction by Alexander Zaitsev, Altinity CTO
ClickHouse Introduction by Alexander Zaitsev, Altinity CTO
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
 
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
 
ClickHouse Intro
ClickHouse IntroClickHouse Intro
ClickHouse Intro
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
 
[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영
 
Practical Memory Tuning for PostgreSQL
Practical Memory Tuning for PostgreSQLPractical Memory Tuning for PostgreSQL
Practical Memory Tuning for PostgreSQL
 
How I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with AirflowHow I learned to time travel, or, data pipelining and scheduling with Airflow
How I learned to time travel, or, data pipelining and scheduling with Airflow
 
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
 
Amazon SageMaker를 통한 대용량 모델 훈련 방법 살펴보기 - 김대근 AWS AI/ML 스페셜리스트 솔루션즈 아키텍트 / 최영준...
Amazon SageMaker를 통한 대용량 모델 훈련 방법 살펴보기 - 김대근 AWS AI/ML 스페셜리스트 솔루션즈 아키텍트 / 최영준...Amazon SageMaker를 통한 대용량 모델 훈련 방법 살펴보기 - 김대근 AWS AI/ML 스페셜리스트 솔루션즈 아키텍트 / 최영준...
Amazon SageMaker를 통한 대용량 모델 훈련 방법 살펴보기 - 김대근 AWS AI/ML 스페셜리스트 솔루션즈 아키텍트 / 최영준...
 

Similar to 201809 DB tech showcase

InfluxDB 101 – Concepts and Architecture by Michael DeSa, Software Engineer |...
InfluxDB 101 – Concepts and Architecture by Michael DeSa, Software Engineer |...InfluxDB 101 – Concepts and Architecture by Michael DeSa, Software Engineer |...
InfluxDB 101 – Concepts and Architecture by Michael DeSa, Software Engineer |...InfluxData
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftAmazon Web Services
 
Recent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future PrestoRecent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future PrestoKai Sasaki
 
Improve data engineering work with Digdag and Presto UDF
Improve data engineering work with Digdag and Presto UDFImprove data engineering work with Digdag and Presto UDF
Improve data engineering work with Digdag and Presto UDFKentaro Yoshida
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftAmazon Web Services
 
Data Warehousing with Amazon Redshift: Data Analytics Week SF
Data Warehousing with Amazon Redshift: Data Analytics Week SFData Warehousing with Amazon Redshift: Data Analytics Week SF
Data Warehousing with Amazon Redshift: Data Analytics Week SFAmazon Web Services
 
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...Amazon Web Services
 
Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...
Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...
Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...Amazon Web Services
 
Data Warehousing with Amazon Redshift: Data Analytics Week at the SF Loft
Data Warehousing with Amazon Redshift: Data Analytics Week at the SF LoftData Warehousing with Amazon Redshift: Data Analytics Week at the SF Loft
Data Warehousing with Amazon Redshift: Data Analytics Week at the SF LoftAmazon Web Services
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftAmazon Web Services
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon RedshiftAmazon Web Services
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftAmazon Web Services
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftAmazon Web Services
 
Barga IC2E & IoTDI'16 Keynote
Barga IC2E & IoTDI'16 KeynoteBarga IC2E & IoTDI'16 Keynote
Barga IC2E & IoTDI'16 KeynoteRoger Barga
 
Managing your Black Friday Logs NDC Oslo
Managing your  Black Friday Logs NDC OsloManaging your  Black Friday Logs NDC Oslo
Managing your Black Friday Logs NDC OsloDavid Pilato
 
Presto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 UpdatesPresto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 UpdatesTaro L. Saito
 
Query your data in S3 with SQL and optimize for cost and performance
Query your data in S3 with SQL and optimize for cost and performanceQuery your data in S3 with SQL and optimize for cost and performance
Query your data in S3 with SQL and optimize for cost and performanceAWS Germany
 
Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...
Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...
Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...Amazon Web Services
 
Big data processing with PubSub, Dataflow, and BigQuery
Big data processing with PubSub, Dataflow, and BigQueryBig data processing with PubSub, Dataflow, and BigQuery
Big data processing with PubSub, Dataflow, and BigQueryThuyen Ho
 

Similar to 201809 DB tech showcase (20)

201810 td tech_talk
201810 td tech_talk201810 td tech_talk
201810 td tech_talk
 
InfluxDB 101 – Concepts and Architecture by Michael DeSa, Software Engineer |...
InfluxDB 101 – Concepts and Architecture by Michael DeSa, Software Engineer |...InfluxDB 101 – Concepts and Architecture by Michael DeSa, Software Engineer |...
InfluxDB 101 – Concepts and Architecture by Michael DeSa, Software Engineer |...
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
Recent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future PrestoRecent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future Presto
 
Improve data engineering work with Digdag and Presto UDF
Improve data engineering work with Digdag and Presto UDFImprove data engineering work with Digdag and Presto UDF
Improve data engineering work with Digdag and Presto UDF
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
Data Warehousing with Amazon Redshift: Data Analytics Week SF
Data Warehousing with Amazon Redshift: Data Analytics Week SFData Warehousing with Amazon Redshift: Data Analytics Week SF
Data Warehousing with Amazon Redshift: Data Analytics Week SF
 
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
 
Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...
Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...
Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re...
 
Data Warehousing with Amazon Redshift: Data Analytics Week at the SF Loft
Data Warehousing with Amazon Redshift: Data Analytics Week at the SF LoftData Warehousing with Amazon Redshift: Data Analytics Week at the SF Loft
Data Warehousing with Amazon Redshift: Data Analytics Week at the SF Loft
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
Barga IC2E & IoTDI'16 Keynote
Barga IC2E & IoTDI'16 KeynoteBarga IC2E & IoTDI'16 Keynote
Barga IC2E & IoTDI'16 Keynote
 
Managing your Black Friday Logs NDC Oslo
Managing your  Black Friday Logs NDC OsloManaging your  Black Friday Logs NDC Oslo
Managing your Black Friday Logs NDC Oslo
 
Presto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 UpdatesPresto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 Updates
 
Query your data in S3 with SQL and optimize for cost and performance
Query your data in S3 with SQL and optimize for cost and performanceQuery your data in S3 with SQL and optimize for cost and performance
Query your data in S3 with SQL and optimize for cost and performance
 
Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...
Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...
Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...
 
Big data processing with PubSub, Dataflow, and BigQuery
Big data processing with PubSub, Dataflow, and BigQueryBig data processing with PubSub, Dataflow, and BigQuery
Big data processing with PubSub, Dataflow, and BigQuery
 

Recently uploaded

VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 

Recently uploaded (20)

VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 

201809 DB tech showcase

  • 1. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Keisuke Suzuki Software engineer PlazmaDB ペタバイトオーダのデータ分析 基盤を支える分散ストレージの アーキテクチャとその運用
  • 2. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Who am I? Keisuke Suzuki • Backend Engineer @ Treasure Data KK – Ex. Fujitsu • DB / Distributed system / Performance optimization • Twitter: @yajilobee
  • 3. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Treasure Data & PlazmaDB
  • 4. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Arm Treasure Data eCDP
  • 5. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. PlazmaDB Streaming data Bulk load PlazmaDB Metadata (PostgreSQL) AWS S3 / Riak CS
  • 6. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Daily Workload & Storage Size Import Query Storage size 500 Billion Records / day ~ 5.8 Million Records / sec 5 PB (+5~10 TB / day) 55 Trillion Records 600,000 Queries / day 15 Trillion Records / day
  • 7. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. PlazmaDB Features - Columnar format - Partitioned by time and optionally user defined column - Schema less - Partition index - Partition optimization - Merge partitions - Realtime Storage & Archive Storage - Transaction - Read committed isolation
  • 8. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. PlazmaDB Features - Columnar format - Partitioned by time and optionally user defined column - Schema less - Partition index - Partition optimization - Merge partitions - Realtime Storage & Archive Storage - Transaction - Read committed isolation
  • 9. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Log data time orderid user region price ... 2018-01-01 10:00:00 1 1 ‘A’ 10000 2018-01-01 10:03:03 2 7 ‘C’ 40000 2018-01-01 10:23:03 3 6 ‘B’ 3000 2018-01-01 10:23:12 4 3 ‘A’ 5500 2018-01-01 11:04:44 5 1 ‘A’ 20000 2018-01-01 11:30:00 6 8 ‘C’ 3000 ... Many columns (attributes) Accumulate over time
  • 10. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Analytical Query SELECT region, SUM(price) FROM orders WHERE time >= ‘2018-01-01 10:00’ AND time <= ‘2018-01-01 11:00’ GROUP BY region Few part of columns Filter by time window time orderid user region price ... 2018-01-01 10:00:00 1 1 ‘A’ 10000 2018-01-01 10:03:03 2 7 ‘C’ 40000 2018-01-01 10:23:03 3 6 ‘B’ 3000 2018-01-01 10:23:12 4 3 ‘A’ 5500 2018-01-01 11:04:44 5 1 ‘A’ 20000 2018-01-01 11:30:00 6 8 ‘C’ 3000 ...
  • 11. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. time orderid user region price ... 2018-01-01 10:00:00 1 1 ‘A’ 10000 2018-01-01 10:03:03 2 7 ‘C’ 40000 2018-01-01 10:23:03 3 6 ‘B’ 3000 2018-01-01 10:23:12 4 3 ‘A’ 5500 2018-01-01 11:04:44 5 1 ‘A’ 20000 2018-01-01 11:30:00 6 8 ‘C’ 3000 ... Inefficiency of Row Based Format SELECT region, SUM(price) FROM orders WHERE time >= ‘2018-01-01 10:00’ AND time <= ‘2018-01-01 11:00’ GROUP BY region Scan direction Scanned data
  • 12. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. time orderid user region price ... 2018-01-01 10:00:00 1 1 ‘A’ 10000 2018-01-01 10:03:03 2 7 ‘C’ 40000 2018-01-01 10:23:03 3 6 ‘B’ 3000 2018-01-01 10:23:12 4 3 ‘A’ 5500 2018-01-01 11:04:44 5 1 ‘A’ 20000 2018-01-01 11:30:00 6 8 ‘C’ 3000 ... Columnar Format SELECT region, SUM(price) FROM orders WHERE time >= ‘2018-01-01 10:00’ AND time <= ‘2018-01-01 11:00’ GROUP BY region Scan direction Scanned data Few part of columns Filter by time window
  • 13. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. 2018-01-01 10:23:03 3 6 ... 2018-01-01 10:23:12 4 3 ... PlazmaDB PlazmaDB Partitions 2018-01-01 10:00:00 1 1 ... 2018-01-01 10:03:03 2 7 ... 2018-01-01 11:04:44 5 1 ... {“time”: “2018-01-01 10:00:00”, “orderid”: 1, …}, {“time”: “2018-01-01 10:03:03”, “orderid”: 2, …} Worker{“time”: “2018-01-01 10:23:03”, “orderid”: 3, …}, {“time”: “2018-01-01 10:23:12”, “orderid”: 4, …} {“time”: “2018-01-01 11:04:44”, “orderid”: 5, …} Application Partition file: A S3/RiakCS Object Send logs periodically Convert to columnar & Store a partition Table is collection of partitions
  • 14. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. 2018-01-01 10:23:03 3 6 ... 2018-01-01 10:23:12 4 3 ... Meta DB (PostgreSQL) PlazmaDB Metadata 2018-01-01 10:00:00 1 1 ... 2018-01-01 10:03:03 2 7 ... 2018-01-01 11:04:44 5 1 ... data_set_id path ... 1 1 1 2 2018-01-01 10:00:00 1 1 ... 2018-01-01 10:03:03 2 7 ... PlazmaDB is Multi tenant data_set_id: ID combination of User, Database, Table Data set 1 Data set 2 AWS S3 / RiakCS
  • 15. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. 2018-01-01 10:23:03 3 6 ... 2018-01-01 10:23:12 4 3 ... Meta DB (PostgreSQL) Partition Index 2018-01-01 10:00:00 1 1 ... 2018-01-01 10:03:03 2 7 ... 2018-01-01 11:04:44 6 1 ... data_set_id time_range path ... 1 [2018-01-01 10:00:00, 2018-01-01 10:03:03] 1 [2018-01-01 10:23:03, 2018-01-01 10:23:12] 1 [2018-01-01 11:04:44, 2018-01-01 10:04:44] 2 2018-01-01 10:00:00 1 1 ... 2018-01-01 10:03:03 2 7 ... Data set 1 Data set 2 AWS S3 / RiakCS
  • 16. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Partition Lookup SELECT region, SUM(price) FROM orders -- assume this is data set 1 WHERE time >= ‘2018-01-01 10:00’ AND time <= ‘2018-01-01 11:00’ GROUP BY region Meta DB (PostgreSQL) data_set_id time_range path ... 1 [2018-01-01 10:00:00, 2018-01-01 10:03:03] 1 [2018-01-01 10:23:03, 2018-01-01 10:23:12] 1 [2018-01-01 11:04:44, 2018-01-01 10:04:44] 2
  • 17. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. 2018-01-01 10:23:03 3 6 ‘B’ 3000 2018-01-01 10:23:12 4 3 ‘A’ 5500 time orderid user region price ... 2018-01-01 10:00:00 1 1 ‘A’ 10000 2018-01-01 10:03:03 2 7 ‘C’ 40000 Skip Partition Scan SELECT region, SUM(price) FROM orders WHERE time >= ‘2018-01-01 10:00’ AND time <= ‘2018-01-01 11:00’ GROUP BY region Scan direction Scanned data 2018-01-01 11:04:44 5 1 ‘A’ 20000 2018-01-01 11:30:00 6 8 ‘C’ 3000 ...
  • 18. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. How to find Partitions? SELECT region, SUM(price) FROM orders -- assume this is data set 1 WHERE time >= ‘2018-01-01 10:00’ AND time <= ‘2018-01-01 11:00’ GROUP BY region Meta DB (PostgreSQL) data_set_id time_range path ... 1 [2018-01-01 10:00:00, 2018-01-01 10:03:03] 1 [2018-01-01 10:23:03, 2018-01-01 10:23:12] 1 [2018-01-01 11:04:44, 2018-01-01 10:04:44] 2 Number of partitions in a data set can be large (1M+) for large tables. ?
  • 19. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Range Type and GiST Index of PostgreSQL Meta DB (PostgreSQL) data_set_id time_range path ... 1 [2018-01-01 10:00:00, 2018-01-01 10:03:03] 1 [2018-01-01 10:23:03, 2018-01-01 10:23:12] 1 [2018-01-01 11:04:44, 2018-01-01 10:04:44] 2 GiST Index Range type • Overlap operator time_range && [2018-01-01 10:00, 2018-01-01 11:00] Overlap is checked by index scan
  • 20. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. User Defined Partition (Beta) SELECT … FROM … WHERE time > … AND time <... AND region = ‘A’ Time Time Region(userdefinedkey) A C B
  • 21. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. PlazmaDB Partition Fragmentation Data set 1 Data set 2 Presto worker Hive worker ● Latency (30ms+) ● Get operation cost
  • 22. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Data set 2 PlazmaDB Data set 2 Data set 1 Realtime Storage & Archive Storage Realtime Storage Archive Storage Partitions imported 1 hour Merge Data set 1
  • 23. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Data set 2 PlazmaDB Data set 2 Data set 1 Realtime Storage & Archive Storage Realtime Storage Archive Storage Partitions imported 1 hour Merge Data set 1 Reduced to 1/20 - 1/100 partitions
  • 24. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Streaming & Bulk data upload Realtime Storage Archive Storage Bulk loadStreaming data
  • 25. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Fragmentation of Archive Storage data_set_id time_range import_time ... 1 [... 10:00:00, ... 10:03:03] … 10:05:00 1 [... 10:23:03, ... 10:23:12] … 10:25:00 ... 1 [... 20:30:00, ... 20:34:44] … 20:35:00 1 [... 10:00:44, ... 20:44:14] … 20:45:00 ... data_set_id time_range ... 1 [... 10:00:00, ... 11:00:00] ... 1 [... 10:00:44, … 11:00:00] 1 [... 15:04:38, ... 15:34:44] 1 [... 20:00:00, ... 21:00:00] ... Realtime Storage Archive Storage Delayed data Split by 1 hour window
  • 26. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Remerge Partitions Archive Storage
  • 27. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Re: PlazmaDB features - Columnar format - Partitioned by time and optionally user defined column - Schema less - Partition index - Partition optimization - Merge partitions - Realtime Storage & Archive Storage - Transaction - Read committed isolation
  • 28. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Current PlazmaDB & Future Challenges
  • 29. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Data Volume PlazmaDB Meta DB (PostgreSQL) Realtime Storage Archive Storage AWS S3 / Riak CS 5 PB GiST GiST Partition Metadata Partition Metadata 1 TB
  • 30. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Monitoring • Arm Treasure Data – Detailed log analyze • DataDog – Metrics visualization
  • 31. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Read Workload on Meta DB • Metadata size ~ 1TB • Shared Buffer size ~ 150GB • But, Hot Data size is much smaller than Shared Buffer # of Read Requested on Data Set /day #ofPartitionsRead/day
  • 32. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Write Workload on Meta DB 200k transaction / min ~ 3k transaction / sec 3 MB / sec
  • 33. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. PostgreSQL Auto VACUUM (FREEZE) VACUUM FREEZE: Vacuum to prevent transaction ID wraparound failures • Force full scan on relation (as of PostgreSQL 9.4) – Hot data may be evicted to scan relations for vacuum => Read workload can be affected – PostgreSQL 9.6 or later mitigate the problem • The more transaction IDs are consumed, the more vacuum can be happened – In our case, it happens every 2-3 day
  • 34. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Impact of VACUUM FREEZE
  • 35. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. GiST Index Bloat 100GB 150GB 50GB 200GB B-tree GiST Bloat 30-40GB/month Reindex (~20h)
  • 36. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Metadata Table Partitioning GiST Partition Metadata One relation’s reindex becomes shorter and space saving
  • 37. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Meta DB Scale out DB1 DB2 DB3
  • 38. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. More Partition Skip SELECT region, SUM(price) FROM orders WHERE time >= ‘2018-01-01 10:00’ AND time <= ‘2018-01-01 11:00’ AND user_age >= 20 AND user_age <= 30 GROUP BY region Meta DB (PostgreSQL) data_set_id time_range path ... 1 [2018-01-01 10:00:00, 2018-01-01 10:03:03] 1 [2018-01-01 10:23:03, 2018-01-01 10:23:12] 1 [2018-01-01 11:04:44, 2018-01-01 10:04:44] 2 Scan partition & Filter
  • 39. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. More Partition Skip SELECT region, SUM(price) FROM orders WHERE time >= ‘2018-01-01 10:00’ AND time <= ‘2018-01-01 11:00’ AND user_age >= 20 AND user_age <= 30 GROUP BY region Meta DB (PostgreSQL) data_set_id time_range user_age_range 1 [2018-01-01 10:00:00, 2018-01-01 10:03:03] [35, 40] 1 [2018-01-01 10:23:03, 2018-01-01 10:23:12] [25, 30] 1 [2018-01-01 11:04:44, 2018-01-01 10:04:44] 2 Store metadata on frequently accessed columns
  • 40. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Smart Partition Selection for Remerge Remerge is resource consuming • Current – Data set size – # of Partitions • Idea – Access frequency – Data freshness Fresh data is likely to be hot
  • 41. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. Summary - PlazmaDB: Storage Layer of Arm Treasure Data Analytics Platform - Optimization for Analytical Queries - Columnar + Time Partitioning + Partition Index - Optimization for Streaming data - Realtime & Archive Storage + Merge Partition - Challenges - Reduce impact of PostgreSQL VACUUM FREEZE - GiST index management - More Partition Optimization - Enrich Metadata - Smart Remerge
  • 42. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. We are Hiring!! https://www.treasuredata.com/company/careers/
  • 43. Thank You! Danke! Merci! 谢谢! Gracias! Kiitos! Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.