OLTP+OLAP=HTAP

OLAP+OLTP=HTA
P
Postgres Professional
Konstantin Knizhnik

OLAP and OLTP oriented databases
mySQL
Oracle
SqlLite
SQLServer
Postgres
OLTP
Clickhouse
GreenPlum
SparkSQL
Teradata
Vertica
OLAP

Hybrid OLAP and OLTP databases
HyPer
SAP HANA
VoltDB
...
HTAP
Clickhouse
GreenPlum
SparkSQL
Teradata
Vertica
OLAPOLTP
mySQL
Oracle
SqlLite
SQLServer
Postgres

Speedup executor
Interpretation
overhead
elimination
JIT
1
2
3
3
4
5
4
6
8
+ =
Vectorize executor

Why Postgres is slow on OLAP queries?
1. Unpacking tuple overhead (heap_deform_tuple)
2. Interpretation overhead (invocation of query plan node functions)
3. Abstraction penalty (user defined types and operations)
4. Pull model overhead (saving/restoring context on each access to the
page)
5. MVCC overhead (~20 bytes per tuple space overhead + visibility
check overhead)

Typical OLAP query profile
16.57% postgres postgres [.] slot_deform_tuple
13.39% postgres postgres [.] ExecEvalExpr
8.64% postgres postgres [.] advance_aggregates
8.58% postgres postgres [.] advance_transition_function
5.83% postgres postgres [.] float8_accum
5.14% postgres postgres [.] tuplehash_insert
3.89% postgres postgres [.] float8pl
3.60% postgres postgres [.] slot_getattr
2.66% postgres postgres [.] bpchareq
2.56% postgres postgres [.] heap_getnext

Query execution plan
select count(*) from where salary > 100000;
>
heap scan
salary 100000
count(*)

Traditional query execution
shipdate quantity price
21.02.2017 100 99
23.02.2017 200 60
24.02.2017 150 120
SELECT sum(quantity*price) FROM lineitems;
100 * 99 = 9900
200 * 60 = 12000
150 * 120 = 18000
+
+
= 39900

Vectorized query execution
shipdate quantity price
21.02.2017,
23.02.2017,
24.02.2017
100,
200,
150
99,
60,
120
25.02.2017,
26.02.2017,
28.02.2017
300,
110,
80
100,
60,
230
SELECT sum(quantity*price) FROM lineitems;
100
200
150
99
60
120
Sum = 39900
9900
12000
18000
Tile

Row store vs. column store
Row store
optimal for OLTP
postgres heap
Column store
optimal for OLAP
cstore,zedstore

Columnar store & vectorized executor
par.workers PG9_6
vectorize=off
PG9_6
vectorize=on
master
vectorize=off
jit=on
master
vectorize=off
jit=off
master
vectorize=on
jit=on
master
vectorize=on
jit=off
zedstore
vectorize=off
jit=on
zedstore
vectorize=off
jit=off
zedstore
vectorize=on
jit=on
zedstore
vectorize=on
jit=off
0 36 20 16 25.5 15 17.5 18 26 17 19
4 10 - 5 7 - - 5 7 - -
Results of TPCH-10G/Q1
1. For 9.6 vectorization gives ~2x speed improvement
2. PG 13 is almost 2 times faster than 9.6 (thanks to JIT)
3. Effect of vectorized execution is much smaller on PG 13
4. Zedstore doesn’t give any noticeable performance advantages

Fast indexes: LSM
C0
C1
C2
Cn
memory
disk
merge join

Postgres + RocksDB FDW
Index Clients TPS
Inclusive B-Tree 1 9387
RocksDB FDW 1 138350
RocksDB 1 166333
RocksDB 10 141482
Benchmark: insertion of 250 millions of records with random key in inclusive
index containing one bigint key and 8 bigint fields.
x15
x6

LSM simulation with B-Tree
Current index
Heap table
Main index
Merging index
inserts
merge
selects

B-Tree*3 = Lsm3
Index Clients TPS
RocksDB 1 166333
RocksDB 10 141482
Lsm3 1 151699
Lsm3 10 65997
x16

Write amplification problem
Why Uber Engineering Switched from Postgres to MySQL?

Postgres MVCC
Key TID
ABC <100,1>
ABC <100,3>
ABC <200,2>
B-Tree
Heap
Tuples:Block 100
Block 200 Tuples:

Hot update
Key TID
ABC <100,1>
ABC <200,2>
B-Tree
Heap
Tuples:Block 100
Block 200 Tuples:
hot update chain

TOAST
TID Main
attributes
1
2
3
4
Main table TOAST table
BlockNo TID ToastNo Extended
attributes
100 1 1
100 1 2
100 1 3
100 2 1

UNDAM tuple chains
Header
chunk
Next
Undo
Data
Extension
chunk
Next
Data
Extension
chunk
Next
Data
Extension
chunk
Next
Data
Extension
chunk
Next
Data
Extension
chunk
Next
Data
Header
chunk
Next
Undo
Data

UNDAM fixed size allocator
Chain
Next
Bitmask:
00101
Chunk 1
Chunk 2
Chunk 3
Chunk 4
Chunk 5
Chain
Next
Bitmask:
10001
Chunk 1
Chunk 2
Chunk 3
Chunk 4
Chunk 5
Chain
Next
Bitmask:
00001
Chunk 1
Chunk 2
Chunk 3
Chunk 4
Chunk 5
Chain
Next
Bitmask:
11100
Chunk 1
Chunk 2
Chunk 3
Chunk 4
Chunk 5
Chain
Next
Bitmask:
00111
Chunk 1
Chunk 2
Chunk 3
Chunk 4
Chunk 5
Chain
Next
Bitmask:
00001
Chunk 1
Chunk 2
Chunk 3
Chunk 4
Chunk 5
Chain
Next
Bitmask:
10101
Chunk 1
Chunk 2
Chunk 3
Chunk 4
Chunk 5
Chain headers

UNDAM relations forks
Main fork Extension fork (fsm)
Head chunks
of last
versions
Extension chunks
+
undo chains
undo
tail

UNDAM and WAL
Chunk header
{1,ABB,100,29.05.2020}
{2,SAP,200,29.05.2020}
empty
{3,IBM,300,29.05.2020}
Old block
Chunk header
{1,ABB,100,29.05.2020}
{2,SAP,300,01.06.2020}
empty
{3,IBM,29.05.2020}
New block
Update
Delta

Performance comparison: pgbench
pgbench -c 10 logged unlogged
Standard heap 7810 8167
UNDAM
chunk_size=64
6367 8010
UNDAM
chunk_size=150
7011 8475
zheap 6174 5800

Update only performance
pgbench -c 10 -P 1 -T 1000 -M prepared -f update.sql postgres
set aid random(1, 100000 * :scale)
update pgbench_accounts set abalance=abalance+1 WHERE aid = :aid;
Configuration TPS
heap+logged 83339
heap+unlogged 110267
undam+logged 156484
undam+unlogged 163072
zheap+logged 91846
zheap+unlogged 106987

Materialized views: hypercubes for PG
7.1 7.2 7.3 7.4 8.0 8.1 8.2 8.3 8.4 9.0 9.1 9.2 9.3 9.4 9.5 9.6 10 11 12 13 ?
OLTP
OLAP
Materialized
Views
Incremental
Materialized
Views
Views

Incremental materialized view
create incremental materialized view teller_sums as
select t.tid,sum(abalance)
from pgbench_accounts a join pgbench_tellers t
on a.bid=t.bid group by t.tid;

Benchmarking 1
pgbench -i -s 100 postgres
done in 26.07 s
create incremental materialized view teller_sums as select t.tid,sum(abalance) from
pgbench_accounts a join pgbench_tellers t on a.bid=t.bid group by t.tid;
Time: 20805.230 ms (00:20.805)
select * from teller_sums where tid=1;
Time: 0.871 ms
select t.tid,sum(abalance) from pgbench_accounts a join pgbench_tellers t on a.bid=t.bid
group by t.tid having t.tid=1;
Time: 915.508 ms x1000

Insertion speed
pgbench -M prepared -N -c 10 -j 4 -T 30 -P 1 postgres
matview
10194
TPS
141
TPS
without with
Oops!

Reasons of slow updates
1. Lack of index: you have to create index for materialized view to allow
efficient update
2. Exclusive lock of materialized view: kill concurrent execution
3. Adding more view cause proportional increase of update time

Incremental update using triggers
x
1
Table O
y
2
Table I
create view materialized view V as
select x,y from O,I;
x y
1 2

Concurrent update of view
Transaction 1 Transaction 2
begin; begin;
Insert into I (x) values (3); Insert into O (y) values (4);
end;
end;
x y
1 2
3 2
x y
1 2
3 2
1 4

Conclusion
1. JIT almost eliminates need in vectorized executor
2. LSM index allows to combine high insertion speed and fast index
scans.
3. UNDO storage provides inplace updates and can significantly
increase update speed
4. Materialized views dramatically decrease insertion speed

Some links:
RocksDB FDW: https://github.com/postgrespro/lsm
LSM3: https://github.com/postgrespro/lsm3
UNDAM: : https://github.com/postgrespro/undam
Vectorized engine: https://github.com/zhangh43/vectorize_engine
VOPS: https://github.com/postgrespro/vops

OLTP+OLAP=HTAP

More Related Content

What's hot

Similar to OLTP+OLAP=HTAP

More from EDB

Recently uploaded

In this document

OLTP+OLAP=HTAP