PostgreSQL 13 New Features

PostgreSQL 13 New Features
Speaker: 林宗禧 @ COSCUP 2020
Taiwan PostgreSQL User Group
1

About
林宗禧
• PostgreSQL愛好者、推廣者
• 以前: 開發FDW套件 (C, Python都有)
• 後來: 到處整合PG的應用
• 推 Industry 4.0，讓業主不經意的導入 PG
• 推 Smart City Solutions ，拿PG做基礎
• 業主反饋都很OK…
• 因為PG本來就穩定好用
• 資料庫技術總是那在後面默默支持各系統 (但卻非常關鍵!!)
2020/8/3 2Taiwan PostgreSQL User Group

Agenda
01. Overview
02. Architecture
1) Deduplication in B-Tree Indexes
2) Logical Replication Partitioning
3) Incremental Sort
4) Parallel vacuum
03. SQL Statement
04. Utilities
2020/8/3 3Taiwan PostgreSQL User Group

01. Overview
Taiwan PostgreSQL User Group 42020/8/3
pg13

Release timeline
PostgreSQL 13 Beta 1 Released @ 2020-05-21
PostgreSQL 13 Beta 2 Released @ 2020-06-25
自PG 10開始，每年一次大版號更新：

Amazon RDS 開始提供 PostgreSQL 13 Beta 1
https://aws.amazon.com/tw/about-aws/whats-new/2020/06/postgresql-13-beta-1-now-available-in-amazon-rds-database-preview-environment/

Amazon RDS 描述 PG 13
• PostgreSQL 13 擁有改善的功能和效能，例如
• 由 B 型樹狀結構索引更好地處理重複資料
• 新增分區功能的改善
• 加速資料排序的增量排序
• 使用 VACUUM 命令平行處理索引
• 更多監控 PostgreSQL 資料庫活動的方法
• 新的安全功能

EDB 13 Agenda
• Performance benchmarking: Four years of going faster
• Logical subscription for partitioned tables
• Partition wise joins
• Before row-level triggers
• Parallel vacuum for indexes
• Corruption checking: pg_catcheck
• Improved security: libpq w. channel binding
9Taiwan PostgreSQL User Group2020/8/3

HPE 新機能檢証結果(1/3)
• HPE Noriyoshi Shinoda https://www.hpe.com/jp/ja/servers/linux/technical/edlin.html

3.1. Architecture
3.1.1. Modified catalogs
3.1.2. Data types
3.1.3. Disk based hash aggregation
3.1.4. Incremental sort
3.1.5. Backup manifests
3.1.6. Partitioned table
3.1.7. Log output for Autovacuum
3.1.8. Wait events
3.1.9. libpq connection string
3.1.10. libpq functions
3.1.11. Hook
3.1.12. Column trigger
3.1.13. Local connection key
3.1.14. Trusted Extension
3.1.15. Replication slot
3.1.16. Text search
3.2. SQL statement
3.2.1. ALTER NO DEPENDS ON
3.2.2. ALTER STATISTICS SET STATISTICS
3.2.3. ALTER TABLE
3.2.4. ALTER TYPE
3.2.5. ALTER VIEW
3.2.6. CREATE DATABASE
3.2.7. CREATE INDEX
3.2.8. CREATE TABLE
3.2.9. CREATE TABLESPACE
3.2.10. DROP DATABASE FORCE
3.2.11. EXPLAIN ANALYZE
3.2.12. INSERT
3.2.13. JSON
3.2.14. MAX/MIN pg_lsn
3.2.15. ROW
3.2.16. SELECT FETCH FIRST WITH TIES
3.2.17. VACUUM PARALLEL
3.2.18. Operator <->
3.2.19. Functions

3.4. Utilities
3.4.1. dropdb
3.4.2. pg_basebackup
3.4.3. pg_dump
3.4.4. pg_rewind
3.4.5. pg_verifybackup
3.4.6. pg_waldump
3.4.7. psql
3.4.8. reindexdb
3.4.9. vacuumdb
3.4.10. Other
3.5. Contrib modules
3.5.1. adminpack
3.5.2. auto_explain
3.5.3. dict_int
3.5.4. ltree
3.5.5. pageinspect
3.5.6. pg_stat_statements
3.5.7. postgres_fdw
3.5.8. bool_plperl

02. Architecture
pg13

https://www.researchgate.net/figure/Overview-of-deduplication-processing_fig1_305801856
https://www.highgo.ca/2020/07/06/features-in-pg13-deduplication-in-b-tree-indexes/
Deduplication Implement Concept (general for files)

實測 (create table、insert 100萬筆、 create index)
postgres=# CREATE TABLE btree_dups AS (SELECT GENERATE_SERIES(1, 1000000)::BIGINT
AS val);
SELECT 1000000
postgres=# CREATE INDEX btree_idx ON btree_dups(val);
CREATE INDEX
postgres=# SELECT c.relname , c.relkind, pg_size_pretty(pg_relation_size(c.oid))
FROM pg_class c
WHERE c.relname = 'btree_dups'
OR c.oid IN (
SELECT i.indexrelid FROM pg_index i WHERE i.indrelid =
'btree_dups'::regclass );
relname | relkind | pg_size_pretty
------------+---------+----------------
btree_dups | r | 35 MB
btree_idx | i | 21 MB
(2 rows)
postgres=# UPDATE btree_dups SET val = val + 1;
UPDATE 1000000

差異比較 (效能)
-- 12.2
postgres=# EXPLAIN SELECT val FROM
btree_dups;
QUERY PLAN
--------------------------------------
Seq Scan on btree_dups
(cost=0.00..23275.00 rows=1000000
width=8)
(1 row)
Time: 2.415 ms
postgres=# DO
postgres-# $$BEGIN
postgres$# PERFORM * FROM btree_dups;
postgres$# END;
postgres$# $$;
DO
Time: 190.101 ms
-- 13 Beta 2
postgres=# EXPLAIN SELECT val FROM
btree_dups;
QUERY PLAN
--------------------------------------
Seq Scan on btree_dups
(cost=0.00..23274.00 rows=1000000
width=8)
(1 row)
Time: 2.221 ms
postgres=# DO
postgres-# $$BEGIN
postgres$# PERFORM * FROM btree_dups;
postgres$# END;
postgres$# $$;
DO
Time: 174.843 ms

差異比較 (空間)
-- 12.2
------------+---------+---------------
-
(2 rows)
-- 13 Beta 2
------------+---------+---------------
-
(2 rows)

https://severalnines.com/database-blog/logical-replication-partitioning-postgresql-13
No Partitioning Partitioned
新增 Partitioned Table 的 Logical Replication 功能

Now
支援Partitioning
Past

差異比較 (SQL)
-- 12.2
CREATE PUBLICATION rep_part_pub FOR
TABLE stock_sales
WITH (publish_via_partition_root);
ERROR: "stock_sales" is a partitioned
table
DETAIL: Adding partitioned tables to
publications is not supported.
HINT: You can add the table
partitions individually.
-- 13 Beta 2
CREATE PUBLICATION rep_part_pub FOR
TABLE stock_sales
WITH (publish_via_partition_root);
CREATE SUBSCRIPTION rep_part_sub
CONNECTION 'host=192.168.56.101
port=5432 user=rep_usr
password=rep_pwd dbname=postgres'
PUBLICATION rep_part_pub;

3) Incremental sort
新增 enable_incrementalsort 參數
-- 13 Beta, enable_incrementalsort=on
postgres=# set enable_incrementalsort=on ;
SET
postgres=# SHOW enable_incrementalsort ;
enable_incrementalsort
------------------------
on
(1 row)

3) Incremental sort
實測 (建 Table, data, index)
-- CREATE
CREATE TABLE t_is(a int4,b
int4,ctime timestamp(6) without
time zone);
-- INSERT 2 times
INSERT INTO t_is(a,b,ctime) SELECT
n,round(random()*100000000),
clock_timestamp() FROM
generate_series(1,1000000) n;
INSERT INTO t_is(a,b,ctime) SELECT
n,round(random()*100000000),
clock_timestamp() FROM
generate_series(1,1000000) n;
-- CREATE INDEX
CREATE INDEX idx_t_is_a ON t_is
USING BTREE(a);
-- Check
postgres=# SELECT * FROM t_is ORDER BY a,b
LIMIT 10;
a | b | ctime
---+----------+----------------------------
1 | 60379526 | 2020-07-21 16:28:42.034869
1 | 73197294 | 2020-07-21 16:28:45.496297
2 | 943408 | 2020-07-21 16:28:45.496346
2 | 27584454 | 2020-07-21 16:28:42.036121
3 | 31616182 | 2020-07-21 16:28:45.496348
3 | 88997913 | 2020-07-21 16:28:42.036134
4 | 21557231 | 2020-07-21 16:28:45.49635
4 | 23206459 | 2020-07-21 16:28:42.036136
5 | 13268559 | 2020-07-21 16:28:45.496351
5 | 33672766 | 2020-07-21 16:28:42.036137
(10 rows)
https://postgres.fun/20200721193000.html

3) Incremental sort
差異比較 (13版, 啟用incrementalsort)
postgres=# EXPLAIN ANALYZE SELECT * FROM t_is ORDER BY a,b LIMIT 10;
QUERY PLAN
----------------------------------------------------------------------------
-------------------------------------------------------------
Limit (cost=0.51..1.16 rows=10 width=16) (actual time=0.042..0.044 rows=10
loops=1)
-> Incremental Sort (cost=0.51..130115.72 rows=2000000 width=16)
(actual time=0.041..0.042 rows=10 loops=1)
Sort Key: a, b
Presorted Key: a
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 25kB
Peak Memory: 25kB
-> Index Scan using idx_t_is_a on t_is (cost=0.43..58848.31
rows=2000000 width=16) (actual time=0.021..0.027 rows=11 loops=1)
Planning Time: 0.106 ms
Execution Time: 0.064 ms
(8 rows)

3. Incremental sort
差異比較 (13版, 關閉incrementalsort)
-- 13 Beta, enable_incrementalsort=off
QUERY PLAN
-------------------------------------------------------------------------------------------
Limit (cost=38152.38..38153.55 rows=10 width=16) (actual time=157.108..157.112 rows=10
loops=1)
-> Gather Merge (cost=38152.38..232610.33 rows=1666666 width=16) (actual
time=157.106..159.975 rows=10 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=37152.36..39235.69 rows=833333 width=16) (actual
time=145.993..145.993 rows=10 loops=3)
Sort Key: a, b
Sort Method: top-N heapsort Memory: 25kB
Worker 0: Sort Method: top-N heapsort Memory: 25kB
-> Parallel Seq Scan on t_is (cost=0.00..19144.33 rows=833333 width=16)
(actual time=0.067..68.815 rows=666667 loops=3)
(12 rows)

3. Incremental sort
差異比較 (12版, 無incrementalsort功能)
-- 12
QUERY PLAN
---------------------------------------------------------------------------- Limit
(cost=38152.38..38153.55 rows=10 width=16) (actual time=144.892..144.896 rows=10
loops=1)
-> Gather Merge (cost=38152.38..232610.33 rows=1666666 width=16) (actual
time=144.891..147.211 rows=10 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=37152.36..39235.69 rows=833333 width=16) (actual
time=141.592..141.593 rows=8 loops=3)
Sort Key: a, b
Sort Method: top-N heapsort Memory: 25kB
-> Parallel Seq Scan on t_is (cost=0.00..19144.33 rows=833333
width=16) (actual time=0.016..72.129 rows=666667 loops=3)
(12 rows)

3. Incremental sort
差異比較
-- 13 Beta, enable_incrementalsort=off
-- 12 no enable_incrementalsort parameter

4. parallel vacuum performance
實測說明
BACKGROUND
For testing the parallel vacuum performance we have constructed a scenario where vacuum is at
the verge of freezing by executing 50 million (vacuum_freeze_min_age) transactions. We
executed non-in place updates which will create huge bloat in the table as well as
indexes . After this point, we have maintained the copy of the database and executed the
vacuum with different numbers of workers on the same state of database and measured the
execution time.
OBSERVATION
We have observed that when the database is in dire need of completing the vacuum the non-
parallel vacuum took more than an hour to execute which we are able to complete in just 16 mins
with parallel vacuum which is nearly 4 times faster.
https://www.enterprisedb.com/postgres-tutorials/what-parallel-vacuum-postgresql-13
https://www.highgo.ca/2020/02/28/parallel-vacuum-in-upcoming-postgresql-13/

Parameters for parallel vacuum
Min_parallel_index_scan_size 512kB
Max_parallel_maintenance_workers 8
Other Performance Parameters
Shared_buffers 128GB
Maintenance_work_mem 1GB
參數設定建立Table
CREATE TABLE pgbench_accounts (
aid bigint,
bid bigint,
abalance bigint,
filler1 text DEFAULT md5(random()::text),
filler12 text DEFAULT md5(random()::text)
);

建立 DATA & INDEXES
INSERT INTO pgbench_accounts select i,i%10,0 FROM
generate_series(1,100000000) as i;
CREATE UNIQUE INDEX pgb_a_aid ON pgbench_accounts(aid);
CREATE INDEX pgb_a_bid ON pgbench_accounts(bid);
CREATE INDEX pgb_a_abalance ON pgbench_accounts(abalance);
CREATE INDEX pgb_a_filler1 ON pgbench_accounts(filler1);

4). parallel vacuum performance
WORKLOAD
./pgbench -c32 -j32 -t15000000 -M prepared -f script.sql postgres
set aid random(1, 100000000)
set bid random(1, 100000000)
set delta random(-5000, 5000)
BEGIN;
UPDATE pgbench_accounts SET bid=:bid WHERE aid = :aid;
END;
SCRIPT

4). parallel vacuum performance
postgres=# VACUUM (PARALLEL 4, VERBOSE) pgbench_accounts;
INFO: vacuuming "public.pgbench_accounts"
INFO: launched 2 parallel vacuum workers for index vacuuming
(planned: 2)
...
...
VACUUM

5. Others
a) Modified catalogs： 13 Added, 5 Dropped, 3 changed
b) Data types： 64-bit transaction IDs…
c) Disk based hash aggregation
d) Backup manifests：with consistency checks.
e) libpq connection string：ssl_min_protocol_version, ssl_max_protocol_versionparameter
f) Column trigger：now executed on the subscription side.
PostgreSQL 13 performs storage based hash aggregation if the hash table cannot be stored in work memory.
This feature can be controlled by the parameter enable_hashagg_disk (default ‘on’) and the hashagg_disk
(default ‘on’) and the parameter enable_groupingsets_hash_disk (default 'off') used in the GROUPING SETS
clause. In parameter enable_groupingsets_hash_disk (default 'off') used in the GROUPING SETS clause.

03. SQL Statement
pg13

03. SQL Statement (1/3)
a) ALTER NO DEPENDS ON (for functions, indexes, materialized views, and triggers)
b) ALTER TABLE DROP EXPRESSION statement
c) ALTER VIEW
d) CREATE DATABASE
e) CREATE INDEX
ALTER FUNCTION func1 DEPENDS ON EXTENSION cube ;
ALTER FUNCTION func1 NO DEPENDS ON EXTENSION cube ;
CREATE TABLE gen1(c1 INT, c2 INT, c3 INT GENERATED ALWAYS AS (c1 + c2) STORED) ;
ALTER TABLE gen1 ALTER COLUMN c3 DROP EXPRESSION IF EXISTS ;
ALTER VIEW view_name RENAME COLUMN old_name TO new_name ;
CREATE DATABASE database_name [[WITH] LOCALE [=] locale_name
CREA TE INDEX idx2_dedup ON data1(c3) WITH (deduplicate_items = on) ;

f) CREATE TABLE (added parameters)
ALTER TABLE data1 SET (toast.vacuum_index_cleanup = OFF) ;
ALTER TABLE data1 SET (autovacuum_vacuum_insert_threshold = 10000) ;

g) INSERT OVERRIDING USER VALUE
h) JSON
• Allow Unicode escape
• Datetime method of jsonpath
i) MAX/MIN pg_lsn
j) ROW expressions
k) Distance operators (<->)
l) gen_random_uuid() ;

04. Utilities
pg13

04. Utilities (1/2)
a) dropdb
b) pg_basebackup
a) added manifest related parameters
b) added checksum related parameters
c) pg_dump
d) pg_verifybackup
• Manifest file version
• The checksum of the manifest file itself
• File size
• File checksum
• WAL file integrityWAL file integrity
$ dropdb --force --echo demodb
$pg_dump –d demodb –-include-foreign-data=svr1

04. Utilities (2/2)
e) reindexdb
f) vacuumdb
$ reindexdb --concurrently --jobs 2 postgres
reindexdb: warning: cannot reindex system catalogs concurrently,
skipping all
$vacuumdb --parallel=4 postgres
vacuumdb: vacuuming database "postgres"

Reference
• HPE Noriyoshi Shinoda https://www.hpe.com/jp/ja/servers/linux/technical/edlin.html
• EnterpriseDB
https://www.enterprisedb.com/postgres-tutorials/what-parallel-vacuum-postgresql-13
• HighGo
https://www.highgo.ca/2020/02/28/parallel-vacuum-in-upcoming-postgresql-13/
https://www.highgo.ca/2020/07/06/features-in-pg13-deduplication-in-b-tree-indexes/
• Severalnines
https://severalnines.com/database-blog/logical-replication-partitioning-postgresql-13

Thank you.
PostgreSQL 13 New Features
@ COSCUP 2020
Taiwan PostgreSQL User Group
林宗禧 linjose@postgresql.tw
41
若有任何問題
歡迎聯絡下方Mail !!
歡迎加入台灣PostgreSQL使用者社群
FB 社團：PostgreSQL.TW
FB 粉絲頁： @pgsqlTaiwan
Website ：http://postgresql.tw

PostgreSQL 13 New Features

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to PostgreSQL 13 New Features

Similar to PostgreSQL 13 New Features (20)

Recently uploaded

Recently uploaded (20)

PostgreSQL 13 New Features

Editor's Notes