PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Indexes
in
PostgreSQL
(10)
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The outlineThe outline
• Indexes in PostgreSQL
• What’s new in v10:
– Parallelism
– Hash indexing
– New supports for SP-GiST (inet data)
– Summarization of BRINs
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
~$ whoami~$ whoami
Giuseppe BroccoloGiuseppe Broccolo
- data engineer at- data engineer at
- member of- member of
@giubro
gbroccolo7
gbroccolo
gemini__81
g.broccolo.7@gmail.com
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
PostgreSQL indexesPostgreSQL indexes
• AKA Access Methods
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
PostgreSQL indexesPostgreSQL indexes
• AKA Access Methods
– allow concurrent changes (MVCC compliant)
– persist the information (WAL)
– speed up access to data:
• links to data blocks (sometimes can be avoided)
• Indexes’ blocks live in shared buffers AWA data blocks
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
WALWALWAL
sharedbuffers
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The default AMs – the treesThe default AMs – the trees
• binary structure hierarchically sorted
– nodes (values, link to pointed nodes, etc.)
– pointing depends from hierarchical criteria
– allow to skip orders of values
• N~O(an
) n~O(logN)→
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The default AMs – the treesThe default AMs – the trees
balanced
• binary structure hierarchically sorted
– nodes (values, link to pointed nodes, etc.)
– pointing depends from hierarchical criteria
– allow to skip orders of values
• N~O(an
) n~O(logN)→
• balanced structures speed up punctual searches
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The default AMs – the treesThe default AMs – the trees
balanced
unbalanced
• binary structure hierarchically sorted
– nodes (values, link to pointed nodes, etc.)
– pointing depends from hierarchical criteria
– allow to skip orders of values
• N~O(an
) n~O(logN)→
• balanced structures speed up punctual searches
• unbalanced ones are quite faster for range
searches
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The default AMs – the hashesThe default AMs – the hashes
• binary maps (k: v)
– k: the hash of the search key - bucket
– v: the address where the key is stored
– just one kind of search: =
– complexity:
• ~O(1)
– like trees, their sizes are comparable with
the indexed dataset
• ~O(N)
search key
k: value...
hashing
N
complexity
~O(logN)
...
~O(1)
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The default AMs – the BRINsThe default AMs – the BRINs
• Block Range Indexes:
– À. Herrera, S. Riggs, H. Linnakangas (PG 9.5)
– Range: summarization of adjacent-on-disk blocks
– complexity:
• ~O(N/K), K~10/100
• really small indexes,faster creation
• ~O(N/K’), K’~1000/10000
• can be used for low-selectivity queries
• low performance for “dynamic” data
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
8kB8kB8kB8kB
range 0 range 1 range 2 range 3
range 7range 6range 5range 4
Summarization:
blk n. xxxxx
range X blk n. yyyyy
blk n. zzzzz
......
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The default AMsThe default AMs
• B-tree, GIN, GiST, SP-GiST, Hash, BRIN
• can add user defined new access methods
– fully supported since 9.6 (thanks to postgrespro & 2ndQuadrant)
• CREATE ACCESS METHOD
sortable generalized
balanced unbalanced
trees
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Extend AMs to datatypes: the OpClassesExtend AMs to datatypes: the OpClasses
• access methods use operator classes (opclass)
•
•
•
• define:
– operators for the needed types
– support functions depending on the access method
• can be extended to specific datatypes
CREATE INDEX idx_name
USING method
ON table (column opclass_name)
WITH (opt=value);
• CREATE OPERATOR CLASS opclass_name
FOR TYPE datatype
USING method
OPERATOR $$(),
[...],
FUNCTION func1(),
[...]
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Execution plansExecution plans
• IndexScan need to inspect data
pages for row visibility
• IndexOnlyScan just index pages, use
visibility map (PG9.2)
• BitmapIndexScan
BitmapHeapScan 1) reduce # of accesses
using a bitmap
2) used by BRIN to
inspect block ranges
N
complexity
~O(logN)
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
What’s new in PG 10 ?What’s new in PG 10 ?
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Parallelization in index scansParallelization in index scans
• parallelization is not new in PG (9.6), see G. Ciolli later
– parallel B-tree index scans
– parallel BitmapHeapScan (different areas of the heap are processed
by parallel workers)
– R. Syed, A. Kapila, R. Haas, R. Sabih, D. Kumar, R. Haas, J. Rouhaud
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Parallelization in Index ScansParallelization in Index Scans
• for B-tree
– Workers inspect leaf pages in parallel
gather
node
gather
node
worker #1
worker #2
worker #N
...
• for bitmap heap scan
– Workers inspect heap chunks in parallel
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Parallelization in Index ScansParallelization in Index Scans
• The parameters:
– max_parallel_workers (included in max_worker_processes)
– max_parallel_workers_per_gather (included in max_parallel_workers)
– min_parallel_index_scan_size (512kB)
• heuristic: # workers / index size > 512kB * 3# workers
– parallel_setup_cost (1000.0)
– parallel_tuple_cost (0.1)
– force_parallel_mode (false)
• tune them basing on underlying HW!
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
When is parallelization used ?When is parallelization used ?
• Ex. IndexOnlyScan on B-tree
• table/B-tree ~O(300MB)
=# CREATE TABLE test AS
=# SELECT generate_series(1,10000000) t(i);
CREATE
=# CREATE INDEX btree_idx ON test USING btree (i);
CREATE
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
When parallelization is disabled:When parallelization is disabled:
• Ex. IndexOnlyScan on B-tree:
=# EXPLAIN ANALYZE SELECT * FROM test WHERE i=5;
QUERY PLAN
----------------------------------------------------------
Index Only Scan using btree_id on test
(cost=0.43..8.45 rows=1 width=4)
(actual time=0.433..0.434 rows=1 loops=1)
Index Cond: (i = 5)
Heap Fetches: 1
Planning time: 0.525 ms
Execution time: 0.461 ms
(5 rows)
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
When is parallelization used ?When is parallelization used ?
• Setup parallel executions:
•
•
•
• Plan does not change!! Force parallelization...
=# SET max_parallel_workers TO 8;
SET
=# SET max_parallel_workers_per_gather TO 8; -- up to 6 workers
SET
=# SET force_parallel_mode TO true;
SET
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
When is parallelization used ?When is parallelization used ?
• Ex. IndexOnlyScan on B-tree
=# EXPLAIN ANALYZE SELECT * FROM test WHERE i=5;
QUERY PLAN
----------------------------------------------------------
Gather (cost=1000.43..1008.45 rows=1 width=4)
(actual time=2.523..2.579 rows=1 loops=1)
Workers Planned: 6
Workers Launched: 6
Single Copy: true
-> Index Only Scan using btree_id on test
(cost=0.43..8.45 rows=1 width=4)
(actual time=0.030..0.032 rows=1 loops=1)
Index Cond: (i = 5)
Heap Fetches: 0
Planning time: 0.063 ms
Execution time: 3.934 ms
(9 rows)
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
When is parallelization used ?When is parallelization used ?
• try to “trick” the planner with lower tuple costs:
• the same plan is obtained – and it is still disadvantageous!
– costs parameters are (almost) always fine
– parallelization costs are sustainable in case of (real) big data
=# SET force_parallel_mode TO false;
SET
=# SET parallel_tuple_cost TO 0.01;
SET
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Hash indexes are now logged!Hash indexes are now logged!
8kB8kB8kB8kB
WALWALWAL
• Hash AMs did not define how index changes had to be logged into WALs:
– Hashes lived just in shared buffers – no crash safe!
– Hashes could not be phisically replicated
• Hashes AMs now include WAL logging (R. Haas, G. Ghosh,
A. Kapila,A. Sharma)
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Hash indexes are now logged!Hash indexes are now logged!
• Ex. physical replication, with pre-existing hash index before 1st
base backup:
hot standby
=# d hash_example
Table "public.hash_example"
Column | Type | Modifiers
--------+---------+-----------
i | integer |
Indexes:
"hash_idx" hash (i)
master
=# d hash_example
Table "public.hash_example"
Column | Type | Modifiers
--------+---------+-----------
i | integer |
Indexes:
"hash_idx" hash (i)
WALWAL WALWALWALWAL
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Hash indexes are now logged!Hash indexes are now logged!
• pre PostgreSQL 10:
hot standby
=# explain analyze select * from
=# hash_example where i = 123;
QUERY PLAN
-----------------------------------------
Index Scan using hash_idx on hash_example
(cost=0.00..8.02 rows=1 width=21)
(actual time=1.526..1.529 rows=1 loops=1)
[...]
master
=# explain analyze select * from
=# hash_example where i = 123;
ERROR: could not read block 0 in file
"base/16402/458955269": read only 0 of
8192 byte
=# SET enable_index_scan TO false;
SET
WALWAL WALWALWALWAL
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
SP-GiST support forSP-GiST support for inetinet
• Unbalanced indexes perform better in case of inclusion searches:
– Ex. Quad-tree
&&
bbox
• H. Hesegeli extended the use case to IPv4/IPv6 addresses (inet, 7 Bytes/19 Bytes):
– defined the OpClass for inet to be interfaced with SP-GiST AMs
• inet_ops → && >> >>= > >= <> << <<= < <= =
– important improvement in SP-GiST AM: # of child nodes is limited
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
SP-GiST support forSP-GiST support for inetinet
• Ex.
=# CREATE TABLE network_a AS SELECT ((random() * 255)::int::text || '.' ||
=# (random() * 255)::int::text || '.' ||
=# (random() * 255)::int::text || '.' ||
=# (random() * 255)::int::text || '/' ||
=# (random() * 32)::int::text)::inet as addr
=# FROM generate_series(1, 1000);
CREATE
=# CREATE INDEX gist_idx ON network_a USING gist (addr inet_ops);
CREATE
=# CREATE INDEX spgist_idx_a ON network_a USING spgist (addr inet_ops);
CREATE
=# CREATE TABLE network_b AS (
=# SELECT * FROM network_a ORDER BY random() LIMIT 100);
CREATE
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
SP-GiST support forSP-GiST support for inetinet
• Ex. no indexes
=# EXPLAIN ANALYZE SELECT * FROM network_a a JOIN network_b b ON b.addr && a.addr;
QUERY PLAN
-----------------------------------------------------------------------------------
Nested Loop (cost=0.00..15032.50 rows=78724 width=14)
(actual time=0.017..185.134 rows=94973 loops=1)
Join Filter: (a.addr && b.addr)
Rows Removed by Join Filter: 905027
-> Seq Scan on network_a a (cost=0.00..15.00 rows=1000 width=7)
(actual time=0.008..0.187 rows=1000 loops=1)
-> Materialize (cost=0.00..20.00 rows=1000 width=7)
(actual time=0.000..0.061 rows=1000 loops=1000)
-> Seq Scan on network_b b (cost=0.00..15.00 rows=1000 width=7)
(actual time=0.005..0.083 rows=1000 loops=1)
Planning time: 0.522 ms
Execution time: 190.120 ms
(8 rows)
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
SP-GiST support forSP-GiST support for inetinet
• Ex. GiST index
=# EXPLAIN ANALYZE SELECT * FROM network_a a JOIN network_b b ON b.addr && a.addr;
QUERY PLAN
-----------------------------------------------------------------------------------
Nested Loop (cost=0.14..631.40 rows=13600 width=39)
(actual time=0.048..112.023 rows=94973 loops=1)
-> Seq Scan on network_b b (cost=0.00..23.60 rows=1360 width=32)
(actual time=0.016..0.153 rows=1000 loops=1)
-> Index Only Scan using gist_idx_a on network_a a
(cost=0.14..0.35 rows=10 width=7)
(actual time=0.018..0.093 rows=95 loops=1000)
Index Cond: (addr && a.addr)
Heap Fetches: 94973
Planning time: 0.111 ms
Execution time: 119.433 ms
(7 rows)
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
SP-GiST support forSP-GiST support for inetinet
• Ex. SP-GiST index
=# EXPLAIN ANALYZE SELECT * FROM network_a a JOIN network_b b ON b.addr && a.addr;
QUERY PLAN
-----------------------------------------------------------------------------------
Nested Loop (cost=0.14..667.40 rows=13600 width=39)
(actual time=0.034..58.196 rows=94973 loops=1)
-> Seq Scan on network_b b (cost=0.00..23.60 rows=1360 width=32)
(actual time=0.009..0.105 rows=1000 loops=1)
-> Index Only Scan using spgist_idx_a on network_a a
(cost=0.14..0.37 rows=10 width=7)
(actual time=0.008..0.042 rows=95 loops=1000)
Index Cond: (addr && a.addr)
Heap Fetches: 94973
Planning time: 0.109 ms
Execution time: 63.562 ms
(7 rows)
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
BRIN summarization for newBRIN summarization for new INSERTINSERTss
• pre PG 10: perform VACUUM, or call brin_summarize_new_value()
• NOW (Á. Herrera):
– autovacuum daemon is now able to summarize now data in present ranges:
• CREATE INDEX ON table USING brin (column) WITH (autosummarize=on);
– It is possible to summarize/desummarized single blocks (bigint):
• brin_summarize_range / brin_desummarize_range
• BRIN are (still) not able to “shrinks” summarized data
– if you update/delete boundary data, need to REINDEX
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Other features about indexesOther features about indexes
• Improve hash index performance
(A. Kapila, M. Cy, A. Sharma)
• Improve accuracy in determining if a BRIN index scan is beneficial
(D. Rowley, E. Hasegeli)
• Allow faster GiST INSERTs/UPDATEs by reusing index space efficiently
(A. Borodin)
• Reduce page locking during vacuuming of GIN indexes
(A. Borodin)
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
The future of indexes in PostgreSQLThe future of indexes in PostgreSQL
• Allow compression/decompression AM functions in SP-GiST
OpClasses (good for PostGIS!)
• CREATE GLOBAL INDEX
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
ConclusionsConclusions
• PostgreSQL has a long tradition in indexes development
• different types for different goals
• an eye to the future
PGDay.IT 2017 - 11th
edition
Milan October, 13th
2017
Giuseppe Broccolo
g.broccolo.7@gmail.com
Viralize.com
Creative Commons licenseCreative Commons license
This work is licensed under a Creative Commons
Attribution-ShareAlike 4.0 International License
https://creativecommons.org/licenses/by-nc-sa/4.0/
© 2017 Giuseppe Broccolo, ITPUG – www.itpug.org/

Indexes in PostgreSQL (10)

  • 1.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com Indexes in PostgreSQL (10)
  • 2.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com The outlineThe outline • Indexes in PostgreSQL • What’s new in v10: – Parallelism – Hash indexing – New supports for SP-GiST (inet data) – Summarization of BRINs
  • 3.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com ~$ whoami~$ whoami Giuseppe BroccoloGiuseppe Broccolo - data engineer at- data engineer at - member of- member of @giubro gbroccolo7 gbroccolo gemini__81 g.broccolo.7@gmail.com
  • 4.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com PostgreSQL indexesPostgreSQL indexes • AKA Access Methods
  • 5.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com PostgreSQL indexesPostgreSQL indexes • AKA Access Methods – allow concurrent changes (MVCC compliant) – persist the information (WAL) – speed up access to data: • links to data blocks (sometimes can be avoided) • Indexes’ blocks live in shared buffers AWA data blocks 8kB8kB8kB8kB 8kB8kB8kB8kB 8kB8kB8kB8kB 8kB8kB8kB8kB 8kB8kB8kB8kB 8kB8kB8kB8kB 8kB8kB8kB8kB 8kB8kB8kB8kB WALWALWAL sharedbuffers
  • 6.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com The default AMs – the treesThe default AMs – the trees • binary structure hierarchically sorted – nodes (values, link to pointed nodes, etc.) – pointing depends from hierarchical criteria – allow to skip orders of values • N~O(an ) n~O(logN)→
  • 7.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com The default AMs – the treesThe default AMs – the trees balanced • binary structure hierarchically sorted – nodes (values, link to pointed nodes, etc.) – pointing depends from hierarchical criteria – allow to skip orders of values • N~O(an ) n~O(logN)→ • balanced structures speed up punctual searches
  • 8.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com The default AMs – the treesThe default AMs – the trees balanced unbalanced • binary structure hierarchically sorted – nodes (values, link to pointed nodes, etc.) – pointing depends from hierarchical criteria – allow to skip orders of values • N~O(an ) n~O(logN)→ • balanced structures speed up punctual searches • unbalanced ones are quite faster for range searches
  • 9.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com The default AMs – the hashesThe default AMs – the hashes • binary maps (k: v) – k: the hash of the search key - bucket – v: the address where the key is stored – just one kind of search: = – complexity: • ~O(1) – like trees, their sizes are comparable with the indexed dataset • ~O(N) search key k: value... hashing N complexity ~O(logN) ... ~O(1)
  • 10.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com The default AMs – the BRINsThe default AMs – the BRINs • Block Range Indexes: – À. Herrera, S. Riggs, H. Linnakangas (PG 9.5) – Range: summarization of adjacent-on-disk blocks – complexity: • ~O(N/K), K~10/100 • really small indexes,faster creation • ~O(N/K’), K’~1000/10000 • can be used for low-selectivity queries • low performance for “dynamic” data 8kB8kB8kB8kB 8kB8kB8kB8kB 8kB8kB8kB8kB 8kB8kB8kB8kB 8kB8kB8kB8kB 8kB8kB8kB8kB 8kB8kB8kB8kB 8kB8kB8kB8kB range 0 range 1 range 2 range 3 range 7range 6range 5range 4 Summarization: blk n. xxxxx range X blk n. yyyyy blk n. zzzzz ......
  • 11.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com The default AMsThe default AMs • B-tree, GIN, GiST, SP-GiST, Hash, BRIN • can add user defined new access methods – fully supported since 9.6 (thanks to postgrespro & 2ndQuadrant) • CREATE ACCESS METHOD sortable generalized balanced unbalanced trees
  • 12.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com Extend AMs to datatypes: the OpClassesExtend AMs to datatypes: the OpClasses • access methods use operator classes (opclass) • • • • define: – operators for the needed types – support functions depending on the access method • can be extended to specific datatypes CREATE INDEX idx_name USING method ON table (column opclass_name) WITH (opt=value); • CREATE OPERATOR CLASS opclass_name FOR TYPE datatype USING method OPERATOR $$(), [...], FUNCTION func1(), [...]
  • 13.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com Execution plansExecution plans • IndexScan need to inspect data pages for row visibility • IndexOnlyScan just index pages, use visibility map (PG9.2) • BitmapIndexScan BitmapHeapScan 1) reduce # of accesses using a bitmap 2) used by BRIN to inspect block ranges N complexity ~O(logN)
  • 14.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com What’s new in PG 10 ?What’s new in PG 10 ?
  • 15.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com Parallelization in index scansParallelization in index scans • parallelization is not new in PG (9.6), see G. Ciolli later – parallel B-tree index scans – parallel BitmapHeapScan (different areas of the heap are processed by parallel workers) – R. Syed, A. Kapila, R. Haas, R. Sabih, D. Kumar, R. Haas, J. Rouhaud
  • 16.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com Parallelization in Index ScansParallelization in Index Scans • for B-tree – Workers inspect leaf pages in parallel gather node gather node worker #1 worker #2 worker #N ... • for bitmap heap scan – Workers inspect heap chunks in parallel
  • 17.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com Parallelization in Index ScansParallelization in Index Scans • The parameters: – max_parallel_workers (included in max_worker_processes) – max_parallel_workers_per_gather (included in max_parallel_workers) – min_parallel_index_scan_size (512kB) • heuristic: # workers / index size > 512kB * 3# workers – parallel_setup_cost (1000.0) – parallel_tuple_cost (0.1) – force_parallel_mode (false) • tune them basing on underlying HW!
  • 18.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com When is parallelization used ?When is parallelization used ? • Ex. IndexOnlyScan on B-tree • table/B-tree ~O(300MB) =# CREATE TABLE test AS =# SELECT generate_series(1,10000000) t(i); CREATE =# CREATE INDEX btree_idx ON test USING btree (i); CREATE
  • 19.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com When parallelization is disabled:When parallelization is disabled: • Ex. IndexOnlyScan on B-tree: =# EXPLAIN ANALYZE SELECT * FROM test WHERE i=5; QUERY PLAN ---------------------------------------------------------- Index Only Scan using btree_id on test (cost=0.43..8.45 rows=1 width=4) (actual time=0.433..0.434 rows=1 loops=1) Index Cond: (i = 5) Heap Fetches: 1 Planning time: 0.525 ms Execution time: 0.461 ms (5 rows)
  • 20.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com When is parallelization used ?When is parallelization used ? • Setup parallel executions: • • • • Plan does not change!! Force parallelization... =# SET max_parallel_workers TO 8; SET =# SET max_parallel_workers_per_gather TO 8; -- up to 6 workers SET =# SET force_parallel_mode TO true; SET
  • 21.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com When is parallelization used ?When is parallelization used ? • Ex. IndexOnlyScan on B-tree =# EXPLAIN ANALYZE SELECT * FROM test WHERE i=5; QUERY PLAN ---------------------------------------------------------- Gather (cost=1000.43..1008.45 rows=1 width=4) (actual time=2.523..2.579 rows=1 loops=1) Workers Planned: 6 Workers Launched: 6 Single Copy: true -> Index Only Scan using btree_id on test (cost=0.43..8.45 rows=1 width=4) (actual time=0.030..0.032 rows=1 loops=1) Index Cond: (i = 5) Heap Fetches: 0 Planning time: 0.063 ms Execution time: 3.934 ms (9 rows)
  • 22.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com When is parallelization used ?When is parallelization used ? • try to “trick” the planner with lower tuple costs: • the same plan is obtained – and it is still disadvantageous! – costs parameters are (almost) always fine – parallelization costs are sustainable in case of (real) big data =# SET force_parallel_mode TO false; SET =# SET parallel_tuple_cost TO 0.01; SET
  • 23.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com Hash indexes are now logged!Hash indexes are now logged! 8kB8kB8kB8kB WALWALWAL • Hash AMs did not define how index changes had to be logged into WALs: – Hashes lived just in shared buffers – no crash safe! – Hashes could not be phisically replicated • Hashes AMs now include WAL logging (R. Haas, G. Ghosh, A. Kapila,A. Sharma)
  • 24.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com Hash indexes are now logged!Hash indexes are now logged! • Ex. physical replication, with pre-existing hash index before 1st base backup: hot standby =# d hash_example Table "public.hash_example" Column | Type | Modifiers --------+---------+----------- i | integer | Indexes: "hash_idx" hash (i) master =# d hash_example Table "public.hash_example" Column | Type | Modifiers --------+---------+----------- i | integer | Indexes: "hash_idx" hash (i) WALWAL WALWALWALWAL
  • 25.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com Hash indexes are now logged!Hash indexes are now logged! • pre PostgreSQL 10: hot standby =# explain analyze select * from =# hash_example where i = 123; QUERY PLAN ----------------------------------------- Index Scan using hash_idx on hash_example (cost=0.00..8.02 rows=1 width=21) (actual time=1.526..1.529 rows=1 loops=1) [...] master =# explain analyze select * from =# hash_example where i = 123; ERROR: could not read block 0 in file "base/16402/458955269": read only 0 of 8192 byte =# SET enable_index_scan TO false; SET WALWAL WALWALWALWAL
  • 26.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com SP-GiST support forSP-GiST support for inetinet • Unbalanced indexes perform better in case of inclusion searches: – Ex. Quad-tree && bbox • H. Hesegeli extended the use case to IPv4/IPv6 addresses (inet, 7 Bytes/19 Bytes): – defined the OpClass for inet to be interfaced with SP-GiST AMs • inet_ops → && >> >>= > >= <> << <<= < <= = – important improvement in SP-GiST AM: # of child nodes is limited
  • 27.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com SP-GiST support forSP-GiST support for inetinet • Ex. =# CREATE TABLE network_a AS SELECT ((random() * 255)::int::text || '.' || =# (random() * 255)::int::text || '.' || =# (random() * 255)::int::text || '.' || =# (random() * 255)::int::text || '/' || =# (random() * 32)::int::text)::inet as addr =# FROM generate_series(1, 1000); CREATE =# CREATE INDEX gist_idx ON network_a USING gist (addr inet_ops); CREATE =# CREATE INDEX spgist_idx_a ON network_a USING spgist (addr inet_ops); CREATE =# CREATE TABLE network_b AS ( =# SELECT * FROM network_a ORDER BY random() LIMIT 100); CREATE
  • 28.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com SP-GiST support forSP-GiST support for inetinet • Ex. no indexes =# EXPLAIN ANALYZE SELECT * FROM network_a a JOIN network_b b ON b.addr && a.addr; QUERY PLAN ----------------------------------------------------------------------------------- Nested Loop (cost=0.00..15032.50 rows=78724 width=14) (actual time=0.017..185.134 rows=94973 loops=1) Join Filter: (a.addr && b.addr) Rows Removed by Join Filter: 905027 -> Seq Scan on network_a a (cost=0.00..15.00 rows=1000 width=7) (actual time=0.008..0.187 rows=1000 loops=1) -> Materialize (cost=0.00..20.00 rows=1000 width=7) (actual time=0.000..0.061 rows=1000 loops=1000) -> Seq Scan on network_b b (cost=0.00..15.00 rows=1000 width=7) (actual time=0.005..0.083 rows=1000 loops=1) Planning time: 0.522 ms Execution time: 190.120 ms (8 rows)
  • 29.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com SP-GiST support forSP-GiST support for inetinet • Ex. GiST index =# EXPLAIN ANALYZE SELECT * FROM network_a a JOIN network_b b ON b.addr && a.addr; QUERY PLAN ----------------------------------------------------------------------------------- Nested Loop (cost=0.14..631.40 rows=13600 width=39) (actual time=0.048..112.023 rows=94973 loops=1) -> Seq Scan on network_b b (cost=0.00..23.60 rows=1360 width=32) (actual time=0.016..0.153 rows=1000 loops=1) -> Index Only Scan using gist_idx_a on network_a a (cost=0.14..0.35 rows=10 width=7) (actual time=0.018..0.093 rows=95 loops=1000) Index Cond: (addr && a.addr) Heap Fetches: 94973 Planning time: 0.111 ms Execution time: 119.433 ms (7 rows)
  • 30.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com SP-GiST support forSP-GiST support for inetinet • Ex. SP-GiST index =# EXPLAIN ANALYZE SELECT * FROM network_a a JOIN network_b b ON b.addr && a.addr; QUERY PLAN ----------------------------------------------------------------------------------- Nested Loop (cost=0.14..667.40 rows=13600 width=39) (actual time=0.034..58.196 rows=94973 loops=1) -> Seq Scan on network_b b (cost=0.00..23.60 rows=1360 width=32) (actual time=0.009..0.105 rows=1000 loops=1) -> Index Only Scan using spgist_idx_a on network_a a (cost=0.14..0.37 rows=10 width=7) (actual time=0.008..0.042 rows=95 loops=1000) Index Cond: (addr && a.addr) Heap Fetches: 94973 Planning time: 0.109 ms Execution time: 63.562 ms (7 rows)
  • 31.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com BRIN summarization for newBRIN summarization for new INSERTINSERTss • pre PG 10: perform VACUUM, or call brin_summarize_new_value() • NOW (Á. Herrera): – autovacuum daemon is now able to summarize now data in present ranges: • CREATE INDEX ON table USING brin (column) WITH (autosummarize=on); – It is possible to summarize/desummarized single blocks (bigint): • brin_summarize_range / brin_desummarize_range • BRIN are (still) not able to “shrinks” summarized data – if you update/delete boundary data, need to REINDEX
  • 32.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com Other features about indexesOther features about indexes • Improve hash index performance (A. Kapila, M. Cy, A. Sharma) • Improve accuracy in determining if a BRIN index scan is beneficial (D. Rowley, E. Hasegeli) • Allow faster GiST INSERTs/UPDATEs by reusing index space efficiently (A. Borodin) • Reduce page locking during vacuuming of GIN indexes (A. Borodin)
  • 33.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com The future of indexes in PostgreSQLThe future of indexes in PostgreSQL • Allow compression/decompression AM functions in SP-GiST OpClasses (good for PostGIS!) • CREATE GLOBAL INDEX
  • 34.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com ConclusionsConclusions • PostgreSQL has a long tradition in indexes development • different types for different goals • an eye to the future
  • 35.
    PGDay.IT 2017 -11th edition Milan October, 13th 2017 Giuseppe Broccolo g.broccolo.7@gmail.com Viralize.com Creative Commons licenseCreative Commons license This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License https://creativecommons.org/licenses/by-nc-sa/4.0/ © 2017 Giuseppe Broccolo, ITPUG – www.itpug.org/