SlideShare a Scribd company logo
1 of 77
Download to read offline
Alex Brasetvik // Senior Principal Architect @ Cognite
Postgres can do THAT?
Survey of select Postgres features, some which you should
probably learn more about!
▪ Link to in-depth material posted in the last slide
▪ Many topics could be long talks on their own. I want to
make you interested in these topics, not trying to fully
cover them (at all!)
▪ Questions? Tweet me @alexbrasetvik
GETTING STARTED
$ whoami
Alex Brasetvik @ Cognite
● Office of the CTO: focusing on
database related stuff
● Previously co-founded the
Elasticsearch platform that became
Elastic Cloud
● Postgres and Elasticsearch are
some of my favorite hammers
● As a proper cloud engineer, I enjoy
jumping out of planes
AGENDA
generate_series
EXPLAIN
RETURNING
WITH/Common Table Expressions
● Materialized vs inlined CTEs
● Writable CTEs
● Recursive CTEs
What's up in my database? pg_stat_(activity|statements)
Less locky migrations
Index tricks, range types
Deferrable constraints
Exclusion constraints
JSON
AGENDA
generate_series
EXPLAIN
RETURNING
WITH/Common Table Expressions
● Materialized vs inlined CTEs
● Writable CTEs
● Recursive CTEs
What's up in my database? pg_stat_(activity|statements)
Less locky migrations
Index tricks, range types
Deferrable constraints
Exclusion constraints
JSON
tl;dr: Hold on to your butts
generate_series()
Make lots of dummy data
generate_series(1, 1000)
# create table a (id int);
CREATE TABLE
# insert into a select generate_series(1, 100);
INSERT 0 100
# select * from a limit 3;
id
----
1
2
3
(3 rows)
-- A million rows in a few seconds:
# timing on
# create table some_numbers as
select generate_series(1, 1000*1000) as i;
SELECT 1000000
Time: 2245.545 ms (00:02.246)
# create index on some_numbers(i);
CREATE INDEX
Time: 994.635 ms
● generate_series(start_inclusive, end_inclusive)
● Useful to easily create an arbitrarily large sample set
● As we'll come back to, always test with realistically
sized data sets!
○ Behaviour can drastically change with changes
in data set sizes
generate_series(1, 1000)
# create table a (id int);
CREATE TABLE
# insert into a select generate_series(1, 100);
INSERT 0 100
# select * from a limit 3;
id
----
1
2
3
(3 rows)
-- A million rows in a few:
# timing on
# create table some_numbers as
select generate_series(1, 1000*1000) as i;
SELECT 1000000
Time: 2245.545 ms (00:02.246)
# create index on some_numbers(i);
CREATE INDEX
Time: 994.635 ms
● generate_series(start_inclusive, end_inclusive)
● Useful to easily create an arbitrarily large sample set
● As we'll come back to, always test with realistically
sized data sets!
○ Behaviour can drastically change with changes
in data set sizes
(psql) command
output
EXPLAIN
Asking Postgres to detail its plans
Please EXPLAIN
# explain select * from a where id=42;
-- QUERY PLAN --
Seq Scan on a (cost=0.00..2.25 rows=1 width=4)
Filter: (id = 42)
# explain analyze select * from a where id=42;
-- QUERY PLAN --
Seq Scan on a (cost=0.00..2.25 rows=1 width=4)
(actual time=0.034..0.041 rows=1 loops=1)
Filter: (id = 42)
Rows Removed by Filter: 99
Planning Time: 0.137 ms
Execution Time: 0.057 ms
● EXPLAIN [query goes here] shows the plan of the
statement without executing the query.
● EXPLAIN ANALYZE [query] executes the query while
profiling it, emitting the plan with profiling information.
Analyze = Execute
Please EXPLAIN
# explain select * from a where id=42;
-- QUERY PLAN --
Seq Scan on a (cost=0.00..2.25 rows=1 width=4)
Filter: (id = 42)
# explain analyze select * from a where id=42;
-- QUERY PLAN --
Seq Scan on a (cost=0.00..2.25 rows=1 width=4)
(actual time=0.034..0.041 rows=1 loops=1)
Filter: (id = 42)
Rows Removed by Filter: 99
Planning Time: 0.137 ms
Execution Time: 0.057 ms
Scan the entire table
Then remove rows
# create index on a(id);
CREATE INDEX
# explain (analyze true, verbose true, buffers true)
select * from a where id=42;
-- QUERY PLAN --
Seq Scan on public.a (cost=0.00..2.25 rows=1 width=4)
(actual time=0.019..0.027 rows=1 loops=1)
Output: id
Filter: (a.id = 42)
Rows Removed by Filter: 99
Buffers: shared hit=1
Settings. FORMAT JSON can be useful too.
Index not
used…?
Let's create an index
Postgres reads 1 page. That
plan is impossible to beat. It
knows there's hardly any data in
the table
# set enable_seqscan to off; -- Disable "seq scans" if possible, for testing
SET
# explain (analyze true, verbose true, buffers true)
select * from a where id=42;
-- QUERY PLAN --
Index Only Scan using a_id_idx on public.a (cost=0.14..8.16 rows=1 width=4)
(actual time=0.015..0.016 rows=1 loops=1)
Output: id
Index Cond: (a.id = 42)
Heap Fetches: 1
Buffers: shared hit=2
Planner setting
# set enable_seqscan to on; -- Revert to default setting
SET
# insert into a select generate_series(101, 1000*1000); -- 1 million rows in total
INSERT 0 999900
# explain (analyze true, verbose true, buffers true) select * from a where id=42;
-- QUERY PLAN --
Index Only Scan using a_id_idx on public.a (cost=0.42..8.44 rows=1 width=4)
(actual time=0.045..0.050 rows=1 loops=1)
Output: id
Index Cond: (a.id = 42)
Heap Fetches: 1
Buffers: shared hit=2
Planning Time: 0.074 ms
Execution Time: 0.075 ms
More data in the table. Picks a plan
with the index.
# drop index a_id_idx; -- Blow away the index, forcing a seq scan
DROP INDEX
# explain (analyze true, verbose true, buffers true) select * from a where id=42;
-- QUERY PLAN --
Gather (cost=1000.00..10634.90 rows=1 width=4)
(actual time=0.170..102.491 rows=1 loops=1)
Output: id
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=4426
→ Parallel Seq Scan on public.a (cost=0.00..9634.80 rows=1 width=4)
(actual time=50.188..77.625 rows=1 loops=3)
Output: id
Filter: (a.id = 42)
Rows Removed by Filter: 333363
Buffers: shared hit=4426
Worker 0: actual time=75.269..75.270 rows=0 loops=1
Buffers: shared hit=1326
Worker 1: actual time=75.277..75.277 rows=0 loops=1
Buffers: shared hit=1334
Planning Time: 0.180 ms
Execution Time: 102.523 ms (vs 0.075 ms with the index)
Visualizing plans
● Spotting performance problems in text can be hard,
especially as queries grow larger
● Useful visualization tools:
○ explain.dalibo.com
○ explain.depesz.com
Explaining writes
# create table foo(id int primary key);
CREATE TABLE
# insert into foo select generate_series(1, 1000*1000);
INSERT 0 1000000
# create table bar(id int primary key, foo int references foo(id) on delete cascade on update cascade);
CREATE TABLE
# insert into bar select generate_series(1, 1000*1000), generate_series(1, 1000*1000);
INSERT 0 1000000
-- foo and bar both have 1 million rows
-- bar.foo points to foo.id
# delete from bar where id = 42; -- This is fast
DELETE 1
Time: 1.130 ms
# delete from foo where id=1000; -- This is slooow
DELETE 1
Time: 405.049 ms
foo
id int
bar
id int
foo int
# explain (analyze true, verbose true, buffers true)
delete from foo where id=1000;
-- QUERY PLAN --
Delete on public.foo (cost=0.42..8.44 rows=1 width=6) (actual time=18.937..18.939 rows=0 loops=1)
Buffers: shared hit=3 read=6 dirtied=2
-> Index Scan using foo_pkey on public.foo (cost=0.42..8.44 rows=1 width=6)
(actual time=0.481..0.485 rows=1 loops=1)
Output: ctid
Index Cond: (foo.id = 1000)
Buffers: shared hit=1 read=3
Planning Time: 1.081 ms
Trigger RI_ConstraintTrigger_a_26211344
for constraint bar_foo_fkey: time=379.013 calls=1
Execution Time: 398.084 ms
foo
id int
bar
id int
foo int No index
on bar.foo
The cascading delete is very slow!
Trigger for reverse FK must scan all
of bar per deleted row.
(Note that the analyze causes the write to go through! "analyze" = "execute")
EXPLAIN that again?
● Test with real-sized data sets
● Experiment with low memory settings (work_mem) to
see behaviour when Postgres flushes to disk
● You can explain UPDATE and DELETE, not just SELECT
● EXPLAIN (ANALYZE) shows the plan (with profiling
data)
● Spot performance problems, find missing indexes
● … but also get a better understanding of how Postgres
executes things!
○ … which can avoid the performance debugging
down the road
● There are many "node types" other than Seq and
Index scan. Learn more about them!
EXPLAIN your queries
Improve your intuition for how queries execute, not just when you have a
performance problem
RETURNING *
Get back changes made
# create table person (
id int generated always as identity primary key,
name text,
is_crazy boolean
);
CREATE TABLE
# insert into person (name, is_crazy) values
('Alice', false),
('Bob', false),
('Mallory', true)
returning *; -- Emit every row inserted, which includes the auto generated ID
id | name | is_crazy
----+---------+----------
1 | Alice | f
2 | Bob | f
3 | Mallory | t
(3 rows)
INSERT 0 3
# update person set is_crazy = not is_crazy returning name, is_crazy;
name | is_crazy
---------+----------
Alice | t
Bob | t
Mallory | f
(3 rows)
UPDATE 3
sample=# delete from person where is_crazy returning id;
id
----
1
2
(2 rows) RETURNING will … return in a
few slides
WITH
Also called "Common Table Expressions"
(or CTEs)
WITH a tiny graph
# select * from edges;
a | b | type
----------------+----------------+--------------
A | B | friend
B | C | friend
B | D | friend
root | A | parentOf
root | something-else | bff
something-else | A pump?! | !
A pump?! | D | tree-breaker
(7 rows)
# copy (select 'digraph G { ' ||
string_agg('"' || a || '" -> "' || b || '" [label="' || type || '"]; ', E'')
|| '}' from edges)
to program 'dot -Tsvg > /tmp/test.svg'; -- Pipe to Graphviz
COPY 1
# copy (select * from edges) to '/tmp/file.csv' with csv header; -- Make a CSV
COPY 7
# copy (select * from edges) to program 'pbcopy' with csv header;
-- CSV now on clipboard, paste straight to Google Sheets
# with out_per_node as (
select a as node,
count(*) as out_degree
from edges
group by a
), in_per_node as (
select b as node,
count(*) as in_degree
from edges
group by b
)
select node,
coalesce(out_degree, 0) as out_degree,
coalesce(in_degree, 0) as in_degree
from out_per_node
full outer join in_per_node
using(node)
order by node;
node | out_degree | in_degree
----------------+------------+-----------
A | 1 | 1
A pump?! | 1 | 1
B | 2 | 1
C | 0 | 1
D | 0 | 2
root | 2 | 0
something-else | 1 | 1
(7 rows)
# with out_per_node as (
select a as node,
count(*) as out_degree
from edges
group by a
), in_per_node as (
select b as node,
count(*) as in_degree
from edges
group by b
)
select node,
coalesce(out_degree, 0) as out_degree,
coalesce(in_degree, 0) as in_degree
from out_per_node
full outer join in_per_node
using(node)
order by node;
node | out_degree
----------------+------------
B | 2
A pump?! | 1
something-else | 1
root | 2
A | 1
(5 rows)
node | in_degree
----------------+-----------
B | 1
A pump?! | 1
C | 1
something-else | 1
D | 2
A | 1
(6 rows)
You can refer to these as if they
were real tables in following
statements
# with out_per_node as (
select a as node,
count(*) as out_degree
from edges
group by a
), in_per_node as (
select b as node,
count(*) as in_degree
from edges
group by b
)
select node,
coalesce(out_degree, 0) as out_degree,
coalesce(in_degree, 0) as in_degree
from out_per_node
full outer join in_per_node
using(node)
order by node;
Equivalent sub-select:
select node,
coalesce(out_degree, 0) as out_degree,
coalesce(in_degree, 0) as in_degree
from (
select a as node, count(*) as out_degree
from edges
group by a
) as out_per_node
full outer join (
select b as node, count(*) as in_degree
from edges
group by b
) as in_per_node
using(node)
order by node;
# with out_per_node as (
select a as node,
count(*) as out_degree
from edges
group by a
), in_per_node as (
select b as node,
count(*) as in_degree
from edges
group by b
)
select node,
coalesce(out_degree, 0) as out_degree,
coalesce(in_degree, 0) as in_degree
from out_per_node
full outer join in_per_node
using(node)
where node = 'root'
order by node;
Equivalent sub-select:
select node,
coalesce(out_degree, 0) as out_degree,
coalesce(in_degree, 0) as in_degree
from (
select a as node, count(*) as out_degree
from edges
where a = 'root'
group by a
) as out_per_node
full outer join (
select b as node, count(*) as in_degree
from edges
where b = 'root'
group by b
) as in_per_node
using(node)
order by node;
# with out_per_node as (
select a as node,
count(*) as out_degree
from edges
group by a
), in_per_node as (
select b as node,
count(*) as in_degree
from edges
group by b
)
select node,
coalesce(out_degree, 0) as out_degree,
coalesce(in_degree, 0) as in_degree
from out_per_node
full outer join in_per_node
using(node)
where node = 'root'
order by node;
Equivalent sub-select:
select node,
coalesce(out_degree, 0) as out_degree,
coalesce(in_degree, 0) as in_degree
from (
select a as node, count(*) as out_degree
from edges
where a = 'root'
group by a
) as out_per_node
full outer join (
select b as node, count(*) as in_degree
from edges
where b = 'root'
group by b
) as in_per_node
using(node)
order by node;
How equivalent?
# explain with out_per_node as (
select a as node,
count(*) as out_degree
from edges
group by a
), in_per_node as (
select b as node,
count(*) as in_degree
from edges
group by b
)
select node,
coalesce(out_degree, 0) as out_degree,
coalesce(in_degree, 0) as in_degree
from out_per_node
full outer join in_per_node
using(node)
where node = 'root'
order by node;
QUERY PLAN
---------------------------------------------------------------------------------------
Sort (cost=36.62..36.64 rows=9 width=48)
Sort Key: (COALESCE(edges.a, edges_1.b))
-> Hash Full Join (cost=18.24..36.48 rows=9 width=48)
Hash Cond: (edges.a = edges_1.b)
-> GroupAggregate (cost=0.00..18.17 rows=3 width=40)
Group Key: edges.a
-> Seq Scan on edges (cost=0.00..18.12 rows=3 width=32)
Filter: (a = 'root'::text)
-> Hash (cost=18.20..18.20 rows=3 width=40)
-> GroupAggregate (cost=0.00..18.17 rows=3 width=40)
Group Key: edges_1.b
-> Seq Scan on edges edges_1 (cost=0.00..18.12 rows=3 width=32)
Filter: (b = 'root'::text)
# explain select node,
coalesce(out_degree, 0) as out_degree,
coalesce(in_degree, 0) as in_degree
from (
select a as node, count(*) as out_degree
from edges
where a = 'root'
group by a
) as out_per_node
full outer join (
select b as node, count(*) as in_degree
from edges
where b = 'root'
group by b
) as in_per_node
using(node)
order by node;
QUERY PLAN
---------------------------------------------------------------------------------------
Sort (cost=36.62..36.64 rows=9 width=48)
Sort Key: (COALESCE(edges.a, edges_1.b))
-> Hash Full Join (cost=18.24..36.48 rows=9 width=48)
Hash Cond: (edges.a = edges_1.b)
-> GroupAggregate (cost=0.00..18.17 rows=3 width=40)
Group Key: edges.a
-> Seq Scan on edges (cost=0.00..18.12 rows=3 width=32)
Filter: (a = 'root'::text)
-> Hash (cost=18.20..18.20 rows=3 width=40)
-> GroupAggregate (cost=0.00..18.17 rows=3 width=40)
Group Key: edges_1.b
-> Seq Scan on edges edges_1 (cost=0.00..18.12 rows=3 width=32)
Filter: (b = 'root'::text)
(The plans are identical!)
WITH queries
● Those two queries were not identical until Postgres 12
● Consider
● "Materialization", i.e. saving a temporary result, causes this to
materialize the entirety of table a
○ (That's a very inefficient way to get two rows..)
● Older blog posts about CTEs will emphasize this as an "optimisation
barrier", probably warning about them
with a_million_numbers as (
select * from a -- table from earlier with index on id
)
select * from a_million_numbers
where id in (42, 43);
}
Postgres ≤11 will always
materialize the result of
a "table expression" –
either in memory or disk
# set work_mem to '64 kB'; -- force disk flushing with very low limit
# explain (analyze true, verbose true, buffers true)
with a_million_numbers as MATERIALIZED (
select * from a
)
select * from a_million_numbers where id in (42, 43);
-- QUERY PLAN --
CTE Scan on a_million_numbers (cost=14426.90..36928.92 rows=10001 width=4)
(actual time=0.048..1356.717 rows=4 loops=1)
Output: a_million_numbers.id
Filter: (a_million_numbers.id = ANY ('{42,43}'::integer[]))
Rows Removed by Filter: 1000086
Buffers: shared hit=4426, temp written=1709
CTE a_million_numbers
-> Seq Scan on public.a (cost=0.00..14426.90 rows=1000090 width=4) (actual
time=0.015..334.729 rows=1000090 loops=1)
Output: a.id
Buffers: shared hit=4426
Planning Time: 0.095 ms
Execution Time: 1358.238 ms
Force Postgres ≤11 behaviour
Low on memory, flushing to disk
# set work_mem to '64 kB'; -- force disk flushing with very low limit
# explain (analyze true, verbose true, buffers true)
with a_million_numbers as [NOT MATERIALIZED] (
select * from a
)
select * from a_million_numbers where id in (42, 43);
-- QUERY PLAN --
Index Only Scan using a_id_idx on a (cost=0.42..12.88 rows=2 width=4) (actual
time=0.020..0.026 rows=4 loops=1)
Index Cond: (id = ANY ('{42,43}'::integer[]))
Heap Fetches: 4
Planning Time: 0.108 ms
Execution Time: 0.043 ms -- vs 1358ms for materialized plan
Default Postgres ≥12 behaviour
# set work_mem to '64 kB'; -- force disk flushing with very low limit
# explain (analyze true, verbose true, buffers true)
with a_million_numbers as [NOT MATERIALIZED] (
select * from a where id in (42, 43)
)
select * from a_million_numbers where id in (42, 43)
-- QUERY PLAN --
Index Only Scan using a_id_idx on a (cost=0.42..12.88 rows=2 width=4) (actual
time=0.020..0.026 rows=4 loops=1)
Index Cond: (id = ANY ('{42,43}'::integer[]))
Heap Fetches: 4
Planning Time: 0.108 ms
Execution Time: 0.043 ms -- vs 1358ms for materialized plan
as if inlined
Writable CTEs
● DELETE/UPDATE/INSERT statements with RETURNING
can be used as "tables" too
● Note!
○ RETURNING is the only way to let other table
expressions see the result.
○ Different table expressions cannot otherwise
see the effect of other modifications in the
same statement.
-- Delete rows while simultaneously
-- inserting them elsewhere
WITH moved_rows AS (
DELETE FROM products
WHERE
date >= '2010-10-01' AND
date < '2010-11-01'
RETURNING *
)
INSERT INTO products_log
SELECT * FROM moved_rows;
Recursive CTEs
# WITH RECURSIVE t(n) AS (
VALUES (1) -- starting value(s)
UNION ALL
SELECT n+1 FROM t -- t will keep growing
WHERE n < 100 -- until termination condition is true
)
SELECT sum(n) FROM t;
sum
------
5050
# with recursive graph_traversal(a, b, path, depth) as (
select a, b, ARRAY[a] as path, 0 as depth
from edges
where a='root'
union all
select edges.a, edges.b, path || ARRAY[edges.a], depth + 1
from edges join graph_traversal on(edges.a=graph_traversal.b)
-- Avoid looping forever if there's a cycle
where not(edges.a = ANY(path))
)
select * from graph_traversal
order by depth, a;
a | b | path | depth
----------------+----------------+----------------------------------+-------
root | A | {root} | 0
root | something-else | {root} | 0
A | B | {root,A} | 1
something-else | A pump?! | {root,something-else} | 1
A pump?! | D | {root,something-else,"A pump?!"} | 2
B | C | {root,A,B} | 2
B | D | {root,A,B} | 2
(7 rows)
-- Mandelbrot set
WITH RECURSIVE x(i)
AS (
VALUES(0)
UNION ALL
SELECT i + 1 FROM x WHERE i < 101
),
Z(Ix, Iy, Cx, Cy, X, Y, I)
AS (
SELECT Ix, Iy, X::float, Y::float, X::float, Y::float, 0
FROM
(SELECT -2.2 + 0.031 * i, i FROM x) AS xgen(x,ix)
CROSS JOIN
(SELECT -1.5 + 0.031 * i, i FROM x) AS ygen(y,iy)
UNION ALL
SELECT Ix, Iy, Cx, Cy, X * X - Y * Y + Cx AS X, Y * X * 2 + Cy, I + 1
FROM Z
WHERE X * X + Y * Y < 16.0
AND I < 27
),
Zt (Ix, Iy, I) AS (
SELECT Ix, Iy, MAX(I) AS I
FROM Z
GROUP BY Iy, Ix
ORDER BY Iy, Ix
)
SELECT array_to_string(
array_agg(
SUBSTRING(
' .,,,-----++++%%%%@@@@#### ',
GREATEST(I,1),
1
)
),''
)
FROM Zt
GROUP BY Iy
ORDER BY Iy;
What's up in my database?
pg_stat_activity
# select * from pg_stat_activity where state='idle';
-[ RECORD 1 ]----+------------------------------------------------
datid | 26211311
datname | sample
pid | 5291
usesysid | 16384
usename | alex
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2021-06-14 16:11:47.491729+02
xact_start |
query_start | 2021-06-14 16:34:41.027633+02
state_change | 2021-06-14 16:34:41.02857+02
wait_event_type | Client
wait_event | ClientRead
state | idle
backend_xid |
backend_xmin |
query | delete from person where is_crazy returning id;
backend_type | client backend
pg_stat_activity
# select * from pg_stat_activity where state='idle';
-[ RECORD 1 ]----+------------------------------------------------
datid | 26211311
datname | sample
pid | 5291
usesysid | 16384
usename | alex
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2021-06-14 16:11:47.491729+02
xact_start |
query_start | 2021-06-14 16:34:41.027633+02
state_change | 2021-06-14 16:34:41.02857+02
wait_event_type | Client
wait_event | ClientRead
state | idle
backend_xid |
backend_xmin |
query | delete from person where is_crazy returning id;
backend_type | client backend
# set application_name to 'my-app:some-role'
jdbc:postgresql://localhost:5435/MyDB?ApplicationName=MyApp
"idle in transaction" is typically bad,
especially if holding locks
What's going on?
● application_name lets you identify your connection in pg_stat_activity
● pg_terminate_backend(pid) to kill a bad connection
● pg_stat_activity for what's up right now
● pg_stat_statements extension for tracking what has happened
○ Install: create extension if not exists pg_stat_statements
○ # install on all subsequently created databases:
$ psql template1 -c 'create extension if not exists pg_stat_statements'
●
pg_stat_statements
● pg_stat_statements extension for tracking continuous activity
● What are the most expensive queries over time?
● One of the most useful extensions!
# select * from pg_stat_statements order by total_time desc limit 1;
-[ RECORD 1 ]-------+------------------------------------------------------------
[…]
query | with a_million_numbers as materialized (select * from a)
select * from a_million_numbers where id in(42, 43)
calls | 2
total_time | 2500.088486
min_time | 1160.500207
max_time | 1339.588279
mean_time | 1250.044243
stddev_time | 89.544036
rows | 0
shared_blks_hit | 8852
[…]
temp_blks_written | 1710
pg_stat_statements
● Some ORMs/toolkits fill pg_stat_statements with junk:
○ WHERE id IN ($1, $2, …, $987)
○ WHERE id IN ($1, $2, …, $5432)
● Consider id = ANY($parameter_as_array)
# select * from pg_stat_statements order by total_time desc limit 1;
-[ RECORD 1 ]-------+------------------------------------------------------------
[…]
query | with a_million_numbers as materialized (select * from a)
select * from a_million_numbers where id in ($1, $2, $3, …)
calls | 2
total_time | 2500.088486
min_time | 1160.500207
max_time | 1339.588279
mean_time | 1250.044243
stddev_time | 89.544036
rows | 0
shared_blks_hit | 8852
[…]
temp_blks_written | 1710
Locks
The likely reason for a migration causing an outage
Sample lock block
# begin;
# create index on person(name);
CREATE INDEX
Client A Client B Client C
In Postgres, DDL like create table, alter table,
create index, etc. are transactional.
Unlike Oracle and MySQL, "CREATE TABLE"
does not imply commit.
"Implicit commit" is not a thing in Postgres.
Sample lock block
# begin;
# create index on person(name);
CREATE INDEX
-- Index creation needs a share lock
-- Block to see effects on others
# select pg_sleep(600);
# begin;
-- Reading is fine
# select * from person where id=3;
id | name | is_crazy
----+---------+----------
3 | Mallory | f
-- But cannot update:
# update person set name='Not Mallory'
where id=3;
-- We're BLOCKED by client A :(
-- Meanwhiles, checking the status:
# select query, wait_event_type from
pg_stat_activity where state='active' and
pid != pg_backend_pid();
-[ RECORD 1 ]---+----------------------
query | select pg_sleep(600);
wait_event_type | Timeout
-[ RECORD 2 ]---+----------------------
query | update person set
name='Not Mallory' where id=3;
wait_event_type | Lock
Client A Client B Client C
pg_sleep is useful to
artificially slow down
operations to observe
locking effects in dev
Locky migrations
● CREATE INDEX prevents writes on the target table for the duration of the
transaction. Locks are released (only!) on commit/rollback.
● ALTER TABLE (add column and/or constraints, etc) will need an exclusive lock
on the table.
● An exclusive lock on a table will block all concurrent access = OUTAGE
● A share lock will block all concurrent write access = read only access
● You generally want to minimise time spent holding or waiting for such locks
○ statement_timeout
○ lock_timeout
Less locky migrations
● CREATE INDEX CONCURRENTLY does not block concurrent writing to the
table
○ Reads the table twice. Costs more IO.
○ Cannot be done in a transaction, i.e. it must be the only operation in its
own transaction
○ Caveats apply, consult the docs
● ALTER TABLE ADD CONSTRAINT … NOT VALID
○ Adds a constraint that only applies to subsequent writes
○ Existing data is not validated, so an exclusive lock needed only briefly
● ALTER TABLE … VALIDATE CONSTRAINT …
○ Holds a much weaker lock while reading the entire table
○ Can of course fail if data exists that cannot validate
● CREATE UNIQUE INDEX CONCURRENTLY for a uniqueness constraint
Less locky migrations
# select * from equipment;
id | mass_in_grams
----+---------------
1 | 42
2 | -1
(2 rows)
# alter table equipment
add constraint mass_not_negative
check (mass_in_grams >= 0) not valid;
ALTER TABLE
# insert into equipment values (3, -42);
ERROR: new row for relation "equipment" violates check
constraint "mass_not_negative"
DETAIL: Failing row contains (3, -42).
# alter table equipment
validate constraint mass_not_negative;
ERROR: check constraint "mass_not_negative" is violated
by some row
Needs exclusive lock for
some milliseconds
Scans the entire table without
strong lock
Deferred constraints
● A constraint can be deferred to commit time
● Useful e.g. for cyclic foreign keys
● Cyclic foreign key useful when partitioning out certain
columns
○ This is useful for "last_modified" kinds of use
cases.
○ Read more about "heap only tuples" if you have
that use case
# create table something (
-- change_info does not exist yet, adding
-- FK via ALTER below:
id int primary key,
what text
);
# create table change_info (
id int primary key
references something(id)
deferrable initially deferred,
last_modified timestamptz not null default now()
);
# alter table something
add constraint must_have_change_info
foreign key (id) references change_info(id)
deferrable initially deferred;
Deferred constraints # insert into something values (1, 'this-will-fail');
ERROR: insert or update on table "something" violates
foreign key constraint "must_have_change_info"
DETAIL: Key (id)=(1) is not present in table
"change_info".
# begin;
-- Not failing, FK check deferred to commit
# insert into something values (1, 'with-version-info');
INSERT 0 1
# insert into change_info values (1, now());
INSERT 0 1
# end;
COMMIT
# begin;
# insert into change_info values (2, now());
INSERT 0 1
-- Will fail:
# end;
ERROR: insert or update on table "change_info" violates
foreign key constraint "change_info_id_fkey"
DETAIL: Key (id)=(2) is not present in table "something".
# create table something (
-- change_info does not exist yet, adding
-- FK via ALTER below:
id int primary key,
what text
);
# create table change_info (
id int primary key
references something(id)
deferrable initially deferred,
last_modified timestamptz not null default now()
);
# alter table something
add constraint must_have_change_info
foreign key (id) references change_info(id)
deferrable initially deferred;
Functional indexes
-- Functional indexes:
# create table users (
id int primary key,
username text
);
# create unique index on users(lower(username));
# insert into users values (1, 'alex'), (2, 'ALEX');
ERROR: duplicate key value violates unique
constraint "users_lower_idx"
DETAIL: Key (lower(username))=(alex) already exists.
-- Note: Index lookups must use the
-- same function
-- This can use the index:
# select * from users
where lower(username)='alex';
-- This will NOT be able to use that index:
# select * from users
where username='alex';
Partial indexes
# create table assets (
id int primary key,
external_id text not null,
deleted_at timestamp
);
# create unique index on assets(external_id)
where deleted_at is null;
# insert into assets values
(1, 'one', now()), (2, 'one', null);
INSERT 0 2
# insert into assets values
(3, 'not one', null), (4, 'one', null);
ERROR: duplicate key value violates unique
constraint "assets_external_id_idx"
DETAIL: Key (external_id)=(one) already exists.
-- This can use the partial index:
# select * from assets
where external_id='one' AND
deleted_at is null;
-- This will NOT be able to use that index:
# select * from users
where external_id='one';
-- missing (deleted_at is null)
Range Types and Exclusion Constraints
Range types
● A range is an interval of numeric-like data
● int4range(0, 10): 0 <= n < 10
● tstzrange(now(), null): any time >= now()
● Index support for overlaps and containment
○ overlaps: int4range(0, 10) && int4range(9, 20)
○ contains:
■ int4range(0, 10) @> 2
■ int4range(0, 10) @> int4range(2, 4)
# create table intervals (
id int primary key,
start timestamptz,
"end" timestamptz
);
-- Insert 1 million random intervals
# with random_starts as (
select generate_series(1, 1000*1000) as id,
'2020-01-01'::timestamptz +
(1000 * random()) * interval '1 day' as start
)
insert into intervals
select id, start, start +
(1000 * random() * interval '1 hour') as "end"
from random_starts;
INSERT 0 1000000
# create index on intervals
using gist(tstzrange(start, "end"));
"Generalised search tree",
R-tree and more
Range types
# create table intervals (
id int primary key,
start timestamptz,
"end" timestamptz
);
-- [Insert random intervals
-- happened here]
# create index on intervals
using
gist(tstzrange(start, "end"));
# explain analyze
select * from intervals
where tstzrange(start, "end") &&
tstzrange(now(), now() + interval '7 days');
-- QUERY PLAN --
Bitmap Heap Scan on intervals
(cost=1368.16..8447.01 rows=28354 width=20)
(actual time=46.379..80.761 rows=27837 loops=1)
Recheck Cond: (tstzrange(start, "end") &&
tstzrange(now(), (now() + '7 days'::interval)))
Heap Blocks: exact=6278
-> Bitmap Index Scan on intervals_tstzrange_idx
(cost=0.00..1361.08 rows=28354 width=0)
(actual time=44.309..44.309 rows=27837 loops=1)
Index Cond: (tstzrange(start, "end") &&
tstzrange(now(), (now() + '7 days'::interval)))
Planning Time: 0.118 ms
Execution Time: 56.151 ms
Exclusion constraints
● Define ranges that cannot overlap
● No two reservations of the same resource can overlap
● Exclusion constraints cannot be added "concurrently",
i.e. without write-locking the table.
# create table reservations (
room text,
start timestamptz,
"end" timestamptz,
constraint no_double_booking exclude using gist(
room with =,
tstzrange(start, "end") with &&
)
);
# insert into reservations values
('zelda', '2021-06-15', '2021-06-16'),
('zelda', '2021-06-17', '2021-06-18'),
('portal', '2021-06-01', '2021-07-01');
INSERT 0 3
# insert into reservations values
('zelda', '2021-06-14', '2021-06-16');
ERROR: conflicting key value violates exclusion
constraint "no_double_booking"
DETAIL: [omitted]
Exclusion constraints
● Define constraint that all elements must be the same
○ "Exclude dissimilar"
● Example:
○ Don't put humans and lions in the same cage at
the same time
# create table cages (
cage text,
animal text,
start timestamptz,
"end" timestamptz,
constraint just_same_animals exclude using gist(
cage with =,
animal with !=,
tstzrange(start, "end") with &&
)
);
# insert into cages values
('cellar', 'human', '2021-06-15', '2021-06-16'),
('bedroom', 'lion', '2021-06-15', '2021-06-16'),
('cellar', 'human', '2021-06-01', '2021-07-01');
INSERT 0 3
# insert into cages values
('bedroom', 'human', '2021-06-14', '2021-06-16');
ERROR: conflicting key value violates exclusion
constraint "just_same_animals"
DETAIL: [omitted]
Upsert: insert or update
Upsert
● INSERT can take an ON CONFLICT
● Given a conflict, either
○ DO NOTHING
○ DO UPDATE SET … [ WHERE … ]
# create table users (
user_id bigint primary key,
username text not null,
name text
);
create unique index on users(lower(username));
insert into users values
(1, 'alex', 'Alex'),
(2, 'bob', 'Bob');
# prepare sample_upsert as
insert into users values ($1, $2, $3)
on conflict(user_id)
do update set
username=excluded.username,
name=excluded.name
where
row(users.username, users.name)
is distinct from
row(excluded.username, excluded.name)
returning *;
# execute sample_upsert(1, 'alex', 'AlexB');
user_id | username | name
---------+----------+-------
1 | alex | AlexB
-- Repeat the same, then what?
# execute sample_upsert(1, 'alex', 'AlexB');
user_id | username | name
---------+----------+------
(0 rows)
Upsert
● ON CONFLICT DO NOTHING does not return data
# prepare fixed_upsert as
with maybe_upserted as (
insert into users values ($1, $2, $3)
on conflict(user_id)
[ … same as previous …]
returning *
)
select * from maybe_upserted
union all
select * from users where user_id = $1
limit 1;
# execute fixed_upsert(1, 'alex', 'AlexB');
user_id | username | name
---------+----------+-------
1 | alex | AlexB
-- Repeat the same, then what?
# execute fixed_upsert(1, 'alex', 'AlexB');
user_id | username | name
---------+----------+-------
1 | alex | AlexB
Upsert
● ON CONFLICT DO NOTHING does not return data
# prepare fixed_upsert as
with maybe_upserted as (
insert into users values ($1, $2, $3)
on conflict(user_id)
[ … same as previous …]
returning *
)
select * from maybe_upserted
union all
select * from users where user_id = $1
limit 1;
-- (note: union all for this trick)
# explain analyze
execute fixed_upsert(1, 'alex', 'Changed Name');
QUERY PLAN
---------------------------------------------------
Limit (..) (actual ...)
CTE maybe_upserted
-> Insert on users users_1 (...)
Conflict Resolution: UPDATE
Conflict Arbiter Indexes: users_pkey
Conflict Filter: ([…snip…])
Tuples Inserted: 0
Conflicting Tuples: 1
-> Result (...)
-> Append (...) -- (this is UNION ALL)
-> CTE Scan on maybe_upserted (...)
-> Index Scan using users_pkey on users
(...) (never executed)
Index Cond: (user_id = '1'::bigint)
Planning Time: 0.170 ms
Execution Time: 0.082 ms
JSON
JSON
● jsonb types, functions and aggregates
● Validates conformity only
○ Not a replacement for proper schemas!
● Use with care
○ Not a replacement for proper schemas! :)
# create table metadata (
user_id int,
key text,
metadata jsonb
);
# insert into metadata values
(1, 'settings', '{"foo": "bar"}'),
(1, 'searches', '["where", "what"]');
# select user_id, jsonb_agg(metadata)
from metadata group by 1;
user_id | jsonb_agg
---------+-------------------------------------
1 | [{"foo": "bar"}, ["where", "what"]]
# select user_id, jsonb_object_agg(key, metadata)
from metadata group by 1;
user_id | jsonb_object_agg
---------+------------------------------------------
1 | {"searches": ["where", "what"],
"settings": {"foo": "bar"}}
JSON object graphs
● Compose complete JSON object graphs via
LATERAL joins.
● Does not require the rows to have JSON types
● LATERAL is a bit like "for each"
○ The subquery gets to reference the row
● GraphQL-implementations on top of Postgres do
this (e.g. Hasura, Postgraphile)
# select * from users join metadata using(user_id);
user_id | username | key | metadata
---------+----------+----------+-------------------
1 | alex | settings | {"foo": "bar"}
1 | alex | searches | ["where", "what"]
# select jsonb_build_object(
'username', username,
'metadata', aggregated_metadata
) as user_object
from users
left join lateral ( -- For each user:
select jsonb_object_agg(key, metadata) as metadata from metadata
where metadata.user_id=users.user_id
) as aggregated_metadata on true;
user_object
--------------------------
{
"metadata": {
"searches": [
"where",
"what"
],
"settings": {
"foo": "bar"
}
},
"username": "alex"
}
Upserting object graphs
● Convert JSON object graphs to records via
LATERAL joins
● Upsert objects to different tables with a single
writable WITH-query
# create table users (
user_id bigint generated by default as identity primary key,
username text unique not null,
name text
);
# create table user_settings (
user_id bigint primary key references users(user_id),
settings jsonb not null
);
-- TODO: Define 'upsert' query that either creates or patches
-- across multiple tables with a single parameter
# execute upsert('[
{
"user": {"username": "alex", "name": "Alex"},
"settings": {"foo": "bar"}
},
{
"user": {"username": "mallory", "name": "Mallory"}
}
]');
# execute upsert('[
{
"user": {"username": "alex", "name": "AlexB"}
},
{
"user": {"username": "mallory", "name": "Mallory"},
"settings": {"bar": "baz"}
}
]');
# prepare upsert as
with records as (
select * from jsonb_to_recordset($1::jsonb)
as _("user" jsonb, "settings" jsonb)
), maybe_upserted_users as (
insert into users (username, name)
select username, name
from records
join lateral
jsonb_populate_record(null::users, records.user)
on(true)
on conflict (username) do -- insert or update
update set
username=excluded.username,
name=coalesce(excluded.name, users.name)
where
row(users.username, users.name) is distinct from
row(excluded.username, excluded.name)
returning *
)
select * from maybe_upserted_users;
# execute upsert('[...]');
user_id | username | name
---------+----------+---------
1 | alex | Alex
2 | mallory | Mallory
# create table users (
user_id bigint ... primary key,
username text unique not null,
name text
);
# create table user_settings (
user_id bigint primary key
references users(user_id),
settings jsonb not null
);
# execute upsert('[
{
"user": {"username": "alex",
"name": "Alex"},
"settings": {"foo": "bar"}
},
{
"user": {"username": "mallory",
"name": "Mallory"}
}
]');
# prepare upsert as
with records as (
select * from jsonb_to_recordset($1::jsonb)
as _("user" jsonb, "settings" jsonb)
), maybe_upserted_users as (
insert into users (username, name)
select username, name
from records
join lateral
jsonb_populate_record(null::users, records.user)
on(true)
on conflict (username) do -- insert or update
update set
username=excluded.username,
name=coalesce(excluded.name, users.name)
where
row(users.username, users.name) is distinct from
row(excluded.username, excluded.name)
returning *
)
select * from maybe_upserted_users;
# execute upsert('[
{
"user": {"username": "alex",
"name": "Alex"},
"settings": {"foo": "bar"}
},
{
"user": {"username": "mallory",
"name": "Mallory"}
}
]');
user_id | username | name
---------+----------+---------
1 | alex | Alex
2 | mallory | Mallory
-- On conflict's WHERE condition makes nothing be returned:
# execute upsert('[...]');
user_id | username | name
---------+----------+------
(0 rows)
# prepare upsert as
with records as (
select * from jsonb_to_recordset($1::jsonb)
as _("user" jsonb, "settings" jsonb)
), maybe_upserted_users as (
insert into users (username, name)
select username, name
from records
join lateral
jsonb_populate_record(null::users, records.user)
on(true)
on conflict (username) do -- insert or update
update set
username=excluded.username,
name=coalesce(excluded.name, users.name)
where
row(users.username, users.name) is distinct from
row(excluded.username, excluded.name)
returning *
), all_users as (
...
)
select * from all_users
all_users as (
select * from maybe_upserted_users
union all
select * from users where username in (
select "user" ->> 'username' from records
except all
select username from maybe_upserted_users
)
)
-- We now get all users: created or updated or neither
# execute upsert('[...]');
user_id | username | name
---------+----------+---------
1 | alex | Alex
2 | mallory | Mallory
# execute upsert('[...]');
user_id | username | name
---------+----------+---------
1 | alex | Alex
2 | mallory | Mallory
# prepare upsert as
with records as (
...
), maybe_upserted_users as (
...
), all_users as (
...
), updated_settings as (
insert into user_settings
select user_id, settings
from all_users
join records on (records.user ->> 'username' = username)
where settings is not null
on conflict(user_id) do
update set
settings=excluded.settings
where
user_settings.settings is distinct from
excluded.settings
)
select username, user_id from all_users;
# execute upsert('[
{
"user": {"username": "alex",
"name": "Alex"},
"settings": {"foo": "bar"}
},
{
"user": {"username": "mallory",
"name": "Mallory"}
}
]');
username | user_id
----------+---------
alex | 1
mallory | 2
# select * from user_settings;
user_id | settings
---------+----------------
1 | {"foo": "bar"}
Coordinating via Postgres
NOTIFY+LISTEN and advisory locks
● LISTEN: Get an async callback when something does NOTIFY
○ LISTEN 'channel'; -- Get notifications between transactions
● NOTIFY: Send callbacks. Delivered between transactions to LISTEN-ers
○ NOTIFY 'channel'; -- Wake up listeners
○ Triggers can do ensure notifications are sent
● Advisory locks:
○ "Leader election" through Postgres
○ Locks that last until you disconnect
○ Need a limited number of background processes to pick up work?
○ SELECT pg_try_advisory_lock(1234);
● SKIP LOCKED
○ SELECT * FROM work_items
LIMIT 1
FOR UPDATE SKIP LOCKED
Learn more!
● tinyurl.com/jz21-psql
● medium.com/@alexbrasetvik/postgres-can-do-that-
f221a8046e
● medium.com/cognite
● Tweet me questions or feedback: @alexbrasetvik
all the same

More Related Content

What's hot

Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basicsnickmbailey
 
Postgresql database administration volume 1
Postgresql database administration volume 1Postgresql database administration volume 1
Postgresql database administration volume 1Federico Campoli
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraGokhan Atil
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQLJim Mlodgenski
 
Oracle Database Performance Tuning Concept
Oracle Database Performance Tuning ConceptOracle Database Performance Tuning Concept
Oracle Database Performance Tuning ConceptChien Chung Shen
 
MariaDB 10: The Complete Tutorial
MariaDB 10: The Complete TutorialMariaDB 10: The Complete Tutorial
MariaDB 10: The Complete TutorialColin Charles
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016DataStax
 
Functional programming in Scala
Functional programming in ScalaFunctional programming in Scala
Functional programming in Scaladatamantra
 
Percona xtrabackup - MySQL Meetup @ Mumbai
Percona xtrabackup - MySQL Meetup @ MumbaiPercona xtrabackup - MySQL Meetup @ Mumbai
Percona xtrabackup - MySQL Meetup @ MumbaiNilnandan Joshi
 
Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres MonitoringDenish Patel
 
MySQL Index Cookbook
MySQL Index CookbookMySQL Index Cookbook
MySQL Index CookbookMYXPLAIN
 
關於SQL Injection的那些奇技淫巧
關於SQL Injection的那些奇技淫巧關於SQL Injection的那些奇技淫巧
關於SQL Injection的那些奇技淫巧Orange Tsai
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Oracle Database Performance Tuning Basics
Oracle Database Performance Tuning BasicsOracle Database Performance Tuning Basics
Oracle Database Performance Tuning Basicsnitin anjankar
 
Postgres connections at scale
Postgres connections at scalePostgres connections at scale
Postgres connections at scaleMydbops
 
Reading the .explain() Output
Reading the .explain() OutputReading the .explain() Output
Reading the .explain() OutputMongoDB
 
A Survey of HBase Application Archetypes
A Survey of HBase Application ArchetypesA Survey of HBase Application Archetypes
A Survey of HBase Application ArchetypesHBaseCon
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep InternalEXEM
 

What's hot (20)

Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
Postgresql database administration volume 1
Postgresql database administration volume 1Postgresql database administration volume 1
Postgresql database administration volume 1
 
Survey of Percona Toolkit
Survey of Percona ToolkitSurvey of Percona Toolkit
Survey of Percona Toolkit
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQL
 
Explain that explain
Explain that explainExplain that explain
Explain that explain
 
Oracle Database Performance Tuning Concept
Oracle Database Performance Tuning ConceptOracle Database Performance Tuning Concept
Oracle Database Performance Tuning Concept
 
MariaDB 10: The Complete Tutorial
MariaDB 10: The Complete TutorialMariaDB 10: The Complete Tutorial
MariaDB 10: The Complete Tutorial
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
 
Functional programming in Scala
Functional programming in ScalaFunctional programming in Scala
Functional programming in Scala
 
Percona xtrabackup - MySQL Meetup @ Mumbai
Percona xtrabackup - MySQL Meetup @ MumbaiPercona xtrabackup - MySQL Meetup @ Mumbai
Percona xtrabackup - MySQL Meetup @ Mumbai
 
Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres Monitoring
 
MySQL Index Cookbook
MySQL Index CookbookMySQL Index Cookbook
MySQL Index Cookbook
 
關於SQL Injection的那些奇技淫巧
關於SQL Injection的那些奇技淫巧關於SQL Injection的那些奇技淫巧
關於SQL Injection的那些奇技淫巧
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Oracle Database Performance Tuning Basics
Oracle Database Performance Tuning BasicsOracle Database Performance Tuning Basics
Oracle Database Performance Tuning Basics
 
Postgres connections at scale
Postgres connections at scalePostgres connections at scale
Postgres connections at scale
 
Reading the .explain() Output
Reading the .explain() OutputReading the .explain() Output
Reading the .explain() Output
 
A Survey of HBase Application Archetypes
A Survey of HBase Application ArchetypesA Survey of HBase Application Archetypes
A Survey of HBase Application Archetypes
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
 

Similar to Postgres can do THAT?

Deep dive to PostgreSQL Indexes
Deep dive to PostgreSQL IndexesDeep dive to PostgreSQL Indexes
Deep dive to PostgreSQL IndexesIbrar Ahmed
 
PerlApp2Postgresql (2)
PerlApp2Postgresql (2)PerlApp2Postgresql (2)
PerlApp2Postgresql (2)Jerome Eteve
 
PyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorialPyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorialjbellis
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7decoupled
 
PostgreSQL10の新機能 ~ロジカルレプリケーションを中心に~
PostgreSQL10の新機能 ~ロジカルレプリケーションを中心に~PostgreSQL10の新機能 ~ロジカルレプリケーションを中心に~
PostgreSQL10の新機能 ~ロジカルレプリケーションを中心に~Atsushi Torikoshi
 
Queuing Sql Server: Utilise queues to increase performance in SQL Server
Queuing Sql Server: Utilise queues to increase performance in SQL ServerQueuing Sql Server: Utilise queues to increase performance in SQL Server
Queuing Sql Server: Utilise queues to increase performance in SQL ServerNiels Berglund
 
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...Ontico
 
Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007paulguerin
 
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenJohn Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenPostgresOpen
 
Becoming a better developer with EXPLAIN
Becoming a better developer with EXPLAINBecoming a better developer with EXPLAIN
Becoming a better developer with EXPLAINLouise Grandjonc
 
Imugi: Compiler made with Python
Imugi: Compiler made with PythonImugi: Compiler made with Python
Imugi: Compiler made with PythonHan Lee
 
ShmooCon 2009 - (Re)Playing(Blind)Sql
ShmooCon 2009 - (Re)Playing(Blind)SqlShmooCon 2009 - (Re)Playing(Blind)Sql
ShmooCon 2009 - (Re)Playing(Blind)SqlChema Alonso
 
Changing your huge table's data types in production
Changing your huge table's data types in productionChanging your huge table's data types in production
Changing your huge table's data types in productionJimmy Angelakos
 
Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with ClojureDmitry Buzdin
 
CBSE Class XII Comp sc practical file
CBSE Class XII Comp sc practical fileCBSE Class XII Comp sc practical file
CBSE Class XII Comp sc practical filePranav Ghildiyal
 
Beyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeBeyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeWim Godden
 
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of Indifference
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of IndifferenceRob Sullivan at Heroku's Waza 2013: Your Database -- A Story of Indifference
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of IndifferenceHeroku
 

Similar to Postgres can do THAT? (20)

Deep dive to PostgreSQL Indexes
Deep dive to PostgreSQL IndexesDeep dive to PostgreSQL Indexes
Deep dive to PostgreSQL Indexes
 
PerlApp2Postgresql (2)
PerlApp2Postgresql (2)PerlApp2Postgresql (2)
PerlApp2Postgresql (2)
 
PyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorialPyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorial
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7
 
A tour of Python
A tour of PythonA tour of Python
A tour of Python
 
PostgreSQL10の新機能 ~ロジカルレプリケーションを中心に~
PostgreSQL10の新機能 ~ロジカルレプリケーションを中心に~PostgreSQL10の新機能 ~ロジカルレプリケーションを中心に~
PostgreSQL10の新機能 ~ロジカルレプリケーションを中心に~
 
Queuing Sql Server: Utilise queues to increase performance in SQL Server
Queuing Sql Server: Utilise queues to increase performance in SQL ServerQueuing Sql Server: Utilise queues to increase performance in SQL Server
Queuing Sql Server: Utilise queues to increase performance in SQL Server
 
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
 
Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007
 
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenJohn Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
 
Becoming a better developer with EXPLAIN
Becoming a better developer with EXPLAINBecoming a better developer with EXPLAIN
Becoming a better developer with EXPLAIN
 
Imugi: Compiler made with Python
Imugi: Compiler made with PythonImugi: Compiler made with Python
Imugi: Compiler made with Python
 
Slides
SlidesSlides
Slides
 
ShmooCon 2009 - (Re)Playing(Blind)Sql
ShmooCon 2009 - (Re)Playing(Blind)SqlShmooCon 2009 - (Re)Playing(Blind)Sql
ShmooCon 2009 - (Re)Playing(Blind)Sql
 
Changing your huge table's data types in production
Changing your huge table's data types in productionChanging your huge table's data types in production
Changing your huge table's data types in production
 
PostgreSQL: Advanced indexing
PostgreSQL: Advanced indexingPostgreSQL: Advanced indexing
PostgreSQL: Advanced indexing
 
Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
 
CBSE Class XII Comp sc practical file
CBSE Class XII Comp sc practical fileCBSE Class XII Comp sc practical file
CBSE Class XII Comp sc practical file
 
Beyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeBeyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the code
 
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of Indifference
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of IndifferenceRob Sullivan at Heroku's Waza 2013: Your Database -- A Story of Indifference
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of Indifference
 

Recently uploaded

Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 

Recently uploaded (20)

Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 

Postgres can do THAT?

  • 1. Alex Brasetvik // Senior Principal Architect @ Cognite Postgres can do THAT? Survey of select Postgres features, some which you should probably learn more about!
  • 2. ▪ Link to in-depth material posted in the last slide ▪ Many topics could be long talks on their own. I want to make you interested in these topics, not trying to fully cover them (at all!) ▪ Questions? Tweet me @alexbrasetvik GETTING STARTED
  • 3. $ whoami Alex Brasetvik @ Cognite ● Office of the CTO: focusing on database related stuff ● Previously co-founded the Elasticsearch platform that became Elastic Cloud ● Postgres and Elasticsearch are some of my favorite hammers ● As a proper cloud engineer, I enjoy jumping out of planes
  • 4. AGENDA generate_series EXPLAIN RETURNING WITH/Common Table Expressions ● Materialized vs inlined CTEs ● Writable CTEs ● Recursive CTEs What's up in my database? pg_stat_(activity|statements) Less locky migrations Index tricks, range types Deferrable constraints Exclusion constraints JSON
  • 5. AGENDA generate_series EXPLAIN RETURNING WITH/Common Table Expressions ● Materialized vs inlined CTEs ● Writable CTEs ● Recursive CTEs What's up in my database? pg_stat_(activity|statements) Less locky migrations Index tricks, range types Deferrable constraints Exclusion constraints JSON tl;dr: Hold on to your butts
  • 7. generate_series(1, 1000) # create table a (id int); CREATE TABLE # insert into a select generate_series(1, 100); INSERT 0 100 # select * from a limit 3; id ---- 1 2 3 (3 rows) -- A million rows in a few seconds: # timing on # create table some_numbers as select generate_series(1, 1000*1000) as i; SELECT 1000000 Time: 2245.545 ms (00:02.246) # create index on some_numbers(i); CREATE INDEX Time: 994.635 ms ● generate_series(start_inclusive, end_inclusive) ● Useful to easily create an arbitrarily large sample set ● As we'll come back to, always test with realistically sized data sets! ○ Behaviour can drastically change with changes in data set sizes
  • 8. generate_series(1, 1000) # create table a (id int); CREATE TABLE # insert into a select generate_series(1, 100); INSERT 0 100 # select * from a limit 3; id ---- 1 2 3 (3 rows) -- A million rows in a few: # timing on # create table some_numbers as select generate_series(1, 1000*1000) as i; SELECT 1000000 Time: 2245.545 ms (00:02.246) # create index on some_numbers(i); CREATE INDEX Time: 994.635 ms ● generate_series(start_inclusive, end_inclusive) ● Useful to easily create an arbitrarily large sample set ● As we'll come back to, always test with realistically sized data sets! ○ Behaviour can drastically change with changes in data set sizes (psql) command output
  • 9. EXPLAIN Asking Postgres to detail its plans
  • 10. Please EXPLAIN # explain select * from a where id=42; -- QUERY PLAN -- Seq Scan on a (cost=0.00..2.25 rows=1 width=4) Filter: (id = 42) # explain analyze select * from a where id=42; -- QUERY PLAN -- Seq Scan on a (cost=0.00..2.25 rows=1 width=4) (actual time=0.034..0.041 rows=1 loops=1) Filter: (id = 42) Rows Removed by Filter: 99 Planning Time: 0.137 ms Execution Time: 0.057 ms ● EXPLAIN [query goes here] shows the plan of the statement without executing the query. ● EXPLAIN ANALYZE [query] executes the query while profiling it, emitting the plan with profiling information. Analyze = Execute
  • 11. Please EXPLAIN # explain select * from a where id=42; -- QUERY PLAN -- Seq Scan on a (cost=0.00..2.25 rows=1 width=4) Filter: (id = 42) # explain analyze select * from a where id=42; -- QUERY PLAN -- Seq Scan on a (cost=0.00..2.25 rows=1 width=4) (actual time=0.034..0.041 rows=1 loops=1) Filter: (id = 42) Rows Removed by Filter: 99 Planning Time: 0.137 ms Execution Time: 0.057 ms Scan the entire table Then remove rows
  • 12. # create index on a(id); CREATE INDEX # explain (analyze true, verbose true, buffers true) select * from a where id=42; -- QUERY PLAN -- Seq Scan on public.a (cost=0.00..2.25 rows=1 width=4) (actual time=0.019..0.027 rows=1 loops=1) Output: id Filter: (a.id = 42) Rows Removed by Filter: 99 Buffers: shared hit=1 Settings. FORMAT JSON can be useful too. Index not used…? Let's create an index Postgres reads 1 page. That plan is impossible to beat. It knows there's hardly any data in the table
  • 13. # set enable_seqscan to off; -- Disable "seq scans" if possible, for testing SET # explain (analyze true, verbose true, buffers true) select * from a where id=42; -- QUERY PLAN -- Index Only Scan using a_id_idx on public.a (cost=0.14..8.16 rows=1 width=4) (actual time=0.015..0.016 rows=1 loops=1) Output: id Index Cond: (a.id = 42) Heap Fetches: 1 Buffers: shared hit=2 Planner setting
  • 14. # set enable_seqscan to on; -- Revert to default setting SET # insert into a select generate_series(101, 1000*1000); -- 1 million rows in total INSERT 0 999900 # explain (analyze true, verbose true, buffers true) select * from a where id=42; -- QUERY PLAN -- Index Only Scan using a_id_idx on public.a (cost=0.42..8.44 rows=1 width=4) (actual time=0.045..0.050 rows=1 loops=1) Output: id Index Cond: (a.id = 42) Heap Fetches: 1 Buffers: shared hit=2 Planning Time: 0.074 ms Execution Time: 0.075 ms More data in the table. Picks a plan with the index.
  • 15. # drop index a_id_idx; -- Blow away the index, forcing a seq scan DROP INDEX # explain (analyze true, verbose true, buffers true) select * from a where id=42; -- QUERY PLAN -- Gather (cost=1000.00..10634.90 rows=1 width=4) (actual time=0.170..102.491 rows=1 loops=1) Output: id Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=4426 → Parallel Seq Scan on public.a (cost=0.00..9634.80 rows=1 width=4) (actual time=50.188..77.625 rows=1 loops=3) Output: id Filter: (a.id = 42) Rows Removed by Filter: 333363 Buffers: shared hit=4426 Worker 0: actual time=75.269..75.270 rows=0 loops=1 Buffers: shared hit=1326 Worker 1: actual time=75.277..75.277 rows=0 loops=1 Buffers: shared hit=1334 Planning Time: 0.180 ms Execution Time: 102.523 ms (vs 0.075 ms with the index)
  • 16. Visualizing plans ● Spotting performance problems in text can be hard, especially as queries grow larger ● Useful visualization tools: ○ explain.dalibo.com ○ explain.depesz.com
  • 17. Explaining writes # create table foo(id int primary key); CREATE TABLE # insert into foo select generate_series(1, 1000*1000); INSERT 0 1000000 # create table bar(id int primary key, foo int references foo(id) on delete cascade on update cascade); CREATE TABLE # insert into bar select generate_series(1, 1000*1000), generate_series(1, 1000*1000); INSERT 0 1000000 -- foo and bar both have 1 million rows -- bar.foo points to foo.id # delete from bar where id = 42; -- This is fast DELETE 1 Time: 1.130 ms # delete from foo where id=1000; -- This is slooow DELETE 1 Time: 405.049 ms foo id int bar id int foo int
  • 18. # explain (analyze true, verbose true, buffers true) delete from foo where id=1000; -- QUERY PLAN -- Delete on public.foo (cost=0.42..8.44 rows=1 width=6) (actual time=18.937..18.939 rows=0 loops=1) Buffers: shared hit=3 read=6 dirtied=2 -> Index Scan using foo_pkey on public.foo (cost=0.42..8.44 rows=1 width=6) (actual time=0.481..0.485 rows=1 loops=1) Output: ctid Index Cond: (foo.id = 1000) Buffers: shared hit=1 read=3 Planning Time: 1.081 ms Trigger RI_ConstraintTrigger_a_26211344 for constraint bar_foo_fkey: time=379.013 calls=1 Execution Time: 398.084 ms foo id int bar id int foo int No index on bar.foo The cascading delete is very slow! Trigger for reverse FK must scan all of bar per deleted row. (Note that the analyze causes the write to go through! "analyze" = "execute")
  • 19. EXPLAIN that again? ● Test with real-sized data sets ● Experiment with low memory settings (work_mem) to see behaviour when Postgres flushes to disk ● You can explain UPDATE and DELETE, not just SELECT ● EXPLAIN (ANALYZE) shows the plan (with profiling data) ● Spot performance problems, find missing indexes ● … but also get a better understanding of how Postgres executes things! ○ … which can avoid the performance debugging down the road ● There are many "node types" other than Seq and Index scan. Learn more about them!
  • 20. EXPLAIN your queries Improve your intuition for how queries execute, not just when you have a performance problem
  • 21. RETURNING * Get back changes made
  • 22. # create table person ( id int generated always as identity primary key, name text, is_crazy boolean ); CREATE TABLE # insert into person (name, is_crazy) values ('Alice', false), ('Bob', false), ('Mallory', true) returning *; -- Emit every row inserted, which includes the auto generated ID id | name | is_crazy ----+---------+---------- 1 | Alice | f 2 | Bob | f 3 | Mallory | t (3 rows) INSERT 0 3
  • 23. # update person set is_crazy = not is_crazy returning name, is_crazy; name | is_crazy ---------+---------- Alice | t Bob | t Mallory | f (3 rows) UPDATE 3 sample=# delete from person where is_crazy returning id; id ---- 1 2 (2 rows) RETURNING will … return in a few slides
  • 24. WITH Also called "Common Table Expressions" (or CTEs)
  • 25. WITH a tiny graph # select * from edges; a | b | type ----------------+----------------+-------------- A | B | friend B | C | friend B | D | friend root | A | parentOf root | something-else | bff something-else | A pump?! | ! A pump?! | D | tree-breaker (7 rows) # copy (select 'digraph G { ' || string_agg('"' || a || '" -> "' || b || '" [label="' || type || '"]; ', E'') || '}' from edges) to program 'dot -Tsvg > /tmp/test.svg'; -- Pipe to Graphviz COPY 1 # copy (select * from edges) to '/tmp/file.csv' with csv header; -- Make a CSV COPY 7 # copy (select * from edges) to program 'pbcopy' with csv header; -- CSV now on clipboard, paste straight to Google Sheets
  • 26. # with out_per_node as ( select a as node, count(*) as out_degree from edges group by a ), in_per_node as ( select b as node, count(*) as in_degree from edges group by b ) select node, coalesce(out_degree, 0) as out_degree, coalesce(in_degree, 0) as in_degree from out_per_node full outer join in_per_node using(node) order by node; node | out_degree | in_degree ----------------+------------+----------- A | 1 | 1 A pump?! | 1 | 1 B | 2 | 1 C | 0 | 1 D | 0 | 2 root | 2 | 0 something-else | 1 | 1 (7 rows)
  • 27. # with out_per_node as ( select a as node, count(*) as out_degree from edges group by a ), in_per_node as ( select b as node, count(*) as in_degree from edges group by b ) select node, coalesce(out_degree, 0) as out_degree, coalesce(in_degree, 0) as in_degree from out_per_node full outer join in_per_node using(node) order by node; node | out_degree ----------------+------------ B | 2 A pump?! | 1 something-else | 1 root | 2 A | 1 (5 rows) node | in_degree ----------------+----------- B | 1 A pump?! | 1 C | 1 something-else | 1 D | 2 A | 1 (6 rows) You can refer to these as if they were real tables in following statements
  • 28. # with out_per_node as ( select a as node, count(*) as out_degree from edges group by a ), in_per_node as ( select b as node, count(*) as in_degree from edges group by b ) select node, coalesce(out_degree, 0) as out_degree, coalesce(in_degree, 0) as in_degree from out_per_node full outer join in_per_node using(node) order by node; Equivalent sub-select: select node, coalesce(out_degree, 0) as out_degree, coalesce(in_degree, 0) as in_degree from ( select a as node, count(*) as out_degree from edges group by a ) as out_per_node full outer join ( select b as node, count(*) as in_degree from edges group by b ) as in_per_node using(node) order by node;
  • 29. # with out_per_node as ( select a as node, count(*) as out_degree from edges group by a ), in_per_node as ( select b as node, count(*) as in_degree from edges group by b ) select node, coalesce(out_degree, 0) as out_degree, coalesce(in_degree, 0) as in_degree from out_per_node full outer join in_per_node using(node) where node = 'root' order by node; Equivalent sub-select: select node, coalesce(out_degree, 0) as out_degree, coalesce(in_degree, 0) as in_degree from ( select a as node, count(*) as out_degree from edges where a = 'root' group by a ) as out_per_node full outer join ( select b as node, count(*) as in_degree from edges where b = 'root' group by b ) as in_per_node using(node) order by node;
  • 30. # with out_per_node as ( select a as node, count(*) as out_degree from edges group by a ), in_per_node as ( select b as node, count(*) as in_degree from edges group by b ) select node, coalesce(out_degree, 0) as out_degree, coalesce(in_degree, 0) as in_degree from out_per_node full outer join in_per_node using(node) where node = 'root' order by node; Equivalent sub-select: select node, coalesce(out_degree, 0) as out_degree, coalesce(in_degree, 0) as in_degree from ( select a as node, count(*) as out_degree from edges where a = 'root' group by a ) as out_per_node full outer join ( select b as node, count(*) as in_degree from edges where b = 'root' group by b ) as in_per_node using(node) order by node; How equivalent?
  • 31. # explain with out_per_node as ( select a as node, count(*) as out_degree from edges group by a ), in_per_node as ( select b as node, count(*) as in_degree from edges group by b ) select node, coalesce(out_degree, 0) as out_degree, coalesce(in_degree, 0) as in_degree from out_per_node full outer join in_per_node using(node) where node = 'root' order by node; QUERY PLAN --------------------------------------------------------------------------------------- Sort (cost=36.62..36.64 rows=9 width=48) Sort Key: (COALESCE(edges.a, edges_1.b)) -> Hash Full Join (cost=18.24..36.48 rows=9 width=48) Hash Cond: (edges.a = edges_1.b) -> GroupAggregate (cost=0.00..18.17 rows=3 width=40) Group Key: edges.a -> Seq Scan on edges (cost=0.00..18.12 rows=3 width=32) Filter: (a = 'root'::text) -> Hash (cost=18.20..18.20 rows=3 width=40) -> GroupAggregate (cost=0.00..18.17 rows=3 width=40) Group Key: edges_1.b -> Seq Scan on edges edges_1 (cost=0.00..18.12 rows=3 width=32) Filter: (b = 'root'::text)
  • 32. # explain select node, coalesce(out_degree, 0) as out_degree, coalesce(in_degree, 0) as in_degree from ( select a as node, count(*) as out_degree from edges where a = 'root' group by a ) as out_per_node full outer join ( select b as node, count(*) as in_degree from edges where b = 'root' group by b ) as in_per_node using(node) order by node; QUERY PLAN --------------------------------------------------------------------------------------- Sort (cost=36.62..36.64 rows=9 width=48) Sort Key: (COALESCE(edges.a, edges_1.b)) -> Hash Full Join (cost=18.24..36.48 rows=9 width=48) Hash Cond: (edges.a = edges_1.b) -> GroupAggregate (cost=0.00..18.17 rows=3 width=40) Group Key: edges.a -> Seq Scan on edges (cost=0.00..18.12 rows=3 width=32) Filter: (a = 'root'::text) -> Hash (cost=18.20..18.20 rows=3 width=40) -> GroupAggregate (cost=0.00..18.17 rows=3 width=40) Group Key: edges_1.b -> Seq Scan on edges edges_1 (cost=0.00..18.12 rows=3 width=32) Filter: (b = 'root'::text) (The plans are identical!)
  • 33. WITH queries ● Those two queries were not identical until Postgres 12 ● Consider ● "Materialization", i.e. saving a temporary result, causes this to materialize the entirety of table a ○ (That's a very inefficient way to get two rows..) ● Older blog posts about CTEs will emphasize this as an "optimisation barrier", probably warning about them with a_million_numbers as ( select * from a -- table from earlier with index on id ) select * from a_million_numbers where id in (42, 43); } Postgres ≤11 will always materialize the result of a "table expression" – either in memory or disk
  • 34. # set work_mem to '64 kB'; -- force disk flushing with very low limit # explain (analyze true, verbose true, buffers true) with a_million_numbers as MATERIALIZED ( select * from a ) select * from a_million_numbers where id in (42, 43); -- QUERY PLAN -- CTE Scan on a_million_numbers (cost=14426.90..36928.92 rows=10001 width=4) (actual time=0.048..1356.717 rows=4 loops=1) Output: a_million_numbers.id Filter: (a_million_numbers.id = ANY ('{42,43}'::integer[])) Rows Removed by Filter: 1000086 Buffers: shared hit=4426, temp written=1709 CTE a_million_numbers -> Seq Scan on public.a (cost=0.00..14426.90 rows=1000090 width=4) (actual time=0.015..334.729 rows=1000090 loops=1) Output: a.id Buffers: shared hit=4426 Planning Time: 0.095 ms Execution Time: 1358.238 ms Force Postgres ≤11 behaviour Low on memory, flushing to disk
  • 35. # set work_mem to '64 kB'; -- force disk flushing with very low limit # explain (analyze true, verbose true, buffers true) with a_million_numbers as [NOT MATERIALIZED] ( select * from a ) select * from a_million_numbers where id in (42, 43); -- QUERY PLAN -- Index Only Scan using a_id_idx on a (cost=0.42..12.88 rows=2 width=4) (actual time=0.020..0.026 rows=4 loops=1) Index Cond: (id = ANY ('{42,43}'::integer[])) Heap Fetches: 4 Planning Time: 0.108 ms Execution Time: 0.043 ms -- vs 1358ms for materialized plan Default Postgres ≥12 behaviour
  • 36. # set work_mem to '64 kB'; -- force disk flushing with very low limit # explain (analyze true, verbose true, buffers true) with a_million_numbers as [NOT MATERIALIZED] ( select * from a where id in (42, 43) ) select * from a_million_numbers where id in (42, 43) -- QUERY PLAN -- Index Only Scan using a_id_idx on a (cost=0.42..12.88 rows=2 width=4) (actual time=0.020..0.026 rows=4 loops=1) Index Cond: (id = ANY ('{42,43}'::integer[])) Heap Fetches: 4 Planning Time: 0.108 ms Execution Time: 0.043 ms -- vs 1358ms for materialized plan as if inlined
  • 37. Writable CTEs ● DELETE/UPDATE/INSERT statements with RETURNING can be used as "tables" too ● Note! ○ RETURNING is the only way to let other table expressions see the result. ○ Different table expressions cannot otherwise see the effect of other modifications in the same statement. -- Delete rows while simultaneously -- inserting them elsewhere WITH moved_rows AS ( DELETE FROM products WHERE date >= '2010-10-01' AND date < '2010-11-01' RETURNING * ) INSERT INTO products_log SELECT * FROM moved_rows;
  • 38. Recursive CTEs # WITH RECURSIVE t(n) AS ( VALUES (1) -- starting value(s) UNION ALL SELECT n+1 FROM t -- t will keep growing WHERE n < 100 -- until termination condition is true ) SELECT sum(n) FROM t; sum ------ 5050
  • 39. # with recursive graph_traversal(a, b, path, depth) as ( select a, b, ARRAY[a] as path, 0 as depth from edges where a='root' union all select edges.a, edges.b, path || ARRAY[edges.a], depth + 1 from edges join graph_traversal on(edges.a=graph_traversal.b) -- Avoid looping forever if there's a cycle where not(edges.a = ANY(path)) ) select * from graph_traversal order by depth, a; a | b | path | depth ----------------+----------------+----------------------------------+------- root | A | {root} | 0 root | something-else | {root} | 0 A | B | {root,A} | 1 something-else | A pump?! | {root,something-else} | 1 A pump?! | D | {root,something-else,"A pump?!"} | 2 B | C | {root,A,B} | 2 B | D | {root,A,B} | 2 (7 rows)
  • 40. -- Mandelbrot set WITH RECURSIVE x(i) AS ( VALUES(0) UNION ALL SELECT i + 1 FROM x WHERE i < 101 ), Z(Ix, Iy, Cx, Cy, X, Y, I) AS ( SELECT Ix, Iy, X::float, Y::float, X::float, Y::float, 0 FROM (SELECT -2.2 + 0.031 * i, i FROM x) AS xgen(x,ix) CROSS JOIN (SELECT -1.5 + 0.031 * i, i FROM x) AS ygen(y,iy) UNION ALL SELECT Ix, Iy, Cx, Cy, X * X - Y * Y + Cx AS X, Y * X * 2 + Cy, I + 1 FROM Z WHERE X * X + Y * Y < 16.0 AND I < 27 ), Zt (Ix, Iy, I) AS ( SELECT Ix, Iy, MAX(I) AS I FROM Z GROUP BY Iy, Ix ORDER BY Iy, Ix ) SELECT array_to_string( array_agg( SUBSTRING( ' .,,,-----++++%%%%@@@@#### ', GREATEST(I,1), 1 ) ),'' ) FROM Zt GROUP BY Iy ORDER BY Iy;
  • 41. What's up in my database?
  • 42. pg_stat_activity # select * from pg_stat_activity where state='idle'; -[ RECORD 1 ]----+------------------------------------------------ datid | 26211311 datname | sample pid | 5291 usesysid | 16384 usename | alex application_name | psql client_addr | client_hostname | client_port | -1 backend_start | 2021-06-14 16:11:47.491729+02 xact_start | query_start | 2021-06-14 16:34:41.027633+02 state_change | 2021-06-14 16:34:41.02857+02 wait_event_type | Client wait_event | ClientRead state | idle backend_xid | backend_xmin | query | delete from person where is_crazy returning id; backend_type | client backend
  • 43. pg_stat_activity # select * from pg_stat_activity where state='idle'; -[ RECORD 1 ]----+------------------------------------------------ datid | 26211311 datname | sample pid | 5291 usesysid | 16384 usename | alex application_name | psql client_addr | client_hostname | client_port | -1 backend_start | 2021-06-14 16:11:47.491729+02 xact_start | query_start | 2021-06-14 16:34:41.027633+02 state_change | 2021-06-14 16:34:41.02857+02 wait_event_type | Client wait_event | ClientRead state | idle backend_xid | backend_xmin | query | delete from person where is_crazy returning id; backend_type | client backend # set application_name to 'my-app:some-role' jdbc:postgresql://localhost:5435/MyDB?ApplicationName=MyApp "idle in transaction" is typically bad, especially if holding locks
  • 44. What's going on? ● application_name lets you identify your connection in pg_stat_activity ● pg_terminate_backend(pid) to kill a bad connection ● pg_stat_activity for what's up right now ● pg_stat_statements extension for tracking what has happened ○ Install: create extension if not exists pg_stat_statements ○ # install on all subsequently created databases: $ psql template1 -c 'create extension if not exists pg_stat_statements' ●
  • 45. pg_stat_statements ● pg_stat_statements extension for tracking continuous activity ● What are the most expensive queries over time? ● One of the most useful extensions! # select * from pg_stat_statements order by total_time desc limit 1; -[ RECORD 1 ]-------+------------------------------------------------------------ […] query | with a_million_numbers as materialized (select * from a) select * from a_million_numbers where id in(42, 43) calls | 2 total_time | 2500.088486 min_time | 1160.500207 max_time | 1339.588279 mean_time | 1250.044243 stddev_time | 89.544036 rows | 0 shared_blks_hit | 8852 […] temp_blks_written | 1710
  • 46. pg_stat_statements ● Some ORMs/toolkits fill pg_stat_statements with junk: ○ WHERE id IN ($1, $2, …, $987) ○ WHERE id IN ($1, $2, …, $5432) ● Consider id = ANY($parameter_as_array) # select * from pg_stat_statements order by total_time desc limit 1; -[ RECORD 1 ]-------+------------------------------------------------------------ […] query | with a_million_numbers as materialized (select * from a) select * from a_million_numbers where id in ($1, $2, $3, …) calls | 2 total_time | 2500.088486 min_time | 1160.500207 max_time | 1339.588279 mean_time | 1250.044243 stddev_time | 89.544036 rows | 0 shared_blks_hit | 8852 […] temp_blks_written | 1710
  • 47. Locks The likely reason for a migration causing an outage
  • 48. Sample lock block # begin; # create index on person(name); CREATE INDEX Client A Client B Client C In Postgres, DDL like create table, alter table, create index, etc. are transactional. Unlike Oracle and MySQL, "CREATE TABLE" does not imply commit. "Implicit commit" is not a thing in Postgres.
  • 49. Sample lock block # begin; # create index on person(name); CREATE INDEX -- Index creation needs a share lock -- Block to see effects on others # select pg_sleep(600); # begin; -- Reading is fine # select * from person where id=3; id | name | is_crazy ----+---------+---------- 3 | Mallory | f -- But cannot update: # update person set name='Not Mallory' where id=3; -- We're BLOCKED by client A :( -- Meanwhiles, checking the status: # select query, wait_event_type from pg_stat_activity where state='active' and pid != pg_backend_pid(); -[ RECORD 1 ]---+---------------------- query | select pg_sleep(600); wait_event_type | Timeout -[ RECORD 2 ]---+---------------------- query | update person set name='Not Mallory' where id=3; wait_event_type | Lock Client A Client B Client C pg_sleep is useful to artificially slow down operations to observe locking effects in dev
  • 50. Locky migrations ● CREATE INDEX prevents writes on the target table for the duration of the transaction. Locks are released (only!) on commit/rollback. ● ALTER TABLE (add column and/or constraints, etc) will need an exclusive lock on the table. ● An exclusive lock on a table will block all concurrent access = OUTAGE ● A share lock will block all concurrent write access = read only access ● You generally want to minimise time spent holding or waiting for such locks ○ statement_timeout ○ lock_timeout
  • 51. Less locky migrations ● CREATE INDEX CONCURRENTLY does not block concurrent writing to the table ○ Reads the table twice. Costs more IO. ○ Cannot be done in a transaction, i.e. it must be the only operation in its own transaction ○ Caveats apply, consult the docs ● ALTER TABLE ADD CONSTRAINT … NOT VALID ○ Adds a constraint that only applies to subsequent writes ○ Existing data is not validated, so an exclusive lock needed only briefly ● ALTER TABLE … VALIDATE CONSTRAINT … ○ Holds a much weaker lock while reading the entire table ○ Can of course fail if data exists that cannot validate ● CREATE UNIQUE INDEX CONCURRENTLY for a uniqueness constraint
  • 52. Less locky migrations # select * from equipment; id | mass_in_grams ----+--------------- 1 | 42 2 | -1 (2 rows) # alter table equipment add constraint mass_not_negative check (mass_in_grams >= 0) not valid; ALTER TABLE # insert into equipment values (3, -42); ERROR: new row for relation "equipment" violates check constraint "mass_not_negative" DETAIL: Failing row contains (3, -42). # alter table equipment validate constraint mass_not_negative; ERROR: check constraint "mass_not_negative" is violated by some row Needs exclusive lock for some milliseconds Scans the entire table without strong lock
  • 53.
  • 54. Deferred constraints ● A constraint can be deferred to commit time ● Useful e.g. for cyclic foreign keys ● Cyclic foreign key useful when partitioning out certain columns ○ This is useful for "last_modified" kinds of use cases. ○ Read more about "heap only tuples" if you have that use case # create table something ( -- change_info does not exist yet, adding -- FK via ALTER below: id int primary key, what text ); # create table change_info ( id int primary key references something(id) deferrable initially deferred, last_modified timestamptz not null default now() ); # alter table something add constraint must_have_change_info foreign key (id) references change_info(id) deferrable initially deferred;
  • 55. Deferred constraints # insert into something values (1, 'this-will-fail'); ERROR: insert or update on table "something" violates foreign key constraint "must_have_change_info" DETAIL: Key (id)=(1) is not present in table "change_info". # begin; -- Not failing, FK check deferred to commit # insert into something values (1, 'with-version-info'); INSERT 0 1 # insert into change_info values (1, now()); INSERT 0 1 # end; COMMIT # begin; # insert into change_info values (2, now()); INSERT 0 1 -- Will fail: # end; ERROR: insert or update on table "change_info" violates foreign key constraint "change_info_id_fkey" DETAIL: Key (id)=(2) is not present in table "something". # create table something ( -- change_info does not exist yet, adding -- FK via ALTER below: id int primary key, what text ); # create table change_info ( id int primary key references something(id) deferrable initially deferred, last_modified timestamptz not null default now() ); # alter table something add constraint must_have_change_info foreign key (id) references change_info(id) deferrable initially deferred;
  • 56. Functional indexes -- Functional indexes: # create table users ( id int primary key, username text ); # create unique index on users(lower(username)); # insert into users values (1, 'alex'), (2, 'ALEX'); ERROR: duplicate key value violates unique constraint "users_lower_idx" DETAIL: Key (lower(username))=(alex) already exists. -- Note: Index lookups must use the -- same function -- This can use the index: # select * from users where lower(username)='alex'; -- This will NOT be able to use that index: # select * from users where username='alex';
  • 57. Partial indexes # create table assets ( id int primary key, external_id text not null, deleted_at timestamp ); # create unique index on assets(external_id) where deleted_at is null; # insert into assets values (1, 'one', now()), (2, 'one', null); INSERT 0 2 # insert into assets values (3, 'not one', null), (4, 'one', null); ERROR: duplicate key value violates unique constraint "assets_external_id_idx" DETAIL: Key (external_id)=(one) already exists. -- This can use the partial index: # select * from assets where external_id='one' AND deleted_at is null; -- This will NOT be able to use that index: # select * from users where external_id='one'; -- missing (deleted_at is null)
  • 58. Range Types and Exclusion Constraints
  • 59. Range types ● A range is an interval of numeric-like data ● int4range(0, 10): 0 <= n < 10 ● tstzrange(now(), null): any time >= now() ● Index support for overlaps and containment ○ overlaps: int4range(0, 10) && int4range(9, 20) ○ contains: ■ int4range(0, 10) @> 2 ■ int4range(0, 10) @> int4range(2, 4) # create table intervals ( id int primary key, start timestamptz, "end" timestamptz ); -- Insert 1 million random intervals # with random_starts as ( select generate_series(1, 1000*1000) as id, '2020-01-01'::timestamptz + (1000 * random()) * interval '1 day' as start ) insert into intervals select id, start, start + (1000 * random() * interval '1 hour') as "end" from random_starts; INSERT 0 1000000 # create index on intervals using gist(tstzrange(start, "end")); "Generalised search tree", R-tree and more
  • 60. Range types # create table intervals ( id int primary key, start timestamptz, "end" timestamptz ); -- [Insert random intervals -- happened here] # create index on intervals using gist(tstzrange(start, "end")); # explain analyze select * from intervals where tstzrange(start, "end") && tstzrange(now(), now() + interval '7 days'); -- QUERY PLAN -- Bitmap Heap Scan on intervals (cost=1368.16..8447.01 rows=28354 width=20) (actual time=46.379..80.761 rows=27837 loops=1) Recheck Cond: (tstzrange(start, "end") && tstzrange(now(), (now() + '7 days'::interval))) Heap Blocks: exact=6278 -> Bitmap Index Scan on intervals_tstzrange_idx (cost=0.00..1361.08 rows=28354 width=0) (actual time=44.309..44.309 rows=27837 loops=1) Index Cond: (tstzrange(start, "end") && tstzrange(now(), (now() + '7 days'::interval))) Planning Time: 0.118 ms Execution Time: 56.151 ms
  • 61. Exclusion constraints ● Define ranges that cannot overlap ● No two reservations of the same resource can overlap ● Exclusion constraints cannot be added "concurrently", i.e. without write-locking the table. # create table reservations ( room text, start timestamptz, "end" timestamptz, constraint no_double_booking exclude using gist( room with =, tstzrange(start, "end") with && ) ); # insert into reservations values ('zelda', '2021-06-15', '2021-06-16'), ('zelda', '2021-06-17', '2021-06-18'), ('portal', '2021-06-01', '2021-07-01'); INSERT 0 3 # insert into reservations values ('zelda', '2021-06-14', '2021-06-16'); ERROR: conflicting key value violates exclusion constraint "no_double_booking" DETAIL: [omitted]
  • 62. Exclusion constraints ● Define constraint that all elements must be the same ○ "Exclude dissimilar" ● Example: ○ Don't put humans and lions in the same cage at the same time # create table cages ( cage text, animal text, start timestamptz, "end" timestamptz, constraint just_same_animals exclude using gist( cage with =, animal with !=, tstzrange(start, "end") with && ) ); # insert into cages values ('cellar', 'human', '2021-06-15', '2021-06-16'), ('bedroom', 'lion', '2021-06-15', '2021-06-16'), ('cellar', 'human', '2021-06-01', '2021-07-01'); INSERT 0 3 # insert into cages values ('bedroom', 'human', '2021-06-14', '2021-06-16'); ERROR: conflicting key value violates exclusion constraint "just_same_animals" DETAIL: [omitted]
  • 64. Upsert ● INSERT can take an ON CONFLICT ● Given a conflict, either ○ DO NOTHING ○ DO UPDATE SET … [ WHERE … ] # create table users ( user_id bigint primary key, username text not null, name text ); create unique index on users(lower(username)); insert into users values (1, 'alex', 'Alex'), (2, 'bob', 'Bob'); # prepare sample_upsert as insert into users values ($1, $2, $3) on conflict(user_id) do update set username=excluded.username, name=excluded.name where row(users.username, users.name) is distinct from row(excluded.username, excluded.name) returning *; # execute sample_upsert(1, 'alex', 'AlexB'); user_id | username | name ---------+----------+------- 1 | alex | AlexB -- Repeat the same, then what? # execute sample_upsert(1, 'alex', 'AlexB'); user_id | username | name ---------+----------+------ (0 rows)
  • 65. Upsert ● ON CONFLICT DO NOTHING does not return data # prepare fixed_upsert as with maybe_upserted as ( insert into users values ($1, $2, $3) on conflict(user_id) [ … same as previous …] returning * ) select * from maybe_upserted union all select * from users where user_id = $1 limit 1; # execute fixed_upsert(1, 'alex', 'AlexB'); user_id | username | name ---------+----------+------- 1 | alex | AlexB -- Repeat the same, then what? # execute fixed_upsert(1, 'alex', 'AlexB'); user_id | username | name ---------+----------+------- 1 | alex | AlexB
  • 66. Upsert ● ON CONFLICT DO NOTHING does not return data # prepare fixed_upsert as with maybe_upserted as ( insert into users values ($1, $2, $3) on conflict(user_id) [ … same as previous …] returning * ) select * from maybe_upserted union all select * from users where user_id = $1 limit 1; -- (note: union all for this trick) # explain analyze execute fixed_upsert(1, 'alex', 'Changed Name'); QUERY PLAN --------------------------------------------------- Limit (..) (actual ...) CTE maybe_upserted -> Insert on users users_1 (...) Conflict Resolution: UPDATE Conflict Arbiter Indexes: users_pkey Conflict Filter: ([…snip…]) Tuples Inserted: 0 Conflicting Tuples: 1 -> Result (...) -> Append (...) -- (this is UNION ALL) -> CTE Scan on maybe_upserted (...) -> Index Scan using users_pkey on users (...) (never executed) Index Cond: (user_id = '1'::bigint) Planning Time: 0.170 ms Execution Time: 0.082 ms
  • 67. JSON
  • 68. JSON ● jsonb types, functions and aggregates ● Validates conformity only ○ Not a replacement for proper schemas! ● Use with care ○ Not a replacement for proper schemas! :) # create table metadata ( user_id int, key text, metadata jsonb ); # insert into metadata values (1, 'settings', '{"foo": "bar"}'), (1, 'searches', '["where", "what"]'); # select user_id, jsonb_agg(metadata) from metadata group by 1; user_id | jsonb_agg ---------+------------------------------------- 1 | [{"foo": "bar"}, ["where", "what"]] # select user_id, jsonb_object_agg(key, metadata) from metadata group by 1; user_id | jsonb_object_agg ---------+------------------------------------------ 1 | {"searches": ["where", "what"], "settings": {"foo": "bar"}}
  • 69. JSON object graphs ● Compose complete JSON object graphs via LATERAL joins. ● Does not require the rows to have JSON types ● LATERAL is a bit like "for each" ○ The subquery gets to reference the row ● GraphQL-implementations on top of Postgres do this (e.g. Hasura, Postgraphile) # select * from users join metadata using(user_id); user_id | username | key | metadata ---------+----------+----------+------------------- 1 | alex | settings | {"foo": "bar"} 1 | alex | searches | ["where", "what"] # select jsonb_build_object( 'username', username, 'metadata', aggregated_metadata ) as user_object from users left join lateral ( -- For each user: select jsonb_object_agg(key, metadata) as metadata from metadata where metadata.user_id=users.user_id ) as aggregated_metadata on true; user_object -------------------------- { "metadata": { "searches": [ "where", "what" ], "settings": { "foo": "bar" } }, "username": "alex" }
  • 70. Upserting object graphs ● Convert JSON object graphs to records via LATERAL joins ● Upsert objects to different tables with a single writable WITH-query # create table users ( user_id bigint generated by default as identity primary key, username text unique not null, name text ); # create table user_settings ( user_id bigint primary key references users(user_id), settings jsonb not null ); -- TODO: Define 'upsert' query that either creates or patches -- across multiple tables with a single parameter # execute upsert('[ { "user": {"username": "alex", "name": "Alex"}, "settings": {"foo": "bar"} }, { "user": {"username": "mallory", "name": "Mallory"} } ]'); # execute upsert('[ { "user": {"username": "alex", "name": "AlexB"} }, { "user": {"username": "mallory", "name": "Mallory"}, "settings": {"bar": "baz"} } ]');
  • 71. # prepare upsert as with records as ( select * from jsonb_to_recordset($1::jsonb) as _("user" jsonb, "settings" jsonb) ), maybe_upserted_users as ( insert into users (username, name) select username, name from records join lateral jsonb_populate_record(null::users, records.user) on(true) on conflict (username) do -- insert or update update set username=excluded.username, name=coalesce(excluded.name, users.name) where row(users.username, users.name) is distinct from row(excluded.username, excluded.name) returning * ) select * from maybe_upserted_users; # execute upsert('[...]'); user_id | username | name ---------+----------+--------- 1 | alex | Alex 2 | mallory | Mallory # create table users ( user_id bigint ... primary key, username text unique not null, name text ); # create table user_settings ( user_id bigint primary key references users(user_id), settings jsonb not null ); # execute upsert('[ { "user": {"username": "alex", "name": "Alex"}, "settings": {"foo": "bar"} }, { "user": {"username": "mallory", "name": "Mallory"} } ]');
  • 72. # prepare upsert as with records as ( select * from jsonb_to_recordset($1::jsonb) as _("user" jsonb, "settings" jsonb) ), maybe_upserted_users as ( insert into users (username, name) select username, name from records join lateral jsonb_populate_record(null::users, records.user) on(true) on conflict (username) do -- insert or update update set username=excluded.username, name=coalesce(excluded.name, users.name) where row(users.username, users.name) is distinct from row(excluded.username, excluded.name) returning * ) select * from maybe_upserted_users; # execute upsert('[ { "user": {"username": "alex", "name": "Alex"}, "settings": {"foo": "bar"} }, { "user": {"username": "mallory", "name": "Mallory"} } ]'); user_id | username | name ---------+----------+--------- 1 | alex | Alex 2 | mallory | Mallory -- On conflict's WHERE condition makes nothing be returned: # execute upsert('[...]'); user_id | username | name ---------+----------+------ (0 rows)
  • 73. # prepare upsert as with records as ( select * from jsonb_to_recordset($1::jsonb) as _("user" jsonb, "settings" jsonb) ), maybe_upserted_users as ( insert into users (username, name) select username, name from records join lateral jsonb_populate_record(null::users, records.user) on(true) on conflict (username) do -- insert or update update set username=excluded.username, name=coalesce(excluded.name, users.name) where row(users.username, users.name) is distinct from row(excluded.username, excluded.name) returning * ), all_users as ( ... ) select * from all_users all_users as ( select * from maybe_upserted_users union all select * from users where username in ( select "user" ->> 'username' from records except all select username from maybe_upserted_users ) ) -- We now get all users: created or updated or neither # execute upsert('[...]'); user_id | username | name ---------+----------+--------- 1 | alex | Alex 2 | mallory | Mallory # execute upsert('[...]'); user_id | username | name ---------+----------+--------- 1 | alex | Alex 2 | mallory | Mallory
  • 74. # prepare upsert as with records as ( ... ), maybe_upserted_users as ( ... ), all_users as ( ... ), updated_settings as ( insert into user_settings select user_id, settings from all_users join records on (records.user ->> 'username' = username) where settings is not null on conflict(user_id) do update set settings=excluded.settings where user_settings.settings is distinct from excluded.settings ) select username, user_id from all_users; # execute upsert('[ { "user": {"username": "alex", "name": "Alex"}, "settings": {"foo": "bar"} }, { "user": {"username": "mallory", "name": "Mallory"} } ]'); username | user_id ----------+--------- alex | 1 mallory | 2 # select * from user_settings; user_id | settings ---------+---------------- 1 | {"foo": "bar"}
  • 76. NOTIFY+LISTEN and advisory locks ● LISTEN: Get an async callback when something does NOTIFY ○ LISTEN 'channel'; -- Get notifications between transactions ● NOTIFY: Send callbacks. Delivered between transactions to LISTEN-ers ○ NOTIFY 'channel'; -- Wake up listeners ○ Triggers can do ensure notifications are sent ● Advisory locks: ○ "Leader election" through Postgres ○ Locks that last until you disconnect ○ Need a limited number of background processes to pick up work? ○ SELECT pg_try_advisory_lock(1234); ● SKIP LOCKED ○ SELECT * FROM work_items LIMIT 1 FOR UPDATE SKIP LOCKED
  • 77. Learn more! ● tinyurl.com/jz21-psql ● medium.com/@alexbrasetvik/postgres-can-do-that- f221a8046e ● medium.com/cognite ● Tweet me questions or feedback: @alexbrasetvik all the same