Maxim Boguk
How to teach
an elephant
to rock’n’roll
About Data Egret
dataegret.com
At Data Egret (formerly known as PostgreSQL-Consulting.com) we
provide database maintenance solutions and system administration
support covering PostgreSQL database life cycle to large and small
businesses alike.

27/7 Support

Consulting

Health Check

Monitoring

Training
dataegret.com
How to teach an elephant to rock’n’roll
dataegret.com

Some queries can’t be automatically optimized

An alternative approach

Manual fast algorithm – huge performance boost
Really?
dataegret.com

Demonstrate very useful fast alternative algorithms

Provide methods of the PL/PgSQL conversion to plain SQL

Additional: Self-contained learning package
Goals
dataegret.com

Basic query optimization techniques

PostgreSQL query planner and optimizer limitations

Compare PL/PgSQL performance vs plain SQL
(The Pl/PgSQL samples are written in a way that is easy to
understand, however they are not the fastest possible
implementation)
Non goals
dataegret.com

PL/PgSQL knowledge

PostgreSQL SQL features:

WITH [RECURSIVE]

[JOIN] LATERAL

UNNEST [WITH ORDINALITY]
Prerequisites
dataegret.com

PostgreSQL version 9.6

Works on 9.5 and 9.4 without (major) changes

Running on the 9.3 (and down to 8.4) is possible, but requires a
work-around implementation for missing features.
PostgreSQL version
dataegret.com

Problem description

Native approach - a simple SQL query

EXPLAIN ANALYZE

Alternative algorithm on PL/PgSQL

The same algorithm implemented on SQL

EXPLAIN ANALYZE

Performance results
Structure
01
Initial data
preparation
dataegret.com
Initial data preparation
Schema01
CREATE UNIQUE INDEX blog_post_test_author_id_ctime_ukey
ON b_p_t USING btree (author_id, ctime);
dataegret.com
Create blog posts table:
DROP TABLE IF EXISTS b_p_t;
CREATE TABLE b_p_t (
id BIGSERIAL PRIMARY KEY,
ctime TIMESTAMP NOT NULL,
author_id INTEGER NOT NULL,
payload text);
Initial data preparation
Part 101
dataegret.com
Populate the table with test data:
-- generate 10.000.000 blog posts from 1000 authors average 10000 post
per author from last 5 years
-- expect few hours run time
INSERT INTO b_p_t (ctime, author_id, payload)
SELECT
-- random in last 5 years
now()-(random()*365*24*3600*5)*'1 second'::interval AS ctime,
-- 1001 author
(random()*1000)::int AS author_id,
-- random text-like payload 100-2100 bytes long
(SELECT
string_agg(substr('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVW
XYZ0123456789 ', (random() * 72)::integer + 1, 1), '')
FROM generate_series(1, 100+i%10+(random() * 2000)::integer)) AS
payload
-- 10M posts
FROM generate_series(1, 10000000) AS g(i);
Initial data preparation
Part 201
Initial data preparation
Part 3
Populate the table with test data (continue):
--delete generated duplicates
DELETE FROM b_p_t where (author_id, ctime) IN (select author_id, ctime
from b_p_t group by author_id, ctime having count(*)>1);
CREATE INDEX bpt_ctime_key on b_p_t(ctime);
CREATE UNIQUE INDEX bpt_a_id_ctime_ukey on b_p_t(author_id, ctime);
-- create authors table
DROP TABLE IF EXISTS a_t;
CREATE TABLE a_t AS select distinct on (author_id) author_id AS id,
'author_'||author_id AS name FROM b_p_t;
ALTER TABLE a_t ADD PRIMARY KEY (id);
ALTER TABLE b_p_t ADD CONSTRAINT author_id_fk FOREIGN KEY (author_id)
REFERENCE a_t(id);
ANALYZE a_t;
dataegret.com
01
02
IOS for
large
offsets
dataegret.com

Queries with large offset are always slow

To produce 1.000.001’st row, database fist needs to iterate
over all 1.000.000 rows

Alternative: use fast Index Only Scan to skip the first
1.000.000 rows
IOS for large offsets02
dataegret.com
SELECT
*
FROM b_p_t
ORDER BY id
OFFSET 1000000
LIMIT 10
IOS for large offsets
Native SQL02
02
dataegret.com
Limit (actual time=503..503 rows=10 loops=1)
-> Index Scan using b_p_t_pkey on b_p_t
(actual time=0..386 rows=1000010 loops=1)
IOS for large offsets
Native SQL EXPLAIN
02
dataegret.com
CREATE OR REPLACE FUNCTION ios_offset_test
(a_offset BIGINT, a_limit BIGINT) RETURNS SETOF b_p_t LANGUAGE plpgsql
AS $function$
DECLARE start_id b_p_t.id%TYPE;
BEGIN
--find a starting id using IOS to skip OFFSET rows
SELECT id INTO start_id FROM b_p_t
ORDER BY id OFFSET a_offset LIMIT 1;
--return result using normal index scan
RETURN QUERY SELECT * FROM b_p_t WHERE id>=start_id
ORDER BY ID LIMIT a_limit;
END;
$function$;
IOS for large offsets
PL/PgSQL
02
dataegret.com
SELECT bpt.* FROM
(
--find a starting id using IOS to skip OFFSET rows
SELECT id FROM b_p_t
ORDER BY id OFFSET 1000000 LIMIT 1
) AS t,
LATERAL (
--return result using normal index scan
SELECT * FROM b_p_t WHERE id>=t.id
ORDER BY id LIMIT 10
) AS bpt;
IOS for large offsets
Advanced SQL
02
dataegret.com
-> Index Only Scan using blog_post_test_pkey on b_p_t
(actual time=0..236 rows=1000001 loops=1)
-> Index Scan using blog_post_test_pkey on b_p_t
(actual time=0.016..0.026 rows=10 loops=1)
Index Cond: (id >= b_p_t.id)
IOS for large offsets
Advanced SQL EXPLAIN
dataegret.com
IOS for large offsets
Performance
OFFSET Native PL/PgSQL Advanced
1000 0.7ms 0.5ms 0.4ms
10000 3.0ms 1.2ms 1.1ms
100000 26.2ms 7.7ms 7.4ms
1000000 273.0ms 71.0ms 70.9ms
02
03
Pull down
LIMIT under
JOIN
dataegret.com

A combination of JOIN, ORDER BY and LIMIT

The database performs join for every row
(not only for the LIMIT rows)

In many cases it’s unnecessary

Alternative: pull down LIMIT+ORDER BY under JOIN
Pull down LIMIT under JOIN03
dataegret.com
SELECT
*
FROM b_p_t
JOIN a_t ON a_t.id=author_id
WHERE author_id IN (1,2,3,4,5)
ORDER BY ctime
LIMIT 10
Pull down LIMIT under JOIN
Native SQL03
03
dataegret.com
-> Sort (actual time=345..345 rows=10 loops=1)
-> Nested Loop (actual time=0.061..295.832 rows=50194
loops=1)
-> Index Scan using b_p_t_author_id_ctime_ukey on
b_p_t (actual time=0..78 rows=50194 loops=1)
Index Cond: (author_id = ANY
('{1,2,3,4,5}'::integer[]))
-> Index Scan using a_t_pkey on a_t
(actual time=0.002 rows=1 loops=50194)
Index Cond: (id = b_p_t.author_id)
Pull down LIMIT under JOIN
Native SQL EXPLAIN
03
dataegret.com
CREATE OR REPLACE FUNCTION join_limit_pulldown_test (a_authors BIGINT[],
a_limit BIGINT)
RETURNS TABLE (id BIGINT, ctime TIMESTAMP, author_id INT, payload TEXT, name
TEXT) LANGUAGE plpgsql
AS $function$
DECLARE t record;
BEGIN
FOR t IN (
-- find ONLY required rows first
SELECT * FROM b_p_t
WHERE b_p_t.author_id=ANY(a_authors)
ORDER BY ctime LIMIT a_limit
) LOOP
-- and only after join with authors
RETURN QUERY SELECT t.*, a_t.name FROM a_t
WHERE a_t.id=t.author_id;
END LOOP;
END;
$function$;
Pull down LIMIT under JOIN
PL/PgSQL
03
dataegret.com
SELECT bpt_with_a_name.* FROM
-- find ONLY required rows first
(
SELECT * FROM b_p_t
WHERE author_id IN (1,2,3,4,5)
ORDER BY ctime LIMIT 10
) AS t, LATERAL (
-- and only after join with the authors
SELECT t.*,a_t.name FROM a_t
WHERE a_t.id=t.author_id
) AS bpt_with_a_name
-- second ORDER BY required
ORDER BY ctime LIMIT 10;
Pull down LIMIT under JOIN
Advanced SQL
03
dataegret.com
-> Nested Loop (actual time=68..68 rows=10 loops=1)
-> Sort (actual time=68..68 rows=10 loops=1)
-> Index Scan using b_p_t_author_id_ctime_ukey on b_p_t
(actual time=0..49 rows=50194 loops=1)
Index Cond: (author_id = ANY ('{1,2,3,4,5}'::integer[]))
-> Index Scan using a_t_pkey on a_t (actual
time=0.002..0.002 rows=1 loops=10)
Index Cond: (id = b_p_t.author_id)
Pull down LIMIT under JOIN
Advanced SQL EXPLAIN
03
dataegret.com
author_id IN (1,2,3,4,5) / LIMIT 10
Native SQL: 155ms
PL/PgSQL: 52ms
Advanced SQL: 51ms
Pull down LIMIT under JOIN
Performance
04
DISTINCT
04
dataegret.com

DISTINCT on a large table always slow

It scans whole table

Even if the amount of distinct values low

Alternative: creative index use to perform DISTINCT

Technique known as LOOSE INDEX SCAN:
https://wiki.postgresql.org/wiki/Loose_indexscan
DISTINCT
04
dataegret.com
select
distinct author_id
from b_p_t
DISTINCT
Native SQL
04
dataegret.com
Unique (actual time=0..5235 rows=1001 loops=1)
-> Index Only Scan using b_p_t_author_id_ctime_ukey
on b_p_t
(actual time=0..3767 rows=9999964 loops=1)
DISTINCT
Native SQL EXPLAIN
04
dataegret.com
CREATE OR REPLACE FUNCTION fast_distinct_test
() RETURNS SETOF INTEGER
LANGUAGE plpgsql
AS $function$
DECLARE _author_id b_p_t.author_id%TYPE;
BEGIN
--start from least author_id
SELECT min(author_id) INTO _author_id FROM b_p_t;
LOOP
--finish if nothing found
EXIT WHEN _author_id IS NULL;
--return found value
RETURN NEXT _author_id;
--find the next author_id > current author_id
SELECT author_id INTO _author_id FROM b_p_t WHERE
author_id>_author_id ORDER BY author_id LIMIT 1;
END LOOP;
END;
$function$;
DISTINCT
PL/PgSQL
04
dataegret.com
WITH RECURSIVE t AS (
--start from least author_id
(SELECT author_id AS _author_id FROM b_p_t
ORDER BY author_id LIMIT 1
)
UNION ALL
SELECT author_id AS _author_id
FROM t, LATERAL (
--find the next author_id > current author_id
SELECT author_id FROM b_p_t
WHERE author_id>t._author_id
ORDER BY author_id LIMIT 1
) AS a_id
)
--return found values
SELECT _author_id FROM t;
DISTINCT
Advanced SQL
04
dataegret.com
-> Index Only Scan using
b_p_t_author_id_ctime_ukey on b_p_t b_p_t_1 (actual
time=0.015..0.015 rows=1 loops=1)
-> Index Only Scan using
b_p_t_author_id_ctime_ukey on b_p_t
(actual time=0.007..0.007 rows=1 loops=1001)
DISTINCT
Advanced SQL EXPLAIN
04
dataegret.com
Native SQL: 2660ms
PL/PgSQL: 18ms
Advanced SQL: 10ms
DISTINCT
Performance
05
DISTINCT
ON with IN
()
05
dataegret.com

DISTINCT ON with IN() used when an application require
fetch the latest data for SELECTED authors in single query

DISTINCT ON queries usually slow

Alternative: creative use of index to calculate DISTINCT ON
DISTINCT ON with IN ()
05
dataegret.com
SELECT
DISTINCT ON (author_id)
*
FROM b_p_t
WHERE author_id IN (1,2,3,4,5)
ORDER BY author_id, ctime DESC
DISTINCT ON with IN ()
Native SQL
05
dataegret.com
Unique (actual time=160..204 rows=5 loops=1)
-> Sort (actual time=160..195 rows=50194
loops=1)
-> Index Scan using
b_p_t_author_id_ctime_ukey on b_p_t (actual
time=0..50 rows=50194 loops=1)
Index Cond: (author_id = ANY
('{1,2,3,4,5}'::integer[]))
DISTINCT ON with IN ()
Native SQL EXPLAIN
05
dataegret.com
CREATE OR REPLACE FUNCTION distinct_on_test
(a_authors INT[]) RETURNS SETOF b_p_t
LANGUAGE plpgsql
AS $function$
DECLARE _a b_p_t.author_id%TYPE;
BEGIN
-- loop over authors list
FOREACH _a IN ARRAY a_authors LOOP
-- return the latest post for author
RETURN QUERY SELECT * FROM b_p_t
WHERE author_id=_a ORDER BY ctime DESC LIMIT 1;
END LOOP;
END;
$function$;
DISTINCT ON with IN ()
PL/PgSQL
05
dataegret.com
SELECT bpt.*
-- loop over authors list
FROM unnest(ARRAY[1,2,3,4,5]::INT[]) AS t(_author_id),
LATERAL (
-- return the latest post for author
SELECT * FROM b_p_t
WHERE author_id=t._author_id
ORDER BY ctime DESC LIMIT 1
) AS bpt;
DISTINCT ON with IN ()
Advanced SQL
05
dataegret.com
Nested Loop (actual time=0.02..0.06 rows=5 loops=1)
-> Index Scan Backward using b_p_t_author_id_ctime_ukey on
b_p_t
(actual time=0.009..0.009 rows=1 loops=5)
Index Cond: (author_id = t._author)
DISTINCT ON with IN ()
Advanced SQL EXPLAIN
05
dataegret.com
Native SQL: 110.0ms
PL/PgSQL: 0.5ms
Advanced SQL: 0.4ms
DISTINCT ON with IN ()
Performance
06
DISTINCT
ON over all
table
06
dataegret.com

DISTINCT ON used when an application require fetch the
latest data for ALL authors in single query

DISTINCT ON on a large table always is performance killer

Alternative: again, creative use of indexes and SQL saves
the day
DISTINCT ON over all table
06
dataegret.com
SELECT
DISTINCT ON (author_id)
*
FROM b_p_t
ORDER BY author_id, ctime DESC
DISTINCT ON over all table
Native SQL
06
dataegret.com
Unique (actual time=29938..39450 rows=1001 loops=1)
-> Sort (actual time=29938..37845 rows=9999964 loops=1)
Sort Key: author_id, ctime DESC
Sort Method: external merge Disk: 10002168kB
-> Seq Scan on b_p_t (actual time=0.004..2472
rows=9999964 loops=1)
DISTINCT ON over all table
Native SQL EXPLAIN
06
dataegret.com
DISTINCT ON over all table
PL/PgSQL
CREATE OR REPLACE FUNCTION fast_distinct_on_test() RETURNS SETOF b_p_t
LANGUAGE plpgsql
AS $function$
DECLARE _b_p_t record;
BEGIN
--start from greatest author_id
SELECT * INTO _b_p_t FROM b_p_t
ORDER BY author_id DESC, ctime DESC LIMIT 1;
LOOP
--finish if nothing found
EXIT WHEN _b_p_t IS NULL;
--return found value
RETURN NEXT _b_p_t;
--latest post from next author_id < current author_id
SELECT * FROM b_p_t INTO _b_p_t
WHERE author_id<_b_p_t.author_id
ORDER BY author_id DESC, ctime DESC LIMIT 1;
END LOOP;
END;
$function$;
06
dataegret.com
WITH RECURSIVE t AS (
--start from greatest author_id
(SELECT * FROM b_p_t
ORDER BY author_id DESC, ctime DESC LIMIT 1)
UNION ALL
SELECT bpt.* FROM t, LATERAL (
--latest post from the next author_id < current author_id
SELECT * FROM b_p_t
WHERE author_id<t.author_id
ORDER BY author_id DESC, ctime DESC LIMIT 1
) AS bpt
)
--return found values
SELECT * FROM t;
DISTINCT ON over all table
Advanced SQL
06
dataegret.com
-> Index Scan Backward using b_p_t_author_id_ctime_ukey on
b_p_t
(actual time=0.008..0.008 rows=1 loops=1)
-> Index Scan Backward using b_p_t_author_id_ctime_ukey
on b_p_t
(actual time=0.007..0.007 rows=1 loops=1001)
Index Cond: (author_id < t_1.author_id)
DISTINCT ON over all table
Advanced SQL EXPLAIN
06
dataegret.com
--fast distinct(author_id) implementation (from part 4)
WITH RECURSIVE t AS (
--start from least author_id
(SELECT author_id AS _author_id FROM b_p_t
ORDER BY author_id LIMIT 1)
UNION ALL
SELECT author_id AS _author_id
FROM t, LATERAL (
--find the next author_id > current author_id
SELECT author_id FROM b_p_t WHERE author_id>t._author_id
ORDER BY author_id LIMIT 1
) AS a_id
)
SELECT bpt.*
-- loop over authors list (from part 5)
FROM t,
LATERAL (
-- return the latest post for each author
SELECT * FROM b_p_t WHERE b_p_t.author_id=t._author_id
ORDER BY ctime DESC LIMIT 1
) AS bpt;
DISTINCT ON over all table
Alternative advanced SQL
06
dataegret.com
Native SQL: 3755ms
PL/PgSQL: 27ms
Advanced SQL: 13ms
Alternative advanced SQL: 18ms
DISTINCT ON over all table
Performance
07
Let the real
fun begin:
the basic
news feed
dataegret.com
The idea of basic news feed is very simple:
 find the N newest posts from the subscription list authors
 may be with some offset
 easy to implement
 hard to make run fast
The basic news feed07
dataegret.com
SELECT *
FROM b_p_t
WHERE
author_id IN (1,2,3,4,5)
ORDER BY ctime DESC
LIMIT 20 OFFSET 20
The basic news feed
Native SQL07
07
dataegret.com
Limit (actual time=146..146 rows=20 loops=1)
-> Sort (actual time=146..146 rows=40 loops=1)
Sort Key: ctime DESC
Sort Method: top-N heapsort Memory: 84kB
-> Index Scan using bpt_a_id_ctime_ukey on b_p_t
(actual time=0.015..105 rows=50194 loops=1)
Index Cond: (author_id = ANY
('{1,2,3,4,5}'::integer[]))
The basic news feed
EXPLAIN
07
dataegret.com

The query fetches all the posts from all listed authors

It fetches whole rows (with all the payloads)

The more posts – the slower the query

The longer list of authors used – the slower the query
The basic news feed
Problems
07
dataegret.com

An ideal feed query implementation should require no
more then OFFSET+LIMIT index probes on blog_post table

Must be fast at least for small OFFSET/LIMIT values (100-
1000)
The basic news feed
Requirements
07
dataegret.com
The basic news feed
Alternative idea Step 0
Lets start from array of the author_id (_a_ids):
'{20,10,30,100,16}'::int[]
pos author_id
1 20
2 10
3 30
4 100
5 16
07
dataegret.com
The basic news feed
Alternative idea Step 1
Now, using ideas from block 5 populate the ctime of
the latest posts for each author (_a_ctimes):
SELECT
array_agg((
SELECT ctime
FROM b_p_t WHERE author_id=_a_id
ORDER BY ctime DESC LIMIT 1
) ORDER BY _pos)
FROM
UNNEST('{20,10,30,100,16}'::int[])
WITH ORDINALITY AS u(_a_id, _pos)
pos ctime
1 01-02-2017
2 03-02-2017
3 10-02-2017
4 28-02-2017
5 21-02-2017
07
dataegret.com
The basic news feed
Alternative idea Step 2
pos author_id
1 20
2 10
3 30
4 100
5 16
pos ctime
1 01-02-2017
2 03-02-2017
3 10-02-2017
4 28-02-2017
5 21-02-2017
Select position of the latest
post from _a_ctimes. It will
be first row of final result set.
SELECT pos FROM
UNNEST(_a_ctimes) WITH ORDINALITY AS u(a_ctime, pos)
ORDER BY a_ctime DESC NULLS LAST LIMIT 1
07
dataegret.com
The basic news feed
Alternative idea Step 3
pos author_id
1 20
2 10
3 30
4 100
5 16
pos ctime
1 01-02-2017
2 03-02-2017
3 10-02-2017
4 28-02-2017
5 21-02-2017
Now, replace row 4 in _a_ctimes with ctime of the previous post from the
same author.
SELECT ctime AS _a_ctime FROM b_p_t WHERE author_id=_a_ids[pos]
AND ctime<_a_ctimes[pos] ORDER BY ctime DESC LIMIT 1
Found author_id ctime
1 4 28-02-2017
07
dataegret.com
The basic news feed
Alternative idea
pos author_id
1 20
2 10
3 30
4 100
5 16
pos ctime
1 01-02-2017
2 03-02-2017
3 10-02-2017
4 20-02-2017
5 21-02-2017
“Rinse and repeat” steps 2 and 3 collecting found rows, until requested
LIMIT rows found.
Found author_id ctime
1 4 28-02-2017
2 5 21-02-2017
dataegret.com
The basic news feed
PL/PgSQL07
CREATE OR REPLACE FUNCTION feed_test
(a_authors INT[], a_limit INT, a_offset INT) RETURNS SETOF b_p_t
LANGUAGE plpgsql
AS $function$
DECLARE _a_ids INT[] := a_authors;
DECLARE _a_ctimes TIMESTAMP[];
DECLARE _rows_found INT := 0;
DECLARE _pos INT;
DECLARE _a_ctime TIMESTAMP;
DECLARE _a_id INT;
BEGIN
-- loop over authors list
FOR _pos IN SELECT generate_subscripts(a_authors, 1) LOOP
--populate the latest post ctime for every author
SELECT ctime INTO _a_ctime FROM b_p_t WHERE
author_id=_a_ids[_pos] ORDER BY ctime DESC LIMIT 1;
_a_ctimes[_pos] := _a_ctime;
END LOOP;
dataegret.com
The basic news feed
PL/PgSQL (continue)07
WHILE _rows_found<a_limit+a_offset LOOP
--seek position of the latest post in ctime array
SELECT pos INTO _pos
FROM UNNEST(_a_ctimes) WITH ORDINALITY AS u(a_ctime, pos)
ORDER BY a_ctime DESC NULLS LAST LIMIT 1;
--get ctime of previous post of the same author
SELECT ctime INTO _a_ctime FROM b_p_t
WHERE author_id=_a_ids[_pos] AND ctime<_a_ctimes[_pos]
ORDER BY ctime DESC LIMIT 1;
dataegret.com
The basic news feed
PL/PgSQL (continue)07
--offset rows done, start return results
IF _rows_found>=a_offset THEN
RETURN QUERY SELECT * FROM b_p_t
WHERE author_id=_a_ids[_pos]
AND ctime=_a_ctimes[_pos];
END IF;
--increase found rows count
_rows_found := _rows_found+1;
--replace ctime for author with previous message ctime
_a_ctimes[_pos] := _a_ctime;
END LOOP;
END;
$function$;
07
dataegret.com
WITH RECURSIVE
r AS (
SELECT
--empty result
NULL::b_p_t AS _return,
--zero rows found yet
0::integer AS _rows_found,
--populate author ARRAY
'{1,2,3,4,5}'::int[] AS _a_ids,
--populate author ARRAY of latest blog posts
(SELECT
array_agg((
SELECT ctime FROM b_p_t WHERE author_id=_a_id ORDER BY ctime DESC LIMIT 1
) ORDER BY _pos)
FROM UNNEST('{1,2,3,4,5}'::int[]) WITH ORDINALITY AS u(_a_id, _pos)
) AS _a_ctimes
UNION ALL
SELECT
--return found row to the result set if we already done OFFSET or more entries
CASE WHEN _rows_found>=100 THEN (
SELECT b_p_t FROM b_p_t WHERE author_id=_a_ids[_pos] AND ctime=_a_ctimes[_pos]
) ELSE NULL END,
--increase found row count
_rows_found+1,
--pass through the same a_ids array
_a_ids,
--replace current ctime for author with previous message ctime for the same author
_a_ctimes[:_pos-1]||_a_ctime||_a_ctimes[_pos+1:]
FROM r,
LATERAL (
SELECT _pos FROM UNNEST(_a_ctimes) WITH ORDINALITY AS u(_a_ctime, _pos)
ORDER BY _a_ctime DESC NULLS LAST LIMIT 1
) AS t1,
LATERAL (
SELECT ctime AS _a_ctime FROM b_p_t WHERE author_id=_a_ids[_pos] AND ctime<_a_ctimes[_pos]
ORDER BY ctime DESC LIMIT 1
) AS t2
--found the required amount of rows (offset+limit done)
WHERE _rows_found<105
)
SELECT (_return).* FROM r WHERE _return IS NOT NULL ORDER BY _rows_found;
The basic news feed
Advanced SQL
07
dataegret.com
WITH RECURSIVE
r AS (
--initial part of recursive union
SELECT
...
UNION ALL
--main part of recursive union
SELECT
…
--exit condition
WHERE _rows_found<200
)
--produce final ordered result
SELECT (_return).* FROM r WHERE _return IS NOT NULL
ORDER BY _rows_found
Alternative implementation
Advanced SQL (structure overview)
07
dataegret.com
--empty result
NULL::b_p_t AS _return,
--zero rows found so far
0::integer AS _rows_found,
--author ARRAY
'{1,2,3,4,5}'::int[] AS _a_ids,
--populate author ARRAY of latest blog posts (see part 5)
(SELECT
array_agg((
SELECT ctime FROM b_p_t WHERE author_id=_a_id ORDER BY
ctime DESC LIMIT 1
) ORDER BY _pos)
FROM UNNEST('{1,2,3,4,5}'::int[]) WITH ORDINALITY AS
u(_a_id, _pos)
) AS _a_ctimes
Alternative implementation
Advanced SQL (initial part)
07
dataegret.com
--return found row to the result set if we already done OFFSET or more entries
CASE WHEN _rows_found>=100 THEN (
SELECT b_p_t FROM b_p_t WHERE author_id=_a_ids[_pos] AND ctime=_a_ctimes[_pos]
) ELSE NULL END,
--increase found row count
_rows_found+1,
--pass through the same a_ids array
_a_ids,
--replace current ctime for author with previous message ctime
_a_ctimes[:_pos-1]||_a_ctime||_a_ctimes[_pos+1:]
FROM r,
LATERAL (
SELECT _pos FROM UNNEST(_a_ctimes) WITH ORDINALITY AS u(_a_ctime, _pos)
ORDER BY _a_ctime DESC NULLS LAST LIMIT 1
) AS t1,
LATERAL (
SELECT ctime AS _a_ctime FROM b_p_t
WHERE author_id=_a_ids[_pos] AND ctime<_a_ctimes[_pos]
ORDER BY ctime DESC LIMIT 1
) AS t2
Alternative implementation
Advanced SQL (main part)
07
dataegret.com
Performance
(with 10 authors)
LIMIT OFFSET Native PL/PgSQL Alternative
1 0 180ms 1.0ms 1.0ms
10 0 182ms 1.8ms 1.3ms
10 10 182ms 1.9ms 1.9ms
10 1000 187ms 53.0ms 24.7ms
1000 0 194ms 100.8ms 35.1ms
100 100 185ms 13.2ms 6.2ms
08
Final notes
dataegret.com

Efficient execution some popular queries requires
implementation of alternative procedural algorithm

Implementation of custom algorithms usually easier when
using PL/PgSQL

The same algorithm implemented on SQL runs faster

Process:
1. Implement and debug algorithm on PL/PgSQL
2. Convert to SQL
Final notes
dataegret.com
No questions?
Really?
Questions?
dataegret.com Maxim Boguk
Thank you!

How to teach an elephant to rock'n'roll

  • 1.
    Maxim Boguk How toteach an elephant to rock’n’roll
  • 2.
    About Data Egret dataegret.com AtData Egret (formerly known as PostgreSQL-Consulting.com) we provide database maintenance solutions and system administration support covering PostgreSQL database life cycle to large and small businesses alike.  27/7 Support  Consulting  Health Check  Monitoring  Training
  • 3.
    dataegret.com How to teachan elephant to rock’n’roll
  • 4.
    dataegret.com  Some queries can’tbe automatically optimized  An alternative approach  Manual fast algorithm – huge performance boost Really?
  • 5.
    dataegret.com  Demonstrate very usefulfast alternative algorithms  Provide methods of the PL/PgSQL conversion to plain SQL  Additional: Self-contained learning package Goals
  • 6.
    dataegret.com  Basic query optimizationtechniques  PostgreSQL query planner and optimizer limitations  Compare PL/PgSQL performance vs plain SQL (The Pl/PgSQL samples are written in a way that is easy to understand, however they are not the fastest possible implementation) Non goals
  • 7.
    dataegret.com  PL/PgSQL knowledge  PostgreSQL SQLfeatures:  WITH [RECURSIVE]  [JOIN] LATERAL  UNNEST [WITH ORDINALITY] Prerequisites
  • 8.
    dataegret.com  PostgreSQL version 9.6  Workson 9.5 and 9.4 without (major) changes  Running on the 9.3 (and down to 8.4) is possible, but requires a work-around implementation for missing features. PostgreSQL version
  • 9.
    dataegret.com  Problem description  Native approach- a simple SQL query  EXPLAIN ANALYZE  Alternative algorithm on PL/PgSQL  The same algorithm implemented on SQL  EXPLAIN ANALYZE  Performance results Structure
  • 10.
  • 11.
    dataegret.com Initial data preparation Schema01 CREATEUNIQUE INDEX blog_post_test_author_id_ctime_ukey ON b_p_t USING btree (author_id, ctime);
  • 12.
    dataegret.com Create blog poststable: DROP TABLE IF EXISTS b_p_t; CREATE TABLE b_p_t ( id BIGSERIAL PRIMARY KEY, ctime TIMESTAMP NOT NULL, author_id INTEGER NOT NULL, payload text); Initial data preparation Part 101
  • 13.
    dataegret.com Populate the tablewith test data: -- generate 10.000.000 blog posts from 1000 authors average 10000 post per author from last 5 years -- expect few hours run time INSERT INTO b_p_t (ctime, author_id, payload) SELECT -- random in last 5 years now()-(random()*365*24*3600*5)*'1 second'::interval AS ctime, -- 1001 author (random()*1000)::int AS author_id, -- random text-like payload 100-2100 bytes long (SELECT string_agg(substr('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVW XYZ0123456789 ', (random() * 72)::integer + 1, 1), '') FROM generate_series(1, 100+i%10+(random() * 2000)::integer)) AS payload -- 10M posts FROM generate_series(1, 10000000) AS g(i); Initial data preparation Part 201
  • 14.
    Initial data preparation Part3 Populate the table with test data (continue): --delete generated duplicates DELETE FROM b_p_t where (author_id, ctime) IN (select author_id, ctime from b_p_t group by author_id, ctime having count(*)>1); CREATE INDEX bpt_ctime_key on b_p_t(ctime); CREATE UNIQUE INDEX bpt_a_id_ctime_ukey on b_p_t(author_id, ctime); -- create authors table DROP TABLE IF EXISTS a_t; CREATE TABLE a_t AS select distinct on (author_id) author_id AS id, 'author_'||author_id AS name FROM b_p_t; ALTER TABLE a_t ADD PRIMARY KEY (id); ALTER TABLE b_p_t ADD CONSTRAINT author_id_fk FOREIGN KEY (author_id) REFERENCE a_t(id); ANALYZE a_t; dataegret.com 01
  • 15.
  • 16.
    dataegret.com  Queries with largeoffset are always slow  To produce 1.000.001’st row, database fist needs to iterate over all 1.000.000 rows  Alternative: use fast Index Only Scan to skip the first 1.000.000 rows IOS for large offsets02
  • 17.
    dataegret.com SELECT * FROM b_p_t ORDER BYid OFFSET 1000000 LIMIT 10 IOS for large offsets Native SQL02
  • 18.
    02 dataegret.com Limit (actual time=503..503rows=10 loops=1) -> Index Scan using b_p_t_pkey on b_p_t (actual time=0..386 rows=1000010 loops=1) IOS for large offsets Native SQL EXPLAIN
  • 19.
    02 dataegret.com CREATE OR REPLACEFUNCTION ios_offset_test (a_offset BIGINT, a_limit BIGINT) RETURNS SETOF b_p_t LANGUAGE plpgsql AS $function$ DECLARE start_id b_p_t.id%TYPE; BEGIN --find a starting id using IOS to skip OFFSET rows SELECT id INTO start_id FROM b_p_t ORDER BY id OFFSET a_offset LIMIT 1; --return result using normal index scan RETURN QUERY SELECT * FROM b_p_t WHERE id>=start_id ORDER BY ID LIMIT a_limit; END; $function$; IOS for large offsets PL/PgSQL
  • 20.
    02 dataegret.com SELECT bpt.* FROM ( --finda starting id using IOS to skip OFFSET rows SELECT id FROM b_p_t ORDER BY id OFFSET 1000000 LIMIT 1 ) AS t, LATERAL ( --return result using normal index scan SELECT * FROM b_p_t WHERE id>=t.id ORDER BY id LIMIT 10 ) AS bpt; IOS for large offsets Advanced SQL
  • 21.
    02 dataegret.com -> Index OnlyScan using blog_post_test_pkey on b_p_t (actual time=0..236 rows=1000001 loops=1) -> Index Scan using blog_post_test_pkey on b_p_t (actual time=0.016..0.026 rows=10 loops=1) Index Cond: (id >= b_p_t.id) IOS for large offsets Advanced SQL EXPLAIN
  • 22.
    dataegret.com IOS for largeoffsets Performance OFFSET Native PL/PgSQL Advanced 1000 0.7ms 0.5ms 0.4ms 10000 3.0ms 1.2ms 1.1ms 100000 26.2ms 7.7ms 7.4ms 1000000 273.0ms 71.0ms 70.9ms 02
  • 23.
  • 24.
    dataegret.com  A combination ofJOIN, ORDER BY and LIMIT  The database performs join for every row (not only for the LIMIT rows)  In many cases it’s unnecessary  Alternative: pull down LIMIT+ORDER BY under JOIN Pull down LIMIT under JOIN03
  • 25.
    dataegret.com SELECT * FROM b_p_t JOIN a_tON a_t.id=author_id WHERE author_id IN (1,2,3,4,5) ORDER BY ctime LIMIT 10 Pull down LIMIT under JOIN Native SQL03
  • 26.
    03 dataegret.com -> Sort (actualtime=345..345 rows=10 loops=1) -> Nested Loop (actual time=0.061..295.832 rows=50194 loops=1) -> Index Scan using b_p_t_author_id_ctime_ukey on b_p_t (actual time=0..78 rows=50194 loops=1) Index Cond: (author_id = ANY ('{1,2,3,4,5}'::integer[])) -> Index Scan using a_t_pkey on a_t (actual time=0.002 rows=1 loops=50194) Index Cond: (id = b_p_t.author_id) Pull down LIMIT under JOIN Native SQL EXPLAIN
  • 27.
    03 dataegret.com CREATE OR REPLACEFUNCTION join_limit_pulldown_test (a_authors BIGINT[], a_limit BIGINT) RETURNS TABLE (id BIGINT, ctime TIMESTAMP, author_id INT, payload TEXT, name TEXT) LANGUAGE plpgsql AS $function$ DECLARE t record; BEGIN FOR t IN ( -- find ONLY required rows first SELECT * FROM b_p_t WHERE b_p_t.author_id=ANY(a_authors) ORDER BY ctime LIMIT a_limit ) LOOP -- and only after join with authors RETURN QUERY SELECT t.*, a_t.name FROM a_t WHERE a_t.id=t.author_id; END LOOP; END; $function$; Pull down LIMIT under JOIN PL/PgSQL
  • 28.
    03 dataegret.com SELECT bpt_with_a_name.* FROM --find ONLY required rows first ( SELECT * FROM b_p_t WHERE author_id IN (1,2,3,4,5) ORDER BY ctime LIMIT 10 ) AS t, LATERAL ( -- and only after join with the authors SELECT t.*,a_t.name FROM a_t WHERE a_t.id=t.author_id ) AS bpt_with_a_name -- second ORDER BY required ORDER BY ctime LIMIT 10; Pull down LIMIT under JOIN Advanced SQL
  • 29.
    03 dataegret.com -> Nested Loop(actual time=68..68 rows=10 loops=1) -> Sort (actual time=68..68 rows=10 loops=1) -> Index Scan using b_p_t_author_id_ctime_ukey on b_p_t (actual time=0..49 rows=50194 loops=1) Index Cond: (author_id = ANY ('{1,2,3,4,5}'::integer[])) -> Index Scan using a_t_pkey on a_t (actual time=0.002..0.002 rows=1 loops=10) Index Cond: (id = b_p_t.author_id) Pull down LIMIT under JOIN Advanced SQL EXPLAIN
  • 30.
    03 dataegret.com author_id IN (1,2,3,4,5)/ LIMIT 10 Native SQL: 155ms PL/PgSQL: 52ms Advanced SQL: 51ms Pull down LIMIT under JOIN Performance
  • 31.
  • 32.
    04 dataegret.com  DISTINCT on alarge table always slow  It scans whole table  Even if the amount of distinct values low  Alternative: creative index use to perform DISTINCT  Technique known as LOOSE INDEX SCAN: https://wiki.postgresql.org/wiki/Loose_indexscan DISTINCT
  • 33.
  • 34.
    04 dataegret.com Unique (actual time=0..5235rows=1001 loops=1) -> Index Only Scan using b_p_t_author_id_ctime_ukey on b_p_t (actual time=0..3767 rows=9999964 loops=1) DISTINCT Native SQL EXPLAIN
  • 35.
    04 dataegret.com CREATE OR REPLACEFUNCTION fast_distinct_test () RETURNS SETOF INTEGER LANGUAGE plpgsql AS $function$ DECLARE _author_id b_p_t.author_id%TYPE; BEGIN --start from least author_id SELECT min(author_id) INTO _author_id FROM b_p_t; LOOP --finish if nothing found EXIT WHEN _author_id IS NULL; --return found value RETURN NEXT _author_id; --find the next author_id > current author_id SELECT author_id INTO _author_id FROM b_p_t WHERE author_id>_author_id ORDER BY author_id LIMIT 1; END LOOP; END; $function$; DISTINCT PL/PgSQL
  • 36.
    04 dataegret.com WITH RECURSIVE tAS ( --start from least author_id (SELECT author_id AS _author_id FROM b_p_t ORDER BY author_id LIMIT 1 ) UNION ALL SELECT author_id AS _author_id FROM t, LATERAL ( --find the next author_id > current author_id SELECT author_id FROM b_p_t WHERE author_id>t._author_id ORDER BY author_id LIMIT 1 ) AS a_id ) --return found values SELECT _author_id FROM t; DISTINCT Advanced SQL
  • 37.
    04 dataegret.com -> Index OnlyScan using b_p_t_author_id_ctime_ukey on b_p_t b_p_t_1 (actual time=0.015..0.015 rows=1 loops=1) -> Index Only Scan using b_p_t_author_id_ctime_ukey on b_p_t (actual time=0.007..0.007 rows=1 loops=1001) DISTINCT Advanced SQL EXPLAIN
  • 38.
    04 dataegret.com Native SQL: 2660ms PL/PgSQL:18ms Advanced SQL: 10ms DISTINCT Performance
  • 39.
  • 40.
    05 dataegret.com  DISTINCT ON withIN() used when an application require fetch the latest data for SELECTED authors in single query  DISTINCT ON queries usually slow  Alternative: creative use of index to calculate DISTINCT ON DISTINCT ON with IN ()
  • 41.
    05 dataegret.com SELECT DISTINCT ON (author_id) * FROMb_p_t WHERE author_id IN (1,2,3,4,5) ORDER BY author_id, ctime DESC DISTINCT ON with IN () Native SQL
  • 42.
    05 dataegret.com Unique (actual time=160..204rows=5 loops=1) -> Sort (actual time=160..195 rows=50194 loops=1) -> Index Scan using b_p_t_author_id_ctime_ukey on b_p_t (actual time=0..50 rows=50194 loops=1) Index Cond: (author_id = ANY ('{1,2,3,4,5}'::integer[])) DISTINCT ON with IN () Native SQL EXPLAIN
  • 43.
    05 dataegret.com CREATE OR REPLACEFUNCTION distinct_on_test (a_authors INT[]) RETURNS SETOF b_p_t LANGUAGE plpgsql AS $function$ DECLARE _a b_p_t.author_id%TYPE; BEGIN -- loop over authors list FOREACH _a IN ARRAY a_authors LOOP -- return the latest post for author RETURN QUERY SELECT * FROM b_p_t WHERE author_id=_a ORDER BY ctime DESC LIMIT 1; END LOOP; END; $function$; DISTINCT ON with IN () PL/PgSQL
  • 44.
    05 dataegret.com SELECT bpt.* -- loopover authors list FROM unnest(ARRAY[1,2,3,4,5]::INT[]) AS t(_author_id), LATERAL ( -- return the latest post for author SELECT * FROM b_p_t WHERE author_id=t._author_id ORDER BY ctime DESC LIMIT 1 ) AS bpt; DISTINCT ON with IN () Advanced SQL
  • 45.
    05 dataegret.com Nested Loop (actualtime=0.02..0.06 rows=5 loops=1) -> Index Scan Backward using b_p_t_author_id_ctime_ukey on b_p_t (actual time=0.009..0.009 rows=1 loops=5) Index Cond: (author_id = t._author) DISTINCT ON with IN () Advanced SQL EXPLAIN
  • 46.
    05 dataegret.com Native SQL: 110.0ms PL/PgSQL:0.5ms Advanced SQL: 0.4ms DISTINCT ON with IN () Performance
  • 47.
  • 48.
    06 dataegret.com  DISTINCT ON usedwhen an application require fetch the latest data for ALL authors in single query  DISTINCT ON on a large table always is performance killer  Alternative: again, creative use of indexes and SQL saves the day DISTINCT ON over all table
  • 49.
    06 dataegret.com SELECT DISTINCT ON (author_id) * FROMb_p_t ORDER BY author_id, ctime DESC DISTINCT ON over all table Native SQL
  • 50.
    06 dataegret.com Unique (actual time=29938..39450rows=1001 loops=1) -> Sort (actual time=29938..37845 rows=9999964 loops=1) Sort Key: author_id, ctime DESC Sort Method: external merge Disk: 10002168kB -> Seq Scan on b_p_t (actual time=0.004..2472 rows=9999964 loops=1) DISTINCT ON over all table Native SQL EXPLAIN
  • 51.
    06 dataegret.com DISTINCT ON overall table PL/PgSQL CREATE OR REPLACE FUNCTION fast_distinct_on_test() RETURNS SETOF b_p_t LANGUAGE plpgsql AS $function$ DECLARE _b_p_t record; BEGIN --start from greatest author_id SELECT * INTO _b_p_t FROM b_p_t ORDER BY author_id DESC, ctime DESC LIMIT 1; LOOP --finish if nothing found EXIT WHEN _b_p_t IS NULL; --return found value RETURN NEXT _b_p_t; --latest post from next author_id < current author_id SELECT * FROM b_p_t INTO _b_p_t WHERE author_id<_b_p_t.author_id ORDER BY author_id DESC, ctime DESC LIMIT 1; END LOOP; END; $function$;
  • 52.
    06 dataegret.com WITH RECURSIVE tAS ( --start from greatest author_id (SELECT * FROM b_p_t ORDER BY author_id DESC, ctime DESC LIMIT 1) UNION ALL SELECT bpt.* FROM t, LATERAL ( --latest post from the next author_id < current author_id SELECT * FROM b_p_t WHERE author_id<t.author_id ORDER BY author_id DESC, ctime DESC LIMIT 1 ) AS bpt ) --return found values SELECT * FROM t; DISTINCT ON over all table Advanced SQL
  • 53.
    06 dataegret.com -> Index ScanBackward using b_p_t_author_id_ctime_ukey on b_p_t (actual time=0.008..0.008 rows=1 loops=1) -> Index Scan Backward using b_p_t_author_id_ctime_ukey on b_p_t (actual time=0.007..0.007 rows=1 loops=1001) Index Cond: (author_id < t_1.author_id) DISTINCT ON over all table Advanced SQL EXPLAIN
  • 54.
    06 dataegret.com --fast distinct(author_id) implementation(from part 4) WITH RECURSIVE t AS ( --start from least author_id (SELECT author_id AS _author_id FROM b_p_t ORDER BY author_id LIMIT 1) UNION ALL SELECT author_id AS _author_id FROM t, LATERAL ( --find the next author_id > current author_id SELECT author_id FROM b_p_t WHERE author_id>t._author_id ORDER BY author_id LIMIT 1 ) AS a_id ) SELECT bpt.* -- loop over authors list (from part 5) FROM t, LATERAL ( -- return the latest post for each author SELECT * FROM b_p_t WHERE b_p_t.author_id=t._author_id ORDER BY ctime DESC LIMIT 1 ) AS bpt; DISTINCT ON over all table Alternative advanced SQL
  • 55.
    06 dataegret.com Native SQL: 3755ms PL/PgSQL:27ms Advanced SQL: 13ms Alternative advanced SQL: 18ms DISTINCT ON over all table Performance
  • 56.
    07 Let the real funbegin: the basic news feed
  • 57.
    dataegret.com The idea ofbasic news feed is very simple:  find the N newest posts from the subscription list authors  may be with some offset  easy to implement  hard to make run fast The basic news feed07
  • 58.
    dataegret.com SELECT * FROM b_p_t WHERE author_idIN (1,2,3,4,5) ORDER BY ctime DESC LIMIT 20 OFFSET 20 The basic news feed Native SQL07
  • 59.
    07 dataegret.com Limit (actual time=146..146rows=20 loops=1) -> Sort (actual time=146..146 rows=40 loops=1) Sort Key: ctime DESC Sort Method: top-N heapsort Memory: 84kB -> Index Scan using bpt_a_id_ctime_ukey on b_p_t (actual time=0.015..105 rows=50194 loops=1) Index Cond: (author_id = ANY ('{1,2,3,4,5}'::integer[])) The basic news feed EXPLAIN
  • 60.
    07 dataegret.com  The query fetchesall the posts from all listed authors  It fetches whole rows (with all the payloads)  The more posts – the slower the query  The longer list of authors used – the slower the query The basic news feed Problems
  • 61.
    07 dataegret.com  An ideal feedquery implementation should require no more then OFFSET+LIMIT index probes on blog_post table  Must be fast at least for small OFFSET/LIMIT values (100- 1000) The basic news feed Requirements
  • 62.
    07 dataegret.com The basic newsfeed Alternative idea Step 0 Lets start from array of the author_id (_a_ids): '{20,10,30,100,16}'::int[] pos author_id 1 20 2 10 3 30 4 100 5 16
  • 63.
    07 dataegret.com The basic newsfeed Alternative idea Step 1 Now, using ideas from block 5 populate the ctime of the latest posts for each author (_a_ctimes): SELECT array_agg(( SELECT ctime FROM b_p_t WHERE author_id=_a_id ORDER BY ctime DESC LIMIT 1 ) ORDER BY _pos) FROM UNNEST('{20,10,30,100,16}'::int[]) WITH ORDINALITY AS u(_a_id, _pos) pos ctime 1 01-02-2017 2 03-02-2017 3 10-02-2017 4 28-02-2017 5 21-02-2017
  • 64.
    07 dataegret.com The basic newsfeed Alternative idea Step 2 pos author_id 1 20 2 10 3 30 4 100 5 16 pos ctime 1 01-02-2017 2 03-02-2017 3 10-02-2017 4 28-02-2017 5 21-02-2017 Select position of the latest post from _a_ctimes. It will be first row of final result set. SELECT pos FROM UNNEST(_a_ctimes) WITH ORDINALITY AS u(a_ctime, pos) ORDER BY a_ctime DESC NULLS LAST LIMIT 1
  • 65.
    07 dataegret.com The basic newsfeed Alternative idea Step 3 pos author_id 1 20 2 10 3 30 4 100 5 16 pos ctime 1 01-02-2017 2 03-02-2017 3 10-02-2017 4 28-02-2017 5 21-02-2017 Now, replace row 4 in _a_ctimes with ctime of the previous post from the same author. SELECT ctime AS _a_ctime FROM b_p_t WHERE author_id=_a_ids[pos] AND ctime<_a_ctimes[pos] ORDER BY ctime DESC LIMIT 1 Found author_id ctime 1 4 28-02-2017
  • 66.
    07 dataegret.com The basic newsfeed Alternative idea pos author_id 1 20 2 10 3 30 4 100 5 16 pos ctime 1 01-02-2017 2 03-02-2017 3 10-02-2017 4 20-02-2017 5 21-02-2017 “Rinse and repeat” steps 2 and 3 collecting found rows, until requested LIMIT rows found. Found author_id ctime 1 4 28-02-2017 2 5 21-02-2017
  • 67.
    dataegret.com The basic newsfeed PL/PgSQL07 CREATE OR REPLACE FUNCTION feed_test (a_authors INT[], a_limit INT, a_offset INT) RETURNS SETOF b_p_t LANGUAGE plpgsql AS $function$ DECLARE _a_ids INT[] := a_authors; DECLARE _a_ctimes TIMESTAMP[]; DECLARE _rows_found INT := 0; DECLARE _pos INT; DECLARE _a_ctime TIMESTAMP; DECLARE _a_id INT; BEGIN -- loop over authors list FOR _pos IN SELECT generate_subscripts(a_authors, 1) LOOP --populate the latest post ctime for every author SELECT ctime INTO _a_ctime FROM b_p_t WHERE author_id=_a_ids[_pos] ORDER BY ctime DESC LIMIT 1; _a_ctimes[_pos] := _a_ctime; END LOOP;
  • 68.
    dataegret.com The basic newsfeed PL/PgSQL (continue)07 WHILE _rows_found<a_limit+a_offset LOOP --seek position of the latest post in ctime array SELECT pos INTO _pos FROM UNNEST(_a_ctimes) WITH ORDINALITY AS u(a_ctime, pos) ORDER BY a_ctime DESC NULLS LAST LIMIT 1; --get ctime of previous post of the same author SELECT ctime INTO _a_ctime FROM b_p_t WHERE author_id=_a_ids[_pos] AND ctime<_a_ctimes[_pos] ORDER BY ctime DESC LIMIT 1;
  • 69.
    dataegret.com The basic newsfeed PL/PgSQL (continue)07 --offset rows done, start return results IF _rows_found>=a_offset THEN RETURN QUERY SELECT * FROM b_p_t WHERE author_id=_a_ids[_pos] AND ctime=_a_ctimes[_pos]; END IF; --increase found rows count _rows_found := _rows_found+1; --replace ctime for author with previous message ctime _a_ctimes[_pos] := _a_ctime; END LOOP; END; $function$;
  • 70.
    07 dataegret.com WITH RECURSIVE r AS( SELECT --empty result NULL::b_p_t AS _return, --zero rows found yet 0::integer AS _rows_found, --populate author ARRAY '{1,2,3,4,5}'::int[] AS _a_ids, --populate author ARRAY of latest blog posts (SELECT array_agg(( SELECT ctime FROM b_p_t WHERE author_id=_a_id ORDER BY ctime DESC LIMIT 1 ) ORDER BY _pos) FROM UNNEST('{1,2,3,4,5}'::int[]) WITH ORDINALITY AS u(_a_id, _pos) ) AS _a_ctimes UNION ALL SELECT --return found row to the result set if we already done OFFSET or more entries CASE WHEN _rows_found>=100 THEN ( SELECT b_p_t FROM b_p_t WHERE author_id=_a_ids[_pos] AND ctime=_a_ctimes[_pos] ) ELSE NULL END, --increase found row count _rows_found+1, --pass through the same a_ids array _a_ids, --replace current ctime for author with previous message ctime for the same author _a_ctimes[:_pos-1]||_a_ctime||_a_ctimes[_pos+1:] FROM r, LATERAL ( SELECT _pos FROM UNNEST(_a_ctimes) WITH ORDINALITY AS u(_a_ctime, _pos) ORDER BY _a_ctime DESC NULLS LAST LIMIT 1 ) AS t1, LATERAL ( SELECT ctime AS _a_ctime FROM b_p_t WHERE author_id=_a_ids[_pos] AND ctime<_a_ctimes[_pos] ORDER BY ctime DESC LIMIT 1 ) AS t2 --found the required amount of rows (offset+limit done) WHERE _rows_found<105 ) SELECT (_return).* FROM r WHERE _return IS NOT NULL ORDER BY _rows_found; The basic news feed Advanced SQL
  • 71.
    07 dataegret.com WITH RECURSIVE r AS( --initial part of recursive union SELECT ... UNION ALL --main part of recursive union SELECT … --exit condition WHERE _rows_found<200 ) --produce final ordered result SELECT (_return).* FROM r WHERE _return IS NOT NULL ORDER BY _rows_found Alternative implementation Advanced SQL (structure overview)
  • 72.
    07 dataegret.com --empty result NULL::b_p_t AS_return, --zero rows found so far 0::integer AS _rows_found, --author ARRAY '{1,2,3,4,5}'::int[] AS _a_ids, --populate author ARRAY of latest blog posts (see part 5) (SELECT array_agg(( SELECT ctime FROM b_p_t WHERE author_id=_a_id ORDER BY ctime DESC LIMIT 1 ) ORDER BY _pos) FROM UNNEST('{1,2,3,4,5}'::int[]) WITH ORDINALITY AS u(_a_id, _pos) ) AS _a_ctimes Alternative implementation Advanced SQL (initial part)
  • 73.
    07 dataegret.com --return found rowto the result set if we already done OFFSET or more entries CASE WHEN _rows_found>=100 THEN ( SELECT b_p_t FROM b_p_t WHERE author_id=_a_ids[_pos] AND ctime=_a_ctimes[_pos] ) ELSE NULL END, --increase found row count _rows_found+1, --pass through the same a_ids array _a_ids, --replace current ctime for author with previous message ctime _a_ctimes[:_pos-1]||_a_ctime||_a_ctimes[_pos+1:] FROM r, LATERAL ( SELECT _pos FROM UNNEST(_a_ctimes) WITH ORDINALITY AS u(_a_ctime, _pos) ORDER BY _a_ctime DESC NULLS LAST LIMIT 1 ) AS t1, LATERAL ( SELECT ctime AS _a_ctime FROM b_p_t WHERE author_id=_a_ids[_pos] AND ctime<_a_ctimes[_pos] ORDER BY ctime DESC LIMIT 1 ) AS t2 Alternative implementation Advanced SQL (main part)
  • 74.
    07 dataegret.com Performance (with 10 authors) LIMITOFFSET Native PL/PgSQL Alternative 1 0 180ms 1.0ms 1.0ms 10 0 182ms 1.8ms 1.3ms 10 10 182ms 1.9ms 1.9ms 10 1000 187ms 53.0ms 24.7ms 1000 0 194ms 100.8ms 35.1ms 100 100 185ms 13.2ms 6.2ms
  • 75.
  • 76.
    dataegret.com  Efficient execution somepopular queries requires implementation of alternative procedural algorithm  Implementation of custom algorithms usually easier when using PL/PgSQL  The same algorithm implemented on SQL runs faster  Process: 1. Implement and debug algorithm on PL/PgSQL 2. Convert to SQL Final notes
  • 77.
  • 78.