SlideShare a Scribd company logo
1 of 56
Download to read offline
Explaining the Postgres Query Optimizer 
BRUCE MOMJIAN 
January, 2012 
The optimizer is the "brain" of the database, interpreting SQL 
queries and determining the fastest method of execution. This 
talk uses the EXPLAIN command to show how the optimizer 
interprets queries and determines optimal execution. 
Creative Commons Attribution License http://momjian.us/presentations 
1 / 56
Postgres Query Execution 
User 
Terminal 
Code 
Database 
Server 
Application 
Queries 
Results 
PostgreSQL 
Libpq 
Explaining the Postgres Query Optimizer 2 / 56
Postgres Query Execution 
Postgres Postgres 
utility 
Postmaster 
Parse Statement 
Traffic Cop 
Query 
Generate Paths 
Optimal Path 
Plan 
Libpq 
Main 
Generate Plan 
Execute Plan 
e.g. CREATE TABLE, COPY 
SELECT, INSERT, UPDATE, DELETE 
Rewrite Query 
Utility 
Command 
Utilities Catalog Storage Managers 
Access Methods Nodes / Lists 
Explaining the Postgres Query Optimizer 3 / 56
Postgres Query Execution 
utility 
Parse Statement 
Traffic Cop 
Query 
Generate Paths 
Optimal Path 
Generate Plan 
Plan 
Execute Plan 
e.g. CREATE TABLE, COPY 
SELECT, INSERT, UPDATE, DELETE 
Rewrite Query 
Utility 
Command 
Explaining the Postgres Query Optimizer 4 / 56
The Optimizer Is the Brain 
http://www.wsmanaging.com/ 
Explaining the Postgres Query Optimizer 5 / 56
What Decisions Does the Optimizer Have to Make? 
I Scan Method 
I Join Method 
I Join Order 
Explaining the Postgres Query Optimizer 6 / 56
Which Scan Method? 
I Sequential Scan 
I Bitmap Index Scan 
I Index Scan 
Explaining the Postgres Query Optimizer 7 / 56
A Simple Example Using pg_class.relname 
SELECT relname 
FROM pg_class 
ORDER BY 1 
LIMIT 8; 
relname 
----------------------------------- 
_pg_foreign_data_wrappers 
_pg_foreign_servers 
_pg_user_mappings 
administrable_role_authorizations 
applicable_roles 
attributes 
check_constraint_routine_usage 
check_constraints 
(8 rows) 
Explaining the Postgres Query Optimizer 8 / 56
Let’s Use Just the First Letter of pg_class.relname 
SELECT substring(relname, 1, 1) 
FROM pg_class 
ORDER BY 1 
LIMIT 8; 
substring 
----------- 
_ 
_ 
_ 
a 
a 
a 
c 
c 
(8 rows) 
Explaining the Postgres Query Optimizer 9 / 56
Create a Temporary Table with an Index 
CREATE TEMPORARY TABLE sample (letter, junk) AS 
SELECT substring(relname, 1, 1), repeat(’x’, 250) 
FROM pg_class 
ORDER BY random(); -- add rows in random order 
SELECT 253 
CREATE INDEX i_sample on sample (letter); 
CREATE INDEX 
All the queries used in this presentation are available at 
http://momjian.us/main/writings/pgsql/optimizer.sql. 
Explaining the Postgres Query Optimizer 10 / 56
Create an EXPLAIN Function 
CREATE OR REPLACE FUNCTION lookup_letter(text) RETURNS SETOF text AS $$ 
BEGIN 
RETURN QUERY EXECUTE ’ 
EXPLAIN SELECT letter 
FROM sample 
WHERE letter = ’’’ || $1 || ’’’’; 
END 
$$ LANGUAGE plpgsql; 
CREATE FUNCTION 
Explaining the Postgres Query Optimizer 11 / 56
What is the Distribution of the sample Table? 
WITH letters (letter, count) AS ( 
SELECT letter, COUNT(*) 
FROM sample 
GROUP BY 1 
) 
SELECT letter, count, (count * 100.0 / (SUM(count) OVER ()))::numeric(4,1) AS "%" 
FROM letters 
ORDER BY 2 DESC; 
Explaining the Postgres Query Optimizer 12 / 56
What is the Distribution of the sample Table? 
letter | count | % 
--------+-------+------ 
p | 199 | 78.7 
s | 9 | 3.6 
c | 8 | 3.2 
r | 7 | 2.8 
t | 5 | 2.0 
v | 4 | 1.6 
f | 4 | 1.6 
d | 4 | 1.6 
u | 3 | 1.2 
a | 3 | 1.2 
_ | 3 | 1.2 
e | 2 | 0.8 
i | 1 | 0.4 
k | 1 | 0.4 
(14 rows) 
Explaining the Postgres Query Optimizer 13 / 56
Is the Distribution Important? 
EXPLAIN SELECT letter 
FROM sample 
WHERE letter = ’p’; 
QUERY PLAN 
------------------------------------------------------------------------ 
Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=32) 
Index Cond: (letter = ’p’::text) 
(2 rows) 
Explaining the Postgres Query Optimizer 14 / 56
Is the Distribution Important? 
EXPLAIN SELECT letter 
FROM sample 
WHERE letter = ’d’; 
QUERY PLAN 
------------------------------------------------------------------------ 
Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=32) 
Index Cond: (letter = ’d’::text) 
(2 rows) 
Explaining the Postgres Query Optimizer 15 / 56
Is the Distribution Important? 
EXPLAIN SELECT letter 
FROM sample 
WHERE letter = ’k’; 
QUERY PLAN 
------------------------------------------------------------------------ 
Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=32) 
Index Cond: (letter = ’k’::text) 
(2 rows) 
Explaining the Postgres Query Optimizer 16 / 56
Running ANALYZE Causes 
a Sequential Scan for a Common Value 
ANALYZE sample; 
ANALYZE 
EXPLAIN SELECT letter 
FROM sample 
WHERE letter = ’p’; 
QUERY PLAN 
--------------------------------------------------------- 
Seq Scan on sample (cost=0.00..13.16 rows=199 width=2) 
Filter: (letter = ’p’::text) 
(2 rows) 
Autovacuum cannot ANALYZE (or VACUUM) temporary tables because 
these tables are only visible to the creating session. 
Explaining the Postgres Query Optimizer 17 / 56
Sequential Scan 
D 
T 
A 
D 
A 
T 
A 
D 
A 
T 
A 
D 
A 
T 
A 
D 
A 
T 
A 
D 
A 
T 
A 
D 
A 
T 
A 
D 
A 
T 
A 
D 
A 
T 
A 
D 
T 
8K 
Heap 
A 
A 
D 
A 
T 
A 
D 
A 
T 
A 
A 
Explaining the Postgres Query Optimizer 18 / 56
A Less Common Value Causes a Bitmap Heap Scan 
EXPLAIN SELECT letter 
FROM sample 
WHERE letter = ’d’; 
QUERY PLAN 
----------------------------------------------------------------------- 
Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) 
Recheck Cond: (letter = ’d’::text) 
-> Bitmap Index Scan on i_sample (cost=0.00..4.28 rows=4 width=0) 
Index Cond: (letter = ’d’::text) 
(4 rows) 
Explaining the Postgres Query Optimizer 19 / 56
Bitmap Index Scan 
Combined 
Index 1 Table 
col1 = ’A’ 
col2 = ’NS’ 
0 
& = 
’A’ AND ’NS’ 
0 
1 
0 
1 
Index 2 
1 
Index 
0 
1 0 
0 
1 
0 
Explaining the Postgres Query Optimizer 20 / 56
An Even Rarer Value Causes an Index Scan 
EXPLAIN SELECT letter 
FROM sample 
WHERE letter = ’k’; 
QUERY PLAN 
----------------------------------------------------------------------- 
Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) 
Index Cond: (letter = ’k’::text) 
(2 rows) 
Explaining the Postgres Query Optimizer 21 / 56
Index Scan 
D 
A 
D 
A 
T 
A 
D 
A 
T 
A 
D 
A 
T 
A 
D 
A 
T 
A 
D 
A 
T 
A 
D 
A 
T 
A 
< Key = > 
< Key = > 
Index 
Heap 
< Key = > 
D 
A 
T 
A 
D 
A 
T 
A 
D 
A 
T 
A 
D 
A 
T 
A 
D 
A 
T 
A 
A 
T 
Explaining the Postgres Query Optimizer 22 / 56
Let’s Look at All Values and their Effects 
WITH letter (letter, count) AS ( 
SELECT letter, COUNT(*) 
FROM sample 
GROUP BY 1 
) 
SELECT letter AS l, count, lookup_letter(letter) 
FROM letter 
ORDER BY 2 DESC; 
l | count | lookup_letter 
---+-------+----------------------------------------------------------------------- 
p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2) 
p | 199 | Filter: (letter = ’p’::text) 
s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2) 
s | 9 | Filter: (letter = ’s’::text) 
c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2) 
c | 8 | Filter: (letter = ’c’::text) 
r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2) 
r | 7 | Filter: (letter = ’r’::text) 
… 
Explaining the Postgres Query Optimizer 23 / 56
OK, Just the First Lines 
WITH letter (letter, count) AS ( 
SELECT letter, COUNT(*) 
FROM sample 
GROUP BY 1 
) 
SELECT letter AS l, count, 
(SELECT * 
FROM lookup_letter(letter) AS l2 
LIMIT 1) AS lookup_letter 
FROM letter 
ORDER BY 2 DESC; 
Explaining the Postgres Query Optimizer 24 / 56
Just the First EXPLAIN Lines 
l | count | lookup_letter 
---+-------+----------------------------------------------------------------------- 
p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2) 
s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2) 
c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2) 
r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2) 
t | 5 | Bitmap Heap Scan on sample (cost=4.29..12.76 rows=5 width=2) 
f | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) 
v | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) 
d | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) 
a | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) 
_ | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) 
u | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) 
e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) 
i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) 
k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) 
(14 rows) 
Explaining the Postgres Query Optimizer 25 / 56
We Can Force an Index Scan 
SET enable_seqscan = false; 
SET enable_bitmapscan = false; 
WITH letter (letter, count) AS ( 
SELECT letter, COUNT(*) 
FROM sample 
GROUP BY 1 
) 
SELECT letter AS l, count, 
(SELECT * 
FROM lookup_letter(letter) AS l2 
LIMIT 1) AS lookup_letter 
FROM letter 
ORDER BY 2 DESC; 
Explaining the Postgres Query Optimizer 26 / 56
Notice the High Cost for Common Values 
l | count | lookup_letter 
---+-------+------------------------------------------------------------------------- 
p | 199 | Index Scan using i_sample on sample (cost=0.00..39.33 rows=199 width=2) 
s | 9 | Index Scan using i_sample on sample (cost=0.00..22.14 rows=9 width=2) 
c | 8 | Index Scan using i_sample on sample (cost=0.00..19.84 rows=8 width=2) 
r | 7 | Index Scan using i_sample on sample (cost=0.00..19.82 rows=7 width=2) 
t | 5 | Index Scan using i_sample on sample (cost=0.00..15.21 rows=5 width=2) 
d | 4 | Index Scan using i_sample on sample (cost=0.00..15.19 rows=4 width=2) 
v | 4 | Index Scan using i_sample on sample (cost=0.00..15.19 rows=4 width=2) 
f | 4 | Index Scan using i_sample on sample (cost=0.00..15.19 rows=4 width=2) 
_ | 3 | Index Scan using i_sample on sample (cost=0.00..12.88 rows=3 width=2) 
a | 3 | Index Scan using i_sample on sample (cost=0.00..12.88 rows=3 width=2) 
u | 3 | Index Scan using i_sample on sample (cost=0.00..12.88 rows=3 width=2) 
e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) 
i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) 
k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) 
(14 rows) 
RESET ALL; 
RESET 
Explaining the Postgres Query Optimizer 27 / 56
This Was the Optimizer’s Preference 
l | count | lookup_letter 
---+-------+----------------------------------------------------------------------- 
p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2) 
s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2) 
c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2) 
r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2) 
t | 5 | Bitmap Heap Scan on sample (cost=4.29..12.76 rows=5 width=2) 
f | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) 
v | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) 
d | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) 
a | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) 
_ | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) 
u | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) 
e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) 
i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) 
k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) 
(14 rows) 
Explaining the Postgres Query Optimizer 28 / 56
Which Join Method? 
I Nested Loop 
I With Inner Sequential Scan 
I With Inner Index Scan 
I Hash Join 
I Merge Join 
Explaining the Postgres Query Optimizer 29 / 56
What Is in pg_proc.oid? 
SELECT oid 
FROM pg_proc 
ORDER BY 1 
LIMIT 8; 
oid 
----- 
31 
33 
34 
35 
38 
39 
40 
41 
(8 rows) 
Explaining the Postgres Query Optimizer 30 / 56
Create Temporary Tables 
from pg_proc and pg_class 
CREATE TEMPORARY TABLE sample1 (id, junk) AS 
SELECT oid, repeat(’x’, 250) 
FROM pg_proc 
ORDER BY random(); -- add rows in random order 
SELECT 2256 
CREATE TEMPORARY TABLE sample2 (id, junk) AS 
SELECT oid, repeat(’x’, 250) 
FROM pg_class 
ORDER BY random(); -- add rows in random order 
SELECT 260 
These tables have no indexes and no optimizer statistics. 
Explaining the Postgres Query Optimizer 31 / 56
Join the Two Tables 
with a Tight Restriction 
EXPLAIN SELECT sample2.junk 
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) 
WHERE sample1.id = 33; 
QUERY PLAN 
--------------------------------------------------------------------- 
Nested Loop (cost=0.00..234.68 rows=300 width=32) 
-> Seq Scan on sample1 (cost=0.00..205.54 rows=50 width=4) 
Filter: (id = 33::oid) 
-> Materialize (cost=0.00..25.41 rows=6 width=36) 
-> Seq Scan on sample2 (cost=0.00..25.38 rows=6 width=36) 
Filter: (id = 33::oid) 
(6 rows) 
Explaining the Postgres Query Optimizer 32 / 56
Nested Loop Join 
with Inner Sequential Scan 
Outer Inner 
aag 
aai 
aay aag 
aar 
aas 
aar 
aay 
aaa 
aag 
No Setup Required 
aai 
Used For Small Tables 
Explaining the Postgres Query Optimizer 33 / 56
Pseudocode for Nested Loop Join 
with Inner Sequential Scan 
for (i = 0; i < length(outer); i++) 
for (j = 0; j < length(inner); j++) 
if (outer[i] == inner[j]) 
output(outer[i], inner[j]); 
Explaining the Postgres Query Optimizer 34 / 56
Join the Two Tables with a Looser Restriction 
EXPLAIN SELECT sample1.junk 
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) 
WHERE sample2.id > 33; 
QUERY PLAN 
---------------------------------------------------------------------- 
Hash Join (cost=30.50..950.88 rows=20424 width=32) 
Hash Cond: (sample1.id = sample2.id) 
-> Seq Scan on sample1 (cost=0.00..180.63 rows=9963 width=36) 
-> Hash (cost=25.38..25.38 rows=410 width=4) 
-> Seq Scan on sample2 (cost=0.00..25.38 rows=410 width=4) 
Filter: (id > 33::oid) 
(6 rows) 
Explaining the Postgres Query Optimizer 35 / 56
Hash Join 
Hashed 
Must fit in Main Memory 
aak 
aar 
aak 
aam aay aar 
aao aaw 
aay 
aag 
aas 
Outer Inner 
Explaining the Postgres Query Optimizer 36 / 56
Pseudocode for Hash Join 
for (j = 0; j < length(inner); j++) 
hash_key = hash(inner[j]); 
append(hash_store[hash_key], inner[j]); 
for (i = 0; i < length(outer); i++) 
hash_key = hash(outer[i]); 
for (j = 0; j < length(hash_store[hash_key]); j++) 
if (outer[i] == hash_store[hash_key][j]) 
output(outer[i], inner[j]); 
Explaining the Postgres Query Optimizer 37 / 56
Join the Two Tables with No Restriction 
EXPLAIN SELECT sample1.junk 
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id); 
QUERY PLAN 
------------------------------------------------------------------------- 
Merge Join (cost=927.72..1852.95 rows=61272 width=32) 
Merge Cond: (sample2.id = sample1.id) 
-> Sort (cost=85.43..88.50 rows=1230 width=4) 
Sort Key: sample2.id 
-> Seq Scan on sample2 (cost=0.00..22.30 rows=1230 width=4) 
-> Sort (cost=842.29..867.20 rows=9963 width=36) 
Sort Key: sample1.id 
-> Seq Scan on sample1 (cost=0.00..180.63 rows=9963 width=36) 
(8 rows) 
Explaining the Postgres Query Optimizer 38 / 56
Merge Join 
Sorted 
Sorted 
Outer Inner 
aaa 
aab 
aac 
aad 
aaa 
aab 
aab 
aac 
aae 
aaf 
aaf 
Ideal for Large Tables 
An Index Can Be Used to Eliminate the Sort 
Explaining the Postgres Query Optimizer 39 / 56
Pseudocode for Merge Join 
sort(outer); 
sort(inner); 
i = 0; 
j = 0; 
save_j = 0; 
while (i < length(outer)) 
if (outer[i] == inner[j]) 
output(outer[i], inner[j]); 
if (outer[i] <= inner[j] && j < length(inner)) 
j++; 
if (outer[i] < inner[j]) 
save_j = j; 
else 
i++; 
j = save_j; 
Explaining the Postgres Query Optimizer 40 / 56
Order of Joined Relations Is Insignificant 
EXPLAIN SELECT sample2.junk 
FROM sample2 JOIN sample1 ON (sample2.id = sample1.id); 
QUERY PLAN 
------------------------------------------------------------------------ 
Merge Join (cost=927.72..1852.95 rows=61272 width=32) 
Merge Cond: (sample2.id = sample1.id) 
-> Sort (cost=85.43..88.50 rows=1230 width=36) 
Sort Key: sample2.id 
-> Seq Scan on sample2 (cost=0.00..22.30 rows=1230 width=36) 
-> Sort (cost=842.29..867.20 rows=9963 width=4) 
Sort Key: sample1.id 
-> Seq Scan on sample1 (cost=0.00..180.63 rows=9963 width=4) 
(8 rows) 
The most restrictive relation, e.g. sample2, is always on the outer side of 
merge joins. All previous merge joins also had sample2 in outer position. 
Explaining the Postgres Query Optimizer 41 / 56
Add Optimizer Statistics 
ANALYZE sample1; 
ANALYZE sample2; 
Explaining the Postgres Query Optimizer 42 / 56
This Was a Merge Join without Optimizer Statistics 
EXPLAIN SELECT sample2.junk 
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id); 
QUERY PLAN 
------------------------------------------------------------------------ 
Hash Join (cost=15.85..130.47 rows=260 width=254) 
Hash Cond: (sample1.id = sample2.id) 
-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4) 
-> Hash (cost=12.60..12.60 rows=260 width=258) 
-> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=258) 
(5 rows) 
Explaining the Postgres Query Optimizer 43 / 56
Outer Joins Can Affect Optimizer Join Usage 
EXPLAIN SELECT sample1.junk 
FROM sample1 RIGHT OUTER JOIN sample2 ON (sample1.id = sample2.id); 
QUERY PLAN 
-------------------------------------------------------------------------- 
Hash Left Join (cost=131.76..148.26 rows=260 width=254) 
Hash Cond: (sample2.id = sample1.id) 
-> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=4) 
-> Hash (cost=103.56..103.56 rows=2256 width=258) 
-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=258) 
(5 rows) 
Use of hashes for outer joins was added in Postgres 9.1. 
Explaining the Postgres Query Optimizer 44 / 56
Cross Joins Are Nested Loop Joins 
without Join Restriction 
EXPLAIN SELECT sample1.junk 
FROM sample1 CROSS JOIN sample2; 
QUERY PLAN 
---------------------------------------------------------------------- 
Nested Loop (cost=0.00..7448.81 rows=586560 width=254) 
-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=254) 
-> Materialize (cost=0.00..13.90 rows=260 width=0) 
-> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=0) 
(4 rows) 
Explaining the Postgres Query Optimizer 45 / 56
Create Indexes 
CREATE INDEX i_sample1 on sample1 (id); 
CREATE INDEX i_sample2 on sample2 (id); 
Explaining the Postgres Query Optimizer 46 / 56
Nested Loop with Inner Index Scan Now Possible 
EXPLAIN SELECT sample2.junk 
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) 
WHERE sample1.id = 33; 
QUERY PLAN 
--------------------------------------------------------------------------------- 
Nested Loop (cost=0.00..16.55 rows=1 width=254) 
-> Index Scan using i_sample1 on sample1 (cost=0.00..8.27 rows=1 width=4) 
Index Cond: (id = 33::oid) 
-> Index Scan using i_sample2 on sample2 (cost=0.00..8.27 rows=1 width=258) 
Index Cond: (sample2.id = 33::oid) 
(5 rows) 
Explaining the Postgres Query Optimizer 47 / 56
Nested Loop Join with Inner Index Scan 
Outer Inner 
aag 
aai 
aay aag 
aar 
aai 
aas 
aar 
aay 
aaa 
aag 
Index Lookup 
No Setup Required 
Index Must Already Exist 
Explaining the Postgres Query Optimizer 48 / 56
Pseudocode for Nested Loop Join 
with Inner Index Scan 
for (i = 0; i < length(outer); i++) 
index_entry = get_first_match(outer[j]) 
while (index_entry) 
output(outer[i], inner[index_entry]); 
index_entry = get_next_match(index_entry); 
Explaining the Postgres Query Optimizer 49 / 56
Query Restrictions Affect Join Usage 
EXPLAIN SELECT sample2.junk 
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) 
WHERE sample2.junk ˜ ’^aaa’; 
QUERY PLAN 
------------------------------------------------------------------------------- 
Nested Loop (cost=0.00..21.53 rows=1 width=254) 
-> Seq Scan on sample2 (cost=0.00..13.25 rows=1 width=258) 
Filter: (junk ˜ ’^aaa’::text) 
-> Index Scan using i_sample1 on sample1 (cost=0.00..8.27 rows=1 width=4) 
Index Cond: (sample1.id = sample2.id) 
(5 rows) 
No junk rows begin with ’aaa’. 
Explaining the Postgres Query Optimizer 50 / 56
All ’junk’ Columns Begin with ’xxx’ 
EXPLAIN SELECT sample2.junk 
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) 
WHERE sample2.junk ˜ ’^xxx’; 
QUERY PLAN 
------------------------------------------------------------------------ 
Hash Join (cost=16.50..131.12 rows=260 width=254) 
Hash Cond: (sample1.id = sample2.id) 
-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4) 
-> Hash (cost=13.25..13.25 rows=260 width=258) 
-> Seq Scan on sample2 (cost=0.00..13.25 rows=260 width=258) 
Filter: (junk ˜ ’^xxx’::text) 
(6 rows) 
Hash join was chosen because many more rows are expected. The 
smaller table, e.g. sample2, is always hashed. 
Explaining the Postgres Query Optimizer 51 / 56
Without LIMIT, Hash Is Used 
for this Unrestricted Join 
EXPLAIN SELECT sample2.junk 
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id); 
QUERY PLAN 
------------------------------------------------------------------------ 
Hash Join (cost=15.85..130.47 rows=260 width=254) 
Hash Cond: (sample1.id = sample2.id) 
-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4) 
-> Hash (cost=12.60..12.60 rows=260 width=258) 
-> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=258) 
(5 rows) 
Explaining the Postgres Query Optimizer 52 / 56
LIMIT Can Affect Join Usage 
EXPLAIN SELECT sample2.id, sample2.junk 
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) 
ORDER BY 1 
LIMIT 1; 
QUERY PLAN 
------------------------------------------------------------------------------------------ 
Limit (cost=0.00..1.83 rows=1 width=258) 
-> Nested Loop (cost=0.00..477.02 rows=260 width=258) 
-> Index Scan using i_sample2 on sample2 (cost=0.00..52.15 rows=260 width=258) 
-> Index Scan using i_sample1 on sample1 (cost=0.00..1.62 rows=1 width=4) 
Index Cond: (sample1.id = sample2.id) 
(5 rows) 
Explaining the Postgres Query Optimizer 53 / 56
LIMIT 10 
EXPLAIN SELECT sample2.id, sample2.junk 
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) 
ORDER BY 1 
LIMIT 10; 
QUERY PLAN 
------------------------------------------------------------------------------------------ 
Limit (cost=0.00..18.35 rows=10 width=258) 
-> Nested Loop (cost=0.00..477.02 rows=260 width=258) 
-> Index Scan using i_sample2 on sample2 (cost=0.00..52.15 rows=260 width=258) 
-> Index Scan using i_sample1 on sample1 (cost=0.00..1.62 rows=1 width=4) 
Index Cond: (sample1.id = sample2.id) 
(5 rows) 
Explaining the Postgres Query Optimizer 54 / 56
LIMIT 100 Switches to Hash Join 
EXPLAIN SELECT sample2.id, sample2.junk 
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) 
ORDER BY 1 
LIMIT 100; 
QUERY PLAN 
------------------------------------------------------------------------------------ 
Limit (cost=140.41..140.66 rows=100 width=258) 
-> Sort (cost=140.41..141.06 rows=260 width=258) 
Sort Key: sample2.id 
-> Hash Join (cost=15.85..130.47 rows=260 width=258) 
Hash Cond: (sample1.id = sample2.id) 
-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4) 
-> Hash (cost=12.60..12.60 rows=260 width=258) 
-> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=258) 
(8 rows) 
Explaining the Postgres Query Optimizer 55 / 56
Conclusion 
http://momjian.us/presentations http://www.vivapixel.com/photo/14252 
Explaining the Postgres Query Optimizer 56 / 56

More Related Content

What's hot

pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLCommand Prompt., Inc
 
PostgreSQL Internals (1) for PostgreSQL 9.6 (English)
PostgreSQL Internals (1) for PostgreSQL 9.6 (English)PostgreSQL Internals (1) for PostgreSQL 9.6 (English)
PostgreSQL Internals (1) for PostgreSQL 9.6 (English)Noriyoshi Shinoda
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresqlbotsplash.com
 
Troubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming ReplicationTroubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming ReplicationAlexey Lesovsky
 
What's New in MySQL 5.7 Optimizer @MySQL User Conference Tokyo 2015
What's New in MySQL 5.7 Optimizer @MySQL User Conference Tokyo 2015What's New in MySQL 5.7 Optimizer @MySQL User Conference Tokyo 2015
What's New in MySQL 5.7 Optimizer @MySQL User Conference Tokyo 2015Mikiya Okuno
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOAltinity Ltd
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesAltinity Ltd
 
PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slidesmetsarin
 
What is new in PostgreSQL 14?
What is new in PostgreSQL 14?What is new in PostgreSQL 14?
What is new in PostgreSQL 14?Mydbops
 
How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performanceoysteing
 
Scaling Django with gevent
Scaling Django with geventScaling Django with gevent
Scaling Django with geventMahendra M
 
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOTricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOAltinity Ltd
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBrendan Gregg
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature EngineeringSri Ambati
 
Chasing the optimizer
Chasing the optimizerChasing the optimizer
Chasing the optimizerMauro Pagano
 
Feature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditFeature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditMichael BENESTY
 
PostgreSQL- An Introduction
PostgreSQL- An IntroductionPostgreSQL- An Introduction
PostgreSQL- An IntroductionSmita Prasad
 
Oracle sql high performance tuning
Oracle sql high performance tuningOracle sql high performance tuning
Oracle sql high performance tuningGuy Harrison
 
[pgday.Seoul 2022] POSTGRES 테스트코드로 기여하기 - 이동욱
[pgday.Seoul 2022] POSTGRES 테스트코드로 기여하기 - 이동욱[pgday.Seoul 2022] POSTGRES 테스트코드로 기여하기 - 이동욱
[pgday.Seoul 2022] POSTGRES 테스트코드로 기여하기 - 이동욱PgDay.Seoul
 

What's hot (20)

pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQL
 
PostgreSQL Internals (1) for PostgreSQL 9.6 (English)
PostgreSQL Internals (1) for PostgreSQL 9.6 (English)PostgreSQL Internals (1) for PostgreSQL 9.6 (English)
PostgreSQL Internals (1) for PostgreSQL 9.6 (English)
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresql
 
Troubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming ReplicationTroubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming Replication
 
What's New in MySQL 5.7 Optimizer @MySQL User Conference Tokyo 2015
What's New in MySQL 5.7 Optimizer @MySQL User Conference Tokyo 2015What's New in MySQL 5.7 Optimizer @MySQL User Conference Tokyo 2015
What's New in MySQL 5.7 Optimizer @MySQL User Conference Tokyo 2015
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
 
PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slides
 
What is new in PostgreSQL 14?
What is new in PostgreSQL 14?What is new in PostgreSQL 14?
What is new in PostgreSQL 14?
 
How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performance
 
Scaling Django with gevent
Scaling Django with geventScaling Django with gevent
Scaling Django with gevent
 
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOTricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Chasing the optimizer
Chasing the optimizerChasing the optimizer
Chasing the optimizer
 
Vi Editor
Vi EditorVi Editor
Vi Editor
 
Feature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditFeature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax audit
 
PostgreSQL- An Introduction
PostgreSQL- An IntroductionPostgreSQL- An Introduction
PostgreSQL- An Introduction
 
Oracle sql high performance tuning
Oracle sql high performance tuningOracle sql high performance tuning
Oracle sql high performance tuning
 
[pgday.Seoul 2022] POSTGRES 테스트코드로 기여하기 - 이동욱
[pgday.Seoul 2022] POSTGRES 테스트코드로 기여하기 - 이동욱[pgday.Seoul 2022] POSTGRES 테스트코드로 기여하기 - 이동욱
[pgday.Seoul 2022] POSTGRES 테스트코드로 기여하기 - 이동욱
 

Similar to Explaining the Postgres Query Optimizer

Explaining the Postgres Query Optimizer - PGCon 2014
Explaining the Postgres Query Optimizer - PGCon 2014Explaining the Postgres Query Optimizer - PGCon 2014
Explaining the Postgres Query Optimizer - PGCon 2014EDB
 
Explaining the Postgres Query Optimizer (Bruce Momjian)
Explaining the Postgres Query Optimizer (Bruce Momjian)Explaining the Postgres Query Optimizer (Bruce Momjian)
Explaining the Postgres Query Optimizer (Bruce Momjian)Ontico
 
Using PostgreSQL statistics to optimize performance
Using PostgreSQL statistics to optimize performance Using PostgreSQL statistics to optimize performance
Using PostgreSQL statistics to optimize performance Alexey Ermakov
 
SQL Functions and Operators
SQL Functions and OperatorsSQL Functions and Operators
SQL Functions and OperatorsMohan Kumar.R
 
New Tuning Features in Oracle 11g - How to make your database as boring as po...
New Tuning Features in Oracle 11g - How to make your database as boring as po...New Tuning Features in Oracle 11g - How to make your database as boring as po...
New Tuning Features in Oracle 11g - How to make your database as boring as po...Sage Computing Services
 
Basic Query Tuning Primer - Pg West 2009
Basic Query Tuning Primer - Pg West 2009Basic Query Tuning Primer - Pg West 2009
Basic Query Tuning Primer - Pg West 2009mattsmiley
 
Histograms : Pre-12c and Now
Histograms : Pre-12c and NowHistograms : Pre-12c and Now
Histograms : Pre-12c and NowAnju Garg
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLMark Wong
 
A few things about the Oracle optimizer - 2013
A few things about the Oracle optimizer - 2013A few things about the Oracle optimizer - 2013
A few things about the Oracle optimizer - 2013Connor McDonald
 
SQL Performance Tuning and New Features in Oracle 19c
SQL Performance Tuning and New Features in Oracle 19cSQL Performance Tuning and New Features in Oracle 19c
SQL Performance Tuning and New Features in Oracle 19cRachelBarker26
 
Query optimizer vivek sharma
Query optimizer vivek sharmaQuery optimizer vivek sharma
Query optimizer vivek sharmaaioughydchapter
 
Managing Statistics for Optimal Query Performance
Managing Statistics for Optimal Query PerformanceManaging Statistics for Optimal Query Performance
Managing Statistics for Optimal Query PerformanceKaren Morton
 
Managing Statistics for Optimal Query Performance
Managing Statistics for Optimal Query PerformanceManaging Statistics for Optimal Query Performance
Managing Statistics for Optimal Query PerformanceKaren Morton
 
Sydney Oracle Meetup - access paths
Sydney Oracle Meetup - access pathsSydney Oracle Meetup - access paths
Sydney Oracle Meetup - access pathspaulguerin
 
Histograms: Pre-12c and now
Histograms: Pre-12c and nowHistograms: Pre-12c and now
Histograms: Pre-12c and nowAnju Garg
 
12c SQL Plan Directives
12c SQL Plan Directives12c SQL Plan Directives
12c SQL Plan DirectivesFranck Pachot
 

Similar to Explaining the Postgres Query Optimizer (20)

Explaining the Postgres Query Optimizer - PGCon 2014
Explaining the Postgres Query Optimizer - PGCon 2014Explaining the Postgres Query Optimizer - PGCon 2014
Explaining the Postgres Query Optimizer - PGCon 2014
 
Explaining the Postgres Query Optimizer (Bruce Momjian)
Explaining the Postgres Query Optimizer (Bruce Momjian)Explaining the Postgres Query Optimizer (Bruce Momjian)
Explaining the Postgres Query Optimizer (Bruce Momjian)
 
Using PostgreSQL statistics to optimize performance
Using PostgreSQL statistics to optimize performance Using PostgreSQL statistics to optimize performance
Using PostgreSQL statistics to optimize performance
 
SQLQueries
SQLQueriesSQLQueries
SQLQueries
 
SQL Functions and Operators
SQL Functions and OperatorsSQL Functions and Operators
SQL Functions and Operators
 
sqltuningcardinality1(1).ppt
sqltuningcardinality1(1).pptsqltuningcardinality1(1).ppt
sqltuningcardinality1(1).ppt
 
New Tuning Features in Oracle 11g - How to make your database as boring as po...
New Tuning Features in Oracle 11g - How to make your database as boring as po...New Tuning Features in Oracle 11g - How to make your database as boring as po...
New Tuning Features in Oracle 11g - How to make your database as boring as po...
 
Basic Query Tuning Primer
Basic Query Tuning PrimerBasic Query Tuning Primer
Basic Query Tuning Primer
 
Basic Query Tuning Primer - Pg West 2009
Basic Query Tuning Primer - Pg West 2009Basic Query Tuning Primer - Pg West 2009
Basic Query Tuning Primer - Pg West 2009
 
Oracle 12c SPM
Oracle 12c SPMOracle 12c SPM
Oracle 12c SPM
 
Histograms : Pre-12c and Now
Histograms : Pre-12c and NowHistograms : Pre-12c and Now
Histograms : Pre-12c and Now
 
pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQL
 
A few things about the Oracle optimizer - 2013
A few things about the Oracle optimizer - 2013A few things about the Oracle optimizer - 2013
A few things about the Oracle optimizer - 2013
 
SQL Performance Tuning and New Features in Oracle 19c
SQL Performance Tuning and New Features in Oracle 19cSQL Performance Tuning and New Features in Oracle 19c
SQL Performance Tuning and New Features in Oracle 19c
 
Query optimizer vivek sharma
Query optimizer vivek sharmaQuery optimizer vivek sharma
Query optimizer vivek sharma
 
Managing Statistics for Optimal Query Performance
Managing Statistics for Optimal Query PerformanceManaging Statistics for Optimal Query Performance
Managing Statistics for Optimal Query Performance
 
Managing Statistics for Optimal Query Performance
Managing Statistics for Optimal Query PerformanceManaging Statistics for Optimal Query Performance
Managing Statistics for Optimal Query Performance
 
Sydney Oracle Meetup - access paths
Sydney Oracle Meetup - access pathsSydney Oracle Meetup - access paths
Sydney Oracle Meetup - access paths
 
Histograms: Pre-12c and now
Histograms: Pre-12c and nowHistograms: Pre-12c and now
Histograms: Pre-12c and now
 
12c SQL Plan Directives
12c SQL Plan Directives12c SQL Plan Directives
12c SQL Plan Directives
 

More from EDB

Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSCloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSEDB
 
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenDie 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenEDB
 
Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube EDB
 
EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021EDB
 
Benchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQLBenchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQLEDB
 
Las Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQLLas Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQLEDB
 
NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLEDB
 
Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?EDB
 
Data Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQLData Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQLEDB
 
Practical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresPractical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresEDB
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINEDB
 
IOT with PostgreSQL
IOT with PostgreSQLIOT with PostgreSQL
IOT with PostgreSQLEDB
 
A Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQLA Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQLEDB
 
Psql is awesome!
Psql is awesome!Psql is awesome!
Psql is awesome!EDB
 
EDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJEDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJEDB
 
Comment sauvegarder correctement vos données
Comment sauvegarder correctement vos donnéesComment sauvegarder correctement vos données
Comment sauvegarder correctement vos donnéesEDB
 
Cloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - ItalianoCloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - ItalianoEDB
 
New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13EDB
 
Best Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLBest Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLEDB
 
Cloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJCloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJEDB
 

More from EDB (20)

Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSCloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
 
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenDie 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
 
Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube
 
EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021
 
Benchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQLBenchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQL
 
Las Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQLLas Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQL
 
NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQL
 
Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?
 
Data Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQLData Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQL
 
Practical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresPractical Partitioning in Production with Postgres
Practical Partitioning in Production with Postgres
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
 
IOT with PostgreSQL
IOT with PostgreSQLIOT with PostgreSQL
IOT with PostgreSQL
 
A Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQLA Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQL
 
Psql is awesome!
Psql is awesome!Psql is awesome!
Psql is awesome!
 
EDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJEDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJ
 
Comment sauvegarder correctement vos données
Comment sauvegarder correctement vos donnéesComment sauvegarder correctement vos données
Comment sauvegarder correctement vos données
 
Cloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - ItalianoCloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - Italiano
 
New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13
 
Best Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLBest Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQL
 
Cloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJCloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJ
 

Recently uploaded

The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 

Recently uploaded (20)

The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 

Explaining the Postgres Query Optimizer

  • 1. Explaining the Postgres Query Optimizer BRUCE MOMJIAN January, 2012 The optimizer is the "brain" of the database, interpreting SQL queries and determining the fastest method of execution. This talk uses the EXPLAIN command to show how the optimizer interprets queries and determines optimal execution. Creative Commons Attribution License http://momjian.us/presentations 1 / 56
  • 2. Postgres Query Execution User Terminal Code Database Server Application Queries Results PostgreSQL Libpq Explaining the Postgres Query Optimizer 2 / 56
  • 3. Postgres Query Execution Postgres Postgres utility Postmaster Parse Statement Traffic Cop Query Generate Paths Optimal Path Plan Libpq Main Generate Plan Execute Plan e.g. CREATE TABLE, COPY SELECT, INSERT, UPDATE, DELETE Rewrite Query Utility Command Utilities Catalog Storage Managers Access Methods Nodes / Lists Explaining the Postgres Query Optimizer 3 / 56
  • 4. Postgres Query Execution utility Parse Statement Traffic Cop Query Generate Paths Optimal Path Generate Plan Plan Execute Plan e.g. CREATE TABLE, COPY SELECT, INSERT, UPDATE, DELETE Rewrite Query Utility Command Explaining the Postgres Query Optimizer 4 / 56
  • 5. The Optimizer Is the Brain http://www.wsmanaging.com/ Explaining the Postgres Query Optimizer 5 / 56
  • 6. What Decisions Does the Optimizer Have to Make? I Scan Method I Join Method I Join Order Explaining the Postgres Query Optimizer 6 / 56
  • 7. Which Scan Method? I Sequential Scan I Bitmap Index Scan I Index Scan Explaining the Postgres Query Optimizer 7 / 56
  • 8. A Simple Example Using pg_class.relname SELECT relname FROM pg_class ORDER BY 1 LIMIT 8; relname ----------------------------------- _pg_foreign_data_wrappers _pg_foreign_servers _pg_user_mappings administrable_role_authorizations applicable_roles attributes check_constraint_routine_usage check_constraints (8 rows) Explaining the Postgres Query Optimizer 8 / 56
  • 9. Let’s Use Just the First Letter of pg_class.relname SELECT substring(relname, 1, 1) FROM pg_class ORDER BY 1 LIMIT 8; substring ----------- _ _ _ a a a c c (8 rows) Explaining the Postgres Query Optimizer 9 / 56
  • 10. Create a Temporary Table with an Index CREATE TEMPORARY TABLE sample (letter, junk) AS SELECT substring(relname, 1, 1), repeat(’x’, 250) FROM pg_class ORDER BY random(); -- add rows in random order SELECT 253 CREATE INDEX i_sample on sample (letter); CREATE INDEX All the queries used in this presentation are available at http://momjian.us/main/writings/pgsql/optimizer.sql. Explaining the Postgres Query Optimizer 10 / 56
  • 11. Create an EXPLAIN Function CREATE OR REPLACE FUNCTION lookup_letter(text) RETURNS SETOF text AS $$ BEGIN RETURN QUERY EXECUTE ’ EXPLAIN SELECT letter FROM sample WHERE letter = ’’’ || $1 || ’’’’; END $$ LANGUAGE plpgsql; CREATE FUNCTION Explaining the Postgres Query Optimizer 11 / 56
  • 12. What is the Distribution of the sample Table? WITH letters (letter, count) AS ( SELECT letter, COUNT(*) FROM sample GROUP BY 1 ) SELECT letter, count, (count * 100.0 / (SUM(count) OVER ()))::numeric(4,1) AS "%" FROM letters ORDER BY 2 DESC; Explaining the Postgres Query Optimizer 12 / 56
  • 13. What is the Distribution of the sample Table? letter | count | % --------+-------+------ p | 199 | 78.7 s | 9 | 3.6 c | 8 | 3.2 r | 7 | 2.8 t | 5 | 2.0 v | 4 | 1.6 f | 4 | 1.6 d | 4 | 1.6 u | 3 | 1.2 a | 3 | 1.2 _ | 3 | 1.2 e | 2 | 0.8 i | 1 | 0.4 k | 1 | 0.4 (14 rows) Explaining the Postgres Query Optimizer 13 / 56
  • 14. Is the Distribution Important? EXPLAIN SELECT letter FROM sample WHERE letter = ’p’; QUERY PLAN ------------------------------------------------------------------------ Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=32) Index Cond: (letter = ’p’::text) (2 rows) Explaining the Postgres Query Optimizer 14 / 56
  • 15. Is the Distribution Important? EXPLAIN SELECT letter FROM sample WHERE letter = ’d’; QUERY PLAN ------------------------------------------------------------------------ Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=32) Index Cond: (letter = ’d’::text) (2 rows) Explaining the Postgres Query Optimizer 15 / 56
  • 16. Is the Distribution Important? EXPLAIN SELECT letter FROM sample WHERE letter = ’k’; QUERY PLAN ------------------------------------------------------------------------ Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=32) Index Cond: (letter = ’k’::text) (2 rows) Explaining the Postgres Query Optimizer 16 / 56
  • 17. Running ANALYZE Causes a Sequential Scan for a Common Value ANALYZE sample; ANALYZE EXPLAIN SELECT letter FROM sample WHERE letter = ’p’; QUERY PLAN --------------------------------------------------------- Seq Scan on sample (cost=0.00..13.16 rows=199 width=2) Filter: (letter = ’p’::text) (2 rows) Autovacuum cannot ANALYZE (or VACUUM) temporary tables because these tables are only visible to the creating session. Explaining the Postgres Query Optimizer 17 / 56
  • 18. Sequential Scan D T A D A T A D A T A D A T A D A T A D A T A D A T A D A T A D A T A D T 8K Heap A A D A T A D A T A A Explaining the Postgres Query Optimizer 18 / 56
  • 19. A Less Common Value Causes a Bitmap Heap Scan EXPLAIN SELECT letter FROM sample WHERE letter = ’d’; QUERY PLAN ----------------------------------------------------------------------- Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) Recheck Cond: (letter = ’d’::text) -> Bitmap Index Scan on i_sample (cost=0.00..4.28 rows=4 width=0) Index Cond: (letter = ’d’::text) (4 rows) Explaining the Postgres Query Optimizer 19 / 56
  • 20. Bitmap Index Scan Combined Index 1 Table col1 = ’A’ col2 = ’NS’ 0 & = ’A’ AND ’NS’ 0 1 0 1 Index 2 1 Index 0 1 0 0 1 0 Explaining the Postgres Query Optimizer 20 / 56
  • 21. An Even Rarer Value Causes an Index Scan EXPLAIN SELECT letter FROM sample WHERE letter = ’k’; QUERY PLAN ----------------------------------------------------------------------- Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) Index Cond: (letter = ’k’::text) (2 rows) Explaining the Postgres Query Optimizer 21 / 56
  • 22. Index Scan D A D A T A D A T A D A T A D A T A D A T A D A T A < Key = > < Key = > Index Heap < Key = > D A T A D A T A D A T A D A T A D A T A A T Explaining the Postgres Query Optimizer 22 / 56
  • 23. Let’s Look at All Values and their Effects WITH letter (letter, count) AS ( SELECT letter, COUNT(*) FROM sample GROUP BY 1 ) SELECT letter AS l, count, lookup_letter(letter) FROM letter ORDER BY 2 DESC; l | count | lookup_letter ---+-------+----------------------------------------------------------------------- p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2) p | 199 | Filter: (letter = ’p’::text) s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2) s | 9 | Filter: (letter = ’s’::text) c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2) c | 8 | Filter: (letter = ’c’::text) r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2) r | 7 | Filter: (letter = ’r’::text) … Explaining the Postgres Query Optimizer 23 / 56
  • 24. OK, Just the First Lines WITH letter (letter, count) AS ( SELECT letter, COUNT(*) FROM sample GROUP BY 1 ) SELECT letter AS l, count, (SELECT * FROM lookup_letter(letter) AS l2 LIMIT 1) AS lookup_letter FROM letter ORDER BY 2 DESC; Explaining the Postgres Query Optimizer 24 / 56
  • 25. Just the First EXPLAIN Lines l | count | lookup_letter ---+-------+----------------------------------------------------------------------- p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2) s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2) c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2) r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2) t | 5 | Bitmap Heap Scan on sample (cost=4.29..12.76 rows=5 width=2) f | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) v | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) d | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) a | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) _ | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) u | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) (14 rows) Explaining the Postgres Query Optimizer 25 / 56
  • 26. We Can Force an Index Scan SET enable_seqscan = false; SET enable_bitmapscan = false; WITH letter (letter, count) AS ( SELECT letter, COUNT(*) FROM sample GROUP BY 1 ) SELECT letter AS l, count, (SELECT * FROM lookup_letter(letter) AS l2 LIMIT 1) AS lookup_letter FROM letter ORDER BY 2 DESC; Explaining the Postgres Query Optimizer 26 / 56
  • 27. Notice the High Cost for Common Values l | count | lookup_letter ---+-------+------------------------------------------------------------------------- p | 199 | Index Scan using i_sample on sample (cost=0.00..39.33 rows=199 width=2) s | 9 | Index Scan using i_sample on sample (cost=0.00..22.14 rows=9 width=2) c | 8 | Index Scan using i_sample on sample (cost=0.00..19.84 rows=8 width=2) r | 7 | Index Scan using i_sample on sample (cost=0.00..19.82 rows=7 width=2) t | 5 | Index Scan using i_sample on sample (cost=0.00..15.21 rows=5 width=2) d | 4 | Index Scan using i_sample on sample (cost=0.00..15.19 rows=4 width=2) v | 4 | Index Scan using i_sample on sample (cost=0.00..15.19 rows=4 width=2) f | 4 | Index Scan using i_sample on sample (cost=0.00..15.19 rows=4 width=2) _ | 3 | Index Scan using i_sample on sample (cost=0.00..12.88 rows=3 width=2) a | 3 | Index Scan using i_sample on sample (cost=0.00..12.88 rows=3 width=2) u | 3 | Index Scan using i_sample on sample (cost=0.00..12.88 rows=3 width=2) e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) (14 rows) RESET ALL; RESET Explaining the Postgres Query Optimizer 27 / 56
  • 28. This Was the Optimizer’s Preference l | count | lookup_letter ---+-------+----------------------------------------------------------------------- p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2) s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2) c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2) r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2) t | 5 | Bitmap Heap Scan on sample (cost=4.29..12.76 rows=5 width=2) f | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) v | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) d | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2) a | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) _ | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) u | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2) e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2) (14 rows) Explaining the Postgres Query Optimizer 28 / 56
  • 29. Which Join Method? I Nested Loop I With Inner Sequential Scan I With Inner Index Scan I Hash Join I Merge Join Explaining the Postgres Query Optimizer 29 / 56
  • 30. What Is in pg_proc.oid? SELECT oid FROM pg_proc ORDER BY 1 LIMIT 8; oid ----- 31 33 34 35 38 39 40 41 (8 rows) Explaining the Postgres Query Optimizer 30 / 56
  • 31. Create Temporary Tables from pg_proc and pg_class CREATE TEMPORARY TABLE sample1 (id, junk) AS SELECT oid, repeat(’x’, 250) FROM pg_proc ORDER BY random(); -- add rows in random order SELECT 2256 CREATE TEMPORARY TABLE sample2 (id, junk) AS SELECT oid, repeat(’x’, 250) FROM pg_class ORDER BY random(); -- add rows in random order SELECT 260 These tables have no indexes and no optimizer statistics. Explaining the Postgres Query Optimizer 31 / 56
  • 32. Join the Two Tables with a Tight Restriction EXPLAIN SELECT sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) WHERE sample1.id = 33; QUERY PLAN --------------------------------------------------------------------- Nested Loop (cost=0.00..234.68 rows=300 width=32) -> Seq Scan on sample1 (cost=0.00..205.54 rows=50 width=4) Filter: (id = 33::oid) -> Materialize (cost=0.00..25.41 rows=6 width=36) -> Seq Scan on sample2 (cost=0.00..25.38 rows=6 width=36) Filter: (id = 33::oid) (6 rows) Explaining the Postgres Query Optimizer 32 / 56
  • 33. Nested Loop Join with Inner Sequential Scan Outer Inner aag aai aay aag aar aas aar aay aaa aag No Setup Required aai Used For Small Tables Explaining the Postgres Query Optimizer 33 / 56
  • 34. Pseudocode for Nested Loop Join with Inner Sequential Scan for (i = 0; i < length(outer); i++) for (j = 0; j < length(inner); j++) if (outer[i] == inner[j]) output(outer[i], inner[j]); Explaining the Postgres Query Optimizer 34 / 56
  • 35. Join the Two Tables with a Looser Restriction EXPLAIN SELECT sample1.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) WHERE sample2.id > 33; QUERY PLAN ---------------------------------------------------------------------- Hash Join (cost=30.50..950.88 rows=20424 width=32) Hash Cond: (sample1.id = sample2.id) -> Seq Scan on sample1 (cost=0.00..180.63 rows=9963 width=36) -> Hash (cost=25.38..25.38 rows=410 width=4) -> Seq Scan on sample2 (cost=0.00..25.38 rows=410 width=4) Filter: (id > 33::oid) (6 rows) Explaining the Postgres Query Optimizer 35 / 56
  • 36. Hash Join Hashed Must fit in Main Memory aak aar aak aam aay aar aao aaw aay aag aas Outer Inner Explaining the Postgres Query Optimizer 36 / 56
  • 37. Pseudocode for Hash Join for (j = 0; j < length(inner); j++) hash_key = hash(inner[j]); append(hash_store[hash_key], inner[j]); for (i = 0; i < length(outer); i++) hash_key = hash(outer[i]); for (j = 0; j < length(hash_store[hash_key]); j++) if (outer[i] == hash_store[hash_key][j]) output(outer[i], inner[j]); Explaining the Postgres Query Optimizer 37 / 56
  • 38. Join the Two Tables with No Restriction EXPLAIN SELECT sample1.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id); QUERY PLAN ------------------------------------------------------------------------- Merge Join (cost=927.72..1852.95 rows=61272 width=32) Merge Cond: (sample2.id = sample1.id) -> Sort (cost=85.43..88.50 rows=1230 width=4) Sort Key: sample2.id -> Seq Scan on sample2 (cost=0.00..22.30 rows=1230 width=4) -> Sort (cost=842.29..867.20 rows=9963 width=36) Sort Key: sample1.id -> Seq Scan on sample1 (cost=0.00..180.63 rows=9963 width=36) (8 rows) Explaining the Postgres Query Optimizer 38 / 56
  • 39. Merge Join Sorted Sorted Outer Inner aaa aab aac aad aaa aab aab aac aae aaf aaf Ideal for Large Tables An Index Can Be Used to Eliminate the Sort Explaining the Postgres Query Optimizer 39 / 56
  • 40. Pseudocode for Merge Join sort(outer); sort(inner); i = 0; j = 0; save_j = 0; while (i < length(outer)) if (outer[i] == inner[j]) output(outer[i], inner[j]); if (outer[i] <= inner[j] && j < length(inner)) j++; if (outer[i] < inner[j]) save_j = j; else i++; j = save_j; Explaining the Postgres Query Optimizer 40 / 56
  • 41. Order of Joined Relations Is Insignificant EXPLAIN SELECT sample2.junk FROM sample2 JOIN sample1 ON (sample2.id = sample1.id); QUERY PLAN ------------------------------------------------------------------------ Merge Join (cost=927.72..1852.95 rows=61272 width=32) Merge Cond: (sample2.id = sample1.id) -> Sort (cost=85.43..88.50 rows=1230 width=36) Sort Key: sample2.id -> Seq Scan on sample2 (cost=0.00..22.30 rows=1230 width=36) -> Sort (cost=842.29..867.20 rows=9963 width=4) Sort Key: sample1.id -> Seq Scan on sample1 (cost=0.00..180.63 rows=9963 width=4) (8 rows) The most restrictive relation, e.g. sample2, is always on the outer side of merge joins. All previous merge joins also had sample2 in outer position. Explaining the Postgres Query Optimizer 41 / 56
  • 42. Add Optimizer Statistics ANALYZE sample1; ANALYZE sample2; Explaining the Postgres Query Optimizer 42 / 56
  • 43. This Was a Merge Join without Optimizer Statistics EXPLAIN SELECT sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id); QUERY PLAN ------------------------------------------------------------------------ Hash Join (cost=15.85..130.47 rows=260 width=254) Hash Cond: (sample1.id = sample2.id) -> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4) -> Hash (cost=12.60..12.60 rows=260 width=258) -> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=258) (5 rows) Explaining the Postgres Query Optimizer 43 / 56
  • 44. Outer Joins Can Affect Optimizer Join Usage EXPLAIN SELECT sample1.junk FROM sample1 RIGHT OUTER JOIN sample2 ON (sample1.id = sample2.id); QUERY PLAN -------------------------------------------------------------------------- Hash Left Join (cost=131.76..148.26 rows=260 width=254) Hash Cond: (sample2.id = sample1.id) -> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=4) -> Hash (cost=103.56..103.56 rows=2256 width=258) -> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=258) (5 rows) Use of hashes for outer joins was added in Postgres 9.1. Explaining the Postgres Query Optimizer 44 / 56
  • 45. Cross Joins Are Nested Loop Joins without Join Restriction EXPLAIN SELECT sample1.junk FROM sample1 CROSS JOIN sample2; QUERY PLAN ---------------------------------------------------------------------- Nested Loop (cost=0.00..7448.81 rows=586560 width=254) -> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=254) -> Materialize (cost=0.00..13.90 rows=260 width=0) -> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=0) (4 rows) Explaining the Postgres Query Optimizer 45 / 56
  • 46. Create Indexes CREATE INDEX i_sample1 on sample1 (id); CREATE INDEX i_sample2 on sample2 (id); Explaining the Postgres Query Optimizer 46 / 56
  • 47. Nested Loop with Inner Index Scan Now Possible EXPLAIN SELECT sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) WHERE sample1.id = 33; QUERY PLAN --------------------------------------------------------------------------------- Nested Loop (cost=0.00..16.55 rows=1 width=254) -> Index Scan using i_sample1 on sample1 (cost=0.00..8.27 rows=1 width=4) Index Cond: (id = 33::oid) -> Index Scan using i_sample2 on sample2 (cost=0.00..8.27 rows=1 width=258) Index Cond: (sample2.id = 33::oid) (5 rows) Explaining the Postgres Query Optimizer 47 / 56
  • 48. Nested Loop Join with Inner Index Scan Outer Inner aag aai aay aag aar aai aas aar aay aaa aag Index Lookup No Setup Required Index Must Already Exist Explaining the Postgres Query Optimizer 48 / 56
  • 49. Pseudocode for Nested Loop Join with Inner Index Scan for (i = 0; i < length(outer); i++) index_entry = get_first_match(outer[j]) while (index_entry) output(outer[i], inner[index_entry]); index_entry = get_next_match(index_entry); Explaining the Postgres Query Optimizer 49 / 56
  • 50. Query Restrictions Affect Join Usage EXPLAIN SELECT sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) WHERE sample2.junk ˜ ’^aaa’; QUERY PLAN ------------------------------------------------------------------------------- Nested Loop (cost=0.00..21.53 rows=1 width=254) -> Seq Scan on sample2 (cost=0.00..13.25 rows=1 width=258) Filter: (junk ˜ ’^aaa’::text) -> Index Scan using i_sample1 on sample1 (cost=0.00..8.27 rows=1 width=4) Index Cond: (sample1.id = sample2.id) (5 rows) No junk rows begin with ’aaa’. Explaining the Postgres Query Optimizer 50 / 56
  • 51. All ’junk’ Columns Begin with ’xxx’ EXPLAIN SELECT sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) WHERE sample2.junk ˜ ’^xxx’; QUERY PLAN ------------------------------------------------------------------------ Hash Join (cost=16.50..131.12 rows=260 width=254) Hash Cond: (sample1.id = sample2.id) -> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4) -> Hash (cost=13.25..13.25 rows=260 width=258) -> Seq Scan on sample2 (cost=0.00..13.25 rows=260 width=258) Filter: (junk ˜ ’^xxx’::text) (6 rows) Hash join was chosen because many more rows are expected. The smaller table, e.g. sample2, is always hashed. Explaining the Postgres Query Optimizer 51 / 56
  • 52. Without LIMIT, Hash Is Used for this Unrestricted Join EXPLAIN SELECT sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id); QUERY PLAN ------------------------------------------------------------------------ Hash Join (cost=15.85..130.47 rows=260 width=254) Hash Cond: (sample1.id = sample2.id) -> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4) -> Hash (cost=12.60..12.60 rows=260 width=258) -> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=258) (5 rows) Explaining the Postgres Query Optimizer 52 / 56
  • 53. LIMIT Can Affect Join Usage EXPLAIN SELECT sample2.id, sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) ORDER BY 1 LIMIT 1; QUERY PLAN ------------------------------------------------------------------------------------------ Limit (cost=0.00..1.83 rows=1 width=258) -> Nested Loop (cost=0.00..477.02 rows=260 width=258) -> Index Scan using i_sample2 on sample2 (cost=0.00..52.15 rows=260 width=258) -> Index Scan using i_sample1 on sample1 (cost=0.00..1.62 rows=1 width=4) Index Cond: (sample1.id = sample2.id) (5 rows) Explaining the Postgres Query Optimizer 53 / 56
  • 54. LIMIT 10 EXPLAIN SELECT sample2.id, sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) ORDER BY 1 LIMIT 10; QUERY PLAN ------------------------------------------------------------------------------------------ Limit (cost=0.00..18.35 rows=10 width=258) -> Nested Loop (cost=0.00..477.02 rows=260 width=258) -> Index Scan using i_sample2 on sample2 (cost=0.00..52.15 rows=260 width=258) -> Index Scan using i_sample1 on sample1 (cost=0.00..1.62 rows=1 width=4) Index Cond: (sample1.id = sample2.id) (5 rows) Explaining the Postgres Query Optimizer 54 / 56
  • 55. LIMIT 100 Switches to Hash Join EXPLAIN SELECT sample2.id, sample2.junk FROM sample1 JOIN sample2 ON (sample1.id = sample2.id) ORDER BY 1 LIMIT 100; QUERY PLAN ------------------------------------------------------------------------------------ Limit (cost=140.41..140.66 rows=100 width=258) -> Sort (cost=140.41..141.06 rows=260 width=258) Sort Key: sample2.id -> Hash Join (cost=15.85..130.47 rows=260 width=258) Hash Cond: (sample1.id = sample2.id) -> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4) -> Hash (cost=12.60..12.60 rows=260 width=258) -> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=258) (8 rows) Explaining the Postgres Query Optimizer 55 / 56