SlideShare a Scribd company logo
Joining 1 million tables
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Joining 1 million tables
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
The goal
Creating and joining 1 million tables
Pushing PostgreSQL to its limits (and definitely beyond)
How far can we push this without patching the core?
Being the first one to try this ;)
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Creating 1 million tables is easy, no?
Here is a script:
#!/usr/bin/perl
print "BEGIN;n";
for (my $i = 0; $i < 1000000; $i++)
{
print "CREATE TABLE t_tab_$i (id int4);n";
}
print "COMMIT;n";
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Creating 1 million tables
Who expects this to work?
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Well, it does not ;)
CREATE TABLE
CREATE TABLE
WARNING: out of shared memory
ERROR: out of shared memory
HINT: You might need to increase
max_locks_per_transaction.
ERROR: current transaction is aborted, commands ignored
until end of transaction block
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Why does it fail?
There is a variable called max locks per transaction
max locks per transaction * max connections = maximum
number of locks
100 connection x 64 is by far not enough
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
A quick workaround
Use single transactions
Avoid nasty disk flushes on the way
SET synchronous_commit TO off;
time ./create_tables.pl | psql test > /dev/null
real 14m45.320s
user 0m45.778s
sys 0m18.877s
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Reporting success . . .
test=# SELECT count(tablename)
FROM pg_tables
WHERE schemaname = ’public’;
count
---------
1000000
(1 row)
One million tables have been created
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Creating an SQL statement
Writing this by hand is not feasible
Desired output:
SELECT 1
FROM t_tab_1, t_tab_2, t_tab_3, t_tab_4, t_tab_5
WHERE t_tab_1.id = t_tab_2.id AND
t_tab_2.id = t_tab_3.id AND
t_tab_3.id = t_tab_4.id AND
t_tab_4.id = t_tab_5.id AND
1 = 1;
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
A first attempt to write a script
#!/usr/bin/perl
my $joins = 5; my $sel = "SELECT 1 ";
my $from = ""; my $where = "";
for (my $i = 1; $i < $joins; $i++)
{
$from .= "t_tab_$i, n";
$where .= " t_tab_" . ($i) . ".id = t_tab_".
($i+1) . ".id AND n";
}
$from .= " t_tab_$joins ";
$where .= " 1 = 1; ";
print "$sel n FROM $from n WHERE $wheren";
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
1 million tables: What it means
If you do this for 1 million tables . . .
perl create.pl | wc
2000001 5000003 54666686
54 MB of SQL is quite a bit of code
First observation: The parser works ;)
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Giving it a try
runtimes with PostgreSQL default settings:
n = 10: 158 ms
n = 100; 948 ms
n = 200: 3970 ms
n = 400: 18356 ms
n = 800: 87095 ms
n = 1000000: prediction = 1575 days :)
ouch, this does not look linear
1575 days was too close to the conference
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
GEQO:
At some point GEQO will kick in
Maybe it is the problem?
SET geqo TO off;
n = 10: 158 ms
n = 100: ERROR: canceling statement due to user request
Time: 680847.127 ms
It clearly is not . . .
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
It seems GEQO is needed
Let us see how far we can get by adjusting GEQO settings
SET geqo_effort TO 1;
- n = 10: 158 ms
- n = 100: 916 ms
- n = 200: 1464 ms
- n = 400: 4694 ms
- n = 800: 18896 ms
- n = 10000000: maybe 1 year? :)
=> the conference deadline is coming closer
=> still 999.000 tables to go ...
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Trying with 10.000 tables
./make_join.pl 10000 | psql test
ERROR: stack depth limit exceeded
HINT: Increase the configuration parameter
max_stack_depth (currently 2048kB), after ensuring
the platform’s stack depth limit is adequate.
this is easy to fix
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Removing limitations
Problem can be fixed easily
OS: ulimit -s unlimited
postgresql.conf: set max stack depth to a very high value
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Next stop: Exponential planning time
Obviously planning time goes up exponentially
There is no way to do this without patching
As long as we are ways slower than linear it is impossible to
join 1 million tables.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
This is how PostgreSQL joins tables:
Tables a, b, c, d form the following joins of 2:
[a, b], [a, c], [a, d], [b, c], [b, d], [c, d]
By adding a single table to each group, the following joins of 3 are
constructed:
[[a, b], c], [[a, b], d], [[a, c], b], [[a, c], d], [[a, d], b], [[a, d], c], [[b,
c], a], [[b, c], d], [[b, d], a], [[b, d], c], [[c, d], a], [[c, d], b]
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
More joining
Likewise, add a single table to each to get the final joins:
[[[a, b], c], d], [[[a, b], d], c], [[[a, c], b], d], [[[a, c], d], b], [[[a, d],
b], c], [[[a, d], c], b], [[[b, c], a], d], [[[b, c], d], a], [[[b, d], a], c],
[[[b, d], c], a], [[[c, d], a], b], [[[c, d], b], a]
Got it? ;)
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
A little patching
A little patch ensures that the list of tables is grouped into
“sub-lists”, where the length of each sub-list is equal to
from collapse limit. If from collapse limit is 2, the list of 4
becomes . . .
[[a, b], [c, d]]
[a, b] and [c, d] are joined separately, so we only get
one “path”:
[[a, b], [c, d]]
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
The downside:
This ensures minimal planning time but is also no real
optimization.
There is no potential for GEQO to improve things
Speed is not an issue in our case anyway. It just has to work
“somehow”.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Are we on a path to success?
Of course not ;)
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
The next limitation
the join fails again
ERROR: too many range table entries
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
This time life is easy
Solution: in include/nodes/primnodes.h
#define INNER_VAR 65000
/* reference to inner subplan */
#define OUTER_VAR 65001
/* reference to outer subplan *
#define INDEX_VAR 65002
/* reference to index column */
to
#define INNER_VAR 2000000
#define OUTER_VAR 2000001
#define INDEX_VAR 2000002
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Next stop: RAM starts to be an issue
| Tables (k) | Planning time (s) | Memory
|------------+-------------------+---------------------|
| 2 | 1 | 38.5 MB |
| 4 | 3 | 75.6 MB |
| 8 | 31 | 192 MB |
| 16 | 186 | 550 MB |
| 32 | 1067 | 1.7 GB |
(this is for planning only)
Expected amount of memory: not available ;)
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Is there a way out?
Clearly: The current path leads to nowhere
Some other approach is needed
Divide and conquer?
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Changing the query
Currently there is . . .
SELECT 1 ...
FROM tab1, ..., tabn
WHERE ...
The problem must be split
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
CTEs might come to the rescue
How about creating WITH-clauses joining smaller sets of
tables?
At the end those WITH statements could be joined together
easily.
1.000 CTEs with 1.000 tables each gives 1.000.000 tables
Scary stuff: Runtime + memory Unforeseen stuff
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
A skeleton query
SET enable_indexscan TO off;
SET enable_hashjoin TO off;
SET enable_mergejoin TO off;
SET from_collapse_limit TO 2;
SET max_stack_depth TO ’1GB’;
WITH cte_0(id_1, id_2) AS (
SELECT t_tab_0.id, t_tab_16383.id
FROM t_tab_0, ...
cte_1 ...
SELECT ... FROM cte_0, cte_1 WHERE ...
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Finally: A path to enlightenment
With all the RAM we had . . .
With all the patience we had . . .
With a fair amount of luck . . .
And with a lot of confidence . . .
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Finally: It worked . . .
Amount of RAM needed: around 40 GB
Runtime: 10h 36min 53 sec
The result:
?column?
----------
(0 rows)
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Finally
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
Thank you for your attention
Cybertec Sch¨onig & Sch¨onig GmbH
Hans-J¨urgen Sch¨onig
Gr¨ohrm¨uhlgasse 26
A-2700 Wiener Neustadt
www.postgresql-support.de
Hans-J¨urgen Sch¨onig
www.postgresql-support.de

More Related Content

What's hot

PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuningelliando dias
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
Alexander Korotkov
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
 
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
Altinity Ltd
 
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
Altinity Ltd
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
PGConf APAC
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Altinity Ltd
 
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
Altinity Ltd
 
PostgreSQL and RAM usage
PostgreSQL and RAM usagePostgreSQL and RAM usage
PostgreSQL and RAM usage
Alexey Bashtanov
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
Altinity Ltd
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparound
Masahiko Sawada
 
Faster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBFaster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDB
MariaDB plc
 
Materialized Views and Secondary Indexes in Scylla: They Are finally here!
Materialized Views and Secondary Indexes in Scylla: They Are finally here!Materialized Views and Secondary Indexes in Scylla: They Are finally here!
Materialized Views and Secondary Indexes in Scylla: They Are finally here!
ScyllaDB
 
押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)
押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)
押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)
NTT DATA Technology & Innovation
 
ClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic ContinuesClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic Continues
Altinity Ltd
 
Postgresql
PostgresqlPostgresql
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaAutovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
PostgreSQL-Consulting
 
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Altinity Ltd
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System Administrators
Command Prompt., Inc
 

What's hot (20)

PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuning
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
 
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
 
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
 
PostgreSQL and RAM usage
PostgreSQL and RAM usagePostgreSQL and RAM usage
PostgreSQL and RAM usage
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparound
 
Faster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBFaster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDB
 
Materialized Views and Secondary Indexes in Scylla: They Are finally here!
Materialized Views and Secondary Indexes in Scylla: They Are finally here!Materialized Views and Secondary Indexes in Scylla: They Are finally here!
Materialized Views and Secondary Indexes in Scylla: They Are finally here!
 
押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)
押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)
押さえておきたい、PostgreSQL 13 の新機能!!(Open Source Conference 2021 Online/Hokkaido 発表資料)
 
ClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic ContinuesClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic Continues
 
Postgresql
PostgresqlPostgresql
Postgresql
 
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaAutovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
 
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System Administrators
 

Viewers also liked

PostgreSQL instance encryption: More database security
PostgreSQL instance encryption: More database securityPostgreSQL instance encryption: More database security
PostgreSQL instance encryption: More database security
Hans-Jürgen Schönig
 
5min analyse
5min analyse5min analyse
5min analyse
Hans-Jürgen Schönig
 
PostgreSQL Replication Tutorial
PostgreSQL Replication TutorialPostgreSQL Replication Tutorial
PostgreSQL Replication Tutorial
Hans-Jürgen Schönig
 
Explain explain
Explain explainExplain explain
Explain explain
Hans-Jürgen Schönig
 
Walbouncer: Filtering PostgreSQL transaction log
Walbouncer: Filtering PostgreSQL transaction logWalbouncer: Filtering PostgreSQL transaction log
Walbouncer: Filtering PostgreSQL transaction log
Hans-Jürgen Schönig
 
PostgreSQL: Eigene Aggregate schreiben
PostgreSQL: Eigene Aggregate schreibenPostgreSQL: Eigene Aggregate schreiben
PostgreSQL: Eigene Aggregate schreiben
Hans-Jürgen Schönig
 
PostgreSQL: The NoSQL way
PostgreSQL: The NoSQL wayPostgreSQL: The NoSQL way
PostgreSQL: The NoSQL way
Hans-Jürgen Schönig
 
PostgreSQL: Advanced indexing
PostgreSQL: Advanced indexingPostgreSQL: Advanced indexing
PostgreSQL: Advanced indexing
Hans-Jürgen Schönig
 
PostgreSQL: Data analysis and analytics
PostgreSQL: Data analysis and analyticsPostgreSQL: Data analysis and analytics
PostgreSQL: Data analysis and analytics
Hans-Jürgen Schönig
 

Viewers also liked (9)

PostgreSQL instance encryption: More database security
PostgreSQL instance encryption: More database securityPostgreSQL instance encryption: More database security
PostgreSQL instance encryption: More database security
 
5min analyse
5min analyse5min analyse
5min analyse
 
PostgreSQL Replication Tutorial
PostgreSQL Replication TutorialPostgreSQL Replication Tutorial
PostgreSQL Replication Tutorial
 
Explain explain
Explain explainExplain explain
Explain explain
 
Walbouncer: Filtering PostgreSQL transaction log
Walbouncer: Filtering PostgreSQL transaction logWalbouncer: Filtering PostgreSQL transaction log
Walbouncer: Filtering PostgreSQL transaction log
 
PostgreSQL: Eigene Aggregate schreiben
PostgreSQL: Eigene Aggregate schreibenPostgreSQL: Eigene Aggregate schreiben
PostgreSQL: Eigene Aggregate schreiben
 
PostgreSQL: The NoSQL way
PostgreSQL: The NoSQL wayPostgreSQL: The NoSQL way
PostgreSQL: The NoSQL way
 
PostgreSQL: Advanced indexing
PostgreSQL: Advanced indexingPostgreSQL: Advanced indexing
PostgreSQL: Advanced indexing
 
PostgreSQL: Data analysis and analytics
PostgreSQL: Data analysis and analyticsPostgreSQL: Data analysis and analytics
PostgreSQL: Data analysis and analytics
 

Similar to PostgreSQL: Joining 1 million tables

Alternative cryptocurrencies
Alternative cryptocurrenciesAlternative cryptocurrencies
Alternative cryptocurrencies
vpnmentor
 
Alternative cryptocurrencies
Alternative cryptocurrencies Alternative cryptocurrencies
Alternative cryptocurrencies
vpnmentor
 
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Spark Summit
 
Engineering fast indexes
Engineering fast indexesEngineering fast indexes
Engineering fast indexes
Daniel Lemire
 
Naughty And Nice Bash Features
Naughty And Nice Bash FeaturesNaughty And Nice Bash Features
Naughty And Nice Bash Features
Nati Cohen
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Databricks
 
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
Taro L. Saito
 
How to make a large C++-code base manageable
How to make a large C++-code base manageableHow to make a large C++-code base manageable
How to make a large C++-code base manageable
corehard_by
 
MapReduce: teoria e prática
MapReduce: teoria e práticaMapReduce: teoria e prática
MapReduce: teoria e prática
PET Computação
 
High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018
Zahari Dichev
 
Segmentation Faults, Page Faults, Processes, Threads, and Tasks
Segmentation Faults, Page Faults, Processes, Threads, and TasksSegmentation Faults, Page Faults, Processes, Threads, and Tasks
Segmentation Faults, Page Faults, Processes, Threads, and Tasks
David Evans
 
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup  - Alex PerrierLarge data with Scikit-learn - Boston Data Mining Meetup  - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
Alexis Perrier
 
High-Performance Haskell
High-Performance HaskellHigh-Performance Haskell
High-Performance HaskellJohan Tibell
 
DA_02_algorithms.pptx
DA_02_algorithms.pptxDA_02_algorithms.pptx
DA_02_algorithms.pptx
Alok Mohapatra
 
Building a DSL with GraalVM (VoxxedDays Luxembourg)
Building a DSL with GraalVM (VoxxedDays Luxembourg)Building a DSL with GraalVM (VoxxedDays Luxembourg)
Building a DSL with GraalVM (VoxxedDays Luxembourg)
Maarten Mulders
 
Next Generation Indexes For Big Data Engineering (ODSC East 2018)
Next Generation Indexes For Big Data Engineering (ODSC East 2018)Next Generation Indexes For Big Data Engineering (ODSC East 2018)
Next Generation Indexes For Big Data Engineering (ODSC East 2018)
Daniel Lemire
 
Pycon 2011 talk (may not be final, note)
Pycon 2011 talk (may not be final, note)Pycon 2011 talk (may not be final, note)
Pycon 2011 talk (may not be final, note)c.titus.brown
 
Queuing Sql Server: Utilise queues to increase performance in SQL Server
Queuing Sql Server: Utilise queues to increase performance in SQL ServerQueuing Sql Server: Utilise queues to increase performance in SQL Server
Queuing Sql Server: Utilise queues to increase performance in SQL Server
Niels Berglund
 
PyCon 2011 talk - ngram assembly with Bloom filters
PyCon 2011 talk - ngram assembly with Bloom filtersPyCon 2011 talk - ngram assembly with Bloom filters
PyCon 2011 talk - ngram assembly with Bloom filtersc.titus.brown
 
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacketCsw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacket
CanSecWest
 

Similar to PostgreSQL: Joining 1 million tables (20)

Alternative cryptocurrencies
Alternative cryptocurrenciesAlternative cryptocurrencies
Alternative cryptocurrencies
 
Alternative cryptocurrencies
Alternative cryptocurrencies Alternative cryptocurrencies
Alternative cryptocurrencies
 
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
 
Engineering fast indexes
Engineering fast indexesEngineering fast indexes
Engineering fast indexes
 
Naughty And Nice Bash Features
Naughty And Nice Bash FeaturesNaughty And Nice Bash Features
Naughty And Nice Bash Features
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
 
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
 
How to make a large C++-code base manageable
How to make a large C++-code base manageableHow to make a large C++-code base manageable
How to make a large C++-code base manageable
 
MapReduce: teoria e prática
MapReduce: teoria e práticaMapReduce: teoria e prática
MapReduce: teoria e prática
 
High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018High Performance Systems Without Tears - Scala Days Berlin 2018
High Performance Systems Without Tears - Scala Days Berlin 2018
 
Segmentation Faults, Page Faults, Processes, Threads, and Tasks
Segmentation Faults, Page Faults, Processes, Threads, and TasksSegmentation Faults, Page Faults, Processes, Threads, and Tasks
Segmentation Faults, Page Faults, Processes, Threads, and Tasks
 
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup  - Alex PerrierLarge data with Scikit-learn - Boston Data Mining Meetup  - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
 
High-Performance Haskell
High-Performance HaskellHigh-Performance Haskell
High-Performance Haskell
 
DA_02_algorithms.pptx
DA_02_algorithms.pptxDA_02_algorithms.pptx
DA_02_algorithms.pptx
 
Building a DSL with GraalVM (VoxxedDays Luxembourg)
Building a DSL with GraalVM (VoxxedDays Luxembourg)Building a DSL with GraalVM (VoxxedDays Luxembourg)
Building a DSL with GraalVM (VoxxedDays Luxembourg)
 
Next Generation Indexes For Big Data Engineering (ODSC East 2018)
Next Generation Indexes For Big Data Engineering (ODSC East 2018)Next Generation Indexes For Big Data Engineering (ODSC East 2018)
Next Generation Indexes For Big Data Engineering (ODSC East 2018)
 
Pycon 2011 talk (may not be final, note)
Pycon 2011 talk (may not be final, note)Pycon 2011 talk (may not be final, note)
Pycon 2011 talk (may not be final, note)
 
Queuing Sql Server: Utilise queues to increase performance in SQL Server
Queuing Sql Server: Utilise queues to increase performance in SQL ServerQueuing Sql Server: Utilise queues to increase performance in SQL Server
Queuing Sql Server: Utilise queues to increase performance in SQL Server
 
PyCon 2011 talk - ngram assembly with Bloom filters
PyCon 2011 talk - ngram assembly with Bloom filtersPyCon 2011 talk - ngram assembly with Bloom filters
PyCon 2011 talk - ngram assembly with Bloom filters
 
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacketCsw2016 wheeler barksdale-gruskovnjak-execute_mypacket
Csw2016 wheeler barksdale-gruskovnjak-execute_mypacket
 

Recently uploaded

2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Jay Das
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 

Recently uploaded (20)

2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 

PostgreSQL: Joining 1 million tables

  • 1. Joining 1 million tables Hans-J¨urgen Sch¨onig www.postgresql-support.de Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 2. Joining 1 million tables Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 3. The goal Creating and joining 1 million tables Pushing PostgreSQL to its limits (and definitely beyond) How far can we push this without patching the core? Being the first one to try this ;) Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 4. Creating 1 million tables is easy, no? Here is a script: #!/usr/bin/perl print "BEGIN;n"; for (my $i = 0; $i < 1000000; $i++) { print "CREATE TABLE t_tab_$i (id int4);n"; } print "COMMIT;n"; Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 5. Creating 1 million tables Who expects this to work? Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 6. Well, it does not ;) CREATE TABLE CREATE TABLE WARNING: out of shared memory ERROR: out of shared memory HINT: You might need to increase max_locks_per_transaction. ERROR: current transaction is aborted, commands ignored until end of transaction block Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 7. Why does it fail? There is a variable called max locks per transaction max locks per transaction * max connections = maximum number of locks 100 connection x 64 is by far not enough Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 8. A quick workaround Use single transactions Avoid nasty disk flushes on the way SET synchronous_commit TO off; time ./create_tables.pl | psql test > /dev/null real 14m45.320s user 0m45.778s sys 0m18.877s Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 9. Reporting success . . . test=# SELECT count(tablename) FROM pg_tables WHERE schemaname = ’public’; count --------- 1000000 (1 row) One million tables have been created Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 10. Creating an SQL statement Writing this by hand is not feasible Desired output: SELECT 1 FROM t_tab_1, t_tab_2, t_tab_3, t_tab_4, t_tab_5 WHERE t_tab_1.id = t_tab_2.id AND t_tab_2.id = t_tab_3.id AND t_tab_3.id = t_tab_4.id AND t_tab_4.id = t_tab_5.id AND 1 = 1; Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 11. A first attempt to write a script #!/usr/bin/perl my $joins = 5; my $sel = "SELECT 1 "; my $from = ""; my $where = ""; for (my $i = 1; $i < $joins; $i++) { $from .= "t_tab_$i, n"; $where .= " t_tab_" . ($i) . ".id = t_tab_". ($i+1) . ".id AND n"; } $from .= " t_tab_$joins "; $where .= " 1 = 1; "; print "$sel n FROM $from n WHERE $wheren"; Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 12. 1 million tables: What it means If you do this for 1 million tables . . . perl create.pl | wc 2000001 5000003 54666686 54 MB of SQL is quite a bit of code First observation: The parser works ;) Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 13. Giving it a try runtimes with PostgreSQL default settings: n = 10: 158 ms n = 100; 948 ms n = 200: 3970 ms n = 400: 18356 ms n = 800: 87095 ms n = 1000000: prediction = 1575 days :) ouch, this does not look linear 1575 days was too close to the conference Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 14. GEQO: At some point GEQO will kick in Maybe it is the problem? SET geqo TO off; n = 10: 158 ms n = 100: ERROR: canceling statement due to user request Time: 680847.127 ms It clearly is not . . . Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 15. It seems GEQO is needed Let us see how far we can get by adjusting GEQO settings SET geqo_effort TO 1; - n = 10: 158 ms - n = 100: 916 ms - n = 200: 1464 ms - n = 400: 4694 ms - n = 800: 18896 ms - n = 10000000: maybe 1 year? :) => the conference deadline is coming closer => still 999.000 tables to go ... Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 16. Trying with 10.000 tables ./make_join.pl 10000 | psql test ERROR: stack depth limit exceeded HINT: Increase the configuration parameter max_stack_depth (currently 2048kB), after ensuring the platform’s stack depth limit is adequate. this is easy to fix Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 17. Removing limitations Problem can be fixed easily OS: ulimit -s unlimited postgresql.conf: set max stack depth to a very high value Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 18. Next stop: Exponential planning time Obviously planning time goes up exponentially There is no way to do this without patching As long as we are ways slower than linear it is impossible to join 1 million tables. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 19. This is how PostgreSQL joins tables: Tables a, b, c, d form the following joins of 2: [a, b], [a, c], [a, d], [b, c], [b, d], [c, d] By adding a single table to each group, the following joins of 3 are constructed: [[a, b], c], [[a, b], d], [[a, c], b], [[a, c], d], [[a, d], b], [[a, d], c], [[b, c], a], [[b, c], d], [[b, d], a], [[b, d], c], [[c, d], a], [[c, d], b] Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 20. More joining Likewise, add a single table to each to get the final joins: [[[a, b], c], d], [[[a, b], d], c], [[[a, c], b], d], [[[a, c], d], b], [[[a, d], b], c], [[[a, d], c], b], [[[b, c], a], d], [[[b, c], d], a], [[[b, d], a], c], [[[b, d], c], a], [[[c, d], a], b], [[[c, d], b], a] Got it? ;) Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 21. A little patching A little patch ensures that the list of tables is grouped into “sub-lists”, where the length of each sub-list is equal to from collapse limit. If from collapse limit is 2, the list of 4 becomes . . . [[a, b], [c, d]] [a, b] and [c, d] are joined separately, so we only get one “path”: [[a, b], [c, d]] Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 22. The downside: This ensures minimal planning time but is also no real optimization. There is no potential for GEQO to improve things Speed is not an issue in our case anyway. It just has to work “somehow”. Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 23. Are we on a path to success? Of course not ;) Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 24. The next limitation the join fails again ERROR: too many range table entries Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 25. This time life is easy Solution: in include/nodes/primnodes.h #define INNER_VAR 65000 /* reference to inner subplan */ #define OUTER_VAR 65001 /* reference to outer subplan * #define INDEX_VAR 65002 /* reference to index column */ to #define INNER_VAR 2000000 #define OUTER_VAR 2000001 #define INDEX_VAR 2000002 Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 26. Next stop: RAM starts to be an issue | Tables (k) | Planning time (s) | Memory |------------+-------------------+---------------------| | 2 | 1 | 38.5 MB | | 4 | 3 | 75.6 MB | | 8 | 31 | 192 MB | | 16 | 186 | 550 MB | | 32 | 1067 | 1.7 GB | (this is for planning only) Expected amount of memory: not available ;) Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 27. Is there a way out? Clearly: The current path leads to nowhere Some other approach is needed Divide and conquer? Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 28. Changing the query Currently there is . . . SELECT 1 ... FROM tab1, ..., tabn WHERE ... The problem must be split Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 29. CTEs might come to the rescue How about creating WITH-clauses joining smaller sets of tables? At the end those WITH statements could be joined together easily. 1.000 CTEs with 1.000 tables each gives 1.000.000 tables Scary stuff: Runtime + memory Unforeseen stuff Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 30. A skeleton query SET enable_indexscan TO off; SET enable_hashjoin TO off; SET enable_mergejoin TO off; SET from_collapse_limit TO 2; SET max_stack_depth TO ’1GB’; WITH cte_0(id_1, id_2) AS ( SELECT t_tab_0.id, t_tab_16383.id FROM t_tab_0, ... cte_1 ... SELECT ... FROM cte_0, cte_1 WHERE ... Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 31. Finally: A path to enlightenment With all the RAM we had . . . With all the patience we had . . . With a fair amount of luck . . . And with a lot of confidence . . . Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 32. Finally: It worked . . . Amount of RAM needed: around 40 GB Runtime: 10h 36min 53 sec The result: ?column? ---------- (0 rows) Hans-J¨urgen Sch¨onig www.postgresql-support.de
  • 34. Thank you for your attention Cybertec Sch¨onig & Sch¨onig GmbH Hans-J¨urgen Sch¨onig Gr¨ohrm¨uhlgasse 26 A-2700 Wiener Neustadt www.postgresql-support.de Hans-J¨urgen Sch¨onig www.postgresql-support.de