Robert Haas Query Planning Gone Wrong Presentation @ Postgres Open
Click to edit Master subtitle style
Query Planning Gone Wrong
Chief Architect, Database Server
Why This Talk?
● 2010: The PostgreSQL Query Planner (Robert Haas)
● How does the query planner actually work from a user
perspective? What does it really do?
● Very common audience question: What do I do when the query
planner fails? How do I fix my query?
● 2011: Hacking the Query Planner (Tom Lane)
● How does the query planner actually work from a developer
perspective? What does it *really* do?
● Plea for help to improve the query planner.
● But... what should we be improving?
Methodology: Which Problems Matter?
● Read hundreds of email threads on pgsql-performance over a
period of almost two years.
● Disregarded all those that were not about query performance
● Decided what I thought the root cause (or, occasionally, causes) of
each complaint was.
● Skipped a very small number where I couldn't form an opinion.
● Counted the number of times each problem was reported.
Methodology: Possible Critiques
● The problems reported on pgsql-performance aren't necessarily
representative of all the problems PostgreSQL users encounter
● In particular, confusing problems might be more likely to be
● I might not have correctly identified the cause of each problem
Statistically Speaking, Why Is My Query Slow? (168)
● Settings (23). Includes anything you can fix with postgresql.conf
changes, DDL, or operating systems settings changes.
● Just Plain Slow (23). Includes anything that amounts to an
unreasonable expectation on the part of the user. These are often
questions of the form “why is query A slower than query B?” when
A is actually doing something much more expensive than B.
● We're Bad At That (22). Includes anything that could be faster in
some other database product, but isn't fast in PostgreSQL for
some reason (not implemented yet, or architectural artifact).
● Planner Error (83). Bad decisions about the cost of one plan vs.
another plan due to limitations of the optimizer.
● Bugs (14). Bugs in the query planner, or in one case, the Linux
● User Error (3). User got confused and did something illogical.
● Planner Cost Constants (8). Adjustments needed to
seq_page_cost, random_page_cost, and perhaps cpu_tuple_cost
to accurately model real costs.
● Missing Index (4)
● Cost for @@ Operator Is Too Low (2)
● work_mem Too Low (2)
● Statistics Target Too Low (2)
● Statistics Target Too High (1)
● n_distinct Estimates Aren't Accurate On Large Tables (1)
● Not Analyzing Tables Often Enough (1)
● TOAST Decompression is Slow (1)
● vm.zone_reclaim_mode = 1 Causes Extra Disk I/O (1)
Just Plain Slow (23)
● It Takes a While to Process a Lot of Data (6)
● Disks Are Slower Than Memory (6)
● Clauses Involving Multiple Tables Can't Be Pushed Down (2)
● Random I/O is Slower Than Sequential I/O (1)
● Linearly Scanning an Array is O(n) (1)
● One Regular Expression is Faster Than Two (1)
● Can't Figure Out Which Patterns Match a String Without Trying
Them All (1)
● xmlagg Is Much Slower Than string_agg (1)
● Scanning More Tables is Slower Than Scanning Fewer Tables (1)
● Replanning Isn't Free (1)
● Repeated Concatenation Using xmlconcat Is Slow (1)
● UNION is Slower than UNION ALL (1)
We're Bad At That (22)
● Plan Types We Can't Generate (11)
● Parameterized Paths (7). Two of these are post-9.2 complaints,
involving cases where 9.2 can't parameterize as needed.
● Merge Append (3). Fixed in 9.1.
● Batched Sort of Data Already Ordered By Leading Columns (1).
● Executor Limitations (3)
● Indexing Unordered Data Causes Random I/O (1)
● <> is Not Indexable (1)
● DISTINCT + HashAggregate Reads All Input Before Emitting
Any Results (1). This matters if there is a LIMIT.
● Architecture (8)
● No Parallel Query (2), Table Bloat (1), Backend Startup Cost (1),
Redundant Updates Are Expensive (1), AFTER Trigger Queue
Size (1), On-Disk Size of numeric (1), Autovacuum Not Smart
About Inherited Tables (1)
Planner Errors (83)
● Conceptual Errors (28). The planner isn't able to recognize that
two different queries are equivalent, so it doesn't even consider the
● Estimation Errors (55). The planner considers the optimal plan, but
rejects it as too expensive.
● Row Count Estimation Errors (48). The planner mis-estimates
the number of rows that will be returned by some scan, join, or
● Cost Estimation Errors (7). The planner estimates the row
count correctly but incorrectly estimates the relative cost.
Grand Prize Winners
● Selectivity of filter conditions involving correlated columns is
estimated inaccurately (13)
● Suppose we want all the rows from a table where a = 1 and b =
1 and c = 1 and d = 1 and e = 1. The planner must estimate the
number of rows that will match, but only has statistics on each
● Planner incorrectly thinks that “SELECT * FROM foo WHERE a = 1
ORDER BY b LIMIT n” will fill the limit after reading a small
percentage of the index (11)
● It can scan an index on b and filter for rows where a = 1.
● Or it can scan an index on a, find all rows where a = 1, and
perform a top-N sort.
● It often prefers the former when the latter would be faster.
● Can often be worked around with a composite or functional
Planner Error: Row Count Estimation – Others (24)
● Using WITH Results in a Bad Plan (5). Some of these are query
fattening issues, while others result from failure to dig out variable
● Generic Plans Can Have Wildly Wrong Estimates (4). Improved.
● Selectivity Estimates on Arbitrary Estimates are Poor (4)
● Join Selectivity Doesn't Know about Cross-Table Correlations (3)
● Uncommitted Tuples Don't Affect Statistics (2)
● No Stats for WITH RECURSIVE (1) or GROUP BY (1) Results
● Redundant Equality Constraints Not Identified As Such (1)
● IN/NOT IN Estimation Doesn't Assume Array Elements Distinct (1).
● Histogram Bounds Can Slide Due to New Data (1). Fixed.
● Inheritance Parents Aren't Assumed to be Completely Empty (1).
Planner Error: Cost Estimation (7)
● Planner doesn't account for de-TOASTing cost (4)
● Plan change causes volume of data to exceed server memory (2)
● Hash join sometimes decides to hash the larger table when it
should probably be hashing the smaller one (1)
Planner Error: Conceptual (28)
● Cross-data type comparisons are not always indexable (3)
● Inlining the same thing multiple times can lose (3)
● NOT IN is hard to optimize – and we don't try very hard (3)
● Target lists are computed too early or unnecessary targets are
● Can't rewrite SELECT max(a) FROM foo WHERE b IN (…) as max of
index scans (2)
● Can't rearrange joins and aggregates relative to one another (2)
● Can't deduce implied inequalities (2)
● Ten other issues that came up once each