© 2013 EDB All rights reserved. 1
I know greater-than-or-equal-to when I see it!
Noah Misch | 2014-05-22
>= <>&&
<@
© 2014 EDB All rights reserved. 2
■ Index access methods (pg_am)
− Type-independent; specific to certain index layout
− bt...
© 2014 EDB All rights reserved. 3
■ In general: ties a data type to an access method
■ The case of btree: comparison funct...
© 2014 EDB All rights reserved. 4
■ Extends operator support to multiple data types
■ Relevant for btree and hash only
Wha...
© 2014 EDB All rights reserved. 5
■ FUNCTION entries maintain the index
■ List of OPERATOR qualified to exploit the index
...
© 2014 EDB All rights reserved. 6
System Catalog Representation
pg_am
btree
pg_opclass
int4_ops
pg_opfamily
integer_ops
pg...
© 2014 EDB All rights reserved. 7
■ btree: other sort orders
− text_pattern_ops
■ hash: not done in practice
■ gin, gist, ...
© 2014 EDB All rights reserved. 8
ORDER BY
-- uses btree text_ops
ORDER BY textcol;
-- uses btree text_pattern_ops
ORDER B...
© 2014 EDB All rights reserved. 9
■ UNION
■ GROUP BY, DISTINCT
■ array, composite type comparisons
■ Choice of default equ...
© 2014 EDB All rights reserved. 10
■ Operator names like “=” and “<” are not special ...
■ … excepting CASE, IN, IS DISTIN...
© 2014 EDB All rights reserved. 11
Merge Join
[local] test=# SET enable_hashjoin = off;
SET
[local] test=# EXPLAIN (costs ...
© 2014 EDB All rights reserved. 12
Hash Join
[local] test=# EXPLAIN (costs off)
SELECT opfmethod, opfname, array_agg(amopo...
© 2014 EDB All rights reserved. 13
■ Don't hard-code “=”
■ Which equality semantics?
− btree/hash default equality
− exact...
© 2014 EDB All rights reserved. 14
■ Choice of default equality semantics is important
− Option to omit them entirely (xml...
© 2014 EDB All rights reserved. 15
Questions?
© 2014 EDB All rights reserved. 16
■ http://www.postgresql.org/docs/current/static/xindex.html
■ contrib/btree_gist, contr...
© 2014 EDB All rights reserved. 17
hash int4_ops
CREATE OPERATOR CLASS int4_ops
DEFAULT FOR TYPE integer
USING hash FAMILY...
© 2014 EDB All rights reserved. 18
Array Element Searches: gin _int4_ops
CREATE OPERATOR CLASS _int4_ops
DEFAULT FOR TYPE ...
Upcoming SlideShare
Loading in …5
×

I know greater-than-or-equal-to when I see it! (PGCon 2014)

697 views

Published on

This talk explores the theory behind operator classes and families, including the assumptions data type-independent code is and is not entitled to make based on available operator classes. We will walk through the creation of new operator classes, some practical and others deliberately perverse, and examine some exceptional operator classes already present in PostgreSQL.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
697
On SlideShare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

I know greater-than-or-equal-to when I see it! (PGCon 2014)

  1. 1. © 2013 EDB All rights reserved. 1 I know greater-than-or-equal-to when I see it! Noah Misch | 2014-05-22 >= <>&& <@
  2. 2. © 2014 EDB All rights reserved. 2 ■ Index access methods (pg_am) − Type-independent; specific to certain index layout − btree, hash, gist, gin, spgist ■ Operator classes (pg_opclass) − Specific to a data type + index access method − Tightly related: operator families (pg_opfamily) − int4_ops, text_ops Layers of Index Support
  3. 3. © 2014 EDB All rights reserved. 3 ■ In general: ties a data type to an access method ■ The case of btree: comparison function and operators What is an operator class? CREATE TABLE t (c date PRIMARY KEY); INSERT INTO t VALUES ('2014-01-01'); INSERT INTO t VALUES ('2015-01-01'); ... -- <(date,date) operator SELECT * FROM t WHERE c < current_date;
  4. 4. © 2014 EDB All rights reserved. 4 ■ Extends operator support to multiple data types ■ Relevant for btree and hash only What is an operator family? CREATE TABLE t (c date PRIMARY KEY); INSERT INTO t VALUES ('2014-01-01'); INSERT INTO t VALUES ('2015-01-01'); ... -- <(date,timestamptz) operator SELECT * FROM t WHERE c < now();
  5. 5. © 2014 EDB All rights reserved. 5 ■ FUNCTION entries maintain the index ■ List of OPERATOR qualified to exploit the index ■ “equal-sign operator” vs. “equality operator” btree int4_ops walk-through CREATE OPERATOR FAMILY integer_ops USING btree; CREATE OPERATOR CLASS int4_ops DEFAULT FOR TYPE integer USING btree FAMILY integer_ops AS FUNCTION 1 btint4cmp(integer, integer), OPERATOR 1 <, OPERATOR 2 <=, OPERATOR 3 =, OPERATOR 4 >=, OPERATOR 5 >;
  6. 6. © 2014 EDB All rights reserved. 6 System Catalog Representation pg_am btree pg_opclass int4_ops pg_opfamily integer_ops pg_opclass int2_ops pg_amproc btint4cmp pg_amproc btint24cmp pg_amproc btint2cmp pg_amop <(int4,int4) pg_amop <(int2,int4) pg_amop <(int2,int2)
  7. 7. © 2014 EDB All rights reserved. 7 ■ btree: other sort orders − text_pattern_ops ■ hash: not done in practice ■ gin, gist, spgist: fruitful opportunities Multiple Operator Classes
  8. 8. © 2014 EDB All rights reserved. 8 ORDER BY -- uses btree text_ops ORDER BY textcol; -- uses btree text_pattern_ops ORDER BY textcol USING ~<~; -- can use e.g. gist_trgm_ops ORDER BY textcol <-> 'search condition';
  9. 9. © 2014 EDB All rights reserved. 9 ■ UNION ■ GROUP BY, DISTINCT ■ array, composite type comparisons ■ Choice of default equality semantics is important Equality [local] test=# SELECT DISTINCT x FROM unnest(array[1.00, 1.1, 1.0]) t(x); x ────── 1.1 1.00 (2 rows)
  10. 10. © 2014 EDB All rights reserved. 10 ■ Operator names like “=” and “<” are not special ... ■ … excepting CASE, IN, IS DISTINCT FROM, etc Equality Surprises [local] test=# SELECT DISTINCT x FROM unnest(array['(1,1),(0,0)', '(2,2),(1,1)']::box[]) t(x); ERROR: could not identify an equality operator for type box [local] test=# SELECT '(1,1),(0,0)'::box IN ('(2,2),(1,1)'::box); ?column? ────────── t
  11. 11. © 2014 EDB All rights reserved. 11 Merge Join [local] test=# SET enable_hashjoin = off; SET [local] test=# EXPLAIN (costs off) SELECT opfmethod, opfname, array_agg(amopopr) FROM pg_amop ao JOIN pg_opfamily f ON amopfamily = f.oid GROUP BY 1,2; QUERY PLAN ───────────────────────────────────────────── HashAggregate Group Key: f.opfmethod, f.opfname -> Merge Join Merge Cond: (f.oid = ao.amopfamily) -> Sort Sort Key: f.oid -> Seq Scan on pg_opfamily f -> Sort Sort Key: ao.amopfamily -> Seq Scan on pg_amop ao
  12. 12. © 2014 EDB All rights reserved. 12 Hash Join [local] test=# EXPLAIN (costs off) SELECT opfmethod, opfname, array_agg(amopopr) FROM pg_amop ao JOIN pg_opfamily f ON amopfamily = f.oid GROUP BY 1,2; QUERY PLAN ───────────────────────────────────────────── HashAggregate Group Key: f.opfmethod, f.opfname -> Hash Join Hash Cond: (ao.amopfamily = f.oid) -> Seq Scan on pg_amop ao -> Hash -> Seq Scan on pg_opfamily f
  13. 13. © 2014 EDB All rights reserved. 13 ■ Don't hard-code “=” ■ Which equality semantics? − btree/hash default equality − exact match (output comparison; record_image_ops) ■ Do look up equality by operator class − backend: TYPECACHE_EQ_OPR − frontend: copy its algorithm ■ Not all types have these operations Writing Generic Data Type Consumers
  14. 14. © 2014 EDB All rights reserved. 14 ■ Choice of default equality semantics is important − Option to omit them entirely (xml, json, box) ■ Try to include a default btree operator class ■ Default hash operator class is then easy ■ Other access methods are situation-specific − gin for container-like types − gist often starts with the search strategy, not the type Implementing Data Types
  15. 15. © 2014 EDB All rights reserved. 15 Questions?
  16. 16. © 2014 EDB All rights reserved. 16 ■ http://www.postgresql.org/docs/current/static/xindex.html ■ contrib/btree_gist, contrib/btree_gin ■ Other built-in and contrib operator classes ■ ATAddForeignKeyConstraint() Further Reading
  17. 17. © 2014 EDB All rights reserved. 17 hash int4_ops CREATE OPERATOR CLASS int4_ops DEFAULT FOR TYPE integer USING hash FAMILY integer_ops AS FUNCTION 1 hashint4(integer), OPERATOR 1 =;
  18. 18. © 2014 EDB All rights reserved. 18 Array Element Searches: gin _int4_ops CREATE OPERATOR CLASS _int4_ops DEFAULT FOR TYPE integer[] USING gin FAMILY array_ops AS STORAGE integer, FUNCTION 1 btint4cmp(integer,integer), FUNCTION 2 ginarrayextract(...), FUNCTION 3 ginqueryarrayextract(...), FUNCTION 4 ginarrayconsistent(...), OPERATOR 1 &&(anyarray,anyarray), OPERATOR 2 @>(anyarray,anyarray), OPERATOR 3 <@(anyarray,anyarray), OPERATOR 4 =(anyarray,anyarray);

×