More Related Content Similar to I know greater-than-or-equal-to when I see it! (PGCon 2014) (20) I know greater-than-or-equal-to when I see it! (PGCon 2014)1. © 2013 EDB All rights reserved. 1
I know greater-than-or-equal-to when I see it!
Noah Misch | 2014-05-22
>= <>&&
<@
2. © 2014 EDB All rights reserved. 2
■ Index access methods (pg_am)
− Type-independent; specific to certain index layout
− btree, hash, gist, gin, spgist
■ Operator classes (pg_opclass)
− Specific to a data type + index access method
− Tightly related: operator families (pg_opfamily)
− int4_ops, text_ops
Layers of Index Support
3. © 2014 EDB All rights reserved. 3
■ In general: ties a data type to an access method
■ The case of btree: comparison function and operators
What is an operator class?
CREATE TABLE t (c date PRIMARY KEY);
INSERT INTO t VALUES ('2014-01-01');
INSERT INTO t VALUES ('2015-01-01');
...
-- <(date,date) operator
SELECT * FROM t WHERE c < current_date;
4. © 2014 EDB All rights reserved. 4
■ Extends operator support to multiple data types
■ Relevant for btree and hash only
What is an operator family?
CREATE TABLE t (c date PRIMARY KEY);
INSERT INTO t VALUES ('2014-01-01');
INSERT INTO t VALUES ('2015-01-01');
...
-- <(date,timestamptz) operator
SELECT * FROM t WHERE c < now();
5. © 2014 EDB All rights reserved. 5
■ FUNCTION entries maintain the index
■ List of OPERATOR qualified to exploit the index
■ “equal-sign operator” vs. “equality operator”
btree int4_ops walk-through
CREATE OPERATOR FAMILY integer_ops USING btree;
CREATE OPERATOR CLASS int4_ops
DEFAULT FOR TYPE integer
USING btree FAMILY integer_ops AS
FUNCTION 1 btint4cmp(integer, integer),
OPERATOR 1 <,
OPERATOR 2 <=,
OPERATOR 3 =,
OPERATOR 4 >=,
OPERATOR 5 >;
6. © 2014 EDB All rights reserved. 6
System Catalog Representation
pg_am
btree
pg_opclass
int4_ops
pg_opfamily
integer_ops
pg_opclass
int2_ops
pg_amproc
btint4cmp
pg_amproc
btint24cmp
pg_amproc
btint2cmp
pg_amop
<(int4,int4)
pg_amop
<(int2,int4)
pg_amop
<(int2,int2)
7. © 2014 EDB All rights reserved. 7
■ btree: other sort orders
− text_pattern_ops
■ hash: not done in practice
■ gin, gist, spgist:
fruitful opportunities
Multiple Operator Classes
8. © 2014 EDB All rights reserved. 8
ORDER BY
-- uses btree text_ops
ORDER BY textcol;
-- uses btree text_pattern_ops
ORDER BY textcol USING ~<~;
-- can use e.g. gist_trgm_ops
ORDER BY textcol <-> 'search condition';
9. © 2014 EDB All rights reserved. 9
■ UNION
■ GROUP BY, DISTINCT
■ array, composite type comparisons
■ Choice of default equality semantics is important
Equality
[local] test=# SELECT DISTINCT x
FROM unnest(array[1.00, 1.1, 1.0]) t(x);
x
──────
1.1
1.00
(2 rows)
10. © 2014 EDB All rights reserved. 10
■ Operator names like “=” and “<” are not special ...
■ … excepting CASE, IN, IS DISTINCT FROM, etc
Equality Surprises
[local] test=# SELECT DISTINCT x
FROM unnest(array['(1,1),(0,0)',
'(2,2),(1,1)']::box[]) t(x);
ERROR: could not identify an equality
operator for type box
[local] test=# SELECT '(1,1),(0,0)'::box IN
('(2,2),(1,1)'::box);
?column?
──────────
t
11. © 2014 EDB All rights reserved. 11
Merge Join
[local] test=# SET enable_hashjoin = off;
SET
[local] test=# EXPLAIN (costs off)
SELECT opfmethod, opfname, array_agg(amopopr)
FROM pg_amop ao JOIN pg_opfamily f
ON amopfamily = f.oid GROUP BY 1,2;
QUERY PLAN
─────────────────────────────────────────────
HashAggregate
Group Key: f.opfmethod, f.opfname
-> Merge Join
Merge Cond: (f.oid = ao.amopfamily)
-> Sort
Sort Key: f.oid
-> Seq Scan on pg_opfamily f
-> Sort
Sort Key: ao.amopfamily
-> Seq Scan on pg_amop ao
12. © 2014 EDB All rights reserved. 12
Hash Join
[local] test=# EXPLAIN (costs off)
SELECT opfmethod, opfname, array_agg(amopopr)
FROM pg_amop ao JOIN pg_opfamily f
ON amopfamily = f.oid GROUP BY 1,2;
QUERY PLAN
─────────────────────────────────────────────
HashAggregate
Group Key: f.opfmethod, f.opfname
-> Hash Join
Hash Cond: (ao.amopfamily = f.oid)
-> Seq Scan on pg_amop ao
-> Hash
-> Seq Scan on pg_opfamily f
13. © 2014 EDB All rights reserved. 13
■ Don't hard-code “=”
■ Which equality semantics?
− btree/hash default equality
− exact match (output comparison; record_image_ops)
■ Do look up equality by operator class
− backend: TYPECACHE_EQ_OPR
− frontend: copy its algorithm
■ Not all types have these operations
Writing Generic Data Type Consumers
14. © 2014 EDB All rights reserved. 14
■ Choice of default equality semantics is important
− Option to omit them entirely (xml, json, box)
■ Try to include a default btree operator class
■ Default hash operator class is then easy
■ Other access methods are situation-specific
− gin for container-like types
− gist often starts with the search strategy, not the type
Implementing Data Types
16. © 2014 EDB All rights reserved. 16
■ http://www.postgresql.org/docs/current/static/xindex.html
■ contrib/btree_gist, contrib/btree_gin
■ Other built-in and contrib operator classes
■ ATAddForeignKeyConstraint()
Further Reading
17. © 2014 EDB All rights reserved. 17
hash int4_ops
CREATE OPERATOR CLASS int4_ops
DEFAULT FOR TYPE integer
USING hash FAMILY integer_ops AS
FUNCTION 1 hashint4(integer),
OPERATOR 1 =;
18. © 2014 EDB All rights reserved. 18
Array Element Searches: gin _int4_ops
CREATE OPERATOR CLASS _int4_ops
DEFAULT FOR TYPE integer[]
USING gin FAMILY array_ops AS
STORAGE integer,
FUNCTION 1 btint4cmp(integer,integer),
FUNCTION 2 ginarrayextract(...),
FUNCTION 3 ginqueryarrayextract(...),
FUNCTION 4 ginarrayconsistent(...),
OPERATOR 1 &&(anyarray,anyarray),
OPERATOR 2 @>(anyarray,anyarray),
OPERATOR 3 <@(anyarray,anyarray),
OPERATOR 4 =(anyarray,anyarray);