Chetan postgresql partitioning
Upcoming SlideShare
Loading in...5
×
 

Chetan postgresql partitioning

on

  • 978 views

 

Statistics

Views

Total Views
978
Views on SlideShare
978
Embed Views
0

Actions

Likes
0
Downloads
20
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Should this table be partitioned? - Adds Complexity - Adds adminstration cost - Number of rows - Types of queries - Data growth
  • Inheritance does not automatically propagate data from INSERT or COPY commands to other tables in the inheritance hierarchy
  • It may not be good idea to set this parameter for all queries in system, as not all queries require this feature.
  • Modifying a Trigger function requires no special locking.
  • Modifying a Trigger function requires no special locking.
  • Partition keys are not supposed to be updated. However in that case, we need to delete from child and do insert on base table.
  • Now() is stable function. STABLE indicates that within a single table scan the function will consistently return the same result for the same argument values, but that its result could change across SQL statements. This is the appropriate selection for functions whose results depend on database lookups, parameter variables (such as the current time zone), etc. Also note that the current_timestamp family of functions qualify as stable, since their values do not change within a transaction.
  • A table can inherit from more than one parent table, in which case it has the union of the columns defined by the parent tables. Any columns declared in the child table's definition are added to these. If the same column name appears in multiple parent tables, or in both a parent table and the child's definition, then these columns are "merged" so that there is only one such column in the child table. To be merged, columns must have the same data types, else an error is raised. The merged column will have copies of all the check constraints coming from any one of the column definitions it came from, and will be marked not-null if any of them are.
  • When checking the foreign key, child tables are not considered. You can work around it using additional table indyvidual_pks (indyvidual_pk integer primary key) with all primary keys from both parent and child, which will be maintained using triggers (very simple — insert to indyvidual_pks on insert, delete from it on delete, update it on update, if it changes indyvidual_pk). Then you point foreign keys to this additional table instead of a child. There'll be some small performance hit, but only when adding/deleting rows.
  • Race condition among maintenance and partitioning trigger. management trigger - creates a new partition, updates the partitioning trigger because of the new partition, etc partitioning trigger - redirects the row into an appropriate partition

Chetan postgresql partitioning Chetan postgresql partitioning Presentation Transcript

  • EnterpriseDB, Postgres Plus and Dynatune are trademarks of EnterpriseDB Corporation. Other names may be trademarks of their respective owners. © 2011. All rights reserved. PostgreSQL Partitioning Using Inheritance and Check Constraints Presented by Chetan Suttraway November 22, 2011 Chetan.Suttraway @enterprisedb.com
  • © 2011 EnterpriseDB. All rights reserved. Partitioning
    • Range Partitioning
    • Specific range of data
    • timestamp
    • List Partitioning
    • Specific list of values
    • name IN ( 'PUNE', 'BENGALURU')
  • © 2011 EnterpriseDB. All rights reserved. Partitioning via Inheritance
    • Master Table
    • Empty
    CREATE TABLE orders( id INT NOT NULL, address TEXT NOT NULL, order_date TIMESTAMP NOT NULL );
    • Child Tables
    • Use non-overlapping Constraints
    CREATE TABLE orders_part_2011 ( CHECK ( order_date >= DATE '2011-01-01' AND order_date < DATE '2012-01-01' ) ) INHERITS (orders) TABLESPACE tbsp2; CREATE TABLE orders_part_2010 ( CHECK ( order_date < DATE '2011-01-01' ) ) INHERITS (orders); No Primary Key! Match datatype on constraints Simple partition names Plan reading becomes easy View slide
  • © 2011 EnterpriseDB. All rights reserved. Partitioning via Inheritance
    • Create Indexes on Child Tables
    • On key columns for each partition
    CREATE INDEX orders_part_2011_idx ON orders_part_2011(order_date); CREATE INDEX orders_part_2010_idx ON orders_part_2010(order_date);
    • Constraint Exclusion configuration parameter
    • postgresql.conf
    CONSTRAINT_EXCLUSION = ON;
    • Session
    SET constraint_exclusion = 'PARTITION';
    • Database
    ALTER DATABASE somedb SET constraint_exclusion = on; View slide
  • Partitioning via Inheritance © 2011 EnterpriseDB. All rights reserved.
    • Trigger or Rule
    • Redirect data inserted into master table to appropriate child table.
    CREATE OR REPLACE FUNCTION orders_insert1() RETURNS TRIGGER AS $$ DECLARE vsql Text; BEGIN vsql := 'INSERT INTO orders_part_'|| to_char(NEW.order_date, 'YYYY' )|| ' VALUES ('||NEW.id||','||quote_litera(NEW.address)||','||quote_literal(NEW.order_date)||')'; RETURN NULL; END; $$ LANGUAGE plpgsql; Low Maintenance Quoting, NULL issues Fast, No locking for updation of function
    • Either of Rules or triggers can be used.
    • Compute partition name on the fly.
    • Different variations for insert, update, delete.
    CREATE TRIGGER orders_insert_trigger BEFORE INSERT ON orders FOR EACH ROW EXECUTE PROCEDURE orders_insert();
  • Partitioning via Inheritance © 2011 EnterpriseDB. All rights reserved. CREATE OR REPLACE FUNCTION orders_insert() RETURNS TRIGGER AS $$ BEGIN IF (NEW.order_date >= DATE '2011-01-01' AND NEW.order_date < DATE '2012-01-01') THEN INSERT INTO orders_part_2011 VALUES (NEW.*); ELSIF (NEW.order_date < DATE '2011-01-01') THEN INSERT INTO orders_part_2010 VALUES (NEW.*); ELSE RAISE EXCEPTION 'Date out of range. check orders_insert() function!'; END IF; RETURN NULL; END; $$ LANGUAGE plpgsql; CREATE TRIGGER orders_insert_trigger BEFORE INSERT ON orders FOR EACH ROW EXECUTE PROCEDURE orders_insert(); High Maintenance No quoting, NULL issues Fast, No locking for updation of function
  • © 2011 EnterpriseDB. All rights reserved. Partitioning via Inheritance pg=# INSERT INTO orders VALUES(1, 'pune', '2011-08-22'); INSERT 0 0 SELECT * FROM orders; id | address | order_date ----+--------------+--------------------------- 1 | pune | 22-AUG-11 00:00:00 2 | bengaluru | 22-FEB-10 00:00:00 (2 rows) pg=# INSERT INTO orders VALUES(2, 'pune', '2010-02-22'); INSERT 0 0 pg=# UPDATE orders SET address = 'bengaluru' WHERE id = 2; UPDATE 1 SELECT * FROM orders_part_2011; id | address | order_date ----+---------+------------------------ 1 | pune | 22-AUG-11 00:00:00 (1 row) SELECT * FROM orders_part_2010; id | address | order_date ----+-----------+-------------------- 2 | bengaluru | 22-FEB-10 00:00:00 (1 row) SELECT * FROM ONLY orders; id | address | order_date ----+---------+------------ (0 rows)
  • © 2011 EnterpriseDB. All rights reserved. Partitioning via Inheritance EXPLAIN SELECT * FROM orders WHERE order_date = '02-JAN-11'; QUERY PLAN ---------------------------------------------------------------------------------------- Result (cost=0.00..26.01 rows=7 width=40) -> Append (cost=0.00..26.01 rows=7 width=40) -> Seq Scan on orders (cost=0.00..23.75 rows=6 width=44) Filter: (order_date = '02-JAN-11 00:00:00'::timestamp without time zone) -> Seq Scan on orders_part_2011 orders (cost=0.00..2.26 rows=1 width=18) Filter: (order_date = '02-JAN-11 00:00:00'::timestamp without time zone) (6 rows) EXPLAIN SELECT * FROM orders WHERE order_date = '02-JAN-10'; QUERY PLAN ---------------------------------------------------------------------------------------- Result (cost=0.00..24.76 rows=7 width=44) -> Append (cost=0.00..24.76 rows=7 width=44) -> Seq Scan on orders (cost=0.00..23.75 rows=6 width=44) Filter: (order_date = '02-JAN-10 00:00:00'::timestamp without time zone) -> Seq Scan on orders_part_2010 orders (cost=0.00..1.01 rows=1 width=44) Filter: (order_date = '02-JAN-10 00:00:00'::timestamp without time zone) (6 rows
  • © 2011 EnterpriseDB. All rights reserved. Query Planning EXPLAIN SELECT * FROM orders WHERE order_date = now(); QUERY PLAN ------------------------------------------------------------------------------------ Result (cost=0.00..30.03 rows=8 width=41) -> Append (cost=0.00..30.03 rows=8 width=41) -> Seq Scan on orders (cost=0.00..26.50 rows=6 width=44) Filter: (order_date = now()) -> Seq Scan on orders_part_2011 orders (cost=0.00..2.51 rows=1 width=18) Filter: (order_date = now()) -> Seq Scan on orders_part_2010 orders (cost=0.00..1.01 rows=1 width=44) Filter: (order_date = now()) (8 rows)
    • Planner analyses the query before values from parameters or stored procedures are substituted.
  • © 2011 EnterpriseDB. All rights reserved. Constraint Exclusion
    • Works for range or equality check constraints.
    • No automatic way to verify that all of the CHECK constraints are
    • mutually exclusive
      ALTER TABLE product_items_j ADD CONSTRAINT chk_item_name CHECK ( item_name LIKE 'J% ');
    • WHERE condition has to be similar to the constraint.
    • Constraint Exclusion works when WHERE clause has constants.
      ALTER TABLE product_items_j ADD CONSTRAINT chk_item_name CHECK ( item_name BETWEEN 'J' AND 'JZ' ');
      SELECT item_name FROM product_items WHERE item_name LIKE 'K%' ;
      SELECT item_name FROM product_items WHERE item_name = 'Kettle ';
    • Always look at the plan!
  • © 2011 EnterpriseDB. All rights reserved. Inheritance
    • Child Tables inherits:
    • All NOT NULL and CHECK constraints
    • Column default values
    • Child Tables does not inherit:
    • Indexes
    • All other constraints – unique, primary and foreign key
    • Ownership
    • Permissions
    • Changes to parent are propagated to child
    • Cannot rename inherited columns on child tables
    • Can remove NOT NULL and default value on child tables
    • ALTER TABLE can enable or disable inheritance on child tables
    • Additional columns can be added to child tables.
    • ANALYZE all child tables individually
  • © 2011 EnterpriseDB. All rights reserved. Uniqueness
    • Primary Key and Unique Constraints
    • Index for uniqueness
    • No Multi-Table Indexes
    • Indexing Partition Keys- Columns
    • Non-overlapping check constraints
    • Unique Index over each Partition
    • Indexing Non-Partition Keys- Columns
    • Unique Index over each Partition
    • Custom functions to scan over all Partitions
      - Not what we want!
    • Foreign Keys
    • Additional table with primary keys from both parent and child
    • Triggers to maintain this table
  • © 2011 EnterpriseDB. All rights reserved. Automating Maintenance
    • Pre-create deterministic partitions
    • Rules, Triggers, Functions can be dynamically created
    • Partition structure doesn't need to be complete
    • http://wiki.postgresql.org/wiki/Table_partitioning
  • © 2011 EnterpriseDB. All rights reserved. Querying Over Partitions
    • Tricky
    • Static Values
      - PostgreSQL's version of Static!
    • Explain
    • Test
    • Datatype issues
  • © 2011 EnterpriseDB. All rights reserved. Querying Over Partitions...MAX pg=# EXPLAIN SELECT MAX(order_date) FROM orders; QUERY PLAN -------------------------------------------------------------------------------- Result (cost=26.56..26.57 rows=1 width=0) InitPlan 1 (returns $0) -> Limit (cost=26.52..26.56 rows=1 width=8) -> Merge Append (cost=26.52..73.51 rows=1196 width=8) Sort Key: public.orders.order_date -> Sort (cost=26.47..29.21 rows=1094 width=8) Sort Key: public.orders.order_date -> Seq Scan on orders (cost=0.00..21.00 rows=1094 width=8) Filter: (order_date IS NOT NULL) -> Index Scan Backward using orders_part_2011_idx on orders_pa rt_2011 orders (cost=0.00..14.02 rows=101 width=8) Index Cond: (order_date IS NOT NULL) -> Index Scan Backward using orders_part_old_idx on orders_par t_old orders (cost=0.00..8.27 rows=1 width=8) Index Cond: (order_date IS NOT NULL) (13 rows)
  • © 2011 EnterpriseDB. All rights reserved. Questions, Inputs...
      Thank You :)