Table partitioning in PostgreSQL + Rails

3,057 views
2,777 views

Published on

An introductory presentation about table partitioning in PostgreSQL and how to integrate it in your Rails application. Given at the Cambridge Ruby User Group meetup Mar 27th 2014.

Published in: Software
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
3,057
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
27
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Table partitioning in PostgreSQL + Rails

  1. 1. Table Partitioning PostgreSQL + Rails Agnieszka Figiel, Cambridge Ruby User Group Mar 2014
  2. 2. What is table partitioning in PostgreSQL? 1 logical table = n smaller physical tables
  3. 3. What are the benefits? improved query performance for big tables * * works best when results are coming from single partition
  4. 4. When to consider partitioning? Big table that cannot be queried efficiently, as the index is enormous as well. Rule of thumb: table does not fit into memory
  5. 5. How is data split? By ranges: year > 2010 AND year <= 2012 By lists of key values: company_id = 5
  6. 6. How does it work? At the heart of it lie 2 mechanisms: ➔ table inheritance ➔ table CHECK constraints
  7. 7. Table inheritance: schema CREATE TABLE A (value INT); CREATE TABLE B () INHERITS (A); Table "public.a" Column | Type | Modifiers --------+---------+----------- value | integer | Table "public.b" Column | Type | Modifiers --------+---------+----------- value | integer | Inherits: a
  8. 8. Table inheritance: schema CREATE TABLE C (extra_field TEXT) INHERITS (A); Table "public.c" Column | Type | Modifiers -------------+---------+----------- value | integer | extra_field | text | Inherits: a NB: may inherit from multiple tables
  9. 9. Table inheritance: querying INSERT INTO A VALUES (0); INSERT INTO B VALUES (10); INSERT INTO C VALUES (20, 'zonk'); SELECT * FROM A WHERE value < 20; value ------- 0 10 SELECT * FROM ONLY A WHERE value < 20; value ------- 0
  10. 10. What happened? EXPLAIN ANALYZE SELECT * FROM A WHERE value < 20; QUERY PLAN ---------------------------------------------------------------------------------------------------- Append (cost=0.00..69.38 rows=1290 width=4) (actual time=0.020..0.033 rows=2 loops=1) -> Seq Scan on a (cost=0.00..4.00 rows=80 width=4) (actual time=0.019..0.021 rows=1 loops=1) Filter: (value < 20) -> Seq Scan on b (cost=0.00..40.00 rows=800 width=4) (actual time=0.005..0.006 rows=1 loops=1) Filter: (value < 20) -> Seq Scan on c (cost=0.00..25.38 rows=410 width=4) (actual time=0.005..0.005 rows=0 loops=1) Filter: (value < 20) Rows Removed by Filter: 1
  11. 11. Table CHECK constraint CREATE TABLE A (value INT); CREATE TABLE B (CHECK (value < 20)) INHERITS (A); CREATE TABLE C (CHECK (value >= 20)) INHERITS (A); INSERT INTO B VALUES (10); INSERT INTO C VALUES (20); INSERT INTO C VALUES(5); ERROR: new row for relation "c" violates check constraint "c_value_check" DETAIL: Failing row contains (5).
  12. 12. Ta da! EXPLAIN ANALYZE SELECT * FROM A WHERE value < 20; QUERY PLAN ---------------------------------------------------------------------------------------------------- Append (cost=0.00..40.00 rows=801 width=4) (actual time=0.015..0.017 rows=1 loops=1) -> Seq Scan on a (cost=0.00..0.00 rows=1 width=4) (actual time=0.002..0.002 rows=0 loops=1) Filter: (value < 20) -> Seq Scan on b (cost=0.00..40.00 rows=800 width=4) (actual time=0.012..0.013 rows=1 loops=1) Filter: (value < 20)
  13. 13. Gotcha #1 Make sure that: SET constraint_exclusion = on;
  14. 14. Gotcha #2 All check constraints and not-null constraints on a parent table are automatically inherited by its children. Other types of constraints (unique, primary key, and foreign key constraints) are not inherited.
  15. 15. Solution #2: create constraints or indexes in all partitions CREATE TABLE A (id serial NOT NULL, value INT NOT NULL); CREATE TABLE B ( CONSTRAINT B_pkey PRIMARY KEY (id), CHECK (value < 20) ) INHERITS (A); CREATE TABLE C ( CONSTRAINT C_pkey PRIMARY KEY (id), CHECK (value >= 20) ) INHERITS (A);
  16. 16. Gotcha #3 There is no practical way to enforce uniqueness of a SERIAL id across partitions. INSERT INTO B (value) VALUES (10); INSERT INTO C (value) VALUES (20); INSERT INTO C VALUES (1, 30); SELECT * FROM A; id | value ----+------- 1 | 10 2 | 20 1 | 30
  17. 17. Solution #3 ➔ carry on and always filter by partitioning key ➔ use UUID (uuid-ossp)
  18. 18. Gotcha #4 Specifying that another table's column REFERENCES a(value) would allow the other table to contain “a values”, but not “b or c values”. There is no good workaround for this case.
  19. 19. How to insert / update rows? ➔ PostgreSQL docs recommend using triggers ➔ also possible to do it using rules (overhead, bulk)
  20. 20. Trigger example CREATE OR REPLACE FUNCTION a_insert_trigger() RETURNS TRIGGER AS $$ BEGIN IF NEW.value < 20 THEN INSERT INTO b VALUES (NEW.*); ELSE INSERT INTO c VALUES (NEW.*); END IF; RETURN NULL; END; $$ LANGUAGE plpgsql; CREATE TRIGGER insert_a_trigger BEFORE INSERT ON a FOR EACH ROW EXECUTE PROCEDURE a_insert_trigger();
  21. 21. Partitioned gem https://github.com/fiksu/partitioned class Company < ActiveRecord::Base; end class ByCompanyId < Partitioned::ByForeignKey self.abstract_class = true belongs_to :company def self.partition_foreign_key return :company_id end partitioned do |partition| partition.index :id, :unique => true end end class Employee < ByCompanyId; end employee = Employee.from_partition(1).find(1)
  22. 22. UUID in Rails Rails 4: enable_extension 'uuid-ossp' create_table :documents, id: :uuid do |t| t.string :title t.string :author t.timestamps end Rails 3: gem postgres_ext https://github.com/dockyard/postgres_ext
  23. 23. Actual performance improvements? ➔ table with ~13 mln rows ➔ partitioned by date into 8 partitions ➔ complex, long-running query ➔ baseline: 0.07 tps ➔ when results in single partition: 0.15 tps ➔ when results in 2 partitions: like baseline
  24. 24. References ➔ http://www.postgresql.org/docs/9.3/static/ddl- partitioning.html ➔ PostgreSQL 9.0 High Performance Gregory Smith

×