Your SlideShare is downloading. ×
Getting Started with PL/Proxy
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Getting Started with PL/Proxy

5,769
views

Published on

presentation from PgEast 2011

presentation from PgEast 2011

Published in: Technology

0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
5,769
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
103
Comments
0
Likes
5
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Getting Started with PL/Proxy Peter Eisentraut peter@eisentraut.org F-Secure Corporation PostgreSQL Conference East 2011 CC-BY
  • 2. Concept • a database partitioning system implemented as a procedural language • “sharding”/horizontal partitioning • PostgreSQL’s No(t-only)SQL solution
  • 3. Concept application application application application frontend partition 1 partition 2 partition 3 partition 4
  • 4. Areas of Application • high write load • (high read load) • allow for some “eventual consistency” • have reasonable partitioning keys • use/plan to use server-side functions
  • 5. Example Have:1 CREATE TABLE products ( prod_id serial PRIMARY KEY , category integer NOT NULL , title varchar (50) NOT NULL , actor varchar (50) NOT NULL , price numeric (12 ,2) NOT NULL , special smallint , common_prod_id integer NOT NULL ); INSERT INTO products VALUES (...) ; UPDATE products SET ... WHERE ...; DELETE FROM products WHERE ...; plus various queries 1 dellstore2 example database
  • 6. Installation • Download: http://plproxy.projects.postgresql.org, Deb, RPM, . . . • Create language: psql -d dellstore2 -f ...../plproxy.sql
  • 7. Backend Functions I CREATE FUNCTION insert_product ( p_category int , p_title varchar , p_actor varchar , p_price numeric , p_special smallint , p_common_prod_id int ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE cnt int ; BEGIN INSERT INTO products ( category , title , actor , price , special , common_prod_id ) VALUES ( p_category , p_title , p_actor , p_price , p_special , p_common_prod_id ) ; GET DIAGNOSTICS cnt = ROW_COUNT ; RETURN cnt ; END ; $$ ;
  • 8. Backend Functions II CREATE FUNCTION update_product_price ( p_prod_id int , p_price numeric ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE cnt int ; BEGIN UPDATE products SET price = p_price WHERE prod_id = p_prod_id ; GET DIAGNOSTICS cnt = ROW_COUNT ; RETURN cnt ; END ; $$ ;
  • 9. Backend Functions III CREATE FUNCTION delete_product_by_title ( p_title varchar ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE cnt int ; BEGIN DELETE FROM products WHERE title = p_title ; GET DIAGNOSTICS cnt = ROW_COUNT ; RETURN cnt ; END ; $$ ;
  • 10. Frontend Functions I CREATE FUNCTION insert_product ( p_category int , p_title varchar , p_actor varchar , p_price numeric , p_special smallint , p_common_prod_id int ) RETURNS SETOF int LANGUAGE plproxy AS $$ CLUSTER dellstore_cluster ; RUN ON hashtext ( p_title ) ; $$ ; CREATE FUNCTION update_product_price ( p_prod_id int , p_price numeric ) RETURNS SETOF int LANGUAGE plproxy AS $$ CLUSTER dellstore_cluster ; RUN ON ALL ; $$ ;
  • 11. Frontend Functions II CREATE FUNCTION delete_product_by_title ( p_title varchar ) RETURNS int LANGUAGE plpgsql AS $$ CLUSTER dellstore_cluster ; RUN ON hashtext ( p_title ) ; $$ ;
  • 12. Frontend Query Functions I CREATE FUNCTION get_product_price ( p_prod_id int ) RETURNS SETOF numeric LANGUAGE plproxy AS $$ CLUSTER dellstore_cluster ; RUN ON ALL ; SELECT price FROM products WHERE prod_id = p_prod_id ; $$ ;
  • 13. Frontend Query Functions II CREATE FUNCTION get_products_by_category ( p_category int ) RETURNS SETOF products LANGUAGE plproxy AS $$ CLUSTER dellstore_cluster ; RUN ON ALL ; SELECT * FROM products WHERE category = p_category ; $$ ;
  • 14. Unpartitioned Small Tables CREATE FUNCTION insert_category ( p_categoryname ) RETURNS SETOF int LANGUAGE plproxy AS $$ CLUSTER dellstore_cluster ; RUN ON 0; $$ ;
  • 15. Which Hash Key? • natural keys (names, descriptions, UUIDs) • not serials (Consider using fewer “ID” fields.) • single columns • group sensibly to allow joins on backend
  • 16. Set Basic Parameters • number of partitions (2n ), e. g. 8 • host names, e. g. • frontend: dbfe • backends: dbbe1, . . . , dbbe8 • database names, e. g. • frontend: dellstore2 • backends: store01, . . . , store08 • user names, e. g. storeapp • hardware: • frontend: lots of memory, normal disk • backends: full-sized database server
  • 17. Set Basic Parameters • number of partitions (2n ), e. g. 8 • host names, e. g. • frontend: dbfe • backends: dbbe1, . . . , dbbe8 (or start at 0?) • database names, e. g. • frontend: dellstore2 • backends: store01, . . . , store08 (or start at 0?) • user names, e. g. storeapp • hardware: • frontend: lots of memory, normal disk • backends: full-sized database server
  • 18. Configuration CREATE FUNCTION plproxy . get_cluster_partitions ( cluster_name text ) RETURNS SETOF text LANGUAGE plpgsql AS $$ ... $$ ; CREATE FUNCTION plproxy . get_cluster_version ( cluster_name text ) RETURNS int LANGUAGE plpgsql AS $$ ... $$ ; CREATE FUNCTION plproxy . get_cluster_config ( IN cluster_name text , OUT key text , OUT val text ) RETURNS SETOF record LANGUAGE plpgsql AS $$ ... $$ ;
  • 19. get_cluster_partitions Simplistic approach: CREATE FUNCTION plproxy . get_cluster_partitions ( cluster_name text ) RETURNS SETOF text LANGUAGE plpgsql AS $$ BEGIN IF cluster_name = dellstore_cluster THEN RETURN NEXT dbname = store01 host = dbbe1 ; RETURN NEXT dbname = store02 host = dbbe2 ; ... RETURN NEXT dbname = store08 host = dbbe8 ; RETURN ; END IF ; RAISE EXCEPTION Unknown cluster ; END ; $$ ;
  • 20. get_cluster_version Simplistic approach: CREATE FUNCTION plproxy . get_cluster_version ( cluster_name text ) RETURNS int LANGUAGE plpgsql AS $$ BEGIN IF cluster_name = dellstore_cluster THEN RETURN 1; END IF ; RAISE EXCEPTION Unknown cluster ; END ; $$ LANGUAGE plpgsql ;
  • 21. get_cluster_config CREATE OR REPLACE FUNCTION plproxy . get_cluster_config ( IN cluster_name text , OUT key text , OUT val text ) RETURNS SETOF record LANGUAGE plpgsql AS $$ BEGIN -- same config for all clusters key := connection_lifetime ; val := 30*60; -- 30 m RETURN NEXT ; RETURN ; END ; $$ ;
  • 22. Table-Driven Configuration I CREATE TABLE plproxy . partitions ( cluster_name text NOT NULL , host text NOT NULL , port text NOT NULL , dbname text NOT NULL , PRIMARY KEY ( cluster_name , dbname ) ); INSERT INTO plproxy . partitions VALUES ( dellstore_cluster , dbbe1 , 5432 , store01 ) , ( dellstore_cluster , dbbe2 , 5432 , store02 ) , ... ( dellstore_cluster , dbbe8 , 5432 , store03 ) ;
  • 23. Table-Driven Configuration II CREATE TABLE plproxy . cluster_users ( cluster_name text NOT NULL , remote_user text NOT NULL , local_user NOT NULL , PRIMARY KEY ( cluster_name , remote_user , local_user ) ); INSERT INTO plproxy . cluster_users VALUES ( dellstore_cluster , storeapp , storeapp ) ;
  • 24. Table-Driven Configuration III CREATE TABLE plproxy . remote_passwords ( host text NOT NULL , port text NOT NULL , dbname text NOT NULL , remote_user text NOT NULL , password text , PRIMARY KEY ( host , port , dbname , remote_user ) ); INSERT INTO plproxy . remote_passwords VALUES ( dbbe1 , 5432 , store01 , storeapp , Thu1Ued0 ) , ... -- or use . pgpass ?
  • 25. Table-Driven Configuration IV CREATE TABLE plproxy . cluster_version ( id int PRIMARY KEY ); INSERT INTO plproxy . cluster_version VALUES (1) ; GRANT SELECT ON plproxy . cluster_version TO PUBLIC ; /* extra credit : write trigger that changes the version when one of the other tables changes */
  • 26. Table-Driven Configuration V CREATE OR REPLACE FUNCTION plproxy . get_cluster_partitions ( p_cluster_name text ) RETURNS SETOF text LANGUAGE plpgsql SECURITY DEFINER AS $$ DECLARE r record ; BEGIN FOR r IN SELECT host = || host || port = || port || dbname = || dbname || user = || remote_user || password = || password AS dsn FROM plproxy . partitions NATURAL JOIN plproxy . cluster_users NATURAL JOIN plproxy . remote_passwords WHERE cluster_name = p_cluster_name AND local_user = session_user ORDER BY dbname -- important LOOP RETURN NEXT r. dsn ; END LOOP ; IF NOT found THEN RAISE EXCEPTION no such cluster : % , p_cluster_name ; END IF ; RETURN ; END ; $$ ;
  • 27. Table-Driven Configuration VI CREATE FUNCTION plproxy . get_cluster_version ( p_cluster_name text ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE ret int ; BEGIN SELECT INTO ret id FROM plproxy . cluster_version ; RETURN ret ; END ; $$ ;
  • 28. SQL/MED Configuration CREATE SERVER dellstore_cluster FOREIGN DATA WRAPPER plproxy OPTIONS ( connection_lifetime 1800 , p0 dbname = store01 host = dbbe1 , p1 dbname = store02 host = dbbe2 , ... p7 dbname = store08 host = dbbe8 ); CREATE USER MAPPING FOR storeapp SERVER dellstore_cluster OPTIONS ( user storeapp , password sekret ) ; GRANT USAGE ON SERVER dellstore_cluster TO storeapp ;
  • 29. Hash Functions RUN ON hashtext ( somecolumn ) ; • want a fast, uniform hash function • typically use hashtext • problem: implementation might change • possible solution: https://github.com/petere/pgvihash
  • 30. Sequences shard 1: ALTER SEQUENCE products_prod_id_seq MINVALUE 1 MAXVALUE 100000000 START 1; shard 2: ALTER SEQUENCE products_prod_id_seq MINVALUE 100000001 MAXVALUE 200000000 START 100000001; etc.
  • 31. Aggregates Example: count all products Backend: CREATE FUNCTION count_products () RETURNS bigint LANGUAGE SQL STABLE AS $$SELECT count (*) FROM products$$ ; Frontend: CREATE FUNCTION count_products () RETURNS SETOF bigint LANGUAGE plproxy AS $$ CLUSTER dellstore_cluster ; RUN ON ALL ; $$ ; SELECT sum ( x ) AS count FROM count_products () AS t(x);
  • 32. Dynamic Queries I a. k. a. “cheating” ;-) CREATE FUNCTION execute_query ( sql text ) RETURNS SETOF RECORD LANGUAGE plproxy AS $$ CLUSTER dellstore_cluster ; RUN ON ALL ; $$ ; CREATE FUNCTION execute_query ( sql text ) RETURNS SETOF RECORD LANGUAGE plpgsql AS $$ BEGIN RETURN QUERY EXECUTE sql ; END ; $$ ;
  • 33. Dynamic Queries II SELECT * FROM execute_query ( SELECT title , price FROM products ) AS ( title varchar , price numeric ) ; SELECT category , sum ( sum_price ) FROM execute_query ( SELECT category , sum ( price ) FROM products GROUP BY category ) AS ( category int , sum_price numeric ) GROUP BY category ;
  • 34. Repartitioning • changing partitioning key is extremely cumbersome • adding partitions is somewhat cumbersome, e. g., to split shard 0: COPY ( SELECT * FROM products WHERE hashtext ( title :: text ) & 15 <> 0) TO somewhere ; DELETE FROM products WHERE hashtext ( title :: text ) & 15 <> 0; Better start out with enough partitions!
  • 35. PgBouncer application application application application frontend PgBouncer PgBouncer PgBouncer PgBouncer partition 1 partition 2 partition 3 partition 4 Use pool_mode = statement
  • 36. Development Issues • foreign keys • notifications • hash key check constraints • testing (pgTAP), no validator
  • 37. Administration • centralized logging • distributed shell (dsh) • query canceling/timeouts • access control, firewalling • deployment
  • 38. High Availability Frontend: • multiple frontends (DNS, load balancer?) • replicate partition configuration (Slony, Bucardo, WAL) • Heartbeat, UCARP, etc. Backend: • replicate backends shards individually (Slony, WAL, DRBD) • use partition configuration to configure load spreading or failover
  • 39. Advanced Topics • generic insert, update, delete functions • frontend joins • backend joins • finding balance between function interface and dynamic queries • arrays, SPLIT BY • use for remote database calls • cross-shard calls • SQL/MED (foreign table) integration
  • 40. The End