Successfully reported this slideshow.
Your SlideShare is downloading. ×

Developing and Deploying Apps with the Postgres FDW

Ad

My Love of Developing
with the Postgres FDW
...and how production tested those feelings.
Jonathan S. Katz
PGConf EU 2015 -...

Ad

Hi! I'm Jonathan!
2

Ad

A Bit About Me
• @jkatz05
• Chief Technology Officer @ VenueBook
• Using Postgres since ~2004
• Been using it decently ~201...

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Check these out next

1 of 92 Ad
1 of 92 Ad

Developing and Deploying Apps with the Postgres FDW

Download to read offline

I couldn't wait to use the Postgres Foreign Data Wrapper (postgres_fdw) in a project; imagine being able to read and write data to many databases all from a single database! I finally found a project where it made sense to use this amazing technology.

I mapped out my architecture and began to code, and realized there were some things that did not work as expected: I could not call remote functions or insert into a table with a serial primary key and have it autoupdate. I found workarounds (which I will share), so the project went on.

We tested the setup, everything seemed to work well, and then we went to deploy to production. And then the real fun began.

Despite the title, I still love the Postgres FDW but wanted to provide some cautionary tales from a hybrid developer/DBA perspective on how to properly use them in your working environment. This talk will cover:

* Basic Postgres FDW setup in a development environment vs. production environment
* Handling some common FDW uses case that you think are trivial but are not
* Working with advanced Postgres constructs such as schemas and sequences with FDWs
* Putting it all together to make sure your production application is safe with your FDWs
* ...and when you really, really need to make a remote call and it is not supported by a FDW, how to do that too!

I couldn't wait to use the Postgres Foreign Data Wrapper (postgres_fdw) in a project; imagine being able to read and write data to many databases all from a single database! I finally found a project where it made sense to use this amazing technology.

I mapped out my architecture and began to code, and realized there were some things that did not work as expected: I could not call remote functions or insert into a table with a serial primary key and have it autoupdate. I found workarounds (which I will share), so the project went on.

We tested the setup, everything seemed to work well, and then we went to deploy to production. And then the real fun began.

Despite the title, I still love the Postgres FDW but wanted to provide some cautionary tales from a hybrid developer/DBA perspective on how to properly use them in your working environment. This talk will cover:

* Basic Postgres FDW setup in a development environment vs. production environment
* Handling some common FDW uses case that you think are trivial but are not
* Working with advanced Postgres constructs such as schemas and sequences with FDWs
* Putting it all together to make sure your production application is safe with your FDWs
* ...and when you really, really need to make a remote call and it is not supported by a FDW, how to do that too!

More Related Content

Developing and Deploying Apps with the Postgres FDW

  1. 1. My Love of Developing with the Postgres FDW ...and how production tested those feelings. Jonathan S. Katz PGConf EU 2015 - October 30, 2015
  2. 2. Hi! I'm Jonathan! 2
  3. 3. A Bit About Me • @jkatz05 • Chief Technology Officer @ VenueBook • Using Postgres since ~2004 • Been using it decently ~2010 • One day hope to use it well ;-) • Active Postgres community member • Co-Chair, PGConf US • Co-organizer, NYC PostgreSQL User Group • Director, United States PostgreSQL Association • Have been to every PGConf.EU except Madrid :( 3
  4. 4. PGConf US 2016 April 18 - 20, New York City http://www.pgconf.us/ 4
  5. 5. Disclaimer 5 I loooooove PostgreSQL
  6. 6. Disclaimer #2 6 I'm some sort of weird dev / DBA / business-person hybrid
  7. 7. Okay, done with the boilerplate. ! Let's do this. 7
  8. 8. Foreign Data Wrappers in a Nutshell • Provide a unified interface (i.e. SQL) to access different data sources • RDBMS (like Postgres!) • NoSQL • APIs (HTTP, Twitter, etc.) • Internet of things 8
  9. 9. IMHO: This is a killer feature 9
  10. 10. History of FDWs • Released in 9.2 with a few read-only interfaces • SQL-MED • Did not include Postgres :( • 9.3: Writeable FDWs • ...and did include Postgres :D • 9.4: Considers triggers on foreign tables • 9.5 • IMPORT FOREIGN SCHEMA • Push Down API (WIP) • Inheritance children 10
  11. 11. Not Going Anywhere • 9.6 • Join Push Down • Aggregate API? • Parallelism? • "Hey we need some data from you, we will check back later" 11
  12. 12. So I was just waiting for a good problem to solve with FDWs 12
  13. 13. And then a couple of them came. 13
  14. 14. Some Background 14 VenueBook is revolutionizing the way people think about event booking. Our platform lets venues and bookers plan together, creating a smarter and better- connected experience for all. We simplify planning, so you can have more fun!
  15. 15. Translation • We have two main products: • A CRM platform that allows venue managers to control everything around an event. • A marketplace that allows event planners source venues and book events. 15
  16. 16. Further Translation 16 There are a lot of moving pieces with our data.
  17. 17. So The Following Conversation Happend 17
  18. 18. 18 Hey, can we build an API? Sure, but I would want to run it as a separate application so that way we can isolate the load from our primary database. Okay, that makes sense. Great. There is a feature in Postgres that makes it easy to talk between two separate Postgres databases, so it shouldn't be too difficult to build. That sounds good. Let's do it! There's one catch...
  19. 19. This could be a bit experimental... 19
  20. 20. I want to experiment with this thing called a "Foreign Data Wrapper" but it should make maintenance easier overall. 20
  21. 21. "OK" 21
  22. 22. Assumptions • We are running PostgreSQL 9.4 • The schema I'm working with is slightly contrived for the purposes of demonstration 23
  23. 23. So, let's build something in our development environment 24
  24. 24. 25 local:app jkatz$ createuser! Enter name of role to add: jkatz! Shall the new role be a superuser? (y/n) y Yeah, of course I want superuser
  25. 25. 26 # "local" is for Unix domain socket connections only! local all all trust Yeah, of course I don't care about authentication settings. (Pro-tip: "trust" means user privileges don't matter)
  26. 26. 27 local:app jkatz$ createdb app Let's pretend this is how I created the main database.
  27. 27. 28 CREATE TABLE venues ( id serial PRIMARY KEY, name varchar(255) NOT NULL ); ! CREATE TABLE events ( id serial PRIMARY KEY, venue_id int REFERENCES venues (id), name text NOT NULL, total int NOT NULL DEFAULT 0, guests int NOT NULL, start_time timestamptz NOT NULL, end_time timestamptz NOT NULL, created_at timestamptz DEFAULT CURRENT_TIMESTAMP NOT NULL ); And let's pretend this is how I created the schema for it.
  28. 28. 29 And this magic function to check for availability. CREATE FUNCTION get_availability( venue_id int, start_time timestamptz, end_time timestamptz ) RETURNS bool AS $$ SELECT NOT EXISTS( SELECT 1 FROM events WHERE events.venue_id = $1 AND ($2, $3) OVERLAPS (events.start_time, events.end_time) LIMIT 1 ); $$ LANGUAGE SQL STABLE;
  29. 29. 30 local:app jkatz$ createdb api So let's make the API schema
  30. 30. 31 CREATE SCHEMA api; We are going to be a bit smarter about how we organize the code.
  31. 31. 32 CREATE TABLE api.users ( id serial PRIMARY KEY, key text UNIQUE NOT NULL, name text NOT NULL ); ! CREATE TABLE api.venues ( id serial PRIMARY KEY, remote_venue_id int NOT NULL ); ! CREATE TABLE api.events ( id serial PRIMARY KEY, user_id int REFERENCES api.users (id) NOT NULL, venue_id int REFERENCES api.venues (id) NOT NULL, remote_bid_id text, ip_address text, data json, created_at timestamptz DEFAULT CURRENT_TIMESTAMP NOT NULL ); Our API schema
  32. 32. 33 CREATE EXTENSION postgres_fdw; ! CREATE SERVER app_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname 'app'); ! CREATE USER MAPPING FOR CURRENT_USER SERVER app_server; Our setup to pull the information from the main application
  33. 33. 34 CREATE SCHEMA app; ! CREATE FOREIGN TABLE app.venues ( id int, name text ) SERVER app_server OPTIONS (table_name 'venues'); We will isolate the foreign tables in their own schema
  34. 34. 35 SELECT * FROM app.venues; So that means this returns...
  35. 35. 36 SELECT * FROM app.venues; ERROR: relation "app.venues" does not exist CONTEXT: Remote SQL command: SELECT id, name FROM app.venues
  36. 36. 37 ...what?
  37. 37. 38 CREATE FOREIGN TABLE app.venues ( id int, name text ) SERVER app_server OPTIONS ( table_name 'venues', schema_name 'public' ); If there is a schema mismatch between local and foreign table, you have to set the schema explicitly.
  38. 38. 39 SELECT * FROM app.venues; id | name ----+-------------- 1 | Venue A 2 | Restaurant B 3 | Bar C 4 | Club D
  39. 39. 40 CREATE FOREIGN TABLE app.events ( id int, venue_id int, name text, total int, guests int, start_time timestamptz, end_time timestamptz ) SERVER app_server OPTIONS ( table_name 'events', schema_name 'public' ); Adding in our foreign table for events
  40. 40. 41 INSERT INTO app.events ( venue_id, name, total, guests, start_time, end_time ) VALUES ( 1, 'Conference Party', 50000, 400, '2015-10-28 18:00', '2015-10-28 21:00' ) RETURNING id; ERROR: null value in column "id" violates not-null constraint DETAIL: Failing row contains (null, 1, Conference Party, 50000, 400, 2015-10-28 22:00:00+00, 2015-10-29 01:00:00+00, 2015-10-27 22:19:10.555695+00). CONTEXT: Remote SQL command: INSERT INTO public.events(id, venue_id, name, total, guests, start_time, end_time) VALUES ($1, $2, $3, $4, $5, $6, $7)
  41. 41. 42 Huh.
  42. 42. 43 Two Solutions.
  43. 43. 44 Solution #1
  44. 44. 45 CREATE FOREIGN TABLE app.events ( id serial NOT NULL, venue_id int, name text, total int, guests int, start_time timestamptz, end_time timestamptz ) SERVER app_server OPTIONS ( table_name 'events', schema_name 'public' );
  45. 45. 46 INSERT INTO app.events ( venue_id, name, total, guests, start_time, end_time ) VALUES ( 1, 'Conference Party', 50000, 400, '2015-10-28 18:00', '2015-10-28 21:00' ) RETURNING id; id ---- 1 (1 row)
  46. 46. WARNING • This is using a sequence on the local database • If you do not want to generate overlapping primary keys, this is not the solution for you. • Want to use the sequence generating function on the foreign database • But FDWs cannot access foreign functions • However... 47
  47. 47. 48 Solution #2
  48. 48. 49 (on the "app" database) CREATE SCHEMA api; ! CREATE VIEW api.events_id_seq_view AS SELECT nextval('public.events_id_seq') AS id;
  49. 49. 50 CREATE FOREIGN TABLE app.events_id_seq_view ( id int ) SERVER app_server OPTIONS ( table_name 'events_id_seq_view', schema_name 'api' ); ! CREATE FUNCTION app.events_id_seq_nextval() RETURNS int AS $$ SELECT id FROM app.events_id_seq_view $$ LANGUAGE SQL; ! CREATE FOREIGN TABLE app.events ( id int DEFAULT app.events_id_seq_nextval(), venue_id int, name text, total int, guests int, start_time timestamptz, end_time timestamptz ) SERVER app_server OPTIONS ( table_name 'events', schema_name 'public' ); (on the "api" database)
  50. 50. 51 INSERT INTO app.events ( venue_id, name, total, guests, start_time, end_time ) VALUES ( 1, 'Conference Party', 50000, 400, '2015-10-28 18:00', '2015-10-28 21:00' ) RETURNING id; id ---- 4 (1 row)
  51. 51. 52 Hey, can we check the availability on the api server before making an insert on the app server?
  52. 52. 53 Sure, we have a function for that on "app" but... FDWs do not support foreign functions. ! And we cannot use a view. ! However...
  53. 53. dblink • Written in 2001 by Joe Conway • Designed to make remote PostgreSQL database calls • The docs say: • See also postgres_fdw, which provides roughly the same functionality using a more modern and standards-compliant infrastructure. 54
  54. 54. 55 -- setup the extensions (if not already done so) CREATE EXTENSION plpgsql; CREATE EXTENSION dblink; ! -- create CREATE FUNCTION app.get_availability( venue_id int, start_time timestamptz, end_time timestamptz ) RETURNS bool AS $get_availability$ DECLARE is_available bool; remote_sql text; BEGIN remote_sql := format('SELECT get_availability(%L, %L, %L)', venue_id, start_time, end_time); SELECT availability.is_available INTO is_available FROM dblink('dbname=app', remote_sql) AS availability(is_available bool); RETURN is_available; EXCEPTION WHEN others THEN RETURN NULL::bool; END; $get_availability$ LANGUAGE plpgsql; (on the "api" database)
  55. 55. 56 SELECT app.get_availability(1, '2015-10-28 18:00', '2015-10-28 20:00'); get_availability ------------------ f (1 row) get_availability ------------------ t (1 row) SELECT app.get_availability(1, '2015-10-28 12:00', '2015-10-28 14:00'); Works great!
  56. 56. Summary So Far... • We created two separate databases with logical schemas • We wrote some code using postgres_fdw and dblink that can • Read data from "app" to "api" • Insert data from "api" to the "app" • ...with the help of the sequence trick • Make a remote function call 57
  57. 57. Awesome! Let's Deploy 58
  58. 58. (And because we are good developers, we are going to test the deploy configuration in a staging environment, but we can all safely assume that, right? :-) 59
  59. 59. (Note: when I say "superuser" I mean a Postgres superuser) 60
  60. 60. 61 app api db01: 10.0.0.80 api api01: 10.0.0.20 app app01: 10.0.0.10 Network Topography
  61. 61. 62 db01:postgresql postgres$ createdb -O app app! db01:postgresql postgres$ createdb -O app api How we are setting things up
  62. 62. 63 # TYPE DATABASE USER ADDRESS METHOD # for the main user host app app 10.0.0.10/32 md5 host api api 10.0.0.20/32 md5 # for foreign table access local api app md5 local app api md5 pg_hba.conf setup
  63. 63. 64 CREATE EXTENSION postgres_fdw; CREATE EXTENSION dblink; So we already know to run these as a supuerser on "api" right? ;-)
  64. 64. 65 CREATE SERVER app_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname 'app'); ERROR: permission denied for foreign server app_server But if we log in as the "api" user and try to run this...
  65. 65. 66 As a superuser, grant permission GRANT USAGE ON FOREIGN DATA WRAPPER postgres_fdw TO api;
  66. 66. 67 CREATE SERVER app_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname 'app'); ! CREATE FOREIGN TABLE app.venues ( id int, name text ) SERVER app_server OPTIONS ( table_name 'venues', schema_name 'public' ); Now this works! Let's run a query...
  67. 67. 68 SELECT * FROM app.venues; ERROR: user mapping not found for "api"
  68. 68. 69 CREATE USER MAPPING FOR api SERVER app_server OPTIONS ( user 'api', password 'test' ); So we create the user mapping and...
  69. 69. 70 SELECT * FROM app.venues; ERROR: permission denied for relation venues CONTEXT: Remote SQL command: SELECT id, name FROM public.venues You've got to be kidding me...
  70. 70. 71 Go to "app" and as a superuser run this GRANT SELECT ON venues TO api; GRANT SELECT, INSERT, UPDATE ON events TO api;
  71. 71. 72 SELECT * FROM app.venues; id | name ----+-------------- 1 | Venue A 2 | Restaurant B 3 | Bar C 4 | Club D Meanwhile, back on "api"
  72. 72. Time to make the events work. 73
  73. 73. 74 CREATE SCHEMA api; ! CREATE VIEW api.events_id_seq_view AS SELECT nextval('public.events_id_seq') AS id; Get things started on the "app" database
  74. 74. 75 -- setup the sequence functionality CREATE FOREIGN TABLE app.events_id_seq_view ( id int ) SERVER app_server OPTIONS ( table_name 'events_id_seq_view', schema_name 'api' ); ! CREATE FUNCTION app.events_id_seq_nextval() RETURNS int AS $$ SELECT id FROM app.events_id_seq_view $$ LANGUAGE SQL; Back on the "api" database
  75. 75. And when we test the sequence function... 76
  76. 76. 77 SELECT app.events_id_seq_nextval(); ERROR: permission denied for schema api CONTEXT: Remote SQL command: SELECT id FROM api.events_id_seq_view SQL function "events_id_seq_nextval" statement 1 Here we go again...
  77. 77. 78 GRANT USAGE ON SCHEMA api TO api; On the "app" database
  78. 78. 79 SELECT app.events_id_seq_nextval(); ERROR: permission denied for relation events_id_seq_view CONTEXT: Remote SQL command: SELECT id FROM api.events_id_seq_view SQL function "events_id_seq_nextval" statement 1 On "api" - ARGH...
  79. 79. 80 GRANT SELECT ON api.events_id_seq_view TO api; On the "app" database
  80. 80. 81 SELECT app.events_id_seq_nextval(); ERROR: permission denied for sequence events_id_seq CONTEXT: Remote SQL command: SELECT id FROM api.events_id_seq_view SQL function "events_id_seq_nextval" statement 1 On "api" - STILL?!?!?!?!
  81. 81. 82 GRANT USAGE ON SEQUENCE events_id_seq TO api; On the "app" database
  82. 82. 83 SELECT app.events_id_seq_nextval(); And on "api" - YES! events_id_seq_nextval ----------------------- 1
  83. 83. 84 CREATE FOREIGN TABLE app.events ( id int DEFAULT app.events_id_seq_nextval(), venue_id int, name text, total int, guests int, start_time timestamptz, end_time timestamptz ) SERVER app_server OPTIONS ( table_name 'events', schema_name 'public' ); We can now create the foreign table and test the INSERT...
  84. 84. 85 INSERT INTO app.events ( venue_id, name, total, guests, start_time, end_time ) VALUES ( 1, 'Conference Party', 50000, 400, '2015-10-28 18:00', '2015-10-28 21:00' ) RETURNING id; id ---- 2 Yup...we ran "GRANT SELECT, INSERT, UPDATE ON events TO api;" on "app" earlier!
  85. 85. 86 CREATE FUNCTION app.get_availability( venue_id int, start_time timestamptz, end_time timestamptz ) RETURNS bool AS $get_availability$ DECLARE is_available bool; remote_sql text; BEGIN remote_sql := format('SELECT get_availability(%L, %L, %L)', venue_id, start_time, end_time); SELECT availability.is_available INTO is_available FROM dblink('dbname=app user=api password=test', remote_sql) AS availability(is_available bool); RETURN is_available; EXCEPTION WHEN others THEN RETURN NULL::bool; END; $get_availability$ LANGUAGE plpgsql; And install our availability function...
  86. 86. 87 SELECT app.get_availability(1, '2015-10-28 18:00', '2015-10-28 20:00'); ! get_availability ------------------ f (1 row) ! ! SELECT app.get_availability(1, '2015-10-28 13:00', '2015-10-28 17:00'); ! get_availability ------------------ t (1 row) ...and wow.
  87. 87. WE DID IT!!! 88
  88. 88. What did we learn? 89
  89. 89. We Learned That... • PostgreSQL has a robust permission system • http://www.postgresql.org/docs/current/static/sql- grant.html • ...there is much more we could have done too. • Double the databases, double the problems • Always have a testing environment that can mimic your production environment • ...when it all works, it is so sweet. 90
  90. 90. Conclusion • Foreign data wrappers are incredible • The postgres_fdw is incredible • ...and it is still a work in progress • Make sure you understand its limitations • Research what is required to properly install in production 91
  91. 91. Questions? • @jkatz05 92

×