• Save
Geographically Distributed PostgreSQL
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Geographically Distributed PostgreSQL

  • 1,184 views
Uploaded on

This presentation surveys different ways one can geographically distribute PostgreSQL, including master-slave and multi-master solutions. It discusses pitfalls and emphasizes understanding......

This presentation surveys different ways one can geographically distribute PostgreSQL, including master-slave and multi-master solutions. It discusses pitfalls and emphasizes understanding requirements. The presentation covers some of the existing tools that are available in the community. It also touches upon upcoming PostgreSQL solutions.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,184
On Slideshare
1,179
From Embeds
5
Number of Embeds
1

Actions

Shares
Downloads
19
Comments
0
Likes
1

Embeds 5

http://www.slideee.com 5

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Even sharding within an instance

Transcript

  • 1. 1©2014 TransLattice, Inc. All Rights Reserved. 11 Geographically Distributed PostgreSQL PGConf NYC April 3, 2014 Mason Sharp, Chief Architect msharp@translattice.com
  • 2. 2©2014 TransLattice, Inc. All Rights Reserved. Agenda n  Why geographically distribute your data? n  General replication background n  PostgreSQL options n  Custom PostgreSQL configurations n  Upcoming solutions
  • 3. 3©2014 TransLattice, Inc. All Rights Reserved. Why geographically distribute your data? n  Improved Availability n  Better performance (in some cases…) –  Read vs Write –  Data closer applications and users n  Regulatory or Corporate Compliance –  Data placement concerns
  • 4. 4©2014 TransLattice, Inc. All Rights Reserved. Availability Issues Remain Headline News
  • 5. 5©2014 TransLattice, Inc. All Rights Reserved. 1  Survey by Zerto, July 2013, 356 IT professionals from 10 industries Primary causes of data center outage1: n  Hardware failure 34.4% n  Power loss/interruption 31.5% n  Natural disaster 13.3% 79.2% Most recent unplanned data center outage1: n  Last 6 months 42% n  Last year 34% 76% experienced in last year Data Center Outages – Causes and Frequency
  • 6. 6©2014 TransLattice, Inc. All Rights Reserved. Data Center Outage Costs Increasing 1  2  2013 Cost of Data Center Outages, Ponemon Institute, December 2013 3 Bringing Continuous Availability to Oracle Environments, 2013 Mission-Critical Application Availability Survey, Unisphere Research Average cost of an outage is increasing2: n  2010 $5,617/minute n  2013 $7,908/minute 41% increase Length of unplanned outage: n  Average: 86 minutes2 n  25%+ of Oracle users had 8+ hours of unplanned downtime in last year3
  • 7. 7©2014 TransLattice, Inc. All Rights Reserved. Current State of Data Replication Top data management issues for IT executives:4 n  Providing business continuity at a reasonable cost n  Deploying applications in multiple geographies consistently n  The continued ability to use SQL 4 DBMS Evaluation Criteria, IDG Research Services, October 2013 5 Bringing Continuous Availability to Oracle Environments, 2013 Mission-Critical Application Availability Survey, Unisphere Research “Among respondents with at least two data centers and rapid replication solutions, 46% indicate they are less than satisfied with their current strategies.” 5
  • 8. 8©2014 TransLattice, Inc. All Rights Reserved. Replication n  Master-Slave –  One Master, One or more Slaves n  Multi-master –  Multiple Masters n  Multi-source fan-in –  Example: consolidate multiple sites n  Fan-out
  • 9. 9©2014 TransLattice, Inc. All Rights Reserved. Master-Slave Master Slave Slave Slave
  • 10. 10©2014 TransLattice, Inc. All Rights Reserved. Master-Slave n  All writes go to one master n  Hot Standby reads can be done from any node n  Synchronous / Asynchronous n  Slaves get transactions via either –  Native streaming replication –  Statement based •  Could be synchronous, could be via 2PC •  Could be a replay mechanism via queues or triggers
  • 11. 11©2014 TransLattice, Inc. All Rights Reserved. Multi-master
  • 12. 12©2014 TransLattice, Inc. All Rights Reserved. Multi-master n  Write can occur at any location n  Synchronous 2PC –  MVCC concerns –  May make sense to first always write at one location, acquiring lock n  Asynchronous n  Conflict Resolution n  Conflict Avoidance through commit ordering –  Paxos –  Raft
  • 13. 13©2014 TransLattice, Inc. All Rights Reserved. Multi-source fan-in CentralLoc1 Loc2 Loc3 n  Consolidated centrally for reporting
  • 14. 14©2014 TransLattice, Inc. All Rights Reserved. Multi-source fan-out CentralLoc1 Loc2 Loc3 n  Subset sent to remote locations
  • 15. 15©2014 TransLattice, Inc. All Rights Reserved. Understand Your Requirements n  Availability –  Read-only access of some data ok in downgraded state? n  Immediacy of Data –  Nightly refresh? Immediate? 2 second lag? n  Performance & Latency –  Read vs. Write n  Correctness versus Performance n  Conflicts: Prevent or Resolve
  • 16. 16©2014 TransLattice, Inc. All Rights Reserved. Understand Your Requirements (continued) n  Data Segregation n  Data Ownership –  Can each location be the “master” to a subset of data? –  Example: regional customers –  Expressed either as a subtable, or expression on a table •  region_code = ‘US’ –  Different availability requirements? n  “Staticity” Classification –  Static tables that rarely change –  Frequently updated tables
  • 17. 17©2014 TransLattice, Inc. All Rights Reserved. Static Tables n  Less concerned about write performance n  Writing to table –  BEGIN; –  Execute DML statement on agreed “master” –  On success, we have acquired all of the row locks –  Safely execute on other nodes without risk of deadlock –  PREPARE TRANSACTION; –  COMMIT;
  • 18. 18©2014 TransLattice, Inc. All Rights Reserved. Careful with reflexive UPDATES UPDATE inventory SET qty = qty – 1 WHERE ….; n  What if happens on multiple nodes? n  If conflict resolution policy is last one wins, inventory is reduced only by 1, not 2 n  May expect inventory that is not there May want to handle some tables specially. •  SELECT FOR UPDATE on a master –  Will block if another transaction modifying –  Locks won’t propagate to other nodes
  • 19. 19©2014 TransLattice, Inc. All Rights Reserved. Looking in the PostgreSQL Toolbox – Master-Slave n  Native Streaming Replication –  All databases in instance are replicated –  Synchronous and Asynchronous options –  Hot queryable standby option n  Slony –  Trigger based, asynchronous replication –  Flexibility for a subset of data –  More complex administration n  Londiste
  • 20. 20©2014 TransLattice, Inc. All Rights Reserved. Looking in the PostgreSQL Toolbox – pgpool-II n  Middle Layer n  Synchronous Statement-Based Replication –  Can instead be combined with other replication incl. native streaming replication n  Load Balancer –  Can be All writes must go through master node
  • 21. 21©2014 TransLattice, Inc. All Rights Reserved. Looking in the PostgreSQL Toolbox – Postgres-XC n  Can connect to one of multiple nodes n  Good push-down join and operation handling n  Ensures cluster-wide consistency BUT n  Requires access to Global Transaction Manager from each node n  Nodes are a modified version of PostgreSQL n  Not currently suited for use case
  • 22. 22©2014 TransLattice, Inc. All Rights Reserved. Looking in the PostgreSQL Toolbox – PL/Proxy n  Everything is a stored function –  More cumbersome, but flexible
  • 23. 23©2014 TransLattice, Inc. All Rights Reserved. Looking in the PostgreSQL Toolbox – Multi-master n  Bucardo –  Perl-based –  Limited to two masters –  Custom conflict resolution possible n  RubyRep –  Ruby-based –  Limited to two masters –  Custom conflict resolution possible n  Postgres-R –  Modified PostgreSQL 9.0
  • 24. 24©2014 TransLattice, Inc. All Rights Reserved. Looking in the PostgreSQL Toolbox – Custom n  Triggers n  Foreign Data Wrappers n  Subtable Partitioning n  Two Phase Commit
  • 25. 25©2014 TransLattice, Inc. All Rights Reserved. Looking in the PostgreSQL Toolbox – Considerations n  Connections and MVCC across multiple instances n  Sequence/Serial –  UUID as alternative n  Timestamps –  Use timestamp with timezone –  Network Time Protocol NTP –  Custom functions for time lag
  • 26. 26©2014 TransLattice, Inc. All Rights Reserved. Custom Example n  Multiple Locations n  Locations largely independent n  Most of the writes will occur locally –  Each site is the “master” for local data n  Want to be able to write data on a remote site n  Want local read performance for remote originating data n  If remote site is down, local read-only access is acceptable n  Occasional updates to static data requires all nodes online
  • 27. 27©2014 TransLattice, Inc. All Rights Reserved. Custom Example DC2 Hot Standby DC1 Master DC2 Master DC1 Hot Standby DC1 DC2
  • 28. 28©2014 TransLattice, Inc. All Rights Reserved. Custom Example DC2 Hot Standby DC1 Master DC2 Master DC1 Hot Standby DC1 DC2 customer_dc1 customer_dc2
  • 29. 29©2014 TransLattice, Inc. All Rights Reserved. View: customer Custom Example DC2 Hot Standby DC1 Master DC2 Master DC1 Hot Standby DC1 DC2 customer_dc1 FDW customer_dc2
  • 30. 30©2014 TransLattice, Inc. All Rights Reserved. Configuration n  configure --with-ossp-uuid n  CREATE EXTENSION "uuid-ossp" n  CREATE EXTENSION "postgres_fdw”
  • 31. 31©2014 TransLattice, Inc. All Rights Reserved. Configuration n  From dc1: CREATE SERVER dc2_master FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'dc2_host', dbname 'dc2', port '5434'); CREATE SERVER dc2_slave FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'dc2', port '5434');
  • 32. 32©2014 TransLattice, Inc. All Rights Reserved. Configuration n  From dc2: CREATE SERVER dc1_master FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'dc1_host', dbname 'dc1', port '5433'); CREATE SERVER dc1_slave FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'dc1', port '5433');
  • 33. 33©2014 TransLattice, Inc. All Rights Reserved. Configuration CREATE USER MAPPING FOR user1 SERVER dc2_master OPTIONS (user ’user1'); CREATE USER MAPPING FOR user1 SERVER dc2_slave OPTIONS (user ’user1');
  • 34. 34©2014 TransLattice, Inc. All Rights Reserved. Configuration CREATE TABLE customer_dc1 (cust_id UUID, cust_name varchar, cust_loc char(5));
  • 35. 35©2014 TransLattice, Inc. All Rights Reserved. Configuration On dc1: CREATE FOREIGN TABLE customer_dc2_master (cust_id UUID, cust_name varchar, cust_loc char(5)) SERVER dc2_master; CREATE FOREIGN TABLE customer_dc2_slave (cust_id UUID, cust_name varchar, cust_loc char(5)) SERVER dc2_slave;
  • 36. 36©2014 TransLattice, Inc. All Rights Reserved. View Handling n  Create a customer view, a union of local data and local slave n  Include cust_loc condition CREATE VIEW customer AS SELECT * FROM customer_dc1 WHERE cust_loc = ‘DC1’ UNION ALL SELECT * FROM customer_dc2_slave WHERE cust_loc = ‘DC2’;
  • 37. 37©2014 TransLattice, Inc. All Rights Reserved. View Handling # explain select * from customer; QUERY PLAN ----------------------------------------------------------------------- --- Append (cost=0.00..140.82 rows=8 width=72) -> Seq Scan on customer_dc1 Filter: (cust_loc = 'DC1'::bpchar) -> Foreign Scan on customer_dc2_slave
  • 38. 38©2014 TransLattice, Inc. All Rights Reserved. Configuration n  PostgreSQL takes qualifications into account for better plans! # explain select * from customer where cust_loc = 'DC1'; QUERY PLAN ---------------------------------------------------------------- Append (cost=0.00..20.04 rows=4 width=72) -> Seq Scan on customer_dc1 (cost=…..) Filter: (cust_loc = 'DC1'::bpchar) n  Smart enough to know to use just one part of the UNION –  Leaves off foreign table part –  Consider in design of application
  • 39. 39©2014 TransLattice, Inc. All Rights Reserved. Triggers CREATE TRIGGER tr_customer INSTEAD OF INSERT OR UPDATE OR DELETE ON customer FOR EACH ROW EXECUTE PROCEDURE update_customer();
  • 40. 40©2014 TransLattice, Inc. All Rights Reserved. Trigger Function CREATE OR REPLACE FUNCTION update_customer() RETURNS TRIGGER AS $$ BEGIN -- TODO: Handle updating cust_loc IF (TG_OP = 'UPDATE') THEN IF OLD.cust_loc = 'DC1' THEN UPDATE customer_dc1 SET cust_name = NEW.cust_name WHERE cust_id = OLD.cust_id; ELSEIF OLD.cust_loc = 'DC2' THEN UPDATE customer_dc2_master SET cust_name = NEW.cust_name WHERE cust_id = OLD.cust_id; END IF; RETURN NEW; : $$ LANGUAGE plpgsql;
  • 41. 41©2014 TransLattice, Inc. All Rights Reserved. Caveats n  Performance will be poor for some queries –  Join push-down n  Two Phase Commit is not used by FDW –  No consistency guarantees! –  FWIW, will commit remotely before locally n  Repeatable Read is used by the FDW –  Keeps results the same for foreign table scanned multiple times n  Differing locale settings may cause problems
  • 42. 42©2014 TransLattice, Inc. All Rights Reserved. Custom Example – Further Enhancement n  Want to reduce loss of ability to write new data n  Add local table for local inserts when remote side is down –  Especially helpful for append-only workloads n  Change trigger functions to use the local table when the remote side is down n  Allow updates and deletes on these as well n  When the remote side is available again, apply changes to remote side, truncate local table
  • 43. 43©2014 TransLattice, Inc. All Rights Reserved. Custom Example – Additional try n  Tried using table inheritance and adding a rule on a subtable to instead query a remote table, but encountered issues
  • 44. 44©2014 TransLattice, Inc. All Rights Reserved. Another Custom Example n  All tables in just one database on each node n  No streaming replication n  Changes applied at both locations –  Either via 2PC –  Or asynchronously via triggers
  • 45. 45©2014 TransLattice, Inc. All Rights Reserved. Upcoming PostgreSQL Multi-master Replication n  Logical Log Streaming Replication (LLSR) in PostgreSQL 9.4 n  WAL is read to determine logical commits n  Can be decoded to SQL n  Less overhead than other projects n  Will allow for a subset of data to be replicated, not entire instance unlike existing SR
  • 46. 46©2014 TransLattice, Inc. All Rights Reserved. Upcoming PostgreSQL Multi-master Replication n  A goal in a future PostgreSQL release is multi- master replication with last-one wins conflict resolution (9.5?) n  Possible 9.4 extension for apply side in future n  Improvements over subsequent releases –  Improved DDL support may be phased in over time
  • 47. 47©2014 TransLattice, Inc. All Rights Reserved. Bucardo Example createdb db1 createdb –p 5433 db1 psql –c “CREATE TABLE tab1 (col1 int, col2 int, PRIMARY KEY(col1))” Db1 psql –c “CREATE TABLE tab1 (col1 int, col2 int, PRIMARY KEY(col1))” -p 5433 db1
  • 48. 48©2014 TransLattice, Inc. All Rights Reserved. Bucardo Example bucardo_ctl install bucardo_ctl add database db1 name=db1a bucardo_ctl add database db1 name=db1b port=5433 bucardo_ctl add all tables db=db1a psql bucardo: update bucardo.goat set standard_conflict = 'latest' where tablename = 'tab1';
  • 49. 49©2014 TransLattice, Inc. All Rights Reserved. Bucardo Example bucardo_ctl add sync sync_tab1 type=swap source=db1a targetdb=db1b tables=tab1 bucardo_ctl stop bucardo_ctl start -> Updates to tab1 now visible on both servers
  • 50. 50©2014 TransLattice, Inc. All Rights Reserved. Bucardo Notes n  If having trouble, try “bucardo_ctl install” again n  Also try bucardo_ctl stop and bucardo_ctl start n  It seemed to get confused with table names the same in multiple databases
  • 51. 51©2014 TransLattice, Inc. All Rights Reserved. Alternative: TransLattice Elastic Database (TED) n  PostgreSQL-based n  Geo-distributed multi-master RDBMS with sharding n  Policy Configurable –  Degree of redundancy –  Data location n  Uses Fast Generalized Paxos for global commit ordering n  Easily add nodes –  New locations –  Existing locations for scalability n  Nodes recover automatically n  Easy transition –  Can operates in conjunction with existing database systems
  • 52. 52©2014 TransLattice, Inc. All Rights Reserved. Each TransLattice Node Delivers Capabilities That Replace Numerous Disparate Technologies A single node type simplifies scaling and management TL Replication Storage Management Cluster Management Compliance Tools Fully Relational Database Management Tools Data Integration Tools
  • 53. 53©2014 TransLattice, Inc. All Rights Reserved. 5353 Thank You! msharp@translattice.com @mason_db @TransLattice