Postgres-XC Write Scalable PostgreSQL Cluster
Upcoming SlideShare
Loading in...5
×
 

Postgres-XC Write Scalable PostgreSQL Cluster

on

  • 7,177 views

An overview of Postgres-XC is provided. Postgres-XC is a free, open source, PostgreSQL based write scalable cluster. It runs on multiple servers, is fully ACID and consistent. Postgres-XC is a ...

An overview of Postgres-XC is provided. Postgres-XC is a free, open source, PostgreSQL based write scalable cluster. It runs on multiple servers, is fully ACID and consistent. Postgres-XC is a traditional relational alternative to cases where NoSQL solutions are being considered.

This presentation was given in San Francisco August 7th by Mason Sharp, one of the original architects of Postgres-XC and co-founder of StormDB (http://www.stormdb.com).

Statistics

Views

Total Views
7,177
Views on SlideShare
7,176
Embed Views
1

Actions

Likes
8
Downloads
83
Comments
0

1 Embed 1

https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Postgres-XC Write Scalable PostgreSQL Cluster Postgres-XC Write Scalable PostgreSQL Cluster Presentation Transcript

  • Postgres-XC: Write-Scalable PostgreSQL Cluster Mason Sharp August 7th, 2012 CC License: Attribution-NonCommercial-ShareAlike
  • Content Attribution• Koichi Suzuki• Michael Paquier• Ashutosh Bapat• Pavan Deolasee• Mason Sharp• ...?Aug 7, 2012 2
  • Who am I ● Mason Sharp ● Co-organizer of NYC PUG ● Co-founder of StormDB ● Previously worked at EnterpriseDB ● Original architect of Stado (GridSQL) ● One of the original architects of Postgres-XCAug 7, 2012 Postgres-XC 3
  • PostgreSQL User Groups San Francisco New York 616 Members 502 Members New: Philadelphia Los Angeles Tokyo 2000? MembersAug 7, 2012 Postgres-XC 4
  • NYC PUG Meetup MembershipAug 7, 2012 Postgres-XC 5
  • NYC PUG Speakers ● Recent speakers include ● Bruce Momjian ● Greg Smith ● Greg Stark ● Joe Conway ● Joachim WielandAug 7, 2012 Postgres-XC 6
  • NYC PUG Speakers We want you!Aug 7, 2012 Postgres-XC 7
  • Postges-XC Talk● Background● Postgres-XC Introduction & Usage● Postgres-XC Components● Postgres-XC Details 8
  • BackgroundAug 7, 2012 Postgres-XC 9
  • Data Tier Scaling ● Up versus Out ● More memory, more cores ● Read-only Replicated Slaves ● Caching ● Memcached ● Sharding ● NoSQL ● NewSQLAug 7, 2012 Postgres-XC 10
  • XC Origins Koichi Suzuki, NTT Data Mason SharpAug 7, 2012 Postgres-XC 11
  • PostgreSQL-Related Clustering Projects ● pgpool-II ● Read replicated slaves ● PL/Proxy ● Used by Skype, meetme (myYearbook) ● All access is over a stored function ● Postgres-R, PostgresForest ● Stado (GridSQL) ● Parallel Query Can we make it write scalable? ● Not write-scalableAug 7, 2012 Postgres-XC 12
  • Postgres-XC IntroductionAug 7, 2012 Postgres-XC 13
  • Overview ● PostgreSQL-based database cluster ● Same API to Apps as PostgreSQL – Same drivers ● Currently based upon PG 9.1. Soon: 9.2. ● Symmetric Multi-headed Cluster ● No master, no slave – Not just PostgreSQL replication. – Application can read/write to any coordinator server ● Consistent database view to all the transactions – Complete ACID property to all the transactions in the cluster ● Scales both for Write and ReadAug 7, 2012 Postgres-XC 14
  • Postgres-XC Cluster Application can connect to any server to have the same database view and service . PG- XC Server PG- XC Server PG- XC Server PG- XC Server Coordinator Coordinator Coordinator ・・・ ・・ Coordinator Data Node Data Node Data Node Add PG- XC servers as Data Node needed Communication among PG- XC servers Global Transaction Manager GTMAug 7, 2012 Postgres-XC 15
  • Read/Write Scalability DBT-1 throughput scalabilityAug 7, 2012 Postgres-XC 16
  • I ConsistencyAug 7, 2012 Postgres-XC 17
  • Is XC right for you? ● I need write scalability ● I like ACID ● I like SQL ● I dont want to rewrite my existing SQL applications ● I want to leverage the PostgreSQL community for all of their contrib modulesAug 7, 2012 Postgres-XC 18
  • Why XC may not be right for you ● I need MPP parallel query capability ● Parallel Query in XC Limited ● Try Stado: www.stado.us ● I need a solution with built-in HA ● I need massive scale and have loose consistency requirements ● I would rather use a NoSQL solution so I can put it on my resumeAug 7, 2012 Postgres-XC 19
  • Postgres-XC ComponentsAug 7, 2012 Postgres-XC 20
  • Aug 7, 2012 Postgres-XC 21
  • Coordinator Overview● Based on PostgreSQL 9.1 (9.2 soon)● Accepts connections from clients● Parses and plans requests● Interacts with Global Transaction Manager● Uses pooler for Data Node connections● Sends down XIDs and snapshots to Data Nodes● Collects results and returns to client● Uses two phase commit if necessary 22
  • Data Node Overview● Based on PostgreSQL 9.1 (9.2 soon)● Where user created data is actually stored● Coordinators (not clients) connects to Data Nodes● Accepts XID and snapshots from Coordinator● The rest is fairly similar to vanilla PostgreSQL 23
  • Global Transaction Manager GTM Cluster nodes XID Snapshot Timestamp Sequence valuesAug 7, 2012 Postgres-XC 24
  • Summary ● Coordinator ● Visible to apps Postgres-XC core, based upon vanilla PostgreSQL ● SQL analysis, planning, execution ● Connection pooling Share same binary ● Datanode (or simply “NODE”) May want to colocate ● Actual database store ● Local SQL execution ● GTM (Global Transaction Manager) ● Provides consistent database view to transactions – GXID (Global Transaction ID) – Snapshot (List of active transactions) Different binaries – Other global values such as SEQUENCE ● GTM Proxy, integrates server-local transaction requirement for performanceAug 7, 2012 Postgres-XC 25
  • Data Distribution Distribution StrategiesAug 7, 2012 Postgres-XC 26
  • Distributing the data ● Replicated table ● Each row in the table is replicated to the datanodes ● Statement based replication ● Distributed table ● Each row of the table is stored on one datanode, decided by one of following strategies – Hash – Round Robin – Modulo – Range and user defined function (future)Aug 7, 2012 Postgres-XC 27
  • Table Distribution and Replication ● Each table can be distributed or replicated ● Strategy based on usage – Transaction tables → Distributed – Static lookup tables → Replicate – Distribute parent-children together ● Join pushdown when possible ● Where clause pushdown ● Simple parallel aggregatesAug 7, 2012 Postgres-XC 28
  • Defining Tables ● Table Distribution/Replication ● CREATE TABLE tab (…) DISTRIBUTE BY HASH(col) | MODULO(col) | ROUND ROBIN | REPLICATIONAug 7, 2012 Postgres-XC 29
  • Replicated Tables Reads Writes read write write write val val2 val val2 val val2 val val2 val val2 val val2 1 2 1 2 1 2 1 2 1 2 1 2 2 10 2 10 2 10 2 10 2 10 2 10 3 4 3 4 3 4 3 4 3 4 3 4Aug 7, 2012 Postgres-XC 30
  • Distributed Tables Write Read Combiner write read read read val val2 val val2 val val2 val val2 val val2 val val2 1 2 11 21 10 20 1 2 11 21 10 20 2 10 21 101 20 100 2 10 20 100 21 101 3 4 31 41 30 40 3 4 31 41 30 40Aug 7, 2012 Postgres-XC 31
  • Join Pushdown Hash/Module Round Robin Replicated distributedHash/Modulo Inner join with NO Inner join if replicateddistributed equality condition on tables distribution list the distribution is superset of column with same distributed tables data type and same distribution list distribution strategyRound Robin No No Inner join if replicated tables distribution list is superset of distributed tables distribution listReplicated Inner join if replicated Inner join if replicated All kinds of joins tables distribution list tables distribution list is superset of is superset of distributed tables distributed tables distribution list distribution listAug 7, 2012 Postgres-XC 32
  • Constraints ● XC does not support Global constraints – i.e. constraints across datanodes ● Constraints within a datanode are supported Distribution strategy Unique, primary key Foreign key constraints constraints Replicated Supported Supported if the referenced table is also replicated on the same nodes Hash/Modulo distributed Supported if primary OR Supported if the referenced unique key is distribution key table is replicated on same nodes OR its distributed by primary key in the same manner and same nodes Round Robin Not supported Supported if the referenced table is replicated on same nodesAug 7, 2012 Postgres-XC 33
  • DemoAug 7, 2012 Postgres-XC 34
  • Transaction Management Why MVCC is Important for Consistency Global Transaction MangerAug 7, 2012 Postgres-XC 35
  • Multi-version Concurrency Control (MVCC) (quick overview) ● Readers do not block writers ● Writers do not block readers ● Transaction Ids (XIDs) ● Every transaction gets an ID ● Snapshots contain a list of running XIDsAug 7, 2012 Postgres-XC 36
  • Multi-version Concurrency Control (MVCC) (quickly discussed) Example:T1 Begin...T2 Begin; INSERT...; CommitT3 Begin...T4 Begin; SELECT ● T4s snapshot contains T1 and T3 ● T2 already committed ● It can see T2s commits, but not T1s nor T3sAug 7, 2012 Postgres-XC 37
  • Multi-version Concurrency Control (MVCC) on 2 Independent Nodes Example:T1 Begin...T2 Begin; INSERT..; Commit;T3 Begin...T4 Begin; SELECT ● Node 1: T2 Commit, T4 SELECT ● Node 2: T4 SELECT, T2 Commit ● T4s SELECT statement returns inconsistent data ● Includes data from Node1, but not Node2. ● C in ACID FailsAug 7, 2012 Postgres-XC 38
  • Global Transaction Manager (GTM) ● Provides Global Transaction Consistency GTM Cluster nodes XID Snapshot Timestamp Sequence valuesAug 7, 2012 Postgres-XC 39
  • Transaction Management● 2PC is used to guarantee transactional consistency across nodes ● When there are more than one nodes involved OR ● When there are explicit 2PC transactions● Only those nodes where write activity has happened, participate in 2PC● In PostgreSQL 2PC can not be applied if temporary tables are involved. Same restriction applies in Postgres-XC● When single coordinator command needs multiple datanode commands, we encase those in transaction blockAug 7, 2012 Postgres-XC 40
  • Postgres-XC ConsiderationsAug 7, 2012 Postgres-XC 41
  • Can GTM be a Performance Bottleneck? • Depending on implementation – Current Implementation CoordinatorsGTM GTM Threads Coordinator Backend Snapshot Data Domain Socket Applicable up to Client Library Coordinator Internet Lock five PG-XC Call servers (DBT-1) Create Terminate GTM Main Thread – Large snapshot size and number – Too many interaction between GTM and CoordinatorsJuly 12th, 2012 42
  • Can GTM be a Performance Bottleneck?Proxy Implementation Coordinators GTM GTM Worker Threads GTM Proxy Thread Coordinator Backend Snapshot Data GTM Snapshot Handler GTM Server Scanner Server Protocol Handler Command Backend Handler Client Library Internet Coordinator Domain Socket Domain Socket Call Unix Lock Call Response Backend Handler Create Terminate Create Connection Terminate Assignment GTM Main Thread Proxy Main Thread Connection•Request/Response grouping•Single representative snapshot applied to multiple transactionsJuly 12th, 2012 43
  • Can GTM be a SPOF?• Implement GTM Standby Checkpoint next starting point (GXID and Sequence) GTM Master GTM Standby Standby can failover the master without referring to GTM master information.July 12th, 2012 44
  • Parallel Query ● OK for simple queries ● Also when all joins can be pushed down – Star schema with replicated dimensions ● Even aggregates ● SELECT SUM(col1) FROM tab1 ● If cross-node join needed performs poorly ● Data on one node needs to join with another ● Ships all data to coordinator for joiningAug 7, 2012 Postgres-XC 45
  • High Availability ● GTM-standby provides basic HA ● No native HA for nodes ● Use HA middleware such as Pacemaker ● Each data node should be configured with synchronous replicationAug 7, 2012 Postgres-XC 46
  • Status Settings and optionsAug 7, 2012 Postgres-XC 47
  • Present Status ● Project/Developer site ● http://postgres-xc.sourceforge.net/ ● http://sourceforge.net/projects/postgres-xc/ ● Version 1.0 available ● Base PostgreSQL version: 9.1 ● Soon, PostgreSQL 9.2! – Group commit: even more write scalability – “Index-only Scans” ● Get Involved ● Even as just a testerAug 7, 2012 Postgres-XC 48
  • Easy way of trying it out? ● www.stormdb.com ● Not Postgres-XC, but similar ● Nothing to install, cloud hosted ● Free betaAug 7, 2012 Postgres-XC 49
  • Thank You mason@stormdb.com Twitter: mason_dbAug 7, 2012 Postgres-XC 50