Scaling out by distributing and
replicating data in Postgres-XC
Ashutosh Bapat
@Post gr es Open 2012
Agenda
●
What is Postgres-XC
●
Postgres-XC architecture over-view
●
Data distribution in XC
●
Effect of data distribution ...
What is Postgres-XC
● Shared Nothing Cluster
– Multiple collaborating PostgreSQL-like servers
– No resources shared
– Scal...
Coordinators
Add coordinators
Datanodes
Add datanodes
SQL + libpq interface
Postgres-XC cluster
SQL statements from applic...
● Replicated tables
– Each row of the table is stored on all the datanodes where
the table is replicated
● Distributed tab...
Replicated Table
Writes
write write write
val val2
1 2
2 10
3 4
val val2
1 2
2 10
3 4
val val2
1 2
2 10
3 4
Reads
read
val...
Replicated Tables
● Statement level replication
● Each write needs to be replicated
– writes are costly
● Read can happen ...
Distributed Tables
Combiner
Read
read read read
val val2
1 2
2 10
3 4
val val2
11 21
21 101
31 41
val val2
10 20
20 100
30...
Distributed Tables
● Write to a single row is applied only on the node
where the row resides
– Multiple rows can be writte...
Distributed query processing in Postgres-
XC
Distributed query processing in Postgres-XC
● Coordinator
– Accepts queries and plans them
– Finds the right data-nodes fr...
Query processing balance
● Coordinator tries to delegate maximum query
processing to data-nodes
– Indexes are located on d...
SQL prompt
Deciding the right distribution strategy
Read-write load on tables
● High point reads (based on distribution column)
– Distributed or replicated
● High read activi...
● Find the relations/columns participating in equi-Join
conditions, WHERE clause etc.
– Distribute on those columns
● Find...
Thumb rules
● Infrequently written tables participating in JOINs with
many other tables (Dimension tables)
– Replicated ta...
DBT-1 schema
C_ID
C_UNAME
C_PASSWD
C_FNAME
C_LNAME
C_ADDR_ID
C_PHONE
C_EMAIL
C_SINCE
C_LAST_VISIT
C_LOGIN
C_EXPIRATION
C_D...
Example DBT-1 (1)
● author, item
– Less frequently written
– Frequently read from
– Author and item are frequently JOINed
...
Example DBT-1 (2)
● customer, address, orders, order_line, cc_xacts
– Frequently written
● hence distributed
– Participate...
Example DBT-1 (3)
● Shopping_cart, shopping_cart_line
– Frequently written
● Hence distributed
– Point selects based on co...
DBT-1 scale-up
● Old data, we will publish
bench-marks for 1.0 soon.
● DBT-1 (TPC-W) benchmark
with some minor
modificatio...
Other scaling tips
Using GTM proxy
● GTM can be a bottleneck
– All nodes get snapshots, transactions ids etc. from GTM
● GTM-proxy helps redu...
Adding coordinator and datanode
● Coordinator
– Scaling connection load
– Too much load on coordinator
– Query processing ...
Impact of transaction management on performance
● 2PC is used when
– More than one node performs write in a transaction
– ...
DBT-2 (sneak peek)
● Like TPC-C
● Early results show 4.3 times scaling with 5 servers
– More details to come ...
Thank you
ashutosh.bapat@enterprisedb.com
Upcoming SlideShare
Loading in...5
×

Pgxc scalability pg_open2012

243

Published on

A presentation on how to scale out using Postgres-XC.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
243
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Pgxc scalability pg_open2012

  1. 1. Scaling out by distributing and replicating data in Postgres-XC Ashutosh Bapat @Post gr es Open 2012
  2. 2. Agenda ● What is Postgres-XC ● Postgres-XC architecture over-view ● Data distribution in XC ● Effect of data distribution on performance ● Example DBT-1 schema
  3. 3. What is Postgres-XC ● Shared Nothing Cluster – Multiple collaborating PostgreSQL-like servers – No resources shared – Scaling by adding commodity hardware ● Write-scalable – Write/Read scalable by adding nodes – Multiple nodes where writes can be issued ● Syncronous – Writes to one node are refl ected on all the nodes ● Transparent – Applications need not care about the data distribution
  4. 4. Coordinators Add coordinators Datanodes Add datanodes SQL + libpq interface Postgres-XC cluster SQL statements from applications Transactioninfo GTM Postgres-XC architecture
  5. 5. ● Replicated tables – Each row of the table is stored on all the datanodes where the table is replicated ● Distributed tables – Each row exists only on a single datanode – Distribution strategies ● HASH ● MODULO ● ROUNDROBIN ● User defined functions (TBD) Distribution strategies
  6. 6. Replicated Table Writes write write write val val2 1 2 2 10 3 4 val val2 1 2 2 10 3 4 val val2 1 2 2 10 3 4 Reads read val val2 1 2 2 10 3 4 val val2 1 2 2 10 3 4 val val2 1 2 2 10 3 4
  7. 7. Replicated Tables ● Statement level replication ● Each write needs to be replicated – writes are costly ● Read can happen on any node (where table is replicated) – reads from different coordinators can be routed to different nodes ● Useful for relatively static tables, with high read load
  8. 8. Distributed Tables Combiner Read read read read val val2 1 2 2 10 3 4 val val2 11 21 21 101 31 41 val val2 10 20 20 100 30 40 Write write val val2 1 2 2 10 3 4 val val2 11 21 21 101 31 41 val val2 10 20 20 100 30 40
  9. 9. Distributed Tables ● Write to a single row is applied only on the node where the row resides – Multiple rows can be written in parallel ● Scanning rows spanning across the nodes (e.g. table scans) can hamper performance ● Point reads and writes based on the distribution column value show good performance – Datanode where the operation happens can be identifi ed by the distribution column value
  10. 10. Distributed query processing in Postgres- XC
  11. 11. Distributed query processing in Postgres-XC ● Coordinator – Accepts queries and plans them – Finds the right data-nodes from where to fetch the data – Frames queries to be sent to these data-nodes – Gathers data from data-nodes – Processes it to get the desired result ● Datanode – Executes queries from coordinator like PostgreSQL – Has same capabilities as PostgreSQL
  12. 12. Query processing balance ● Coordinator tries to delegate maximum query processing to data-nodes – Indexes are located on datanodes – Materialization of huge results is avoided in case of sorting, aggregation, grouping, JOINs etc. – Coordinator is freed to handle large number of connections ● Distributing data wisely helps coordinator to delegate maximum query processing and improve performance ● Delegation is often termed as shipping
  13. 13. SQL prompt
  14. 14. Deciding the right distribution strategy
  15. 15. Read-write load on tables ● High point reads (based on distribution column) – Distributed or replicated ● High read activities but no frequent writes – Better be replicated ● High point writes – Better be distributed ● High insert-load, but no frequent update/delete/read – Better be round-robin
  16. 16. ● Find the relations/columns participating in equi-Join conditions, WHERE clause etc. – Distribute on those columns ● Find columns participating in GROUP BY, DISTINCT clauses – Distribute on those columns ● Find columns/tables which are part of primary key and foreign key constraints – Global constraints are not yet supported in XC – Distribute on those columns Query analysis (Frequently occuring queries)
  17. 17. Thumb rules ● Infrequently written tables participating in JOINs with many other tables (Dimension tables) – Replicated table ● Frequently written tables participating in JOINs with replicated tables – Distributed table ● Frequently written tables participating in JOINs with each other, with equi-JOINing columns of same data type – Distribute both of them by the columns participating in JOIN on same nodes ● Referenced tables – Better be replicated
  18. 18. DBT-1 schema C_ID C_UNAME C_PASSWD C_FNAME C_LNAME C_ADDR_ID C_PHONE C_EMAIL C_SINCE C_LAST_VISIT C_LOGIN C_EXPIRATION C_DISCOUNT C_BALANCE C_YTD_PMT C_BIRTHDATE C_DATA ADDR_ID ADDR_STREET1 ADDR_STREET2 ADDR_CITY ADDR_STATE ADDR_ZIP ADDR_CO_ID ADDR_C_ID O_ID O_C_ID O_DATE O_SUB_TOTAL O_TAX O_TOTAL O_SHIP_TYPE O_BILL_ADDR_ID O_SHIP_ADDR_ID O_STATUS CUSTOMER ADDRESS ORDERS OL_ID OL_O_ID OL_I_ID OL_QTY OL_DISCOUNT OL_COMMENTS OL_C_ID ORDER_LI NE I_ID I_TITLE I_A_ID I_PUB_DATE I_PUBLISHER I_SUBJECT I_DESC I_RELATED1 I_RELATED2 I_RELATED3 I_RELATED4 I_RELATED5 I_THUMBNAIL I_IMAGE I_SRP I_COST I_AVAIL I_ISBN I_PAGE I_BACKING I_DIMENASIONS ITEM CX_I_ID CX_TYPE CX_NUM CX_NAME CX_EXPIRY CX_AUTH_ID CX_XACT_AMT CX_XACT_DATE CX_CO_ID CX_C_ID CC_XACTS OL_ID OL_O_ID OL_I_ID OL_QTY OL_DISCOUNT OL_COMMENTS OL_C_ID AUTHOR ST_I_ID ST_STOCK STOCK SC_ID SC_C_ID SC_DATE SC_SUB_TOTAL SC_TAX SC_SHIPPING_CO ST SC_TOTAL SC_C_FNAME SC_C_LNAME SC_C>DISCOUNT SHOPPING_CART SCL_SC_ID SCL_I_ID SCL_QTY SCL_COST SCL_SRP SCL_TITLE SCL_BACKING SCL_C_ID SHOPPING_CART_LINE CO_ID CO_NAME CO_EXCHANGE CO_CURRENCY COUNTRY Distributed with Customer ID Replicated Distributed with ItemID Distributed with Shopping Cart ID
  19. 19. Example DBT-1 (1) ● author, item – Less frequently written – Frequently read from – Author and item are frequently JOINed ● Dimension tables – Hence replicated on all nodes
  20. 20. Example DBT-1 (2) ● customer, address, orders, order_line, cc_xacts – Frequently written ● hence distributed – Participate in JOINs amongst each other with customer_id as JOIN key – point SELECTs based on customer_id ● hence diistributed by hash on customer_id so that JOINs are shippable – Participate in JOINs with item ● Having item replicated helps pushing JOINs to datanode
  21. 21. Example DBT-1 (3) ● Shopping_cart, shopping_cart_line – Frequently written ● Hence distributed – Point selects based on column shopping_cart_id ● Hence distributed by hash on shopping_cart_id – JOINs with item ● Having item replicated helps
  22. 22. DBT-1 scale-up ● Old data, we will publish bench-marks for 1.0 soon. ● DBT-1 (TPC-W) benchmark with some minor modification to the schema ● 1 server = 1 coordinator + 1 datanode on same machine ● Coordinator is CPU bound ● Datanode is I/O bound
  23. 23. Other scaling tips
  24. 24. Using GTM proxy ● GTM can be a bottleneck – All nodes get snapshots, transactions ids etc. from GTM ● GTM-proxy helps reduce the load on GTM – Runs on each physical server – Caches information about snapshots, transaction ids etc. – Serves logical nodes on that server
  25. 25. Adding coordinator and datanode ● Coordinator – Scaling connection load – Too much load on coordinator – Query processing mostly happens on coordinator ● Datanode – Data scalability ● Number of tables grow – new nodes for new tables/databases ● Distributed table sizes grow – new nodes providing space for additional data – Redundancy
  26. 26. Impact of transaction management on performance ● 2PC is used when – More than one node performs write in a transaction – Explicit 2PC is used – More than one node performs write during a single statement ● Only nodes performing writes participate in 2PC ● Design transactions such that they span across as few nodes as possible.
  27. 27. DBT-2 (sneak peek) ● Like TPC-C ● Early results show 4.3 times scaling with 5 servers – More details to come ...
  28. 28. Thank you ashutosh.bapat@enterprisedb.com
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×