Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)

1,782 views

Published on

Database sharding involves spreading database contents across multiple servers, with each server holding only part of the database. While it is possible to vertically scale Postgres, and to scale read-only workloads across multiple servers, only sharding allows multi-server read-write scaling. This presentation will cover the advantages of sharding and future Postgres sharding implementation requirements, including foreign data wrapper enhancements, parallelism, and global snapshot and transaction control. This is a followup to my Postgres Scaling Opportunities presentation.

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)

  1. 1. The Future of Postgres Sharding BRUCE MOMJIAN This presentation will cover the advantages of sharding and future Postgres sharding implementation requirements. Creative Commons Attribution License http://momjian.us/presentations Last updated: October, 2015 1 / 22
  2. 2. Outline 1. Scaling 2. Vertical scaling options 3. Non-sharding horizontal scaling 4. Existing sharding options 5. Future sharding The Future of Postgres Sharding 2 / 22
  3. 3. Scaling Database scaling is the ability to increase database throughput by utilizing additional resources such as I/O, memory, CPU, or additional computers. However, the high concurrency and write requirements of database servers make scaling a challenge. Sometimes scaling is only possible with multiple sessions, while other options require data model adjustments or server configuration changes. Postgres Scaling Opportunities http://momjian.us/main/presentations/overview.html#scaling The Future of Postgres Sharding 3 / 22
  4. 4. Vertical Scaling Vertical scaling can improve performance on a single server by: ◮ Increasing I/O with ◮ faster storage ◮ tablespaces on storage devices ◮ striping (RAID 0) across storage devices ◮ Moving WAL to separate storage ◮ Adding memory to reduce read I/O requirements ◮ Adding more and faster CPUs The Future of Postgres Sharding 4 / 22
  5. 5. Non-sharding Horizontal Scaling Non-sharding horizontal scaling options include: ◮ Read scaling using Pgpool and streaming replication ◮ CPU/memory scaling with asynchronous multi-master The entire data set is stored on each server. The Future of Postgres Sharding 5 / 22
  6. 6. Why Use Sharding? ◮ Only sharding can reduce I/O, by splitting data across servers ◮ Sharding benefits are only possible with a shardable workload ◮ The shard key should be one that evenly spreads the data ◮ Changing the sharding layout can cause downtime ◮ Additional hosts reduce reliability; additional standby servers might be required The Future of Postgres Sharding 6 / 22
  7. 7. Typical Sharding Criteria ◮ List ◮ Range ◮ Hash The Future of Postgres Sharding 7 / 22
  8. 8. Existing Sharding Solutions ◮ Application-based sharding ◮ PL/Proxy ◮ Postgres-XC ◮ pg_shard ◮ Hadoop The data set is sharded (striped) across servers. The Future of Postgres Sharding 8 / 22
  9. 9. Sharding Using Foreign Data Wrappers (FDW) 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 SQL Queries SQL Queries PG FDW Foreign Server Foreign Server Foreign Server The Future of Postgres Sharding 9 / 22
  10. 10. FDW Sort/Join/Aggregate Pushdown 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 SQL Queries SQL Queries PG FDW with joins, sorts, aggregates Foreign Server Foreign Server Foreign Server The Future of Postgres Sharding 10 / 22
  11. 11. Advantages of FDW Sort/Join/Aggregate Pushdown ◮ Sort pushdown reduces CPU and memory overhead on the coordinator ◮ Join pushdown reduces coordinator join overhead, and reduces the number of rows transferred ◮ Aggregate pushdown causes summarized values to be passed back from the shards ◮ WHERE clause restrictions are already pushed down The Future of Postgres Sharding 11 / 22
  12. 12. Pushdown of Static Tables 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 SQL Queries SQL Queries with joins to static data PG FDW and static data restrictions Foreign S. Foreign S. Foreign S. static static static The Future of Postgres Sharding 12 / 22
  13. 13. Implementing Pushdown of Static Tables Static table pushdown allows join pushdown where the query restriction is on the static table and not on the shared key. Static tables can be pushed to shards using: ◮ Triggers with Slony-like distribution ◮ WAL logical decoding The optimizer must also know which tables are duplicated on the shards for proper join pushdown. The Future of Postgres Sharding 13 / 22
  14. 14. Query Routing and DDL to Proper Shards 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 SQL Queries SQL Queries PG FDW Foreign Server Foreign Server Foreign Server The Future of Postgres Sharding 14 / 22
  15. 15. Implementing Query Routing and DDL to Proper Shards ◮ Could use the existing partitioning system ◮ inheritance ◮ CHECK constraint ◮ constraint exclusion ◮ trigger for INSERT, UPDATE, and DELETE The Future of Postgres Sharding 15 / 22
  16. 16. Parallel FDW Access 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 00000000000000000000000 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 11111111111111111111111 SQL Queries PG FDW Parallel Queries Foreign Server Foreign Server Foreign Server The Future of Postgres Sharding 16 / 22
  17. 17. Advantages of Parallel FDW Access ◮ Can use libpq’s asynchronous API to issue multiple pending queries ◮ Ideal for queries that must run on every shard, e.g. ◮ restrictions on static tables ◮ queries with no sharded-key reference ◮ queries with multiple shared-key references ◮ Aggregates like AVG() must be computed on the controller by having shards return SUM() and COUNT() The Future of Postgres Sharding 17 / 22
  18. 18. Global Snapshot Manager 00000000000000000000000000000000000000000000 0000000000000000000000 0000000000000000000000 0000000000000000000000 00000000000000000000000000000000000000000000 11111111111111111111111111111111111111111111 1111111111111111111111 1111111111111111111111 1111111111111111111111 11111111111111111111111111111111111111111111 00000000000000000000000000000000000000000000 0000000000000000000000 0000000000000000000000 0000000000000000000000 00000000000000000000000000000000000000000000 11111111111111111111111111111111111111111111 1111111111111111111111 1111111111111111111111 1111111111111111111111 11111111111111111111111111111111111111111111 00000000000000000000000000000000000000000000 0000000000000000000000 0000000000000000000000 0000000000000000000000 00000000000000000000000000000000000000000000 11111111111111111111111111111111111111111111 1111111111111111111111 1111111111111111111111 1111111111111111111111 11111111111111111111111111111111111111111111 SQL Queries SQL Queries Manager Global Snapshot PG FDW Foreign Server Foreign Server Foreign Server The Future of Postgres Sharding 18 / 22
  19. 19. Implementing a Global Snapshot Manager ◮ We already support sharing snapshots among clients with pg_export_snapshot() ◮ We already support exporting snapshots to other servers with the GUC hot_standby_feedback The Future of Postgres Sharding 19 / 22
  20. 20. Global Transaction Manager 00000000000000000000000000000000000000000000 0000000000000000000000 0000000000000000000000 0000000000000000000000 00000000000000000000000000000000000000000000 11111111111111111111111111111111111111111111 1111111111111111111111 1111111111111111111111 1111111111111111111111 11111111111111111111111111111111111111111111 00000000000000000000000000000000000000000000 0000000000000000000000 0000000000000000000000 0000000000000000000000 00000000000000000000000000000000000000000000 11111111111111111111111111111111111111111111 1111111111111111111111 1111111111111111111111 1111111111111111111111 11111111111111111111111111111111111111111111 00000000000000000000000000000000000000000000 0000000000000000000000 0000000000000000000000 0000000000000000000000 00000000000000000000000000000000000000000000 11111111111111111111111111111111111111111111 1111111111111111111111 1111111111111111111111 1111111111111111111111 11111111111111111111111111111111111111111111 SQL Queries SQL Queries Foreign Server Manager Global Transaction PG FDW Foreign Server Foreign Server The Future of Postgres Sharding 20 / 22
  21. 21. Implementing a Global Transaction Manager ◮ Can use prepared transactions (two-phase commit) ◮ Transaction manager can be internal or external ◮ Can use an industry-standard protocol like XA The Future of Postgres Sharding 21 / 22
  22. 22. Conclusion http://momjian.us/presentations https://www.flickr.com/photos/anotherpintplease/ The Future of Postgres Sharding 22 / 22

×