Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Distributed Postgres with Citus / Will Leinweber (PostgreSQL)

245 views

Published on

HighLoad++ 2017

Зал «Кейптаун», 7 ноября, 15:00

Тезисы:
http://www.highload.ru/2017/abstracts/3043.html

Citus is an open-source extension to Postgres that transforms it into a multi-node, distributed database. It allows you to horizontally scale out both the.

In this session you'll learn how Citus takes care of sharding, distributed transactions, and even masterless writes. You'll learn how to transition your database from single-node Postgres in order to scale up your database to bigger and bigger sizes as your data grows.

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

Distributed Postgres with Citus / Will Leinweber (PostgreSQL)

  1. 1. Distributed Postgres with Citus Will Leinweber
  2. 2. Will Leinweber Principal Cloud Engineer at Citus Previously at Heroku Postgres @leinweber bitfission.com (warning: autoplays MIDI)
  3. 3. Developers Love Postgres Postgres MySQL MongoDB SQL Server + Oracle RDBMS: Postgres, MySQL, Microsoft SQL Server, Oracle
  4. 4. A. Start with SQL, need to scale out and migrate to NoSQL B. Start with NoSQL, hope you actually later need scale out C. Start with SQL, need to scale out and stay with SQL? Possible Paths
  5. 5. What is Citus? 1.Scales out Postgres 2.Extension to Postgres 3.Available in 3 Ways • Using sharding & replication • Query engine parallelizes SQL queries across many nodes • Using Postgres extension APIs
  6. 6. Citus, Packaged Three Ways Open Source Enterprise Software Fully-Managed Database as a Service github.com/citusdata/citus
  7. 7. Simplified Citus Architecture
  8. 8. (coordinator node)=# d Schema | Name --------+------------ public | cw_metrics public | events (worker node)=# d Schema | Name --------+------------------- public | cw_metrics_102008 public | cw_metrics_102012 public | cw_metrics_102016 public | cw_metrics_102064 public | cw_metrics_102068 public | events_102104 public | events_102108 public | events_102112 public | events_102116 ...
  9. 9. citus=> select * from pg_dist_shard limit 10; logicalrelid | shardid | shardminvalue | shardmaxvalue --------------+---------+---------------+--------------- 19395 | 102040 | -2147483648 | -2013265921 19395 | 102041 | -2013265920 | -1879048193 19395 | 102042 | -1879048192 | -1744830465 19395 | 102043 | -1744830464 | -1610612737 19395 | 102044 | -1610612736 | -1476395009 19395 | 102045 | -1476395008 | -1342177281 19395 | 102046 | -1342177280 | -1207959553 19395 | 102047 | -1207959552 | -1073741825 19395 | 102048 | -1073741824 | -939524097 19395 | 102049 | -939524096 | -805306369 ...
  10. 10. 3 Challenges Distributing Postgres 1. Postgres and High Availability 2. To build new distributed database—or to fork? 3. Distributed transactions
  11. 11. Postgres & High Availability (HA) Designing for a Cloud-native world
  12. 12. Why is High Availability hard? Postgres replication uses one primary & multiple secondary nodes. Two challenges: 1. Most Postgres clients aren’t smart. When the primary fails, they retry the same IP. 2. Postgres replicates entire state. This makes it resource intensive to reconstruct new nodes from a primary.
  13. 13. Database Failures Should Be Transparent
  14. 14. Database Failures Shouldn’t Be a Big Deal 1. Postgres streaming replication to replicate from primary to secondary. Back up to S3. 2. Volume level replication to replicate to secondary’s volume. Back up to S3. 3. Incremental backups to S3. Reconstruct secondary nodes from S3. 3 Methods for HA & Backups in Postgres
  15. 15. Postgres - Streaming Replication (1) Write-ahead logs (streaming repl.) Table foo Primary – Postgres streaming repl. Table bar WAL logs Table foo Table bar WAL logs Secondary – Postgres streaming repl. Monitoring Agents - streaming repl. setup & auto failover S3 / Blob Storage (Encrypted) Backup Process
  16. 16. Postgres – AWS RDS & Azure (2) Postgres Primary Monitoring Agents (Auto node failover) Persistent Volume Postgres Standby S3 / Blob Storage (Encrypted) Table foo Table bar WAL logs Table foo Table bar WAL logs Backup process Backup Process Persistent Volume
  17. 17. Postgres – Reconstruct from WAL (3) Postgres Primary Monitoring Agents (Auto node failover) Persistent Volume Postgres Secondary Backup Process S3 / Blob Storage (Encrypted) Table foo Table bar WAL logs Persistent Volume Table foo Table bar WAL logs Backup process
  18. 18. WHO DOES THIS? PRIMARY BENEFITS Streaming Replication (local / ephemeral disk) On-prem Manual EC2 Simple to set up Direct I/O: High I/O & large storage Disk Mirroring RDS Azure Preview Works for MySQL and Postgres Data durability in cloud environments Reconstruct from WAL Heroku Citus Data Enables Fork and PITR Node reconstruction in background (Data durability in cloud environments) How do these approaches compare?
  19. 19. wal-e github.com/wal-e/wal-e github.com/wal-g/wal-g
  20. 20. Summary • In Postgres, a database node’s state gets replicated in its entirety. The replication can be set up in three ways. • Reconstructing a secondary node from S3 makes bringing up or shooting down nodes easy. • When you shard your database, the state you need to replicate per node becomes smaller.
  21. 21. Postgres has a huge ecosystem. How do you keep up with it?
  22. 22. 3 ways to build a distributed database 1. Build a distributed database from scratch 2. Middleware sharding (mimic the parser) 3. Fork your favorite database (like Postgres)
  23. 23. Example Transaction Block
  24. 24. Postgres Features, Tools & Frameworks • Postgres manual (US Letter) • Clients for different programming languages • ORMs, libraries, GUIs • Tools (dump, restore, analyze) • New features
  25. 25. At First, Forked Postgres with Style
  26. 26. Two Stage Query Optimization 1. Plan to minimize network I/O 2. Nodes talk to each other using SQL over libpq 3. Learned to cooperate with planner / executor bit by bit (Volcano style executor)
  27. 27. Citus Architecture (Simplified) SELECT avg(revenue) FROM sales Coordinator SELECT sum(revenue), count(revenue) FROM table_1001 SELECT sum … FROM table_1003 Worker node 1 Table metadata Table_1001 Table_1003 SELECT sum … FROM table_1002 SELECT sum … FROM table_1004 Worker node 2 Table_1002 Table_1004 Worker node N . . . . . . Each node Postgres with Citus installed 1 shard = 1 Postgres table
  28. 28. Unfork Citus using Extension APIs CREATE EXTENSION citus; • System catalogs – Distributed metadata • Planner hook – Insert, Update, Delete, Select • Executor hook – Insert, Update, Delete, Select • Utility hook – Alter Table, Create Index, Vacuum, etc. • Transaction & resources handling – file descriptors, etc. • Background worker process – Maintenance processes (distributed deadlock detection, task tracker, etc.) • Logical decoding – Online data migrations
  29. 29. Postgres has transactions How to handle distributed transactions
  30. 30. BEGIN INSERT UPDATE SELECT COMMIT ROLLBACK
  31. 31. Consistency in Distributed Databases 1. 2PC: All participating nodes need to be up 2. Paxos: Achieves consensus with quorum 3. Raft: More understandable alternative to Paxos
  32. 32. Concurrency in Distributed Databases
  33. 33. Locks Locks
  34. 34. What is a Lock? • Protects against concurrent modifications. • Locks are released at the end of a transaction. Deadlocks
  35. 35. Transactions Block on 1st Conflicting LockWhat is a lock? Protects against concurrent modifications Locks released at end of transaction BEGIN; UPDATE data SET y = 2 WHERE x = 1; <obtained lock on rows with x = 1> COMMIT; <all locks released> BEGIN; UPDATE data SET y = 5 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT;
  36. 36. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT;
  37. 37. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?
  38. 38. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?Deadlock detection in PostgreSQL Deadlock detection builds a graph of processes that are waiting for each other.
  39. 39. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?Deadlock detection in PostgreSQL Deadlock detection builds a graph of processes that are waiting for each other. Deadlock detection in PostgreSQL Transactions are cancelled until the cycle is gone
  40. 40. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?Deadlock detection in PostgreSQL Deadlock detection builds a graph of processes that are waiting for each other. Deadlock detection in PostgreSQL Transactions are cancelled until the cycle is gone Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes
  41. 41. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?Deadlock detection in PostgreSQL Deadlock detection builds a graph of processes that are waiting for each other. Deadlock detection in PostgreSQL Transactions are cancelled until the cycle is gone Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus PostgreSQL’s deadlock detector still works
  42. 42. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?Deadlock detection in PostgreSQL Deadlock detection builds a graph of processes that are waiting for each other. Deadlock detection in PostgreSQL Transactions are cancelled until the cycle is gone Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus PostgreSQL’s deadlock detector still works Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus When deadlocks span across node, PostgreSQL cannot help us Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus When deadlocks span across node, PostgreSQL cannot help us
  43. 43. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?Deadlock detection in PostgreSQL Deadlock detection builds a graph of processes that are waiting for each other. Deadlock detection in PostgreSQL Transactions are cancelled until the cycle is gone Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus PostgreSQL’s deadlock detector still works Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus When deadlocks span across node, PostgreSQL cannot help us Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus When deadlocks span across node, PostgreSQL cannot help us Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlock detection in Citus 7 Citus 7 adds distributed deadlock detection
  44. 44. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?Deadlock detection in PostgreSQL Deadlock detection builds a graph of processes that are waiting for each other. Deadlock detection in PostgreSQL Transactions are cancelled until the cycle is gone Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus PostgreSQL’s deadlock detector still works Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus When deadlocks span across node, PostgreSQL cannot help us Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus When deadlocks span across node, PostgreSQL cannot help us Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlock detection in Citus 7 Citus 7 adds distributed deadlock detection Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlock detection in Citus 7 Citus 7 adds distributed deadlock detection.
  45. 45. Summary Distributed transactions are a complex topic. Most articles on this topic focus on data consistency. Data consistency is only one side of the coin. If you’re using a relational database, your application benefits from another key feature: deadlock detection. https://www.citusdata.com/blog/2017/08/31/databases- and-distributed-deadlocks-a-faq
  46. 46. Conclusion Postgres High Availability (HA) Extension APIs Distributed Deadlock Detection
  47. 47. SQL is hard, not impossible, to scale
  48. 48. © 2017 Citus Data. All right reserved. will@citusdata.com Questions? @citusdata Will Leinweber www.citusdata.com

×