Successfully reported this slideshow.
Your SlideShare is downloading. ×

DAT316_Report from the field on Aurora PostgreSQL Performance

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 49 Ad

DAT316_Report from the field on Aurora PostgreSQL Performance

Download to read offline

Tatsuo Ishii from SRA OSS has done extensive testing to compare the Aurora PostgreSQL-compatible Edition with standard PostgreSQL. In this session, he will present his performance testing results, and his work on Pgpool-II with Aurora; Pgpool-II is an open source tool which provides load balancing, connection pooling, and connection management for PostgreSQL.

Tatsuo Ishii from SRA OSS has done extensive testing to compare the Aurora PostgreSQL-compatible Edition with standard PostgreSQL. In this session, he will present his performance testing results, and his work on Pgpool-II with Aurora; Pgpool-II is an open source tool which provides load balancing, connection pooling, and connection management for PostgreSQL.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to DAT316_Report from the field on Aurora PostgreSQL Performance (20)

Advertisement

More from Amazon Web Services (20)

DAT316_Report from the field on Aurora PostgreSQL Performance

  1. 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Report from the field on Aurora PostgreSQL Performance T a t s u o I s h i i , J a p a n P r e s i d e n t S R A O S S , I n c . M a r k P o r t e r G e n e r a l M a n a g e r A m a z o n R D S , A u r o r a , R D S f o r P o s t g r e S Q L DAT 316
  2. 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Introduction to Aurora PostgreSQL • Performance Results from SRA OSS • Aurora Architecture • Pgpool-II Announcement • Performance Insights • Q&A
  3. 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Introduction to Aurora PostgreSQL • Performance Results from SRA OSS • Aurora Architecture • Pgpool-II Announcement • Performance Insights • Q&A
  4. 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Reimagining the relational database What if you were inventing the database today? You would break apart the stack You would build something that:  Can scale out…  Is self-healing…  Leverages distributed services… You would use open source software
  5. 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. A service-oriented architecture applied to the database Move the logging and storage layer into a multitenant, scale-out, database-optimized storage service Integrate with other AWS services like Amazon EC2, Amazon VPC, Amazon DynamoDB, Amazon SWF, and Amazon Route 53 for control and monitoring Make it a managed service—using Amazon RDS; takes care of management and administrative functions Amazon DynamoDB Amazon SWF Amazon Route 53 Logging + Storage SQL Transactions Caching Amazon S3 1 2 3 Amazon RDS
  6. 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why PostgreSQL ? • Open source database • In active development for 20 years • Owned by a foundation, not a single company • Permissive innovation-friendly open source license • High performance out of the box • Object-oriented and ANSI-SQL:2008 compatible • Most geospatial features of any open source database • Supports stored procedures in 12 languages (Java, Perl, Python, Ruby, Tcl, C/C++, its own Oracle-like PL/pgSQL, etc.) • Most Oracle-compatible open source database • Highest AWS Schema Conversion Tool automatic conversion rates are from Oracle to PostgreSQL
  7. 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. In 2014, we launched Amazon Aurora MySQL Now we have added PostgreSQL compatibility—creating Amazon Aurora PostgreSQL Customers can now choose how to use Amazon’s cloud-optimized relational database, with the performance and availability of commercial databases and the simplicity and cost-effectiveness of open source databases
  8. 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Introduction to Aurora PostgreSQL • Performance Results from SRA OSS • Aurora Architecture • Pgpool-II Announcement • Performance Insights • Q&A
  9. 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora PostgreSQL db.r3.8xlarge vCPU 32 mem 244GB Multi-AZ Environment Amazon RDS for PostgreSQL db.r3.8xlarge vCPU 32, mem 244GB Provisioned IOPS 10,000 Multi-AZ  Using same #CPU and memory size between Aurora and RDS  Using 10,000 Provisioned IOPS storage to set the same price with Aurora per hour  Using PostgreSQL 9.6.2 on both AZ2 AZ1 AZ2 AZ1 Write & Read Multi-AZ (Backup) Client Amazon EC2 m4.10xlarge vCPU 40, mem 160GB pgbench Read Only Write & Read Multi-AZ (Backup)
  10. 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Scenario • Using pbench • 250, 500, 750, and 1,000 connections • Loading data, creating index, and executing vacuum for each test • DB size is 30 GB and large table contains 200 million rows • Executing one SELECT, three UPDATE, and one INSERT within a transaction for NUM in 250 500 750 1000 do #Initialization pgbench -i -s 2000 #Benchmark pgbench --progress=1 --protocol=prepared -T 3600 -r -c $NUM -j $NUM -s 2000 done
  11. 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Loading Data 0:00:00 0:02:53 0:05:46 0:08:38 0:11:31 0:14:24 0:17:17 0:20:10 copy vacuum index 合計 Aurora RDS Total Avg. of 4 tests GoodElapsedTime 1/2 3/4 1/3 1/8
  12. 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Throughput 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 250 500 750 1000 Aurora RDS Connections GoodTPS x1.7 x2.2 x2.7 x3
  13. 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Average wait time of transactions 0 500 1000 1500 2000 2500 1 301 601 901 1201 1501 1801 2101 2401 2701 3001 3301 0 500 1000 1500 2000 2500 1 301 601 901 1201 1501 1801 2101 2401 2701 3001 3301 Elapsed seconds with 1,000 connections Elapsed seconds with 1,000 connections Aurora RDS msmsGoodGood 0 20 40 60 80 1 61 121 181 241 301 0 500 1000 1500 2000 2500 1 61 121 181 241 301 Elapsed seconds + 1,800 sec Elapsed seconds + 1,800 sec Aurora is more stable!
  14. 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Comparison of CPU utilization • CPU Utilization on CloudWatch (1,000 connections) • Aurora uses CPU more efficiently than RDS # IO waits consume CPU time on RDS Aurora RDS
  15. 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Comparison of Write IOPS (count/second) • Write IOPS on CloudWatch • Write IOPS of Aurora is lower than RDS This means Aurora is handling writes more efficiently Aurora RDS
  16. 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Comparison of Replica Lag • Adding slave for read on RDS to compare streaming replication on RDS and Aurora’s replication • Replica Lag on Aurora was low and replication was finished within tens of milliseconds • Streaming replication on RDS could not catch up within the benchmark Aurora RDS
  17. 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Summary Compared to RDS, Amazon Aurora PostgreSQL is: • 3 times faster on data loading • 3 times faster on throughput • Quick and stable response • No performance degradation with increasing connections • Low replica lag on replication
  18. 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Introduction to Aurora PostgreSQL • Performance Results from SRA OSS • Aurora Architecture • Pgpool-II Announcement • Performance Insights • Q&A
  19. 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora PostgreSQL Performance Architecture
  20. 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Do fewer IOs Minimize network packets Offload the database engine DO LESS WORK Process asynchronously Reduce latency path Use lock-free data structures Batch operations together BE MORE EFFICIENT How does Amazon Aurora achieve high performance? DATABASES ARE ALL ABOUT I/O NETWORK-ATTACHED STORAGE IS ALL ABOUT PACKETS/SECOND HIGH-THROUGHPUT PROCESSING NEEDS CPU AND MEMORY OPTIMIZATIONS
  21. 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Write IO Traffic in an Amazon Aurora database node AZ 1 AZ 3 Primary Database Node Amazon S3 AZ 2 Read Replica/ Secondary Node AMAZON AURORA ASYNC 4/6 QUORUM DISTRIBUTED WRITES DATAAMAZON AURORA + WAL LOG COMMIT LOG & FILES IO FLOW Only write WAL records; all steps asynchronous No data block writes (checkpoint, cache replacement) 6X more log writes, but 9X less network traffic Tolerant of network and storage outlier latency OBSERVATIONS 2x or better PostgreSQL Community Edition performance on write-only or mixed read-write workloads PERFORMANCE Boxcar log records—fully ordered by LSN Shuffle to appropriate segments—partially ordered Boxcar to storage nodes and issue writes WAL T Y P E O F W R IT E Read Replica/ Secondary Node
  22. 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. IO Traffic in Aurora Replicas Physical: Ship redo (WAL) to Replica Write workload similar on both instances Independent, duplicated storage Heavy write load impairs read performance PAGE CACHE UPDATE Aurora Master 30% Read 70% Write Aurora Replica 100% New Reads Shared Multi-AZ Storage PostgreSQL Master 30% Read 70% Write PostgreSQL Replica 30% New Reads 70% Write SINGLE-THREADED WAL APPLY Data Volume Data Volume Physical: Ship redo (WAL) from Master to Replica Cached pages have redo applied Replica shares storage: no writes performed Replica can do more read work Advance read view as commits seen from master POSTGRESQL READ SCALING AMAZON AURORA READ SCALING
  23. 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Write IO Traffic in an Amazon Aurora storage node LOG RECORDS Primary Database Node INCOMING QUEUE STORAGE NODE AMAZON S3 BACKUP 1 2 3 4 5 6 7 8 UPDATE QUEUE ACK HOT LOG DATA BLOCKS POINT IN TIME SNAPSHOT GC SCRUB COALESCE SORT GROUP PEER-TO- PEER GOSSIPPeer Storage Nodes All steps are asynchronous Only steps 1 and 2 are in foreground latency path Input queue is far smaller than PostgreSQL Favors latency-sensitive operations Uses disk space to buffer against spikes in activity OBSERVATIONS IO FLOW ① Receive record and add to in-memory queue ② Persist record and acknowledge ③ Organize records and identify gaps in log ④ Gossip with peers to fill in holes ⑤ Coalesce log records into new data block versions ⑥ Periodically stage log and new block versions to Amazon S3 ⑦ Periodically garbage-collect old versions ⑧ Periodically validate CRC codes on blocks
  24. 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora PostgreSQL Durability and Availability Architecture
  25. 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora storage engine overview Data is replicated 6 times across 3 Availability Zones Continuous backup to Amazon S3 (built for 11 9s durability) Continuous monitoring of nodes and disks for repair 10 GB segments as unit of repair or hotspot rebalance Quorum system for read/write; latency tolerant Quorum membership changes do not stall writes Storage volume automatically grows up to 64 TB AZ 1 AZ 2 AZ 3 Amazon S3 Database Node Storage Node Storage Node Storage Node Storage Node Storage Node Storage Node Storage Monitoring
  26. 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Scale-out, distributed, log structured storage Master Replica Replica Replica Availability Zone 1 Shared Storage Volume—Transaction Aware Primary Database Node Read Replica/ Secondary Node Read Replica/ Secondary Node Read Replica/ Secondary Node Availability Zone 2 Availability Zone 3 AWS Region Storage Monitoring Database and Instance Monitoring
  27. 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What can fail? Segment failures (disks) Node failures (machines) AZ failures (network or datacenter) Optimizations 4 out of 6 write quorum 3 out of 6 read quorum Peer-to-peer replication for repairs Amazon Aurora Storage fault tolerance SQL Transaction AZ 1 AZ 2 AZ 3 Caching SQL Transaction AZ 1 AZ 2 AZ 3 Caching
  28. 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora Replicas Availability Failing database nodes are automatically detected and replaced Failing database processes are automatically detected and recycled Replicas are automatically promoted to primary if needed (failover) Customer specifiable failover order AZ 1 AZ 3AZ 2 Primary Node Primary Node Primary Database Node Primary Node Primary Node Read Replica Primary Node Primary Node Read Replica Database and Instance Monitoring Performance Customer applications can scale out read traffic across read replicas Read balancing across read replicas
  29. 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Faster, more predictable failover with Amazon Aurora App RunningFailure Detection DNS Propagation Recovery Database Failure Amazon RDS for PostgreSQL is good: failover times of ~60 seconds Replica-Aware App Running Failure Detection DNS Propagation Recovery Database Failure Amazon Aurora is better: failover times < 30 seconds 1 5 - 2 0 s e c 3 - 1 0 s e c App Running
  30. 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Introduction to Aurora PostgreSQL • Performance Results from SRA OSS • Aurora Architecture • Pgpool-II Announcement • Performance Insights • Q&A
  31. 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Introducing Pgpool-II • Providing clustering features between application and PostgreSQL • Normally, Pgpool-II is used with PostgreSQL’s streaming replication • Open Source Software (BSD License) • Major version up per year • Latest version is 3.7 PostgreSQL Pgpool-IIClient Read/Write Query Write Primary Standby Standby Replication Read Read Pgpool-II details pgpool.net/mediawiki/index.php/Main_Page
  32. 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. WE ARE ANNOUNCING… Pgpool-II 3.7 supports Amazon Aurora PostgreSQL and provides: • Automatic distribution of queries (UPDATE for master, SELECT for read replica) • Connection pooling and query cache • Configuration sample is included
  33. 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Introduction to Aurora PostgreSQL • Performance Results from SRA OSS • Aurora Architecture • Pgpool-II Announcement • Performance Insights • Q&A
  34. 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is RDS Performance Insights? Customers ask for • Visibility into performance of RDS databases • Want to optimize cloud database workloads • Easy tool • Often only part-time DBA or no DBA • Single pane of glass
  35. 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. First Step: RDS Enhanced Monitoring Released 2016 OS Metrics Process/thread list Up to 1 second granularity
  36. 36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Introducing: RDS Performance Insights Dashboard • DB load • Adjustable timeframe • Filterable by attribute (SQL, User, Host, Wait) • SQL causing load Phased RDS delivery • Aurora, MySQL/MariaDB, PostgreSQL, Oracle, SQL Server Guided discovery of performance problems • For both beginners and experts • Core metric is “Database Load”
  37. 37. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is “Database Load?” All engines have a connections list showing • Active • Idle We sample every second • For each active session, collect • SQL, • State :CPU, I/O, Lock, Commit log wait, etc. • Host • User Expose as “Average Active Sessions” (AAS)
  38. 38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. RDS Performance Insights dashboard
  39. 39. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. RDS Performance Insights
  40. 40. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Sampling Every Second Query run often Fast query, run rarely Slow query User 1 User 2 User 3 Time
  41. 41. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Sampling is like film
  42. 42. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AAS load graph User 1 User 2 User 3 User 4 Active Sessions = 1 2 3 4
  43. 43. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Active Session State CPU IO Wait idleidle idle idleQuery 1 Query 2 Query 3 Time
  44. 44. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AAS over 1 minute averages
  45. 45. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Access to RDS Performance Insights
  46. 46. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Access to RDS Performance Insights High Load
  47. 47. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Summary: Amazon RDS Performance Insights  DB Load: Average Active Sessions  Identifies database bottlenecks  Easy  Powerful  Top SQL  Identifies source of bottleneck  Enables problem discovery  Adjustable timeframe  Hour, day, week, and longer  Questions:  rdspi@amazon.com
  48. 48. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Introduction to Aurora PostgreSQL • Performance Results from SRA OSS • Aurora Architecture • Pgpool-II Announcement • Performance Insights • Questions?
  49. 49. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. THANK YOU! T a t s u o I s h i i , J a p a n P r e s i d e n t S R A O S S , I n c . M a r k P o r t e r G e n e r a l M a n a g e r A m a z o n R D S , A u r o r a , R D S f o r P o s t g r e S Q L (And pl ease fi l l out your sessi on revi ews)

×