"In this workshop, we focus on the hands-on journey for migrating Oracle databases to the Aurora PostgreSQL-compatible Edition. Participants deploy an instance of Amazon Aurora, migrate or generate a test workload, and manually monitor the database to understand the workload. Participants also review multiple ways to track queries and their execution plans, and they determine how to optimize the queries. Finally, participants also learn how to use Amazon RDS Performance Insights for query-analysis and tuning.
Below are the prerequisites for the workshop.
Active AWS account with Admin privileges. (IAM user should have administrator access). Please refer the link on how to create IAM administrator user here
Existing EC2 key pair created in the AWS region you are launching the CloudFormation template in. Please refer below on how to first create a new Key pair as shown here
Pre-installed AWS Schema Conversion Tool software on your machine. Details on how to download and install AWS Schema Conversion Tool shown below
Install and launch SCT on your local machine from http://docs.aws.amazon.com/SchemaConversionTool/latest/userguide/CHAP_SchemaConversionTool.Installing.html
Download required drivers from links in the “Installing the Required Database Drivers” section from the above link. You will need to download Oracle and PostgreSQL drivers for this workshop. Alternatively, you can download the required drivers for this lab from
http://bit.ly/2phVpPk -> Oracle JDBC driver
http://bit.ly/2pt04ZT -> PostgreSQL JDBC driver
Download the Workshop Hands on lab guide http://bit.ly/2zYpnvS"
6. A Service-Oriented Architecture Applied to the
Database
Move the logging and storage layer into a
multitenant, scale-out, database-optimized
storage service
Integrate with other AWS services like Amazon
Elastic Compute Cloud (Amazon EC2), Amazon
Virtual Private Cloud (Amazon VPC), Amazon
DynamoDB, Amazon Simple Workflow Service
(Amazon SWF), and Amazon Route 53 for
control and monitoring
Make it a managed service – using Amazon
Relational Database Service (Amazon RDS).
Takes care of management and administrative
functions.
Amazon
DynamoDB
Amazon SWF
Amazon Route 53
Logging + Storage
SQL
Transactions
Caching
Amazon S3
1
2
3
Amazon RDS
8. PostgreSQL Compatibility
PostgreSQL 9.6 + Amazon Aurora cloud-optimized storage
Performance: 2x to 3x higher throughput than PostgreSQL alone
Availability: failover time of < 30 seconds
Durability: six copies across three Availability Zones
Read Replicas: single-digit millisecond lag times on up to 15 replicas
Amazon Aurora Storage
9. PostgreSQL Compatibility
Cloud-native security and encryption
AWS Key Management Service (AWS KMS)
AWS Identity and Access Management (IAM)
Easy to manage with Amazon RDS
Easy to load and unload
AWS Database Migration Service
AWS Schema Conversion Tool
Fully compatible with PostgreSQL, now and for the
foreseeable future
Not a compatibility layer – native PostgreSQL
implementation
AWS DMS
Amazon RDS
PostgreSQL
11. Scale-Out, Distributed, Log Structured Storage
Master Replica Replica Replica
Availability Zone 1
Shared Storage Volume – Transaction Aware
Primary
Database
Node
Read
Replica/
Secondary
Node
Read
Replica/
Secondary
Node
Read
Replica/
Secondary
Node
Availability Zone 2 Availability Zone 3
AWS Region
Storage
Monitoring
Database and
Instance
Monitoring
12. Amazon Aurora Storage Engine Overview
Data is replicated six times across three
Availability Zones
Continuous backup to Amazon S3
(built for 11 9s durability)
Continuous monitoring of nodes and disks for
repair
10 GB segments as unit of repair or hotspot
rebalance
Quorum system for read/write; latency tolerant
Quorum membership changes do not stall writes
Storage volume automatically grows up to 64 TB
AZ 1 AZ 2 AZ 3
Amazon S3
Database
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Monitoring
13. Faster, More Predictable Failover with Amazon Aurora
App
RunningFailure Detection DNS Propagation
Recovery
Database
Failure
Amazon RDS for PostgreSQL is good: failover times of ~60 seconds
Replica-Aware App Running
Failure Detection DNS Propagation
Recovery
Database
Failure
Amazon Aurora is better: failover times < 30 seconds
1 5 - 2 0 s e c 3 - 1 0 s e c
App
Running
15. Do fewer IOs
Minimize network packets
Offload the database engine
DO LESS WORK
Process asynchronously
Reduce latency path
Use lock-free data structures
Batch operations together
BE MORE EFFICIENT
How Does Amazon Aurora Achieve High Performance?
DATABASES ARE ALL ABOUT I/O
NETWORK-ATTACHED STORAGE IS ALL ABOUT PACKETS/SECOND
HIGH-THROUGHPUT PROCESSING NEEDS CPU AND MEMORY OPTIMIZATIONS
16. Write IO Traffic in an Amazon Aurora Database Node
AZ 1 AZ 3
Primary
Database
Node
Amazon S3
AZ 2
Read
Replica/
Secondary
Node
AMAZON AURORA
ASYNC
4/6 QUORUM
DISTRIBUTED
WRITES
DATAAMAZON AURORA + WAL LOG COMMIT LOG & FILES
IO FLOW
Only write WAL records; all steps asynchronous
No data block writes (checkpoint, cache replacement)
6x more log writes, but 9x less network traffic
Tolerant of network and storage outlier latency
OBSERVATIONS
2x or better PostgreSQL Community Edition performance on
write-only or mixed read-write workloads
PERFORMANCE
Boxcar log records – fully ordered by LSN
Shuffle to appropriate segments – partially ordered
Boxcar to storage nodes and issue writes
WAL
T Y P E O F W R IT E
Read
Replica/
Secondary
Node
17. Write IO Traffic in an Amazon Aurora Storage Node
LOG RECORDS
Primary
Database
Node
INCOMING QUEUE
STORAGE NODE
AMAZON S3 BACKUP
1
2
3
4
5
6
7
8
UPDATE
QUEUE
ACK
HOT
LOG
DATA
BLOCKS
POINT IN TIME
SNAPSHOT
GC
SCRUB
COALESCE
SORT
GROUP
PEER TO PEER GOSSIPPeer
Storage
Nodes
All steps are asynchronous
Only steps 1 and 2 are in foreground latency path
Input queue is far smaller than PostgreSQL
Favors latency-sensitive operations
Uses disk space to buffer against spikes in activity
OBSERVATIONS
IO FLOW
① Receive record and add to in-memory queue
② Persist record and acknowledge
③ Organize records and identify gaps in log
④ Gossip with peers to fill in holes
⑤ Coalesce log records into new data block versions
⑥ Periodically stage log and new block versions to Amazon
S3
⑦ Periodically garbage collect old versions
⑧ Periodically validate CRC codes on blocks
18. IO Traffic in Aurora Replicas
PAGE CACHE
UPDATE
Aurora Master
30% Read
70% Write
Aurora Replica
100% New Reads
Shared Multi-AZ Storage
PostgreSQL Master
30% Read
70% Write
PostgreSQL Replica
30% New Reads
70% Write
SINGLE-THREADED
WAL APPLY
Data Volume Data Volume
Physical: Ship redo (WAL) to Replica
Write workload similar on both instances
Independent storage
Physical: Ship redo (WAL) from Master to Replica
Replica shares storage: No writes performed
Cached pages have redo applied
Advance read view when all commits seen
POSTGRESQL READ SCALING AMAZON AURORA READ SCALING
19. Amazon Aurora with PostgreSQL Compatibility
Performance as opposed to PostgreSQL
20. System Configuration
PostgreSQL – Single AZ, no backup
Amazon Aurora
EBS EBS EBS
60,000 total IOPS
AZ 1 AZ 2 AZ 3
Amazon S3
r4.16xlarge
database
instance
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
r4.8xlarge
client
driver
r4.16xlarge
database
instance
r4.8xlarge
client
driver
r4.8xlarge
client
driver
Amazon S3
Amazon RDS
r4.16xlarge
30,000 iops
21. Amazon Aurora Is Faster on PgBench
Running the standard pgbench benchmark, Amazon Aurora delivers
1.6x the peak throughput of PostgreSQL and 2.9x at high client counts
0
5
10
15
20
25
30
35
40
45
128 256 512 768 1024 1280 1536 1792 2048
Throughput(tps,thousands)
Number of clients
pgbench tpcb-like throughput, 150 GiB
PostgreSQL (Single AZ) Amazon Aurora (Three AZs)
22. Amazon Aurora Is 2x–5x Faster on Sysbench
0
20
40
60
80
100
120
140
256 512 768 1024 1280 1536 1792 2048 2305 2560
writes/second,thousands
Number of clients
sysbench write-only 30 GiB
PostgreSQL (Single AZ, No Backup) Amazon Aurora (Three AZs, Continuous Backup)
2.2x 5.3x
Running the standard sysbench benchmark, Amazon Aurora delivers
> 2x the absolute peak of PostgreSQL and 5x at high client counts.
23. Amazon Aurora Is 4x Faster at Large Scale
Scales from 1.8x to 4.4x better as database grows from 10 GiB to 100
GiB
74
49
30
136 134 131
0
20
40
60
80
100
120
140
160
10 GiB 30 GiB 100 GiB
writes/second,thousands
Database size
sysbench write-only
PostgreSQL (Single AZ, No Backup) Amazon Aurora (Three AZs, Continuous Backup)
4.4x1.8x 2.8x
24. Amazon Aurora Improvements for GA
GA release improves performance on larger working sets
112
92
83
136 134 131
0
20
40
60
80
100
120
140
160
10 GiB 30 GiB 100 GiB
writes/sec.thousands
Database Size
sysbench write-only
Amazon Aurora Preview Amazon Aurora GA
25. Amazon Aurora Loads Data Faster
Database initialization is > 2x faster than PostgreSQL using the
standard pgbench benchmark
Copy In
Copy In
Vacuum
Vacuum
Index Build
Index Build
0 500 1000 1500 2000 2500 3000 3500 4000
PostgreSQL
Amazon Aurora
Runtime (seconds)
pgbench initialization, scale 10000 (150 GiB)
86% reduction in vacuum time
26. Amazon Aurora Gives > 2x Lower Response Times
Response time under heavy write load > 2x lower than PostgreSQL,
variance reduced by 99%
0.00
100.00
200.00
300.00
400.00
500.00
600.00
700.00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Responsetime,ms
Minutes
SYSBENCH RESPONSE TIME (p95), 30 GiB, 1024 CLIENTS
PostgreSQL (Single AZ, No Backup) Amazon Aurora (Three AZs, Continuous Backup)
27. Amazon Aurora Has More Consistent Throughput
While running pgbench at load, throughput is 3x more consistent
than PostgreSQL
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
10 15 20 25 30 35 40 45 50 55 60
Throughput,tps
Minutes
pgbench throughput over time, 150 GiB, 1024 clients
PostgreSQL (Single AZ) Amazon Aurora (Three AZs)
28. Amazon Aurora Reduced Recovery Time Up to 97%
Transaction-aware storage system recovers almost instantly
3 GiB redo
recovered in 19 seconds
10 GiB redo
recovered in 50 seconds
30 GiB redo
recovered in 123 seconds
Amazon Aurora has no redo.
Recovered in 3 seconds while maintaining
significantly greater throughput.
0
20
40
60
80
100
120
140
160
0 20000 40000 60000 80000 100000 120000 140000
Recoverytimeinseconds(lessisbetter)
writes/second (more is better)
Recovery time from crash under load
Bubble size represents redo log which must be recovered
As PostgreSQL throughput
goes up, so does log size
and crash recovery time.
29. Amazon Aurora with PostgreSQL Compatibility
Performance by the Numbers
Measurement Result
PgBench Up to 2.9x faster
Sysbench 2x–5x faster
Vacuum Time reduced up to 86%
Response time > 2x lower, 99% < variance
Throughput jitter 3x more consistent
Throughput at scale 4x faster
Recovery speed Time reduced up to 97%