Shootout at the
AWS Corral
EC2
RDS
Heroku
Josh Berkus
PostgreSQL
Experts Inc.
SCALE 13
https://github.com/
manageacloud/
cloud-benchmark-postgres
Thanks!
Ruben Rubio Rey
Thanks!
What is Amazon
Web Services?
Magic?
Image by Sperlingsmaedchen. Free for non-profit use only.
a bunch of servers
with virtualization
shared storage
… and a great API with stuff
The Good
● Fast deployment
– new servers in minutes, with a script
● Easy scale-out
– add replicas in minutes
● Minimize ops staff
– no HW wranglers
¢heap
(at the low end)
ex$pensive
(at the high end)
The Bad
● Low system resources
– VMs are small/slow
● Security
– more attack vectors
The Ugly
● Everything is shared
– Network
– IO / Storage
– CPU (partly)
Your performance depends on somebody
else's peak load.
Sharing
is
Not
Caring
ephemeral cloud
● DR is not optional
● your virtual DB server will go away
● you need replicas & backup
why Postgres?
transactional DB workout
● works CPU
● works RAM
● works IO
● works network IO
at the same time,
in parallel
Cast of Three
The Gunslinger
Roll-Your-Own
The Rancher
RDS
The Dandy
Heroku
The Gunslinger
The Gunslinger
Roll Your Own
on EC2
Become a Gunslinger
I. Create an EC2 instance
II. Install PostgreSQL on it
III. Configure PostgreSQL
Roll-Your-Own ++
● cheapest option
● highly configurable
● install whatever you want
– version
– extensions
Roll-Your-Own --
● you still do all the admin
– installation
– backup/redundancy
– updates
– OS updates
● configuration required
A Fistful of Services
● AMIs
– “clone” your database server setup
● AWS's other services
– caching, queueing, s3 storage, etc.
Instance Types
● m3.* general-purpose
● c3.* small, CPU-bound DBs $$
● r3.* maximize caching $$$
● i2.* Data warehousing $$$$
instance tips
● Use m3.* if you don't know what to use
● Get enough RAM to cache your
database if you can
Storage Types
● EBS + Provisioned IOPS
– large size, latency issues
– reliable, snapshots
– choose “EBS optimized”
● General SSD
– better for bursts (about 20% better)
– high variability
Storage Types
● Instance Storage + SSD option
– low-latency, limited size
– risky: data loss, data corruption
– just for “running with scissors”
PrIOPS != througput
PrIOPS fallacy
● not a guarantee, a limit
– but mostly pretty consistent
● each IOP is no more than 8K
● random access → each page is an IOP
– no real prefetch
● PrIOPS ~ rows/second
Stuff to Set Up
● Backup: WAL-E to S3
● Replication: not optional
– in another Availbility Zone
● Monitoring for instance failure
● Secure your instance
– SSL
– pg_hba.conf
Configuration Tips
● random_page_cost = 1.5
● wal_buffers = 32 to 64MB
● stats_temp_directory =
/mnt/tmpfs/stats
● synchronous_commit = off
– if you can afford it
Junior Gunslinger
● “small”
“economical”
● m3.medium
● 1 core
3.75GB RAM
● 40GB
+ 1000 PrIOPS
Senior Gunslinger
● “large”
“performance”
● r3.2xlarge
● 8 cores
61GB RAM
● 200GB
+ 4000 PrIOPS
Junior Gunslinger Cost
instance: $36.50/month
EBS PrIOPS: $105/month
S3 Archive: $5/month
X2 for replica
== $288.00 a month
(+ misc charges)
Cheaper Gunslinger
instance: $36.50/month
EBS PrIOPS: $105/month
S3 Archive: $5/month
no replica
== $146.50 a month
(+ misc charges)
Senior Gunslinger
archive-only: $760.70/month
with replica: $1509.40/month
(+ misc charges)
The Rancher
The Rancher
RDS
Relational
Database
Service
In
The
Middle
Ranching 101
1. Go to AWS RDS
2. Choose “PostgreSQL”
3. Select instance size and storage
4. Launch
5. Connect over port 5432
RDS ++
● Simpler deployment
● AWS manages updates, uptime
● Easy replicas
● Double redundancy
– multizone warm standby
OR replicas
– regular DB snapshots
RDS --
● Limited extensions
● 9.3 only
– and not promptly updated
● No shell access
● Still might have to configure
Postgres
Rancher equipment
● Integration with some AWS sevices
– caching
– S3 snapshotting
– plus regular access to other services
● 2 dozen extensions available
RDS Options
● Instance types:
– same m3.* and r3.* options
– no c3 or i3 instances currently
● again, get enough RAM
● All storage is EBS
– take PrIOPS storage options
RDS redundancy
● Do Multi-AZ instances or replication
– Multi-AZ: automated failover
– Replication: better performance
● Set up auto DB snapshots
– automatically deleted snapshots?
RDS configuration
● Same as Roll-Your-Own
– except: be cautious, defaults are OK
● Except fewer security options
RDS Cost
Small, single: $ 184.35
Small, redundant: $ 358.70
Large, single: $1119.35
Large, redundant: $2342.70
(+ misc charges)
The Dandy
The Dandy
Heroku
Heroku
for
white-glove
service
Doing the Dandy
1. Choose “Create Database”
2. Pick a size
3. Launch
4. Connect using supplied credentials
Heroku ++
● Heroku manages everything
– updates, backups, availability,
configuration
● really no Ops staff
● Heroku-only features
● Latest Postgres stuff
– sometimes feature previews
Heroku --
● No configurability
– webapp assumed
– can't control AZ, etc.
● Limited extensions & versions
● Costs escalate
● No shell
Dandy Bling
● git-based instance management
– works really well with Rails/Django
● Dataclips
– web-sharable matviews!
● Followers == replicas
Dandy Bling
● About 20 extensions
● Heroku addons and apps
● encryption
● Access all AWS services
Heroku options
● 5 database “sizes”
● 3 levels of HA/uptime
● that's it
Heroku Sizing
● Small
Standard 2: 3.5 GB RAM
shared hosting
● Large
Standard 6: 60GB RAM
dedicated instance
Heroku Sizing
● Small
archive: $200
HA: $350
● Large
archive: $2000
HA: $3500
the shootout
pgbench++
● ships with Postgres
● microbenchmark
– very simple “bank trade” workload
● fast to set up and run
pgbench--
● doesn't do complex queries
● pure random data / access
● unrealistic balance of work
– too reliant on single-row write speed
● not very tunable
pgbench sizing
1. memory read-write (RW):
– 50% of RAM, write transactions
2. memory read-only (RO):
– 50% of RAM, read-only queries
3. disk read-write (RW):
– 200% of RAM, write transactions
pgbench small
● memory RW
● pgbench -i -s 100 --foreign-keys
● pgbench -c 4 -T 900
● memory RO
● pgbench -i -s 100 --foreign-keys
● pgbench -c 4 -T 900 -S
● disk RW
● pgbench -i -s 400 --foreign-keys
● pgbench -c 4 -T 900
pgbench large
● memory RW
● pgbench -i -s 1000 --foreign-keys
● pgbench -c 16 -T 900
● memory RO
● pgbench -i -s 1000 --foreign-keys
● pgbench -c 16 -T 900 -S
● disk RW
● pgbench -i -s 7000 --foreign-keys
● pgbench -c 16 -T 900
metrics
● TPS: transactions-per-second
– measures multiple things
● Load Time: time to build the database
run many many timesrun many many times
Box PlotBox Plot
495 TPS
587 TPS
1685 TPS
2537 TPS
6156 TPS
50
90
10
Min
Max
0.3X
0.4X
Median
1.7X
4X
50
90
10
Min
Max
when the smoke clears ...
features
Versions Extensions Superuser Replication
EC2 Any All Yes Yes
RDS 9.3 only Some No Yes
Heroku 9.3, 9.4, betas Some No Yes
Auto-
Failover
Snapshots Extras Support
EC2 No DIY DIY No
RDS Yes* Yes No No
Heroku Yes Yes Yes Yes
EC2 RDS Heroku
$0.00
$100.00
$200.00
$300.00
$400.00
Small Instance Pricing
Archive
HA
costpermonth
EC2 RDS Heroku
$0.00
$500.00
$1,000.00
$1,500.00
$2,000.00
$2,500.00
$3,000.00
$3,500.00
$4,000.00
Large Instance Pricing
Archive
HA
costpermonth
EC2 Heroku RDS RDS HA
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Small Node Load Time
In-memory DB (smaller is faster)
median
90%
Minutes
EC2 Heroku RDS RDS HA
0
5
10
15
20
25
30
Small Node Load Time
On-Disk DB (smaller is faster)
median
90%
Minutes
EC2 Heroku RDS RDS HA
0
5
10
15
20
25
30
Large Node Load Time
In-Memory DB (smaller is faster)
median
90%
Minutes
EC2 Heroku RDS RDS HA
0
20
40
60
80
100
120
140
160
180
Large Node Load Time
On-Disk DB (smaller is faster)
median
90%
Minutes
EC2 Heroku RDS RDS HA
0
100
200
300
400
500
600
700
800
In-Memory RW Test
Small Node (taller is faster)
median
90%
TPS
EC2 Heroku RDS RDS HA
0
500
1000
1500
2000
2500
3000
In-Memory RW Test
Large Node (taller is faster)
median
90%
TPS
EC2 Heroku RDS RDS HA
0
1000
2000
3000
4000
5000
6000
7000
In-Memory RO Test
Small Node (taller is faster)
median
90%
TPS
EC2 Heroku RDS RDS HA
0
5000
10000
15000
20000
25000
30000
In-Memory RO Test
Large Node (taller is faster)
median
90%
TPS
EC2 Heroku RDS RDS HA
0
50
100
150
200
250
300
350
400
On-Disk RW Test
Small Node (taller is faster)
median
90%
TPS
EC2 Heroku RDS RDS HA
0
200
400
600
800
1000
1200
1400
1600
On-Disk RW Test
Large Node (taller is faster)
median
90%
TPS
What's Next
More Clouds
● Rackspace
● Digital Ocean
● OpenShift
● Google Compute Engine
More Benchmarks
● OLTPBench?
– Wikipedia, Auctionmark, Epinions
● DVDStore?
● New benchmark?
– really need something more “webby”
● NoSQLish benchmark?
Better Visualizations
● better graphs
● automated graph
generation
● detailed response
times and time
graphs
“running with scissors”
● test for pure ephemeral instances
● no transaction log
● local SSD
● just for RO load-balancing
more shooting
● Josh Berkus: josh@pgexperts.com
– www.pgexperts.com
● More Shootouts
– www.databasesoup.com
– https://github.com/manageacloud/cloud-
benchmark-postgres/
– pgConf.US NYC, pgCon Ottawa
Copyright 2015 PostgreSQL Experts Inc. Released under the Creative Commons
Share-Alike 3.0 License. All images and trademarks are the property of their
respective owners.

Shootout at the AWS Corral