Latency: the #1 metric of your
cloud
CloudStack User Group, 14 Mar 2019
ops/transactionspersecond
Dependency of any transaction processing system
Latency
Databases
Web Servers
Load balancers
Storage systems
Etc.
core core core core
task
Balancing work to be done with system resources
core core core core
task task
Balancing work to be done with system resources – 50%
core core core core
task task task task
Balancing work to be done with system resources – 100%
Latency
opspersecond
increasingload0->100%
Elasticity of an “ideal system in vacuum”
Latency
opspersecond
Elasticity of real systems
core core core core
task task task task task task
System overload
Latency
opspersecond
Regimes of any transaction processing system
Latency
opspersecond
best service
Latency
opspersecond
best service
lowest cost per
delivered resource
Latency
opspersecond
best service
lowest cost per
delivered resource
only pain
Latency
opspersecond
best service
lowest cost per
delivered resource
only pain
systemthroughput
Latency
opspersecond
best service
lowest cost per
delivered resource
only pain
benchmarks
Latency
opspersecond
10kIOPS
https://www.digitalocean.com/docs/volumes/
https://www.digitalocean.com/docs/volumes/overview/
Metrics of a Cloud Storage Layer
https://www.digitalocean.com/docs/volumes/overview/
Latency
The most
important
metric is
Missing!
Where is my Latency?!
random read/write 50/50, 4k, QD 1 avg. latency
DreamHost/DreamCompute (Ceph) 4.51 ms
DigitalOcean (Ceph) 1.75 ms
OVH (Ceph) 1.53 ms
Comparing some public Clouds
DreamHost/DreamCompute (Ceph) 4.51 ms
DigitalOcean (Ceph) 1.75 ms
OVH (Ceph) 1.53 ms
AWS EBS gp2 10k ($333/mo) 0.29 ms
Comparing some public Clouds
random read/write 50/50, 4k, QD 1 avg. latency
5 to 15 times faster!
For a lot of $$$ thought!
DreamHost/DreamCompute (Ceph) 4.51 ms
DigitalOcean (Ceph) 1.75 ms
OVH (Ceph) 1.53 ms
AWS EBS gp2 10k ($333/mo) 0.29 ms
eApps ($50/mo) 0.41 ms
Togglebox ($20/mo) 0.21 ms
StorPool BCP (Best Current Practice) 0.17 ms
random read/write 50/50, 4k, QD 1 avg. latency
Best-of-breed Clouds
A small experiment
Latency
opspersecond
10kIOPS
2ms
A volume with 2ms latency…
Latency
opspersecond
10kIOPS
0.3ms
…Vs. a volume with 0.3ms latency
4 vCPUs, 4 GB RAM
vdisk with 10k IOPS
"fast" volume with approx 0.3 ms latency (QD 1)
"slow" volume with 2 ms latency (QD 1)
with dm-delay in host
Both volumes are on the same SSD pool
Both volumes measure 10k IOPS flat
Same test VMs in all Cloud providers 2 disks – slow & fast
4 vCPUs, 4 GB RAM
vdisk with 10k IOPS
pgbench --client=8 --jobs=4 
--progress=1 --time=5 pgbench4x
database size: 16 GB (4x RAM)
https://wiki.postgresql.org/wiki/Pgbenchtesting
Same test
DEMO
(embedded video on next slide)
/or click here if video does not play/
Click here if video does not play
4 vCPUs, 4 GB RAM
vdisk with 10k IOPS
2ms storage latency = 900 TPS @ 8 ms pgbench
Results – slow volume
4 vCPUs, 4 GB RAM
vdisk with 10k IOPS
2ms storage latency = 900 TPS @ 8 ms pgbench
-> if we ask for 1200 TPS -> pile up
Results – slow volume under pressure – congestion!
4 vCPUs, 4 GB RAM
vdisk with 10k IOPS
0.3ms storage latency = 1600 TPS @ 5 ms pgbench
Results – fast volume
4 vCPUs, 4 GB RAM
vdisk with 10k IOPS
0.3ms storage latency = 1600 TPS @ 5 ms pgbench
-> if we ask for 1200 TPS - no problem
Results – fast volume under pressure – no problem
8 vCPUs, 16 GB RAM, dedicated if offered, database size = 4x RAM
https://wiki.postgresql.org/wiki/Pgbenchtesting
Comparing Clouds – AWS seems to rule them all?
8 vCPUs, 16 GB RAM, dedicated if offered, database size = 4x RAM
https://wiki.postgresql.org/wiki/Pgbenchtesting
Comparing Clouds – who can run a database?
Boyan Ivanov
StorPool Storage
info@storpool.com
www.storpool.com
@storpool
Thank you + Q&A!

Boyan Ivanov - latency, the #1 metric of your cloud