This document discusses scaling heterogeneous systems on the cloud. It notes that complex systems will not be homogeneous, with different subsystems scaling in different ways. It emphasizes that high availability is critical and that scaling can be done vertically by upgrading hardware or horizontally by adding more nodes. The document provides advice on choosing scale-friendly systems like key-value stores as well as caveats about properly estimating cluster size and non-linear scalability.
Adam Grummitt - Capacity Management: Guided Practitioner Satnav
Scaling Heterogeneous Systems on the Cloud
1. Scaling heterogeneous
systems on the cloud
John D. Rowell
jd@escalari.com
@jdrowell
http://www.flickr.com/photos/klearchos/4632744945
Thursday, March 24, 2011
2. Works fine on my
machine
http://www.flickr.com/photos/klearchos/4632744945
Thursday, March 24, 2011
4. What to scale
Storage capacity
Processing power
Concurrency
Availability
http://www.flickr.com/photos/kwl/3219157599
Thursday, March 24, 2011
5. Any complex system
won't be homogeneous
Web servers
Databases
Caches
Queues
Workers
http://www.flickr.com/photos/core-materials/3838557749
Thursday, March 24, 2011
6. Different subsystems
scale differently
Master / Slave
Client sharding
Map / Reduce
Workers
http://www.flickr.com/photos/licassuncao/2500282164
Thursday, March 24, 2011
7. No failure is not an
option
Monitoring
Auto respawn
Live spares
http://www.flickr.com/photos/bfishadow/5197774708
Thursday, March 24, 2011
8. No sysadm? No
problem!
Full featured APIs
Thresholds
Remote monitoring
Cloud monitoring
http://www.flickr.com/photos/deltamike/3536991945
Thursday, March 24, 2011
9. Scaling Vertically
Stop and Go
Stick to 32 or 64bits
Rolling upgrade
http://www.flickr.com/photos/doctorvee/3766965528
Thursday, March 24, 2011
10. Scaling Horizontally
Network latency
Bandwidth use
Security
http://www.flickr.com/photos/thefangmonster/
Thursday, March 24, 2011 4024861156
11. Where's the node?
Cluster is dynamic
No broadcast or multicast
Use the API Luke
http://www.flickr.com/photos/silvery/2414538926
Thursday, March 24, 2011
12. Make all nodes equal
Dynamo
Memcached
ZeroConf
http://www.flickr.com/photos/jurvetson/3327872958
Thursday, March 24, 2011
13. Scale-friendly systems
Web servers
Riak
memcached
ElasticSearch
http://www.flickr.com/photos/xiaming/50391986
Thursday, March 24, 2011
14. Scale semi-friendly
systems
MongoDB
MySQL
PostgreSQL
http://www.flickr.com/photos/fenng/5489161388
Thursday, March 24, 2011
15. Scale un-friendly
systems
Redis*
Queues
Stream consumers
http://www.flickr.com/photos/addedentry/631590447
Thursday, March 24, 2011
16. One AMI to rule them all
/opt is your friend
data on EBS
cloud-init
http://www.flickr.com/photos/thecaucas/3573910044
Thursday, March 24, 2011
17. There is no spoon
Leverage your services
Key/value =~ scoreboard
Queue =~ Job list
Thursday, March 24, 2011
18. There is no spoon
Leverage your services
Key/value =~ scoreboard
Queue =~ Job list
Thursday, March 24, 2011