Scaling heterogeneous
           systems on the cloud

                            John D. Rowell
                           jd@escalari.com
                              @jdrowell


                               http://www.flickr.com/photos/klearchos/4632744945
Thursday, March 24, 2011
Works fine on my
                          machine




                            http://www.flickr.com/photos/klearchos/4632744945
Thursday, March 24, 2011
Yeah but does it scale?




Thursday, March 24, 2011
What to scale

                           Storage capacity
                           Processing power
                             Concurrency
                              Availability

                                  http://www.flickr.com/photos/kwl/3219157599
Thursday, March 24, 2011
Any complex system
         won't be homogeneous
                           Web servers
                            Databases
                             Caches
                             Queues
                             Workers

                            http://www.flickr.com/photos/core-materials/3838557749
Thursday, March 24, 2011
Different subsystems
                 scale differently

                           Master / Slave
                           Client sharding
                            Map / Reduce
                               Workers

                                 http://www.flickr.com/photos/licassuncao/2500282164
Thursday, March 24, 2011
No failure is not an
                           option

                            Monitoring
                           Auto respawn
                            Live spares


                                http://www.flickr.com/photos/bfishadow/5197774708
Thursday, March 24, 2011
No sysadm? No
                              problem!

                           Full featured APIs
                               Thresholds
                           Remote monitoring
                            Cloud monitoring

                                   http://www.flickr.com/photos/deltamike/3536991945
Thursday, March 24, 2011
Scaling Vertically


                          Stop and Go
                      Stick to 32 or 64bits
                        Rolling upgrade


                                   http://www.flickr.com/photos/doctorvee/3766965528
Thursday, March 24, 2011
Scaling Horizontally


                           Network latency
                           Bandwidth use
                              Security


                                http://www.flickr.com/photos/thefangmonster/
Thursday, March 24, 2011        4024861156
Where's the node?


            Cluster is dynamic
         No broadcast or multicast
             Use the API Luke


                            http://www.flickr.com/photos/silvery/2414538926
Thursday, March 24, 2011
Make all nodes equal


                            Dynamo
                           Memcached
                            ZeroConf


                               http://www.flickr.com/photos/jurvetson/3327872958
Thursday, March 24, 2011
Scale-friendly systems

                           Web servers
                               Riak
                            memcached
                           ElasticSearch

                                  http://www.flickr.com/photos/xiaming/50391986
Thursday, March 24, 2011
Scale semi-friendly
                        systems

                            MongoDB
                             MySQL
                           PostgreSQL


                                http://www.flickr.com/photos/fenng/5489161388
Thursday, March 24, 2011
Scale un-friendly
                               systems

                                Redis*
                                Queues
                           Stream consumers


                                  http://www.flickr.com/photos/addedentry/631590447
Thursday, March 24, 2011
One AMI to rule them all


                           /opt is your friend
                              data on EBS
                                cloud-init


                                   http://www.flickr.com/photos/thecaucas/3573910044
Thursday, March 24, 2011
There is no spoon


           Leverage your services
           Key/value =~ scoreboard
              Queue =~ Job list


Thursday, March 24, 2011
There is no spoon


           Leverage your services
           Key/value =~ scoreboard
              Queue =~ Job list


Thursday, March 24, 2011
Caveats


                   Estimate cluster size
                   Non-linear scalability


                             http://www.flickr.com/photos/justin_glass/3793492335
Thursday, March 24, 2011
Thank you!


                            John D. Rowell
                           jd@escalari.com
                              @jdrowell


                               http://www.flickr.com/photos/klearchos/4632744945
Thursday, March 24, 2011

John D. Rowell - Scaling heterogeneous systems on the cloud

  • 1.
    Scaling heterogeneous systems on the cloud John D. Rowell jd@escalari.com @jdrowell http://www.flickr.com/photos/klearchos/4632744945 Thursday, March 24, 2011
  • 2.
    Works fine onmy machine http://www.flickr.com/photos/klearchos/4632744945 Thursday, March 24, 2011
  • 3.
    Yeah but doesit scale? Thursday, March 24, 2011
  • 4.
    What to scale Storage capacity Processing power Concurrency Availability http://www.flickr.com/photos/kwl/3219157599 Thursday, March 24, 2011
  • 5.
    Any complex system won't be homogeneous Web servers Databases Caches Queues Workers http://www.flickr.com/photos/core-materials/3838557749 Thursday, March 24, 2011
  • 6.
    Different subsystems scale differently Master / Slave Client sharding Map / Reduce Workers http://www.flickr.com/photos/licassuncao/2500282164 Thursday, March 24, 2011
  • 7.
    No failure isnot an option Monitoring Auto respawn Live spares http://www.flickr.com/photos/bfishadow/5197774708 Thursday, March 24, 2011
  • 8.
    No sysadm? No problem! Full featured APIs Thresholds Remote monitoring Cloud monitoring http://www.flickr.com/photos/deltamike/3536991945 Thursday, March 24, 2011
  • 9.
    Scaling Vertically Stop and Go Stick to 32 or 64bits Rolling upgrade http://www.flickr.com/photos/doctorvee/3766965528 Thursday, March 24, 2011
  • 10.
    Scaling Horizontally Network latency Bandwidth use Security http://www.flickr.com/photos/thefangmonster/ Thursday, March 24, 2011 4024861156
  • 11.
    Where's the node? Cluster is dynamic No broadcast or multicast Use the API Luke http://www.flickr.com/photos/silvery/2414538926 Thursday, March 24, 2011
  • 12.
    Make all nodesequal Dynamo Memcached ZeroConf http://www.flickr.com/photos/jurvetson/3327872958 Thursday, March 24, 2011
  • 13.
    Scale-friendly systems Web servers Riak memcached ElasticSearch http://www.flickr.com/photos/xiaming/50391986 Thursday, March 24, 2011
  • 14.
    Scale semi-friendly systems MongoDB MySQL PostgreSQL http://www.flickr.com/photos/fenng/5489161388 Thursday, March 24, 2011
  • 15.
    Scale un-friendly systems Redis* Queues Stream consumers http://www.flickr.com/photos/addedentry/631590447 Thursday, March 24, 2011
  • 16.
    One AMI torule them all /opt is your friend data on EBS cloud-init http://www.flickr.com/photos/thecaucas/3573910044 Thursday, March 24, 2011
  • 17.
    There is nospoon Leverage your services Key/value =~ scoreboard Queue =~ Job list Thursday, March 24, 2011
  • 18.
    There is nospoon Leverage your services Key/value =~ scoreboard Queue =~ Job list Thursday, March 24, 2011
  • 19.
    Caveats Estimate cluster size Non-linear scalability http://www.flickr.com/photos/justin_glass/3793492335 Thursday, March 24, 2011
  • 20.
    Thank you! John D. Rowell jd@escalari.com @jdrowell http://www.flickr.com/photos/klearchos/4632744945 Thursday, March 24, 2011