How To Scale v2


Published on

Slightly updated scaling presentation to include information on EC2

Published in: Technology
1 Comment
  • exceptional presentation..convinced me to have a hardlook at my business model..brilliant
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • First barcamp Rails but principles applied elsewhere blog
  • How To Scale v2

    1. 1. How to scale (with ruby on rails) George Palmer [email_address]
    2. 2. Overview <ul><li>Starting out </li></ul><ul><li>Scaling the database </li></ul><ul><li>Scaling the web server </li></ul><ul><li>User clusters </li></ul><ul><li>Caching </li></ul><ul><li>Elastic architectures </li></ul><ul><li>Links and Questions </li></ul>
    3. 3. How you start out <ul><li>Shared Hosting </li></ul><ul><li>One web server and DB on same machine </li></ul><ul><li>Application designed for one machine </li></ul><ul><li>Volume of traffic will depend on host </li></ul>DB Web Server Shared Hosting
    4. 4. Two servers <ul><li>Possibly still shared hosting </li></ul><ul><li>Web server and DB on different machine </li></ul><ul><li>Minimal changes to code </li></ul><ul><li>Volume of traffic will depend on whether made it to dedicated machines </li></ul>DB Web Server
    5. 5. Scaling the database (1) <ul><li>DB setup more suited to read intensive applications (MySQL replication) </li></ul><ul><li>Should be on dedicated hosts </li></ul><ul><li>Minimal changes to code </li></ul>Master DB Web Server Slave Slave Slave
    6. 6. Scaling the database (2) <ul><li>DB setup more suited to equal read/write applications (MySQL cluster) </li></ul><ul><li>Should be on dedicated hosts </li></ul><ul><li>Minimal changes to code </li></ul>Master DB Web Server Master DB MySQL Cluster
    7. 7. Scaling the web server <ul><li>Web Server comprises of “Worker threads” that process work as it comes in </li></ul>DB Farm Worker thread Worker thread Worker thread Worker thread Web Server
    8. 8. Load balancing <ul><li>App Server depends: </li></ul><ul><ul><li>Rails (Mongrel, FastCGI) </li></ul></ul><ul><ul><li>PHP </li></ul></ul><ul><ul><li>J2EE </li></ul></ul><ul><li>Some changes to code will be required </li></ul>DB Farm App Server App Server App Server Load balancer
    9. 9. The story so far… <ul><li>App servers continue to scale but the database side is somewhat limited… </li></ul>App Server App Server App Server Load balancer Master DB Slave Slave Slave
    10. 10. User Clusters <ul><li>For each user registered on the service add a entry to a master database detailing where their user data is stored </li></ul><ul><ul><li>UserID </li></ul></ul><ul><ul><li>DB Cluster </li></ul></ul><ul><ul><li>Basic authorisation details such as username, password, any NLS settings </li></ul></ul>
    11. 11. User Clusters (2) App Server Master DB User Cluster 1 User Cluster 2 User clusters are themselves one of the two database setups outlined earlier SELECT * FROM users WHERE username=‘Bob’ AND … user_id=91732db_cluster=2
    12. 12. User Clusters (3) <ul><li>ID management becomes an issue </li></ul><ul><ul><li>Best to use master DB id as user_id in user cluster or uuid’s </li></ul></ul><ul><ul><li>If let cluster allocate then make sure use offset and increment (not auto_increment) </li></ul></ul><ul><li>Other DBs such as session must reference a user by id and DB cluster </li></ul><ul><li>Serious code changes may be required </li></ul><ul><li>Will want to have ability to move use users between clusters </li></ul>
    13. 13. Architecture so far <ul><li>As number of app servers grow it’s a good idea to add a database connection manager (eg SQLRelay) </li></ul><ul><li>Extract out session, search, translation databases onto own machines </li></ul><ul><li>Add background processor for long running tasks (so don’t block app servers) </li></ul><ul><li>Use MySQL cluster (or equivalent) for any critical database </li></ul><ul><ul><li>In replication setup can make a slave a backup master </li></ul></ul>
    14. 14. Non-cached architecture Load balancer Master DB App Server 1 App Server 2 App Server 50 … DB Connection Manager Master DB Session DB Search DB NLS DB Master Slave Slave Slave Master Slave Slave Slave User Cluster 2 User Cluster 1 Static Files BackgroundRB
    15. 15. Issues <ul><li>Load balancer and database connection manager are single point of failure </li></ul><ul><ul><li>Easy solved </li></ul></ul><ul><li>2PC needed for some operations. For example a user wants to be removed from search database </li></ul><ul><ul><li>2PC not supported in rails </li></ul></ul><ul><li>Rails doesn’t support database switching for a given model </li></ul><ul><ul><li>Can do explicitly on each request but expensive due to connection establishment overhead </li></ul></ul><ul><ul><li>Can get round if using connection manager but a proper solution is required (a few gems starting to emerge on this) </li></ul></ul>
    16. 16. Making the most of your assets <ul><li>In a lot of web applications a huge % of the hits are read only. Hence the need for caching: </li></ul><ul><ul><li>Squid </li></ul></ul><ul><ul><ul><li>A reverse-proxy (or webserver accelerator) </li></ul></ul></ul><ul><ul><li>Memcached </li></ul></ul><ul><ul><ul><li>Distributed memory caching solution </li></ul></ul></ul><ul><ul><li>Language specific caching </li></ul></ul><ul><ul><ul><li>Eg rails fragment caching </li></ul></ul></ul>
    17. 17. Squid <ul><li>Lookup of pages is in memory, storing of files is on disk </li></ul><ul><li>Can act also act as a load balancer </li></ul><ul><li>Pages can be expired by sending DELETE request to proxy </li></ul><ul><li>Can program any load balancer to pick up pages cached by your app servers (if you know the rules under which it operates) </li></ul>Squid App Server 1 App Server 2 Storage In cache Not in cache …
    18. 18. Memcached <ul><li>Location of data is irrespective of physical machine </li></ul><ul><li>A really nice simple API </li></ul><ul><ul><li>SET </li></ul></ul><ul><ul><li>GET </li></ul></ul><ul><ul><li>DELETE </li></ul></ul><ul><li>In rails only a fews LOC will make a model cached </li></ul><ul><li>Also useful for tracking cross machine information – eg dodge user behaviour </li></ul>App Server DB Farm Memcached Physical Machine App Server Memcached Physical Machine (Not in memcached)
    19. 19. Cached architecture <ul><li>Introduce squid or nginx </li></ul><ul><li>Introduce memcached </li></ul><ul><ul><li>Can go on every machine that has spare memory </li></ul></ul><ul><ul><ul><li>Best suited to application servers which have high CPU usage but low memory requirements </li></ul></ul></ul><ul><li>Introduce language specific caching </li></ul>
    20. 20. Cached architecture Load balancer Master DB App Server 1 App Server 2 App Server 50 … DB Connection Manager Master DB Session DB Search DB NLS DB Master Slave Slave Slave Master Slave Slave Slave User Cluster 2 User Cluster 1 M C M C M C MC=memcached BackgroundRB Storage
    21. 21. Cached architecture <ul><li>Wikipedia quote a cache hit rate of 78% for squid and 7% for memcached </li></ul><ul><ul><li>So only 15% of hits actually get to the DB!! </li></ul></ul><ul><li>Performance is a whole new ball game but we recently gained 15-20% by optimising our rails configuration </li></ul><ul><ul><li>But don’t get carried away - at some point the time you spend exceeds the money saved </li></ul></ul><ul><li>Its very easy to scale this architecture down to one machine </li></ul>
    22. 22. Elastic architectures <ul><li>Based upon Amazon EC2 </li></ul><ul><ul><li>Allow you to create server images and launch instances on demand </li></ul></ul><ul><ul><li>Very cheap as you only pay for what you use </li></ul></ul><ul><li>Currently no way to mount Amazon S3 </li></ul><ul><ul><li>Strictly speaking there are a few projects ongoing… </li></ul></ul><ul><li>Still in Beta </li></ul><ul><ul><li>We’ve had network performance issues </li></ul></ul><ul><li>An American VC was quoted as saying “Are you using EC2 for scaling? If not, you better have a good reason” </li></ul>
    23. 23. Elastic architectures Load balancer App Server 1 App Server 2 App Server 3 M C M C M C Monitor EC2 Cloud App Server Image App Server 4 M C produces <ul><li>WeoCeo now offer a similar service </li></ul>High load
    24. 24. How far can it go? <ul><li>For a truly global application, with millions of users - In order of ease: </li></ul><ul><ul><li>Have a cache on each continent </li></ul></ul><ul><ul><li>Make user clusters based on user location </li></ul></ul><ul><ul><ul><li>Distribute the clusters physically around the world </li></ul></ul></ul><ul><ul><li>Introduce app servers on each continent </li></ul></ul><ul><ul><li>If you must replicate your site globally then use transaction replication software, eg GoldenGate </li></ul></ul>
    25. 25. Useful Links <ul><li> </li></ul><ul><li>http:// / </li></ul><ul><li> </li></ul><ul><li> </li></ul><ul><li> </li></ul>
    26. 26. Questions?
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.