How to scale (with ruby on rails) George Palmer [email_address] 3dogsbark.com
Overview <ul><li>One server </li></ul><ul><li>Two servers </li></ul><ul><li>Scaling the database </li></ul><ul><li>Scaling...
How you start out <ul><li>Shared Hosting </li></ul><ul><li>One web server and DB on same machine </li></ul><ul><li>Applica...
Two servers <ul><li>Possibly still shared hosting </li></ul><ul><li>Web server and DB on different machine </li></ul><ul><...
Scaling the database (1) <ul><li>DB setup more suited to read intensive applications (MySQL replication) </li></ul><ul><li...
Scaling the database (2) <ul><li>DB setup more suited to equal read/write applications (MySQL cluster) </li></ul><ul><li>S...
Scaling the web server <ul><li>Web Server comprises of “Worker threads” that process work as it comes in </li></ul>DB Farm...
Load balancing <ul><li>App Server depends: </li></ul><ul><ul><li>Rails (Mongrel, FastCGI) </li></ul></ul><ul><ul><li>PHP <...
The story so far… <ul><li>App servers continue to scale but the database side is somewhat limited… </li></ul>App Server Ap...
User Clusters <ul><li>For each user registered on the service add a entry to a master database detailing where their user ...
User Clusters (2) App Server Master DB User  Cluster 1 User Cluster 2 User clusters are themselves one of the two database...
User Clusters (3) <ul><li>ID management becomes an issue </li></ul><ul><ul><li>Best to use master DB id as user_id in user...
The final architecture <ul><li>As number of app servers grow it’s a good idea to add a database connection manager (eg SQL...
The final architecture (2) Load balancer Master DB App Server 1 App Server 2 App Server 50 … DB Connection Manager Master ...
Issues <ul><li>Load balancer and database connection manager are single point of failure </li></ul><ul><ul><li>Easy solved...
Making the most of your assets <ul><li>In a lot of web applications a huge % of the hits are read only.  Hence the need fo...
Squid <ul><li>Lookup of pages is in memory, storing of files is on disk </li></ul><ul><li>Can act also act as a load balan...
Memcached <ul><li>Location of data is irrespective of physical machine </li></ul><ul><li>A really nice simple API </li></u...
Cached Architecture <ul><li>Introduce Squid </li></ul><ul><ul><li>Acts as load balancer (note there are higher performing ...
Cached architecture Squid Master DB App Server 1 App Server 2 App Server 50 … DB Connection Manager Master DB Session DB S...
Cached architecture <ul><li>Wikipedia quote a cache hit rate of 78% for squid and 7% for memcached </li></ul><ul><ul><li>S...
Cached architecture – 1 machine Squid Master DB App Server 1 App Server 2 App Server 5 … DB Connection Manager Master DB S...
How far can it go? <ul><li>For a truly global application, with millions of users - In order of ease: </li></ul><ul><ul><l...
Useful Links <ul><li>http://www.squid-cache.org/ </li></ul><ul><li>http://www.danga.com/memcached/ </li></ul><ul><li>http:...
Questions?
Upcoming SlideShare
Loading in...5
×

How to scale your web app

56,295

Published on

Scaling web applications, as present at Barcamp London 2 by George Palmer

Published in: Technology
7 Comments
198 Likes
Statistics
Notes
  • How do you navigate between pages?
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Thanks a lot for sharing ~~~
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Great and interesting slides. Thanks for sharing

    http://www.diyhousepainting.net/
    http://www.diyhousepainting.net/category/walls-painting/
    http://www.diyhousepainting.net/category/wood-painting/
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • nice slides about ruby application, but not sure whether it's still relevant or not.

    bread machine review - kitchenmaster
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • very nice slides!
    http://www.myselfhypnosis.net/
    http://www.mindpowerspecialreport.com/
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
56,295
On Slideshare
0
From Embeds
0
Number of Embeds
13
Actions
Shares
0
Downloads
1,052
Comments
7
Likes
198
Embeds 0
No embeds

No notes for slide
  • First barcamp Rails but principles applied elsewhere blog
  • Transcript of "How to scale your web app"

    1. 1. How to scale (with ruby on rails) George Palmer [email_address] 3dogsbark.com
    2. 2. Overview <ul><li>One server </li></ul><ul><li>Two servers </li></ul><ul><li>Scaling the database </li></ul><ul><li>Scaling the web server </li></ul><ul><li>User clusters </li></ul><ul><li>Final architecture </li></ul><ul><li>Caching </li></ul><ul><li>Cached architecture </li></ul><ul><li>Links </li></ul><ul><li>Questions </li></ul>
    3. 3. How you start out <ul><li>Shared Hosting </li></ul><ul><li>One web server and DB on same machine </li></ul><ul><li>Application designed for one machine </li></ul><ul><li>Volume of traffic will depend on host </li></ul>DB Web Server Shared Hosting
    4. 4. Two servers <ul><li>Possibly still shared hosting </li></ul><ul><li>Web server and DB on different machine </li></ul><ul><li>Minimal changes to code </li></ul><ul><li>Volume of traffic will depend on whether made it to dedicated machines </li></ul>DB Web Server
    5. 5. Scaling the database (1) <ul><li>DB setup more suited to read intensive applications (MySQL replication) </li></ul><ul><li>Should be on dedicated hosts </li></ul><ul><li>Minimal changes to code </li></ul>Master DB Web Server Slave Slave Slave
    6. 6. Scaling the database (2) <ul><li>DB setup more suited to equal read/write applications (MySQL cluster) </li></ul><ul><li>Should be on dedicated hosts </li></ul><ul><li>Minimal changes to code </li></ul>Master DB Web Server Master DB MySQL Cluster
    7. 7. Scaling the web server <ul><li>Web Server comprises of “Worker threads” that process work as it comes in </li></ul>DB Farm Worker thread Worker thread Worker thread Worker thread Web Server
    8. 8. Load balancing <ul><li>App Server depends: </li></ul><ul><ul><li>Rails (Mongrel, FastCGI) </li></ul></ul><ul><ul><li>PHP </li></ul></ul><ul><ul><li>J2EE </li></ul></ul><ul><li>Some changes to code will be required </li></ul>DB Farm App Server App Server App Server Load balancer
    9. 9. The story so far… <ul><li>App servers continue to scale but the database side is somewhat limited… </li></ul>App Server App Server App Server Load balancer Master DB Slave Slave Slave
    10. 10. User Clusters <ul><li>For each user registered on the service add a entry to a master database detailing where their user data is stored </li></ul><ul><ul><li>UserID </li></ul></ul><ul><ul><li>DB Cluster </li></ul></ul><ul><ul><li>Basic authorisation details such as username, password, any NLS settings </li></ul></ul>
    11. 11. User Clusters (2) App Server Master DB User Cluster 1 User Cluster 2 User clusters are themselves one of the two database setups outlined earlier SELECT * FROM users WHERE username=‘Bob’ AND … user_id=91732db_cluster=2
    12. 12. User Clusters (3) <ul><li>ID management becomes an issue </li></ul><ul><ul><li>Best to use master DB id as user_id in user cluster </li></ul></ul><ul><ul><li>If let cluster allocate then make sure use offset and increment (not auto_increment) </li></ul></ul><ul><li>Other DBs such as session must reference a user by id and DB cluster </li></ul><ul><li>Serious code changes may be required </li></ul><ul><li>Will want to have ability to move use users between clusters </li></ul>
    13. 13. The final architecture <ul><li>As number of app servers grow it’s a good idea to add a database connection manager (eg SQLRelay) </li></ul><ul><li>Extract out session, search, translation databases onto own machines </li></ul><ul><li>Use MySQL cluster (or equivalent) for any critical database </li></ul><ul><ul><li>In replication setup can make a slave a backup master </li></ul></ul><ul><li>Add a NFS/SAN for static files </li></ul>
    14. 14. The final architecture (2) Load balancer Master DB App Server 1 App Server 2 App Server 50 … DB Connection Manager Master DB Session DB Search DB NLS DB Master Slave Slave Slave Master Slave Slave Slave User Cluster 2 User Cluster 1 NFS/SAN
    15. 15. Issues <ul><li>Load balancer and database connection manager are single point of failure </li></ul><ul><ul><li>Easy solved </li></ul></ul><ul><li>2PC needed for some operations. For example a user wants to be removed from search database </li></ul><ul><ul><li>2PC not supported in rails </li></ul></ul><ul><li>Rails doesn’t support database switching for a given model </li></ul><ul><ul><li>Can do explicitly on each request but expensive due to connection establishment overhead </li></ul></ul><ul><ul><li>Can get round if using connection manager but a proper solution is required (I may write a gem to do this) </li></ul></ul>
    16. 16. Making the most of your assets <ul><li>In a lot of web applications a huge % of the hits are read only. Hence the need for caching: </li></ul><ul><ul><li>Squid </li></ul></ul><ul><ul><ul><li>A reverse-proxy (or webserver accelerator) </li></ul></ul></ul><ul><ul><li>Memcached </li></ul></ul><ul><ul><ul><li>Distributed memory caching solution </li></ul></ul></ul>
    17. 17. Squid <ul><li>Lookup of pages is in memory, storing of files is on disk </li></ul><ul><li>Can act also act as a load balancer </li></ul><ul><li>Pages can be expired by sending DELETE request to proxy </li></ul>Squid App Server 1 App Server 2 NFS/SAN In cache Not in cache …
    18. 18. Memcached <ul><li>Location of data is irrespective of physical machine </li></ul><ul><li>A really nice simple API </li></ul><ul><ul><li>SET </li></ul></ul><ul><ul><li>GET </li></ul></ul><ul><ul><li>DELETE </li></ul></ul><ul><li>In rails only a fews LOC will make a model cached </li></ul><ul><li>Also useful for tracking cross machine information – eg dodge user behaviour </li></ul>App Server DB Farm Memcached Physical Machine App Server Memcached Physical Machine (Not in memcached)
    19. 19. Cached Architecture <ul><li>Introduce Squid </li></ul><ul><ul><li>Acts as load balancer (note there are higher performing load balancers) </li></ul></ul><ul><li>Introduce memcached </li></ul><ul><ul><li>Can go on every machine that has spare memory </li></ul></ul><ul><ul><ul><li>Best suited to application servers which have high CPU usage but low memory requirements </li></ul></ul></ul>
    20. 20. Cached architecture Squid Master DB App Server 1 App Server 2 App Server 50 … DB Connection Manager Master DB Session DB Search DB NLS DB Master Slave Slave Slave Master Slave Slave Slave User Cluster 2 User Cluster 1 NFS/SAN M C M C M C MC=memcached
    21. 21. Cached architecture <ul><li>Wikipedia quote a cache hit rate of 78% for squid and 7% for memcached </li></ul><ul><ul><li>So only 15% of hits actually get to the DB!! </li></ul></ul><ul><li>Performance is a whole new ball game but we recently gained 15-20% by optimising our rails configuration </li></ul><ul><ul><li>But don’t get carried away - at some point the time you spend exceeds the money saved </li></ul></ul>
    22. 22. Cached architecture – 1 machine Squid Master DB App Server 1 App Server 2 App Server 5 … DB Connection Manager Master DB Session DB Search DB NLS DB Master Slave Slave Slave User Cluster 1 NFS/SAN Memcached Physical Machine
    23. 23. How far can it go? <ul><li>For a truly global application, with millions of users - In order of ease: </li></ul><ul><ul><li>Have a cache on each continent </li></ul></ul><ul><ul><li>Make user clusters based on user location </li></ul></ul><ul><ul><ul><li>Distribute the clusters physically around the world </li></ul></ul></ul><ul><ul><li>Introduce app servers on each continent </li></ul></ul><ul><ul><li>If you must replicate your site globally then use transaction replication software, eg GoldenGate </li></ul></ul>
    24. 24. Useful Links <ul><li>http://www.squid-cache.org/ </li></ul><ul><li>http://www.danga.com/memcached/ </li></ul><ul><li>http://sqlrelay.sourceforge.net/ </li></ul><ul><li>http://railsexpress.de/blog/ </li></ul>
    25. 25. Questions?
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×