Slideshare.net (beta)

 
Post: 
Myspace Hi5 Friendster Xanga LiveJournal Facebook Blogger Tagged Typepad Freewebs BlackPlanet gigya icons

All comments

Add a comment on Slide 1

If you have a SlideShare account, login to comment; else you can comment as a guest


Showing 1-50 of 87 (more)

How to scale your web app

From Georgio_1999, 1 year ago

Scaling web applications, as present at Barcamp London 2 by George more

20964 views  |  5 comments  |  84 favorites  |  13 embeds (Stats)
 

Tags

barcamplondon2 scale web applications rails scalability scaling ruby architecture ror

more

 
 

Privacy InfoNew!

This slideshow is Public

 
Embed in your blog
Embed (wordpress.com)

Slideshow transcript

Slide 1: How to scale (with ruby on rails) George Palmer george.palmer@gmail.com 3dogsbark.com

Slide 2: Overview • One server • Two servers • Scaling the database • Scaling the web server • User clusters • Final architecture • Caching • Cached architecture • Links • Questions George Palmer 17th February 2007

Slide 3: How you start out Shared Hosting Web Server DB • Shared Hosting • One web server and DB on same machine • Application designed for one machine • Volume of traffic will depend on host George Palmer 17th February 2007

Slide 4: Two servers Web Server DB • Possibly still shared hosting • Web server and DB on different machine • Minimal changes to code • Volume of traffic will depend on whether made it to dedicated machines George Palmer 17th February 2007

Slide 5: Scaling the database (1) Slave Master Web Server Slave DB Slave • DB setup more suited to read intensive applications (MySQL replication) • Should be on dedicated hosts • Minimal changes to code George Palmer 17th February 2007

Slide 6: Scaling the database (2) MySQL Cluster Master DB Web Server Master DB • DB setup more suited to equal read/write applications (MySQL cluster) • Should be on dedicated hosts • Minimal changes to code George Palmer 17th February 2007

Slide 7: Scaling the web server Web Server Worker thread Worker thread DB Worker thread Farm Worker thread • Web Server comprises of “Worker threads” that process work as it comes in George Palmer 17th February 2007

Slide 8: Load balancing App Server Load balancer App Server DB Farm App Server • App Server depends: – Rails (Mongrel, FastCGI) – PHP – J2EE • Some changes to code will be required George Palmer 17th February 2007

Slide 9: The story so far… App Server Slave Master Load balancer App Server Slave DB App Server Slave • App servers continue to scale but the database side is somewhat limited… George Palmer 17th February 2007

Slide 10: User Clusters • For each user registered on the service add a entry to a master database detailing where their user data is stored – UserID – DB Cluster – Basic authorisation details such as username, password, any NLS settings George Palmer 17th February 2007

Slide 11: User Clusters (2) SELECT * FROM users WHERE username=‘Bob’ Master AND … DB App Server user_id=91732 db_cluster=2 User clusters are themselves one of the two User User database setups outlined Cluster 1 Cluster 2 earlier George Palmer 17th February 2007

Slide 12: User Clusters (3) • ID management becomes an issue – Best to use master DB id as user_id in user cluster – If let cluster allocate then make sure use offset and increment (not auto_increment) • Other DBs such as session must reference a user by id and DB cluster • Serious code changes may be required • Will want to have ability to move use users between clusters George Palmer 17th February 2007

Slide 13: The final architecture • As number of app servers grow it’s a good idea to add a database connection manager (eg SQLRelay) • Extract out session, search, translation databases onto own machines • Use MySQL cluster (or equivalent) for any critical database – In replication setup can make a slave a backup master • Add a NFS/SAN for static files George Palmer 17th February 2007

Slide 14: The final architecture (2) Master Master NFS/SAN DB DB App Server 1 Session DB App Server 2 DB Connection Load balancer Manager … Search DB App Server 50 NLS DB User User Cluster Cluster Master Master 1 2 Slave Slave Slave Slave Slave Slave George Palmer 17th February 2007

Slide 15: Issues • Load balancer and database connection manager are single point of failure – Easy solved • 2PC needed for some operations. For example a user wants to be removed from search database – 2PC not supported in rails • Rails doesn’t support database switching for a given model – Can do explicitly on each request but expensive due to connection establishment overhead – Can get round if using connection manager but a proper solution is required (I may write a gem to do this) George Palmer 17th February 2007

Slide 16: Making the most of your assets • In a lot of web applications a huge % of the hits are read only. Hence the need for caching: – Squid • A reverse-proxy (or webserver accelerator) – Memcached • Distributed memory caching solution George Palmer 17th February 2007

Slide 17: Squid App Server 1 Squid … App Server 2 Not in In cache cache NFS/SAN • Lookup of pages is in memory, storing of files is on disk • Can act also act as a load balancer • Pages can be expired by sending DELETE request to proxy George Palmer 17th February 2007

Slide 18: Memcached Physical Machine Physical Machine App Server App Server DB Farm Memcached Memcached (Not in memcached) • Location of data is irrespective of physical machine • A really nice simple API – SET – GET – DELETE • In rails only a fews LOC will make a model cached • Also useful for tracking cross machine information – eg dodge user behaviour George Palmer 17th February 2007

Slide 19: Cached Architecture • Introduce Squid – Acts as load balancer (note there are higher performing load balancers) • Introduce memcached – Can go on every machine that has spare memory • Best suited to application servers which have high CPU usage but low memory requirements George Palmer 17th February 2007

Slide 20: Cached architecture Master Master NFS/SAN DB DB M App Server 1 C Session M DB App Server 2 C DB Connection Squid Manager … Search DB M App Server 50 C NLS DB User User Cluster Cluster Master Master 1 2 MC=memcached Slave Slave Slave Slave Slave Slave George Palmer 17th February 2007

Slide 21: Cached architecture • Wikipedia quote a cache hit rate of 78% for squid and 7% for memcached – So only 15% of hits actually get to the DB!! • Performance is a whole new ball game but we recently gained 15-20% by optimising our rails configuration – But don’t get carried away - at some point the time you spend exceeds the money saved George Palmer 17th February 2007

Slide 22: Cached architecture – 1 machine Physical Machine Master Master NFS/SAN DB DB App Server 1 Session DB App Server 2 DB Connection Memcached Squid Manager … Search DB App Server 5 NLS DB User Cluster Master 1 Slave Slave Slave George Palmer 17th February 2007

Slide 23: How far can it go? • For a truly global application, with millions of users - In order of ease: – Have a cache on each continent – Make user clusters based on user location • Distribute the clusters physically around the world – Introduce app servers on each continent – If you must replicate your site globally then use transaction replication software, eg GoldenGate George Palmer 17th February 2007

Slide 24: Useful Links • http://www.squid-cache.org/ • http://www.danga.com/memcached/ • http://sqlrelay.sourceforge.net/ • http://railsexpress.de/blog/ George Palmer 17th February 2007

Slide 25: Questions? George Palmer 17th February 2007