Slideshare.net (beta)

 
Post: 
Myspace Hi5 Friendster Xanga LiveJournal Facebook Blogger Tagged Typepad Freewebs BlackPlanet gigya icons



All comments

Add a comment on Slide 1

If you have a SlideShare account, login to comment; else you can comment as a guest


Showing 1-50 of 29 (more)

How To Scale v2

From Georgio_1999, 1 year ago

Slightly updated scaling presentation to include information on EC

6509 views  |  0 comments  |  26 favorites  |  8 embeds (Stats)
 

Tags

scale web applications barcampsheffield scalability scaling rails ruby architecture nginx

more

 
 

Privacy InfoNew!

This slideshow is Public

 
Embed in your blog
Embed (wordpress.com)
custom

Slideshow Statistics
Total Views: 6509
on Slideshare: 6465
from embeds: 44* * Views from embeds since 21 Aug, 07

Slideshow transcript

Slide 1: How to scale (with ruby on rails) George Palmer george@meecard.com 3dogsbark.com

Slide 2: Overview • Starting out • Scaling the database • Scaling the web server • User clusters • Caching • Elastic architectures • Links and Questions George Palmer 26th May 2007

Slide 3: How you start out Shared Hosting Web Server DB • Shared Hosting • One web server and DB on same machine • Application designed for one machine • Volume of traffic will depend on host George Palmer 26th May 2007

Slide 4: Two servers Web Server DB • Possibly still shared hosting • Web server and DB on different machine • Minimal changes to code • Volume of traffic will depend on whether made it to dedicated machines George Palmer 26th May 2007

Slide 5: Scaling the database (1) Slave Master Web Server Slave DB Slave • DB setup more suited to read intensive applications (MySQL replication) • Should be on dedicated hosts • Minimal changes to code George Palmer 26th May 2007

Slide 6: Scaling the database (2) MySQL Cluster Master DB Web Server Master DB • DB setup more suited to equal read/write applications (MySQL cluster) • Should be on dedicated hosts • Minimal changes to code George Palmer 26th May 2007

Slide 7: Scaling the web server Web Server Worker thread Worker thread DB Worker thread Farm Worker thread • Web Server comprises of “Worker threads” that process work as it comes in George Palmer 26th May 2007

Slide 8: Load balancing App Server Load balancer App Server DB Farm App Server • App Server depends: – Rails (Mongrel, FastCGI) – PHP – J2EE • Some changes to code will be required George Palmer 26th May 2007

Slide 9: The story so far… App Server Slave Master Load balancer App Server Slave DB App Server Slave • App servers continue to scale but the database side is somewhat limited… George Palmer 26th May 2007

Slide 10: User Clusters • For each user registered on the service add a entry to a master database detailing where their user data is stored – UserID – DB Cluster – Basic authorisation details such as username, password, any NLS settings George Palmer 26th May 2007

Slide 11: User Clusters (2) SELECT * FROM users WHERE username=‘Bob’ Master AND … DB App Server user_id=91732 db_cluster=2 User clusters are themselves one of the two User User database setups outlined Cluster 1 Cluster 2 earlier George Palmer 26th May 2007

Slide 12: User Clusters (3) • ID management becomes an issue – Best to use master DB id as user_id in user cluster or uuid’s – If let cluster allocate then make sure use offset and increment (not auto_increment) • Other DBs such as session must reference a user by id and DB cluster • Serious code changes may be required • Will want to have ability to move use users between clusters George Palmer 26th May 2007

Slide 13: Architecture so far • As number of app servers grow it’s a good idea to add a database connection manager (eg SQLRelay) • Extract out session, search, translation databases onto own machines • Add background processor for long running tasks (so don’t block app servers) • Use MySQL cluster (or equivalent) for any critical database – In replication setup can make a slave a backup master George Palmer 26th May 2007

Slide 14: Non-cached architecture Master Master BackgroundRB DB DB App Server 1 Session DB App Server 2 DB Connection Load balancer Manager … Search DB App Server 50 NLS Static Files DB User User Cluster Cluster Master Master 1 2 Slave Slave Slave Slave Slave Slave George Palmer 26th May 2007

Slide 15: Issues • Load balancer and database connection manager are single point of failure – Easy solved • 2PC needed for some operations. For example a user wants to be removed from search database – 2PC not supported in rails • Rails doesn’t support database switching for a given model – Can do explicitly on each request but expensive due to connection establishment overhead – Can get round if using connection manager but a proper solution is required (a few gems starting to emerge on this) George Palmer 26th May 2007

Slide 16: Making the most of your assets • In a lot of web applications a huge % of the hits are read only. Hence the need for caching: – Squid • A reverse-proxy (or webserver accelerator) – Memcached • Distributed memory caching solution – Language specific caching • Eg rails fragment caching George Palmer 26th May 2007

Slide 17: Squid App Server 1 Squid … App Server 2 Not in In cache cache Storage • Lookup of pages is in memory, storing of files is on disk • Can act also act as a load balancer • Pages can be expired by sending DELETE request to proxy • Can program any load balancer to pick up pages cached by your app servers (if you know the rules under which it operates) George Palmer 26th May 2007

Slide 18: Memcached Physical Machine Physical Machine App Server App Server DB Farm Memcached Memcached (Not in memcached) • Location of data is irrespective of physical machine • A really nice simple API – SET – GET – DELETE • In rails only a fews LOC will make a model cached • Also useful for tracking cross machine information – eg dodge user behaviour George Palmer 26th May 2007

Slide 19: Cached architecture • Introduce squid or nginx • Introduce memcached – Can go on every machine that has spare memory • Best suited to application servers which have high CPU usage but low memory requirements • Introduce language specific caching George Palmer 26th May 2007

Slide 20: Cached architecture Master Master BackgroundRB DB DB M App Server 1 C Session M DB App Server 2 C DB Connection Load balancer Manager … Search DB M App Server 50 C NLS Storage DB User User Cluster Cluster Master Master 1 2 MC=memcached Slave Slave Slave Slave Slave Slave George Palmer 26th May 2007

Slide 21: Cached architecture • Wikipedia quote a cache hit rate of 78% for squid and 7% for memcached – So only 15% of hits actually get to the DB!! • Performance is a whole new ball game but we recently gained 15-20% by optimising our rails configuration – But don’t get carried away - at some point the time you spend exceeds the money saved • Its very easy to scale this architecture down to one machine George Palmer 26th May 2007

Slide 22: Elastic architectures • Based upon Amazon EC2 – Allow you to create server images and launch instances on demand – Very cheap as you only pay for what you use • Currently no way to mount Amazon S3 – Strictly speaking there are a few projects ongoing… • Still in Beta – We’ve had network performance issues • An American VC was quoted as saying “Are you using EC2 for scaling? If not, you better have a good reason” George Palmer 26th May 2007

Slide 23: Elastic architectures M App Server 1 Monitor C High load M App Server 2 C EC2 Cloud Load balancer M App Server 3 C App Server Image M App Server 4 C produces • WeoCeo now offer a similar service George Palmer 26th May 2007

Slide 24: How far can it go? • For a truly global application, with millions of users - In order of ease: – Have a cache on each continent – Make user clusters based on user location • Distribute the clusters physically around the world – Introduce app servers on each continent – If you must replicate your site globally then use transaction replication software, eg GoldenGate George Palmer 26th May 2007

Slide 25: Useful Links • http://www.squid-cache.org/ • http://nginx.net/ • http://www.danga.com/memcached/ • http://sqlrelay.sourceforge.net/ • http://railsexpress.de/blog/ George Palmer 26th May 2007

Slide 26: Questions? George Palmer 26th May 2007