Ajax For Scalability

1,229 views

Published on

My talk from PHP Barcelona 2009. I discuss how Tuenti's AJAX architecture helps us scale and deliver a better user experience.

Published in: Technology
  • Be the first to comment

Ajax For Scalability

  1. 1. AJAX for ScalabilityErik SchultinkPHP Barcelona – 30 Oct, 2009<br />
  2. 2. AJAX<br />What is AJAX?<br />“Asynchronous JavaScript and XML”<br />Paradigm for client-server interaction<br />Change state on client, without loading a complete HTML page<br />
  3. 3. Traditional HTML Browsing<br />User clicks link<br />Browser sends request<br />Server receives, parses request, generates response<br />Browser receives response and begins rendering<br />Dependent objects (images, js, css) load and render<br />Page appears<br />
  4. 4. AJAX Browsing<br />User clicks link<br />Browser sends request<br />Server receives, parses request, generates response<br />Browser receives response and begins rendering<br />Dependent objects (images, js, css) load and render<br />Page appears<br />
  5. 5. Why Asynchronous?<br />Not always necessary to wait for server response<br />Most of response is predictable<br />
  6. 6. Scalability?<br />Through-put, not speed<br />Ideally, capacity is linear function of machines<br />In practice, linear below some boundary<br />
  7. 7. What is Tuenti.com?<br />
  8. 8. Tuenti.com<br />Started 2007<br />More than 20 billion pageviews/month<br />More than 20k/second at peak<br />Based in Madrid<br />~80 employees, 40 engineers<br />
  9. 9. How does Tuenti use AJAX?<br />Only pageloads are login and home page<br />Loader pulls in all JS/CSS<br />Afterwards stay within one HTML page, rotating canvas area content<br />
  10. 10. How does Tuenti use AJAX?<br />PHOTO UPLOAD<br />
  11. 11. How does Tuenti use AJAX?<br />CHAT<br />
  12. 12. How does Tuenti use AJAX?<br />CHAT<br />
  13. 13. How does Tuenti use AJAX?<br />
  14. 14. How does Tuenti use AJAX?<br />
  15. 15. Traditional Systems Architecture<br />www.tuenti.com<br />Load Balancer<br />Web server farm<br />Web server farm<br />Web server farm<br />
  16. 16. Traditional Systems Architecture<br />www.tuenti.com<br />12.45.34.179<br />12.45.34.178<br />Load Balancer<br />Load Balancer<br />Web server farm<br />Web server farm<br />Web server farm<br />Web server farm<br />
  17. 17. Client-side Routing<br />www.tuenti.com<br />wwwb3.tuenti.com<br />wwwb2.tuenti.com<br />wwwb1.tuenti.com<br />wwwb4.tuenti.com<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Web server farm<br />Web server farm<br />Web server farm<br />Web server farm<br />Linearly scalable …<br />
  18. 18. Client-side Routing<br />www.tuenti.com<br />wwwb3.tuenti.com<br />wwwb2.tuenti.com<br />wwwb1.tuenti.com<br />wwwb4.tuenti.com<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Web server farm<br />Web server farm<br />Web server farm<br />Web server farm<br />Linearly scalable … except for top level<br />
  19. 19. Client-side Routing<br />www.tuenti.com<br />wwwb3.tuenti.com<br />wwwb2.tuenti.com<br />wwwb1.tuenti.com<br />wwwb4.tuenti.com<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Web server farm<br />Web server farm<br />Web server farm<br />Web server farm<br />lots of content creation<br /> = lots of dynamic data <br />
  20. 20. Client-side Routing<br />www.tuenti.com<br />wwwb3.tuenti.com<br />wwwb2.tuenti.com<br />wwwb1.tuenti.com<br />wwwb4.tuenti.com<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Web server farm<br />Web server farm<br />Web server farm<br />Web server farm<br />Cache Farm<br /> lots of dynamic data<br /> = lots of cache<br /> = internal network traffic<br />
  21. 21. Client-side Routing<br />www.tuenti.com<br />wwwb3.tuenti.com<br />wwwb2.tuenti.com<br />wwwb1.tuenti.com<br />wwwb4.tuenti.com<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Web server farm<br />Web server farm<br />Web server farm<br />Web server farm<br />Cache Farm<br />Cache Farm<br />Cache Farm<br />Cache Farm<br />Partition cache<br />Route requests to a farm near cache needed to respond<br />
  22. 22. Internal network savings<br />
  23. 23. Balancing Load<br />Top-level requests to www.tuenti.com<br />Each request tells client which farm it should be using, based on a mapping<br />Mapping can be changed to balance load, perform maintenance, etc<br />
  24. 24. Robust Browsing<br />Don’t tightly couple layers<br />Client and server are separate layers<br />
  25. 25. Robust Browsing<br />Client-side code should be robust against server-side failures<br />What can go wrong?<br />Client loses connectivity<br />Server-side error<br />Web server overloaded<br />Entire server farm overloaded<br />Server farm loses connectivity<br />
  26. 26. Robust Browsing<br />Options:<br />Retry (carefully)<br />Switch to alternate farm (carefully)<br />Detect error, give user feedback<br />
  27. 27. Caching<br />Think like server-side scalability<br />Cache data that is likely to be requested in the future<br />Doesn’t have consistency issues<br />Doesn’t change<br />
  28. 28. Cache Data in Client Browser<br />More responsive for user<br />Fewer requests to your servers<br />
  29. 29. Some details<br />Lots of dependencies to load initially<br />Loading can be brittle and slow<br />Browser performance<br />
  30. 30. Some details<br />Beware of external libraries<br />Google JS lib are not static, not indefinitely cacheable<br />
  31. 31. Some details<br />Friends do have consistency issues<br />Add/remove friends<br />Change avatar (rare)<br />
  32. 32. Image Serving<br />Tuenti serves ~2.5 billion images/day<br />At peak, this is &gt;6 Gbps and &gt;70k hits/sec<br />We use CDNs<br />
  33. 33. What is a CDN?<br />Content Delivery Network<br />
  34. 34. What is a CDN?<br />Examples: Akamai, Limelight<br />also dozens more, including Amazon<br />Big distributed, object cache<br />Pay per use <br />either per request, per TB transfer, or per peak Mbps<br />
  35. 35. What is a CDN?<br />Advantages:<br />Outsource dev and infrastructure<br />Geographically distributed<br />Economies of scale<br />Disadvantages:<br />High cost<br />Less control and transparency<br />Commitments<br />
  36. 36. What affects image load time?<br />Client internet connection<br />Response time of CDN<br />CDN cache hit rate<br />Response time of Origin<br />
  37. 37. What affects image load time?<br />Client internet connection<br />Response time of CDN<br />CDN cache hit rate<br />Response time of Origin<br />
  38. 38.
  39. 39. Monitor Performance from Client<br />Closer to performance experienced by end-user<br />Only way to get view of network issues faced by users (ie last mile)<br />
  40. 40.
  41. 41. Good Monitoring<br />Percentiles &gt; averages <br />Less noise, more stable<br />Comparable to SLA terms<br />Establish baselines <br />performance relative to past more important than absolute<br />Avoid false positives<br />
  42. 42. How to fix slow ISP?<br />Choose better transit provider<br />Set-up peering (or get CDN too)<br />Traffic management<br />
  43. 43. What affects image load time?<br />Client internet connection<br />Response time of CDN<br />CDN cache hit rate<br />Response time of Origin<br />
  44. 44.
  45. 45.
  46. 46. Quality of End-User Experience<br />vs.<br />Cost<br />
  47. 47. We use multiple CDNs, and shift content based on price/performance.<br />
  48. 48. Know your content<br />
  49. 49. Know your content<br />
  50. 50. Know your content<br />
  51. 51. Know your content<br />30<br />75<br />200<br />
  52. 52. Know your content<br />600<br />
  53. 53. Know your content<br />120<br />
  54. 54. Know your content<br />
  55. 55. Pre-fetch Content<br />Exploit predictable user behavior<br />Ex: clicking to next photo in an album<br />Simple solution – load next image hidden<br />Client browser will cache it (next response &lt; 100 ms)<br />Increase tolerance for slow response time<br />
  56. 56. Pre-fetch Content<br />More complex solution<br />Pre-fetch next canvas (full html), render in background – rotate in on Next<br />Even more complex<br />Instantiate HTML template w/ data on client<br />Pre-fetch data X photos in advance, render Y templates in advance with this data<br />
  57. 57. Pre-fetch Content<br />Problems:<br />Rendering still takes time<br />Increases browser load<br />Need to set cache headers correctly<br />
  58. 58. Image delivery<br />Small images: High request, low volume<br />Most cost-effective to cache in memory<br />Large images: High volume, low requests, greater tolerance for latency<br />
  59. 59. What affects image load time?<br />Client internet connection<br />Response time of CDN<br />CDN cache hit rate<br />Response time of Origin<br />
  60. 60. Monitor Performance from Client<br />cold servers online<br />
  61. 61. What affects image load time?<br />Client internet connection<br />Response time of CDN<br />CDN cache hit rate<br />Response time of Origin<br />
  62. 62. Deploy cache layers (squid, varnish)<br />Optimize web servers for static content delivery<br />More machines<br />Avoid RAID, file system<br />Improve Origin Response<br />
  63. 63. Summary<br />Cache and pre-fetch content<br />Client-side routing<br />Client-side monitoring<br />
  64. 64. General caveats<br />Be careful with browser performance<br />Look carefully at what’s in the worst 5%<br />Don’t assume that high values are incorrect<br />
  65. 65. More<br />jobs.tuenti.com<br />Presentation<br />Saturday 14:20<br />Challenge (win a PS3)<br />Talk: Continuous Integration<br />DavideMendolia, Saturday 12:00<br />
  66. 66. More? Ask us!<br />
  67. 67. More? Ask us!<br />
  68. 68. Q & A<br />

×