AJAX for ScalabilityErik SchultinkPHP Barcelona – 30 Oct, 2009<br />
AJAX<br />What is AJAX?<br />“Asynchronous JavaScript and XML”<br />Paradigm for client-server interaction<br />Change sta...
Traditional HTML Browsing<br />User clicks link<br />Browser sends request<br />Server receives, parses request, generates...
AJAX Browsing<br />User clicks link<br />Browser sends request<br />Server receives, parses request, generates response<br...
Why Asynchronous?<br />Not always necessary to wait for server response<br />Most of response is predictable<br />
Scalability?<br />Through-put, not speed<br />Ideally, capacity is linear function of machines<br />In practice, linear be...
What is Tuenti.com?<br />
Tuenti.com<br />Started 2007<br />More than 20 billion pageviews/month<br />More than 20k/second at peak<br />Based in Mad...
How does Tuenti use AJAX?<br />Only pageloads are login and home page<br />Loader pulls in all JS/CSS<br />Afterwards stay...
How does Tuenti use AJAX?<br />PHOTO UPLOAD<br />
How does Tuenti use AJAX?<br />CHAT<br />
How does Tuenti use AJAX?<br />CHAT<br />
How does Tuenti use AJAX?<br />
How does Tuenti use AJAX?<br />
Traditional Systems Architecture<br />www.tuenti.com<br />Load Balancer<br />Web server farm<br />Web server farm<br />Web...
Traditional Systems Architecture<br />www.tuenti.com<br />12.45.34.179<br />12.45.34.178<br />Load Balancer<br />Load Bala...
Client-side Routing<br />www.tuenti.com<br />wwwb3.tuenti.com<br />wwwb2.tuenti.com<br />wwwb1.tuenti.com<br />wwwb4.tuent...
Client-side Routing<br />www.tuenti.com<br />wwwb3.tuenti.com<br />wwwb2.tuenti.com<br />wwwb1.tuenti.com<br />wwwb4.tuent...
Client-side Routing<br />www.tuenti.com<br />wwwb3.tuenti.com<br />wwwb2.tuenti.com<br />wwwb1.tuenti.com<br />wwwb4.tuent...
Client-side Routing<br />www.tuenti.com<br />wwwb3.tuenti.com<br />wwwb2.tuenti.com<br />wwwb1.tuenti.com<br />wwwb4.tuent...
Client-side Routing<br />www.tuenti.com<br />wwwb3.tuenti.com<br />wwwb2.tuenti.com<br />wwwb1.tuenti.com<br />wwwb4.tuent...
Internal network savings<br />
Balancing Load<br />Top-level requests to www.tuenti.com<br />Each request tells client which farm it should be using, bas...
Robust Browsing<br />Don’t tightly couple layers<br />Client and server are separate layers<br />
Robust Browsing<br />Client-side code should be robust against server-side failures<br />What can go wrong?<br />Client lo...
Robust Browsing<br />Options:<br />Retry (carefully)<br />Switch to alternate farm (carefully)<br />Detect error, give use...
Caching<br />Think like server-side scalability<br />Cache data that is likely to be requested in the future<br />Doesn’t ...
Cache Data in Client Browser<br />More responsive for user<br />Fewer requests to your servers<br />
Some details<br />Lots of dependencies to load initially<br />Loading can be brittle and slow<br />Browser performance<br />
Some details<br />Beware of external libraries<br />Google JS lib are not static, not indefinitely cacheable<br />
Some details<br />Friends do have consistency issues<br />Add/remove friends<br />Change avatar (rare)<br />
Image Serving<br />Tuenti serves ~2.5 billion images/day<br />At peak, this is &gt;6 Gbps and &gt;70k hits/sec<br />We use...
What is a CDN?<br />Content Delivery Network<br />
What is a CDN?<br />Examples: Akamai, Limelight<br />also dozens more, including Amazon<br />Big distributed, object cache...
What is a CDN?<br />Advantages:<br />Outsource dev and infrastructure<br />Geographically distributed<br />Economies of sc...
What affects image load time?<br />Client internet connection<br />Response time of CDN<br />CDN cache hit rate<br />Respo...
What affects image load time?<br />Client internet connection<br />Response time of CDN<br />CDN cache hit rate<br />Respo...
Monitor Performance from Client<br />Closer to performance experienced by end-user<br />Only way to get view of network is...
Good Monitoring<br />Percentiles &gt; averages <br />Less noise, more stable<br />Comparable to SLA terms<br />Establish b...
How to fix slow ISP?<br />Choose better transit provider<br />Set-up peering (or get CDN too)<br />Traffic management<br />
What affects image load time?<br />Client internet connection<br />Response time of CDN<br />CDN cache hit rate<br />Respo...
Quality of End-User Experience<br />vs.<br />Cost<br />
We use multiple CDNs, and shift content based on price/performance.<br />
Know your content<br />
Know your content<br />
Know your content<br />
Know your content<br />30<br />75<br />200<br />
Know your content<br />600<br />
Know your content<br />120<br />
Know your content<br />
Pre-fetch Content<br />Exploit predictable user behavior<br />Ex: clicking to next photo in an album<br />Simple solution ...
Pre-fetch Content<br />More complex solution<br />Pre-fetch next canvas (full html), render in background – rotate in on N...
Pre-fetch Content<br />Problems:<br />Rendering still takes time<br />Increases browser load<br />Need to set cache header...
Image delivery<br />Small images: High request, low volume<br />Most cost-effective to cache in memory<br />Large images: ...
What affects image load time?<br />Client internet connection<br />Response time of CDN<br />CDN cache hit rate<br />Respo...
Monitor Performance from Client<br />cold servers online<br />
What affects image load time?<br />Client internet connection<br />Response time of CDN<br />CDN cache hit rate<br />Respo...
Deploy cache layers (squid, varnish)<br />Optimize web servers for static content delivery<br />More machines<br />Avoid R...
Summary<br />Cache and pre-fetch content<br />Client-side routing<br />Client-side monitoring<br />
General caveats<br />Be careful with browser performance<br />Look carefully at what’s in the worst 5%<br />Don’t assume t...
More<br />jobs.tuenti.com<br />Presentation<br />Saturday 14:20<br />Challenge (win a PS3)<br />Talk: Continuous Integrati...
More? Ask us!<br />
More? Ask us!<br />
Q & A<br />
Upcoming SlideShare
Loading in...5
×

AJAX for Scalability

2,725

Published on

1 Comment
7 Likes
Statistics
Notes
No Downloads
Views
Total Views
2,725
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
1
Likes
7
Embeds 0
No embeds

No notes for slide
  • JavaScript and XML are just technologies; “Asynchronous” is what’s important – the shift in thinking from web browsing as serial page by page to more fluid navigation that’s wholly contained within the same HTML page. I’m not going to go much into implementation, etc – it’s a lot of detail, and talking about cross-browser compatibility isn’t so fun or interesting. Focus on approaches – what we’ve learned from scaling on the server side can be applied to client side.
  • Using AJAX in application design, allows 1-6 to collapse a bit
  • Using AJAX in application design, allows 1-6 to collapse a bit
  • ComScore numbers show that we have more traffic than all Google properties combined. ComScore estimates 1 in 6 web pages viewed in Spain is from Tuenti. ComScore numbers are lower than our internal measurements.
  • Consider breaking into slides that illustrate this process
  • Not just well structure
  • Not just well structure
  • Not just well structure
  • Not just well structure
  • Not just well structure
  • Not just well structure
  • Not just well structure
  • Competitive market – only 2 (Akamai and Limelight) are financially very healthy – and Limelight is losing money if you consider investments
  • AJAX for Scalability

    1. 1. AJAX for ScalabilityErik SchultinkPHP Barcelona – 30 Oct, 2009<br />
    2. 2. AJAX<br />What is AJAX?<br />“Asynchronous JavaScript and XML”<br />Paradigm for client-server interaction<br />Change state on client, without loading a complete HTML page<br />
    3. 3. Traditional HTML Browsing<br />User clicks link<br />Browser sends request<br />Server receives, parses request, generates response<br />Browser receives response and begins rendering<br />Dependent objects (images, js, css) load and render<br />Page appears<br />
    4. 4. AJAX Browsing<br />User clicks link<br />Browser sends request<br />Server receives, parses request, generates response<br />Browser receives response and begins rendering<br />Dependent objects (images, js, css) load and render<br />Page appears<br />
    5. 5. Why Asynchronous?<br />Not always necessary to wait for server response<br />Most of response is predictable<br />
    6. 6. Scalability?<br />Through-put, not speed<br />Ideally, capacity is linear function of machines<br />In practice, linear below some boundary<br />
    7. 7. What is Tuenti.com?<br />
    8. 8. Tuenti.com<br />Started 2007<br />More than 20 billion pageviews/month<br />More than 20k/second at peak<br />Based in Madrid<br />~80 employees, 40 engineers<br />
    9. 9. How does Tuenti use AJAX?<br />Only pageloads are login and home page<br />Loader pulls in all JS/CSS<br />Afterwards stay within one HTML page, rotating canvas area content<br />
    10. 10. How does Tuenti use AJAX?<br />PHOTO UPLOAD<br />
    11. 11. How does Tuenti use AJAX?<br />CHAT<br />
    12. 12. How does Tuenti use AJAX?<br />CHAT<br />
    13. 13. How does Tuenti use AJAX?<br />
    14. 14. How does Tuenti use AJAX?<br />
    15. 15. Traditional Systems Architecture<br />www.tuenti.com<br />Load Balancer<br />Web server farm<br />Web server farm<br />Web server farm<br />
    16. 16. Traditional Systems Architecture<br />www.tuenti.com<br />12.45.34.179<br />12.45.34.178<br />Load Balancer<br />Load Balancer<br />Web server farm<br />Web server farm<br />Web server farm<br />Web server farm<br />
    17. 17. Client-side Routing<br />www.tuenti.com<br />wwwb3.tuenti.com<br />wwwb2.tuenti.com<br />wwwb1.tuenti.com<br />wwwb4.tuenti.com<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Web server farm<br />Web server farm<br />Web server farm<br />Web server farm<br />Linearly scalable …<br />
    18. 18. Client-side Routing<br />www.tuenti.com<br />wwwb3.tuenti.com<br />wwwb2.tuenti.com<br />wwwb1.tuenti.com<br />wwwb4.tuenti.com<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Web server farm<br />Web server farm<br />Web server farm<br />Web server farm<br />Linearly scalable … except for top level<br />
    19. 19. Client-side Routing<br />www.tuenti.com<br />wwwb3.tuenti.com<br />wwwb2.tuenti.com<br />wwwb1.tuenti.com<br />wwwb4.tuenti.com<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Web server farm<br />Web server farm<br />Web server farm<br />Web server farm<br />lots of content creation<br /> = lots of dynamic data <br />
    20. 20. Client-side Routing<br />www.tuenti.com<br />wwwb3.tuenti.com<br />wwwb2.tuenti.com<br />wwwb1.tuenti.com<br />wwwb4.tuenti.com<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Web server farm<br />Web server farm<br />Web server farm<br />Web server farm<br />Cache Farm<br /> lots of dynamic data<br /> = lots of cache<br /> = internal network traffic<br />
    21. 21. Client-side Routing<br />www.tuenti.com<br />wwwb3.tuenti.com<br />wwwb2.tuenti.com<br />wwwb1.tuenti.com<br />wwwb4.tuenti.com<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Load Balancer<br />Web server farm<br />Web server farm<br />Web server farm<br />Web server farm<br />Cache Farm<br />Cache Farm<br />Cache Farm<br />Cache Farm<br />Partition cache<br />Route requests to a farm near cache needed to respond<br />
    22. 22. Internal network savings<br />
    23. 23. Balancing Load<br />Top-level requests to www.tuenti.com<br />Each request tells client which farm it should be using, based on a mapping<br />Mapping can be changed to balance load, perform maintenance, etc<br />
    24. 24. Robust Browsing<br />Don’t tightly couple layers<br />Client and server are separate layers<br />
    25. 25. Robust Browsing<br />Client-side code should be robust against server-side failures<br />What can go wrong?<br />Client loses connectivity<br />Server-side error<br />Web server overloaded<br />Entire server farm overloaded<br />Server farm loses connectivity<br />
    26. 26. Robust Browsing<br />Options:<br />Retry (carefully)<br />Switch to alternate farm (carefully)<br />Detect error, give user feedback<br />
    27. 27. Caching<br />Think like server-side scalability<br />Cache data that is likely to be requested in the future<br />Doesn’t have consistency issues<br />Doesn’t change<br />
    28. 28. Cache Data in Client Browser<br />More responsive for user<br />Fewer requests to your servers<br />
    29. 29. Some details<br />Lots of dependencies to load initially<br />Loading can be brittle and slow<br />Browser performance<br />
    30. 30. Some details<br />Beware of external libraries<br />Google JS lib are not static, not indefinitely cacheable<br />
    31. 31. Some details<br />Friends do have consistency issues<br />Add/remove friends<br />Change avatar (rare)<br />
    32. 32. Image Serving<br />Tuenti serves ~2.5 billion images/day<br />At peak, this is &gt;6 Gbps and &gt;70k hits/sec<br />We use CDNs<br />
    33. 33. What is a CDN?<br />Content Delivery Network<br />
    34. 34. What is a CDN?<br />Examples: Akamai, Limelight<br />also dozens more, including Amazon<br />Big distributed, object cache<br />Pay per use <br />either per request, per TB transfer, or per peak Mbps<br />
    35. 35. What is a CDN?<br />Advantages:<br />Outsource dev and infrastructure<br />Geographically distributed<br />Economies of scale<br />Disadvantages:<br />High cost<br />Less control and transparency<br />Commitments<br />
    36. 36. What affects image load time?<br />Client internet connection<br />Response time of CDN<br />CDN cache hit rate<br />Response time of Origin<br />
    37. 37. What affects image load time?<br />Client internet connection<br />Response time of CDN<br />CDN cache hit rate<br />Response time of Origin<br />
    38. 38.
    39. 39. Monitor Performance from Client<br />Closer to performance experienced by end-user<br />Only way to get view of network issues faced by users (ie last mile)<br />
    40. 40.
    41. 41. Good Monitoring<br />Percentiles &gt; averages <br />Less noise, more stable<br />Comparable to SLA terms<br />Establish baselines <br />performance relative to past more important than absolute<br />Avoid false positives<br />
    42. 42. How to fix slow ISP?<br />Choose better transit provider<br />Set-up peering (or get CDN too)<br />Traffic management<br />
    43. 43. What affects image load time?<br />Client internet connection<br />Response time of CDN<br />CDN cache hit rate<br />Response time of Origin<br />
    44. 44.
    45. 45.
    46. 46. Quality of End-User Experience<br />vs.<br />Cost<br />
    47. 47. We use multiple CDNs, and shift content based on price/performance.<br />
    48. 48. Know your content<br />
    49. 49. Know your content<br />
    50. 50. Know your content<br />
    51. 51. Know your content<br />30<br />75<br />200<br />
    52. 52. Know your content<br />600<br />
    53. 53. Know your content<br />120<br />
    54. 54. Know your content<br />
    55. 55. Pre-fetch Content<br />Exploit predictable user behavior<br />Ex: clicking to next photo in an album<br />Simple solution – load next image hidden<br />Client browser will cache it (next response &lt; 100 ms)<br />Increase tolerance for slow response time<br />
    56. 56. Pre-fetch Content<br />More complex solution<br />Pre-fetch next canvas (full html), render in background – rotate in on Next<br />Even more complex<br />Instantiate HTML template w/ data on client<br />Pre-fetch data X photos in advance, render Y templates in advance with this data<br />
    57. 57. Pre-fetch Content<br />Problems:<br />Rendering still takes time<br />Increases browser load<br />Need to set cache headers correctly<br />
    58. 58. Image delivery<br />Small images: High request, low volume<br />Most cost-effective to cache in memory<br />Large images: High volume, low requests, greater tolerance for latency<br />
    59. 59. What affects image load time?<br />Client internet connection<br />Response time of CDN<br />CDN cache hit rate<br />Response time of Origin<br />
    60. 60. Monitor Performance from Client<br />cold servers online<br />
    61. 61. What affects image load time?<br />Client internet connection<br />Response time of CDN<br />CDN cache hit rate<br />Response time of Origin<br />
    62. 62. Deploy cache layers (squid, varnish)<br />Optimize web servers for static content delivery<br />More machines<br />Avoid RAID, file system<br />Improve Origin Response<br />
    63. 63. Summary<br />Cache and pre-fetch content<br />Client-side routing<br />Client-side monitoring<br />
    64. 64. General caveats<br />Be careful with browser performance<br />Look carefully at what’s in the worst 5%<br />Don’t assume that high values are incorrect<br />
    65. 65. More<br />jobs.tuenti.com<br />Presentation<br />Saturday 14:20<br />Challenge (win a PS3)<br />Talk: Continuous Integration<br />DavideMendolia, Saturday 12:00<br />
    66. 66. More? Ask us!<br />
    67. 67. More? Ask us!<br />
    68. 68. Q & A<br />

    ×