0
Jonathan Abrams geekSessions July 31, 2007 Web Infrastructure: Surviving The “Hockey Stick”
Friendster Growth
How to scale a webapp <ul><li>Lightweight non-sticky sessions </li></ul><ul><li>Cache almost everything (memcached) </li><...
Session Management <ul><li>Use simple lightweight sessions, store in centralized location that can be volatile (memcached ...
Tim Bray - Nov 2006 “Comparing Frameworks” <ul><li>“ For Web apps, I’ve given PHP the edge, because I think building scala...
Evite Session Management
Caching <ul><li>Use memcached, don’t invent your own </li></ul><ul><li>Put a large memcached instance on every webapp node...
Avoid queries in loops <ul><li>Queries in loops are SLOW and strain the database </li></ul><ul><li>Friendster in 2006 – 10...
MySQL replication is not for scaling <ul><li>If you have mostly reads, just use memcached, not slaves </li></ul><ul><li>If...
 
How not to architect a “friend tracker” Tim adds a new photo -> update his 100 friends’ trackers Each user has their own t...
A better “friend tracker” approach: Tim adds a new photo -> update his 100 friends’ trackers Each user has their own updat...
Decouple slow processes <ul><li>Expensive computations (i.e. graph) </li></ul><ul><li>Uploads & photo processing </li></ul...
How not to decouple: WebApp Queue Servers add photo POST url: trackevent.php <ul><li>… </li></ul><ul><li>… </li></ul><ul><...
Thank You <ul><li>[email_address] </li></ul><ul><li>www.socializr.com/jobs </li></ul>
Upcoming SlideShare
Loading in...5
×

Geek Sessions Talk

1,883

Published on

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,883
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
22
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Transcript of "Geek Sessions Talk"

  1. 1. Jonathan Abrams geekSessions July 31, 2007 Web Infrastructure: Surviving The “Hockey Stick”
  2. 2. Friendster Growth
  3. 3. How to scale a webapp <ul><li>Lightweight non-sticky sessions </li></ul><ul><li>Cache almost everything (memcached) </li></ul><ul><li>Decouple slow processes from webapp </li></ul><ul><li>Segment the database (don’t use replication to scale) </li></ul><ul><li>Scale out not up (innovate on your app, not your infrastructure) </li></ul>
  4. 4. Session Management <ul><li>Use simple lightweight sessions, store in centralized location that can be volatile (memcached at Socializr) </li></ul><ul><li>9F6077E3322B4E24C90C43178F42B9C0FFD1E4A43AED1BCA –> 656280029 </li></ul><ul><li>Keep other user data in cache or cookies </li></ul><ul><li>Avoid sticky sessions, keep load balancing simple </li></ul>
  5. 5. Tim Bray - Nov 2006 “Comparing Frameworks” <ul><li>“ For Web apps, I’ve given PHP the edge, because I think building scalable PHP is a little easier . By default, PHP gives you a “shared-nothing” (or at least “shared very little”) architecture, which means you’re going to scale out pretty well until your database hits the wall. Java is a much richer system and assumes you’re smart enough to know whether a shared-nothing architecture is appropriate or not. The effect is, you have to be smarter to get the same kind of scaling out of Java.” </li></ul>
  6. 6. Evite Session Management
  7. 7. Caching <ul><li>Use memcached, don’t invent your own </li></ul><ul><li>Put a large memcached instance on every webapp node </li></ul><ul><li>Cache almost everything but think of your expiration strategy and invalidation rules </li></ul>
  8. 8. Avoid queries in loops <ul><li>Queries in loops are SLOW and strain the database </li></ul><ul><li>Friendster in 2006 – 100s of db and cache queries per page </li></ul><ul><li>Don’t be afraid of joins when they are optimized and well-indexed and the results are cached </li></ul><ul><li>Cache big results </li></ul>
  9. 9. MySQL replication is not for scaling <ul><li>If you have mostly reads, just use memcached, not slaves </li></ul><ul><li>If you have many writes, the master will still be a bottleneck, and you will experience slave lag </li></ul><ul><li>Scaling requires you to segment the db, not replicate (especially blobs) </li></ul><ul><li>Use replication only for redundancy </li></ul><ul><li>(some exceptions to this, i.e. joins on shards) </li></ul>
  10. 11. How not to architect a “friend tracker” Tim adds a new photo -> update his 100 friends’ trackers Each user has their own tracker rows, sharded based on owner: insert into tracker (owner, user, date, type, param) values … For each trackable event, # of db writes = # of friends for that user …
  11. 12. A better “friend tracker” approach: Tim adds a new photo -> update his 100 friends’ trackers Each user has their own update rows, sharded based on user: insert into tracker (user, date, type, param) values … For each trackable event, # of db writes = 1 To compute someone’s “tracker”: for each of n shards: select * from tracker where user in (list), aggregate in the code …
  12. 13. Decouple slow processes <ul><li>Expensive computations (i.e. graph) </li></ul><ul><li>Uploads & photo processing </li></ul><ul><li>External content integration (via screen scraping, APIs, RSS, etc.) </li></ul><ul><li>How: iframe, AJAX, POSTs and redirects, subdomains, etc. </li></ul>
  13. 14. How not to decouple: WebApp Queue Servers add photo POST url: trackevent.php <ul><li>… </li></ul><ul><li>… </li></ul><ul><li>trackevent.php?user=1… </li></ul>
  14. 15. Thank You <ul><li>[email_address] </li></ul><ul><li>www.socializr.com/jobs </li></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×