Your SlideShare is downloading. ×
Geek Sessions Talk
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Geek Sessions Talk


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Jonathan Abrams geekSessions July 31, 2007 Web Infrastructure: Surviving The “Hockey Stick”
  • 2. Friendster Growth
  • 3. How to scale a webapp
    • Lightweight non-sticky sessions
    • Cache almost everything (memcached)
    • Decouple slow processes from webapp
    • Segment the database (don’t use replication to scale)
    • Scale out not up (innovate on your app, not your infrastructure)
  • 4. Session Management
    • Use simple lightweight sessions, store in centralized location that can be volatile (memcached at Socializr)
    • 9F6077E3322B4E24C90C43178F42B9C0FFD1E4A43AED1BCA –> 656280029
    • Keep other user data in cache or cookies
    • Avoid sticky sessions, keep load balancing simple
  • 5. Tim Bray - Nov 2006 “Comparing Frameworks”
    • “ For Web apps, I’ve given PHP the edge, because I think building scalable PHP is a little easier . By default, PHP gives you a “shared-nothing” (or at least “shared very little”) architecture, which means you’re going to scale out pretty well until your database hits the wall. Java is a much richer system and assumes you’re smart enough to know whether a shared-nothing architecture is appropriate or not. The effect is, you have to be smarter to get the same kind of scaling out of Java.”
  • 6. Evite Session Management
  • 7. Caching
    • Use memcached, don’t invent your own
    • Put a large memcached instance on every webapp node
    • Cache almost everything but think of your expiration strategy and invalidation rules
  • 8. Avoid queries in loops
    • Queries in loops are SLOW and strain the database
    • Friendster in 2006 – 100s of db and cache queries per page
    • Don’t be afraid of joins when they are optimized and well-indexed and the results are cached
    • Cache big results
  • 9. MySQL replication is not for scaling
    • If you have mostly reads, just use memcached, not slaves
    • If you have many writes, the master will still be a bottleneck, and you will experience slave lag
    • Scaling requires you to segment the db, not replicate (especially blobs)
    • Use replication only for redundancy
    • (some exceptions to this, i.e. joins on shards)
  • 10.  
  • 11. How not to architect a “friend tracker” Tim adds a new photo -> update his 100 friends’ trackers Each user has their own tracker rows, sharded based on owner: insert into tracker (owner, user, date, type, param) values … For each trackable event, # of db writes = # of friends for that user …
  • 12. A better “friend tracker” approach: Tim adds a new photo -> update his 100 friends’ trackers Each user has their own update rows, sharded based on user: insert into tracker (user, date, type, param) values … For each trackable event, # of db writes = 1 To compute someone’s “tracker”: for each of n shards: select * from tracker where user in (list), aggregate in the code …
  • 13. Decouple slow processes
    • Expensive computations (i.e. graph)
    • Uploads & photo processing
    • External content integration (via screen scraping, APIs, RSS, etc.)
    • How: iframe, AJAX, POSTs and redirects, subdomains, etc.
  • 14. How not to decouple: WebApp Queue Servers add photo POST url: trackevent.php
    • trackevent.php?user=1…
  • 15. Thank You
    • [email_address]