Geek Sessions Talk

  • 1,806 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,806
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
21
Comments
0
Likes
5

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Jonathan Abrams geekSessions July 31, 2007 Web Infrastructure: Surviving The “Hockey Stick”
  • 2. Friendster Growth
  • 3. How to scale a webapp
    • Lightweight non-sticky sessions
    • Cache almost everything (memcached)
    • Decouple slow processes from webapp
    • Segment the database (don’t use replication to scale)
    • Scale out not up (innovate on your app, not your infrastructure)
  • 4. Session Management
    • Use simple lightweight sessions, store in centralized location that can be volatile (memcached at Socializr)
    • 9F6077E3322B4E24C90C43178F42B9C0FFD1E4A43AED1BCA –> 656280029
    • Keep other user data in cache or cookies
    • Avoid sticky sessions, keep load balancing simple
  • 5. Tim Bray - Nov 2006 “Comparing Frameworks”
    • “ For Web apps, I’ve given PHP the edge, because I think building scalable PHP is a little easier . By default, PHP gives you a “shared-nothing” (or at least “shared very little”) architecture, which means you’re going to scale out pretty well until your database hits the wall. Java is a much richer system and assumes you’re smart enough to know whether a shared-nothing architecture is appropriate or not. The effect is, you have to be smarter to get the same kind of scaling out of Java.”
  • 6. Evite Session Management
  • 7. Caching
    • Use memcached, don’t invent your own
    • Put a large memcached instance on every webapp node
    • Cache almost everything but think of your expiration strategy and invalidation rules
  • 8. Avoid queries in loops
    • Queries in loops are SLOW and strain the database
    • Friendster in 2006 – 100s of db and cache queries per page
    • Don’t be afraid of joins when they are optimized and well-indexed and the results are cached
    • Cache big results
  • 9. MySQL replication is not for scaling
    • If you have mostly reads, just use memcached, not slaves
    • If you have many writes, the master will still be a bottleneck, and you will experience slave lag
    • Scaling requires you to segment the db, not replicate (especially blobs)
    • Use replication only for redundancy
    • (some exceptions to this, i.e. joins on shards)
  • 10.  
  • 11. How not to architect a “friend tracker” Tim adds a new photo -> update his 100 friends’ trackers Each user has their own tracker rows, sharded based on owner: insert into tracker (owner, user, date, type, param) values … For each trackable event, # of db writes = # of friends for that user …
  • 12. A better “friend tracker” approach: Tim adds a new photo -> update his 100 friends’ trackers Each user has their own update rows, sharded based on user: insert into tracker (user, date, type, param) values … For each trackable event, # of db writes = 1 To compute someone’s “tracker”: for each of n shards: select * from tracker where user in (list), aggregate in the code …
  • 13. Decouple slow processes
    • Expensive computations (i.e. graph)
    • Uploads & photo processing
    • External content integration (via screen scraping, APIs, RSS, etc.)
    • How: iframe, AJAX, POSTs and redirects, subdomains, etc.
  • 14. How not to decouple: WebApp Queue Servers add photo POST url: trackevent.php
    • trackevent.php?user=1…
  • 15. Thank You
    • [email_address]
    • www.socializr.com/jobs