Geek Sessions Talk
Upcoming SlideShare
Loading in...5
×
 

Geek Sessions Talk

on

  • 2,565 views

 

Statistics

Views

Total Views
2,565
Views on SlideShare
2,563
Embed Views
2

Actions

Likes
5
Downloads
21
Comments
0

1 Embed 2

http://www.slideshare.net 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Geek Sessions Talk Geek Sessions Talk Presentation Transcript

  • Jonathan Abrams geekSessions July 31, 2007 Web Infrastructure: Surviving The “Hockey Stick”
  • Friendster Growth
  • How to scale a webapp
    • Lightweight non-sticky sessions
    • Cache almost everything (memcached)
    • Decouple slow processes from webapp
    • Segment the database (don’t use replication to scale)
    • Scale out not up (innovate on your app, not your infrastructure)
  • Session Management
    • Use simple lightweight sessions, store in centralized location that can be volatile (memcached at Socializr)
    • 9F6077E3322B4E24C90C43178F42B9C0FFD1E4A43AED1BCA –> 656280029
    • Keep other user data in cache or cookies
    • Avoid sticky sessions, keep load balancing simple
  • Tim Bray - Nov 2006 “Comparing Frameworks”
    • “ For Web apps, I’ve given PHP the edge, because I think building scalable PHP is a little easier . By default, PHP gives you a “shared-nothing” (or at least “shared very little”) architecture, which means you’re going to scale out pretty well until your database hits the wall. Java is a much richer system and assumes you’re smart enough to know whether a shared-nothing architecture is appropriate or not. The effect is, you have to be smarter to get the same kind of scaling out of Java.”
  • Evite Session Management
  • Caching
    • Use memcached, don’t invent your own
    • Put a large memcached instance on every webapp node
    • Cache almost everything but think of your expiration strategy and invalidation rules
  • Avoid queries in loops
    • Queries in loops are SLOW and strain the database
    • Friendster in 2006 – 100s of db and cache queries per page
    • Don’t be afraid of joins when they are optimized and well-indexed and the results are cached
    • Cache big results
  • MySQL replication is not for scaling
    • If you have mostly reads, just use memcached, not slaves
    • If you have many writes, the master will still be a bottleneck, and you will experience slave lag
    • Scaling requires you to segment the db, not replicate (especially blobs)
    • Use replication only for redundancy
    • (some exceptions to this, i.e. joins on shards)
  •  
  • How not to architect a “friend tracker” Tim adds a new photo -> update his 100 friends’ trackers Each user has their own tracker rows, sharded based on owner: insert into tracker (owner, user, date, type, param) values … For each trackable event, # of db writes = # of friends for that user …
  • A better “friend tracker” approach: Tim adds a new photo -> update his 100 friends’ trackers Each user has their own update rows, sharded based on user: insert into tracker (user, date, type, param) values … For each trackable event, # of db writes = 1 To compute someone’s “tracker”: for each of n shards: select * from tracker where user in (list), aggregate in the code …
  • Decouple slow processes
    • Expensive computations (i.e. graph)
    • Uploads & photo processing
    • External content integration (via screen scraping, APIs, RSS, etc.)
    • How: iframe, AJAX, POSTs and redirects, subdomains, etc.
  • How not to decouple: WebApp Queue Servers add photo POST url: trackevent.php
    • trackevent.php?user=1…
  • Thank You
    • [email_address]
    • www.socializr.com/jobs