Big Bird.
(scaling twitter)
Rails Scales.
(but not out of the box)
First, Some Facts
• 600 requests per second. Growing fast.
• 180 Rails Instances (Mongrel). Growing fast.
• 1 Database Ser...
Joy          Pain




Oct   Nov   Dec    Jan   Feb     March   Apr
IM IN UR RAILZ




     MAKIN EM GO FAST
It’s Easy, Really.
1. Realize Your Site is Slow
2. Optimize the Database
3. Cache the Hell out of Everything
4. Scale Mess...
It’s Easy, Really.
1. Realize Your Site is Slow
2. Optimize the Database
3. Cache the Hell out of Everything
4. Scale Mess...
the
     more
      you
        know

{ Part the First }
We Failed at This.
Don’t Be Like Us

• Munin
• Nagios
• AWStats & Google Analytics
• Exception Notifier / Exception Logger
• Immediately add r...
Test Everything

•   Start Before You Start

•   No Need To Be Fancy

•   Tests Will Save Your Life

•   Agile Becomes
   ...
<!-- served to you through a copper wire by sampaati at 22 Apr
    15:02 in 343 ms (d 102 / r 217). thank you, come again....
The Database
  { Part the Second }
“The Next Application I Build is Going
to Be Easily Partitionable” - S. Butterfield
“The Next Application I Build is Going
to Be Easily Partitionable” - S. Butterfield
“The Next Application I Build is Going
to Be Easily Partitionable” - S. Butterfield
Too Late.
Index Everything
class AddIndex < ActiveRecord::Migration
     def self.up
       add_index :users, :email
     end

     def self.down
   ...
Denormalize A Lot
class DenormalizeFriendsIds < ActiveRecord::Migration
  def self.up
    add_column "users", "friends_ids", :text
  end

  ...
class Friendship < ActiveRecord::Base
  belongs_to :user
  belongs_to :friend

 after_create :add_to_denormalized_friends
...
Don’t be Stupid
bob.friends.map(&:email)
     Status.count()
“email like ‘%#{search}%’”
That’s where we are.
                  Seriously.
  If your Rails application is doing anything more
complex than that, yo...
Partitioning Comes Later.
   (we’ll let you know how it goes)
The Cache
 { Part the Third }
MemCache
MemCache
MemCache
!
class Status < ActiveRecord::Base
  class << self
    def count_with_memcache(*args)
      return count_without_memcache u...
class User < ActiveRecord::Base
  def friends_statuses
    ids = CACHE.get(“friends_statuses:#{id}”)
    Status.find(:all,...
The Future


            ve d
          ti r
         co
         Ac
           e
         R
90% API Requests
     Cache Them!
“There are only two hard things in CS:
 cache invalidation and naming things.”

             – Phil Karlton, via Tim Bray
Messaging
{ Part the Fourth }
You Already Knew All
That Other Stuff, Right?
Producer             Consumer
           Message
Producer             Consumer
           Queue
Producer             Consumer
DRb
• The Good:
 • Stupid Easy
 • Reasonably Fast
• The Bad:
 • Kinda Flaky
 • Zero Redundancy
 • Tightly Coupled
ejabberd


            Jabber Client
                (drb)




           Incoming         Outgoing
Presence
           Me...
Server
     DRb.start_service ‘druby://localhost:10000’, myobject




                         Client
myobject = DRbObject...
Rinda

• Shared Queue (TupleSpace)
• Built with DRb
• RingyDingy makes it stupid easy
• See Eric Hodel’s documentation
• O...
Timestamp: 12/22/06 01:53:14 (4 months ago)
      Author: lattice
      Message: Fugly. Seriously. Fugly.




        SELE...
It Scales.
(except it stopped on Tuesday)
Options

• ActiveMQ (Java)
• RabbitMQ (erlang)
• MySQL + Lightweight Locking
• Something Else?
erlang?


What are you doing?
 Stabbing my eyes out with a fork.
Starling

• Ruby, will be ported to something faster
• 4000 transactional msgs/s
• First pass written in 4 hours
• Speaks ...
Use Messages to
Invalidate Cache
   (it’s really not that hard)
Abuse
{ Part the Fifth }
The Italians
9000 friends in 24 hours
        (doesn’t scale)
http://flickr.com/photos/heather/464504545/
http://flickr.com/photos/curiouskiwi/165229284/
http://flickr.com/photo_zoom.g...
Scaling Twitter
Scaling Twitter
Scaling Twitter
Upcoming SlideShare
Loading in...5
×

Scaling Twitter

229,091

Published on

Scaling Twitter - Slides for a talk presented at the SDForum Silicon Valley Ruby Conference 2007 on Twitter's challenges scaling Rails.

Published in: Technology, Sports
30 Comments
340 Likes
Statistics
Notes
No Downloads
Views
Total Views
229,091
On Slideshare
0
From Embeds
0
Number of Embeds
29
Actions
Shares
0
Downloads
4,738
Comments
30
Likes
340
Embeds 0
No embeds

No notes for slide

Scaling Twitter

  1. 1. Big Bird. (scaling twitter)
  2. 2. Rails Scales. (but not out of the box)
  3. 3. First, Some Facts • 600 requests per second. Growing fast. • 180 Rails Instances (Mongrel). Growing fast. • 1 Database Server (MySQL) + 1 Slave. • 30-odd Processes for Misc. Jobs • 8 Sun X4100s • Many users, many updates.
  4. 4. Joy Pain Oct Nov Dec Jan Feb March Apr
  5. 5. IM IN UR RAILZ MAKIN EM GO FAST
  6. 6. It’s Easy, Really. 1. Realize Your Site is Slow 2. Optimize the Database 3. Cache the Hell out of Everything 4. Scale Messaging 5. Deal With Abuse
  7. 7. It’s Easy, Really. 1. Realize Your Site is Slow 2. Optimize the Database 3. Cache the Hell out of Everything 4. Scale Messaging 5. Deal With Abuse 6. Profit
  8. 8. the more you know { Part the First }
  9. 9. We Failed at This.
  10. 10. Don’t Be Like Us • Munin • Nagios • AWStats & Google Analytics • Exception Notifier / Exception Logger • Immediately add reporting to track problems.
  11. 11. Test Everything • Start Before You Start • No Need To Be Fancy • Tests Will Save Your Life • Agile Becomes Important When Your Site Is Down
  12. 12. <!-- served to you through a copper wire by sampaati at 22 Apr 15:02 in 343 ms (d 102 / r 217). thank you, come again. --> <!-- served to you through a copper wire by kolea.twitter.com at 22 Apr 15:02 in 235 ms (d 87 / r 130). thank you, come again. --> <!-- served to you through a copper wire by raven.twitter.com at 22 Apr 15:01 in 450 ms (d 96 / r 337). thank you, come again. --> Benchmarks? let your users do it. <!-- served to you through a copper wire by kolea.twitter.com at 22 Apr 15:00 in 409 ms (d 88 / r 307). thank you, come again. --> <!-- served to you through a copper wire by firebird at 22 Apr 15:03 in 2094 ms (d 643 / r 1445). thank you, come again. --> <!-- served to you through a copper wire by quetzal at 22 Apr 15:01 in 384 ms (d 70 / r 297). thank you, come again. -->
  13. 13. The Database { Part the Second }
  14. 14. “The Next Application I Build is Going to Be Easily Partitionable” - S. Butterfield
  15. 15. “The Next Application I Build is Going to Be Easily Partitionable” - S. Butterfield
  16. 16. “The Next Application I Build is Going to Be Easily Partitionable” - S. Butterfield
  17. 17. Too Late.
  18. 18. Index Everything
  19. 19. class AddIndex < ActiveRecord::Migration def self.up add_index :users, :email end def self.down remove_index :users, :email end end Repeat for any column that appears in a WHERE clause Rails won’t do this for you.
  20. 20. Denormalize A Lot
  21. 21. class DenormalizeFriendsIds < ActiveRecord::Migration def self.up add_column "users", "friends_ids", :text end def self.down remove_column "users", "friends_ids" end end
  22. 22. class Friendship < ActiveRecord::Base belongs_to :user belongs_to :friend after_create :add_to_denormalized_friends after_destroy :remove_from_denormalized_friends def add_to_denormalized_friends user.friends_ids << friend.id user.friends_ids.uniq! user.save_without_validation end def remove_from_denormalized_friends user.friends_ids.delete(friend.id) user.save_without_validation end end
  23. 23. Don’t be Stupid
  24. 24. bob.friends.map(&:email) Status.count() “email like ‘%#{search}%’”
  25. 25. That’s where we are. Seriously. If your Rails application is doing anything more complex than that, you’re doing something wrong*. * or you observed the First Rule of Butterfield.
  26. 26. Partitioning Comes Later. (we’ll let you know how it goes)
  27. 27. The Cache { Part the Third }
  28. 28. MemCache
  29. 29. MemCache
  30. 30. MemCache
  31. 31. !
  32. 32. class Status < ActiveRecord::Base class << self def count_with_memcache(*args) return count_without_memcache unless args.empty? count = CACHE.get(“status_count”) if count.nil? count = count_without_memcache CACHE.set(“status_count”, count) end count end alias_method_chain :count, :memcache end after_create :increment_memcache_count after_destroy :decrement_memcache_count ... end
  33. 33. class User < ActiveRecord::Base def friends_statuses ids = CACHE.get(“friends_statuses:#{id}”) Status.find(:all, :conditions => [“id IN (?)”, ids]) end end class Status < ActiveRecord::Base after_create :update_caches def update_caches user.friends_ids.each do |friend_id| ids = CACHE.get(“friends_statuses:#{friend_id}”) ids.pop ids.unshift(id) CACHE.set(“friends_statuses:#{friend_id}”, ids) end end end
  34. 34. The Future ve d ti r co Ac e R
  35. 35. 90% API Requests Cache Them!
  36. 36. “There are only two hard things in CS: cache invalidation and naming things.” – Phil Karlton, via Tim Bray
  37. 37. Messaging { Part the Fourth }
  38. 38. You Already Knew All That Other Stuff, Right?
  39. 39. Producer Consumer Message Producer Consumer Queue Producer Consumer
  40. 40. DRb • The Good: • Stupid Easy • Reasonably Fast • The Bad: • Kinda Flaky • Zero Redundancy • Tightly Coupled
  41. 41. ejabberd Jabber Client (drb) Incoming Outgoing Presence Messages Messages MySQL
  42. 42. Server DRb.start_service ‘druby://localhost:10000’, myobject Client myobject = DRbObject.new_with_uri(‘druby://localhost:10000’)
  43. 43. Rinda • Shared Queue (TupleSpace) • Built with DRb • RingyDingy makes it stupid easy • See Eric Hodel’s documentation • O(N) for take(). Sigh.
  44. 44. Timestamp: 12/22/06 01:53:14 (4 months ago) Author: lattice Message: Fugly. Seriously. Fugly. SELECT * FROM messages WHERE substring(truncate(id,0),-2,1) = #{@fugly_dist_idx}
  45. 45. It Scales. (except it stopped on Tuesday)
  46. 46. Options • ActiveMQ (Java) • RabbitMQ (erlang) • MySQL + Lightweight Locking • Something Else?
  47. 47. erlang? What are you doing? Stabbing my eyes out with a fork.
  48. 48. Starling • Ruby, will be ported to something faster • 4000 transactional msgs/s • First pass written in 4 hours • Speaks MemCache (set, get)
  49. 49. Use Messages to Invalidate Cache (it’s really not that hard)
  50. 50. Abuse { Part the Fifth }
  51. 51. The Italians
  52. 52. 9000 friends in 24 hours (doesn’t scale)
  53. 53. http://flickr.com/photos/heather/464504545/ http://flickr.com/photos/curiouskiwi/165229284/ http://flickr.com/photo_zoom.gne?id=42914103&size=l http://flickr.com/photos/madstillz/354596905/ http://flickr.com/photos/laughingsquid/382242677/ http://flickr.com/photos/bng/46678227/
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×