Scaling Twitter

Blaine
Big Bird.
(scaling twitter)
Rails Scales.
(but not out of the box)
First, Some Facts
• 600 requests per second. Growing fast.
• 180 Rails Instances (Mongrel). Growing fast.
• 1 Database Server (MySQL) + 1 Slave.
• 30-odd Processes for Misc. Jobs
• 8 Sun X4100s
• Many users, many updates.
Scaling Twitter
Scaling Twitter
Scaling Twitter
Joy          Pain




Oct   Nov   Dec    Jan   Feb     March   Apr
IM IN UR RAILZ




     MAKIN EM GO FAST
It’s Easy, Really.
1. Realize Your Site is Slow
2. Optimize the Database
3. Cache the Hell out of Everything
4. Scale Messaging
5. Deal With Abuse
It’s Easy, Really.
1. Realize Your Site is Slow
2. Optimize the Database
3. Cache the Hell out of Everything
4. Scale Messaging
5. Deal With Abuse
6. Profit
the
     more
      you
        know

{ Part the First }
We Failed at This.
Don’t Be Like Us

• Munin
• Nagios
• AWStats & Google Analytics
• Exception Notifier / Exception Logger
• Immediately add reporting to track problems.
Test Everything

•   Start Before You Start

•   No Need To Be Fancy

•   Tests Will Save Your Life

•   Agile Becomes
    Important When Your
    Site Is Down
<!-- served to you through a copper wire by sampaati at 22 Apr
    15:02 in 343 ms (d 102 / r 217). thank you, come again. -->
 <!-- served to you through a copper wire by kolea.twitter.com at
22 Apr 15:02 in 235 ms (d 87 / r 130). thank you, come again. -->
 <!-- served to you through a copper wire by raven.twitter.com at
22 Apr 15:01 in 450 ms (d 96 / r 337). thank you, come again. -->



                  Benchmarks?
                       let your users do it.
 <!-- served to you through a copper wire by kolea.twitter.com at
22 Apr 15:00 in 409 ms (d 88 / r 307). thank you, come again. -->
  <!-- served to you through a copper wire by firebird at 22 Apr
   15:03 in 2094 ms (d 643 / r 1445). thank you, come again. -->
   <!-- served to you through a copper wire by quetzal at 22 Apr
     15:01 in 384 ms (d 70 / r 297). thank you, come again. -->
The Database
  { Part the Second }
“The Next Application I Build is Going
to Be Easily Partitionable” - S. Butterfield
“The Next Application I Build is Going
to Be Easily Partitionable” - S. Butterfield
“The Next Application I Build is Going
to Be Easily Partitionable” - S. Butterfield
Too Late.
Index Everything
class AddIndex < ActiveRecord::Migration
     def self.up
       add_index :users, :email
     end

     def self.down
       remove_index :users, :email
     end
   end


Repeat for any column that appears in a WHERE clause

             Rails won’t do this for you.
Denormalize A Lot
class DenormalizeFriendsIds < ActiveRecord::Migration
  def self.up
    add_column "users", "friends_ids", :text
  end

  def self.down
    remove_column "users", "friends_ids"
  end
end
class Friendship < ActiveRecord::Base
  belongs_to :user
  belongs_to :friend

 after_create :add_to_denormalized_friends
 after_destroy :remove_from_denormalized_friends

  def add_to_denormalized_friends
    user.friends_ids << friend.id
    user.friends_ids.uniq!
    user.save_without_validation
  end

  def remove_from_denormalized_friends
    user.friends_ids.delete(friend.id)
    user.save_without_validation
  end
end
Don’t be Stupid
bob.friends.map(&:email)
     Status.count()
“email like ‘%#{search}%’”
That’s where we are.
                  Seriously.
  If your Rails application is doing anything more
complex than that, you’re doing something wrong*.



        * or you observed the First Rule of Butterfield.
Partitioning Comes Later.
   (we’ll let you know how it goes)
The Cache
 { Part the Third }
MemCache
MemCache
MemCache
!
class Status < ActiveRecord::Base
  class << self
    def count_with_memcache(*args)
      return count_without_memcache unless args.empty?
      count = CACHE.get(“status_count”)
      if count.nil?
        count = count_without_memcache
        CACHE.set(“status_count”, count)
      end
      count
    end
    alias_method_chain :count, :memcache
  end
  after_create :increment_memcache_count
  after_destroy :decrement_memcache_count
  ...
end
class User < ActiveRecord::Base
  def friends_statuses
    ids = CACHE.get(“friends_statuses:#{id}”)
    Status.find(:all, :conditions => [“id IN (?)”, ids])
  end
end

class Status < ActiveRecord::Base
  after_create :update_caches
  def update_caches
    user.friends_ids.each do |friend_id|
      ids = CACHE.get(“friends_statuses:#{friend_id}”)
      ids.pop
      ids.unshift(id)
      CACHE.set(“friends_statuses:#{friend_id}”, ids)
    end
  end
end
The Future


            ve d
          ti r
         co
         Ac
           e
         R
90% API Requests
     Cache Them!
“There are only two hard things in CS:
 cache invalidation and naming things.”

             – Phil Karlton, via Tim Bray
Messaging
{ Part the Fourth }
You Already Knew All
That Other Stuff, Right?
Producer             Consumer
           Message
Producer             Consumer
           Queue
Producer             Consumer
DRb
• The Good:
 • Stupid Easy
 • Reasonably Fast
• The Bad:
 • Kinda Flaky
 • Zero Redundancy
 • Tightly Coupled
ejabberd


            Jabber Client
                (drb)




           Incoming         Outgoing
Presence
           Messages         Messages


              MySQL
Server
     DRb.start_service ‘druby://localhost:10000’, myobject




                         Client
myobject = DRbObject.new_with_uri(‘druby://localhost:10000’)
Rinda

• Shared Queue (TupleSpace)
• Built with DRb
• RingyDingy makes it stupid easy
• See Eric Hodel’s documentation
• O(N) for take(). Sigh.
Timestamp: 12/22/06 01:53:14 (4 months ago)
      Author: lattice
      Message: Fugly. Seriously. Fugly.




        SELECT * FROM messages WHERE
substring(truncate(id,0),-2,1) = #{@fugly_dist_idx}
It Scales.
(except it stopped on Tuesday)
Options

• ActiveMQ (Java)
• RabbitMQ (erlang)
• MySQL + Lightweight Locking
• Something Else?
erlang?


What are you doing?
 Stabbing my eyes out with a fork.
Starling

• Ruby, will be ported to something faster
• 4000 transactional msgs/s
• First pass written in 4 hours
• Speaks MemCache (set, get)
Use Messages to
Invalidate Cache
   (it’s really not that hard)
Abuse
{ Part the Fifth }
The Italians
9000 friends in 24 hours
        (doesn’t scale)
http://flickr.com/photos/heather/464504545/
http://flickr.com/photos/curiouskiwi/165229284/
http://flickr.com/photo_zoom.gne?id=42914103&size=l
http://flickr.com/photos/madstillz/354596905/
http://flickr.com/photos/laughingsquid/382242677/
http://flickr.com/photos/bng/46678227/
1 of 56

Recommended

Introduction to Redis by
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
121.1K views24 slides
BI, Reporting and Analytics on Apache Cassandra by
BI, Reporting and Analytics on Apache CassandraBI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraVictor Coustenoble
27K views39 slides
Kafka replication apachecon_2013 by
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013Jun Rao
21.3K views31 slides
Introduction to Apache ZooKeeper by
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeperSaurav Haloi
128.5K views30 slides
Extending Flink SQL for stream processing use cases by
Extending Flink SQL for stream processing use casesExtending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesFlink Forward
117 views33 slides
RedisConf17- Using Redis at scale @ Twitter by
RedisConf17- Using Redis at scale @ TwitterRedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ TwitterRedis Labs
3.8K views26 slides

More Related Content

What's hot

Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021 by
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021StreamNative
539 views18 slides
From cache to in-memory data grid. Introduction to Hazelcast. by
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.Taras Matyashovsky
42.6K views83 slides
MySQL Deep dive with FusionIO by
MySQL Deep dive with FusionIOMySQL Deep dive with FusionIO
MySQL Deep dive with FusionIOI Goo Lee
904 views29 slides
The columnar roadmap: Apache Parquet and Apache Arrow by
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowDataWorks Summit
3.3K views49 slides
Hello, kafka! (an introduction to apache kafka) by
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Timothy Spann
321 views35 slides
Stability Patterns for Microservices by
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservicespflueras
1.9K views18 slides

What's hot(20)

Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021 by StreamNative
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
StreamNative539 views
From cache to in-memory data grid. Introduction to Hazelcast. by Taras Matyashovsky
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
Taras Matyashovsky42.6K views
MySQL Deep dive with FusionIO by I Goo Lee
MySQL Deep dive with FusionIOMySQL Deep dive with FusionIO
MySQL Deep dive with FusionIO
I Goo Lee904 views
The columnar roadmap: Apache Parquet and Apache Arrow by DataWorks Summit
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
DataWorks Summit3.3K views
Hello, kafka! (an introduction to apache kafka) by Timothy Spann
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)
Timothy Spann321 views
Stability Patterns for Microservices by pflueras
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
pflueras1.9K views
PostgreSQL at 20TB and Beyond by Chris Travers
PostgreSQL at 20TB and BeyondPostgreSQL at 20TB and Beyond
PostgreSQL at 20TB and Beyond
Chris Travers8.8K views
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering by Erik Krogen
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage TieringHadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Erik Krogen516 views
Etsy Activity Feeds Architecture by Dan McKinley
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
Dan McKinley112.1K views
High-speed Database Throughput Using Apache Arrow Flight SQL by ScyllaDB
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
ScyllaDB1.2K views
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da... by Andrew Lamb
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
Andrew Lamb189 views
Real-time Analytics with Trino and Apache Pinot by Xiang Fu
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
Xiang Fu1.2K views
Twitter - Architecture and Scalability lessons by Aditya Rao
Twitter - Architecture and Scalability lessonsTwitter - Architecture and Scalability lessons
Twitter - Architecture and Scalability lessons
Aditya Rao24K views
Spark and S3 with Ryan Blue by Databricks
Spark and S3 with Ryan BlueSpark and S3 with Ryan Blue
Spark and S3 with Ryan Blue
Databricks3.9K views
Pulsar - Distributed pub/sub platform by Matteo Merli
Pulsar - Distributed pub/sub platformPulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platform
Matteo Merli4.2K views
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka... by HostedbyConfluent
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...
HostedbyConfluent1.7K views
Building Software Systems at Google and Lessons Learned by parallellabs
Building Software Systems at Google and Lessons LearnedBuilding Software Systems at Google and Lessons Learned
Building Software Systems at Google and Lessons Learned
parallellabs29K views

Similar to Scaling Twitter

Hiveminder - Everything but the Secret Sauce by
Hiveminder - Everything but the Secret SauceHiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret SauceJesse Vincent
2.9K views265 slides
Beijing Perl Workshop 2008 Hiveminder Secret Sauce by
Beijing Perl Workshop 2008 Hiveminder Secret SauceBeijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret SauceJesse Vincent
1.2K views238 slides
Microblogging via XMPP by
Microblogging via XMPPMicroblogging via XMPP
Microblogging via XMPPStoyan Zhekov
2.5K views39 slides
Aprendendo solid com exemplos by
Aprendendo solid com exemplosAprendendo solid com exemplos
Aprendendo solid com exemplosvinibaggio
1.1K views52 slides
Socket applications by
Socket applicationsSocket applications
Socket applicationsJoão Moura
601 views121 slides
From crash to testcase by
From crash to testcaseFrom crash to testcase
From crash to testcaseRoel Van de Paar
1.6K views54 slides

Similar to Scaling Twitter(20)

Hiveminder - Everything but the Secret Sauce by Jesse Vincent
Hiveminder - Everything but the Secret SauceHiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret Sauce
Jesse Vincent2.9K views
Beijing Perl Workshop 2008 Hiveminder Secret Sauce by Jesse Vincent
Beijing Perl Workshop 2008 Hiveminder Secret SauceBeijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret Sauce
Jesse Vincent1.2K views
Microblogging via XMPP by Stoyan Zhekov
Microblogging via XMPPMicroblogging via XMPP
Microblogging via XMPP
Stoyan Zhekov2.5K views
Aprendendo solid com exemplos by vinibaggio
Aprendendo solid com exemplosAprendendo solid com exemplos
Aprendendo solid com exemplos
vinibaggio1.1K views
Socket applications by João Moura
Socket applicationsSocket applications
Socket applications
João Moura601 views
Dynomite at Erlang Factory by moonpolysoft
Dynomite at Erlang FactoryDynomite at Erlang Factory
Dynomite at Erlang Factory
moonpolysoft594 views
Performance Optimization of Rails Applications by Serge Smetana
Performance Optimization of Rails ApplicationsPerformance Optimization of Rails Applications
Performance Optimization of Rails Applications
Serge Smetana8.8K views
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv... by MongoDB
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
MongoDB602 views
WebPerformance: Why and How? – Stefan Wintermeyer by Elixir Club
WebPerformance: Why and How? – Stefan WintermeyerWebPerformance: Why and How? – Stefan Wintermeyer
WebPerformance: Why and How? – Stefan Wintermeyer
Elixir Club576 views
NPW2009 - my.opera.com scalability v2.0 by Cosimo Streppone
NPW2009 - my.opera.com scalability v2.0NPW2009 - my.opera.com scalability v2.0
NPW2009 - my.opera.com scalability v2.0
Cosimo Streppone1K views
Fisl - Deployment by Fabio Akita
Fisl - DeploymentFisl - Deployment
Fisl - Deployment
Fabio Akita934 views
SD, a P2P bug tracking system by Jesse Vincent
SD, a P2P bug tracking systemSD, a P2P bug tracking system
SD, a P2P bug tracking system
Jesse Vincent13.6K views
RubyEnRails2007 - Dr Nic Williams - Keynote by Dr Nic Williams
RubyEnRails2007 - Dr Nic Williams - KeynoteRubyEnRails2007 - Dr Nic Williams - Keynote
RubyEnRails2007 - Dr Nic Williams - Keynote
Dr Nic Williams2.8K views
MongoDB: Optimising for Performance, Scale & Analytics by Server Density
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
Server Density555 views
JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot... by PROIDEA
JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...
JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...
PROIDEA125 views
Web 2.0 Performance and Reliability: How to Run Large Web Apps by adunne
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Apps
adunne2.4K views
How to avoid hanging yourself with Rails by Rowan Hick
How to avoid hanging yourself with RailsHow to avoid hanging yourself with Rails
How to avoid hanging yourself with Rails
Rowan Hick1.3K views
Monkeybars in the Manor by martinbtt
Monkeybars in the ManorMonkeybars in the Manor
Monkeybars in the Manor
martinbtt820 views

More from Blaine

Social Privacy for HTTP over Webfinger by
Social Privacy for HTTP over WebfingerSocial Privacy for HTTP over Webfinger
Social Privacy for HTTP over WebfingerBlaine
5.5K views23 slides
Social Software for Robots by
Social Software for RobotsSocial Software for Robots
Social Software for RobotsBlaine
1.4K views46 slides
OAuth by
OAuthOAuth
OAuthBlaine
4K views37 slides
Building the Real Time Web by
Building the Real Time WebBuilding the Real Time Web
Building the Real Time WebBlaine
2.6K views77 slides
You & Me & Everyone We Know by
You & Me & Everyone We KnowYou & Me & Everyone We Know
You & Me & Everyone We KnowBlaine
2.4K views54 slides
Social Software for Robots by
Social Software for RobotsSocial Software for Robots
Social Software for RobotsBlaine
14.2K views50 slides

More from Blaine(6)

Social Privacy for HTTP over Webfinger by Blaine
Social Privacy for HTTP over WebfingerSocial Privacy for HTTP over Webfinger
Social Privacy for HTTP over Webfinger
Blaine5.5K views
Social Software for Robots by Blaine
Social Software for RobotsSocial Software for Robots
Social Software for Robots
Blaine1.4K views
OAuth by Blaine
OAuthOAuth
OAuth
Blaine4K views
Building the Real Time Web by Blaine
Building the Real Time WebBuilding the Real Time Web
Building the Real Time Web
Blaine2.6K views
You & Me & Everyone We Know by Blaine
You & Me & Everyone We KnowYou & Me & Everyone We Know
You & Me & Everyone We Know
Blaine2.4K views
Social Software for Robots by Blaine
Social Software for RobotsSocial Software for Robots
Social Software for Robots
Blaine14.2K views

Recently uploaded

CloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&T by
CloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&TCloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&T
CloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&TShapeBlue
112 views34 slides
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue by
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlueShapeBlue
103 views23 slides
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive by
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveAutomating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveNetwork Automation Forum
50 views35 slides
Kyo - Functional Scala 2023.pdf by
Kyo - Functional Scala 2023.pdfKyo - Functional Scala 2023.pdf
Kyo - Functional Scala 2023.pdfFlavio W. Brasil
449 views92 slides
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT by
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITShapeBlue
166 views8 slides
20231123_Camunda Meetup Vienna.pdf by
20231123_Camunda Meetup Vienna.pdf20231123_Camunda Meetup Vienna.pdf
20231123_Camunda Meetup Vienna.pdfPhactum Softwareentwicklung GmbH
50 views73 slides

Recently uploaded(20)

CloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&T by ShapeBlue
CloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&TCloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&T
CloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&T
ShapeBlue112 views
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue by ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
ShapeBlue103 views
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive by Network Automation Forum
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveAutomating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT by ShapeBlue
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
ShapeBlue166 views
The Power of Heat Decarbonisation Plans in the Built Environment by IES VE
The Power of Heat Decarbonisation Plans in the Built EnvironmentThe Power of Heat Decarbonisation Plans in the Built Environment
The Power of Heat Decarbonisation Plans in the Built Environment
IES VE69 views
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti... by ShapeBlue
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
ShapeBlue98 views
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue by ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlueElevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
ShapeBlue179 views
Why and How CloudStack at weSystems - Stephan Bienek - weSystems by ShapeBlue
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsWhy and How CloudStack at weSystems - Stephan Bienek - weSystems
Why and How CloudStack at weSystems - Stephan Bienek - weSystems
ShapeBlue197 views
Data Integrity for Banking and Financial Services by Precisely
Data Integrity for Banking and Financial ServicesData Integrity for Banking and Financial Services
Data Integrity for Banking and Financial Services
Precisely78 views
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O... by ShapeBlue
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
ShapeBlue88 views
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue by ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
ShapeBlue93 views
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda... by ShapeBlue
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
ShapeBlue120 views
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P... by ShapeBlue
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
ShapeBlue154 views
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... by ShapeBlue
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
ShapeBlue138 views
State of the Union - Rohit Yadav - Apache CloudStack by ShapeBlue
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStack
ShapeBlue253 views
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ... by ShapeBlue
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
ShapeBlue123 views

Scaling Twitter

  • 2. Rails Scales. (but not out of the box)
  • 3. First, Some Facts • 600 requests per second. Growing fast. • 180 Rails Instances (Mongrel). Growing fast. • 1 Database Server (MySQL) + 1 Slave. • 30-odd Processes for Misc. Jobs • 8 Sun X4100s • Many users, many updates.
  • 7. Joy Pain Oct Nov Dec Jan Feb March Apr
  • 8. IM IN UR RAILZ MAKIN EM GO FAST
  • 9. It’s Easy, Really. 1. Realize Your Site is Slow 2. Optimize the Database 3. Cache the Hell out of Everything 4. Scale Messaging 5. Deal With Abuse
  • 10. It’s Easy, Really. 1. Realize Your Site is Slow 2. Optimize the Database 3. Cache the Hell out of Everything 4. Scale Messaging 5. Deal With Abuse 6. Profit
  • 11. the more you know { Part the First }
  • 12. We Failed at This.
  • 13. Don’t Be Like Us • Munin • Nagios • AWStats & Google Analytics • Exception Notifier / Exception Logger • Immediately add reporting to track problems.
  • 14. Test Everything • Start Before You Start • No Need To Be Fancy • Tests Will Save Your Life • Agile Becomes Important When Your Site Is Down
  • 15. <!-- served to you through a copper wire by sampaati at 22 Apr 15:02 in 343 ms (d 102 / r 217). thank you, come again. --> <!-- served to you through a copper wire by kolea.twitter.com at 22 Apr 15:02 in 235 ms (d 87 / r 130). thank you, come again. --> <!-- served to you through a copper wire by raven.twitter.com at 22 Apr 15:01 in 450 ms (d 96 / r 337). thank you, come again. --> Benchmarks? let your users do it. <!-- served to you through a copper wire by kolea.twitter.com at 22 Apr 15:00 in 409 ms (d 88 / r 307). thank you, come again. --> <!-- served to you through a copper wire by firebird at 22 Apr 15:03 in 2094 ms (d 643 / r 1445). thank you, come again. --> <!-- served to you through a copper wire by quetzal at 22 Apr 15:01 in 384 ms (d 70 / r 297). thank you, come again. -->
  • 16. The Database { Part the Second }
  • 17. “The Next Application I Build is Going to Be Easily Partitionable” - S. Butterfield
  • 18. “The Next Application I Build is Going to Be Easily Partitionable” - S. Butterfield
  • 19. “The Next Application I Build is Going to Be Easily Partitionable” - S. Butterfield
  • 22. class AddIndex < ActiveRecord::Migration def self.up add_index :users, :email end def self.down remove_index :users, :email end end Repeat for any column that appears in a WHERE clause Rails won’t do this for you.
  • 24. class DenormalizeFriendsIds < ActiveRecord::Migration def self.up add_column "users", "friends_ids", :text end def self.down remove_column "users", "friends_ids" end end
  • 25. class Friendship < ActiveRecord::Base belongs_to :user belongs_to :friend after_create :add_to_denormalized_friends after_destroy :remove_from_denormalized_friends def add_to_denormalized_friends user.friends_ids << friend.id user.friends_ids.uniq! user.save_without_validation end def remove_from_denormalized_friends user.friends_ids.delete(friend.id) user.save_without_validation end end
  • 27. bob.friends.map(&:email) Status.count() “email like ‘%#{search}%’”
  • 28. That’s where we are. Seriously. If your Rails application is doing anything more complex than that, you’re doing something wrong*. * or you observed the First Rule of Butterfield.
  • 29. Partitioning Comes Later. (we’ll let you know how it goes)
  • 30. The Cache { Part the Third }
  • 34. !
  • 35. class Status < ActiveRecord::Base class << self def count_with_memcache(*args) return count_without_memcache unless args.empty? count = CACHE.get(“status_count”) if count.nil? count = count_without_memcache CACHE.set(“status_count”, count) end count end alias_method_chain :count, :memcache end after_create :increment_memcache_count after_destroy :decrement_memcache_count ... end
  • 36. class User < ActiveRecord::Base def friends_statuses ids = CACHE.get(“friends_statuses:#{id}”) Status.find(:all, :conditions => [“id IN (?)”, ids]) end end class Status < ActiveRecord::Base after_create :update_caches def update_caches user.friends_ids.each do |friend_id| ids = CACHE.get(“friends_statuses:#{friend_id}”) ids.pop ids.unshift(id) CACHE.set(“friends_statuses:#{friend_id}”, ids) end end end
  • 37. The Future ve d ti r co Ac e R
  • 38. 90% API Requests Cache Them!
  • 39. “There are only two hard things in CS: cache invalidation and naming things.” – Phil Karlton, via Tim Bray
  • 41. You Already Knew All That Other Stuff, Right?
  • 42. Producer Consumer Message Producer Consumer Queue Producer Consumer
  • 43. DRb • The Good: • Stupid Easy • Reasonably Fast • The Bad: • Kinda Flaky • Zero Redundancy • Tightly Coupled
  • 44. ejabberd Jabber Client (drb) Incoming Outgoing Presence Messages Messages MySQL
  • 45. Server DRb.start_service ‘druby://localhost:10000’, myobject Client myobject = DRbObject.new_with_uri(‘druby://localhost:10000’)
  • 46. Rinda • Shared Queue (TupleSpace) • Built with DRb • RingyDingy makes it stupid easy • See Eric Hodel’s documentation • O(N) for take(). Sigh.
  • 47. Timestamp: 12/22/06 01:53:14 (4 months ago) Author: lattice Message: Fugly. Seriously. Fugly. SELECT * FROM messages WHERE substring(truncate(id,0),-2,1) = #{@fugly_dist_idx}
  • 48. It Scales. (except it stopped on Tuesday)
  • 49. Options • ActiveMQ (Java) • RabbitMQ (erlang) • MySQL + Lightweight Locking • Something Else?
  • 50. erlang? What are you doing? Stabbing my eyes out with a fork.
  • 51. Starling • Ruby, will be ported to something faster • 4000 transactional msgs/s • First pass written in 4 hours • Speaks MemCache (set, get)
  • 52. Use Messages to Invalidate Cache (it’s really not that hard)
  • 55. 9000 friends in 24 hours (doesn’t scale)