Twitter!  Architecture and Scalability Aditya B 05IT04
WHAT IS TWITTER ?
Its addictive micro-blogging platform text-based posts 140 characters in length followers receive updates
 
 
Who uses twitter
Web Traffic Twitter’s web-based traffic  Plus Twitter's API Traffic which is 10x the Site’s
As it often happens..
 
Downtimes!! 2008
 
So why the problem ? Over 350,000 users. The actual numbers are as always, very super super top secret. 600 requests per second. Average 200-300 connections per second. Spiking to 800 connections per second. MySQL handled 2,400 requests per second.
What? Why so?? When a user Abhinav writes a simple “I’m hanging out with…” message, Twitter has two choices – PUSH the message to the queue’s of each of his 6,864 followers, or Wait for the 6,864 followers to log in, then PULL the message. It is not as easy as it looks.
A 6000x multiplication factor D o you see a scaling problem with this scenario? Scoble  writes something  boom  6,800 writes  are kicked off. 1 for each follower. Michael Arrington  replies  boom  another  6,600 writes . Jason   Calacanis  jumps in  boom  another  6,500 writes .
कितनॆ आदमी थे  ? ~ 350,000  सरकार् और् तुम्  ? 1 database  सरकार् बहुत् नाइन्साफी है !
Bottlenecks Single MySQL database no monitoring, no graphs, no statistics Abuses Plan to partition in the future
SOLUTION ?
Caching Getting your friends status is complicated.  There are security and other issues.  So rather than doing a query, a friend's status is updated in cache instead.  It never touches the database.  This gives a predictable response time frame (upper bound 20 ms)
Partitioning Plan to partition in the future. Currently they don't. The partition scheme will be based on time, not users Because most requests are very temporally local.
Abuse Prevention Bots crawl the site and add everyone as friends.  9000 friends in 24 hours. It would take down the site. Saraha Be ruthless. Delete them as users. 9000 14 2 Following Followers  Updates
Scalability -- Doing It Right Asynchronous event-driven design Partitioning/Shards Parallel execution Replication (read-mostly)
Are we all doomed to go through this painful process when we are successful? Time-To-Market Vs Architecture  Good, Fast, Cheap - pick two  :P
LESSONS LEARNED
Talk to the community. Treat your scaling plan like a business plan Build it yourself Build in user limits Don't make the database the central bottleneck of doom Make your application easily partitionable from the start
Optimize the database Cache the hell out of everything Most performance comes not from the language, but from application design Turn your website into an open service by creating an API.  Their API is the single most powerful reason for Twitter's success.
References http://twitter.com/ http://highscalability.com/ scaling-twitter-making-twitter-10000-percent-faster http://www.slideshare.net/ Blaine/scaling-twitter http://dev.twitter.com/ 2008/05/twittering-about-architecture.html http://www.danga.com/memcached/ http://geekandpoke.com/
QUESTIONS
ADITYA http://twitter.com/arbitya [email_address]
 
 
 
 

Twitter - Architecture and Scalability lessons

  • 1.
    Twitter! Architectureand Scalability Aditya B 05IT04
  • 2.
  • 3.
    Its addictive micro-blogging platformtext-based posts 140 characters in length followers receive updates
  • 4.
  • 5.
  • 6.
  • 7.
    Web Traffic Twitter’sweb-based traffic Plus Twitter's API Traffic which is 10x the Site’s
  • 8.
    As it oftenhappens..
  • 9.
  • 10.
  • 11.
  • 12.
    So why theproblem ? Over 350,000 users. The actual numbers are as always, very super super top secret. 600 requests per second. Average 200-300 connections per second. Spiking to 800 connections per second. MySQL handled 2,400 requests per second.
  • 13.
    What? Why so??When a user Abhinav writes a simple “I’m hanging out with…” message, Twitter has two choices – PUSH the message to the queue’s of each of his 6,864 followers, or Wait for the 6,864 followers to log in, then PULL the message. It is not as easy as it looks.
  • 14.
    A 6000x multiplicationfactor D o you see a scaling problem with this scenario? Scoble writes something boom 6,800 writes are kicked off. 1 for each follower. Michael Arrington replies boom another 6,600 writes . Jason Calacanis jumps in boom another 6,500 writes .
  • 15.
    कितनॆ आदमी थे ? ~ 350,000 सरकार् और् तुम् ? 1 database सरकार् बहुत् नाइन्साफी है !
  • 16.
    Bottlenecks Single MySQLdatabase no monitoring, no graphs, no statistics Abuses Plan to partition in the future
  • 17.
  • 18.
    Caching Getting yourfriends status is complicated. There are security and other issues. So rather than doing a query, a friend's status is updated in cache instead. It never touches the database. This gives a predictable response time frame (upper bound 20 ms)
  • 19.
    Partitioning Plan topartition in the future. Currently they don't. The partition scheme will be based on time, not users Because most requests are very temporally local.
  • 20.
    Abuse Prevention Botscrawl the site and add everyone as friends. 9000 friends in 24 hours. It would take down the site. Saraha Be ruthless. Delete them as users. 9000 14 2 Following Followers Updates
  • 21.
    Scalability -- DoingIt Right Asynchronous event-driven design Partitioning/Shards Parallel execution Replication (read-mostly)
  • 22.
    Are we alldoomed to go through this painful process when we are successful? Time-To-Market Vs Architecture Good, Fast, Cheap - pick two :P
  • 23.
  • 24.
    Talk to thecommunity. Treat your scaling plan like a business plan Build it yourself Build in user limits Don't make the database the central bottleneck of doom Make your application easily partitionable from the start
  • 25.
    Optimize the databaseCache the hell out of everything Most performance comes not from the language, but from application design Turn your website into an open service by creating an API. Their API is the single most powerful reason for Twitter's success.
  • 26.
    References http://twitter.com/ http://highscalability.com/scaling-twitter-making-twitter-10000-percent-faster http://www.slideshare.net/ Blaine/scaling-twitter http://dev.twitter.com/ 2008/05/twittering-about-architecture.html http://www.danga.com/memcached/ http://geekandpoke.com/
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.