by the #s
witter by the
    ing a t alk entitled “T
Giv
                   @twit  terU a t Cal.
Number    s” with
Lots o f ##s!     ...
What’s a Tweet?

It’s a short message that's sent through




                                           140 characters
How many are there?
How many are there?   70M
                      60M




                            Today! *




                         ...
op_oh
                                 Commons from lo
                    der Creative
       Photo used un




47M serve...
70M tweets
   per day   =   800 tweets
                 per second
How big are they?


1 tweet text   =    140
                    characters
               ≈    200 bytes
800 tweets
per second       ≈       160 KB/sec
                 ≈ 9 MB/min
                 ≈ 12 GB/day
      Just tweet t...
MySQL
         Can’t generate IDs fast enough
Centralized and a single point of failure




                              ...
ampura
                                    Commons from ch
                       der Creative
          Photo used un



...
8 TB
per day
in total
                                                              ≈   100 MB
                           ...
Where do they go?
              Followed by
  Following




                            Asymmetric Digraph
Tweets multiply
1

                   Digraph                           2
            Need to represent this

                            ...
150M registered users




     2006     2008      2010
Photo used under Creative Commons from jurvetson




          Distributed graph database

flockdb   High rate of CRUD ope...
@ladygaga
mother mons†er
6.1 million followers

@BarakObama
44th President of the United States
5.3 million followers

@ju...
How do they get out?


6B API calls
    per day    ≈   70,000 calls
                   per second
REST API
         XML/JSON API over HTTP
Poll-based system / pseudo real-time




                               hosebird
...
752%
in 2008
1358%
 in 2009
Where do we want to be?

           Today - 150M people generate ~1000 TPS

Tomorrow - we want to support half the world a...
Real challenges in front of us
                 Real time

       Indexing, search, and analytics

             Relevance ...
Questions?   Follow me at
             twitter.com/raffi
Twitter by the Numbers
Upcoming SlideShare
Loading in...5
×

Twitter by the Numbers

48,194

Published on

The slides to a tech talk I gave as part of @TwitterU at UC Berkeley on 9 September 2010.

See the blog post at http://mehack.com/twitter-by-the-numbers, and an animated version of the slide deck at http://www.youtube.com/watch?v=TdY0jU697lY

Published in: Technology
8 Comments
161 Likes
Statistics
Notes
No Downloads
Views
Total Views
48,194
On Slideshare
0
From Embeds
0
Number of Embeds
46
Actions
Shares
0
Downloads
0
Comments
8
Likes
161
Embeds 0
No embeds

No notes for slide

Twitter by the Numbers

  1. 1. by the #s
  2. 2. witter by the ing a t alk entitled “T Giv @twit terU a t Cal. Number s” with Lots o f ##s! ne Tw itter for iPho 1m inute ago via on Retweet ed by 1 pers
  3. 3. What’s a Tweet? It’s a short message that's sent through 140 characters
  4. 4. How many are there?
  5. 5. How many are there? 70M 60M Today! * *off the chart
  6. 6. op_oh Commons from lo der Creative Photo used un 47M served 70M tweets per day per day
  7. 7. 70M tweets per day = 800 tweets per second
  8. 8. How big are they? 1 tweet text = 140 characters ≈ 200 bytes
  9. 9. 800 tweets per second ≈ 160 KB/sec ≈ 9 MB/min ≈ 12 GB/day Just tweet text!
  10. 10. MySQL Can’t generate IDs fast enough Centralized and a single point of failure snowflake Highly available and uncoordinated (10kqps) Compatible with the ecosystem http://github.com/twitter/snowflake
  11. 11. ampura Commons from ch der Creative Photo used un 1 TB generated 8 TB generated per day per day
  12. 12. 8 TB per day in total ≈ 100 MB per sec Photo used u nder Creative C ommons from Mac Users G uide = 80 MB per sec
  13. 13. Where do they go? Followed by Following Asymmetric Digraph
  14. 14. Tweets multiply
  15. 15. 1 Digraph 2 Need to represent this 4 1 2 3 4 3 1 Matrix 2 Naïve implementation is not scalable 3 4
  16. 16. 150M registered users 2006 2008 2010
  17. 17. Photo used under Creative Commons from jurvetson Distributed graph database flockdb High rate of CRUD operations Complex set arithmetic queries http://github.com/twitter/flockdb
  18. 18. @ladygaga mother mons†er 6.1 million followers @BarakObama 44th President of the United States 5.3 million followers @justinbieber Justin Bieber 5.1 million followers @raffi me! 4.1 thousand followers
  19. 19. How do they get out? 6B API calls per day ≈ 70,000 calls per second
  20. 20. REST API XML/JSON API over HTTP Poll-based system / pseudo real-time hosebird Streaming API Long poll HTTP Near real-time delivery of Tweets
  21. 21. 752% in 2008
  22. 22. 1358% in 2009
  23. 23. Where do we want to be? Today - 150M people generate ~1000 TPS Tomorrow - we want to support half the world and all its devices (5B phones and 6B people)
  24. 24. Real challenges in front of us Real time Indexing, search, and analytics Relevance systems Graph databases Storage Scalability and efficiency
  25. 25. Questions? Follow me at twitter.com/raffi

×