Twitter by the Numbers

50,594
-1

Published on

The slides to a tech talk I gave as part of @TwitterU at UC Berkeley on 9 September 2010.

See the blog post at http://mehack.com/twitter-by-the-numbers, and an animated version of the slide deck at http://www.youtube.com/watch?v=TdY0jU697lY

Published in: Technology
8 Comments
162 Likes
Statistics
Notes
No Downloads
Views
Total Views
50,594
On Slideshare
0
From Embeds
0
Number of Embeds
46
Actions
Shares
0
Downloads
0
Comments
8
Likes
162
Embeds 0
No embeds

No notes for slide

Twitter by the Numbers

  1. by the #s
  2. witter by the ing a t alk entitled “T Giv @twit terU a t Cal. Number s” with Lots o f ##s! ne Tw itter for iPho 1m inute ago via on Retweet ed by 1 pers
  3. What’s a Tweet? It’s a short message that's sent through 140 characters
  4. How many are there?
  5. How many are there? 70M 60M Today! * *off the chart
  6. op_oh Commons from lo der Creative Photo used un 47M served 70M tweets per day per day
  7. 70M tweets per day = 800 tweets per second
  8. How big are they? 1 tweet text = 140 characters ≈ 200 bytes
  9. 800 tweets per second ≈ 160 KB/sec ≈ 9 MB/min ≈ 12 GB/day Just tweet text!
  10. MySQL Can’t generate IDs fast enough Centralized and a single point of failure snowflake Highly available and uncoordinated (10kqps) Compatible with the ecosystem http://github.com/twitter/snowflake
  11. ampura Commons from ch der Creative Photo used un 1 TB generated 8 TB generated per day per day
  12. 8 TB per day in total ≈ 100 MB per sec Photo used u nder Creative C ommons from Mac Users G uide = 80 MB per sec
  13. Where do they go? Followed by Following Asymmetric Digraph
  14. Tweets multiply
  15. 1 Digraph 2 Need to represent this 4 1 2 3 4 3 1 Matrix 2 Naïve implementation is not scalable 3 4
  16. 150M registered users 2006 2008 2010
  17. Photo used under Creative Commons from jurvetson Distributed graph database flockdb High rate of CRUD operations Complex set arithmetic queries http://github.com/twitter/flockdb
  18. @ladygaga mother mons†er 6.1 million followers @BarakObama 44th President of the United States 5.3 million followers @justinbieber Justin Bieber 5.1 million followers @raffi me! 4.1 thousand followers
  19. How do they get out? 6B API calls per day ≈ 70,000 calls per second
  20. REST API XML/JSON API over HTTP Poll-based system / pseudo real-time hosebird Streaming API Long poll HTTP Near real-time delivery of Tweets
  21. 752% in 2008
  22. 1358% in 2009
  23. Where do we want to be? Today - 150M people generate ~1000 TPS Tomorrow - we want to support half the world and all its devices (5B phones and 6B people)
  24. Real challenges in front of us Real time Indexing, search, and analytics Relevance systems Graph databases Storage Scalability and efficiency
  25. Questions? Follow me at twitter.com/raffi

×