Twitter by the Numbers (Columbia University)

  • 17,131 views
Uploaded on

The version of "Twitter by the Numbers" that I delivered at Columbia University for @TwitterU. 22 Feb 2011.

The version of "Twitter by the Numbers" that I delivered at Columbia University for @TwitterU. 22 Feb 2011.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Wow! These are mind boggling (and awesome to work with technology that supports this). Thanks for sharing this!
    Are you sure you want to
    Your message goes here
  • about twitter up
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
17,131
On Slideshare
0
From Embeds
0
Number of Embeds
13

Actions

Shares
Downloads
0
Comments
2
Likes
24

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. by the #s with @raffi
  • 2. Giv ing a @twitter t alk at ColumbiaUni versity talking a bout Twitter’sNumbers!22 Feb via Twitter for iPhone ty from Mudd Building at Columbia Universi 500 West 120th Street New York, New York View Tweets at this place
  • 3. http://twitter.com/#!/toptweets/status/12483108178
  • 4. http://twitter.com/#!/Emergency_In_SF/status/29440739442
  • 5. http://twitter.com/#!/i80chains/status/9726084734
  • 6. http://twitter.com/#!/remedyoakland/status/29002198672
  • 7. http://twitter.com/#!/AlbionsOven/status/10015063036
  • 8. What’s a Tweet?It’s a short message thats sent through 140 characters
  • 9. How many are there?
  • 10. How many are there? 110M!
  • 11. 110M tweets 1200 tweets per day ≈ per second
  • 12. How big are they? 1 tweet text = 140 characters ≈ 200 bytes
  • 13. 1200 tweets per ≈ 230 KB/sec second ≈ 14 MB/min ≈ 19 GB/day Just tweet text!
  • 14. MySQL Can’t generate IDs fast enoughCentralized and a single point of failure snowflake Highly available and uncoordinated (10kqps) Compatible with the ecosystem http://github.com/twitter/snowflake
  • 15. ampura mons from ch used under Creative Com Photo1 TB generated 10 TB generated per day per day
  • 16. 10 TBper day in total ≈ 120 MB per sec 80 MB = per sec Photo used u n der Creative C ommons from Mac Users G uide
  • 17. Where do they go? Followed by Following Asymmetric Digraph
  • 18. 1 Digraph 2 Need to represent this 4 1 2 3 4 31 Matrix2 Naïve implementation is not scalable34
  • 19. 200M registered users 2006 2008 2010 2011
  • 20. Photo used under Creative Commons from jurvetson Distributed graph databaseflockdb High rate of CRUD operations Complex set arithmetic queries http://github.com/twitter/flockdb
  • 21. @ladygagamother mons†er8.3 million followers@justinbieberJustin Bieber7.5 million followers@BarakObama44th President of the United States6.7 million followers@raffime!0.007 million followers
  • 22. How do they get out? 10B API calls 100,000 calls per day ≈ per second
  • 23. REST API XML/JSON API over HTTPPoll-based system / pseudo real-time hosebird Streaming API Long poll HTTP Near real-time delivery of Tweets
  • 24. Latency200ms100ms 0ms
  • 25. 752%in 2008
  • 26. 1358% in 2009
  • 27. Where do we want to be? Today - 200M people generate ~1200 TPSTomorrow - we want to support half the world and all its devices (5B phones and 6B people)
  • 28. Real challenges in front of us Real time Indexing, search, and analytics Relevance systems Graph databases Storage Scalability and efficiency
  • 29. Follow me atQuestions? twitter.com/raffi