UC Berkeley - CS10
Giv ing a talk at Ca l’s CS10 - The
Beauty and Joy     of Computing.
inst. eecs.berkeley.e   du/~cs10/fa10
20 Oct via Twit...
To start... What’s a Tweet?
Follow me!




 twitter.com/raffi
 or send a SMS to 40404
 follow raffi
“Rudimentary communication
among individuals in real time
allows many to move together as
one” - @biz
One Word
April 10, 2008
  4:33:23 PM
One Word
                 April 10, 2008
                   4:33:23 PM




 And it’s over
April 11, 2008
1:27:26 PM
Haiti
January 2010




                Chile
               February 2010
MTV VMAs
              September 2010




 Emmys
August 2010
The humble beginnings
How does Twitter work?
Twitter is just A “struct”, a graph,
& a LIST
A Tweet is a simple thing




                            140 characters
What’s The data structure of a
tweet?

  {
   “id ” : 787167620,
   “text” : “Free”,
   “ date” : “2008-04 -11 13:27:26”
 ...
Where do they go?

                    Following
                    Followed by
Tweets multiply!
It’s a huge graph
Viewing Tweets
Home Timeline
                 [27886578700,
                  27856404785,
                  27727644786,
               ...
Is that it? ... YES! ... sorta ...
things get more complicated at scale
And scaling real-time is hard
How many Tweets are there?
How many Tweets are there?

                             90M!
90M tweets
   per day   ≈   1000 tweets
                 per second
Lots of deliveries
Tweets multiply




                    5,162 people
                  need to see my
                           tweet
@ladygaga
mother mons†er
6.8 million followers

@justinbieber
Justin Bieber
5.7 million followers

@BarakObama
44th Presid...
@ladygaga
   mother mons†er
   6.8 x 106 followers



   @raffi
   me!
   5.1 x 103 followers




differs by              ...
@ladygaga
 mother mons†er
 6.8 x 106 followers




scaling                10 6
6000
       106

 deliveries
per minute
              =   deliveries
                  per
                  millisecond
A System is only as strong as its
weakest link
80 MB
    per sec
≈   10 ms
    per seek
10 ms
per seek   =   100 seeks
               per second
MySQL
         Can’t generate IDs fast enough
Centralized and a single point of failure




                              ...
How big are they?


1 tweet text        =   140
                        characters
                    ≈   200 bytes
1000 tweets
 per second       ≈       195 KB/sec
                  ≈ 11 MB/min
                  ≈ 16 GB/day
       Just t...
Lots of deliveries
1

                   Digraph                           2
            Need to represent this

                            ...
Photo used under Creative Commons from jurvetson




          Distributed graph database

flockdb   High rate of CRUD ope...
160M registered users




       2006         2008   2010
Using the system


6B API calls
    per day        ≈   70,000 calls
                       per second
Where do we want to be?

           Today - 160M people generate ~1000 TPS

Tomorrow - we want to support half the world a...
Follow me at
Questions?   twitter.com/raffi
Twitter - Guest Lecture UC Berkeley CS10 Fall 2010
Twitter - Guest Lecture UC Berkeley CS10 Fall 2010
Upcoming SlideShare
Loading in...5
×

Twitter - Guest Lecture UC Berkeley CS10 Fall 2010

3,147

Published on

I was invited to give a guest lecture at Cal's CS10 ("The Beauty and Joy of Computing"). These are the slides on Twitter.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,147
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Transcript of "Twitter - Guest Lecture UC Berkeley CS10 Fall 2010"

  1. 1. UC Berkeley - CS10
  2. 2. Giv ing a talk at Ca l’s CS10 - The Beauty and Joy of Computing. inst. eecs.berkeley.e du/~cs10/fa10 20 Oct via Twitter for iPhone from UC Berkeley 155 Donner Lab Berkeley, CA View Tweets at this place
  3. 3. To start... What’s a Tweet?
  4. 4. Follow me! twitter.com/raffi or send a SMS to 40404 follow raffi
  5. 5. “Rudimentary communication among individuals in real time allows many to move together as one” - @biz
  6. 6. One Word April 10, 2008 4:33:23 PM
  7. 7. One Word April 10, 2008 4:33:23 PM And it’s over April 11, 2008 1:27:26 PM
  8. 8. Haiti January 2010 Chile February 2010
  9. 9. MTV VMAs September 2010 Emmys August 2010
  10. 10. The humble beginnings
  11. 11. How does Twitter work?
  12. 12. Twitter is just A “struct”, a graph, & a LIST
  13. 13. A Tweet is a simple thing 140 characters
  14. 14. What’s The data structure of a tweet? { “id ” : 787167620, “text” : “Free”, “ date” : “2008-04 -11 13:27:26” “user” : “jamesbuck” }* *not quite, but something like this
  15. 15. Where do they go? Following Followed by
  16. 16. Tweets multiply!
  17. 17. It’s a huge graph
  18. 18. Viewing Tweets
  19. 19. Home Timeline [27886578700, 27856404785, 27727644786, 27679412331, 27594550593, 27583976236, 27580246968, 27576503025, 27545926564, 27545450496, 27539864336...
  20. 20. Is that it? ... YES! ... sorta ...
  21. 21. things get more complicated at scale
  22. 22. And scaling real-time is hard
  23. 23. How many Tweets are there?
  24. 24. How many Tweets are there? 90M!
  25. 25. 90M tweets per day ≈ 1000 tweets per second
  26. 26. Lots of deliveries
  27. 27. Tweets multiply 5,162 people need to see my tweet
  28. 28. @ladygaga mother mons†er 6.8 million followers @justinbieber Justin Bieber 5.7 million followers @BarakObama 44th President of the United States 5.6 million followers @raffi me! 5.1 thousand followers
  29. 29. @ladygaga mother mons†er 6.8 x 106 followers @raffi me! 5.1 x 103 followers differs by 10 3
  30. 30. @ladygaga mother mons†er 6.8 x 106 followers scaling 10 6
  31. 31. 6000 106 deliveries per minute = deliveries per millisecond
  32. 32. A System is only as strong as its weakest link
  33. 33. 80 MB per sec ≈ 10 ms per seek
  34. 34. 10 ms per seek = 100 seeks per second
  35. 35. MySQL Can’t generate IDs fast enough Centralized and a single point of failure snowflake Highly available and uncoordinated (10kqps) Compatible with the ecosystem http://github.com/twitter/snowflake
  36. 36. How big are they? 1 tweet text = 140 characters ≈ 200 bytes
  37. 37. 1000 tweets per second ≈ 195 KB/sec ≈ 11 MB/min ≈ 16 GB/day Just tweet text!
  38. 38. Lots of deliveries
  39. 39. 1 Digraph 2 Need to represent this 4 1 2 3 4 3 1 Matrix 2 Naïve implementation is not scalable 3 4
  40. 40. Photo used under Creative Commons from jurvetson Distributed graph database flockdb High rate of CRUD operations Complex set arithmetic queries http://github.com/twitter/flockdb
  41. 41. 160M registered users 2006 2008 2010
  42. 42. Using the system 6B API calls per day ≈ 70,000 calls per second
  43. 43. Where do we want to be? Today - 160M people generate ~1000 TPS Tomorrow - we want to support half the world and all its devices (right now, there are 6B people and 5B phones)
  44. 44. Follow me at Questions? twitter.com/raffi

×