witter by the
ing a t alk entitled “T
Giv
@twit terU a t Cal.
Number s” with
Lots o f ##s! ne
Tw itter for iPho
1m inute ago via
on
Retweet ed by 1 pers
What’s a Tweet?
It’s a short message that's sent through
140 characters
How many are there?
How many are there? 70M
60M
Today! *
*off the chart
op_oh
Commons from lo
der Creative
Photo used un
47M served 70M tweets
per day per day
70M tweets
per day = 800 tweets
per second
How big are they?
1 tweet text = 140
characters
≈ 200 bytes
800 tweets
per second ≈ 160 KB/sec
≈ 9 MB/min
≈ 12 GB/day
Just tweet text!
MySQL
Can’t generate IDs fast enough
Centralized and a single point of failure
snowflake
Highly available and uncoordinated (10kqps)
Compatible with the ecosystem
http://github.com/twitter/snowflake
ampura
Commons from ch
der Creative
Photo used un
1 TB generated 8 TB generated
per day per day
8 TB
per day
in total
≈ 100 MB
per sec
Photo used u
nder Creative C
ommons from
Mac Users G
uide
= 80 MB
per sec
Where do they go?
Followed by
Following
Asymmetric Digraph
Tweets multiply
1
Digraph 2
Need to represent this
4
1 2 3 4 3
1
Matrix
2
Naïve implementation is not scalable
3
4
150M registered users
2006 2008 2010
Photo used under Creative Commons from jurvetson
Distributed graph database
flockdb High rate of CRUD operations
Complex set arithmetic queries
http://github.com/twitter/flockdb
@ladygaga
mother mons†er
6.1 million followers
@BarakObama
44th President of the United States
5.3 million followers
@justinbieber
Justin Bieber
5.1 million followers
@raffi
me!
4.1 thousand followers
How do they get out?
6B API calls
per day ≈ 70,000 calls
per second
REST API
XML/JSON API over HTTP
Poll-based system / pseudo real-time
hosebird
Streaming API
Long poll HTTP
Near real-time delivery of Tweets
752%
in 2008
1358%
in 2009
Where do we want to be?
Today - 150M people generate ~1000 TPS
Tomorrow - we want to support half the world and all its devices
(5B phones and 6B people)
Real challenges in front of us
Real time
Indexing, search, and analytics
Relevance systems
Graph databases
Storage
Scalability and efficiency
1–7 of 7 previous next Post a comment