ACM/SIGCOMM 2010 – New Delhi, India
The Little Engine(s)
that could: Scaling
Online Social Networks
Josep M. Pujol, Vijay ...
ACM/SIGCOMM 2010 – New Delhi, India
Problem
Solution
Implementation
Conclusions
2
ACM/SIGCOMM 2010 – New Delhi, India
Scalability is a pain: the designers’ dilemma
“Scalability is a desirable
property of ...
ACM/SIGCOMM 2010 – New Delhi, India
Towards transparent scalability
The advent of the Cloud:
virtualized machinery + gigab...
ACM/SIGCOMM 2010 – New Delhi, India
Towards transparent scalability
Elastic resource allocation for the
Presentation and t...
ACM/SIGCOMM 2010 – New Delhi, India
Scaling the data backend...
• Full replication
– Load of read requests decrease with t...
ACM/SIGCOMM 2010 – New Delhi, India
Scaling the data backend...
• Full replication
– Load of read requests decrease with t...
ACM/SIGCOMM 2010 – New Delhi, India
Scaling the data backend...
• Full replication
– Load of read requests decrease with t...
ACM/SIGCOMM 2010 – New Delhi, India
Scaling the data backend...
• Full replication
– Load of read requests decrease with t...
ACM/SIGCOMM 2010 – New Delhi, India
Scaling the data backend...
• Full replication
– Load of read requests decrease with t...
ACM/SIGCOMM 2010 – New Delhi, India
Why sharding is not enough for OSN?
• Shards in OSN can never be
disjoint because:
– O...
ACM/SIGCOMM 2010 – New Delhi, India
A very active area…
• Online Social Networks are massive systems spanning hundreds
of ...
ACM/SIGCOMM 2010 – New Delhi, India
Problem
Solution
Implementation
Conclusions
13
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
14
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
Me
15
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[]
Inbox
Friends
16
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[]
key-101
17
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-101]
key-101
18
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key...
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key...
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101zz
k...
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key...
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key...
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key...
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s Operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key...
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s Operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key...
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1) Relational Databases  Selects and joins across multiple shard...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
28
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0
Server 1 Server 2
Users a...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Random Partitioning – DHT...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Random Partitioning – DHT...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Random Partitioning – DHT...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Random Partitioning – DHT...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0
4 7
34
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0
4 7
#4 Master + Replica#4...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 That does it for user #3,...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 That does it for user #3,...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads f...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads f...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads f...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads f...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads f...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads i...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads i...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads i...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads i...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads i...
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 4
5
6789
0
1 23
4
5
67
8
9
0 Data locality for reads i...
ACM/SIGCOMM 2010 – New Delhi, India
SPAR Algorithm from 10000 feet
“All partition algorithms are equal, but some are more ...
ACM/SIGCOMM 2010 – New Delhi, India
SPAR Algorithm from 10000 feet
“All partition algorithms are equal, but some are more ...
ACM/SIGCOMM 2010 – New Delhi, India
Looks good on paper, but...
Real OSN data
• Twitter
– 2.4M users, 48M edges,
12M tweet...
ACM/SIGCOMM 2010 – New Delhi, India
How many replicas are generated?
Twitter 16 partitions Twitter 128 partitions
Rep. ove...
ACM/SIGCOMM 2010 – New Delhi, India
How many replicas are generated?
Twitter 16 partitions Twitter 128 partitions
Rep. ove...
ACM/SIGCOMM 2010 – New Delhi, India
How many replicas are generated?
Twitter 16 partitions Twitter 128 partitions
Rep. ove...
ACM/SIGCOMM 2010 – New Delhi, India
How many replicas are generated?
Twitter 16 partitions Twitter 128 partitions
Rep. ove...
ACM/SIGCOMM 2010 – New Delhi, India
Problem
Solution
Implementation
Conclusions
56
ACM/SIGCOMM 2010 – New Delhi, India
The SPAR architecture
57
ACM/SIGCOMM 2010 – New Delhi, India
The SPAR architecture
Application, 3-tiers
58
ACM/SIGCOMM 2010 – New Delhi, India
The SPAR architecture
Application, 3-tiers
Application, 3-tiers
SPAR Middleware
(data-...
ACM/SIGCOMM 2010 – New Delhi, India
The SPAR architecture
Application, 3-tiers
Application, 3-tiers
SPAR Middleware
(data-...
ACM/SIGCOMM 2010 – New Delhi, India
SPAR in the wild
• Twitter clone (Laconica, now Statusnet)
– Centralized architecture,...
ACM/SIGCOMM 2010 – New Delhi, India
SPAR on top of MySQL
• Different read request rate (application level):
– 1 call to LD...
ACM/SIGCOMM 2010 – New Delhi, India
SPAR on top of Cassandra
Network bandwidth is not an issue but
Network I/O and CPU I/O...
ACM/SIGCOMM 2010 – New Delhi, India
Problem
Solution
Implementation
Conclusions
64
ACM/SIGCOMM 2010 – New Delhi, India
Conclusions
SPAR provides the means
to achieve transparent
scalability
1) For applicat...
ACM/SIGCOMM 2010 – New Delhi, India
jmps@tid.es
http://research.tid.es/jmps/
@solso
Thanks!
66
Upcoming SlideShare
Loading in …5
×

The little engine(s) that could: scaling online social networks

3,045 views

Published on

Presentation of the SIGCOMM 2010 paper: "The little engine(s) that could: scaling online social networks".

The full paper is available at:

http://ccr.sigcomm.org/online/?q=node/642

or at

http://research.tid.es/jmps/

Published in: Technology, Travel
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,045
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
114
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

The little engine(s) that could: scaling online social networks

  1. 1. ACM/SIGCOMM 2010 – New Delhi, India The Little Engine(s) that could: Scaling Online Social Networks Josep M. Pujol, Vijay Erramilli, Georgos Siganos, Xiaoyuan Yang Nikos Laoutaris, Parminder Chhabra and Pablo Rodriguez Telefonica Research, Barcelona
  2. 2. ACM/SIGCOMM 2010 – New Delhi, India Problem Solution Implementation Conclusions 2
  3. 3. ACM/SIGCOMM 2010 – New Delhi, India Scalability is a pain: the designers’ dilemma “Scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner or to be readily enlarged" The designers’ dilemma: wasted complexity and long time to market vs. short time to market but risk of death by success 3
  4. 4. ACM/SIGCOMM 2010 – New Delhi, India Towards transparent scalability The advent of the Cloud: virtualized machinery + gigabit network Hardware scalability Application scalability 4
  5. 5. ACM/SIGCOMM 2010 – New Delhi, India Towards transparent scalability Elastic resource allocation for the Presentation and the Logic layer: More machines when load increases. Components are stateless, therefore, independent and duplicable Components are interdependent, therefore, non-duplicable on demand. 5
  6. 6. ACM/SIGCOMM 2010 – New Delhi, India Scaling the data backend... • Full replication – Load of read requests decrease with the number of servers – State does not decrease with the number of servers 6
  7. 7. ACM/SIGCOMM 2010 – New Delhi, India Scaling the data backend... • Full replication – Load of read requests decrease with the number of servers – State does not decrease with the number of servers – Maintains data locality 7
  8. 8. ACM/SIGCOMM 2010 – New Delhi, India Scaling the data backend... • Full replication – Load of read requests decrease with the number of servers – State does not decrease with the number of servers – Maintains data locality Operations on the data (e.g. a query) can be resolved in a single server from the application perspective Application Logic Data Store Application Logic Data Store Data Store Data Store Centralized Distributed 8
  9. 9. ACM/SIGCOMM 2010 – New Delhi, India Scaling the data backend... • Full replication – Load of read requests decrease with the number of servers – State does not decrease with the number of servers – Maintains data locality • Horizontal partitioning or sharding – Load of read requests decrease with the number of servers – State decrease with the number of servers – Maintains data locality as long as… 9
  10. 10. ACM/SIGCOMM 2010 – New Delhi, India Scaling the data backend... • Full replication – Load of read requests decrease with the number of servers – State does not decrease with the number of servers – Maintains data locality • Horizontal partitioning or sharding – Load of read requests decrease with the number of servers – State decrease with the number of servers – Maintains data locality as long as… … the splits are disjoint (independent) Bad news for OSNs!! 10
  11. 11. ACM/SIGCOMM 2010 – New Delhi, India Why sharding is not enough for OSN? • Shards in OSN can never be disjoint because: – Operations on user i require access to the data of other users – at least one-hop away. • From graph theory: – there is no partition that for all nodes all neighbors and itself are in the same partition if there is a single connected component. Data locality cannot be maintained by partitioning a social network!! 11
  12. 12. ACM/SIGCOMM 2010 – New Delhi, India A very active area… • Online Social Networks are massive systems spanning hundreds of machines in multiple datacenters • Partition and placement to minimize network traffic and delay – Karagiannis et al. “Hermes: Clustering Users in Large-Scale Email Services” -- SoCC 2010 – Agarwal et al. “Volley: Automated Data Placement of Geo- Distributed Cloud Services” -- NSDI 2010 • Partition and replication towards scalability – Pujol et al “Scaling Online Social Networks without Pains” -- NetDB / SOSP 2009 12
  13. 13. ACM/SIGCOMM 2010 – New Delhi, India Problem Solution Implementation Conclusions 13
  14. 14. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 14
  15. 15. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 Me 15
  16. 16. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [] Inbox Friends 16
  17. 17. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [] key-101 17
  18. 18. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [key-101] key-101 18
  19. 19. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101 key-105 key-121 key-146 19
  20. 20. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101 key-105 key-121 key-146 Let’s see what’s going on in the world 20
  21. 21. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101zz key-105 key-121 key-146 Let’s see what’s going on in the world 1) FETCH –inbox [from server ]A 21
  22. 22. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101 key-105 key-121 key-146 Let’s see what’s going on in the world 1) FETCH –inbox [from server ] R= [key-146, key-121, key-105, key-101] A 22
  23. 23. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101 key-105 key-121 key-146 Let’s see what’s going on in the world 1) FETCH –inbox [from server ] 2) FETCH text WHERE key IN R [from server ] R= [key-146, key-121, key-105, key-101] A ? 23
  24. 24. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101 key-105 key-121 key-146 Let’s see what’s going on in the world 1) FETCH –inbox [from server ] 2) FETCH text WHERE key IN R [from server ] Where is the data? R= [key-146, key-121, key-105, key-101] A ? 24
  25. 25. ACM/SIGCOMM 2010 – New Delhi, India OSN’s Operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101 key-105 key-121 key-146 1) FETCH –inbox [from server ] 2) FETCH text WHERE key IN R [from server ] in , then data locality! R= [key-146, key-121, key-105, key-101] A ? A A AA A 25
  26. 26. ACM/SIGCOMM 2010 – New Delhi, India OSN’s Operations 101 1 2 3 4 5 6789 0 [key-146, key-121, key-105, key-101] key-101 key-105 key-121 key-146 1) FETCH –inbox [from server ] 2) FETCH text WHERE key IN R [from server ] in , then no data locality! R= [key-146, key-121, key-105, key-101] A ? E A FB B F E 26
  27. 27. ACM/SIGCOMM 2010 – New Delhi, India OSN’s operations 101 1) Relational Databases  Selects and joins across multiple shards of the database are possible but performance is poor (e.g. MySQL Cluster, Oracle Rack) 2) Key-Value Stores (DHT)  More efficient than relational databases: multi-get primitives to transparently fetch data from multiple servers.  But it’s not a silver bullet: o lose SQL query language => programmatic queries o lose abstraction from data operations o suffer from high traffic, eventually affecting performance: • Incast issue • Multi-get hole • latency dominated by the worse performing server 27
  28. 28. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 28
  29. 29. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Server 1 Server 2 Users are assigned to servers randomly Random partition (DHT-like) 29
  30. 30. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Random Partitioning – DHT 30
  31. 31. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Random Partitioning – DHT 31
  32. 32. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Random Partitioning – DHT How do we enforce data locality? 32
  33. 33. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Random Partitioning – DHT How do we enforce data locality? Replication 33
  34. 34. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 4 7 34
  35. 35. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 4 7 #4 Master + Replica#4 Master + Replica 35
  36. 36. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 That does it for user #3, but what about the others? 4 7 36
  37. 37. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 That does it for user #3, but what about the others? The very same procedure…4 7 37
  38. 38. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads for any user is guaranteed! 4 7 1 0 5 3 2 8 9 6 After a while… 38
  39. 39. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads for any user is guaranteed! 4 7 1 0 5 3 2 8 9 6 39
  40. 40. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads for any user is guaranteed! 4 7 1 0 5 3 2 8 9 6 40
  41. 41. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads for any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 This is full replication!! 41
  42. 42. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads for any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 Can we do better? 42
  43. 43. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads in any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 We can leverage the underlying social structure for the partition… 43
  44. 44. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads in any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 We can leverage the underlying social structure for the partition… 44
  45. 45. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads in any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 We can leverage the underlying social structure for the partition… 0 43 9 2 68 5 7 1 Social Partition 45
  46. 46. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads in any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 We can leverage the underlying social structure for the partition… 0 43 9 2 68 5 7 1 Social Partition and Replication: SPAR 3 2 46
  47. 47. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads in any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 We can leverage the underlying social structure for the partition… 0 43 9 2 68 5 7 1 Social Partition and Replication: SPAR … only two replicas to achieve data locality 3 2 47
  48. 48. ACM/SIGCOMM 2010 – New Delhi, India Maintaining data locality 1 2 3 4 5 6789 0 1 23 4 5 67 8 9 0 Data locality for reads in any user is guaranteed! Random partition + Replication can lead to high replication overheads! 4 7 1 0 5 3 2 8 9 6 We can leverage the underlying social structure for the partition… 0 43 9 2 68 5 7 1 Social Partition and Replication: SPAR … only two replicas to achieve data locality 3 2 48
  49. 49. ACM/SIGCOMM 2010 – New Delhi, India SPAR Algorithm from 10000 feet “All partition algorithms are equal, but some are more equal than others” 49
  50. 50. ACM/SIGCOMM 2010 – New Delhi, India SPAR Algorithm from 10000 feet “All partition algorithms are equal, but some are more equal than others” 1) Online (incremental) a) Dynamics of the SN (add/remove node, edge) b) Dynamics of the System (add/remove server) 2) Fast (and simple) a) Local information b) Hill-climbing heuristic c) Load-balancing via back-pressure 3) Stable a) Avoid cascades 4) Effective a) Optimize for MIN_REPLICA (i.e. NP-Hard) b) While maintaining a fixed number of replicas per user for redundancy (e.g. K=2) 50
  51. 51. ACM/SIGCOMM 2010 – New Delhi, India Looks good on paper, but... Real OSN data • Twitter – 2.4M users, 48M edges, 12M tweets for 15 days (Twitter as of Dec08) • Orkut (MPI) – 3M users, 223M edges • Facebook (MPI) – 60K users, 500K edges Partition algorithms • Random (DHT) • METIS • MO+ • modularity optimization + hack • SPAR online ... let’s try it for real 51
  52. 52. ACM/SIGCOMM 2010 – New Delhi, India How many replicas are generated? Twitter 16 partitions Twitter 128 partitions Rep. overhead % over K=2 Rep. overhead % over K=2 SPAR 2.44 +22% 3.69 +84% MO+ 3.04 +52% 5.14 +157% METIS 3.44 +72% 8.03 +302% Random 5.68 +184% 10.75 +434% 52
  53. 53. ACM/SIGCOMM 2010 – New Delhi, India How many replicas are generated? Twitter 16 partitions Twitter 128 partitions Rep. overhead % over K=2 Rep. overhead % over K=2 SPAR 2.44 +22% 3.69 +84% MO+ 3.04 +52% 5.14 +157% METIS 3.44 +72% 8.03 +302% Random 5.68 +184% 10.75 +434% 2.44 replicas per user (on average) 2 to guarantee redundacy + 0.44 to guarantee data locality (+22%) 2.44 53
  54. 54. ACM/SIGCOMM 2010 – New Delhi, India How many replicas are generated? Twitter 16 partitions Twitter 128 partitions Rep. overhead % over K=2 Rep. overhead % over K=2 SPAR 2.44 +22% 3.69 +84% MO+ 3.04 +52% 5.14 +157% METIS 3.44 +72% 8.03 +302% Random 5.68 +184% 10.75 +434% Replication with random partitioning, too costly! 54
  55. 55. ACM/SIGCOMM 2010 – New Delhi, India How many replicas are generated? Twitter 16 partitions Twitter 128 partitions Rep. overhead % over K=2 Rep. overhead % over K=2 SPAR 2.44 +22% 3.69 +84% MO+ 3.04 +52% 5.14 +157% METIS 3.44 +72% 8.03 +302% Random 5.68 +184% 10.75 +434% Other social based partitioning algorithms have higher replication overhead than SPAR. 55
  56. 56. ACM/SIGCOMM 2010 – New Delhi, India Problem Solution Implementation Conclusions 56
  57. 57. ACM/SIGCOMM 2010 – New Delhi, India The SPAR architecture 57
  58. 58. ACM/SIGCOMM 2010 – New Delhi, India The SPAR architecture Application, 3-tiers 58
  59. 59. ACM/SIGCOMM 2010 – New Delhi, India The SPAR architecture Application, 3-tiers Application, 3-tiers SPAR Middleware (data-store specific) 59
  60. 60. ACM/SIGCOMM 2010 – New Delhi, India The SPAR architecture Application, 3-tiers Application, 3-tiers SPAR Middleware (data-store specific) SPAR Controller : Partiton manager and Directory Service 60
  61. 61. ACM/SIGCOMM 2010 – New Delhi, India SPAR in the wild • Twitter clone (Laconica, now Statusnet) – Centralized architecture, PHP + MySQL/Postgres • Twitter data – Twitter as of end of 2008, 2.4M users, 12M tweets in 15 days • The little engines – 16 commodity desktops: Pentium Duo at 2.33Ghz, 2GB RAM connected with Gigabit-Ethernet switch • Test SPAR on top of: – MySQL (v5.5) – Cassandra (v0.5.0) 61
  62. 62. ACM/SIGCOMM 2010 – New Delhi, India SPAR on top of MySQL • Different read request rate (application level): – 1 call to LDS, 1 join to on the notice and notice inbox to get the last 20 tweets and its content – User is queried only once per session (4min) – Spar allows for data to be partitioned, improving both query times and cache hit ratios. – Full replication suffers from low cache hit ratio and large tables (1B+) MySQL Full-Replication SPAR + MySQL 99th < 150ms 16 req/s 2500 req/s 62
  63. 63. ACM/SIGCOMM 2010 – New Delhi, India SPAR on top of Cassandra Network bandwidth is not an issue but Network I/O and CPU I/O are: – Network delays is reduced • Worse performing server produces delays – Memory hit ratio is increased • Random partition destroys correlations •Different read request rate (application level): Vanilla Cassandra SPAR + Cassandra 99th < 100ms 200 req/s 800 req/s 63
  64. 64. ACM/SIGCOMM 2010 – New Delhi, India Problem Solution Implementation Conclusions 64
  65. 65. ACM/SIGCOMM 2010 – New Delhi, India Conclusions SPAR provides the means to achieve transparent scalability 1) For applications using RDBMS (not necessarily limited to OSN) 2) And, a performance boost for key-value stores due to the reduction of Network I/O 65
  66. 66. ACM/SIGCOMM 2010 – New Delhi, India jmps@tid.es http://research.tid.es/jmps/ @solso Thanks! 66

×