Scaling Online Social Networks (OSNs)

1,773 views
1,595 views

Published on

The final presentation of a semester project.
Course: Implementation of Distributed Systems (KTH Royal Institute of Technology)

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,773
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
20
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Scaling Online Social Networks (OSNs)

  1. 1. Scaling Online Social Networks (OSNs)Presented by: Maria Stylianou Coworker: Anis Uddin Supervisor: Šarūnas Girdzijauskas KTH - Royal Institute of Technology Implementation of Distributed Systems December 6th, 2012
  2. 2. Outline● Motivation● Current Algorithms – SPAR – JA-BE-JA● Contributions – Challenges – Solution● Evaluation & Conclusions 2
  3. 3. Outline● Motivation● Current Algorithms – SPAR – JA-BE-JA● Contributions – Challenges – Solution● Evaluation & Conclusions 3
  4. 4. “Pandoras box” Online Social NetworksSource: http://technorati.com/social-media/article/social-networks-theyre-what-every-local/ Motivation-Algorithms-Contribution-Evaluation 4
  5. 5. Easy to maintain... Online Social NetworksSource: http://mastersofmedia.hum.uva.nl/2009/09/14/a-review-of-taken-out-of-context/ Motivation-Algorithms-Contribution-Evaluation 5
  6. 6. ...or not! Online Social NetworksSource: http://mastersofmedia.hum.uva.nl/2009/09/14/a-review-of-taken-out-of-context/ Motivation-Algorithms-Contribution-Evaluation 6
  7. 7. Scaling Approaches Vertical Scaling Horizontal Scaling● Full Replication ● Adding servers● Data Locality ● Clean & Disjoint Partitions● But: ● But: – Expensive – Saturation – Not applicable in OSNs Motivation-Algorithms-Contribution-Evaluation 7
  8. 8. Scaling Approaches Vertical Scaling Horizontal Scaling● Full Replication ● Adding servers● Data Locality ● Clean & Disjoint Partitions● But: ● But: – Expensive – Saturation – Not applicable in OSNs Inefficient Motivation-Algorithms-Contribution-Evaluation 8
  9. 9. Existing Solutions for OSNsRelational Databases Key-Value Stores Motivation-Algorithms-Contribution-Evaluation 9
  10. 10. Existing Solutions for OSNsRelational Databases Key-Value Stores Inefficient Motivation-Algorithms-Contribution-Evaluation 10
  11. 11. Outline● Motivation● Current Algorithms – SPAR – JA-BE-JA● Contributions – Challenges – Solutions● Evaluation & Conclusions 11
  12. 12. SPAR Social Partitioning & Replication middle-ware● Transparent OSN scalability avoids● Data Locality performance● Load Balancing bottlenecks● Fault Tolerance● Stability● Replication Overhead Minimization Motivation-Algorithms-Contribution-Evaluation 12
  13. 13. SPAR Events● Nodes – Add/Remove● Edges – Add/Remove● Servers – Add/Remove Motivation-Algorithms-Contribution-Evaluation 13
  14. 14. SPAR Algorithm M2 6 5 1 Create Edge (1,6)2 53 14 5M1 6 Master Node Replica Node M3 Motivation-Algorithms-Contribution-Evaluation 14
  15. 15. SPAR Algorithm M2 6 5 Create Edge (1,6)2 5 1 C1: Create 6 in M1 Create 1 in M33 14 6 5M1 6 Master Node 1 Replica Node M3 Motivation-Algorithms-Contribution-Evaluation 15
  16. 16. SPAR Algorithm M2 6 5 Create Edge (1,6)2 1 C2: Move 1 to M33 14 5 2 Master NodeM1 6 1 3 Replica Node M3 4 Motivation-Algorithms-Contribution-Evaluation 16
  17. 17. SPAR Algorithm M2 6 5 Create Edge (1,6)2 5 1 C3: Move 6 to M13 14 6M1 Master Node Replica Node M3 Motivation-Algorithms-Contribution-Evaluation 17
  18. 18. JA-BE-JA● Distributed Partitioning Algorithm● K-way Partitioning● Load Balancing● Gossip Learning Motivation-Algorithms-Contribution-Evaluation 18
  19. 19. JA-BE-JA - Policies● Sampling ● Swapping – Local – Energy Function ● Select neighbors ● Reach minimum – Random – Simulated Annealing ● Select from random ● Escape from local walk optima – Hybrid ● Local & RandomSource: http://socialnetworking.lovetoknow.com/Growth_of_Online_Social_Networking_in_Business Motivation-Algorithms-Contribution-Evaluation 19
  20. 20. Outline● Motivation● Current Algorithms – SPAR – JA-BE-JA● Contributions – Challenges – Solution● Evaluation & Conclusions 20
  21. 21. Challenges Global ViewPartition Manager requirement→ Single Point of Failure SPAR SPAR Replication Overhead Motivation-Algorithms-Contribution-Evaluation 21
  22. 22. Our Solution Global ViewPartition Manager requirement→ Single Point of Failure SPAR Local ViewDistributed &Partition JA-BE-JAManager Replication Overhead Motivation-Algorithms-Contribution-Evaluation 22
  23. 23. Our Solution (wait for it...) Client RequestsSPAR Data Store Servers Motivation-Algorithms-Contribution-Evaluation 23
  24. 24. Our Solution Client Requests SPAR &JA-BE-JA JA BE JA Data Store Servers Motivation-Algorithms-Contribution-Evaluation 24
  25. 25. Outline● Motivation● Current Algorithms – SPAR – JA-BE-JA● Contributions – Challenges – Solution● Evaluation & Conclusions 25
  26. 26. Implementation● SPAR● SPAR-JA This is SPARJA! Motivation-Algorithms-Contribution-Evaluation 26
  27. 27. Datasets● Facebook Graphs by Stanford Network Analysis Project – #nodes: 150 #edges: ~3000 – #nodes: 224 #edges: ~6000 – #nodes: 786 #edges: ~60000 Source: http://snap.stanford.edu/ Motivation-Algorithms-Contribution-Evaluation 27
  28. 28. Datasets● Synthesized Graphs – using our own Graph Generator Graph Visualization Tool – #nodes: 1000, #degree: 10 https://gephi.org/ Randomized Clustered Highly Clustered Motivation-Algorithms-Contribution-Evaluation 28
  29. 29. ExperimentsReplication Overhead on Different Datasets#k-replicas: 0 (fault tolerance) #Servers: 4 Synthesized Graphs 10000 edges synth-r: Randomized synth-c: Clustered synth-hc: Highly Clustered Facebook Graphs fcbk-1: ~3000 edges fcbk-2: ~6000 edges fcbk-3: ~60000 edges Motivation-Algorithms-Contribution-Evaluation 29
  30. 30. ExperimentsReplication Overhead vs Replication Factor K=0 K=2 Motivation-Algorithms-Contribution-Evaluation 30
  31. 31. ExperimentsReplication Overhead on both algorithms Fault Tolerance K=2 synth-hc: - Highly Clustered - Synthesized Graph - 10000 edges Motivation-Algorithms-Contribution-Evaluation 31
  32. 32. ExperimentsReplication Overhead on both algorithms Fault Tolerance K=2 fcbk-3: - 3rd facebook graph - 60,000 edges Motivation-Algorithms-Contribution-Evaluation 32
  33. 33. Conclusions● SPAR + JA-BE-JA = SPAR-JA – Highly clustered nodes – Achieves fault tolerance by-default – Better than SPAR in case of high clusterization● Future Work – More datasets – Bigger datasets Motivation-Algorithms-Contribution-Evaluation 33
  34. 34. Scaling Online Social Networks (OSNs)Presented by: Maria Stylianou Coworker: Anis Uddin Supervisor: Šarūnas Girdzijauskas KTH - Royal Institute of Technology Implementation of Distributed Systems December 6th, 2012

×