SPARJA: a Distributed Social Graph Partitioning andReplication MiddlewareStylianou MariaKTH Royal Institute of TechnologyS...
improved with a distributed approach for partitioning.In the next section, we discuss about related work and thebackground...
the network and creating replicas under certain conditions.Below, we describe operations that SPARJA executes forachieving...
Dataset Nodes Edges Clusterization (%)Synth-R 1000 10,000 0%Synth-C 1000 10,000 75%Synth-HC 1000 10,000 95%Facebook-1 150 ...
the replication overhead of SPARJA in all datasets withreplication factor for fault tolerance set to zero (K=0) andtwo (K=...
Figure 6: Replication Overhead with Different Number of Serversthe 9th ACM SIGCOMM conference on Internetmeasurement confer...
Upcoming SlideShare
Loading in …5

SPARJA: a Distributed Social Graph Partitioning and Replication Middleware


Published on

Course: Implementation of Distributed Systems
KTH - Royal Institute of Technology

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

SPARJA: a Distributed Social Graph Partitioning and Replication Middleware

  1. 1. SPARJA: a Distributed Social Graph Partitioning andReplication MiddlewareStylianou MariaKTH Royal Institute of TechnologyStockholm, Swedenmariasty@kth.seGirdzijauskas Šar¯unasKTH Royal Institute of TechnologyStockholm, Swedensarunas@sics.seABSTRACTThe rapid growth of Online Social Networks (OSNs) haslead to the necessity of effective and low-cost scalability.Approaches like vertical and horizontal scaling have beenproven to be inefficient due to the strong community struc-ture of OSNs. We propose SPARJA, a distributed graphpartitioning and replication middleware for scaling OSNs.SPARJA is an extension of SPAR [8] with an improved par-titioning algorithm which functions in a distributed man-ner and eliminates the requirement of a global view. Wecompare and evaluate SPARJA with a variance of SPARon synthesized datasets and datasets from Facebook. Ourresults show that our proposed system is on par with andeven outperforms SPAR depending on graphs natures andclusterization.Categories and Subject DescriptorsC.4 [Performance of Systems]: Miscellaneous; D.4.8 [Performance]: Metrics—performance measuresGeneral TermsOnline Social Networks, Scalability, Partitioning, Replica-tionKeywordsOnline Social Networks, scalability, partitioning, replication,SPAR, JA-BE-JA1. INTRODUCTIONRecently, there has been an abrupt transition of interestfrom traditional web applications to social applications andespecially to Online Social Networks (OSNs), e.g. Facebook1and Twitter2. Both Facebook and Twitter have millions ofactive users who post, comment and update their status at1http://www.facebook.com2http://www.twitter.comvery high rates. This trend makes OSNs a popular object ofanalysis and research.Recent research showed that OSNs produce a non-traditionalform of workloads [1, 12], mostly because of the differentnature of data. Data are highly personalized and intercon-nected due to the strong community structure [5–7]. Thesenew characteristics impose new challenges in terms of OSNsmaintenance and scalability.To address scalability, two approaches have been followedso far; the vertical and the horizontal scaling. The formersolution, which implies replacements of existing hardwarewith high-performance servers, tends to be very expensiveand sometimes not infeasible because of the very large size ofOSNs. The latter solution proposes load partitioning amongseveral cheap commodity servers or virtual machines (VMs),the second of which derives from the emergence of cloudcomputing systems, e.g. Amazon EC23and Google Ap-pEngine4. With this approach, data are partitioned intodisjoint components, offering horizontal scaling in low cost.However, problems arise when it comes to OSNs.In OSNs, users can be members of several social communi-ties [5–7], making the partitioning unfeasible. Most of theoperations in OSNs concern a user and her friends, who areher neighbors. Thus, if a user belongs to many commu-nities, clean partitioning is apparently impossible. Subse-quently, queries are resolved with high inter-server traffic.An attempt to eliminate this traffic is to replicate all usersdata in multiple or all the servers. However, this leads toan increased replication overhead which hinders consistencyamong replicas.SPAR [8], a social partitioning and replication middleware,addresses the problem of partitioning - and consequentlyscaling - OSNs. However, a global view of the entire networkis required and therefore, its use for extremely large-scaleOSNs may be defective. In this paper, we present a variantof the SPAR algorithm which uses the partitioning techniqueproposed in [10]. The new system has an improved - anddistributed - partitioning phase which does not require theglobal view. We evaluate and compare our heuristic withthe initial SPAR algorithm, proving that scalability can be3
  2. 2. improved with a distributed approach for partitioning.In the next section, we discuss about related work and thebackground of our research. In Section 3, we present ourcontribution and describe the system deployed. Section 4consists of the evaluation with the experiments conductedand in section 5, our conclusions are listed.2. BACKGROUND AND RELATED WORKDue to the recent emergence of OSNs, scaling and main-taining such networks constitute a new area of research withlimited work so far. In this section, we describe approachesfollowed in the past and associate them with SPAR and ourwork.Scaling out web applications is achievable with the use ofCloud providers, like Amazon EC2 and Google AppEngine.Developers have the ability to dynamically add or removecomputing resources depending on the workload of their ap-plications. This facility requires the applications to be state-less and the data to be independent and easily sharded intoclean partitions. OSNs deal with highly interconnected anddependent data and therefore scaling out is not a scalingsolution.Nowadays, Key-Value stores have become the scaling solu-tion for several popular OSNs. Key-Value stores are de-signed to scale with the tradeoff to partition data randomlyacross the servers. This requirement limits the performanceof OSNs. Puyol et al. [8] have proven that SPAR performsbetter than Key-Value stores. Because of their principleof preserving data locality, they managed to minimize theinter-server traffic and, therefore, improve the performance.Another approach for scaling and maintaining applicationsis the use of Distributed File Systems. Such systems [4,11]distribute and replicate data for achieving high availability.In the case of OSNs, most of the queries concern data fromseveral users which would imply fetching data from multi-ple servers. SPAR does not follow the same approach asDistributed File Systems, but it replicates data in such amanner that all necessary data can be found locally andmore efficiently.SPAR is the initial work and motivation for our research.It is a partitioning and replication algorithm, designed forsocial applications. SPAR offers transparent scalability [9]by preserving local semantics, i.e. storing all relevant datafor a user on one server. Moreover, it aims at minimizingthe replication overhead for keeping the overall performanceand system efficiency high. SPAR achieves load balancingby acquiring the global view of the network. However, hav-ing access to all data at all times can be very costly andimpractical, especially for large-scale systems. Additionally,SPAR has a central partition manager which imposes singlepoint of failure. Both drawbacks are addressed in our imple-mentation which is described in next section. Furthermore,we tackle the possibility of SPAR’s partition manager to fallinto a local optima while trying to preserve load balancing.This likelihood may result to an increased replication over-head which we also try to improve in the proposed system.3. OUR CONTRIBUTION - SPARJAOur main contribution is the implementation of SPARJA,a variant of SPAR which is based on JA-BE-JA [10]. JA-BE-JA is a distributed graph partitioning algorithm thatdoes not require the global view of the system. SPARJAeliminates the single point of failure from the initial SPARby replacing its main algorithm with JA-BE-JA. It also aimsat replication overhead minimization with the execution ofa simple straightforward technique.3.1 System ArchitectureFigure 1 depicts a three-tier web architecture with SPARJAand it is based on the architecture of SPAR. The applicationcan interact with SPARJA through the Middleware (MW).When the application requests a read or write operation ona user’s data, it calls the MW which locates the back-endserver that contains the data of the specific user. The MWsends back to the application the address of the server toinitiate a data-store interface, like MySQL, Cassandra orothers.Figure 1: SPARJA Architecture3.2 DescriptionSPARJA is a dynamic gossip-based local search algorithm.Its goal is to group and store connected users, i.e. friends,into the same server. In that way, SPARJA aims to reducethe replication overhead as well as the inter-server traffic.Initially, the system takes as an input a partial graph andpartitions it into k equal size components. All componentshave the same amount of users, thus achieving load balanc-ing. Afterwards, each node behaves as an independent pro-cessing unit which periodically executes the algorithm basedon local information about the graph topology. These peri-odical executions are essential for repartitioning the graphand minimizing the replica nodes. Nodes can work in par-allel, nevertheless SPARJA can work as a central system aswell.3.3 System OperationsSPARJA is responsible to preserve scalability and trans-parency of the application by distributing users, partitioning
  3. 3. the network and creating replicas under certain conditions.Below, we describe operations that SPARJA executes forachieving all its goals.3.3.1 Data Distribution and PartitioningSPARJA guarantees that users are equally distributed amongall servers. When a new user joins the network, a node -called master node - is created and stored in the server withthe minimum number of master nodes. Hence, the datadistribution is fair and the load balanced. Recalling fromSPAR, users may move from one server to another. In con-trast, users in SPARJA may exchange positions, i.e. UserA can move to the server of user B, and user B can moveto the server of user A, in order to be co-located with theirfriends.3.3.2 Data ReplicationData Replication is an important function of SPARJA. Byreplicating master nodes, two requirements are satisfied; lo-cal semantics and fault tolerance. When a new user joinsthe network, along with the master node, replica nodes arecreated and stored in servers. The number of replicas is acustom value, initially set before the execution of the al-gorithm and serves for preserving fault tolerance. When anew friendship is established, new replicas may be created,if needed, for data locality. SPARJA attempts to keep thenumber of replicas to the minimum, by solving the set-coverproblem [2]. Particularly, with the creation of additionalreplicas for data locality, some of the fault tolerance repli-cas may be removed. Listing 1 presents the replication al-gorithm in pseudocode which is executed periodically fromeach node. The first part of the algorithm guarantees lo-cality by creating replica nodes of a user - if not exist - inthe servers of her friends. The second part guarantees faulttolerance by creating additional replica nodes for a user inservers that do not already have the user.1 f o r user in graph :2 f r i e n d s = g e t f r i e n d s ( user )3 f o r f r i e n d in f r i e n d s :4 i f server i s not same :5 i f ! r e p l i c a e x i s t s ( user ) :6 c r e a t e r e p l i c a ( user )78 f o r user in graph :9 r e p l i c a s = g e t r e p l i c a s ( user )10 i f num replicas < k r e p l i c a s :11 new replicas = k r e p l i c a s −e x i s t i n g r e p l i c a s12 f o r j in range ( new replicas ) :13 f o r k in range ( t o t a l s e r v e r s ) :14 i f k i s not master server :15 i f ! r e p l i c a e x i s t s ( user , k) :16 c r e a t e r e p l i c a ( user , k)Listing 1: Algorithm for Data Replication inSPARJA3.3.3 Sampling and Swapping PoliciesEach node runs the algorithm of SPARJA as a process-ing unit. It periodically selects a node and moves to theswapping policy, which measures the benefit of swapping itsserver with the server of the sampled node. The benefit ofswapping is measured in terms of energy, as introduced inJA-BE-JA [10]. Each node has energy and, therefore, thesystem has a global energy. The energy becomes low whennodes are placed close to their neighbors. SPARJA uses thesame energy function to measure the swapping benefit. Ifthe energy decreases for both nodes - which is the desired be-havior - then the swapping is performed, otherwise it halts.A hybrid node selection policy is followed [10], which con-sists of two parts. Firstly, the node selects - in random -one direct neighbor and calculates the benefits. If the en-ergy function does not improve, then the node performs arandom walk and selects another node from its walk [3].3.3.4 Simulated AnnealingLocal search algorithms tend to stuck in local optimas. Simi-larly, SPARJA is vulnerable to this hazard which would leadto higher replication overhead. To address this possibility,we employ the Simulated Annealing technique as describedin [13]. Initially, noise is introduced to the system whichis analogous to temperature, causing the system to deviatefrom the optimum value. After an amount of iterations, thesystem starts to stabilize and eventually concludes to theoptimal solution, rather than a local optima.4. EVALUATIONFor evaluating SPARJA, we have implemented both SPARJAand SPAR algorithms. Both algorithms are implemented inPython, using Cassandra as the data store.4.1 MetricsThe principal evaluation metric we use is the replicationoverhead, which consists of the number of replicas createdfor local semantics and the number of replicas created forfault tolerance.4.2 DatasetsWe use six datasets for evaluating the replication overhead;three synthesized datasets and three facebook datasets.Synthesized DatasetsWe generated three synthesized datasets with different clus-terization levels in order to study the clusterization impactin the replication overhead. All three datasets contain 1000nodes each with node degree equal to 10. The Randomizedgraph (Synth-R) has no clusterization policy, while the Clus-tered (Synth-C) and Highly Clustered (Synth-HC) graphshave 75% and 95% of clusterization respectively.Facebook DatasetsThe three Facebook datasets have been acquired from theStanford Large Network Dataset Collection 5, with numberof edges approximate to 3000, 6000 and 60000 respectively.All details - including nodes, edges and clusterization levels- can be found in the Table 1.5
  4. 4. Dataset Nodes Edges Clusterization (%)Synth-R 1000 10,000 0%Synth-C 1000 10,000 75%Synth-HC 1000 10,000 95%Facebook-1 150 3386 n/aFacebook-2 224 6384 n/aFacebook-3 786 60,050 n/aTable 1: Description of Datasets4.3 Environment PreparationBefore conducting any experiments, we set up the coolingrate (δ), i.e. the rate of change of temperature, used inthe simulated annealing technique. As stated in [13], thenumber of iterations is equal to the fraction of temperaturedifference divided by the cooling rate.number of iterations =To − 1δTo is the Initial Temperature of the network, which declinesaccording to the cooling rate. Assuming a network withfour servers and a fixed value of the Initial Temperature(To) equal to 2, we run the algorithm for different numbersof iterations in order to adjust the cooling rate. The datasetsused are the synthesized graphs with 0%, 75% and 95% ofclusterization. Figure 2 shows how the number of non lo-cal nodes is adjusted with the increase of iterations. Fromthe number of non local nodes we can deduce how muchreplication overhead is caused and how well the clusteriza-tion is done. As illustrated, the number of non local nodesdecreases with the increase of iterations, and eventually sta-bilizes at 200 iterations for the randomized graph and at 300iterations for both clusterized and highly clusterized graphs.Figure 2: Number of Iterations vs Number of NonLocal NodesIn Table 2, we accumulate all the parameters with fixedvalues set before realising the experiments.4.4 ExperimentsIn our experiments we compare SPARJA and SPAR in termsof the replication overhead. We designed three scenariosParameter ValueInitial Temperature (To) 2Final Temperature (T) 1Cooling Factor (δ) 0.003Energy Function Parameter (n) 2Table 2: Parameters for SPARJAfor testing the impact of different datasets, fault tolerancereplication and amount of servers.4.4.1 Replication Overhead on Different DatasetsIn the first experiment, we study how different datasets andtopologies affect the replication overhead of the system. Fig-ures 3 and 4 show the replication overhead of both SPARand SPARJA in synthesized graphs and facebook graphs re-spectively. The amount of servers is set to four (S=4) andthe replication factor for fault tolerance is set to zero (K=0).As revealed in Figure 3, a higher clusterization leads to lowerreplication overhead in SPARJA. As expected, SPARJA takesadvantage of the existing graph clusterization and continuesredistributing nodes based on this divided topology. As aresult, SPARJA outperforms SPAR in the clustered graphswhile giving worse results than SPAR in the random graph.Similarly, Figure 4 shows how SPARJA and SPAR performon facebook graphs. Again, SPARJA gives better results ascompared to SPAR.Figure 3: Replication Overhead on SynthesizedDatasetsFigure 4: Replication Overhead on FacebookDatasets4.4.2 Replication Overhead vs Replication FactorNext, we turn our attention to the replication factor andwhether it affects the replication overhead. Figure 5 shows
  5. 5. the replication overhead of SPARJA in all datasets withreplication factor for fault tolerance set to zero (K=0) andtwo (K=2). The amount of servers is set to four (S=4).As can be seen, the fault tolerance replication factor candramatically affect and decrease the replication overhead.As it was expected, fault tolerance replica nodes are alsoused for preserving data locality.Figure 5: Replication Overhead for Different Num-ber Fault Tolerance Replicas4.4.3 Replication Overhead with Different Numberof ServersIn our final experiment we measure the replication overheadof both SPARJA and SPAR for different number of serversS=4,8,16. The fault tolerance replication factor is set to 2(K=2) and all datasets are used.In Figure 6 we plot the results collected from the algorithms,divided in 6 graphs, each one for a different dataset. Asexpected in all datasets, the replication overhead increaseswith the increase of the number of servers.5. CONCLUSIONSOnline Social Networks have faced a steep growth over thelast decade. This popularity has lead companies to study thenature of OSNs and offer scalability and maintenance ser-vices. However, none of the scalability approaches, proposedso far, has solved all the scalability issues. The strong com-munity structure of such systems makes Key-Value storesand relational databases inefficient.We proposed SPARJA, a distributed graph partitioning andreplication middleware for scaling OSNs. SPARJA parti-tions the graph into k balanced components and maintainsthem without obtaining the global view of the system. Itrelies on data replication for preserving fault tolerance andlocality semantics, while aiming to keep the replication over-head as low as possible.The evaluation of SPARJA was accomplished using synthe-sized graphs as well as real datasets from facebook. Ourcomparisons with SPAR showed that SPARJA offers signif-icant gains in replication overhead, especially when there isclusterization of the graph. Moreover, with the low replica-tion overhead, it covers both goals of locality semantics andfault tolerance.We implemented and tested an initial version of SPARJA.We leave the integration of SPARJA in a real system witha three-tier architecture as a future work.6. ACKNOWLEDGEMENTSWe would like to thank our colleague Muhammad Anis ud-din Nasir for his valuable contribution and help in the project.We also thank Fatemeh Rahimian for providing sources fordatasets used for the evaluation part of the project.7. REFERENCES[1] F. Benevenuto, T. Rodrigues, M. Cha, andV. Almeida. Characterizing user behavior in onlinesocial networks. In Proceedings of the 9th ACMSIGCOMM conference on Internet measurementconference, pages 49–62. ACM, 2009.[2] R. Carr, S. Doddi, G. Konjevod, and M. Marathe. Onthe red-blue set cover problem. In Proceedings of theeleventh annual ACM-SIAM symposium on Discretealgorithms, pages 345–353. Society for Industrial andApplied Mathematics, 2000.[3] M. Gjoka, M. Kurant, C. Butts, and A. Markopoulou.Walking in facebook: A case study of unbiasedsampling of osns. In INFOCOM, 2010 ProceedingsIEEE, pages 1–9. IEEE, 2010.[4] R. Guy, J. Heidemann, W. Mak, T. Page Jr,G. Popek, D. Rothmeier, et al. Implementation of theficus replicated file system. In USENIX ConferenceProceedings, volume 74, pages 63–71. Citeseer, 1990.[5] J. Leskovec, K. Lang, A. Dasgupta, and M. Mahoney.Community structure in large networks: Naturalcluster sizes and the absence of large well-definedclusters. Internet Mathematics, 6(1):29–123, 2009.[6] M. Newman. Modularity and community structure innetworks. Proceedings of the National Academy ofSciences, 103(23):8577–8582, 2006.[7] M. Newman and J. Park. Why social networks aredifferent from other types of networks. PhysicalReview E, 68(3):036122, 2003.[8] J. Pujol, V. Erramilli, G. Siganos, X. Yang,N. Laoutaris, P. Chhabra, and P. Rodriguez. The littleengine (s) that could: scaling online social networks.In ACM SIGCOMM Computer CommunicationReview, volume 40, pages 375–386. ACM, 2010.[9] J. Pujol, G. Siganos, V. Erramilli, and P. Rodriguez.Scaling online social networks without pains. In Procof NETDB. Citeseer, 2009.[10] F. Rahimian, A. H. Payberah, S. Girdzijauskas,M. Jelasity, and S. Haridi. JA-BE-JA: a distributedalgorithm for balanced graph partitioning.forthcoming.[11] M. Satyanarayanan, J. Kistler, P. Kumar, M. Okasaki,E. Siegel, and D. Steere. Coda: A highly available filesystem for a distributed workstation environment.Computers, IEEE Transactions on, 39(4):447–459,1990.[12] F. Schneider, A. Feldmann, B. Krishnamurthy, andW. Willinger. Understanding online social networkusage from a network perspective. In Proceedings of
  6. 6. Figure 6: Replication Overhead with Different Number of Serversthe 9th ACM SIGCOMM conference on Internetmeasurement conference, pages 35–48. ACM, 2009.[13] E. Talbi. Metaheuristics: from design toimplementation. 2009.