Literature Survey to discuss topographical structure of social networks and information propagation Sathe, Vaibhav1 Indian Institute of Management Lucknow IIM Campus, Prabandh Nagar, Off Sitapur Road, Lucknow, Uttar Pradesh – 226013, INDIA 1 email@example.com Sr. Article/Paper Journal/Publisher I. INT RODUCT ION 1 Measurement and Analysis of ACM Facebook’s currently 800 million and continuously growing Online Social Networksuser base and increasing trend in time spent has attracted a lot 2 Linking via Social Similarity: The IEEEof attraction fro m researchers in various fields. Recently Emergence of Co mmunityFacebook has been used as platform for organizing mass Structure in Scale-free Networkprotests in countries of middle-east. Even looking at events in 3 A fast algorith m for simu lating ICCTA (IEEE)India like rise of India against Corruption and their Facebook scale-free networksfollowing of 500,000 people has underscored rising power of 4 Social Search in “Small-World” World Wide Websocial media. This has resulted in clashes with governments Experiments Consortiumwhich are seeking to curtail power of social networks and its 5 Recip rocity in evolving social Journal ofusers to spread messages without restrictions. In our research, networks Evolutionarywe want to model this censorship activity. This literature Economicssurvey is being conducted to support the research byunderstanding network concepts required for modelling social B. Information Propagationnetworks, primarily in areas of structure of network and how Following articles contribute to second objective ofmessage spreads. determining patterns in informat ion spread. Detailed reference We will review some well cited papers published on top is included in references section.Information Systems journals to identify various dimensions Sr. Article/Paper Journal/Publisherrequired for modelling exercise. 1 Network Effects and Personal Journal of II. PROBLEM DEFINIT ION Influences: The Diffusion of an Marketing Online Social Network Research Following are objectives of this literature review. 2 Forward or delete: What drives Journal of peer-to-peer message propagation consumer(1) Structure of Social Networks: across social networks? behaviour In order to model social network, we need to determine 3 User Interactions in Social EuroSys’09, ACM which model fro m network science applies to social Networks and their Implications network. Probable options are small world, random 4 Online organization of offline HICSS 2011 network and scale free network. It is also noted that Protest: Fro m Social to Tradit ional different social networks may d isplay different structures Media and Back due to fundamental differences. Fro m point of view of 5 Information propagation analysis IEEE censorship, we will focus mo re on social networks like in a social network site Facebook. Facebook clearly holds largest interest due to 6 Detecting and Characterizing IMC’10, ACM largest user base which gives it capability to influence Social Spam Campaigns behaviour of actors involved in censorship related study. IV. TERMINOLOGIES(2) Information Propagation Pattern: In order to identify parameters that model interactions of Let’s look at some terminologies in detail required to users on social network which lead to information understand concepts discussed in this review. diffusion, we need to understand how informat ion spreads on networks and what all factors affect it. Power Law: When frequency varies inversely with power of III. LIT ERAT URE SEARCH quantifiable size of event, the relationship is said to follow The literature surveyed for this is divided into following power law. One of the characteristics of such distribution is large difference between mean and median.sections.A. Structure of Social Networks Types of networks: Following articles contribute to first objective to determine A. Random Networksstructure of social networks . Detailed reference is included inreferences section. Random network are unstructured networks with low clustering. They do not occur in nature. They are theoretically
studied to provide baseline for study of more structured area. They also know some more people at workp lace. Therenetworks like small world and scale free. is also tendency that they want to know more people and try to gain access to larger contacts through person they think isB. Small World Network well-connected. The information exchange may be intentional Small world networks are networks wh ich have small or unintentional. The study of social networks focusses onaverage path length due to large number of interconnections critical issues like d isease spread, news spread, riots, fads,and high cluster coefficient. social awareness etc. Online social networks demonstrate similar characteristicsC. Scale-Free Network with exception that users are not in physical connection with Scale-free networks are those whose degree sequence each other. Examp les of online social networks includedistribution follows power law. i.e. the network consists of Facebook, Twitter, Flickr, YouTube or any other sites wh ichSmall nu mber of highly connected users and large number of facilitate interaction between users. This can be one-oneless connected users. (Google talk) or one-many (Facebook) or many-many (Foru m) depending on nature of the site.Terms related to networks: B. Structure of Social Networks(1) Network Diameter: Maximu m internode distance is called What graph structure social networks follo w has been very diameter of network. interesting topic for the researchers as it is fundamental step in(2) Indegree: No. of inward connections for given user. any modelling or simulation on the network.(3) Outdegree: No. of outword connections for given user. Mislove et al  in their paper on measurement and analysis This is valid measure when networks are directed graphs. of social network try to identify various characteristics of Network like Facebook and Orkut are symmetrical social network. In the experiment they collected data fro m networks i.e. for any user, indegree and outdegree are over 11.3 million users of Orkut, Youtube, Flickr and equal. LiveJournal. When network analysis was done on each(4) Assortativity: It is measure of likeliness that nodes in network, these networks follo wed Power Law. In addit ion, network establish lin k with other node which is similar to they identified that these social networks display scale-free it on some parameter. and small world properties. All networks have high clusters. Authors have identified interesting parameter that whetherInformation shared on Facebook: consent is required fro m second party to establish connection The informat ion that is created and shared on Facebook by first party. The example is twitter, where anyone cancomes from various sources. These are as follows: follow you and you need not follow him. But on other hand,(1) Status Messages: Users can share text message as their on Facebook, if so mebody wants to be friends with you then status message. This is visible to other users (friends or he needs to send request and only when you approve, you both others) on user’s wall. The message also appears in news become friends to each other. Twitter is example of feed of other users which are friends or/and subscribed to asymmetric network wh ich has different indegree and user’s updates. outdegree for each user. Facebook is examp le o f symmetric(2) Hyperlink: A hyperlin k to so me other location on Internet, networks where each user has identical indegree and typically news of interest, is another source of shared outdegree. Based on these parameters, characteristics of informat ion. Friends can like, share, co mment on such network will vary. Sy mmetric networks have more links. connections among users and hence, they form stronger(3) Photo: Photographs, typically taken by user, are clusters thereby reducing network diameter. Hence, they frequently shared, liked and commented. display characteristics of small world network. A mong(4) Co mmunity/ Group: Facebook has different groups examples taken for analysis by author, we need to focus more dedicated to various topics. Message posted by or on the on example of Orkut as it is most closely related to Facebook. community is typically shared by user so that his To understand limitations, we need to note complex structure subscribers can view it, which may not have access to the of Facebook. Although friendship is one of the prime ways community. Facebook disseminates informat ion, we need to consider other(5) Person: Famous people like Bill Gates have their own ways like groups, pages where user subscribes thereby personal pages which are not like groups. These are used creating directed or asymmetric relat ionship. Nowadays, by sending personal images and links to thousands of Facebook is also allowing users to subscribe to status updates subscribers in similar way as these personalities are using fro m other users without requirements of explicit consent. twitter today. This is for one-way communication. This has resulted in formation of Facebook has hybrid(6) Event Invitations: Users can create events and invite network with different types of nodes. With regards to cluster people. Users can also forward event invites. formation, the authors state that the online social networks score higher on assortativity on parameter that users of high V. DAT A EVALUAT ION degree establish relation with other users of high degree wh ile This section is split into sections as below. users of low degree establish relation with other users of lowA. Social Networks degree. This looks in violat ion with scale-free properties Before starting, let’s look at what is mean ing of social where low degree users have tendency to attach to high degreenetworks and how online social networks are different. users more in order to form Hub and Spoke model. Social Network concept applies to naturally formed The social networks are examp les of very large scalenetworks like co mmunity, family t ies and relationships etc. networks and they are not random. Study by Erdos and Renyi For e.g. In a town, people know each other in one residential proved that networks like social networks evolve with
particular patterns and they have certain structure, but not they also apply to user behaviour on social network likerandom. Facebook. Authors have identified that likelihood of video Wei Ren and Jianping Li’s  paper proposes RX algorith m being forwarded are closely correlated to sender involvement,to simu late scale free network, wh ich they claim is better sender tie strength and amount of online commun icationperforming than popular Barabasi-Albert (BA) algorith m. across ties. We would explain these factors in short. SenderAuthors state that as number of nodes increase, the time involvement means, as explained by Norman , is relation ofrequired for RX is much lesser co mpared to that taken by BA. subject to person’s needs. Sender’s tie strength means howThey conclude that the networks that expand continuously close is the user to sender of message. Third factor on amountexhibit characteristics of scale-free networks. And since, of co mmunication that sender has with p robable to who m hesocial networks are both very large in size as well as would forward. Authors reject factor that knowledge of howcontinuously expanding, scale-free characteristics apply. The to forward given message has got any correlation to this.same is true about online social network like Facebook, wh ich Skoric et al  in their paper discuss parameter of trusthas currently 800 million users and is increasing in terms of which is similar to ties with sender which we discussed intotal users as well as average number of friends at very rapid previous paper. Authors say that in general, user t rust theirrate. friends over any other person like polit ical leader or advertiser. Yixiao Li et al  in their paper, make important What this means is when a friend forwards or share someobservations that social network model exh ib its community message, they consider it as serious message. This improvesstructure. This paper however correctly establishes clustering likelihood that they forward such message. This research alsomethod based on “Birds of feather flock together”, stating that identifies that groups, events and status messages are the toolsusers having something in co mmon tend to form clusters or on Facebook by which users can reach one’s immed iate andgroups with a lot of interconnections among them. This does extended friends in fast, easily accessible and cost effectivenot agree with statement in paper of Mislove , which stated way. One important contribution of this paper is identificationthat users with high degree have tendency to connect to other that spread of such messages will be limited in individualsusers with high degree and vice versa. Further this paper who are mostly similar and in one category of politicallyestablishes that commun ities develop into scale-free networks engaged and socially act ive people. Th is is typically due to thewhen they keep expanding. fact that such messages will spread only through friendship There is one more factor discussed in literature on user’s networks, which are based on different intentions thanintention. As explained in paper by Goel et al , fro m spreading such message. Friends are generally of s imilarphysical social network standpoint, the topological connection thought process and hence similar on above parameters.and algorith mic connection (intention to connect) with Katona et al  brings out some crit ical points based onexample of spread of diseases in social network. The paper sender’s influence in their paper. First, they discussed that asdistinguishes in network structure based on intention of user. number of contacts of recipient increase, influencing effectNext paper discussed below extends this concept by looking that particular indiv idual has on him gets diluted accordingly.into factor when such intentions evolve, making network very Second factor is of brokers. We have already seen that socialdynamic. networks demonstrate characteristics of scale-free and small The paper by Jun and Sethi  discusses how social network world networks. This means that among different clusters ofstructure is developed in dynamic and continuously evolving users there are few users which are co mmon, which formenvironment. The changes in network result as random prominent nodes linking these two clusters. As provedrewiring. Also, to certain extent, some old lin ks are severed emp irically, since they control large amount of informat ion,over period of t ime. In physical as well as online social they have higher influential power.networks it is due to changes in one’s lifestyle in terms of Another very interesting observation is made by Wilson etlocation, co mmunity memberships etc. Also, changes may al  in their paper. Authors say that links or connections onhappen in intention factor which is taken as conditional social network like Facebook are not indicators of interactioncooperation. Over period o f t ime, user’s reasons to connect among them. This is primarily due to time constraints thatcan evolve e.g. looking for relationship, friendship or users face. So, all the friendships are not equally meaningful.professional networking. Another important observation by Authors therefore have co me up with new concept ofthe authors is based on increasing degree of network. With interaction graph as valid indicator to map social connectivityincreasing degree, the clustering increases as neighbours of than Facebook updates. Interesting observation they haveone node are likely to be neighbours of each other. Th is is made that such interaction graph does not exhib it small worldsame phenomenon that social network like Facebook fo llows. characteristics. Therefore, authors believe more in the scale-Hence, the diameter of network reduces. This paper identifies free network pattern when it co mes to interactions that happenfuture research scope in terms of in fluence of behaviour of within users.non-neighbours on given user. This is also valid scenario In paper by Magnani et al , authors have identified someconsidering features of Facebook. User A may receive updates important dimensions of discussion. The average lifet ime offro m interaction of particu lar friend B to his friend C who is post or message is the time for wh ich it is availab le on newsnot friend of user A. We will discuss this propagation in next feeds of user. It will vary inversely with nu mber of friends thesection. user has and their frequency of activity on Facebook. Overall, authors have found that such lifetime of post also followsC. Information Propagation power law. Based on their empirical analysis it was found that Harvey et al  in their paper on v iral marketing on Internet 50% of entries survive fo r around one hour, 85% survive for aresearched how users Forward or Delete particu lar message on day and so on. Authors have also identified specific time trendsocial network like YouTube. Fro m our research point of view, in content generation. Since users in given clusters have someobservations on this forward ing behaviour are important as
parameters in co mmon, any temporal factors affecting those B. Information Propagationparameters will also affect activity of all users simultaneously. As literature explains, we have several factors that define One impo rtant issue that needs attention is increasing the pattern of propagation of information. However, we needquantity of spam. The paper by Gao et al , looks at to alter some conditions when we apply these to our researchquantifying and characterizing online spam campaigns for purpose of understanding how a message spreads overlaunched by online social network accounts. Important social network like Facebook, fundamentally due to severalobservation fro m this emp irical study of 3.5 million Facebook differences in characteristics of Facebook against socialusers indicate that over 97% of accounts are compro mised networks that were considered for empirical research inaccounts and only rest are fake accounts. Another observation literature researched.is that spamming activ ity is more generally at early mo rning As against preferential forwarding discussed in paper byhours for users based on their local time. Harvey et al , on Facebook, the user would forward i.e. share message that he likes to all of his friends and those who VI. A NALYSIS AND INT ERPRET AT ION are subscribed to his updates. Very few times he would shareA. Network Structure such message with particu lar Facebook user. However, we Based on reviews of art icles in section on network structure need to note that he can preferentially tie up some users basedabove, we find that Mislove’s art icle  develops many on relevance he sees while sharing the message with largerconcepts required for understanding how this structure audience. The ways to do it are tagging a person or postingdevelops. But, with help of co mmun ity as example fro m Yixio such link or image on wall of user intended.Li et al , we can get idea how social networks evolve. This We also agree with Harvey’s finding that user’s knowledgehelps in understanding why social networks display has little to do with forwarding likelihood. While looking atcharacteristics of both small world networks and scale-free this observation from Facebook’s point of view, we can’tnetworks. logically thin k of any reason to believe that a Facebook user Initially a group of individuals with something in co mmon will not be aware how to share the message that he or she islike belonging to same school come together on network like reading if at all he wants to do that.Facebook. They add each other as lin ks, thereby establishing As we have seen in the structure of social networks, thecommunity structure. This is also a cluster of users tightly users of similar nature co me together and form clusters. Thiscoupled with each other. Th is behaves like Small World creates strong bonds between similar people and weakernetwork due to shorter diameter. As time progresses, the bonds between dissimilar people. Moreover we saw that wh ileindividuals fro m these clusters may get exposed to a different friendship networks are formed based on consent, the usergroup or set of users. Now this particular user becomes gives such consent based on different criteria than spreadingconnection between these two clusters. That way, this particular message. This results in effectively reducingindividual will have much higher degree of links than his velocity of message spread as it does not reach to dissimilarearlier cluster peers. This develops into hub and spoke model users with equal intensity.and thereby into scale free networks. These follo w Power Law, Wilson et al  have found that small world clustering doesas there are lesser users connected across clusters and hence not exist due to low degree of connection in their interactionhave higher degree, than large number of users connected only graph, which is different than friendship link graph. Th is iswithin cluster, therefore have lesser degree of links. due to the fact that users on regular basis interact with a s mall Another parameter that impacts expansion of social portion of their friends. As degree of lin ks per user fro mnetworks is how users can search other users in order to interaction point of view decreases, clustering index reduces ,connect them. Networks like LinkedIn allow users to search thereby network becomes mo re scale-free and less small-only within certain levels of neighbourhood. This limits world.capability of less connected users to connect to large number As described by Katona et al , the dilution of influenceof users. This further provides incentive to user to connect to occurs as number o f contacts increase. This is very logical. Asanother user which is highly connected. This simp le behaviour number of friends on Facebook increases frequency of updatescontradicts concept given in paper of Mislove  that users of in Feeds also increases proportionately. As pointed out bysimilar degree are more likely to connect to each other. Wilson, every user has limited time on Facebook. Hence, Scenario of lin king unintentionally is not applicable to likelihood that particular update will be visib le in considerableonline social network like Facebook as there is no reason to portion of his news feed he would scroll at time redu ces withbelieve that two users are connected to each other unless they increasing number of contacts. This weakens influence levelhave some intention to do so. At least one user will have some and hence the interaction that we are looking for.reason to connect to other, second user may approve request Paper written by Magnani et al  discusses lifet ime of postunknowingly. Additionally it may need to be noted that the where it is active and accessible to friends. Overall it indicatesintentions of different users connecting to each other may be short lifespan of the message. We also need to note that asdifferent. What this means is one user A intends to connect to clustering will increase in Facebook with mo re and more useruser B for reason X. But user B wants to connect to user A for activity and more friends, average lifespan of particularreason Y and still they can establish connection as long as message would lower further. This further underlines pointboth users agree. But if there is no reason Y for B to connect mentioned in Wilson’s paper about constrained time makesto A then the link will not establish. However, we could not interaction networks rather than connection networks morelocate any literature modelling the network taking into important for modelling, which are scale-free in nature.account heterogeneous intentions. Regarding spread of spam content, impo rtant factor fro m our study point of view is that co mpro mised accounts contribute to 97% of spam and only 3% by fake accounts. This further highlights that users trust their friends. Message
coming fro m unknown user is identified as spam easily than only scale-free characteristics, we need to model socialthe one coming fro m friend with who m user has closer ties. network as scale-free network for our research perspective.Regarding t iming issue of the spam generation, we do not find We conclude that following factors should be taken intoany relevance to our study on spread of information. account by our model which will impact likelihood and But time of content generation has critical ro le to play when velocity of message spread.it co mes to find lifet ime of the message to remain active in (1) Nu mber of friends of user is inversely proportional tonews feed of the user. If message is created or shared at peak amount of influence of friend has on usertime for local user, as per clustering of users, there is (2) Nu mber of friends of user is inversely proportional tosignificant evidence that most friends are geographically lifetime of message to remain active in user’s news feedcollocated. And hence, there will be higher activity in the (3) Amount of time user spends on average on Facebook isentire cluster. This further reduces lifetime of message in the directly proportional to likelihood of spreading messagenews feed, but simultaneously increases likelihood that user (4) Stronger bond with sender is directly proportional tosees such message due to he or she is actively v iewing the likelihood of spreading message furthernews feed. (5) More is the clustering in user’s network, less is the Another important point is that not all content that is velocity of message to spread, primarily due tofrequently shared is genuine. Unfortunately we could not find duplication of messages it will remain confined to sameany conclusive literature on user behaviour where they clusterforward or share spam or incorrect info rmation knowingly (6) Message shared at peak time will have less lifetime onsimp ly for amusement purpose. This typically includes some news feed but higher likelihood to get replicated due torandom so called “confidential” information about some high activity in entire clusterpolitical leader or forged images. If users share this (7) If users perceive particular message as no harmful toinformat ion unknowingly, then this behaviour can be them, then there is higher likelihood that it will be spreadconsidered under trusting the ties which we just discussed. But, or shared, irrespective of user’s analysis of message’smany a times user is completely aware of fraudulent nature. authenticity. This will be typical sharing of suchStill, either for amusement purpose or out of po litical or messages for amusement or political conflicts.ideological conflict with person or event in question, they findit encouraging sharing of such material. We could not REFERENCEShowever find any emp irical research on this behaviour. It  Katona Z., Zubcsek P., Sarvary M., Network Effects andshould also be noted that users who are aware of spam, if they Personal Influences: The Diffusion of an Online Socialthink it may be harmful to them, then they do not indulge in Network , Journal of Marketing Research, Vo l. XLVIIIsuch activity. But when it co mes to pure static spam content, (June 2011), 425-443, American Marketing Association.which they are sure that it won’t co mpro mise their pro files,  Mislove A., Marcon M., Gu mmad i K., Druschel P.,they do not have objection to share or comment on it. If we Bhattacharjee B., Measurement and Analysis of Onlinelook at censorship proposals from governments, we may find Social Networks, proceedings of IMC’07, ACM.that they are largely interested in controlling such content.  Yixiao Li, Xiaogang Jin, Fansheng Kong and Jiming Li, Linking via Social Similarity: The Emergence of VII. LIMIT AT IONS Community Structure in Scale-free Network , IEEE Facebook is continuously updating its features. Literature symposium on digital object identifier, 2009.suggests that new features have significant impact on user  Wei Ren, Jianping Li, A fast algorithm for simulatingbehaviour. For newly introduced timeline feature, wh ich scale-free networks, proceedings of ICCTA2009allo ws users to view past important interactions with ease, has  Ted G. Lewis, Network Science: Theory and Practice,greater significance on user interactivity. But, we could not John Wiley & Sons, Inc. 2009.locate any literature discussing impact of timeline. Also, we  P. Erdos, A. Renyi, On the evolution of random graphs,could not find literature conclusively quantifying Facebook Publ. Math. Inst. Hung. Acad. Sci., vol. 5, pp. 17-60,events and their impact on social events. We also did not 1959.locate any literature wh ich can explain user bias in sharing  Goel S., Muhamad R., Watts D., Social Search infake informat ion knowingly. We understand that social “Small-World” Experiments, proc. WWW 2009 , ACM.networking phenomenon is relatively new and hence there is  Jun T., Sethi R., Reciprocity in evolving social networks,no enough research done on every aspect of social network’s Journal of Evolutionary Economics , June 2009.impact on our real time interactions.  Harvey C., Stewart D., Ewing M., Forward or delete: What drives peer-to-peer message propagation across VIII. CONCLUSION social networks?, Journal of Consumer Behavior, Vol. In this literature survey, we have identified factors that need 10, 2011, Published by Wiley.to be accounted while modelling informat ion spread on social  Norman AT, Russell CA. 2006. The Pass-Along Effect:networks. We have avoided going into details of mathemat ical Investigating Word-of-Mouth Effects on Online Surveydetails supporting conclusions derived for simp licity. We have Procedures. Journal of Co mputer-Mediatedlin ked various papers that is available on this topic to identify Communication 11(4): 1085–1103.following conclusions.  Wilson C., Boe B., Sala A., Puttaswamy P., Zhao B., On network structure side, we conclude that social network User Interactions in Social Networks and theirfro m friendship perspective demonstrates characteristics of Implications, Proceedings of EuroSys 2009, ACM.both scale-free and small-world networks. But since,  Skoric M., Poor N., Liao Y., Wei S., Onlineinteractions between users which are time constrained, display Organization of an Offline Protest: From Social to
Traditional Media and Back , proceedings of HICSS 2011, retrieved from IEEE. Magnani M., Montesi D., Rossi L., In formation propagation analysis in a social network site, proceedings of International Conference on Advances in Social Networks Analysis and Mining, 2010, IEEE. Gao H., Hu J., Wilson C., Li Z., Chen Y., Zhao B., Detecting and Characterizing Social Spam Campaigns , proceedings of IMC’10. ACM.