INTERNATIONALComputer EngineeringCOMPUTER ENGINEERING  International Journal of JOURNAL OF and Technology (IJCET), ISSN 09...
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volu...
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volu...
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volu...
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volu...
Upcoming SlideShare
Loading in …5
×

Multimode network based efficient and scalable learning of collective behavior

382 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
382
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Multimode network based efficient and scalable learning of collective behavior

  1. 1. INTERNATIONALComputer EngineeringCOMPUTER ENGINEERING International Journal of JOURNAL OF and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME & TECHNOLOGY (IJCET)ISSN 0976 – 6367(Print)ISSN 0976 – 6375(Online)Volume 4, Issue 1, January- February (2013), pp. 313-317 IJCET© IAEME: www.iaeme.com/ijcet.aspJournal Impact Factor (2012): 3.9580 (Calculated by GISI) ©IAEMEwww.jifactor.com MULTIMODE NETWORK BASED EFFICIENT AND SCALABLE LEARNING OF COLLECTIVE BEHAVIOR: A SURVEY 1 2 Vibha B. Lahane , Prof. Santoshkumar V. Chobe 1 Department of Computer Engineering,Pad.Dr. DYPIET Pimpri, Pune 2 Department of Computer Engineering, Pad.Dr. DYPIET Pimpri, Pune ABSTRACT Study of collective behavior is to understand how individuals behave in a social networking environment. Oceans of data generated by social media like Facebook, Twitter, Flickr, and YouTube present opportunities and challenges to study collective behavior on a large scale. To learn to predict collective behavior in social media. In particular, given information about some individuals, how to infer the behavior of unobserved individuals in the same network? A social-dimension-based approach has been shown effective in addressing the heterogeneity of connections presented in social media. However, the networks in social media are normally of colossal size, involving hundreds of thousands of actors. The scale of these networks entails scalable learning of models for collective behavior prediction. To address the scalability issue, an edge-centric clustering scheme to extract sparse social dimensions. With sparse social dimensions, the proposed approach can efficiently handle networks of millions of actors while demonstrating a comparable prediction performance to other non-scalable methods. I. INTRODUCTION a) Collective Behavior Collective behavior refers to the behaviors of individuals in a social networking environment, but it is not simply the aggregation of individual behaviors. In a connected environment, individuals’ behaviors tend to be interdependent, influenced by the behavior of friends. This naturally leads to behavior correlation between connected users. Take marketing as an example: if our friends buy something, there is a better-than-average chance that we will buy it, too. 313
  2. 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEMEb) Need for Acquirements of Collective Behavior The advancement in computing and communication technologies enables peopleto get together and share information in innovative ways. Social media, in forms of Web2.0 and popular social networking sites like Facebook, Flickr, YouTube, Digg, Blog, etc.,is reshaping various fields including online business, marketing, epidemics and intelligentanalysis. Concomitant with the opportunities indicated by the rocketing online trafficin social media are the challenges for user/customer profiling, accurate user search,matching, recommendation as well as effective advertising and marketing. Takeblogsphere as an example. Bloggers can upload tags for their own blog sites. The tags of ablog site provide the description of the blogger, facilitating blog search, retrieval andother tasks. Unfortunately, not all the bloggers provide tags; even if some do, they mayjust choose some for convenience. Thus, it becomes a challenge to infer the likely tags ofthose bloggers with partial information. Another problem is social networkingadvertising. Currently, advertising in social media has encountered many challenges. A recent study from the research firm IDC suggested that “just 57% of all users ofsocial networks clicked on an ad in the last year, and only 11% of those clicks lead to apurchase”. Note that some social networking sites can only collect very limited userprofile information, either due to the privacy issue or because the user declines to sharethe true information. On the contrary, the friendship network is normally available. If onecan leverage a small portion of user information and the network data wisely, thesituation might improve significantly Social networking sites (a recent phenomenon)empower people of different ages and backgrounds with new forms of collaboration,communication, and collective intelligence. Prodigious numbers of online volunteerscollaboratively write encyclopedia articles of unprecedented scope and scale; onlinemarketplaces recommend products by investigating user shopping behavior andinteractions; and political movements also exploit new forms of engagement andcollective action. In the same process, social media provides ample opportunities to studyhuman interactions and collective behavior on an unprecedented scale.II. LITERATURE SURVEY L.Tang and H.Liu [1] stated that collective behavior refers to how individualsbehave when they are exposed in a social network environment. In this paper, theyexamined how they could predict online behaviors of users in a network, given thebehavior information of some actors in the network. Many social media tasks can beconnected to the problem of collective behavior prediction. Since connections in a socialnetwork represent various kinds of relations, a social-learning framework based on socialdimensions is introduced. This framework suggests extracting social dimensions thatrepresent the latent affiliations associated with actors, and then applying supervisedlearning to determine which dimensions are informative for behavior prediction. Itdemonstrates many advantages, especially suitable for large-scale networks, paving theway for the study of collective behavior in many real-world applications. Collectivebehavior is not simply the aggregation of individuals behavior. In a connectedenvironment, behaviors of individuals tend to be interdependent. That is, ones behaviorcan be influenced by the behavior of his/her friends. This naturally leads to behaviorcorrelation between connected users. Such collective behavior correlation can also be 314
  3. 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEMEexplained by homophily. M.McPherson, L.Smith-Lovin, and J.M.Cook [7] discussed that, thepeople who are interacting with each other share certain similarities between them. Theauthor also described that this correlated behavior information also used for prediction ofonline behaviors in a network. M.E.J. Newman, A.L. Barab´asi and D.J.Watts [17] proposed a concept called collectiveinference. It assumes that the behavior of one actor is dependent upon that of his friends. Tomake prediction, collective inference is required to find an equilibrium status such that theinconsistency between connected actors is minimized. This is normally done by iterativelyupdating the possiblebehavior output of one actor while fixing the behavior output (or attributes) of his connectedfriends in the network. It has been shown that considering this network connectivity forbehavior prediction outperforms those that do not. However,connections in social media areoften not homogeneous. The heterogeneity presented in network connectivities can hinder thesuccess of collective inference. P.Singla and M.Richardson [4] applied data mining techniques to study this relationship for apopulation of over 10 million people, by turning to online sources of data. The analysisreveals that people who chat with each other (using instant messaging) are more likely toshare interests (their Web searches are the same or topically similar). The more time theyspend talking, the stronger this relationship is. People who chat with each other are also morelikely to share other personal characteristics, such as their age and location (and, they arelikely to be of opposite gender). Similar findings hold for people who do not necessarily talkto each other but do have a friend in common. Their analysis is based on a well-definedmathematical formulation of the problem, and is the largest such study they were aware of.M.E.J.Newman [3] considered the problem of detecting communities or modules innetworks, groups of vertices with a higher-than-average density of edge connecting them.Previous work indicates that a robust approach to this problem is the maximization of thebenefit function known as “modularity” over possible divisions of a network. Here the authorshowed that this maximization process can be written in terms of the eigen spectrum of amatrix they called the modularity matrix, which plays a role in community detection similarto that played by the graph Laplacian in graph partitioning calculations. This result leads us toa number of possible algorithms for detecting community structure, as well as several otherresults, including a spectral measure of bipartite structure in networks and a new centralitymeasure that identifies those vertices that occupy central positions within the communities towhich they belong. The algorithms and measures proposed are illustrated with applications toa variety of real-world complexnetworks. H. W. Lauw, J. C. Shafer, R. Agrawal, andA. Ntoulas [11] study the phenomenon of homophily in the digital world. Unlike the physicalworld, the digital world doesn’t impose any geographic or organizational constraints onfriendships. Online friends might share common interests, there’s no reason to believe thattwo users with common interests are more likely to be friends. A common assumption abouthuman nature is that people have a tendency to associate with other, similar people (aphenomenon called homophily). Sociology has studied homophily in the physical worldextensively. However, the studies have generally been conducted on a small scale, and thesimilarity factors examined have been limited mostly to easily observed or surveyed sociodemographic characteristics, such as race, gender, religion, and occupation — characteristicsthat don’t necessarily manifest themselves in online social networks. One of the strongestunderlying sources of homophily in the physical world is locality due to geographic 315
  4. 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEMEproximity, family ties, and organizational factors, such as school and work. However, in thedigital world, physical locality becomes less important, and other factors such as commoninterests might play a greater role.S.A.Macskassy and F.Provost [9] presented that the classified entities that are interlinkedwith entities for which the class is known and also they use a modular toolkit called NetKitfor classification in networked data. NetKit is based on a node-centric framework in whichclassifiers comprise a local classifier, a relational classifier, and a collective inferenceprocedure. Various existing node-centric relational learning algorithms can be instantiatedwith appropriate choices for these components, and new combinations of components realizenew algorithms. This study focuses on univariate networkclassification, for which the only information used is the structure of class linkage in thenetwork (i.e., only links and some class labels).It also shows that there are two sets oftechniques that are preferable in different situations, namely when few versus many labels areknown initially. They also demonstrated that link selection plays an important role similar totraditional feature selection.Lei Tang, Huan Liu,Jianping Zhang, Zohreh Nazeri[22]stated that a multi-mode networktypically consists of multiple heterogeneous social actors among which various types ofinteractions could occur. Identifying communities in a multi-mode network can helpunderstand the structural properties of the network, address the data shortage and unbalancedproblems ,and assist tasks like targeted marketing and finding influential actors within orbetween groups. In general, a network and the membership of groups often evolve gradually.In a dynamic multi-mode network, both actor membership and interactions can evolve, whichposes a challenging problem of identifying community evolution. In this work, they try toaddress this issue by employing the temporal information to analyze a multi-mode network.A spectral framework and its scalability issue are carefully studied. Experiments on bothsynthetic data and real-world large scale networks demonstrate the efficacy of algorithm andsuggest its generality in solving problems with complex relationships.III. CONCLUSION This paper has surveyed different schemes that are used for collective behaviorprediction. The networks in social media are normally of big in size, involving hundreds ofthousands of actors. The scale of these networks entails scalable learning of models forcollective behavior prediction. To address the scalability issue, an edge-centric clusteringscheme is proposed. The proposed approach can efficiently handle networks of millions ofactors while demonstrating a comparable prediction performance to other non-scalablemethods like Modularity maximization are dense so memory requirement is high andmemory requirements hinder both extractions of social dimensions as well as subsequentdiscriminative learning. With edge-centric view, shows that the extracted social dimensionsare guaranteed to be sparse.REFERENCES[1] L. Tang and H. Liu, “Toward predicting collective behavior via social dimensionextraction,” IEEE Intelligent Systems, vol. 25, pp. 19–25, 2010.[2] N.J. Smelser and N. Joseph, Theory of collective behavior. London (UK): Routledge &Kegan Paul, 1962 316
  5. 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME[3] M.Newman, “Finding community structure in networks using the eigenvectors of matrices,”Physical Review E (Statistical, Nonlinear, and Soft Matter Physics), vol. 74, no. 3, 2006. [Online].Available: http://dx.doi.org/10.1103/PhysRevE.74.036104[4] P.Singla and M. Richardson, “Yes, there is a correlation: - from social networks to personalbehavior on the web,” in WWW ’08: Proceeding of the 17th international conference on World WideWeb. New York, NY, USA: ACM, 2008, pp. 655–664.[5]N.J. Smelser and N. Joseph, Theory of collective behavior. London (UK): Routledge & KeganPaul, 1962.[6] L. Tang, H. Liu, “Relational learning via latent social dimensions,” in KDD ’09: Proceedings ofthe 15th ACM SIGKDD international conference on Knowledge discovery and data mining. NewYork, NY, USA: ACM, 2009, pp. 817–826.[7] M.McPherson , L.Smith-Lovin , and J. M. Cook, “Birds of a feather: Homophily in socialnetworks,” Annual Review of Sociology, vol. 27, pp. 415–444, 2001.[8] H. W. Lauw, J. C. Shafer, R. Agrawal, and A. Ntoulas,“Homophily in the digital world: A Live Journal case study,” IEEE Internet Computing, vol. 14, pp.15 23,2010.[9] S. A. Macskassy and F. Provost, “ Classification in networked data: A toolkit and a univariate casestudy,” J. Mach. Learn. Res.,vol. 8, pp. 935–983, 2007.[10] S. Gupta, R. M. Anderson, and R. M. May, Networks of sexual contacts: Implications for thepattern of spread of HIV. AIDS 3, 807–817 (1989).[11] H. W. Lauw, J. C. Shafer, R. Agrawal, and A. Ntoulas, “Homophily in the digital world: ALiveJournal case study,” IEEE Internet Computing, vol. 14, pp. 15–23, 2010.[12] M. E. J. Newman, Fast algorithm for detecting community structure in networks. Phys. Rev. E69, 066133 (2004).[13] J.Reichardt and S. Bornholdt, Statistical mechanics of community detection. Preprint cond-mat/0603718 (2006).[14] J.Duch and A. Arenas, Community detection in complex networks using extremal optimization.Phys. Rev.E 72, 027104 (2005).[15] M. Granovetter. Threshold models of collective behavior. American journal of sociology,83(6):1420, 1978.[16] T. C. Schelling. Dynamic models of segregation. Journal of Mathematical Sociology, 1:143{186,1971.[17] M. E. J. Newman, A.-L. Barab´asi, and D. J. Watts, The Structure and Dynamics of Networks.Princeton University Press, Princeton (2006)[18] M.Girvan and M. E. J. Newman, Community structure in social and biological networks. Proc.Natl. Acad. Sci.USA 99,7821–7826 (2002).[19] P.Holme, M. Huss, and H. Jeong, Subnetwork hierarchies of biochemical pathways.Bioinformatics 19,532–538 (2003).[20] G.W.Flake, S. R. Lawrence, C. L. Giles, and F. M. Coetzee, Self-organization and identificationof Web communities. IEEE Computer 35, 66–71 (2002).[21] R. Guimer`a and L. A. N. Amaral, Functional cartography of complex metabolic networks.Nature 433, 895–900 (2005).[22] L. Tang, H. Liu, J. Zhang, and Z. Nazeri, “Community evolution in dynamic multi-modenetworks,” in KDD ’08: Proceeding of the 14th ACM SIGKDD international conference onKnowledge discovery and data mining. New York, NY, USA: ACM, 2008, pp. 677–685.[23] Sonia and Satinder Pal, “An Effective Approach To Contention Based BandwidthRequest Mechanism In Wimax Networks” International journal of Computer Engineering &Technology (IJCET), Volume 3, Issue 2, 2012, pp. 603 - 620, Published by IAEME.[24] F.Fenita and Sumitha C.H, “Prediction of Customer Behavior Using CMA” Internationaljournal of Computer Engineering & Technology (IJCET), Volume 1, Issue 2, 2010,pp. 192 - 200, Published by IAEME. 317

×