LINK PREDICTION IN
SOCIAL NETWORKING
SUBMITTED BY:
Umang Chaudhary (10103408)
Sanyam Gupta (10103405)
Under the Guidance Of: Dr. Buddha Singh
WHAT IS SOCIAL NETWORKING?
 Social networking is the grouping of individuals into specific
groups, like small rural communities or a neighbourhood
subdivision, if you will. Although social networking is possible
in person, especially in the workplace, universities, and high
schools, it is most popular online.
CHALLENGE…!!
 Social networks are highly dynamic objects; they
grow and change quickly over time through the
addition of new edges. Its solution is LINK
PREDICTION.
WHAT IS LINK
PREDICTION?
 Link prediction is an important task for analysing
social networks. It is the method to predict link
between the given nodes using various algorithms.
APPLICATIONS
 Identifying the structure of a criminal
network by predicting missing links in a
criminal network using incomplete data.
 Automatic web hyperlink creation
 Website hyperlink prediction
 Build recommendation systems (e-
commerce)
 Protien-protien interaction (bio-informatics)
 Annotate PPI graph (bio-informatics)
 Identify hidden group of terrorists (security)
 Overcoming the data-sparsity problem in
recommender systems using collaborative
filtering.
PROBLEM STATEMENT
 The network is dynamic as it keeps on
expanding because the users keep
getting added exponentially so
predicting link between the users is a
big challenge. So we are going to
implement algorithms like common
neighbours, jaccard coefficient and
adamic/adar which can predict link
efficiently,
Our social networking application is about where
anyone can register and become a member and stay
connected to all their friends and to other users. Our
application includes all the features of a social
networking application like:
 We have developed the social networking
system which is client-server based.
 We have developed the algorithms for link
prediction that includes prediction on the
basis of common neighbours, common
features, members of the same community
etc.
 To chek user’s strength we have
calculated betweeness, closeness and
degree of centrality.
 We have implemented our proposed
algorithm using python programming
language, GLADE(for GUI) and SQL (for
database).
 We have evaluated the performance of
our proposed algorithm.
FEATURES OF OUR
APPLICATION
 New user can register on the application and can update their
profile and also can view other’s profile.
 Users have a unique profile visible to the friends and users of
the application, where they can upload there pictures and
personal information.
 Users can add other users.
 Most famous person in the group can be predicted in the
application.
 We are able to find each user’s strength by calculating
closeness, betweeness and degree of centrality.
 Users can upload their status and comment on the status as
well.
 Users can join communities created by them
in their field of interest.
 Users can join groups, create groups.
 There is link prediction on the basis of
common or mutual friends so the users will
get friend suggestion on the basis of mutual
friends
 The users get friend suggestion also from the
person who are added in the same
community if there are many features
common in them.
 Users also get friend suggestions of those
persons who have many features in common.
 Users can chat with each others.
 The application contains basic features of any
social networking website such as liking the
pages, adding friends, uploading status, follow
people etc.
METHODS FOR LINK
PREDICTION
 There can be many methods that can be
used for predicting the link. Some of them
are:
1. Jaccard coefficient
2. Adamic/adar
3. Common neighbors
4. Graph distance
5. Katz
6. Hitting time
7. Friends measure
8. Preferential attachment score
9. Bayesian algorithm, etc.
 The link prediction problem
 Given a snapshot of a social network
at time t, we seek to accurately predict
the edges that will be added to the
network during the interval (t, t’)
Methods we are using for the
prediction
 COMMON NEIGHBOURS
 JACCARD COEFFICIENT
 ADAMIC/ADAR
COMMON NEIGHBOURS
 A and C have 2 common neighbors,
more likely to collaborate
A
B
C
D
E
JACCARD COEFFICIENT
 same as common neighbors, adjusted
for degree
ADAMIC/ADAR
 weighting rarer neighbors more
heavily (gives more weightage to
neighbours that are not shared with
many others)
A
B
C
D
E
Methods we are using to find
strengths of users
1. Degree of Centrality: Centrality is
regarded as one of the most important and
commonly used conceptual tools for
exploring actor roles in social networks. A
node’s degree centrality, in an un-directed
graph, is defined as the number of nodes that
are connected to that node.
The definition dictates that “central actors
must be the most active in the sense that
they
have the most ties to other actors in the
network or graph”
2. Closeness: Closeness centrality indicates the
influence of a node on the entire network, and thus
discipline centrality in this research can tell how
“close” each discipline is to the other disciplines
and the influence that a discipline puts on the
entire network.
3. Betweenness: According to the definition
of betweenness, betweenness centrality
reflects the bridge role of a discipline in a
knowledge communication network. The larger
the discipline betweenness, the more control
that the discipline has over the interaction
between other disconnected disciplines.
IMPLEMENTATION
 SANPSHOTS
TEST PLAN
 We have implemented many test cases
on the modules that have been
developed so far which are as follows:
 When we check the prediction on the
basis of common features, we see that
the right person gets predicted.
 On changing the features of any users,
new links should pe predicted which is
as per the adamic and adar algorithm.
 Testing the user login by entering the user
name and password. If the user name and
password is correct then the user will login
otherwise login failed. Below are the
snapshots indicating the login successful and
failure.
 If all the fields have been filled for registering
the user, then the user will be registered with
a different id which is the primary key.
 On adding friends we test that the friend list
should get updated successfully.
REFERENCES
 ULRIK BRANDES†.2001. A Faster Algorithm For Betweenness Centrality*. Published in
Journal of Mathematical Sociology
 Kazuya Okamoto1, Wei Chen, and Xiang-Yang Li.2008. Ranking of Closeness Centrality
for Large-Scale Social Networks. FAW '08 Proceedings of the 2nd annual international
workshop on Frontiers in Algorithmics ( Pages 186-195)
 Luca Maria Aiello, Alain Barrat, Rossano Schifanella, Ciro Cattuto, Benjamin Markines,
Filippo Menczer. 2011. Friendship prediction and homophily in social media. ACM
journal.
 Purnamrita Sarkar, Deepayan Chakrabartiy, Michael I. Jordanz. 2012. Nonparametric
Link Prediction in Dynamic Networks. ICML, UK,2012, PAGE NO. 1-8
 Zhengdong Lu, Berkant Savas, Wei Tang, Inderjit Dhillon. 2010. Supervised Link
Prediction Using Multiple Sources. 29 th International Conference on Machine Learning,
Edinburgh, Scotland, UK.
 Suphakit Niwattanakul, Jatsada Singthongchai, Ekkachai Naenudorn and Supachanun
Wanapu.2013. Using of Jaccard Coefficient with keyword similarity. IMECS
 AIROLDI, E., BLEI, D., FIENBERG, S., XING, E., AND JAAKKOLA, T. 2006. Mixed
 AVIN, C., LOTKER, Z., AND PIGNOLET, Y. 2011. On the elite of social networks. Arxiv preprint
arXiv:1111.3374.
 BARABASI, A.-L. AND ALBERT, R. 1999. Emergence of scaling in random networks. Science 286,
509–512.
 BLONDEL, V., GUILLAUME, J., LAMBIOTTE, R., AND LEFEBVRE, E. 2008. Fast unfolding of
communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008,
P10008.
 CHA, M., HADDADI, H., BENEVENUTO, F., AND GUMMADI, K. P. 2010. Measuring user influence
in twitter: The million follower fallacy. In In Proceedings of the 4th International AAAI Conference on
Weblogs and Social Media (ICWSM). Washington DC, USA.
 CHAWLA, N., JAPKOWICZ, N., AND KOTCZ, A. 2004. Editorial: special issue on learning from
imbalanced data sets. ACM SIGKDD Explorations Newsletter 6, 1, 1–6.
 CHO, E., MYERS, S., AND LESKOVEC, J. 2011. Friendship and mobility: user movement in
location-based social networks. In Proceedings of the 17th ACM SIGKDD international conference
on Knowledge discovery and data mining. ACM, 1082–1090.
 A.-L. Barab´asi and R. Albert, “Emergence of Scaling in Random Networks,” Science, vol. 286, no.
5439, pp. 509-512, Oct. 1999.
 D. Wang, Dashun. Pedreschi and A. Barab´asi, “Human Mobility, Social Ties, and Link Prediction,” in
KDD ’11. ACM, 2011, pp. 1100–1108.
THANK YOU

Ppt

  • 1.
    LINK PREDICTION IN SOCIALNETWORKING SUBMITTED BY: Umang Chaudhary (10103408) Sanyam Gupta (10103405) Under the Guidance Of: Dr. Buddha Singh
  • 2.
    WHAT IS SOCIALNETWORKING?  Social networking is the grouping of individuals into specific groups, like small rural communities or a neighbourhood subdivision, if you will. Although social networking is possible in person, especially in the workplace, universities, and high schools, it is most popular online.
  • 3.
    CHALLENGE…!!  Social networksare highly dynamic objects; they grow and change quickly over time through the addition of new edges. Its solution is LINK PREDICTION.
  • 4.
    WHAT IS LINK PREDICTION? Link prediction is an important task for analysing social networks. It is the method to predict link between the given nodes using various algorithms.
  • 5.
    APPLICATIONS  Identifying thestructure of a criminal network by predicting missing links in a criminal network using incomplete data.  Automatic web hyperlink creation  Website hyperlink prediction  Build recommendation systems (e- commerce)
  • 6.
     Protien-protien interaction(bio-informatics)  Annotate PPI graph (bio-informatics)  Identify hidden group of terrorists (security)  Overcoming the data-sparsity problem in recommender systems using collaborative filtering.
  • 7.
    PROBLEM STATEMENT  Thenetwork is dynamic as it keeps on expanding because the users keep getting added exponentially so predicting link between the users is a big challenge. So we are going to implement algorithms like common neighbours, jaccard coefficient and adamic/adar which can predict link efficiently,
  • 8.
    Our social networkingapplication is about where anyone can register and become a member and stay connected to all their friends and to other users. Our application includes all the features of a social networking application like:  We have developed the social networking system which is client-server based.  We have developed the algorithms for link prediction that includes prediction on the basis of common neighbours, common features, members of the same community etc.
  • 9.
     To chekuser’s strength we have calculated betweeness, closeness and degree of centrality.  We have implemented our proposed algorithm using python programming language, GLADE(for GUI) and SQL (for database).  We have evaluated the performance of our proposed algorithm.
  • 10.
    FEATURES OF OUR APPLICATION New user can register on the application and can update their profile and also can view other’s profile.  Users have a unique profile visible to the friends and users of the application, where they can upload there pictures and personal information.  Users can add other users.  Most famous person in the group can be predicted in the application.  We are able to find each user’s strength by calculating closeness, betweeness and degree of centrality.  Users can upload their status and comment on the status as well.
  • 11.
     Users canjoin communities created by them in their field of interest.  Users can join groups, create groups.  There is link prediction on the basis of common or mutual friends so the users will get friend suggestion on the basis of mutual friends  The users get friend suggestion also from the person who are added in the same community if there are many features common in them.
  • 12.
     Users alsoget friend suggestions of those persons who have many features in common.  Users can chat with each others.  The application contains basic features of any social networking website such as liking the pages, adding friends, uploading status, follow people etc.
  • 13.
    METHODS FOR LINK PREDICTION There can be many methods that can be used for predicting the link. Some of them are: 1. Jaccard coefficient 2. Adamic/adar 3. Common neighbors 4. Graph distance 5. Katz 6. Hitting time 7. Friends measure 8. Preferential attachment score 9. Bayesian algorithm, etc.
  • 14.
     The linkprediction problem  Given a snapshot of a social network at time t, we seek to accurately predict the edges that will be added to the network during the interval (t, t’)
  • 15.
    Methods we areusing for the prediction  COMMON NEIGHBOURS  JACCARD COEFFICIENT  ADAMIC/ADAR
  • 16.
    COMMON NEIGHBOURS  Aand C have 2 common neighbors, more likely to collaborate A B C D E
  • 17.
    JACCARD COEFFICIENT  sameas common neighbors, adjusted for degree
  • 18.
    ADAMIC/ADAR  weighting rarerneighbors more heavily (gives more weightage to neighbours that are not shared with many others) A B C D E
  • 19.
    Methods we areusing to find strengths of users 1. Degree of Centrality: Centrality is regarded as one of the most important and commonly used conceptual tools for exploring actor roles in social networks. A node’s degree centrality, in an un-directed graph, is defined as the number of nodes that are connected to that node. The definition dictates that “central actors must be the most active in the sense that they have the most ties to other actors in the network or graph”
  • 20.
    2. Closeness: Closenesscentrality indicates the influence of a node on the entire network, and thus discipline centrality in this research can tell how “close” each discipline is to the other disciplines and the influence that a discipline puts on the entire network. 3. Betweenness: According to the definition of betweenness, betweenness centrality reflects the bridge role of a discipline in a knowledge communication network. The larger the discipline betweenness, the more control that the discipline has over the interaction between other disconnected disciplines.
  • 21.
  • 25.
    TEST PLAN  Wehave implemented many test cases on the modules that have been developed so far which are as follows:  When we check the prediction on the basis of common features, we see that the right person gets predicted.  On changing the features of any users, new links should pe predicted which is as per the adamic and adar algorithm.
  • 26.
     Testing theuser login by entering the user name and password. If the user name and password is correct then the user will login otherwise login failed. Below are the snapshots indicating the login successful and failure.  If all the fields have been filled for registering the user, then the user will be registered with a different id which is the primary key.  On adding friends we test that the friend list should get updated successfully.
  • 27.
    REFERENCES  ULRIK BRANDES†.2001.A Faster Algorithm For Betweenness Centrality*. Published in Journal of Mathematical Sociology  Kazuya Okamoto1, Wei Chen, and Xiang-Yang Li.2008. Ranking of Closeness Centrality for Large-Scale Social Networks. FAW '08 Proceedings of the 2nd annual international workshop on Frontiers in Algorithmics ( Pages 186-195)  Luca Maria Aiello, Alain Barrat, Rossano Schifanella, Ciro Cattuto, Benjamin Markines, Filippo Menczer. 2011. Friendship prediction and homophily in social media. ACM journal.  Purnamrita Sarkar, Deepayan Chakrabartiy, Michael I. Jordanz. 2012. Nonparametric Link Prediction in Dynamic Networks. ICML, UK,2012, PAGE NO. 1-8  Zhengdong Lu, Berkant Savas, Wei Tang, Inderjit Dhillon. 2010. Supervised Link Prediction Using Multiple Sources. 29 th International Conference on Machine Learning, Edinburgh, Scotland, UK.  Suphakit Niwattanakul, Jatsada Singthongchai, Ekkachai Naenudorn and Supachanun Wanapu.2013. Using of Jaccard Coefficient with keyword similarity. IMECS  AIROLDI, E., BLEI, D., FIENBERG, S., XING, E., AND JAAKKOLA, T. 2006. Mixed
  • 28.
     AVIN, C.,LOTKER, Z., AND PIGNOLET, Y. 2011. On the elite of social networks. Arxiv preprint arXiv:1111.3374.  BARABASI, A.-L. AND ALBERT, R. 1999. Emergence of scaling in random networks. Science 286, 509–512.  BLONDEL, V., GUILLAUME, J., LAMBIOTTE, R., AND LEFEBVRE, E. 2008. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008, P10008.  CHA, M., HADDADI, H., BENEVENUTO, F., AND GUMMADI, K. P. 2010. Measuring user influence in twitter: The million follower fallacy. In In Proceedings of the 4th International AAAI Conference on Weblogs and Social Media (ICWSM). Washington DC, USA.  CHAWLA, N., JAPKOWICZ, N., AND KOTCZ, A. 2004. Editorial: special issue on learning from imbalanced data sets. ACM SIGKDD Explorations Newsletter 6, 1, 1–6.  CHO, E., MYERS, S., AND LESKOVEC, J. 2011. Friendship and mobility: user movement in location-based social networks. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1082–1090.  A.-L. Barab´asi and R. Albert, “Emergence of Scaling in Random Networks,” Science, vol. 286, no. 5439, pp. 509-512, Oct. 1999.  D. Wang, Dashun. Pedreschi and A. Barab´asi, “Human Mobility, Social Ties, and Link Prediction,” in KDD ’11. ACM, 2011, pp. 1100–1108.
  • 29.