LINK PREDICTION IN
Umang Chaudhary (10103408)
Sanyam Gupta (10103405)
Under the Guidance Of: Dr. Buddha Singh
WHAT IS SOCIAL NETWORKING?
Social networking is the grouping of individuals into specific
groups, like small rural communities or a neighbourhood
subdivision, if you will. Although social networking is possible
in person, especially in the workplace, universities, and high
schools, it is most popular online.
Social networks are highly dynamic objects; they
grow and change quickly over time through the
addition of new edges. Its solution is LINK
WHAT IS LINK
Link prediction is an important task for analysing
social networks. It is the method to predict link
between the given nodes using various algorithms.
Identifying the structure of a criminal
network by predicting missing links in a
criminal network using incomplete data.
Automatic web hyperlink creation
Website hyperlink prediction
Build recommendation systems (e-
Protien-protien interaction (bio-informatics)
Annotate PPI graph (bio-informatics)
Identify hidden group of terrorists (security)
Overcoming the data-sparsity problem in
recommender systems using collaborative
The network is dynamic as it keeps on
expanding because the users keep
getting added exponentially so
predicting link between the users is a
big challenge. So we are going to
implement algorithms like common
neighbours, jaccard coefficient and
adamic/adar which can predict link
Our social networking application is about where
anyone can register and become a member and stay
connected to all their friends and to other users. Our
application includes all the features of a social
networking application like:
We have developed the social networking
system which is client-server based.
We have developed the algorithms for link
prediction that includes prediction on the
basis of common neighbours, common
features, members of the same community
To chek user’s strength we have
calculated betweeness, closeness and
degree of centrality.
We have implemented our proposed
algorithm using python programming
language, GLADE(for GUI) and SQL (for
We have evaluated the performance of
our proposed algorithm.
FEATURES OF OUR
New user can register on the application and can update their
profile and also can view other’s profile.
Users have a unique profile visible to the friends and users of
the application, where they can upload there pictures and
Users can add other users.
Most famous person in the group can be predicted in the
We are able to find each user’s strength by calculating
closeness, betweeness and degree of centrality.
Users can upload their status and comment on the status as
Users can join communities created by them
in their field of interest.
Users can join groups, create groups.
There is link prediction on the basis of
common or mutual friends so the users will
get friend suggestion on the basis of mutual
The users get friend suggestion also from the
person who are added in the same
community if there are many features
common in them.
Users also get friend suggestions of those
persons who have many features in common.
Users can chat with each others.
The application contains basic features of any
social networking website such as liking the
pages, adding friends, uploading status, follow
METHODS FOR LINK
There can be many methods that can be
used for predicting the link. Some of them
1. Jaccard coefficient
3. Common neighbors
4. Graph distance
6. Hitting time
7. Friends measure
8. Preferential attachment score
9. Bayesian algorithm, etc.
The link prediction problem
Given a snapshot of a social network
at time t, we seek to accurately predict
the edges that will be added to the
network during the interval (t, t’)
Methods we are using for the
A and C have 2 common neighbors,
more likely to collaborate
same as common neighbors, adjusted
weighting rarer neighbors more
heavily (gives more weightage to
neighbours that are not shared with
Methods we are using to find
strengths of users
1. Degree of Centrality: Centrality is
regarded as one of the most important and
commonly used conceptual tools for
exploring actor roles in social networks. A
node’s degree centrality, in an un-directed
graph, is defined as the number of nodes that
are connected to that node.
The definition dictates that “central actors
must be the most active in the sense that
have the most ties to other actors in the
network or graph”
2. Closeness: Closeness centrality indicates the
influence of a node on the entire network, and thus
discipline centrality in this research can tell how
“close” each discipline is to the other disciplines
and the influence that a discipline puts on the
3. Betweenness: According to the definition
of betweenness, betweenness centrality
reflects the bridge role of a discipline in a
knowledge communication network. The larger
the discipline betweenness, the more control
that the discipline has over the interaction
between other disconnected disciplines.
We have implemented many test cases
on the modules that have been
developed so far which are as follows:
When we check the prediction on the
basis of common features, we see that
the right person gets predicted.
On changing the features of any users,
new links should pe predicted which is
as per the adamic and adar algorithm.
Testing the user login by entering the user
name and password. If the user name and
password is correct then the user will login
otherwise login failed. Below are the
snapshots indicating the login successful and
If all the fields have been filled for registering
the user, then the user will be registered with
a different id which is the primary key.
On adding friends we test that the friend list
should get updated successfully.
ULRIK BRANDES†.2001. A Faster Algorithm For Betweenness Centrality*. Published in
Journal of Mathematical Sociology
Kazuya Okamoto1, Wei Chen, and Xiang-Yang Li.2008. Ranking of Closeness Centrality
for Large-Scale Social Networks. FAW '08 Proceedings of the 2nd annual international
workshop on Frontiers in Algorithmics ( Pages 186-195)
Luca Maria Aiello, Alain Barrat, Rossano Schifanella, Ciro Cattuto, Benjamin Markines,
Filippo Menczer. 2011. Friendship prediction and homophily in social media. ACM
Purnamrita Sarkar, Deepayan Chakrabartiy, Michael I. Jordanz. 2012. Nonparametric
Link Prediction in Dynamic Networks. ICML, UK,2012, PAGE NO. 1-8
Zhengdong Lu, Berkant Savas, Wei Tang, Inderjit Dhillon. 2010. Supervised Link
Prediction Using Multiple Sources. 29 th International Conference on Machine Learning,
Edinburgh, Scotland, UK.
Suphakit Niwattanakul, Jatsada Singthongchai, Ekkachai Naenudorn and Supachanun
Wanapu.2013. Using of Jaccard Coefficient with keyword similarity. IMECS
AIROLDI, E., BLEI, D., FIENBERG, S., XING, E., AND JAAKKOLA, T. 2006. Mixed
AVIN, C., LOTKER, Z., AND PIGNOLET, Y. 2011. On the elite of social networks. Arxiv preprint
BARABASI, A.-L. AND ALBERT, R. 1999. Emergence of scaling in random networks. Science 286,
BLONDEL, V., GUILLAUME, J., LAMBIOTTE, R., AND LEFEBVRE, E. 2008. Fast unfolding of
communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008,
CHA, M., HADDADI, H., BENEVENUTO, F., AND GUMMADI, K. P. 2010. Measuring user influence
in twitter: The million follower fallacy. In In Proceedings of the 4th International AAAI Conference on
Weblogs and Social Media (ICWSM). Washington DC, USA.
CHAWLA, N., JAPKOWICZ, N., AND KOTCZ, A. 2004. Editorial: special issue on learning from
imbalanced data sets. ACM SIGKDD Explorations Newsletter 6, 1, 1–6.
CHO, E., MYERS, S., AND LESKOVEC, J. 2011. Friendship and mobility: user movement in
location-based social networks. In Proceedings of the 17th ACM SIGKDD international conference
on Knowledge discovery and data mining. ACM, 1082–1090.
A.-L. Barab´asi and R. Albert, “Emergence of Scaling in Random Networks,” Science, vol. 286, no.
5439, pp. 509-512, Oct. 1999.
D. Wang, Dashun. Pedreschi and A. Barab´asi, “Human Mobility, Social Ties, and Link Prediction,” in
KDD ’11. ACM, 2011, pp. 1100–1108.