Supervised-Learning Link Recommendation in the DBLP co-authoring network

Introduction Link Prediction and Metrics Results Conclusions
Supervised Learning Link Recommendation in the
DBLP co-authoring network
Gabriel P Gimenes, Hugo Gualdron, Thiago R Raddo, Jose F
Rodrigues Jr
Instituto de Ci^encias Matematicas e de Computac~ao
Universidade de S~ao Paulo
Av. Trabalhador S~ao-carlense, 400-Centro, S~ao Carlos, SP, Brasil
Click for paper:
http://www.icmc.usp.br/pessoas/junio/PublishedPapers/Gimenes_et_al_IEEE-PerCom-SCI2014.pdf
This work has

nanctial support from FAPESP (2013/10026-7 2011/13724-1)
1 / 22

Summary
1 Introduction
2 Link Prediction and Metrics
3 Results
4 Conclusions
2 / 22

Context
Advances in the WWW led to improved mechanisms for users
to interact
Data became abundant in several scenarios
social networks, co-authoring networks, recommender systems,
communication networks
Need for tools that can assist in the decision making process
Most of the networks produced on our daily lives are dynamic
- Link Recommendation
3 / 22

Objectives
Analysis of the Link Recommendation task on a co-authoring
network - DBLP
Comparison between the most used algorithms in supervised
learning using performance metrics (AUC, F-measure,
Precision e Recall)
Including the use of meta-classi

ers such as Bagging and
Random Forest
Detailed study of the parameters involved on the technique -
Core(k) and the intervals
4 / 22

Link Prediction and Metrics
1 Introduction
2 Link Prediction and Metrics
3 Results
4 Conclusions
5 / 22

Problem De

nition
It is possible to model a co-authoring network as a graph,
nodes represent individuals and edges indicate a collaboration
between them
The idea is to predict/recommend new edges using only past
and present informations about the network using supervised
learning techniques
6 / 22

nition
Applications exist in dierent domains such as:
Forecasting suspect behavior on social networks, terrorism, for
example
Identifying interactions that would need intense
experimentation in biology
Suggesting new collaborations/interactions to individuals on
co-authoring networks
7 / 22

nition
Given a snapshot of a network on time t, we are interested in
the edges that most likely should/could exist in t', where
t t0.
Training a supervised classi

er using topological features
extracted from the network to be able to analyze its dynamics
8 / 22

Core
Core(k) is the subset of nodes of interest
Nodes that have at least k edges on training and test intervals
are considered to be in Core(k), the other nodes are not used
10 / 22

Topological Features
Metric Equation
Common Neighbours CN(x; y) = j(x) (y)j
Jaccard Coe

cient JC(x; y) = j(x)(y)j
j(x)[(y)j
Preferential Attachment PA(x; y) = j(x)j j(y)j
Adamic-Adar Coe

cient AA(x; y) =
P
z2(x)(y)
1
logj(z)j
Geodesic Distance shortest path between x and y
Resource Allocation Index RA(x; y) =
P
z2(x)(y)
1
j(z)j
Local Paths LP(x; y) =

Supervised-Learning Link Recommendation in the DBLP co-authoring network

More Related Content

What's hot

Viewers also liked

Similar to Supervised-Learning Link Recommendation in the DBLP co-authoring network

More from Universidade de São Paulo

Recently uploaded

Supervised-Learning Link Recommendation in the DBLP co-authoring network