Sunbelt 2013 Presentation

Layout Algorithm for Clustered
Graphs to Analyze Community
Interactions in Social Networks
Juan David Cruz
CécileBothorel
François Poulet
SUNBELT Conference 2013
May 23rd, 2013, Hamburg, Germany

Institut Mines-Télécom
Introduction – I
Real world social networks store both
social and structural information from
the actors
For example, this social network from
Facebook, contains actors’personal
information and the links between
them…
This social network is described by two
types of information, which is
integrated into the communities that
can be identified…
2 Juan David Cruz

Introduction – II
? How to represent these structural and profile similarities on the same
plane while presenting the communities configuration?
Combining both types of information helps
to identify groups of similar and well
connected nodes.
-Find groups of friends who are similar
from a point of view of their hobbies
-Find groups of friends from a point of
view of their academic competences
These new partitions can be analyzed
using visual analytics approaches… but
how to use a visual approach to exploit all
this information?
3 Juan David Cruz

Layout of communities – Objective
? How to represent these structural and profile similarities on the same
plane while presenting the communities configuration?
-Graph layout has several challenges, from
computational complexity to readability
-These challenges are true for visualizing
and analyzing communities
We want to:
-Reduce the node cluttering while showing
the relationships between profile and
structure in the communities
-Observe interactions between
communities and find important nodes
4 Juan David Cruz

Bibliographical revision – Clustered graphs
layout
5 Juan David Cruz
Force based models Hierarchical models Other models
Partition of the graph G
Graph G
-Multilevel
-LinLog
-Multilevel for weighted
graphs
-Kamada-Kawai based
-Hierarchic quotient
graph
-Radial hierarchy
representation
-Hierarchical visual
clustering
-Rectangles and
straight lines
-Topological features
-Overlapping clustered
graphs
! In general, these models are oriented to differentiate the groups, separating
them from each other: establish their limits.

Bibliographical revision – Clustered graphs
layout - Examples
6 Juan David Cruz
Multilevel (Eades&Feng) Orthogonal (di Battista et al.)
Weighted Multilevel (Bourqui et al.) Overlapping (Santamaría&Therón)

Clustered graphs layout algorithms -
Summary
Information used
Method Structura
l
Profiles Communitie
s
Multilevel /Force directed [Eades&Feng 1999] No No Yes
Rectangles and straight lines [diGiacomo 2007 ] No No Yes
LinLog /Force model [Noack 2003] Yes No No
Hierarchical /Quotient graph [Brockenauer 2001] No No Yes
Multilevel /Force directed w. graphs [Bourqui et al.
2007]
Yes No Yes
Overlapping clustered graphs [Santamaría et al. 2008] No No Yes
Radial representation [Mun& Ha 2005] Yes No No
Hierarchical /visual clustering [Batagelj et al. 2011] Yes No No
Kamada-Kawai based [Shi et al. 2009] No No Yes
Topological features based [Archambault et al. 2007] Yes No Yes
Multivariate layout algorithm (My algorithm) Yes Yes Yes
7 Juan David Cruz

Visualization of communities
 The algorithm allows correlating structural and profile
information
 Each group has diverse categories from the profile
information
 The algorithm focuses on individuals connecting
communities (boundary connectors)
 Describe boundary connectors with their profile and
their neighborhood
8 Juan David Cruz

Visualization of communities – Multi-
Dimensional Scaling
 Maps a similarity into a 2/3
dimensional space
 MDS uses a (dis)similarity matrix
as input
 The output is a set of coordinates
whose distances resemble the
(dis)similarities
 (Dis)similarities:
• Geographic distances
• Jaccard distance (vectors, sets)
• Geodesic distances (graphs)
9 Juan David Cruz
Dissimilarity matrix
2D Coordinates

Visualization of communities – Types of
nodes
10 Juan David Cruz
These are the nodes connecting
communities: have neighbors in other
clusters, defining the interaction zone.
These are the nodes with edges from/to
nodes in the same cluster only. Placed
outside the interaction zone.
Border nodes
Inner nodes

Visualization of communities – The
algorithm
1. For each node set a dissimilarity matrix is calculated using the profile
and the structural information
2. The coordinates reflect the proximity of the nodes in terms of the two
variables – Output of the MDS algorithm
3. The final coordinates transformation defines the interaction zone
11 Juan David Cruz

Visualization of communities –
Experiments – Setup
 The goal of the experiments is to test the algorithm capabilities of
identifying important nodes regarding the connections and inside
connections
 The graphs used in experimentation has a low edge density, expecting to
have a community structure
 The community structure makes these graphs suitable for our algorithm
12 Juan David Cruz
Clustered graphs used for the layout algorithm testing

Experiments – Setup: roles
13
Roles from Guimera and Amaral, 2005
Juan David Cruz

Experiments – Facebook
Ambassadors help their communities to get into the interaction zone (where the
communities interact.) The influence of the structural similarity is reflected on the
proximity of the nodes
14 Juan David Cruz
Layout using Fruchterman&Reingold Layout using our algorithm
Interaction
zone
Inner
nodes

Experiments – DBLP
In this graph, several well connected nodes remain as inner nodes. These nodes can
be seen as gurus in their communities, however they are not connected with other
communities (treating other topics)
15 Juan David Cruz

Experiments – Protein interaction
With our algorithm it is possible to observe the sizes of the inner nodes and to identify
those nodes important in regard of the interactions. However, this representation has
to be analyzed by an expert to give some insight about the configuration
16 Juan David Cruz

Visualization of communities – Complexity
17 Juan David Cruz
Complexity of the algorithm
The overall complexity of the algorithm is:
The algorithm was implemented using two
parallelization approaches: threaded CBLAS
routines and GP-GPU CUBLAS routines
where available…
Results of the experiments
In general the complexity is quadratic in
function of the number of border nodes
Graph % border nodes Time
(s)
Protein interaction 38% 1021
DBLP network 40% 346
Twitter network 24% 89
Facebook network 25% 36

Table of contents
18 Juan David Cruz
3 Conclusion and perspectives

Conclusion and perspectives
 Our proposed visualization model focuses on the integration of
the variables existing on a social network
 Dividing the nodes into two categories allows identifying
important nodes regarding the communication between
communities
 This division reduces the complexity (in average) of the layout
algorithm
 The nodes are placed in such way the distance between them
represents their structural similarity
 The model was implemented using PT-CBLAS and CUBLAS to
improve some operations of the algorithm (parallelization)
 Qualitative studies have to be performed to test the functionality
of the model on real research cases
19 Juan David Cruz

Conclusion and perspectives – Future work
 The visual model can be extended to include the notion of point of
view, showing the impact of selecting different elements from the
profile information
 Use this visualization method on real world applications such as
identification of influencing actors in marketing campaigns
20 Juan David Cruz

Thank you for your attention
Do you have questions?
21 Juan David Cruz

Sunbelt 2013 Presentation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Sunbelt 2013 Presentation

Similar to Sunbelt 2013 Presentation (20)

More from Juan David Cruz-Gómez

More from Juan David Cruz-Gómez (7)

Recently uploaded

Recently uploaded (20)

Sunbelt 2013 Presentation

Editor's Notes