SlideShare a Scribd company logo
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Graph mining with kernel self-organizing map
Nathalie Villa-Vialaneix
http://www.nathalievilla.org
Joint work with Fabrice Rossi, INRIA, Rocquencourt, France
Institut de Mathématiques de Toulouse, - IUT de Carcassonne, Université de
Perpignan
France
SanTouVal, February 1st, 2008
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Table of contents
1 Motivations
2 Dissimilarities and distances between vertices
3 Kernel SOM
4 Application and comments
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Exploring a big historic database
Data
1000 agrarian contracts,
from four seignories (about 10 villages) of South West of
France,
established between 1250 and 1350 (before the Hundred
Years’ war).
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Exploring a big historic database
Data
1000 agrarian contracts,
from four seignories (about 10 villages) of South West of
France,
established between 1250 and 1350 (before the Hundred
Years’ war).
Historian’s questions:
family or geographical social links ?
central people having a main social role ?
. . .
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Exploring a big historic database
Data
1000 agrarian contracts,
from four seignories (about 10 villages) of South West of
France,
established between 1250 and 1350 (before the Hundred
Years’ war).
Historian’s questions:
family or geographical social links ?
central people having a main social role ?
. . .
⇒ Data mining is required.
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
A graph clustering problem
From the database, building a weighted graph:
with 615 vertices x1, . . . , xn := peasants found in the
contracts;
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
A graph clustering problem
From the database, building a weighted graph:
with 615 vertices x1, . . . , xn := peasants found in the
contracts;
with weights (wi,j)i,j=1,...,n := {contracts where xi and xj are
mentionned}.
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
A graph clustering problem
From the database, building a weighted graph:
with 615 vertices x1, . . . , xn := peasants found in the
contracts;
with weights (wi,j)i,j=1,...,n := {contracts where xi and xj are
mentionned}.
Number of vertices: 615
Number of edges: 4193
Total of weights: 40 329
Diameter: 10
Density: 2,2%
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
A graph clustering problem
From the database, building a weighted graph:
with 615 vertices x1, . . . , xn := peasants found in the
contracts;
with weights (wi,j)i,j=1,...,n := {contracts where xi and xj are
mentionned}.
Number of vertices: 615
Number of edges: 4193
Total of weights: 40 329
Diameter: 10
Density: 2,2%
Clustering the vertices into homogeneous social groups to
understand the structure of the peasant community.
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Other fields modelized by large graphs
Computer science: World Wide Web, P2P network. . .
Social networks
Biology: Protein interactions, Neuronal network,. . .
Business, management: Transportation networks, Industry
partnerships. . .
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Other fields modelized by large graphs
Computer science: World Wide Web, P2P network. . .
Social networks
Biology: Protein interactions, Neuronal network,. . .
Business, management: Transportation networks, Industry
partnerships. . .
Question: Understanding the structure of these large graphs
Clustering: building relevant homogeneous groups;
Graph drawing: giving a global representation of the graph.
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Other fields modelized by large graphs
Computer science: World Wide Web, P2P network. . .
Social networks
Biology: Protein interactions, Neuronal network,. . .
Business, management: Transportation networks, Industry
partnerships. . .
Question: Understanding the structure of these large graphs
Clustering: building relevant homogeneous groups;
Graph drawing: giving a global representation of the graph.
Here: Self-Organizing Map for nonvectorial data.
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Table of contents
1 Motivations
2 Dissimilarities and distances between vertices
3 Kernel SOM
4 Application and comments
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Usual dissimilarities between vertices
The Dice (Jaccard) index:
D(xi, xj) =
Γ(xi) ∩ Γ(xj)
|Γ(xi)| + |Γ(xj)|
(non weighted graphs);
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Usual dissimilarities between vertices
The Dice (Jaccard) index:
D(xi, xj) =
Γ(xi) ∩ Γ(xj)
|Γ(xi)| + |Γ(xj)|
(non weighted graphs);
Dissimilarities based on the shortest paths;
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Usual dissimilarities between vertices
The Dice (Jaccard) index:
D(xi, xj) =
Γ(xi) ∩ Γ(xj)
|Γ(xi)| + |Γ(xj)|
(non weighted graphs);
Dissimilarities based on the shortest paths;
Dissimilarities or distances based on the Laplacian matrix:
spectral clustering.
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Laplacian
Definitions
For a graph with vertices V = {x1, . . . , xn} having positive weights
(wi,j)i,j=1,...,n such that, for all i, j = 1, . . . , n, wi,j = wj,i and di = n
j=1 wi,j,
Laplacian: L = (Li,j)i,j=1,...,n where
Li,j =
−wi,j if i j
di if i = j
;
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Laplacian: property I [von Luxburg, 2007]
Connected subgraphs
KerL = Span{IA1
, . . . , IAk
} where Ai indicates the positions of the
vertices of the ith connected component of the graph.
1
4
5
2
3
KerL = Span





1
0
0
1
1


;


0
1
1
0
0





Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Laplacian: property II [Boulet et al., 2008]
Perfect community : Complete subgraph (clique) which vertices
share the same neighbors outside the clique.
Laplacian and perfect communities
For a non weighted graph,
The graph has a perfect community with m vertices
⇔
L has m eigenvectors such that each eigenvector has the same
n − m coordinates that vanish.
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Laplacian: property II [Boulet et al., 2008]
Perfect community : Complete subgraph (clique) which vertices
share the same neighbors outside the clique.
Application :
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Laplacian: property II [Boulet et al., 2008]
Perfect community : Complete subgraph (clique) which vertices
share the same neighbors outside the clique.
Application :
But: only 1/3 of the graph can be drawn this way.
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Laplacian: property III [von Luxburg, 2007]
Min Cut problem: Suppose that we have a connected graph.
Find a classification of the vertices of the graph, A1, . . . , Ak such
that
1
2
k
i=1 j∈Ai,j Ai
wj,j
is minimum , is equivalent to minimize
H = arg min
h∈Rn×k
Tr hT
Lh subject to
hT
h = I
hi = 1/
√
|Ai|1Ai
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Laplacian: property III [von Luxburg, 2007]
Min Cut problem: Suppose that we have a connected graph.
Find a classification of the vertices of the graph, A1, . . . , Ak such
that
1
2
k
i=1 j∈Ai,j Ai
wj,j
is minimum , is equivalent to minimize
H = arg min
h∈Rn×k
Tr hT
Lh subject to
hT
h = I
hi = 1/
√
|Ai|1Ai
⇒ NP-complete problem.
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Laplacian: property III [von Luxburg, 2007]
Min Cut problem: Suppose that we have a connected graph.
Find a classification of the vertices of the graph, A1, . . . , Ak such
that
1
2
k
i=1 j∈Ai,j Ai
wj,j
is minimum can be approached by
H = arg min
h∈Rn×k
Tr hT
Lh subject to hT
h = I
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Laplacian: property III [von Luxburg, 2007]
Min Cut problem: Suppose that we have a connected graph.
Find a classification of the vertices of the graph, A1, . . . , Ak such
that
1
2
k
i=1 j∈Ai,j Ai
wj,j
is minimum can be approached by
H = arg min
h∈Rn×k
Tr hT
Lh subject to hT
h = I
Spectral clustering: Find the k smallest eigenvectors of L, H, and
make the classification on the rows of H.
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
A regularized version of L
Regularization : the diffusion matrix : pour β > 0,
Kβ = e−βL
= +∞
k=1
(−βL)k
k! .
⇒
kβ
: V × V → R
(xi, xj) → K
β
i,j
diffusion kernel (or heat kernel).
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Diffusion process on the graph
If Z0 = (1 1 1 . . . 1 1)T
is the “energy” of each vertex at time 0 and
if a small fraction of this energy is propagated among the edges
of the graph at each time step, then after t steps, the energy of the
vertices of the graph is:
Zt = (1 + L)t
Z0
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Diffusion process on the graph
If Z0 = (1 1 1 . . . 1 1)T
is the “energy” of each vertex at time 0 and
if a small fraction of this energy is propagated among the edges
of the graph at each time step, then after t steps, the energy of the
vertices of the graph is:
Zt = (1 + L)t
Z0
Limits: Time step ∆t by t → t/(∆t) and → ∆t; then
(∆t) → 0 (continuous process) gives
lim Zt = e tL
= K t
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Properties
1 Diffusion on the graph: kβ(xi, xj) quantity of energy
accumulated in xj after a given time if energy 1 is injected in xi
at time 0 and if diffusion is done continuously along the edges.
β intensity of diffusion;
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Properties
1 Diffusion on the graph: kβ(xi, xj) quantity of energy
accumulated in xj after a given time if energy 1 is injected in xi
at time 0 and if diffusion is done continuously along the edges.
β intensity of diffusion;
2 Regularization operator: for u ∈ Rn
∼ V, uT
Kβu is higher for
vectors u that vary a lot over “close” vertices of the graph.
β intensity of regularization (for small β, direct neighbors are
more important);
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Properties
1 Diffusion on the graph: kβ(xi, xj) quantity of energy
accumulated in xj after a given time if energy 1 is injected in xi
at time 0 and if diffusion is done continuously along the edges.
β intensity of diffusion;
2 Regularization operator: for u ∈ Rn
∼ V, uT
Kβu is higher for
vectors u that vary a lot over “close” vertices of the graph.
β intensity of regularization (for small β, direct neighbors are
more important);
3 Reproducing kernel property: kβ is symmetric and positive
⇒ ∃ Hilbert space (H, ., . ) and φ : V → H such that
kβ
(xi, xj) = φ(xi), φ(xj) .
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Table of contents
1 Motivations
2 Dissimilarities and distances between vertices
3 Kernel SOM
4 Application and comments
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Kohonen map
Mapping the data onto a 2 dimensional map
Each neuron of the map, i = 1, . . . , M is associated to a
prototype, pi ∈ H ;
Neurons are related to each others by a neighborhood
relationship (“distance”: d) :
Classifying the vertices on the map
Each xi is associated to a neuron (cluster or class) of the map,
f(xi).
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Preserving the initial topology
Energy
The goal is to minimize the energy of the map:
E =
M
i=1
h(d(f(x), i)) x − pi
2
H dP(x)
where h is a decreasing function (ex: h(t) = αe−t/2σ2
).
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Preserving the initial topology
Energy
The goal is to minimize the energy of the map:
E =
M
i=1
h(d(f(x), i)) x − pi
2
H dP(x)
where h is a decreasing function (ex: h(t) = αe−t/2σ2
).
Energy is approached by its empirical version:
En
=
n
j=1
M
i=1
h(d(f(xj), i)) xj − pi
2
H .
and minimization is approached by SOM algorithm.
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Batch kernel SOM [Villa and Rossi, 2007]
Initialize randomly γ0
ji
∈ R (i, j = 1, . . . , n) and p0
j
= n
i=1 γ0
ji
φ(xi).
Then, for l = 1, . . . , n repeat
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Batch kernel SOM [Villa and Rossi, 2007]
Initialize randomly γ0
ji
∈ R (i, j = 1, . . . , n) and p0
j
= n
i=1 γ0
ji
φ(xi).
Then, for l = 1, . . . , n repeat
Assignment step
for all xi,
fl
(xi) = arg min
j=1,...,M
φ(xi) −
n
i=1
γl
jiφ(xi)
H
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Batch kernel SOM [Villa and Rossi, 2007]
Initialize randomly γ0
ji
∈ R (i, j = 1, . . . , n) and p0
j
= n
i=1 γ0
ji
φ(xi).
Then, for l = 1, . . . , n repeat
Assignment step
for all xi,
fl
(xi) = arg min
j=1,...,M
φ(xi) −
n
i=1
γl
jiφ(xi)
H
Representation step
γl
j = arg min
γ∈Rn
n
i=1
h(fl
(xi), j) φ(xi) −
n
l =1
γl φ(xl )
2
H
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Batch kernel SOM [Villa and Rossi, 2007]
Initialize randomly γ0
ji
∈ R (i, j = 1, . . . , n) and p0
j
= n
i=1 γ0
ji
φ(xi).
Then, for l = 1, . . . , n repeat
Assignment step
for all xi,
f(xi) = arg min
j=1,...,M
n
u,u =1
γjuγju kβ
(xu, xu ) − 2
n
u=1
γjukβ
(xu, xi)
Representation step
γl
ji =
h(fl
(xi), j))
n
i =1 h(fl(xi , j))
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Table of contents
1 Motivations
2 Dissimilarities and distances between vertices
3 Kernel SOM
4 Application and comments
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Results on a 7 × 7 rectangular map
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Results on a 7 × 7 rectangular map
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Results on a 7 × 7 rectangular map
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
Expected developments
1 Hierarchical clustering;
2 Achieve a classification based on density criterium (joint work
with S. Gadat);
3 Adapting the algorithm to very large graphs (thousands of
vertices).
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
Motivations
Dissimilarities and distances between vertices
Kernel SOM
Application and comments
References
Boulet, R., Jouve, B., Rossi, F., and Villa, N. (2008).
Batch kernel SOM and related laplacian methods for social network
analysis.
Neurocomputing.
To appear.
Villa, N. and Rossi, F. (2007).
A comparison between dissimilarity SOM and kernel SOM for clustering the
vertices of a graph.
In Proceedings of the 6th Workshop on Self-Organizing Maps (WSOM 07),
Bielefield, Germany.
von Luxburg, U. (2007).
A tutorial on spectral clustering.
Technical Report TR-149, Max Planck Institut für biologische Kybernetik.
Avaliable at http://www.kyb.mpg.de/publications/
attachments/luxburg06_TR_v2_4139%5B1%5D.pdf.
Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008

More Related Content

Similar to Graph mining with kernel self-organizing map

A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
tuxette
 
Network analysis for computational biology
Network analysis for computational biologyNetwork analysis for computational biology
Network analysis for computational biology
tuxette
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering Networks
Tomaso Aste
 
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans RVisualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
tuxette
 
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Universitat Politècnica de Catalunya
 
http://old.nathalievilla.org/IMG/pdf/Presentation-27.pdf
http://old.nathalievilla.org/IMG/pdf/Presentation-27.pdfhttp://old.nathalievilla.org/IMG/pdf/Presentation-27.pdf
http://old.nathalievilla.org/IMG/pdf/Presentation-27.pdf
tuxette
 
Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"
tuxette
 
Large scale landuse classification of satellite imagery
Large scale landuse classification of satellite imageryLarge scale landuse classification of satellite imagery
Large scale landuse classification of satellite imagery
Suneel Marthi
 
08 Inference for Networks – DYAD Model Overview (2017)
08 Inference for Networks – DYAD Model Overview (2017)08 Inference for Networks – DYAD Model Overview (2017)
08 Inference for Networks – DYAD Model Overview (2017)
Duke Network Analysis Center
 
A Performance Analysis of Self-* Evolutionary Algorithms on Networks with Cor...
A Performance Analysis of Self-* Evolutionary Algorithms on Networks with Cor...A Performance Analysis of Self-* Evolutionary Algorithms on Networks with Cor...
A Performance Analysis of Self-* Evolutionary Algorithms on Networks with Cor...
Rafael Nogueras
 
Statistical Modeling: The Two Cultures
Statistical Modeling: The Two CulturesStatistical Modeling: The Two Cultures
Statistical Modeling: The Two Cultures
Christoph Molnar
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
recsysfr
 
Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...
tuxette
 
Traffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equationsTraffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equations
Guillaume Costeseque
 
Garbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesGarbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning Techniques
IRJET Journal
 
About functional SIR
About functional SIRAbout functional SIR
About functional SIR
tuxette
 
Doubly-Massive MIMO Systems at mmWave Frequencies: Opportunities and Research...
Doubly-Massive MIMO Systems at mmWave Frequencies: Opportunities and Research...Doubly-Massive MIMO Systems at mmWave Frequencies: Opportunities and Research...
Doubly-Massive MIMO Systems at mmWave Frequencies: Opportunities and Research...
Stefano Buzzi
 
Knowledge Graphs and Milestone
Knowledge Graphs and MilestoneKnowledge Graphs and Milestone
Knowledge Graphs and Milestone
Barry Norton
 

Similar to Graph mining with kernel self-organizing map (20)

A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Network analysis for computational biology
Network analysis for computational biologyNetwork analysis for computational biology
Network analysis for computational biology
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering Networks
 
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans RVisualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
 
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
 
http://old.nathalievilla.org/IMG/pdf/Presentation-27.pdf
http://old.nathalievilla.org/IMG/pdf/Presentation-27.pdfhttp://old.nathalievilla.org/IMG/pdf/Presentation-27.pdf
http://old.nathalievilla.org/IMG/pdf/Presentation-27.pdf
 
Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"
 
Large scale landuse classification of satellite imagery
Large scale landuse classification of satellite imageryLarge scale landuse classification of satellite imagery
Large scale landuse classification of satellite imagery
 
08 Inference for Networks – DYAD Model Overview (2017)
08 Inference for Networks – DYAD Model Overview (2017)08 Inference for Networks – DYAD Model Overview (2017)
08 Inference for Networks – DYAD Model Overview (2017)
 
A Performance Analysis of Self-* Evolutionary Algorithms on Networks with Cor...
A Performance Analysis of Self-* Evolutionary Algorithms on Networks with Cor...A Performance Analysis of Self-* Evolutionary Algorithms on Networks with Cor...
A Performance Analysis of Self-* Evolutionary Algorithms on Networks with Cor...
 
Statistical Modeling: The Two Cultures
Statistical Modeling: The Two CulturesStatistical Modeling: The Two Cultures
Statistical Modeling: The Two Cultures
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
 
Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...
 
Traffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equationsTraffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equations
 
Garbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesGarbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning Techniques
 
About functional SIR
About functional SIRAbout functional SIR
About functional SIR
 
CoopLoc Technical Presentation
CoopLoc Technical PresentationCoopLoc Technical Presentation
CoopLoc Technical Presentation
 
Doubly-Massive MIMO Systems at mmWave Frequencies: Opportunities and Research...
Doubly-Massive MIMO Systems at mmWave Frequencies: Opportunities and Research...Doubly-Massive MIMO Systems at mmWave Frequencies: Opportunities and Research...
Doubly-Massive MIMO Systems at mmWave Frequencies: Opportunities and Research...
 
Final Project
Final ProjectFinal Project
Final Project
 
Knowledge Graphs and Milestone
Knowledge Graphs and MilestoneKnowledge Graphs and Milestone
Knowledge Graphs and Milestone
 

More from tuxette

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
tuxette
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
tuxette
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
tuxette
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
tuxette
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
tuxette
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
tuxette
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
tuxette
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
tuxette
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
tuxette
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
tuxette
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
tuxette
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
tuxette
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
tuxette
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
tuxette
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
tuxette
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
tuxette
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
tuxette
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
tuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
tuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
tuxette
 

More from tuxette (20)

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 

Recently uploaded

insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
anitaento25
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
insect morphology and physiology of insect
insect morphology and physiology of insectinsect morphology and physiology of insect
insect morphology and physiology of insect
anitaento25
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
muralinath2
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
Penicillin...........................pptx
Penicillin...........................pptxPenicillin...........................pptx
Penicillin...........................pptx
Cherry
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
NathanBaughman3
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
Michel Dumontier
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 

Recently uploaded (20)

insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
insect morphology and physiology of insect
insect morphology and physiology of insectinsect morphology and physiology of insect
insect morphology and physiology of insect
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
Penicillin...........................pptx
Penicillin...........................pptxPenicillin...........................pptx
Penicillin...........................pptx
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 

Graph mining with kernel self-organizing map

  • 1. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Graph mining with kernel self-organizing map Nathalie Villa-Vialaneix http://www.nathalievilla.org Joint work with Fabrice Rossi, INRIA, Rocquencourt, France Institut de Mathématiques de Toulouse, - IUT de Carcassonne, Université de Perpignan France SanTouVal, February 1st, 2008 Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 2. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Table of contents 1 Motivations 2 Dissimilarities and distances between vertices 3 Kernel SOM 4 Application and comments Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 3. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Exploring a big historic database Data 1000 agrarian contracts, from four seignories (about 10 villages) of South West of France, established between 1250 and 1350 (before the Hundred Years’ war). Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 4. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Exploring a big historic database Data 1000 agrarian contracts, from four seignories (about 10 villages) of South West of France, established between 1250 and 1350 (before the Hundred Years’ war). Historian’s questions: family or geographical social links ? central people having a main social role ? . . . Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 5. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Exploring a big historic database Data 1000 agrarian contracts, from four seignories (about 10 villages) of South West of France, established between 1250 and 1350 (before the Hundred Years’ war). Historian’s questions: family or geographical social links ? central people having a main social role ? . . . ⇒ Data mining is required. Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 6. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments A graph clustering problem From the database, building a weighted graph: with 615 vertices x1, . . . , xn := peasants found in the contracts; Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 7. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments A graph clustering problem From the database, building a weighted graph: with 615 vertices x1, . . . , xn := peasants found in the contracts; with weights (wi,j)i,j=1,...,n := {contracts where xi and xj are mentionned}. Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 8. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments A graph clustering problem From the database, building a weighted graph: with 615 vertices x1, . . . , xn := peasants found in the contracts; with weights (wi,j)i,j=1,...,n := {contracts where xi and xj are mentionned}. Number of vertices: 615 Number of edges: 4193 Total of weights: 40 329 Diameter: 10 Density: 2,2% Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 9. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments A graph clustering problem From the database, building a weighted graph: with 615 vertices x1, . . . , xn := peasants found in the contracts; with weights (wi,j)i,j=1,...,n := {contracts where xi and xj are mentionned}. Number of vertices: 615 Number of edges: 4193 Total of weights: 40 329 Diameter: 10 Density: 2,2% Clustering the vertices into homogeneous social groups to understand the structure of the peasant community. Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 10. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Other fields modelized by large graphs Computer science: World Wide Web, P2P network. . . Social networks Biology: Protein interactions, Neuronal network,. . . Business, management: Transportation networks, Industry partnerships. . . Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 11. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Other fields modelized by large graphs Computer science: World Wide Web, P2P network. . . Social networks Biology: Protein interactions, Neuronal network,. . . Business, management: Transportation networks, Industry partnerships. . . Question: Understanding the structure of these large graphs Clustering: building relevant homogeneous groups; Graph drawing: giving a global representation of the graph. Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 12. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Other fields modelized by large graphs Computer science: World Wide Web, P2P network. . . Social networks Biology: Protein interactions, Neuronal network,. . . Business, management: Transportation networks, Industry partnerships. . . Question: Understanding the structure of these large graphs Clustering: building relevant homogeneous groups; Graph drawing: giving a global representation of the graph. Here: Self-Organizing Map for nonvectorial data. Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 13. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Table of contents 1 Motivations 2 Dissimilarities and distances between vertices 3 Kernel SOM 4 Application and comments Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 14. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Usual dissimilarities between vertices The Dice (Jaccard) index: D(xi, xj) = Γ(xi) ∩ Γ(xj) |Γ(xi)| + |Γ(xj)| (non weighted graphs); Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 15. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Usual dissimilarities between vertices The Dice (Jaccard) index: D(xi, xj) = Γ(xi) ∩ Γ(xj) |Γ(xi)| + |Γ(xj)| (non weighted graphs); Dissimilarities based on the shortest paths; Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 16. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Usual dissimilarities between vertices The Dice (Jaccard) index: D(xi, xj) = Γ(xi) ∩ Γ(xj) |Γ(xi)| + |Γ(xj)| (non weighted graphs); Dissimilarities based on the shortest paths; Dissimilarities or distances based on the Laplacian matrix: spectral clustering. Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 17. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Laplacian Definitions For a graph with vertices V = {x1, . . . , xn} having positive weights (wi,j)i,j=1,...,n such that, for all i, j = 1, . . . , n, wi,j = wj,i and di = n j=1 wi,j, Laplacian: L = (Li,j)i,j=1,...,n where Li,j = −wi,j if i j di if i = j ; Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 18. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Laplacian: property I [von Luxburg, 2007] Connected subgraphs KerL = Span{IA1 , . . . , IAk } where Ai indicates the positions of the vertices of the ith connected component of the graph. 1 4 5 2 3 KerL = Span      1 0 0 1 1   ;   0 1 1 0 0      Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 19. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Laplacian: property II [Boulet et al., 2008] Perfect community : Complete subgraph (clique) which vertices share the same neighbors outside the clique. Laplacian and perfect communities For a non weighted graph, The graph has a perfect community with m vertices ⇔ L has m eigenvectors such that each eigenvector has the same n − m coordinates that vanish. Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 20. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Laplacian: property II [Boulet et al., 2008] Perfect community : Complete subgraph (clique) which vertices share the same neighbors outside the clique. Application : Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 21. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Laplacian: property II [Boulet et al., 2008] Perfect community : Complete subgraph (clique) which vertices share the same neighbors outside the clique. Application : But: only 1/3 of the graph can be drawn this way. Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 22. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Laplacian: property III [von Luxburg, 2007] Min Cut problem: Suppose that we have a connected graph. Find a classification of the vertices of the graph, A1, . . . , Ak such that 1 2 k i=1 j∈Ai,j Ai wj,j is minimum , is equivalent to minimize H = arg min h∈Rn×k Tr hT Lh subject to hT h = I hi = 1/ √ |Ai|1Ai Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 23. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Laplacian: property III [von Luxburg, 2007] Min Cut problem: Suppose that we have a connected graph. Find a classification of the vertices of the graph, A1, . . . , Ak such that 1 2 k i=1 j∈Ai,j Ai wj,j is minimum , is equivalent to minimize H = arg min h∈Rn×k Tr hT Lh subject to hT h = I hi = 1/ √ |Ai|1Ai ⇒ NP-complete problem. Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 24. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Laplacian: property III [von Luxburg, 2007] Min Cut problem: Suppose that we have a connected graph. Find a classification of the vertices of the graph, A1, . . . , Ak such that 1 2 k i=1 j∈Ai,j Ai wj,j is minimum can be approached by H = arg min h∈Rn×k Tr hT Lh subject to hT h = I Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 25. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Laplacian: property III [von Luxburg, 2007] Min Cut problem: Suppose that we have a connected graph. Find a classification of the vertices of the graph, A1, . . . , Ak such that 1 2 k i=1 j∈Ai,j Ai wj,j is minimum can be approached by H = arg min h∈Rn×k Tr hT Lh subject to hT h = I Spectral clustering: Find the k smallest eigenvectors of L, H, and make the classification on the rows of H. Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 26. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments A regularized version of L Regularization : the diffusion matrix : pour β > 0, Kβ = e−βL = +∞ k=1 (−βL)k k! . ⇒ kβ : V × V → R (xi, xj) → K β i,j diffusion kernel (or heat kernel). Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 27. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Diffusion process on the graph If Z0 = (1 1 1 . . . 1 1)T is the “energy” of each vertex at time 0 and if a small fraction of this energy is propagated among the edges of the graph at each time step, then after t steps, the energy of the vertices of the graph is: Zt = (1 + L)t Z0 Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 28. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Diffusion process on the graph If Z0 = (1 1 1 . . . 1 1)T is the “energy” of each vertex at time 0 and if a small fraction of this energy is propagated among the edges of the graph at each time step, then after t steps, the energy of the vertices of the graph is: Zt = (1 + L)t Z0 Limits: Time step ∆t by t → t/(∆t) and → ∆t; then (∆t) → 0 (continuous process) gives lim Zt = e tL = K t Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 29. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Properties 1 Diffusion on the graph: kβ(xi, xj) quantity of energy accumulated in xj after a given time if energy 1 is injected in xi at time 0 and if diffusion is done continuously along the edges. β intensity of diffusion; Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 30. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Properties 1 Diffusion on the graph: kβ(xi, xj) quantity of energy accumulated in xj after a given time if energy 1 is injected in xi at time 0 and if diffusion is done continuously along the edges. β intensity of diffusion; 2 Regularization operator: for u ∈ Rn ∼ V, uT Kβu is higher for vectors u that vary a lot over “close” vertices of the graph. β intensity of regularization (for small β, direct neighbors are more important); Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 31. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Properties 1 Diffusion on the graph: kβ(xi, xj) quantity of energy accumulated in xj after a given time if energy 1 is injected in xi at time 0 and if diffusion is done continuously along the edges. β intensity of diffusion; 2 Regularization operator: for u ∈ Rn ∼ V, uT Kβu is higher for vectors u that vary a lot over “close” vertices of the graph. β intensity of regularization (for small β, direct neighbors are more important); 3 Reproducing kernel property: kβ is symmetric and positive ⇒ ∃ Hilbert space (H, ., . ) and φ : V → H such that kβ (xi, xj) = φ(xi), φ(xj) . Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 32. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Table of contents 1 Motivations 2 Dissimilarities and distances between vertices 3 Kernel SOM 4 Application and comments Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 33. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Kohonen map Mapping the data onto a 2 dimensional map Each neuron of the map, i = 1, . . . , M is associated to a prototype, pi ∈ H ; Neurons are related to each others by a neighborhood relationship (“distance”: d) : Classifying the vertices on the map Each xi is associated to a neuron (cluster or class) of the map, f(xi). Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 34. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Preserving the initial topology Energy The goal is to minimize the energy of the map: E = M i=1 h(d(f(x), i)) x − pi 2 H dP(x) where h is a decreasing function (ex: h(t) = αe−t/2σ2 ). Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 35. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Preserving the initial topology Energy The goal is to minimize the energy of the map: E = M i=1 h(d(f(x), i)) x − pi 2 H dP(x) where h is a decreasing function (ex: h(t) = αe−t/2σ2 ). Energy is approached by its empirical version: En = n j=1 M i=1 h(d(f(xj), i)) xj − pi 2 H . and minimization is approached by SOM algorithm. Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 36. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Batch kernel SOM [Villa and Rossi, 2007] Initialize randomly γ0 ji ∈ R (i, j = 1, . . . , n) and p0 j = n i=1 γ0 ji φ(xi). Then, for l = 1, . . . , n repeat Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 37. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Batch kernel SOM [Villa and Rossi, 2007] Initialize randomly γ0 ji ∈ R (i, j = 1, . . . , n) and p0 j = n i=1 γ0 ji φ(xi). Then, for l = 1, . . . , n repeat Assignment step for all xi, fl (xi) = arg min j=1,...,M φ(xi) − n i=1 γl jiφ(xi) H Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 38. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Batch kernel SOM [Villa and Rossi, 2007] Initialize randomly γ0 ji ∈ R (i, j = 1, . . . , n) and p0 j = n i=1 γ0 ji φ(xi). Then, for l = 1, . . . , n repeat Assignment step for all xi, fl (xi) = arg min j=1,...,M φ(xi) − n i=1 γl jiφ(xi) H Representation step γl j = arg min γ∈Rn n i=1 h(fl (xi), j) φ(xi) − n l =1 γl φ(xl ) 2 H Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 39. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Batch kernel SOM [Villa and Rossi, 2007] Initialize randomly γ0 ji ∈ R (i, j = 1, . . . , n) and p0 j = n i=1 γ0 ji φ(xi). Then, for l = 1, . . . , n repeat Assignment step for all xi, f(xi) = arg min j=1,...,M n u,u =1 γjuγju kβ (xu, xu ) − 2 n u=1 γjukβ (xu, xi) Representation step γl ji = h(fl (xi), j)) n i =1 h(fl(xi , j)) Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 40. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Table of contents 1 Motivations 2 Dissimilarities and distances between vertices 3 Kernel SOM 4 Application and comments Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 41. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Results on a 7 × 7 rectangular map Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 42. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Results on a 7 × 7 rectangular map Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 43. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Results on a 7 × 7 rectangular map Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 44. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Expected developments 1 Hierarchical clustering; 2 Achieve a classification based on density criterium (joint work with S. Gadat); 3 Adapting the algorithm to very large graphs (thousands of vertices). Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008
  • 45. Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments References Boulet, R., Jouve, B., Rossi, F., and Villa, N. (2008). Batch kernel SOM and related laplacian methods for social network analysis. Neurocomputing. To appear. Villa, N. and Rossi, F. (2007). A comparison between dissimilarity SOM and kernel SOM for clustering the vertices of a graph. In Proceedings of the 6th Workshop on Self-Organizing Maps (WSOM 07), Bielefield, Germany. von Luxburg, U. (2007). A tutorial on spectral clustering. Technical Report TR-149, Max Planck Institut für biologische Kybernetik. Avaliable at http://www.kyb.mpg.de/publications/ attachments/luxburg06_TR_v2_4139%5B1%5D.pdf. Nathalie Villa - nathalie.villa@math.univ-toulouse.fr SanTouVal - Feb. 2008