SlideShare a Scribd company logo
1 of 26
Download to read offline
Point of View Based
Clustering of
Socio-Semantic Networks
Juan David CRUZ1
Cécile BOTHOREL1
François POULET2
JT – FGG2010
Outline

   1     Introduction
            Social Networks and Points of View
            Some Previous Work

   2     The Point of View of Social Networks

   3     Influencing the Community Detection with the Point of View
           Phase 1
           Phase 2

   4     Preliminary Experiments and Results

   5     Conclusion


page 2          Cruz,Bothorel,Poulet               Clustering by Point of View
Motivation
Introduction   Social Networks and Points of View



               Socio–semantic networks are contains both:
                   •   The social graph
                   •   Semantic information represented by the features of the
                       vertices and the edges.
               Given this information, how to identify communities derived
               from the conjoint use of it?
               It is necessary to measure the quality of the partitions found
               using this information from two perspectives:
                   •   The quality of the graph clustering
                   •   The quality of the semantic information within the
                       communities




    page 3              Cruz,Bothorel,Poulet                    Clustering by Point of View
Social Networks and Points of View
Introduction   Social Networks and Points of View




         It is possible to obtain different partitions from different points
                                       of view



    page 4              Cruz,Bothorel,Poulet              Clustering by Point of View
Quality Measures
Introduction    Some Previous Work




               Type                    Objective                   Examples
                             Reduce the distance between
                                                                Manhattan L1
                                the members of the same
          Similarity                                            Euclidean L2
                                 group while the distance
                                                                Chebyshev L∞
                              between groups is increased.
                             Increase the number of edges
                             within each community while          Coverage
                            the number of edges between          Conductance
             Quality
                               communities is reduced. In        Performance
                            general: index (C) = f (C)+g)(C)
                                                      N(G        Modularity
                                        [Gae05]


    page 5              Cruz,Bothorel,Poulet                   Clustering by Point of View
Graph Clustering Algorithms
Introduction   Some Previous Work



       Several graph clustering algorithms have been developed, among
       others:
               Newman [New01] (Modularity optimization)
               Fast unfolding [BGLL08] (Modularity optimization)
               Maximal cliques enumeration and kernel generation [DWP+ 07]
               (Modularity optimization)
               Genetic algorithm for detecting communities in large graphs
               [LM09] (Fitness function based on modularity)
               Genetic algorithm for detecting overlapped communities
               [Piz09] (Fitness function based on internal edges vs. outgoing
               edges)



    page 6             Cruz,Bothorel,Poulet                Clustering by Point of View
The Representation of a Point of View
The Point of View of Social Networks




                The semantic information can be described by a set of features
                The point of view is PoVi = si × V

                                               Point of View
             Nodes         Feature 1          Feature 2 . . .    Feature f
             Node 1            1                  0        ...       0
             Node 2            0                  1        ...       1
               .
               .               .
                               .                   .
                                                   .                 .
                                                                     .
               .               .                   .                 .
             Node n               1               0        ...       1

                The assignation of features to each node in the network


    page 7             Cruz,Bothorel,Poulet                         Clustering by Point of View
General Architecture
Influencing the Community Detection with the Point of View



              Guide the community detection algorithm according to
              semantic information.




              Use of clustering techniques from different domains.


    page 8             Cruz,Bothorel,Poulet                 Clustering by Point of View
Semantic Clustering
Influencing the Community Detection with the Point of View   Phase 1




    page 9             Cruz,Bothorel,Poulet                           Clustering by Point of View
Semantic Clustering
Influencing the Community Detection with the Point of View   Phase 1



              Clustering of the defined point of view: search nodes with
              similar instances of features.
              Use of Self–Organizing Maps (SOM): non–supervised machine
              learning method [Koh97].
              The proximity between the input vector (instance) and the
              weight vector of the network is measured with the Euclidean
              distance.
              The SOM algorithm will find some number of groups.




   page 10             Cruz,Bothorel,Poulet                           Clustering by Point of View
Weights Assignation and Community
                 Detection
Influencing the Community Detection with the Point of View   Phase 2




   page 11             Cruz,Bothorel,Poulet                           Clustering by Point of View
Weights Assignation and Community
                 Detection
Influencing the Community Detection with the Point of View   Phase 2



              Given the trained SOM network N and a graph G (V , E ):
              For each e (i, j) ∈ E , the weight will be changed according to:

                                              wij = 1 + αd (Nij ) δij

              where α ≥ 1 is constant parameter, d (Nij ), is the distance
              between the node i and the node j in the SOM network and
              δij = 1 if i, j belong to the same group in the SOM network.
              After the weights are set, a classic graph clustering algorithm
              (the fast unfolding algorithm [BGLL08]) is used.




   page 12             Cruz,Bothorel,Poulet                             Clustering by Point of View
Experiments Configuration I
Preliminary Experiments and Results



              In each experiment three algorithm were compared:
                   •   SOM
                   •   Fast unfolding
                   •   Our method
              Performed in two levels:
                   •   The final modularity: to measure the quality of the partition.
                   •   The average intra–cluster Euclidean distance: to measure the
                       quality of the semantic clustering.
              The experiment were executed using a graph of 5389 nodes
              and 27347 edges extracted from a Twitter data set. The initial
              modularity of this graph is −2.5192 × 10−3



    page 13             Cruz,Bothorel,Poulet                     Clustering by Point of View
Experiments Configuration II
Preliminary Experiments and Results


              The experiments were performed using two different points of
              view.
              Point of View 1:
                   •   Composed of 33 features. Each feature represents a time zone
                       from the Twitter data set.
                   •   A feature will be set to 1 if the node has at least one friend in
                       the time zone represented√ the feature.
                                                   by
                   •   Distances vary from 0 to 32
              Point of View 2:
                   •   Composed of 4 features representing the messaging profile of
                       each user.
                   •   The first feature is set to 1 if the user has more friends than
                       followers.


    page 14             Cruz,Bothorel,Poulet                       Clustering by Point of View
Experiments Configuration III
Preliminary Experiments and Results


                   •   The next three features indicate the user behavior according to
                       the number of messages sent: below the mean, between the
                       mean plus three standard deviations and, over mean plus three
                       standard deviations.      √
                   •   Distances vary from 0 to 3




    page 15             Cruz,Bothorel,Poulet                     Clustering by Point of View
Case Twitter – Point of View 1
Preliminary Experiments and Results


             Experiment                         Final Q      Avg. Intracluster Distance
             SOM Graph                         −7.5 × 10−3             0.3697
         Graph based clustering                  0.5728                1.8091
         PoV based Clustering                    0.5747                1.1947

              The average intracluster distance found by our proposed
              method is less than the average intracluster distance found by
              the graph based algorithm.
              The modularity obtained is very similar: the point of view uses
              information associated with the localization of people’s friends.
              The modularity of the graph from the SOM clustering is not
              very different from the modularity of the original graph.
              SOM groups are close to the structure of the non–clustered
              graph.
    page 16             Cruz,Bothorel,Poulet                       Clustering by Point of View
Case Twitter – Point of View 2
Preliminary Experiments and Results


             Experiment                        Final Q   Avg. Intracluster Distance
             SOM Graph                         -0.2991                0
           Classic clustering                   0.5728             0.7100
         PoV based Clustering                   0.6351             0.5507
              The SOM clustered the nodes into six groups, each one
              expressing one of the possible instances described above. This
              explains the average distance found.
              Creating a graph from the SOM clustering will produce better
              semantic clusters, however, the modularity is worst than the
              one from the original graph.
              The SOM groups are totally unrelated with the structure of
              the graph.
              Regarding the modularity and the average intracluster distance,
              the performance of the PoV based algorithm was better.
    page 17             Cruz,Bothorel,Poulet                        Clustering by Point of View
Conclusion I
Conclusion



             The classic community detection algorithms do not take into
             account the semantic information to influence the clustering
             process.
             Changing the weights according to the results of the semantic
             clustering, the semantic information is included into the
             community detection process.
             The two types of informations are merged to find and
             visualize a social network from a selected point of view.




   page 18         Cruz,Bothorel,Poulet                 Clustering by Point of View
Conclusion II
Conclusion


             Outlook
               •   Make tests over the obtained partitions: rand index, robustness
                   tests...
               •   Study the case of overlapping communities.
               •   Include the features of the edges into the point of view
                   generation.
               •   Development of a visualization algorithm for representing the
                   PoV and the transition between two points of view.




   page 19          Cruz,Bothorel,Poulet                     Clustering by Point of View
page 20   Cruz,Bothorel,Poulet   Clustering by Point of View
nav


                  Thanks for your attention
Appendix     Appendix




                                               Questions?




   page 20              Cruz,Bothorel,Poulet                Clustering by Point of View
For Further Reading I
Appendix     Appendix   For Further Reading



              Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte,
              and Etienne Lefebvre.
              Fast unfolding of communities in large networks.
              Journal of Statistical Mechanics: Theory and Experiment,
              2008(10):P10008 (12pp), 2008.
              Nan Du, Bin Wu, Xin Pei, Bai Wang, and Liutong Xu.
              Community detection in large-scale social networks.
              In WebKDD/SNA-KDD ’07: Proceedings of the 9th WebKDD
              and 1st SNA-KDD 2007 workshop on Web mining and social
              network analysis, pages 16–25, New York, NY, USA, 2007.
              ACM.



   page 21              Cruz,Bothorel,Poulet           Clustering by Point of View
For Further Reading II
Appendix     Appendix   For Further Reading



              Marco Gaetler.
              Network Analysis: Methodological Foundations, chapter
              Clustering, pages 178 – 215.
              Springer Berlin / Heidelberg, 2005.
              Teuvo Kohonen.
              Self-Organizing Maps.
              Springer, 1997.




   page 22              Cruz,Bothorel,Poulet            Clustering by Point of View
For Further Reading III
Appendix     Appendix   For Further Reading



              Marek Lipczak and Evangelos Milios.
              Agglomerative genetic algorithm for clustering in social
              networks.
              In GECCO ’09: Proceedings of the 11th Annual conference on
              Genetic and evolutionary computation, pages 1243–1250, New
              York, NY, USA, 2009. ACM.
              M. E. Newman.
              Scientific collaboration networks. ii. shortest paths, weighted
              networks, and centrality.
              Phys Rev E Stat Nonlin Soft Matter Phys, 64, July 2001.




   page 23              Cruz,Bothorel,Poulet               Clustering by Point of View
For Further Reading IV
Appendix     Appendix   For Further Reading



              Clara Pizzuti.
              Overlapped community detection in complex networks.
              In GECCO ’09: Proceedings of the 11th Annual conference on
              Genetic and evolutionary computation, pages 859–866, New
              York, NY, USA, 2009. ACM.




   page 24              Cruz,Bothorel,Poulet           Clustering by Point of View
Quality Measures Summary
Appendix     Appendix   For Further Reading



      Summary of quality measures for graph clustering.




   page 25              Cruz,Bothorel,Poulet         Clustering by Point of View

More Related Content

What's hot

Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Jia-Bin Huang
 
A DWT based Dual Image Watermarking Technique for Authenticity and Watermark ...
A DWT based Dual Image Watermarking Technique for Authenticity and Watermark ...A DWT based Dual Image Watermarking Technique for Authenticity and Watermark ...
A DWT based Dual Image Watermarking Technique for Authenticity and Watermark ...
sipij
 
Literature study article
Literature study articleLiterature study article
Literature study article
akaspers
 
A Game Theoretic Framework for Heterogenous Information Network Clustering
A Game Theoretic Framework for Heterogenous Information Network ClusteringA Game Theoretic Framework for Heterogenous Information Network Clustering
A Game Theoretic Framework for Heterogenous Information Network Clustering
Faris Alqadah
 
Improving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signalsImproving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signals
iaemedu
 

What's hot (15)

Block Matching Project
Block Matching ProjectBlock Matching Project
Block Matching Project
 
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
 
Deep Reinforcement Learning for Visual Navigation
Deep Reinforcement Learning for Visual NavigationDeep Reinforcement Learning for Visual Navigation
Deep Reinforcement Learning for Visual Navigation
 
Efficient Belief Propagation in Depth Finding
Efficient Belief Propagation in Depth FindingEfficient Belief Propagation in Depth Finding
Efficient Belief Propagation in Depth Finding
 
A DWT based Dual Image Watermarking Technique for Authenticity and Watermark ...
A DWT based Dual Image Watermarking Technique for Authenticity and Watermark ...A DWT based Dual Image Watermarking Technique for Authenticity and Watermark ...
A DWT based Dual Image Watermarking Technique for Authenticity and Watermark ...
 
Literature study article
Literature study articleLiterature study article
Literature study article
 
Secure Image Transfer in The Domain Transform DFT
Secure Image Transfer in The Domain Transform DFTSecure Image Transfer in The Domain Transform DFT
Secure Image Transfer in The Domain Transform DFT
 
A Game Theoretic Framework for Heterogenous Information Network Clustering
A Game Theoretic Framework for Heterogenous Information Network ClusteringA Game Theoretic Framework for Heterogenous Information Network Clustering
A Game Theoretic Framework for Heterogenous Information Network Clustering
 
NEIGHBOUR LOCAL VARIABILITY FOR MULTIFOCUS IMAGES FUSION
NEIGHBOUR LOCAL VARIABILITY FOR MULTIFOCUS IMAGES FUSIONNEIGHBOUR LOCAL VARIABILITY FOR MULTIFOCUS IMAGES FUSION
NEIGHBOUR LOCAL VARIABILITY FOR MULTIFOCUS IMAGES FUSION
 
H3602056060
H3602056060H3602056060
H3602056060
 
Defocus magnification
Defocus magnificationDefocus magnification
Defocus magnification
 
In2414961500
In2414961500In2414961500
In2414961500
 
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
 
International Journal of Engineering Research and Development (IJERD)
 International Journal of Engineering Research and Development (IJERD) International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Improving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signalsImproving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signals
 

Viewers also liked (7)

Fisrt project poster
Fisrt project posterFisrt project poster
Fisrt project poster
 
A Social Network Based Model for e-Mail Information Visualization - Sunbelt ...
A Social Network Based Model for e-Mail  Information Visualization - Sunbelt ...A Social Network Based Model for e-Mail  Information Visualization - Sunbelt ...
A Social Network Based Model for e-Mail Information Visualization - Sunbelt ...
 
Poster presented at EGC 2011
Poster presented at EGC 2011Poster presented at EGC 2011
Poster presented at EGC 2011
 
Entropy based algorithm for community detection in augmented networks
Entropy based algorithm for community detection in augmented networksEntropy based algorithm for community detection in augmented networks
Entropy based algorithm for community detection in augmented networks
 
ToTeM : une méthode de détection de communautés adaptée à la fouille de résea...
ToTeM : une méthode de détection de communautés adaptée à la fouille de résea...ToTeM : une méthode de détection de communautés adaptée à la fouille de résea...
ToTeM : une méthode de détection de communautés adaptée à la fouille de résea...
 
Poster Journée F&R
Poster Journée F&RPoster Journée F&R
Poster Journée F&R
 
Presentation OOC2013
Presentation OOC2013Presentation OOC2013
Presentation OOC2013
 

Similar to Presentation Journée Thématique - Fouille de Grands Graphes

Life-Cycles and Mutual Effects of Scientific Communities: ASNA 2010
Life-Cycles and Mutual Effects of Scientific Communities: ASNA 2010Life-Cycles and Mutual Effects of Scientific Communities: ASNA 2010
Life-Cycles and Mutual Effects of Scientific Communities: ASNA 2010
Václav Belák
 
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
Sigma web solutions pvt. ltd.
 
networkanalysis
networkanalysisnetworkanalysis
networkanalysis
moogway
 

Similar to Presentation Journée Thématique - Fouille de Grands Graphes (20)

Life-Cycles and Mutual Effects of Scientific Communities: ASNA 2010
Life-Cycles and Mutual Effects of Scientific Communities: ASNA 2010Life-Cycles and Mutual Effects of Scientific Communities: ASNA 2010
Life-Cycles and Mutual Effects of Scientific Communities: ASNA 2010
 
iiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdfiiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdf
 
Community Extracting Using Intersection Graph and Content Analysis in Complex...
Community Extracting Using Intersection Graph and Content Analysis in Complex...Community Extracting Using Intersection Graph and Content Analysis in Complex...
Community Extracting Using Intersection Graph and Content Analysis in Complex...
 
Multi graph encoder
Multi graph encoderMulti graph encoder
Multi graph encoder
 
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
Community detection in social networks[1]
Community detection in social networks[1]Community detection in social networks[1]
Community detection in social networks[1]
 
Uncovering the Structural Fairness in Graph Contrastive Learning.pptx
Uncovering the Structural Fairness in Graph Contrastive Learning.pptxUncovering the Structural Fairness in Graph Contrastive Learning.pptx
Uncovering the Structural Fairness in Graph Contrastive Learning.pptx
 
MAGNETIC RESONANCE BRAIN IMAGE SEGMENTATION
MAGNETIC RESONANCE BRAIN IMAGE SEGMENTATIONMAGNETIC RESONANCE BRAIN IMAGE SEGMENTATION
MAGNETIC RESONANCE BRAIN IMAGE SEGMENTATION
 
MTAAP12: Scalable Community Detection
MTAAP12: Scalable Community DetectionMTAAP12: Scalable Community Detection
MTAAP12: Scalable Community Detection
 
FUAT – A Fuzzy Clustering Analysis Tool
FUAT – A Fuzzy Clustering Analysis ToolFUAT – A Fuzzy Clustering Analysis Tool
FUAT – A Fuzzy Clustering Analysis Tool
 
Network sampling, community detection
Network sampling, community detectionNetwork sampling, community detection
Network sampling, community detection
 
QUALITY OF CLUSTER INDEX BASED ON STUDY OF DECISION TREE
QUALITY OF CLUSTER INDEX BASED ON STUDY OF DECISION TREE QUALITY OF CLUSTER INDEX BASED ON STUDY OF DECISION TREE
QUALITY OF CLUSTER INDEX BASED ON STUDY OF DECISION TREE
 
Fuzzy Encoding For Image Classification Using Gustafson-Kessel Aglorithm
Fuzzy Encoding For Image Classification Using Gustafson-Kessel AglorithmFuzzy Encoding For Image Classification Using Gustafson-Kessel Aglorithm
Fuzzy Encoding For Image Classification Using Gustafson-Kessel Aglorithm
 
Large Scale Data Clustering: an overview
Large Scale Data Clustering: an overviewLarge Scale Data Clustering: an overview
Large Scale Data Clustering: an overview
 
Community detection algorithms
Community detection algorithmsCommunity detection algorithms
Community detection algorithms
 
11 clusadvanced
11 clusadvanced11 clusadvanced
11 clusadvanced
 
networkanalysis
networkanalysisnetworkanalysis
networkanalysis
 
Dbm630 lecture09
Dbm630 lecture09Dbm630 lecture09
Dbm630 lecture09
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Presentation Journée Thématique - Fouille de Grands Graphes

  • 1. Point of View Based Clustering of Socio-Semantic Networks Juan David CRUZ1 Cécile BOTHOREL1 François POULET2 JT – FGG2010
  • 2. Outline 1 Introduction Social Networks and Points of View Some Previous Work 2 The Point of View of Social Networks 3 Influencing the Community Detection with the Point of View Phase 1 Phase 2 4 Preliminary Experiments and Results 5 Conclusion page 2 Cruz,Bothorel,Poulet Clustering by Point of View
  • 3. Motivation Introduction Social Networks and Points of View Socio–semantic networks are contains both: • The social graph • Semantic information represented by the features of the vertices and the edges. Given this information, how to identify communities derived from the conjoint use of it? It is necessary to measure the quality of the partitions found using this information from two perspectives: • The quality of the graph clustering • The quality of the semantic information within the communities page 3 Cruz,Bothorel,Poulet Clustering by Point of View
  • 4. Social Networks and Points of View Introduction Social Networks and Points of View It is possible to obtain different partitions from different points of view page 4 Cruz,Bothorel,Poulet Clustering by Point of View
  • 5. Quality Measures Introduction Some Previous Work Type Objective Examples Reduce the distance between Manhattan L1 the members of the same Similarity Euclidean L2 group while the distance Chebyshev L∞ between groups is increased. Increase the number of edges within each community while Coverage the number of edges between Conductance Quality communities is reduced. In Performance general: index (C) = f (C)+g)(C) N(G Modularity [Gae05] page 5 Cruz,Bothorel,Poulet Clustering by Point of View
  • 6. Graph Clustering Algorithms Introduction Some Previous Work Several graph clustering algorithms have been developed, among others: Newman [New01] (Modularity optimization) Fast unfolding [BGLL08] (Modularity optimization) Maximal cliques enumeration and kernel generation [DWP+ 07] (Modularity optimization) Genetic algorithm for detecting communities in large graphs [LM09] (Fitness function based on modularity) Genetic algorithm for detecting overlapped communities [Piz09] (Fitness function based on internal edges vs. outgoing edges) page 6 Cruz,Bothorel,Poulet Clustering by Point of View
  • 7. The Representation of a Point of View The Point of View of Social Networks The semantic information can be described by a set of features The point of view is PoVi = si × V Point of View Nodes Feature 1 Feature 2 . . . Feature f Node 1 1 0 ... 0 Node 2 0 1 ... 1 . . . . . . . . . . . . Node n 1 0 ... 1 The assignation of features to each node in the network page 7 Cruz,Bothorel,Poulet Clustering by Point of View
  • 8. General Architecture Influencing the Community Detection with the Point of View Guide the community detection algorithm according to semantic information. Use of clustering techniques from different domains. page 8 Cruz,Bothorel,Poulet Clustering by Point of View
  • 9. Semantic Clustering Influencing the Community Detection with the Point of View Phase 1 page 9 Cruz,Bothorel,Poulet Clustering by Point of View
  • 10. Semantic Clustering Influencing the Community Detection with the Point of View Phase 1 Clustering of the defined point of view: search nodes with similar instances of features. Use of Self–Organizing Maps (SOM): non–supervised machine learning method [Koh97]. The proximity between the input vector (instance) and the weight vector of the network is measured with the Euclidean distance. The SOM algorithm will find some number of groups. page 10 Cruz,Bothorel,Poulet Clustering by Point of View
  • 11. Weights Assignation and Community Detection Influencing the Community Detection with the Point of View Phase 2 page 11 Cruz,Bothorel,Poulet Clustering by Point of View
  • 12. Weights Assignation and Community Detection Influencing the Community Detection with the Point of View Phase 2 Given the trained SOM network N and a graph G (V , E ): For each e (i, j) ∈ E , the weight will be changed according to: wij = 1 + αd (Nij ) δij where α ≥ 1 is constant parameter, d (Nij ), is the distance between the node i and the node j in the SOM network and δij = 1 if i, j belong to the same group in the SOM network. After the weights are set, a classic graph clustering algorithm (the fast unfolding algorithm [BGLL08]) is used. page 12 Cruz,Bothorel,Poulet Clustering by Point of View
  • 13. Experiments Configuration I Preliminary Experiments and Results In each experiment three algorithm were compared: • SOM • Fast unfolding • Our method Performed in two levels: • The final modularity: to measure the quality of the partition. • The average intra–cluster Euclidean distance: to measure the quality of the semantic clustering. The experiment were executed using a graph of 5389 nodes and 27347 edges extracted from a Twitter data set. The initial modularity of this graph is −2.5192 × 10−3 page 13 Cruz,Bothorel,Poulet Clustering by Point of View
  • 14. Experiments Configuration II Preliminary Experiments and Results The experiments were performed using two different points of view. Point of View 1: • Composed of 33 features. Each feature represents a time zone from the Twitter data set. • A feature will be set to 1 if the node has at least one friend in the time zone represented√ the feature. by • Distances vary from 0 to 32 Point of View 2: • Composed of 4 features representing the messaging profile of each user. • The first feature is set to 1 if the user has more friends than followers. page 14 Cruz,Bothorel,Poulet Clustering by Point of View
  • 15. Experiments Configuration III Preliminary Experiments and Results • The next three features indicate the user behavior according to the number of messages sent: below the mean, between the mean plus three standard deviations and, over mean plus three standard deviations. √ • Distances vary from 0 to 3 page 15 Cruz,Bothorel,Poulet Clustering by Point of View
  • 16. Case Twitter – Point of View 1 Preliminary Experiments and Results Experiment Final Q Avg. Intracluster Distance SOM Graph −7.5 × 10−3 0.3697 Graph based clustering 0.5728 1.8091 PoV based Clustering 0.5747 1.1947 The average intracluster distance found by our proposed method is less than the average intracluster distance found by the graph based algorithm. The modularity obtained is very similar: the point of view uses information associated with the localization of people’s friends. The modularity of the graph from the SOM clustering is not very different from the modularity of the original graph. SOM groups are close to the structure of the non–clustered graph. page 16 Cruz,Bothorel,Poulet Clustering by Point of View
  • 17. Case Twitter – Point of View 2 Preliminary Experiments and Results Experiment Final Q Avg. Intracluster Distance SOM Graph -0.2991 0 Classic clustering 0.5728 0.7100 PoV based Clustering 0.6351 0.5507 The SOM clustered the nodes into six groups, each one expressing one of the possible instances described above. This explains the average distance found. Creating a graph from the SOM clustering will produce better semantic clusters, however, the modularity is worst than the one from the original graph. The SOM groups are totally unrelated with the structure of the graph. Regarding the modularity and the average intracluster distance, the performance of the PoV based algorithm was better. page 17 Cruz,Bothorel,Poulet Clustering by Point of View
  • 18. Conclusion I Conclusion The classic community detection algorithms do not take into account the semantic information to influence the clustering process. Changing the weights according to the results of the semantic clustering, the semantic information is included into the community detection process. The two types of informations are merged to find and visualize a social network from a selected point of view. page 18 Cruz,Bothorel,Poulet Clustering by Point of View
  • 19. Conclusion II Conclusion Outlook • Make tests over the obtained partitions: rand index, robustness tests... • Study the case of overlapping communities. • Include the features of the edges into the point of view generation. • Development of a visualization algorithm for representing the PoV and the transition between two points of view. page 19 Cruz,Bothorel,Poulet Clustering by Point of View
  • 20. page 20 Cruz,Bothorel,Poulet Clustering by Point of View
  • 21. nav Thanks for your attention Appendix Appendix Questions? page 20 Cruz,Bothorel,Poulet Clustering by Point of View
  • 22. For Further Reading I Appendix Appendix For Further Reading Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10):P10008 (12pp), 2008. Nan Du, Bin Wu, Xin Pei, Bai Wang, and Liutong Xu. Community detection in large-scale social networks. In WebKDD/SNA-KDD ’07: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, pages 16–25, New York, NY, USA, 2007. ACM. page 21 Cruz,Bothorel,Poulet Clustering by Point of View
  • 23. For Further Reading II Appendix Appendix For Further Reading Marco Gaetler. Network Analysis: Methodological Foundations, chapter Clustering, pages 178 – 215. Springer Berlin / Heidelberg, 2005. Teuvo Kohonen. Self-Organizing Maps. Springer, 1997. page 22 Cruz,Bothorel,Poulet Clustering by Point of View
  • 24. For Further Reading III Appendix Appendix For Further Reading Marek Lipczak and Evangelos Milios. Agglomerative genetic algorithm for clustering in social networks. In GECCO ’09: Proceedings of the 11th Annual conference on Genetic and evolutionary computation, pages 1243–1250, New York, NY, USA, 2009. ACM. M. E. Newman. Scientific collaboration networks. ii. shortest paths, weighted networks, and centrality. Phys Rev E Stat Nonlin Soft Matter Phys, 64, July 2001. page 23 Cruz,Bothorel,Poulet Clustering by Point of View
  • 25. For Further Reading IV Appendix Appendix For Further Reading Clara Pizzuti. Overlapped community detection in complex networks. In GECCO ’09: Proceedings of the 11th Annual conference on Genetic and evolutionary computation, pages 859–866, New York, NY, USA, 2009. ACM. page 24 Cruz,Bothorel,Poulet Clustering by Point of View
  • 26. Quality Measures Summary Appendix Appendix For Further Reading Summary of quality measures for graph clustering. page 25 Cruz,Bothorel,Poulet Clustering by Point of View