Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Community Detection in Social Media by Symeon Papadopoulos 12206 views
- Community detection in graphs by Nicola Barbieri 5048 views
- Community Detection with Networkx by Erika Fille Legara 355 views
- Community detection in social networks by Francisco Restivo 1283 views
- Group and Community Detection in So... by Ismael A. Ali 995 views
- Network Analysis and Law: Introduct... by Daniel Katz 21247 views

2,046 views

Published on

A method to detect communities in networks

No Downloads

Total views

2,046

On SlideShare

0

From Embeds

0

Number of Embeds

1

Shares

0

Downloads

126

Comments

0

Likes

2

No embeds

No notes for slide

- 1. Community Detection in Social Networks Lei Tang
- 2. Properties of Complex Network Power Law Community Structure Small World
- 3. Why Community Detection?Communities in a citation network might representrelated papers on a single topic;Communities on the web might represent pages ofrelated topics;Community can be considered as a summary ofthe whole network thus easy to visualize andunderstand.Sometimes, community can reveal the propertieswithout releasing the individual privacy information.
- 4. Community Detection,Reinventing the wheel?
- 5. Community Detection = Clustering?As I understand, community detection isessentially clustering.But why so many works on CommunityDetection? (in physical review, KDD, WWW)The network data pose challenges toclassical clustering method.
- 6. DifferenceClustering works on the distance or similarity matrix (k-means, hierarchical clustering, spectral clustering)Network data tends to be “discrete”, leading to algorithmsusing the graph property directly (k-clique, quasi-clique,vertex-betweenness, edge-betweeness etc.)Real-world network is large scale! Sometimes, even n^2 inunbearable for efficiency or space (local/distributedclustering, network approximation, sampling method)
- 7. OutlineTwo recent community detection methodsClustering based on shortest-pathbetweennessClustering based on network modularity
- 8. Basic Idea A simple divisive strategy: Repeat1. Find out one “inter-community” edge2. Remove the edge3. Check if there’s any disconnected components (which corresponds to a community)
- 9. How to measure “inter-community”If two communities are joined by a few inter-communityedges, then all the paths from one community to anothermust pass the edges.Various measures:Edge Betweenness: find the shortest pathsbetween all pairs of nodes and count howmany run along each edge.Random Walk betweenness.Current-flow betweenness
- 10. Shortest-path betweennessComputation could be expensive: calculating the shortestpath between one pair is O(m), and there are O(n^2) pairs.Could be optimized to O(mn)Simple case: only one shortest path When there is only one single path between the Source S and other vertex, then those paths form a tree. Bottom-up: start from the leaves, assign edges to 1. Count of parent edge = sum (count of children edge)+1
- 11. Multiple shortest pathFirst compute the number of paths from source to othervertexThen assign a proper weight for the path counts sum of the betweenness =.number of reachablevertices.
- 12. Calculate #shortest path1.Initial distance W:Number of shortest paths
- 13. Update edge weight Edge weight
- 14. Time ComplexityO(mn) in each iteration.Could be accelerated by noting that only thenodes in the connected component would beaffected.Some other techniques developed: samplingstrategy to approximate the betweenness; usespecific network index for speed.
- 15. Obtain a hierarchical tree, use modularityTo determine the number of clusters.
- 16. ModularitySpectral clustering essentially tries to minimize the numberedges between groups.Modularity consider the number edges which is smaller thanexpected.If the difference is significantly large, there’s a communitystructure inside.The larger, the better.
- 17. QuizGiven a network of m edges, for two nodeswith degree ki, kj, what is the expectededges between these two nodes?
- 18. Modularity CalculationModularity can be used to determine thenumber of clusters, why not maximize itdirectly?Unfortunately, it’s NP-hard
- 19. Relaxation Eigen Value Problem! Modularity MatrixBetai is the eigen value of theEigen vector ui of modularity matrix B
- 20. Properties of Modularity Matrix(1,1,…1) is an eigen vector with zero eigen value.Different from graph Laplacian, the eigen value ofmodularity matrix could be +, 0 or -.What if the maximum eigen value is zero?Essentially, it hints that there’s no strongcommunity pattern. Not necessary to split thenetwork, which is a nice property.
- 21. Here, the spectral partitioning is forced tosplit the network into approximately equal-size clusters.
- 22. ExtensionsDivisive clusteringK - partitioning…
- 23. CommentsI thought spectral clustering is the end of clustering. Buthere a new measure Modularity is proposed and found tobe working very well, which confirms that “research isendless”, or “no last bug”.Since Graph Laplacian and Modularity matrix both boilsdown to a eigen value problem, is there any innateconnection between these two measures?How could it work if we apply it directly to some classicdata representation?Extend modularity to relational data could be a promisingdirection.There could be more opportunities than “wheels” in socialcomputing.Scalability is really a big issue.
- 24. ReferencesM.E.J.Newman, Finding community structure innetworks using the eigenvectors of matrices, Phys.Rev. , 2006M. E. J. Newman, M. Girvan, Finding andevaluating community structure in networks,Phys. Rev. 2004

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment