2. AGENDA
Introduction
What is a Community
Why we Need?
Types of Communities
Model for Dynamic Community Analysis
More Complex Analysis with Rules
Tracking Communities Across Time Steps
Algorithm
Application in the Real World
Conclusion
3. Introduction
Goal is to identify meaningful group structures in the network.
Real-world social networks from variety of domains can naturally be modelled as
dynamic graphs.
Tracking the evolution and structure of communities over multiple time steps in
dynamic network.
Interested in examining the formation in change in comminutes – such as friends in
online network like Facebook or Twitter.
4. What is a Community
Definition : “A community is a group of nodes that are densely connected, and they have few edges connecting
them to nodes outside the community”
Finding groups of nodes in a network that are very similar to each other.
Connecting nodes outside community.
An edge between two entities if they have.
Eg. For instance, the exchange of ideas, information, and experiences between people in the web can be
modeled as a social network.
5. Why we need?
Dynamic source of data as static data is
meaningless to identify group structures in
network.
Representation of graphs as static network
can lead to wrong predictions.
This static representation misses the
opportunity to detect the evolutionary
behavior of the network and the
communities.
Discovering densely connected user
communities from social networks has
become one of the major challenges.
6. Types of Communities
Common Features: Communities are groups of nodes with similar attributes.
Internal Density: Here we are interested in just maximizing the number of edges inside the
communities.
Action Communities: Nodes are not just static entities, but they perform actions
Proximal Nodes: Here we want the edges inside the communities to make it easy for a node to be
connected to all other nodes in the community.
Fixed Structure: It says that the algorithm knows what a community looks like and it just has to
find that structure in the network.
Link Communities: Here we think that we need to group the edges, not the nodes. In a social
network, we know different people for different reasons: family, work, free time, etc
Others: They just add features to other community discovery algorithms or because they let the
user define their communities and then try to find them.
7. Model for Dynamic Community Analysis
K’ dynamic communities D = { D1,
D2…Dk’ }
Step communities are identified at
individual time steps.
at time t, Ct = { Ct1, Ct2…Ctk }
So, for each time stamp t,
D1 : { C11, C21, C31 }
D2 : { C22, C32 }
D3 : { C12, C23 }
Recent observation is referred as the
front of the dynamic community.
t = 1 t = 2 t = 3
D1
D2
D3
C11
C12 C23
C22 C32
C31C21 F1
F2
F3
8. More Complex Analysis with Rules
Birth : Emergence of step community,
observed at time t for which there is
no corresponding dynamic community
in D.
Death :Dissolution of D occurs when it
has not been observed for at least d
consecutive time steps.
t = 1 t = 2 t = 3
D1
D2
D3
C11
C12 C23
C22 C32
C31C21 F1
F2
F1
9. More Complex Analysis with Rules
Birth : Emergence of step community Ctj,
observed for which there is no
corresponding dynamic community in D.
Death : Dissolution of D occurs when it
has not been observed for at least d
consecutive time steps.
Merging : Two distinct dynamic
communities observed to match a single
step community.
Splitting : Single dynamic community D, is
matched to two distinct step
communities
Expansion : When the corresponding step
community is significantly larger than the
previous one.
Contraction : : When the corresponding
step community is significantly smaller
than the previous one.
t = 1 t = 2 t = 3
D1
D2
D3
D4
C11
C12
C23
C22 C32
C31C21
C13
C33
10. More Complex Analysis with Rules
Occurring of one-to-one matching or
continuation events.
Intermittent dynamic communities due
to the structure of the network.
t = 1 t = 2 t = 3 t =4
D1
D2
C11
C12 X
C41C31X
X C42
t = 1 t = 2 t = 3
D1
D2
D3
C11
C12 C23
C22 C32
C31C21 F1
F2
F1
11. Tracking Dynamic Communities Across Time Steps
C1 & C2 are generated by applying a chosen static community finding algorithm to graph g1.
A distinct dynamic community is created for each step community.
An attempt is made to match these step communities with the fronts {F1, . . . , Fk0}.
All pairs (C2a, Fi) are compared, and the dynamic community timelines and fronts are updated.
The process continues until all l step graphs have been processed.
We employ Jaccard coefficient for binary sets. Given a step community Cta and a front Fi, the similarity
between the pair is calculated as :
sim(Cta, Fi) = |Cta Ո Fi| / |Cta U Fi|
12. ALGORITHM
1. Apply static community finding algorithm on g1 to extract C1. Initialize D by creating a new
dynamic community for each step cluster C1i ∈ C1.
2. For each subsequent step t > 1, extract Ct from gt
3. Process every Cta ∈ Ct as follows:
o Match all dynamic communities Di for which sim (Cta, Fi) > θ.
o If there are no matches, create new dynamic community containing Cta.
o Otherwise, add Cta to each matching dynamic community.
4. Update the set of fronts for each dynamic community to be the latest matched step community.
For each case where one existing dynamic community has been matched to 2 or more step
communities, create a split dynamic community.
5. Repeat from #2 until all time step graphs have been processed.
13. Application in the Real World
Implemented on real mobile operator network.
Examined weekly voice call graphs over eight consecutive
weeks.
4million unique subscribers & 10 millions of edges.
Time slices of 2 weeks over four separate time step graphs.
After applying the algorithm, 150k long-lived dynamic
communities were present in at least 2 steps, 27% were
visible in 3 time steps, 9.6% in full 2 months time period.
Overlapping groups were derived, 80% of the communities
contained about 5 and 15users.
82k long-lived community exhibited intermittent behavior.
24k community merge events were detected in the four
intervals.
4k split community split users were also detected.
t = 1 t = 2 t = 3 t = 4
14. Conclusion
Tracking communities from dynamic data at consecutive time steps I the individual
snapshot graphs.
Described a general model for tracking communities in dynamic networks.
Identifying key events that characterize the life cycle of a community of users in a
dynamic network.
Fast and effective method on the model which scales to graphs with ≈10⁶ and 10⁷
edges.
15. REFERENCES
http://www.michelecoscia.com/?p=668
Dino Pedreschi and Fosca Giannotti in paper “A classification for Community
Discovery Methods in Complex Networks“
Community Evolution Mining in Dynamic Social Networks.
User community discovery from multi-relational networks
Research Paper: Focused Community Discovery
file:///D:/Study/1st%20Sem/DM/Presentation%20Material/535-
Trends%20and%20Applications%20in%20Knowledge%20Discovery%20and%2
0Data%20Mining_%20PAKDD%202014%20...%20-%20Google%20Books.html
However, approaches to detecting communities have largely focused on identifying communities in static graphs.
Group structures over shorter period of time can be difficult to identify or may be completely ablated.
A community is a group of nodes that are densely connected, and they have few edges connecting them to nodes outside the community
->Data is static, we cannot infer when and where the data can be changed and there are chances that the prediction will be wrong.
->Dynamic helps in examining structures and patterns that can lead to the understanding of the data & predicting the trends in the dynamic data.
->Misses the opportunity to capture the evaluating pattern in DN.
->Group structures over shorter period of time can be difficult to identify or may be completely ablated.
-> If we are in a social network and the nodes are people, these attributes may well be the social connections, the movies you like, the songs you listen to. Communities are groups of nodes with similar attributes.
DC->present in the network across one or more time steps.
SC->identified at individual time steps, which represent specific observations of dynamic communities at a given point in time.
->A new dynamic community Di containing Ctj is created and added to D.
->assuming that no further step communities are subsequently assigned to its timeline.
->two distinct dynamic communities observed at time t−1 match to a single step community at time t.
->single dynamic community present at time t − 1 is matched to two distinct step communities at time t.
A branching occurs with the creation of an additional dynamic community Dj that shares the timeline of Di up to time t − 1, but has a distinct timeline from time t onwards.