Towards Maximising Cross-Community Information Diffusion

446 views
355 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
446
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Towards Maximising Cross-Community Information Diffusion

  1. 1. Digital Enterprise Research Institute www.deri.ie Towards Cross-Community Information Diffusion Maximisation Václav Belák, Samantha Lam, Conor Hayes© Copyright 2011 Digital Enterprise Research Institute. All rights reserved. Enabling Networked Knowledge
  2. 2. MotivationDigital Enterprise Research Institute www.deri.ie •  Information cascades of high interest in marketing, CRM, etc. •  A common approach is to maximise information diffusion by targeting influential actors •  In the context of many online communities (e.g. discussion fora) the information is shared to the community as a whole and not to individual actors common case – targeting individuals cross-community case – targeting communities Enabling Networked Knowledge
  3. 3. ObjectivesDigital Enterprise Research Institute www.deri.ie •  Our main hypothesis is that it is possible to efficiently spread a message over the information flow network by targeting highly influential communities •  The main problem is then formulated as a prediction of the set of communities to target such that the message is spread over the network as much as possible •  Spread over the actors, i.e. user activation fraction •  Spread over the communities, i.e. community activation fraction Enabling Networked Knowledge
  4. 4. Methods: Definition of ImpactDigital Enterprise Research Institute www.deri.ie •  We propose (Belák et al., ‘12) to take two factors into account: 1.  degree of community membership of the users 2.  centrality of the users within each community •  Impact of community A on community B defined as an average centrality of actors from A within B, weighted by their membership in A Enabling Networked Knowledge
  5. 5. Methods: Targeting CommunitiesDigital Enterprise Research Institute www.deri.ie •  Level of dispersion (heterogeneity) of total impact of community i can be measured as an entropy of an i-th row/column of the impact matrix •  We propose to target communities by means of the product of the total impact of community i and its entropy: impact focus (IF) •  We simulated the diffusion by extending Independent Cascade (ICM) and Linear Threshold (LTM) Models (Kempe et al., ‘03) 1.  Take q target communities and sample s users from each of them 2.  Run the original models from the union of sampled users •  Information diffusion network derived from the reply-to network: replies to i rji j information i j flow wij Enabling Networked Knowledge
  6. 6. Evaluation StrategyDigital Enterprise Research Institute www.deri.ie •  IF compared with random targeting (R), and group in-degree (GI) (Everett & Borgatti, ’99) •  The main aim was to investigate robustness of our framework with respect to: •  Character of the system •  Diffusion models •  User and Community Activation Fractions •  Procedural outline 1.  Target q communities using one of the heuristics evaluated on the data from time-slice t 2.  Run the diffusion model on the network from time-slice t+1 3.  Compute an average user and community spreads over all pairs (t, t+1) Enabling Networked Knowledge
  7. 7. Evaluation Data-SetsDigital Enterprise Research Institute www.deri.ie •  51 weeks of data of the largest Irish discussion board system •  Segmented using 1 week sliding window •  1 week window represents approx. 84% of cross-fora posting activity •  540 communities, 5.3k users/snapshot (avg) •  5 years of data from the technical support fora of SAP •  Used only for the diffusion experiments •  Segmented using 2 months sliding window •  2 months represent approx. 50% of cross-fora posting activity •  33 communities, 2k users/snapshot (avg) Enabling Networked Knowledge
  8. 8. User Act. FractionDigital Enterprise Research Institute www.deri.ie One targeted community q=1, Boards−LTM q=1, SAP−LTM 0.8 0.30 0.7 0.25 0.6 mean user activation fraction (u) mean user activation fraction (u) 0.20 0.5 0.15 0.4 0.10 0.3 0.05 0.2 IF IF GI GI 0.00 0.1 R R 5 10 15 20 5 10 15 20 user sample size (s) user sample size (s) Enabling Networked Knowledge
  9. 9. Community Act. Fr.Digital Enterprise Research Institute www.deri.ie One targeted community q=1, Boards−LTM q=1, SAP−LTM 0.5 0.8 0.7 0.4 mean community activation fraction (c) mean community activation fraction (c) 0.6 0.3 0.5 0.4 0.2 0.3 0.1 0.2 IF IF GI GI 0.1 0.0 R R 5 10 15 20 5 10 15 20 user sample size (s) user sample size (s) Enabling Networked Knowledge
  10. 10. Community Act. Fr.Digital Enterprise Research Institute www.deri.ie Five targeted communities q=5, Boards−LTM q=5, SAP−LTM 0.5 0.8 0.7 0.4 mean community activation fraction (c) mean community activation fraction (c) 0.6 0.3 0.5 0.4 0.2 0.3 0.1 0.2 IF IF GI GI 0.1 0.0 R R 5 10 15 20 5 10 15 20 user sample size (s) user sample size (s) Enabling Networked Knowledge
  11. 11. Results HighlightsDigital Enterprise Research Institute www.deri.ie •  Diffusion process became saturated at approximately 80% of users or communities in Boards, and 30% in SAP •  More efficient to target few communities •  Impact Focus outperformed the other two strategies with respect to both user and community activation fractions, namely for small number of targeted communities (i.e. [1, 2]) and seed users (i.e. [1, 20]) •  Diminishing returns •  For high number of targeted communities and seed users, random strategy outperformed the other two with respect to community activation fractions in SAP data-set •  SAP network fragmented into many small components, which made it hard to reach peripheral communities Enabling Networked Knowledge
  12. 12. ConclusionDigital Enterprise Research Institute www.deri.ie •  The evaluation demonstrated that the framework •  is able to identify highly influential communities •  can predict which communities to target s.t. the message spreads efficiently over both individual users and communities •  We aim to extend it with content analysis •  E.g. What are the most influential communities with respect to a particular topic? •  We will also investigate empirically-observed topic cascades and modify our models accordingly if needed Enabling Networked Knowledge
  13. 13. Questions?Digital Enterprise Research Institute www.deri.ie References •  Belák V., Lam S., Hayes C. Cross-Community Influence in Discussion Fora. ICWSM. AAAI, 2012. •  M. Everett and S. Borgatti. The centrality of groups and classes. J. of Mathematical Sociology, 23(3):181–201, 1999. •  D. Kempe, J. Kleinberg, and É. Tardos. Maximizing the spread of influence through a social network. SIGKDD. ACM, 2003. Enabling Networked Knowledge

×