Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Towards Maximising Cross-Community Information Diffusion

566 views

Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Towards Maximising Cross-Community Information Diffusion

  1. 1. Digital Enterprise Research Institute www.deri.ie Towards Cross-Community Information Diffusion Maximisation Václav Belák, Samantha Lam, Conor Hayes© Copyright 2011 Digital Enterprise Research Institute. All rights reserved. Enabling Networked Knowledge
  2. 2. MotivationDigital Enterprise Research Institute www.deri.ie •  Information cascades of high interest in marketing, CRM, etc. •  A common approach is to maximise information diffusion by targeting influential actors •  In the context of many online communities (e.g. discussion fora) the information is shared to the community as a whole and not to individual actors common case – targeting individuals cross-community case – targeting communities Enabling Networked Knowledge
  3. 3. ObjectivesDigital Enterprise Research Institute www.deri.ie •  Our main hypothesis is that it is possible to efficiently spread a message over the information flow network by targeting highly influential communities •  The main problem is then formulated as a prediction of the set of communities to target such that the message is spread over the network as much as possible •  Spread over the actors, i.e. user activation fraction •  Spread over the communities, i.e. community activation fraction Enabling Networked Knowledge
  4. 4. Methods: Definition of ImpactDigital Enterprise Research Institute www.deri.ie •  We propose (Belák et al., ‘12) to take two factors into account: 1.  degree of community membership of the users 2.  centrality of the users within each community •  Impact of community A on community B defined as an average centrality of actors from A within B, weighted by their membership in A Enabling Networked Knowledge
  5. 5. Methods: Targeting CommunitiesDigital Enterprise Research Institute www.deri.ie •  Level of dispersion (heterogeneity) of total impact of community i can be measured as an entropy of an i-th row/column of the impact matrix •  We propose to target communities by means of the product of the total impact of community i and its entropy: impact focus (IF) •  We simulated the diffusion by extending Independent Cascade (ICM) and Linear Threshold (LTM) Models (Kempe et al., ‘03) 1.  Take q target communities and sample s users from each of them 2.  Run the original models from the union of sampled users •  Information diffusion network derived from the reply-to network: replies to i rji j information i j flow wij Enabling Networked Knowledge
  6. 6. Evaluation StrategyDigital Enterprise Research Institute www.deri.ie •  IF compared with random targeting (R), and group in-degree (GI) (Everett & Borgatti, ’99) •  The main aim was to investigate robustness of our framework with respect to: •  Character of the system •  Diffusion models •  User and Community Activation Fractions •  Procedural outline 1.  Target q communities using one of the heuristics evaluated on the data from time-slice t 2.  Run the diffusion model on the network from time-slice t+1 3.  Compute an average user and community spreads over all pairs (t, t+1) Enabling Networked Knowledge
  7. 7. Evaluation Data-SetsDigital Enterprise Research Institute www.deri.ie •  51 weeks of data of the largest Irish discussion board system •  Segmented using 1 week sliding window •  1 week window represents approx. 84% of cross-fora posting activity •  540 communities, 5.3k users/snapshot (avg) •  5 years of data from the technical support fora of SAP •  Used only for the diffusion experiments •  Segmented using 2 months sliding window •  2 months represent approx. 50% of cross-fora posting activity •  33 communities, 2k users/snapshot (avg) Enabling Networked Knowledge
  8. 8. User Act. FractionDigital Enterprise Research Institute www.deri.ie One targeted community q=1, Boards−LTM q=1, SAP−LTM 0.8 0.30 0.7 0.25 0.6 mean user activation fraction (u) mean user activation fraction (u) 0.20 0.5 0.15 0.4 0.10 0.3 0.05 0.2 IF IF GI GI 0.00 0.1 R R 5 10 15 20 5 10 15 20 user sample size (s) user sample size (s) Enabling Networked Knowledge
  9. 9. Community Act. Fr.Digital Enterprise Research Institute www.deri.ie One targeted community q=1, Boards−LTM q=1, SAP−LTM 0.5 0.8 0.7 0.4 mean community activation fraction (c) mean community activation fraction (c) 0.6 0.3 0.5 0.4 0.2 0.3 0.1 0.2 IF IF GI GI 0.1 0.0 R R 5 10 15 20 5 10 15 20 user sample size (s) user sample size (s) Enabling Networked Knowledge
  10. 10. Community Act. Fr.Digital Enterprise Research Institute www.deri.ie Five targeted communities q=5, Boards−LTM q=5, SAP−LTM 0.5 0.8 0.7 0.4 mean community activation fraction (c) mean community activation fraction (c) 0.6 0.3 0.5 0.4 0.2 0.3 0.1 0.2 IF IF GI GI 0.1 0.0 R R 5 10 15 20 5 10 15 20 user sample size (s) user sample size (s) Enabling Networked Knowledge
  11. 11. Results HighlightsDigital Enterprise Research Institute www.deri.ie •  Diffusion process became saturated at approximately 80% of users or communities in Boards, and 30% in SAP •  More efficient to target few communities •  Impact Focus outperformed the other two strategies with respect to both user and community activation fractions, namely for small number of targeted communities (i.e. [1, 2]) and seed users (i.e. [1, 20]) •  Diminishing returns •  For high number of targeted communities and seed users, random strategy outperformed the other two with respect to community activation fractions in SAP data-set •  SAP network fragmented into many small components, which made it hard to reach peripheral communities Enabling Networked Knowledge
  12. 12. ConclusionDigital Enterprise Research Institute www.deri.ie •  The evaluation demonstrated that the framework •  is able to identify highly influential communities •  can predict which communities to target s.t. the message spreads efficiently over both individual users and communities •  We aim to extend it with content analysis •  E.g. What are the most influential communities with respect to a particular topic? •  We will also investigate empirically-observed topic cascades and modify our models accordingly if needed Enabling Networked Knowledge
  13. 13. Questions?Digital Enterprise Research Institute www.deri.ie References •  Belák V., Lam S., Hayes C. Cross-Community Influence in Discussion Fora. ICWSM. AAAI, 2012. •  M. Everett and S. Borgatti. The centrality of groups and classes. J. of Mathematical Sociology, 23(3):181–201, 1999. •  D. Kempe, J. Kleinberg, and É. Tardos. Maximizing the spread of influence through a social network. SIGKDD. ACM, 2003. Enabling Networked Knowledge

×