Towards Maximising Cross-Community Information Diffusion
Upcoming SlideShare
Loading in...5
×
 

Towards Maximising Cross-Community Information Diffusion

on

  • 201 views

 

Statistics

Views

Total Views
201
Views on SlideShare
200
Embed Views
1

Actions

Likes
0
Downloads
3
Comments
0

1 Embed 1

http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Towards Maximising Cross-Community Information Diffusion Towards Maximising Cross-Community Information Diffusion Presentation Transcript

  • Digital Enterprise Research Institute www.deri.ie Towards Cross-Community Information Diffusion Maximisation Václav Belák, Samantha Lam, Conor Hayes© Copyright 2011 Digital Enterprise Research Institute. All rights reserved. Enabling Networked Knowledge
  • MotivationDigital Enterprise Research Institute www.deri.ie •  Information cascades of high interest in marketing, CRM, etc. •  A common approach is to maximise information diffusion by targeting influential actors •  In the context of many online communities (e.g. discussion fora) the information is shared to the community as a whole and not to individual actors common case – targeting individuals cross-community case – targeting communities Enabling Networked Knowledge
  • ObjectivesDigital Enterprise Research Institute www.deri.ie •  Our main hypothesis is that it is possible to efficiently spread a message over the information flow network by targeting highly influential communities •  The main problem is then formulated as a prediction of the set of communities to target such that the message is spread over the network as much as possible •  Spread over the actors, i.e. user activation fraction •  Spread over the communities, i.e. community activation fraction Enabling Networked Knowledge
  • Methods: Definition of ImpactDigital Enterprise Research Institute www.deri.ie •  We propose (Belák et al., ‘12) to take two factors into account: 1.  degree of community membership of the users 2.  centrality of the users within each community •  Impact of community A on community B defined as an average centrality of actors from A within B, weighted by their membership in A Enabling Networked Knowledge
  • Methods: Targeting CommunitiesDigital Enterprise Research Institute www.deri.ie •  Level of dispersion (heterogeneity) of total impact of community i can be measured as an entropy of an i-th row/column of the impact matrix •  We propose to target communities by means of the product of the total impact of community i and its entropy: impact focus (IF) •  We simulated the diffusion by extending Independent Cascade (ICM) and Linear Threshold (LTM) Models (Kempe et al., ‘03) 1.  Take q target communities and sample s users from each of them 2.  Run the original models from the union of sampled users •  Information diffusion network derived from the reply-to network: replies to i rji j information i j flow wij Enabling Networked Knowledge
  • Evaluation StrategyDigital Enterprise Research Institute www.deri.ie •  IF compared with random targeting (R), and group in-degree (GI) (Everett & Borgatti, ’99) •  The main aim was to investigate robustness of our framework with respect to: •  Character of the system •  Diffusion models •  User and Community Activation Fractions •  Procedural outline 1.  Target q communities using one of the heuristics evaluated on the data from time-slice t 2.  Run the diffusion model on the network from time-slice t+1 3.  Compute an average user and community spreads over all pairs (t, t+1) Enabling Networked Knowledge
  • Evaluation Data-SetsDigital Enterprise Research Institute www.deri.ie •  51 weeks of data of the largest Irish discussion board system •  Segmented using 1 week sliding window •  1 week window represents approx. 84% of cross-fora posting activity •  540 communities, 5.3k users/snapshot (avg) •  5 years of data from the technical support fora of SAP •  Used only for the diffusion experiments •  Segmented using 2 months sliding window •  2 months represent approx. 50% of cross-fora posting activity •  33 communities, 2k users/snapshot (avg) Enabling Networked Knowledge
  • User Act. FractionDigital Enterprise Research Institute www.deri.ie One targeted community q=1, Boards−LTM q=1, SAP−LTM 0.8 0.30 0.7 0.25 0.6 mean user activation fraction (u) mean user activation fraction (u) 0.20 0.5 0.15 0.4 0.10 0.3 0.05 0.2 IF IF GI GI 0.00 0.1 R R 5 10 15 20 5 10 15 20 user sample size (s) user sample size (s) Enabling Networked Knowledge
  • Community Act. Fr.Digital Enterprise Research Institute www.deri.ie One targeted community q=1, Boards−LTM q=1, SAP−LTM 0.5 0.8 0.7 0.4 mean community activation fraction (c) mean community activation fraction (c) 0.6 0.3 0.5 0.4 0.2 0.3 0.1 0.2 IF IF GI GI 0.1 0.0 R R 5 10 15 20 5 10 15 20 user sample size (s) user sample size (s) Enabling Networked Knowledge
  • Community Act. Fr.Digital Enterprise Research Institute www.deri.ie Five targeted communities q=5, Boards−LTM q=5, SAP−LTM 0.5 0.8 0.7 0.4 mean community activation fraction (c) mean community activation fraction (c) 0.6 0.3 0.5 0.4 0.2 0.3 0.1 0.2 IF IF GI GI 0.1 0.0 R R 5 10 15 20 5 10 15 20 user sample size (s) user sample size (s) Enabling Networked Knowledge
  • Results HighlightsDigital Enterprise Research Institute www.deri.ie •  Diffusion process became saturated at approximately 80% of users or communities in Boards, and 30% in SAP •  More efficient to target few communities •  Impact Focus outperformed the other two strategies with respect to both user and community activation fractions, namely for small number of targeted communities (i.e. [1, 2]) and seed users (i.e. [1, 20]) •  Diminishing returns •  For high number of targeted communities and seed users, random strategy outperformed the other two with respect to community activation fractions in SAP data-set •  SAP network fragmented into many small components, which made it hard to reach peripheral communities Enabling Networked Knowledge
  • ConclusionDigital Enterprise Research Institute www.deri.ie •  The evaluation demonstrated that the framework •  is able to identify highly influential communities •  can predict which communities to target s.t. the message spreads efficiently over both individual users and communities •  We aim to extend it with content analysis •  E.g. What are the most influential communities with respect to a particular topic? •  We will also investigate empirically-observed topic cascades and modify our models accordingly if needed Enabling Networked Knowledge
  • Questions?Digital Enterprise Research Institute www.deri.ie References •  Belák V., Lam S., Hayes C. Cross-Community Influence in Discussion Fora. ICWSM. AAAI, 2012. •  M. Everett and S. Borgatti. The centrality of groups and classes. J. of Mathematical Sociology, 23(3):181–201, 1999. •  D. Kempe, J. Kleinberg, and É. Tardos. Maximizing the spread of influence through a social network. SIGKDD. ACM, 2003. Enabling Networked Knowledge