Targeting Communities to Maximise Information Diffusion


Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Targeting Communities to Maximise Information Diffusion

  1. 1. Digital Enterprise Research Institute Targeting Communities to Maximise Information Diffusion Václav Belák, Samantha Lam, Conor Hayes© Copyright 2010 Digital Enterprise Research Institute. All rights reserved.
  2. 2. MotivationDigital Enterprise Research Institute •  Many companies have started to utilise online communities as a means of communicating and targeting their customers •  A common approach is to maximise information diffusion by targeting influential actors •  In the context of many online communities (e.g. discussion fora) the information is shared to the community as a whole and not to individual actors
  3. 3. ObjectivesDigital Enterprise Research Institute •  Our main hypothesis is that it is possible to efficiently spread a message over the information flow network by targeting highly influential communities •  We derive the information flow network from the reply-to network between the actors •  The main problem is then formulated as a prediction of the set of communities to target such that the message is spread over the network as much as possible
  4. 4. Methods: Definition of ImpactDigital Enterprise Research Institute forum A forum B e b a d f c g •  We propose (Belák et al., ‘12) to take two factors into account: 1.  degree of community membership of the users 2.  centrality of the users within each community •  we used in-degree (# replies of a user) •  For general case of n users and k communities define: •  n × k membership matrix M •  n × k centrality matrix C •  Cross-community k × k impact matrix J can then be obtained as a product of the two matrices: J=MTC
  5. 5. Methods: Targeting CommunitiesDigital Enterprise Research Institute •  Level of dispersion (heterogeneity) of total impact of community i can be measured as an entropy of a an i-th row/column of the impact matrix •  Is a community broadly influential or does it influence only few other communities? •  We propose to target communities by means of the product of the total impact of community i and its entropy: impact focus (IF) •  IF compared with random targeting (R), and group in-degree (GI) (Everett & Borgatti, ‘99) •  We simulated the diffusion by extending Independent Cascade Model (ICM) (Kempe et al., ‘03) 1.  Take q target communities and sample s users from each of them 2.  Run the original ICM from the union of sampled users •  Performance measured by the fraction of all the users, that have been activated during the simulation
  6. 6. Evaluation Data-SetDigital Enterprise Research Institute •  51 weeks of data of the largest Irish discussion board system •  Segmented using 1 week sliding window •  1 week window represents approx. 84% of cross-fora posting activity •  540 communities in total •  5,298 avg. nodes per snapshot •  26,484 avg. edges per snapshot
  7. 7. Results: Avg. Performance Digital Enterprise Research Institute •  Impact focus outperformed the other two namely for small number of targeted communities and seed users sampled from them •  Diffusion process became saturated on avg. at approx. 60% of the users activated targeted communities q=1 targeted communities q=2 targeted communities q=3 targeted communities q=4 targeted communities q=5 0.6 0.6 0.6 0.6 0.6mean activation fraction (a) mean activation fraction (a) mean activation fraction (a) mean activation fraction (a) mean activation fraction (a) 0.5 0.5 0.5 0.5 0.5 0.4 0.4 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.2 0.2 IF IF IF IF IF GI GI GI GI GI R R R R R 0.1 0.1 0.1 0.1 0.1 2 4 6 8 10 14 2 4 6 8 10 14 2 4 6 8 10 14 2 4 6 8 10 14 2 4 6 8 10 14 user sample size (s) user sample size (s) user sample size (s) user sample size (s) user sample size (s)
  8. 8. Results: IF outperforms GI, RDigital Enterprise Research Institute user sample size s=1 user sample size s=12 0.5 0.65 0.4 0.60 ● activation fraction (a) activation fraction (a) 0.3 0.55 ● 0.2 0.50 ● 0.1 0.45 ● ● ● IF GI R IF GI R
  9. 9. ConclusionDigital Enterprise Research Institute •  The evaluation demonstrated that the framework •  is able to identify highly influential communities •  can predict which communities to stimulate (e.g. by posting a message) s.t. the stimulus spreads efficiently •  We aim to extend it with content analysis •  E.g. What are the most influential communities with respect to a particular topic? •  We will also investigate empirically-observed topic cascades and modify our models accordingly if needed References •  Belák V., Lam S., Hayes C. Cross-Community Influence in Discussion Fora. ICWSM. AAAI, 2012. •  M. Everett and S. Borgatti. The centrality of groups and classes. J. of Mathematical Sociology, 23(3):181–201, 1999. •  D. Kempe, J. Kleinberg, and É. Tardos. Maximizing the spread of influence through a social network. SIGKDD. ACM, 2003.