MotivationDigital Enterprise Research Institute www.deri.ie • Many companies have started to utilise online communities as a means of communicating and targeting their customers • A common approach is to maximise information diffusion by targeting influential actors • In the context of many online communities (e.g. discussion fora) the information is shared to the community as a whole and not to individual actors
ObjectivesDigital Enterprise Research Institute www.deri.ie • Our main hypothesis is that it is possible to efficiently spread a message over the information flow network by targeting highly influential communities • We derive the information flow network from the reply-to network between the actors • The main problem is then formulated as a prediction of the set of communities to target such that the message is spread over the network as much as possible
Methods: Definition of ImpactDigital Enterprise Research Institute www.deri.ie forum A forum B e b a d f c g • We propose (Belák et al., ‘12) to take two factors into account: 1. degree of community membership of the users 2. centrality of the users within each community • we used in-degree (# replies of a user) • For general case of n users and k communities define: • n × k membership matrix M • n × k centrality matrix C • Cross-community k × k impact matrix J can then be obtained as a product of the two matrices: J=MTC
Methods: Targeting CommunitiesDigital Enterprise Research Institute www.deri.ie • Level of dispersion (heterogeneity) of total impact of community i can be measured as an entropy of a an i-th row/column of the impact matrix • Is a community broadly influential or does it influence only few other communities? • We propose to target communities by means of the product of the total impact of community i and its entropy: impact focus (IF) • IF compared with random targeting (R), and group in-degree (GI) (Everett & Borgatti, ‘99) • We simulated the diffusion by extending Independent Cascade Model (ICM) (Kempe et al., ‘03) 1. Take q target communities and sample s users from each of them 2. Run the original ICM from the union of sampled users • Performance measured by the fraction of all the users, that have been activated during the simulation
Evaluation Data-SetDigital Enterprise Research Institute www.deri.ie • 51 weeks of data of the largest Irish discussion board system • Segmented using 1 week sliding window • 1 week window represents approx. 84% of cross-fora posting activity • 540 communities in total • 5,298 avg. nodes per snapshot • 26,484 avg. edges per snapshot
Results: Avg. Performance Digital Enterprise Research Institute www.deri.ie • Impact focus outperformed the other two namely for small number of targeted communities and seed users sampled from them • Diffusion process became saturated on avg. at approx. 60% of the users activated targeted communities q=1 targeted communities q=2 targeted communities q=3 targeted communities q=4 targeted communities q=5 0.6 0.6 0.6 0.6 0.6mean activation fraction (a) mean activation fraction (a) mean activation fraction (a) mean activation fraction (a) mean activation fraction (a) 0.5 0.5 0.5 0.5 0.5 0.4 0.4 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.2 0.2 IF IF IF IF IF GI GI GI GI GI R R R R R 0.1 0.1 0.1 0.1 0.1 2 4 6 8 10 14 2 4 6 8 10 14 2 4 6 8 10 14 2 4 6 8 10 14 2 4 6 8 10 14 user sample size (s) user sample size (s) user sample size (s) user sample size (s) user sample size (s)
Results: IF outperforms GI, RDigital Enterprise Research Institute www.deri.ie user sample size s=1 user sample size s=12 0.5 0.65 0.4 0.60 ● activation fraction (a) activation fraction (a) 0.3 0.55 ● 0.2 0.50 ● 0.1 0.45 ● ● ● IF GI R IF GI R
ConclusionDigital Enterprise Research Institute www.deri.ie • The evaluation demonstrated that the framework • is able to identify highly influential communities • can predict which communities to stimulate (e.g. by posting a message) s.t. the stimulus spreads efficiently • We aim to extend it with content analysis • E.g. What are the most influential communities with respect to a particular topic? • We will also investigate empirically-observed topic cascades and modify our models accordingly if needed References • Belák V., Lam S., Hayes C. Cross-Community Influence in Discussion Fora. ICWSM. AAAI, 2012. • M. Everett and S. Borgatti. The centrality of groups and classes. J. of Mathematical Sociology, 23(3):181–201, 1999. • D. Kempe, J. Kleinberg, and É. Tardos. Maximizing the spread of influence through a social network. SIGKDD. ACM, 2003.