STRIP: stream learning of influence probabilities.

2,028 views
1,893 views

Published on

Influence-driven diffusion of information is a fundamental process in social networks. Learning the latent variables of such process, i.e., the influence strength along each link, is a central question towards understanding the structure and function of complex networks, modeling information cascades, and developing applications such as viral marketing.

Motivated by modern microblogging platforms, such as twitter, in this paper we study the problem of learning influence probabilities in a data-stream scenario, in which the network topology is relatively stable and the challenge of a learning algorithm is to keep up with a continuous stream of tweets using a small amount of time and memory. Our contribution is a number of randomized approximation algorithms, categorized according to the available space (superlinear, linear, and sublinear in the number of nodes n) and according to different models (landmark and sliding window). Among several results, we show that we can learn influence probabilities with one pass over the data, using O(nlog n) space, in both the landmark model and the sliding-window model, and we further show that our algorithm is within a logarithmic factor of optimal.

For truly large graphs, when one needs to operate with sublinear space, we show that we can still learn influence probabilities in one pass, assuming that we restrict our attention to the most active users.

Our thorough experimental evaluation on large social graph demonstrates that the empirical performance of our algorithms agrees with that predicted by the theory.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,028
On SlideShare
0
From Embeds
0
Number of Embeds
18
Actions
Shares
0
Downloads
28
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

STRIP: stream learning of influence probabilities.

  1. 1. STRIPStream Learning of Influence Probabilities Konstantin Kutzkov – IT University of Copenhagen, Denmark Albert Bifet, Francesco Bonchi – Yahoo! Labs, Barcelona Aristides Gionis – Aalto University and HIIT, Finland 1 KDD 2013
  2. 2. Problem: Learning Influence Probabilities • Learning influence strength along links in social networks • Real-time interactions – Small amount of time and memory – Only one pass The general Propagation log Social graph Learnprobabilities
  3. 3. Data Stream Learning • Sequence is potentially infinite • High amount of data, high speed of arrival • Change over time • Process elements from a data stream in only one pass • Approximation algorithms – Small error rate with high probability
  4. 4. Social graph and log of propagations We have 2 pieces of input data: (1) social graph and (2) a log of past propagations u12 u45 u45 follows u12 – arc u12→ u45 I liked this movie I read this article great movie 09:30 09:00 Action Node Time a u12 1 a u45 2 a u32 3 a u76 8 b u32 1 b u45 3 b u98 7
  5. 5. Learning Influence Probabilities • STRIP – Streaming methods • approximate solutions – Superlinear space – Linear space – Sublinear space • Landmark and sliding window Independent Casca Every arc (u,v) has associated the prob Time proceeds in di At time t, nodes that became active at neighbors, and succeed 15 .1 .1 .1 .1 .1 .2 .2 .2 .2 .3 .3 .3 .3 .4 .4 .4 .4 .4 .1 .1 b a c f e d g h i
  6. 6. The problem • Given a social network with nodes being the users and directed edges the social relations among them. • For two users u and v connected by an edge (u, v), let pu,v ϵ [0,1] be the influence of u on v. • In the Independent Cascade Model the influence is the probability that an action propagates from user u to user v. • The problem of finding the influential users has been widely studied in viral marketing. (Domingos and Richardson KDD’01, Kempe et al. KDD’03) 6
  7. 7. But how do we know the influence probabilities? • It is usually assumed the influence probabilities are known in advance. • However, in real-life (social) networks the relations and influence probabilities are not known in advance. • The problem of learning the probabilities was first considered in Goyal et al. WSDM’10. • In particular the following definition was shown to give a good prediction of propagation: Let Au2v(t) be the number of actions that propagated from user u to user v within time t and Au|v – the total number of actions performed by either u or v. Then pu,v = Au2v(t)/Au|v 7
  8. 8. First solution. Store the whole social graph in memory. • Count exactly for each user u how many actions propagate to its neighbor v within time t, i.e., compute exactly Au2v(t). (Means we have to store all actions performed at most t time units ago.) • Estimate the total number of actions performed by either u or v, i.e., estimate Au|v. (Set size estimation in a streaming setting, many efficient algorithms.) 9
  9. 9. 10 Experiments with Twitter dataset. “+” Very good approximation, storing the network in RAM does not dominate the space usage (real graphs are sparse), for small time threshold t good space usage. “-” Slow processing time per action (one has to check all neighbors), bad space usage for large t. Experiments with Twitter Dataset
  10. 10. Second solution. (Not storing the whole social graph in main memory.) Lower bound: 11 • Intuitively, u and v can perform just one action and have influence probability 1. • We thus need to somehow store information for each user. • Formally, using a reduction from the Set Disjointness problem we show the following result: Let the number of users be n. Any streaming algorithm which distinguishes with probability at least 2/3 between the cases (i) there exists an edge with influence probability 1/2 (ii) all edges have influence probability at most 1/3, needs Ω(n) bits.
  11. 11. Second solution. (Not storing the whole social graph in main memory.) • The definition pu,v = Au2v(t)/Au|v is almost identical to Jaccard coefficient for estimating the similarity between sets A and B. • Jaccard(A, B) = |A∩B|/|A U B| • Min-wise independent hashing (Broder et al., 2000) has become the de facto standard technique for estimating the Jaccard similarity in data streams. 12
  12. 12. Min-wise independent hashing. Example. Similarity in market baskets among supermarket customers. 13 Alice Bob
  13. 13. 14 Min-wise independent hashing. Example. Let h: Items  [0,1] be a random hash function. Find the 4 items bought by either Alice or Bob with the smallest hash values: i) find the 4 smallest hash values for Alice, and the 4 for Bob. ii) Aggregate the results. iii) Estimate the similarity.
  14. 14. 15 Min-wise independent hashing. Example. Let h: Items  [0,1] be a random hash function. Find the 4 items bought by either Alice or Bob with the smallest hash values: i) find the 4 smallest hash values for Alice, and the 4 for Bob. ii) Aggregate the results. iii) Estimate the similarity. Alice Bob h( ) = 0.06 h( ) = 0.09 h( ) = 0.1 h( ) = 0.12 h( ) = 0.03 h( ) = 0.07 h( ) = 0.09 h( ) = 0.096
  15. 15. Min-wise independent hashing. Example. 16 Let h: Items  [0,1] be a random hash function. Find the 4 items bought by either Alice or Bob with the smallest hash values: i) find the 4 smallest hash values for Alice, and the 4 for Bob. ii) Aggregate the results. iii) Estimate the similarity. Alice or Bob: h( ) = 0.03 h( ) = 0.07 h( ) = 0.09h( ) = 0.06
  16. 16. Min-wise independent hashing. Example. 17 Let h: Items  [0,1] be a random hash function. Find the 4 items bought by either Alice or Bob with the smallest hash values: i) find the 4 smallest hash values for Alice, and the 4 for Bob. ii) Aggregate the results. iii) Estimate the similarity. Among the 4 items with the smallest hash values, only is bought by Alice and Bob, thus the estimated similarity is 1/4.
  17. 17. Second solution. Min-wise independent hashing. • Store the k actions with the smallest hash values for each user. • After processing the stream, for an edge (u, v) we count the number of actions among the k actions with the smallest hash values that propagated from u to v within time t. 18 For all edges (u, v) with constant influence probability pu,v, we can obtain a constant factor approximation with probability 2/3 using O(n log n) bits and constant processing time per action.
  18. 18. 19 Experiments with Twitter dataset. “+” Good space usage, fast processing time, reasonably good approximation. “-” One achieves really good space savings for large streams (average number of actions per user is a few thousands). Experiments with Twitter Dataset
  19. 19. Third solution. (Estimate influence only for active users.) • Often, the most interesting users are the most active users. • By sampling only actions a with h(a)<α, for some α = α(k) ϵ [0,1], we can collect the min-wise samples for the k most active users. • Thus, we can estimate influence probabilities pu,v for users u and v who are among the k most active users. • Space efficient for small k and highly skewed user activity. 20
  20. 20. Extending the algorithms to sliding windows. • Influence probabilities may change over time. • Streaming Sliding Window: Exponential histograms technique for bit-counting in data streams (Datar et al. SIAMCOMP, 2002) • Essentially, we obtain the same results, at the price of multiplicative polylog-factors in the time and space complexity. 21
  21. 21. Conclusions and future directions. • The first algorithms for learning influence probabilities in data streams. • We performed only a high-level evaluation of our algorithms, confirming the theoretical findings. • We plan to extend the algorithms to handle adaptive size windows such that the user does not have to decide in advance the optimal window size. • The min-wise independent hashing approach can be distributed such that each processor processes only a subset of the users. 22
  22. 22. STRIP: Stream Learning of Influence Probabilities Thank you! 23

×