Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Statistical Network Models: How to Leverage Network Structure to Improve Estimation & Prediction

176 views

Published on

Networks are a natural representation of complex systems across domains, from social, to biological, to technological. Modeling the dynamics and structure of networks is central to the understanding of these systems and have a huge economic impact. However, the characteristics of network data captured from online complex systems present a number of challenges to the design and evaluation of machine learning methods. This talk will outline some of the important algorithmic and evaluation challenges that arise due to the massive size, streaming nature and heterogenous structure of complex networks, and discuss statistical online methods, unbiased and shrinkage estimators, as well as learning techniques to address some of these challenges.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Statistical Network Models: How to Leverage Network Structure to Improve Estimation & Prediction

  1. 1. 2nd Workshop on Offline & Online Evaluation of Interactive Systems @KDD 2019
  2. 2. Graphs – rich and powerful data representation - - - - - Social network Human Disease Network [Barabasi 2007] Food Web [2007] Terrorist Network [Krebs 2002]Internet (AS) [2005] Gene Regulatory Network [Decourty 2008] Protein Interactions [breast cancer] Political blogs Power grid
  3. 3. User-items w1 w2 w3 wK … Document-Words Question-Answer Graphs with multiple node/link types, strength attributes, weights … What’s impact of heterogenous structure on prediction and error? w1 w2 w3 w4 w5 w6 w7 wK … P Q Wiki-talk, user-edits, … User A User B User A Item K User A Heterogenous Graphs Webpage
  4. 4. Many examples: emails, user logs, weblogs, transactions, online social network, news feed, Q&A forums … time t1 time t2 (from online interactive systems) time t1 How to handle concept drift, partial observability, higher-order relationships across time ⋯ ⋯ time User A User B User A Item K User A Webpage
  5. 5. (1) Statistical Online/Streaming Methods (2) Graph Representation Learning
  6. 6. Streaming Model
  7. 7. t=1 t=2 [1, 5] [6, 10] Discrete-time models: represent dynamic network as a sequence of static snapshot graphs where User-defined aggregation time-interval Streaming Model
  8. 8. t=1 t=2 [1, 5] [6, 10] Discrete-time models: represent dynamic network as a sequence of static snapshot graphs where User-defined aggregation time-interval Streaming Model Discrete Time Model: Very coarse representation with noise/error problems Difficult to manage at a large scale Missing higher-order dependencies
  9. 9. Sketching/ Sampling/ Summarization ⋯ ⋯ Sketch/Sample/ summary Graph (S) Graph Stream time Statistical Randomized Methods with Theoretical Guarantees q Not possible to store the entire data stream q Faster/convenient to work with a compact summary q Incremental & online updates Turn large streaming data into small manageable & useful data + Statistical Estimation
  10. 10. Online ML Algorithms Sketch/Sample/summary Graph (S) Model Learning Estimate Network Structure Network Parameter Estimation Feature Representation Higher-order Dependencies . . .
  11. 11. Query Requirements Accuracy, Aggregates, Top-k Ranks, Speed, privacy Resource Constraints Bandwidth, Storage, Memory, access constraints Sampling/ Sketching Data Characteristics Heavy-Tailed distribution, Correlations, clusters, rare events
  12. 12. Data Characteristics Heavy-Tailed distribution, Correlations, clusters, rare events Query Requirements Accuracy, Aggregates, Top-k Ranks, Speed, privacy Resource Constraints Bandwidth, Storage, Memory, access constraints Goal Parameter Estimation Network Estimation Data Collection Learning a Model Sample/Sketch Sampling/ Sketching
  13. 13. Statistical Estimator Complexity EstimationError High Bias High Variance OptimalEstimator Bias2 Variance MSE
  14. 14. § Main idea: we define the selection estimator for an edge Unbiased Edge Estimator For each subgraph J ⇢ [t], we define the sequence of subgraph estimators as ˆSJ,t = Q i2J ˆSi,t E[ ˆSJ,t] = SJ,t Subgraph Estimator Unbiased Ahmed et al. KDD14, TKDD14, VLDB17, IJCAI18 counts are small, because in this case the individual count estima3 ties, are less smoother by aggregation. More generally, James a4 that unbiased estimators do not necessarily minimize mean squa5 estimates of high dimensional Gaussian random variable are adj6 ization and linear combination with dimensional averages. In this7 bnk by convex combination with unnormalized count estimated fo8 bSi,t = I(i 2 bKt)/pi9 E(bSi,t) = Si,t10 Define ⌘w = bn + w where w represents the weight wk of11 sugbraph count as maintained in Algorithm ??. The loss Lw( )12 Var(b⌘w) + (E[b⌘w] n)2 = 2 Var(bn) + 2 Var(w) + 2 A straightforward computation shows that Lw( ) is minimized13 1 = Cov(bn w, bn)/E[(bn w)2 ]14 A plug-in estimator bw for is obtained by substituting (bn w)2 15 for Cov(bn w, bn) whose computation we now describe.16 1.1 Covariance Relations Amongst Subgraph Estimators17 Horvitz-Thompson Estimator See paper for proofs
  15. 15. § Main idea: we define the selection estimator for an edge Unbiased Edge Estimator Ahmed et al. KDD14, TKDD14, VLDB17, IJCAI18 counts are small, because in this case the individual count estima3 ties, are less smoother by aggregation. More generally, James a4 that unbiased estimators do not necessarily minimize mean squa5 estimates of high dimensional Gaussian random variable are adj6 ization and linear combination with dimensional averages. In this7 bnk by convex combination with unnormalized count estimated fo8 bSi,t = I(i 2 bKt)/pi9 E(bSi,t) = Si,t10 Define ⌘w = bn + w where w represents the weight wk of11 sugbraph count as maintained in Algorithm ??. The loss Lw( )12 Var(b⌘w) + (E[b⌘w] n)2 = 2 Var(bn) + 2 Var(w) + 2 A straightforward computation shows that Lw( ) is minimized13 1 = Cov(bn w, bn)/E[(bn w)2 ]14 A plug-in estimator bw for is obtained by substituting (bn w)2 15 for Cov(bn w, bn) whose computation we now describe.16 1.1 Covariance Relations Amongst Subgraph Estimators17 Horvitz-Thompson Estimator The sample Gs is a proxy for the input graph stream G at time t
  16. 16. Input Adaptive Graph Priority Sampling APS(m) Output For each edge k Edge stream k1, k2, ..., k, ... Sampled Edge stream ˆK Stored State m = O(| ˆK|) Generate a random number u(k) ⇠ Uni(0, 1] Compute edge weight w(k) = W(k, ˆK) Compute edge priority r(k) = w(k)/u(k) ˆK = ˆK [ {k} See paper for algorithm details
  17. 17. Input Output For each edge k Edge stream k1, k2, ..., k, ... Sampled Edge stream ˆK Stored State m = O(| ˆK|) Find edge with lowest priority k⇤ = arg mink02 ˆK r(k0 ) Update sample threshold z⇤ = max{z⇤ , r(k⇤ )} Remove lowest priority edge ˆK = ˆK{k⇤ } Use a priority queue with O(log m) updates Adaptive Graph Priority Sampling APS(m)
  18. 18. Statistical Estimator Complexity EstimationError High Bias High Variance OptimalEstimator Bias2 Variance MSE
  19. 19. § A shrinkage estimator is an estimator that incorporates the effect of shrinkage § Main idea: an (unbiased) estimator could be improved by combining it with other information • e.g., combining with prior knowledge, other simple estimators, etc. James & Stein 1992 Gruber 2017
  20. 20. § A shrinkage estimator is an estimator that incorporates the effect of shrinkage § Main idea: an (unbiased) estimator could be improved by combining it with other information • e.g., combining with prior knowledge, other simple estimators, etc. § Why would shrunk estimates be better? Ø This introduces bias, but could significantly decrease the variance. Ø If the effect of the variance is larger, this would decrease the estimation error. James & Stein 1992 Gruber 2017
  21. 21. ties, are less smoother by aggregation. More generally, Jam that unbiased estimators do not necessarily minimize mean sq estimates of high dimensional Gaussian random variable are ization and linear combination with dimensional averages. In nk by convex combination with unnormalized count estimate Define ηw = λn + λw where w represents the weight wk sugbraph count as maintained in Algorithm 1. The loss Lw(λ Var(ηw) + (E[ηw] − n)2 = λ2 Var(n) + λ 2 Var(w) + A straightforward computation shows that Lw(λ) is minimize w)2 ]. A plug-in estimator λw for λ is obtained by substitutin estimate for Cov(n − w, n) whose computation we now des 4 estimates of high dimensional Gaussian rand161 ization and linear combination with dimensio162 nk by convex combination with unnormalize163 Define ηw = λn + λw where w represent164 sugbraph count as maintained in Algorithm165 Var(ηw) + (E[ηw] − n)2 = λ2 Var(n A straightforward computation shows that Lw166 w)2 ]. A plug-in estimator λw for λ is obtain167 estimate for Cov(n − w, n) whose computa168 r Reduction using James-Stein Shrinkage ocal subgraph counts are subject to high relative variance when the m in this case the individual count estimates, scaled by the inverse probab y aggregation. More generally, James and Stein originated the observat do not necessarily minimize mean square error [22]. In their study, unbia ional Gaussian random variable are adjusted through scaling-based regu ation with dimensional averages. In this paper we examine shrinkage for n with unnormalized count estimated form by the edge sampling weight where w represents the weight wk of any edge k, i.e., its unnormali ained in Algorithm 1. The loss Lw(λ) is the mean sqaure error: − n)2 = λ2 Var(n) + λ 2 Var(w) + 2λλ Cov(n, w) + λ 2 E[n − w]2 ation shows that Lw(λ) is minimized when 1 −λ = Cov(n−w, n)/E[(n r λw for λ is obtained by substituting (n − w)2 in the denominator, and n) whose computation we now describe. Mean Squared Error (MSE) 1 Estimation Error Reduction using James-Ste1 Unbiased estimators of local subgraph counts are subject to2 counts are small, because in this case the individual count esti3 ties, are less smoother by aggregation. More generally, Jame4 that unbiased estimators do not necessarily minimize mean sq5 estimates of high dimensional Gaussian random variable are a6 ization and linear combination with dimensional averages. In th7 bnk by convex combination with unnormalized count estimated8 Define ⌘w = bn + w where w represents the weight wk9 sugbraph count as maintained in Algorithm ??. The loss Lw(10 Var(b⌘w) + (E[b⌘w] n)2 = 2 Var(bn) + 2 Var(w) + A straightforward computation shows that Lw( ) is minimize11 1 = Cov(bn w, bn)/E[(bn w)2 ]12 A plug-in estimator bw for is obtained by substituting (bn w13 for Cov(bn w, bn) whose computation we now describe.14 i We emphasize that the edge weight wi is a function of the grap85 that contain edge i). In the static case, we can compute the m86 by counting the number of motifs incident to each edge. How87 streaming data model, since we cannot store the entire graph i88 – We assume edges are labelled (indexed) by their arrival tim89 number of edges in the graph stream. We denote J as a sub90 subgraph isomorphic to M (i.e., J ⇢ K), hence max(J) indica91 assume J1 = {4, 20, 100} represents a triangle formed by edge92 last arriving edge in J1. Also, assume the new arriving edge i93 J1 if and only if edges 4, and 20 are already in the sample. T94 weights are allowed to increase, as we sample more topology95 with higher importance to increase their ranks. However, the96 would also change as the stream evolves, we define the edge97 form (see eqn 1). Assume edge i is added to the sample at tim98 set to 1 initially, and then adapted as the stream evolves using99 t i (i.e., subsequent times, after edge i is sampled).100 – We are working on the proof, so we will remove any optimal101 from the paper. For Uniform sampling: uniform sampling prod102 capture the skewed distribution of the data. For the constra103 that ¯  1 always holds. For variance: As requested, we show104 Minimize the mean squared error (MSE) s. t. Unbiased Estimator Simple Biased Estimator Shrinkage/Combined Estimator estimates of high dimensional Gaussian random variable are adjuste6 ization and linear combination with dimensional averages. In this pa7 bnk by convex combination with unnormalized count estimated form8 Define ⌘w = bn + w where w represents the weight wk of an9 sugbraph count as maintained in Algorithm ??. The loss Lw( ) is t10 Var(b⌘w) + (E[b⌘w] n)2 = 2 Var(bn) + 2 Var(w) + 2 C A straightforward computation shows that Lw( ) is minimized wh11 1 = Cov(bn w, bn)/E[(bn w)2 ]12 A plug-in estimator bw for is obtained by substituting (bn w)2 in13 for Cov(bn w, bn) whose computation we now describe.14 1.1 Covariance Relations Amongst Subgraph Estimators15 Define the unnormalized subgraph estimator bIJ,t = I(J ⇢ bK16 I0 J = IJ(0),⌧J 1. This is 1 iff all the edges of J(0) are present in bK⌧J 17 arrival of the last edge. When this occurs, each edge in J(0) has its w18 Algorithm ??. Thus, the weight associated with edge k at time t is w19 covariances amongst the bnk,t and wk,t need to compute the shrinkag20 Shrinkage intensity See paper for proofs (For each sampled edge) Ahmed et al. 2019 arXiv:1908.01087
  22. 22. 100 101 102 103 104 105 106 top-k edges 0 100 200 300 400 500 600 700 localedgetrianglecount soc-livejournal Exact APS f=0.40 APS UB APS LB 100 101 102 103 104 105 106 top-k edges 0 100 200 300 400 500 600 700 localedgetrianglecount soc-livejournal 100 101 102 103 104 105 106 top-k edges 0 100 200 300 400 500 600 700 localedgetrianglecount soc-livejournal Exact APS JS f=0.40 APS JS UB APS JS LB 100 101 102 103 104 105 106 top-k edges 0 100 200 300 400 500 600 700 localedgetrianglecount soc-livejournal Unbiased Estimator Shrinkage Estimator Goal: Estimate the higher-order motif weighted network at any time t e.g. triangle- weighted graph Ahmed et al. 2019 arXiv:1908.01087
  23. 23. W(A,B) 10 0 10 1 10 2 10 3 10 4 10 5 0 200 400 600 800 1000 1200 stackoverflow top−k edges EdgeWeight ground truth GPS f = 0.10 10 0 10 1 10 2 10 3 10 4 10 5 0 200 400 600 800 1000 1200 stackoverflow top−k edges EdgeWeight APS User A User B Goal: Estimate the aggregated weighted graph at any time t e.g., W(A,B) = # communications between A & B
  24. 24. (1) Statistical Online/Streaming Methods (2) Graph Representation Learning✓
  25. 25. § Goal: Learn representation (features) for a set of graph elements (nodes, edges, etc.) § Key intuition: Map the graph elements (e.g., nodes) to the d-dimension space § Use the features for any downstream prediction task
  26. 26. Communities: cohesive subsets of nodes à Proximity Roles: represent structural patterns à Structural Similarity - two nodes belong to the same role if they’ve similar structural patterns Cj# Ci# Ck# TKDE 2015 AAAI 2017
  27. 27. x x xxx x x x xx x x x x x x x x x x x x x xxx x x x xx x x x x x x x x x x x x xxx x x x xx x x x x x x x x x x e.g., Deepwalk, GraRep, node2vec, Line, etc. No guarantee that nearby vertices are structurally similar Ahmed et al. IJCAI-StarAI 2018
  28. 28. P h hxci i| hxii i = Y j2ci P( hxji | hxii) <latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit><latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit><latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit><latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit> A general principled framework for learning graph embeddings (Role2vec) that capture structural similarity (roles) 16.5% avg. improvement in accuracy Space Efficient with 850x space savings G1 1 G2 3 2 G3 4 G4 5 6 G5 7 8 G6 9 G7 10 11 12 G9 15 G8 13 14 Ahmed et al. IJCAI-StarAI 2018
  29. 29. Attributed Random Walk P h hxci i| hxii i = Y j2ci P( hxji | hxii) <latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit><latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit><latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit><latexit sha1_base64="EygxMVbPFetu9XNSjdx+QQkZmhU=">AAACvHicjVFdS8MwFE3r15xfUx99CQ5BX0YrgvqgjPni4wS3CUspaZpucUlaknQ4av+kPvlvbLc5/EK8EDicc+89N/cGCWfaOM6bZS8tr6yuVdarG5tb2zu13b2ujlNFaIfEPFYPAdaUM0k7hhlOHxJFsQg47QWjm1LvjanSLJb3ZpJQT+CBZBEj2BSUX3tFApthEGTtHLXYoA9Re8gQx3LAKZxqSmQoiOBT7mfEZzlEaiY+/5XKPrLKph68gihRcehnjxAxCWd9FsbHfzR6XPih/zmeVKt+re40nGnAn8CdgzqYR9uvvaAwJqmg0hCOte67TmK8DCvDCKd5FaWaJpiM8ID2CyixoNrLpsvP4VHBhDCKVfGkgVP2c0WGhdYTERSZ5cT6u1aSv2n91EQXXsZkkhoqycwoSjk0MSwvCUOmKDF8UgBMFCtmhWSIFSamuHe5BPf7l3+C7mnDdRru3Vm92ZqvowIOwCE4Bi44B01wC9qgA4h1afnW0GL2tR3aI1vMUm1rXrMPvoQ9fgfUudnT</latexit> 16.5% avg. improvement in accuracy Space Efficient with 850x space savings A general principled framework for learning graph embeddings (Role2vec) that capture structural similarity (roles) A general principled framework for learning graph embeddings (Role2vec) that capture structural similarity (roles) Ahmed et al. IJCAI-StarAI 2018
  30. 30. § Statistical online methods for streaming graphs in real-time and interactive systems • Evaluation challenges: bias and variance & their tradeoffs • Unbiased estimators • Shrinkage estimators to reduce variance § Graph representation learning • Structural similarity vs. proximity • Introduced notion of feature-based walks & a framework for generalizing existing methods based on it
  31. 31. § Estimation with quadratic loss. [James & Stein] – In Breakthroughs in statistics, 1992 § Improving Efficiency by Shrinkage: The James–Stein and Ridge Regression Estimators. [Gruber] – Routledge, 2017 § Graph Sample and Hold. [Ahmed et al.] – ACM SIGKDD 2014 § Network Sampling: From Static to Streaming Graphs. [Ahmed et al.] – ACM TKDD 2014 § On Sampling from Massive Graph Streams. [Ahmed et al.] – VLDB 2017 § Sampling for approximate bipartite network projection. [Ahmed et al.] – IJCAI 2018 § Network Shrinkage Estimation. [Ahmed et al.] – arXiv:1908.01087 2019 § Learning Role-based Graph Embeddings. [Ahmed et al.] – IJCAI-StarAI 2018 § Role Discovery in Networks. [Rossi and Ahmed] – TKDE 2015
  32. 32. 2nd Workshop on Offline & Online Evaluation of Interactive Systems @KDD 2019

×