Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Arizona State University
Inside	the	Atoms:	Mining		
a	Network	of	Networks	and	Beyond	
Hanghang Tong
hanghang.tong@asu.edu
...
Arizona State University
Hospital Networks
US Power Grid
Biological Networks
Collaboration Networks
Observation: Graphs ar...
Arizona State University
Graph Mining: An Overview
- 3 -
Observation: Mining stops at nodes/links (atom) level.
Q: Is ther...
Arizona State University
A Motivating Example: Cross-Network Association
(e.g., candidate gene prioritization problem)
- 4...
Arizona State University
A Motivating Example: Cross-Network Association
(e.g., candidate gene prioritization problem)
- 5...
Arizona State University
A Motivating Example: Cross-Network Association
(e.g., candidate gene prioritization problem)
- 6...
Arizona State University
A Set of Networks: More Applications
- 7 -
Collaborations
System of Systems
Brain Networks
Cyber-...
Arizona State University
Roadmap
§ Motivations
§ NoN: A Network of Networks
– NoN Modeling
– NoN Mining
§ Beyond NoN
§ Som...
Arizona State University
Modeling NoN
§  Q: How to represent a set of inter-connected
networks (e.g., Tissue-Specific PPI ...
Arizona State University
Introducing the NoN Model
§  A: each green node (disease) itself is a network
- 10 -
NoN (A Netwo...
Arizona State University
NoN Models: Examples
Applications The Main Network (G) Domain Networks (A)
Gene-Pheno Assoc. Dise...
Arizona State University
NoN - Generalizations
§  G1: Multi-layered NoN
–  Candidate Gene Prioritization: Disease-tissue-
...
Arizona State University
NoN vs. Some Popular Multi-Network Models
§  They are all special case of our NoN model!
–  Tenso...
Arizona State University
Roadmap
§ Motivations
§ NoN: A Network of Networks
– NoN Modeling
– NoN Mining: Ranking and Clust...
Arizona State University
NoN Mining - Ranking
A1: Given a disease (e.g. P1),
what are the most relevant
genes (blue nodes)...
Arizona State University
Ranking on a Single Network
- 16 -
Node 4
Node 1
Node 2
Node 3
Node 4
Node 5
Node 6
Node 7
Node 8...
Arizona State University
Ranking on a Single Network
- 17 -
Node 4
Node 1
Node 2
Node 3
Node 4
Node 5
Node 6
Node 7
Node 8...
Arizona State University
Ranking on a Single Network
- 18 -
Background
An Optimization Viewpoint of “Maxwell Equation” for...
Arizona State University
Ranking on NoN
§  Optimization Formulation:
§  Intuition:
–  Similar ranking scores for an overla...
Arizona State University
Ranking on NoN
§  Optimization Formulation:
§  Equivalence: J(r) = J(r1,…,rg)
–  Intuition: a sin...
Arizona State University
Ranking on NoN
§  Equivalence: J(r) = J(r1,…,rg)
–  Intuition: One single random walk on the inte...
Arizona State University
NoN Ranking - Results
- 22 -
A1: Candidate Gene Prioritization
•  Which genes are most relevant w...
Arizona State University
NoN Mining - Clustering
§  Obj. Function:
- 23 -J. Ni, H. Tong, W. Fan, X. Zhang: Flexible and Ro...
Arizona State University
Roadmap
§ Motivations
§ NoN: A Network of Networks
– NoN Modeling
– NoN Mining
§ Beyond NoN: From...
Arizona State University
NoT: A Network of Time Series
§  Problem Definition
- 25 -
•  Y. Cai, H. Tong, W. Fan and P. Ji: ...
Arizona State University
iBall: A Network of Regression Models
- 26 -
•  Y. Yao, H. Tong, F. Xu, J. Lu: Predicting long-te...
Arizona State University
Fascinate: Cross-Layer Dependence
Inference on Multi-Layered Networks
- 27 -
§ Results
§  Methods...
Arizona State University
Conclusion: a Network of X
§  Summary
–  NoN: Network + Networks
–  NoT: Network + Time Series
– ...
Arizona State University
Roadmap
§ Motivations
§ NoN: A Network of Networks
§ Beyond NoN
§ Some of Our Other Recent Work
–...
Arizona State University
Replacing the Irreplaceable:
Team Replacement Recommendation
- 30 -
•  L. Li, H. Tong, N. Cao, K....
Arizona State University
Travel Mode Identification w/ Smartphones
- 31 -
§ Prob.Dfn
•  X. Su, H. Tong and P. Ji: Accelero...
Arizona State University
BrainQuest: Visual Brain Comparison
- 32 -
Quest brains to spot picture diff.
•  L. Shi, H. Tong,...
Arizona State University
BrainQuest: Visual Brain Comparison
- 33 -
Quest computers to spot brain diff.
•  L. Shi, H. Tong...
Arizona State University
BrainQuest: Visual Brain Comparison
- 34 -
Quest computers to spot brain diff.
AD group (n1) Cont...
Arizona State University
BrainQuest: Visual Brain Comparison
- 35 -
§ VAFramework
§  Model & Algorithm
§ ProblemDfn.
§  Re...
Arizona State University
Query-Specific Optimal Networks
- 36 -
L. Li, Y. Yao, J. Tang, W. Fan, H. Tong: QUINT: On Query-S...
Arizona State University
Attributed Network Alignment
•  D. Koutra, H. Tong, D. Lubensky:BIG-ALIGN: Fast Bipartite Graph A...
Arizona State University
Vegas: Influence Graph Visual Summarization
- 38 -
•  L. Shi, H. Tong, J. Tang and C. Lin: Flow-b...
Arizona State University
Q&A
Inside the atom is a whole new world!
- 39 -
•  “A whole new world
•  Every turn a surprise
•...
Arizona State University
§  Collaborators:
–  Norbou Buchler, Nan Cao, Madelaine Daianu, Kate Ehrlich, Wei
Fan, Qing He, P...
Upcoming SlideShare
Loading in …5
×

Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at BigMine16

815 views

Published on

Networks (i.e., graphs) appears in many high-impact applications. Often these networks are collected from different sources, at different times, at different granularities. In this talk, I will present our recent work on mining such multiple networks. First, we will present two models - one on modeling a set of inter-connected networks (NoN); and the other on modeling a set of inter-connected co-evolving time series (NoT). For both models, we will show that by treating networks as context, we are able to model more complicate real-world applications. Second, we will present some algorithmic examples on how to do mining with such new models, including ranking, imputation and prediction. Finally, we will demonstrate the effectiveness of our new models and algorithms in some applications, including bioinformatics, and sensor networks.

Published in: Data & Analytics
  • Be the first to comment

Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at BigMine16

  1. 1. Arizona State University Inside the Atoms: Mining a Network of Networks and Beyond Hanghang Tong hanghang.tong@asu.edu http://tonghanghang.org - 1 - @KDD BigMine 16: the 5th International Workshop on Big Data, Streams and Heterogeneous Source Mining
  2. 2. Arizona State University Hospital Networks US Power Grid Biological Networks Collaboration Networks Observation: Graphs are everywhere! - 2 - Traffic Network Brain Networks
  3. 3. Arizona State University Graph Mining: An Overview - 3 - Observation: Mining stops at nodes/links (atom) level. Q: Is there a level x (x=4, 5, …)? What is it? graph subgraph node/link
  4. 4. Arizona State University A Motivating Example: Cross-Network Association (e.g., candidate gene prioritization problem) - 4 - §  Problem Definition –  Given: (1) two networks P and G, and (2) their partial association A; –  Find: missing associations in A. §  Solutions: Graph Ranking –  Given: a green node (disease); –  Find: the most relevant blue nodes (genes). P GA A Powerful Primitive in (A1) drug discovery; (A2) social recommendation; (3) QA post-tagging, etc. (PPI) (Phenotype)
  5. 5. Arizona State University A Motivating Example: Cross-Network Association (e.g., candidate gene prioritization problem) - 5 - §  Problem Definition –  Given: (1) two networks P and G, and (2) their partial association A; –  Find: missing associations in A. §  Solutions: Graph Ranking –  Given: a green node (disease); –  Find: the most relevant blue nodes (genes). §  Limitations: Each green node (disease) might have its own PPI network! O. Magger, Y. Y. Waldman, E. Ruppin, and R. Sharan. Enhancing the prioritization of disease-causing genes through tissue specific protein interaction networks. PLoS Computational Biology, 8(9), 2012. P GA
  6. 6. Arizona State University A Motivating Example: Cross-Network Association (e.g., candidate gene prioritization problem) - 6 - •  A Disease Network P •  A PPI Network G a b c d GA 4 5 3 6 7 2 1P •  A Disease Network P •  A set of :ssue-specific PPI Networks G1 ,…, G7 4 5 3 6 7 2 1P A G1 a b d c G2 a c db G7 a b d c … … ……
  7. 7. Arizona State University A Set of Networks: More Applications - 7 - Collaborations System of Systems Brain Networks Cyber-Physics Systems
  8. 8. Arizona State University Roadmap § Motivations § NoN: A Network of Networks – NoN Modeling – NoN Mining § Beyond NoN § Some of Our Other Recent Work - 8 -
  9. 9. Arizona State University Modeling NoN §  Q: How to represent a set of inter-connected networks (e.g., Tissue-Specific PPI Networks)? - 9 - 4 5 3 6 7 2 1P A G1 a b d c G2 a c db G7 a b d c … … ……
  10. 10. Arizona State University Introducing the NoN Model §  A: each green node (disease) itself is a network - 10 - NoN (A Network of Networks) := a triplet R = <G, A, θ> •  G: Main Network (the green, disease to disease networks) •  A: Domain Networks (the blue, tissue-specific PPI networks) •  θ: Mapping function (each green, main node à a blue, domain network) J. Ni, H. Tong, W. Fan, X. Zhang: Inside the atoms: ranking on a network of networks. KDD 2014
  11. 11. Arizona State University NoN Models: Examples Applications The Main Network (G) Domain Networks (A) Gene-Pheno Assoc. Disease Sim Network Tissue-specific PPI Nets LBSN Geo-proximity network Social Networks Brain Initiative Person-Person Network Brain Networks Team of Teams Project Dependence Net Team Networks Scholarly Data Res. Area Sim Network Collaboration Networks - 11 - NoN (A Network of Networks) := a triplet R = <G, A, θ> •  G: Main Network (the green, disease to disease networks) •  A: Domain Networks (the blue, tissue-specific PPI networks) •  θ: Mapping function (each green, main node à a blue, domain network)
  12. 12. Arizona State University NoN - Generalizations §  G1: Multi-layered NoN –  Candidate Gene Prioritization: Disease-tissue- protein –  Geo-social networks: City-district-person §  G2: Soft Mapping function θ –  1-to-many, or many-to-many - 12 - •  C. Chen, J. He, N. Bliss and H. Tong: “On the Connectivity of Multi-layered Networks: Models, Measures and Optimal Control” ICDM 2015.
  13. 13. Arizona State University NoN vs. Some Popular Multi-Network Models §  They are all special case of our NoN model! –  Tensor: a special NoN with 1)  A full clique main network (G); 2)  All domain networks (A) sharing the same node sets –  Hypergraph: a special NoN with 1)  All domain networks (A) being empty –  Multiplex: a special NoN with 1)  Two-layers 2)  All domain networks (A) sharing the same node sets - 13 -
  14. 14. Arizona State University Roadmap § Motivations § NoN: A Network of Networks – NoN Modeling – NoN Mining: Ranking and Clustering § Beyond NoN § Some of Our Other Recent Work - 14 -
  15. 15. Arizona State University NoN Mining - Ranking A1: Given a disease (e.g. P1), what are the most relevant genes (blue nodes)? - 15 - A2: Who is most influential, considering both the within- and cross-area influence?
  16. 16. Arizona State University Ranking on a Single Network - 16 - Node 4 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 0.13 0.10 0.13 0.22 0.13 0.05 0.05 0.08 0.04 0.03 0.04 0.02 1 4 3 2 5 6 7 9 10 8 11 12 0.13 0.10 0.13 0.13 0.05 0.05 0.08 0.04 0.02 0.04 0.03 Ranking vector More red, more relevant Nearby nodes, higher scores Background 4r r H. Tong, C. Faloutsos, J.-Y. Pan: Fast Random Walk with Restart and Its Applications. ICDM 2006. (best paper award at 2006, ICDM 2015 10-Yeart Highest Impact Paper Award)
  17. 17. Arizona State University Ranking on a Single Network - 17 - Node 4 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 0.13 0.10 0.13 0.22 0.13 0.05 0.05 0.08 0.04 0.03 0.04 0.02 1 4 3 2 5 6 7 9 10 8 11 12 0.13 0.10 0.13 0.13 0.05 0.05 0.08 0.04 0.02 0.04 0.03 Ranking vector More red, more relevant Nearby nodes, higher scores 4r r Background Footnote: “Maxwell Equation” for Web [Soumen Chakrabarti] ri = c x A x ri + (1-c) x ei
  18. 18. Arizona State University Ranking on a Single Network - 18 - Background An Optimization Viewpoint of “Maxwell Equation” for Web (Symmetric A) ri = c x A x ri + (1-c) x ei = argmin cri'(I – A)ri + (1-c) x||ri – ei||2 Network Smoothness Query Preference
  19. 19. Arizona State University Ranking on NoN §  Optimization Formulation: §  Intuition: –  Similar ranking scores for an overlapped node, if their G(i,j) is high. –  A set of correlated g random walks - 19 -J. Ni, H. Tong, W. Fan, X. Zhang: Inside the atoms: ranking on a network of networks. KDD 2014 #1: within-network smoothness #2: query preference#2: query preference #3: cross-network consistency
  20. 20. Arizona State University Ranking on NoN §  Optimization Formulation: §  Equivalence: J(r) = J(r1,…,rg) –  Intuition: a single R.W. on the integrated graph A –  Property: J(r) is positive-definite! - 20 - ~ #1: within-network smoothness #2: query preference #3: cross-network consistency
  21. 21. Arizona State University Ranking on NoN §  Equivalence: J(r) = J(r1,…,rg) –  Intuition: One single random walk on the integrated graph A –  Property: J(r) is positive-definite! §  Algorithms: –  #1: A linear algorithm à the optimal solution –  #2: Any existing fast solution on a single network –  #3: Further Speedup: O(T(m+ng)) à O(T(g log(g) + z)) •  g << n; and z << m (key idea: using main network to do pruning) - 21 - ~
  22. 22. Arizona State University NoN Ranking - Results - 22 - A1: Candidate Gene Prioritization •  Which genes are most relevant wrt disease a? ROC Curve Comparison A2: Co-authorship Prediction •  Which DM authors are most likely to collaborate with a given Med author? AUC and Accuracy
  23. 23. Arizona State University NoN Mining - Clustering §  Obj. Function: - 23 -J. Ni, H. Tong, W. Fan, X. Zhang: Flexible and Robust Multi-Network Clustering. KDD 2015 Similar Intuition ! P-value vs. (biologically meaningful) clusters §  Results:
  24. 24. Arizona State University Roadmap § Motivations § NoN: A Network of Networks – NoN Modeling – NoN Mining § Beyond NoN: From NoN to NoX § Some of Our Other Recent Work - 24 -
  25. 25. Arizona State University NoT: A Network of Time Series §  Problem Definition - 25 - •  Y. Cai, H. Tong, W. Fan and P. Ji: Fast Mining of a Network of Coevolving Time Series. SDM 2015. •  Y. Cai, H. Tong, W. Fan, P. Ji, Q. He:Facets: Fast Comprehensive Mining of Coevolving High-order Time Series. KDD 2015 §  Models & Algorithms §  Results 0 50 100 150 200 400 600 800 1000 1200 1400 frame # coordinate original DCMF DMF dynaMMo DCMFdynaMMo DMF MARKER PLACEMENT GUIDE The marker placement in this document is only one of many possible combinations. T his guide will only show the standard marker placement that’s being used in the motion capture laboratory. The marker placement in this guide resembles the one that is shown and explained in the Vicon 512 manual. As such, the Vicon 512 Manual can offer additional information. The difference with the marker set in this document from the Vicon 512 Manual is the addition of 4 m arkers, namely RARM, L ARM, RLEG, a nd LLEG. Before starting, below are some general rules of thumb one should follow: • Have the person who’s going to be motion captured wear tight fitting clothes—strap down any areas of the clothing that is loose. The marker balls’ position should move as little as possible and should be properly seen. • Place the marker balls as close to the boneas possible. This follows therule of having the marker balls stay stationary during movement.
  26. 26. Arizona State University iBall: A Network of Regression Models - 26 - •  Y. Yao, H. Tong, F. Xu, J. Lu: Predicting long-term impact of CQA posts: a comprehensive viewpoint. KDD 2014 •  L. Li, H. Tong: The Child is Father of the Man: Foresee the Success at the Early Stage. KDD 2015. •  “Data Mining Reveals the Secret to Getting Good Answers”, MIT Technology Review, 2013 §  Results §  Models & Algorithms§  Problem Definition D1 D3 D2 D4
  27. 27. Arizona State University Fascinate: Cross-Layer Dependence Inference on Multi-Layered Networks - 27 - § Results §  Methods§  Problem Definition Infer Unobserved Cross-Layer Links Cross-Layer Inference = Collective CF Effectiveness Efficiency •  C. Chen, J. He, N. Bliss and H. Tong: “On the Connectivity of Multi-layered Networks: Models, Measures and Optimal Control” ICDM15. •  C. Chen, H. Tong, L. Xie, L. Ying and Q. He: “FASCINATE: Fast Cross-Layer Dependence Inference on Multi-layered Networks”, KDD16, 3:15pm, Monday, Plaza Room A/B
  28. 28. Arizona State University Conclusion: a Network of X §  Summary –  NoN: Network + Networks –  NoT: Network + Time Series –  iBall: Network + Regression –  Fascinate: Network + Inference §  Take Home Messages –  Modeling: `No’ (i.e., a Network of X) as the answer •  Networks as data à as context –  Algorithms: Networks as the contextual regularizer - 28 -
  29. 29. Arizona State University Roadmap § Motivations § NoN: A Network of Networks § Beyond NoN § Some of Our Other Recent Work –  Team Replacement –  TravelModeLogger –  BrainQuest - 29 - –  Network Alignment –  Optimal Networks –  Visual Influence Sum
  30. 30. Arizona State University Replacing the Irreplaceable: Team Replacement Recommendation - 30 - •  L. Li, H. Tong, N. Cao, K. Ehrlich, Y.-R. Lin and N. Buchler: Replacing the Irreplaceable: Fast Algorithms for Team Member Recommendation, WWW 2015 •  N. Cao, Y.-R. Lin, L. Li, H. Tong: g-Miner: Interactive Visual Group Mining on Multivariate Graphs, ACM CHI 2015 •  System prototype & video demo: http://team-net-work.org §  Problem Definition § System §  Sol. § Results
  31. 31. Arizona State University Travel Mode Identification w/ Smartphones - 31 - § Prob.Dfn •  X. Su, H. Tong and P. Ji: Accelerometer-based Activity Recognition on Smartphone. CIKM 2014 •  X. Su, H. Caceres, H. Tong and Q. He: Travel Mode Identification with Smartphones. TRB 2015 § Method § Results §  Open Challenges ²  Battery Consumption (sampling rates, sensor selection) ²  On-line algorithms ²  Adaptive (summer vs. winter; high-way vs. local)
  32. 32. Arizona State University BrainQuest: Visual Brain Comparison - 32 - Quest brains to spot picture diff. •  L. Shi, H. Tong, X. Mu: BrainQuest: Perception-Guided Visual Brain Comparison, ICDM 2015 •  L. Shi, H. Tong, M. Daianu, X. Mu and P. Thompson Block-wise Human Brain Network Visual Comparison Using NodeTrix Representation. VIS'16
  33. 33. Arizona State University BrainQuest: Visual Brain Comparison - 33 - Quest computers to spot brain diff. •  L. Shi, H. Tong, X. Mu: BrainQuest: Perception-Guided Visual Brain Comparison, ICDM 2015 •  L. Shi, H. Tong, M. Daianu, X. Mu and P. Thompson Block-wise Human Brain Network Visual Comparison Using NodeTrix Representation. VIS'16
  34. 34. Arizona State University BrainQuest: Visual Brain Comparison - 34 - Quest computers to spot brain diff. AD group (n1) Control group (n2) •  L. Shi, H. Tong, X. Mu: BrainQuest: Perception-Guided Visual Brain Comparison, ICDM 2015 •  L. Shi, H. Tong, M. Daianu, X. Mu and P. Thompson Block-wise Human Brain Network Visual Comparison Using NodeTrix Representation. VIS'16
  35. 35. Arizona State University BrainQuest: Visual Brain Comparison - 35 - § VAFramework §  Model & Algorithm § ProblemDfn. §  Results Spot structural diff. between two groups of brain networks •  L. Shi, H. Tong, X. Mu: BrainQuest: Perception-Guided Visual Brain Comparison, ICDM 2015 •  L. Shi, H. Tong, M. Daianu, X. Mu and P. Thompson Block-wise Human Brain Network Visual Comparison Using NodeTrix Representation. VIS'16
  36. 36. Arizona State University Query-Specific Optimal Networks - 36 - L. Li, Y. Yao, J. Tang, W. Fan, H. Tong: QUINT: On Query-Specific Optimal Networks. KDD 2016. 10:00am, Monday, Plaza Room A/B §  Goal: Optimal Networks –  Query-Specific –  Optimal Topology + Weights –  On-line Learning §  + Error Estimation §  Results §  Methods: VERY efficient way to estimate Accuracy(MAP)Scalabilitys x ij Query node Positive node @Q(x, s) @As(i, j) Q(j, s) ⇥ Q(x, i) / Neighbor of Neighbor ofs x
  37. 37. Arizona State University Attributed Network Alignment •  D. Koutra, H. Tong, D. Lubensky:BIG-ALIGN: Fast Bipartite Graph Alignment. ICDM 2013. •  S. Zhang and H. Tong: Final: Fast Attributed Network Alighnment. KDD 2016, 3:15pm, Monday, Plaza Room A/B § Formulation §  Algorithms § ProblemDfn. §  Results Accuracy vs. Time Accuracy vs. Noise •  Iterative Alg. •  Global Optimal •  Same Complexity as ISORANK •  Further Speed-up •  Low-Rank Approximation •  On-Query Alignment (Linear)
  38. 38. Arizona State University Vegas: Influence Graph Visual Summarization - 38 - •  L. Shi, H. Tong, J. Tang and C. Lin: Flow-based Influence Graph Visual Summarization, ICDM 2014 •  L. Shi, H. Tong, J. Tang, C. Lin: VEGAS: Visual influEnce GrAph Summarization on Citation Networks. TKDE 2015 § Solution § Results “Stochastic High-Level Petri Net and Applications” §  Prob. Dfn. Who/What How/Why
  39. 39. Arizona State University Q&A Inside the atom is a whole new world! - 39 - •  “A whole new world •  Every turn a surprise •  With new horizons to pursue •  Every moment red-letter ……”
  40. 40. Arizona State University §  Collaborators: –  Norbou Buchler, Nan Cao, Madelaine Daianu, Kate Ehrlich, Wei Fan, Qing He, Ping Ji, Yu-ru Lin, Lei Shi, Chuang Lin, Jie Tang, Paul M. Thompson, Lei Xie, Yuan Yao, Lei Ying, Xiang Zhang §  Students: –  Liangyue Li –  Chen Chen –  Yongjie Cai (now at Google) –  Xing Su –  Si Zhang Acknowledgement - 40 -

×