Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

887 views

Published on

Published in:
Data & Analytics

No Downloads

Total views

887

On SlideShare

0

From Embeds

0

Number of Embeds

547

Shares

0

Downloads

11

Comments

0

Likes

2

No embeds

No notes for slide

- 1. Arizona State University Inside the Atoms: Mining a Network of Networks and Beyond Hanghang Tong hanghang.tong@asu.edu http://tonghanghang.org - 1 - @KDD BigMine 16: the 5th International Workshop on Big Data, Streams and Heterogeneous Source Mining
- 2. Arizona State University Hospital Networks US Power Grid Biological Networks Collaboration Networks Observation: Graphs are everywhere! - 2 - Traffic Network Brain Networks
- 3. Arizona State University Graph Mining: An Overview - 3 - Observation: Mining stops at nodes/links (atom) level. Q: Is there a level x (x=4, 5, …)? What is it? graph subgraph node/link
- 4. Arizona State University A Motivating Example: Cross-Network Association (e.g., candidate gene prioritization problem) - 4 - § Problem Definition – Given: (1) two networks P and G, and (2) their partial association A; – Find: missing associations in A. § Solutions: Graph Ranking – Given: a green node (disease); – Find: the most relevant blue nodes (genes). P GA A Powerful Primitive in (A1) drug discovery; (A2) social recommendation; (3) QA post-tagging, etc. (PPI) (Phenotype)
- 5. Arizona State University A Motivating Example: Cross-Network Association (e.g., candidate gene prioritization problem) - 5 - § Problem Definition – Given: (1) two networks P and G, and (2) their partial association A; – Find: missing associations in A. § Solutions: Graph Ranking – Given: a green node (disease); – Find: the most relevant blue nodes (genes). § Limitations: Each green node (disease) might have its own PPI network! O. Magger, Y. Y. Waldman, E. Ruppin, and R. Sharan. Enhancing the prioritization of disease-causing genes through tissue specific protein interaction networks. PLoS Computational Biology, 8(9), 2012. P GA
- 6. Arizona State University A Motivating Example: Cross-Network Association (e.g., candidate gene prioritization problem) - 6 - • A Disease Network P • A PPI Network G a b c d GA 4 5 3 6 7 2 1P • A Disease Network P • A set of :ssue-speciﬁc PPI Networks G1 ,…, G7 4 5 3 6 7 2 1P A G1 a b d c G2 a c db G7 a b d c … … ……
- 7. Arizona State University A Set of Networks: More Applications - 7 - Collaborations System of Systems Brain Networks Cyber-Physics Systems
- 8. Arizona State University Roadmap § Motivations § NoN: A Network of Networks – NoN Modeling – NoN Mining § Beyond NoN § Some of Our Other Recent Work - 8 -
- 9. Arizona State University Modeling NoN § Q: How to represent a set of inter-connected networks (e.g., Tissue-Specific PPI Networks)? - 9 - 4 5 3 6 7 2 1P A G1 a b d c G2 a c db G7 a b d c … … ……
- 10. Arizona State University Introducing the NoN Model § A: each green node (disease) itself is a network - 10 - NoN (A Network of Networks) := a triplet R = <G, A, θ> • G: Main Network (the green, disease to disease networks) • A: Domain Networks (the blue, tissue-specific PPI networks) • θ: Mapping function (each green, main node à a blue, domain network) J. Ni, H. Tong, W. Fan, X. Zhang: Inside the atoms: ranking on a network of networks. KDD 2014
- 11. Arizona State University NoN Models: Examples Applications The Main Network (G) Domain Networks (A) Gene-Pheno Assoc. Disease Sim Network Tissue-specific PPI Nets LBSN Geo-proximity network Social Networks Brain Initiative Person-Person Network Brain Networks Team of Teams Project Dependence Net Team Networks Scholarly Data Res. Area Sim Network Collaboration Networks - 11 - NoN (A Network of Networks) := a triplet R = <G, A, θ> • G: Main Network (the green, disease to disease networks) • A: Domain Networks (the blue, tissue-specific PPI networks) • θ: Mapping function (each green, main node à a blue, domain network)
- 12. Arizona State University NoN - Generalizations § G1: Multi-layered NoN – Candidate Gene Prioritization: Disease-tissue- protein – Geo-social networks: City-district-person § G2: Soft Mapping function θ – 1-to-many, or many-to-many - 12 - • C. Chen, J. He, N. Bliss and H. Tong: “On the Connectivity of Multi-layered Networks: Models, Measures and Optimal Control” ICDM 2015.
- 13. Arizona State University NoN vs. Some Popular Multi-Network Models § They are all special case of our NoN model! – Tensor: a special NoN with 1) A full clique main network (G); 2) All domain networks (A) sharing the same node sets – Hypergraph: a special NoN with 1) All domain networks (A) being empty – Multiplex: a special NoN with 1) Two-layers 2) All domain networks (A) sharing the same node sets - 13 -
- 14. Arizona State University Roadmap § Motivations § NoN: A Network of Networks – NoN Modeling – NoN Mining: Ranking and Clustering § Beyond NoN § Some of Our Other Recent Work - 14 -
- 15. Arizona State University NoN Mining - Ranking A1: Given a disease (e.g. P1), what are the most relevant genes (blue nodes)? - 15 - A2: Who is most influential, considering both the within- and cross-area influence?
- 16. Arizona State University Ranking on a Single Network - 16 - Node 4 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 0.13 0.10 0.13 0.22 0.13 0.05 0.05 0.08 0.04 0.03 0.04 0.02 1 4 3 2 5 6 7 9 10 8 11 12 0.13 0.10 0.13 0.13 0.05 0.05 0.08 0.04 0.02 0.04 0.03 Ranking vector More red, more relevant Nearby nodes, higher scores Background 4r r H. Tong, C. Faloutsos, J.-Y. Pan: Fast Random Walk with Restart and Its Applications. ICDM 2006. (best paper award at 2006, ICDM 2015 10-Yeart Highest Impact Paper Award)
- 17. Arizona State University Ranking on a Single Network - 17 - Node 4 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 0.13 0.10 0.13 0.22 0.13 0.05 0.05 0.08 0.04 0.03 0.04 0.02 1 4 3 2 5 6 7 9 10 8 11 12 0.13 0.10 0.13 0.13 0.05 0.05 0.08 0.04 0.02 0.04 0.03 Ranking vector More red, more relevant Nearby nodes, higher scores 4r r Background Footnote: “Maxwell Equation” for Web [Soumen Chakrabarti] ri = c x A x ri + (1-c) x ei
- 18. Arizona State University Ranking on a Single Network - 18 - Background An Optimization Viewpoint of “Maxwell Equation” for Web (Symmetric A) ri = c x A x ri + (1-c) x ei = argmin cri'(I – A)ri + (1-c) x||ri – ei||2 Network Smoothness Query Preference
- 19. Arizona State University Ranking on NoN § Optimization Formulation: § Intuition: – Similar ranking scores for an overlapped node, if their G(i,j) is high. – A set of correlated g random walks - 19 -J. Ni, H. Tong, W. Fan, X. Zhang: Inside the atoms: ranking on a network of networks. KDD 2014 #1: within-network smoothness #2: query preference#2: query preference #3: cross-network consistency
- 20. Arizona State University Ranking on NoN § Optimization Formulation: § Equivalence: J(r) = J(r1,…,rg) – Intuition: a single R.W. on the integrated graph A – Property: J(r) is positive-definite! - 20 - ~ #1: within-network smoothness #2: query preference #3: cross-network consistency
- 21. Arizona State University Ranking on NoN § Equivalence: J(r) = J(r1,…,rg) – Intuition: One single random walk on the integrated graph A – Property: J(r) is positive-definite! § Algorithms: – #1: A linear algorithm à the optimal solution – #2: Any existing fast solution on a single network – #3: Further Speedup: O(T(m+ng)) à O(T(g log(g) + z)) • g << n; and z << m (key idea: using main network to do pruning) - 21 - ~
- 22. Arizona State University NoN Ranking - Results - 22 - A1: Candidate Gene Prioritization • Which genes are most relevant wrt disease a? ROC Curve Comparison A2: Co-authorship Prediction • Which DM authors are most likely to collaborate with a given Med author? AUC and Accuracy
- 23. Arizona State University NoN Mining - Clustering § Obj. Function: - 23 -J. Ni, H. Tong, W. Fan, X. Zhang: Flexible and Robust Multi-Network Clustering. KDD 2015 Similar Intuition ! P-value vs. (biologically meaningful) clusters § Results:
- 24. Arizona State University Roadmap § Motivations § NoN: A Network of Networks – NoN Modeling – NoN Mining § Beyond NoN: From NoN to NoX § Some of Our Other Recent Work - 24 -
- 25. Arizona State University NoT: A Network of Time Series § Problem Definition - 25 - • Y. Cai, H. Tong, W. Fan and P. Ji: Fast Mining of a Network of Coevolving Time Series. SDM 2015. • Y. Cai, H. Tong, W. Fan, P. Ji, Q. He:Facets: Fast Comprehensive Mining of Coevolving High-order Time Series. KDD 2015 § Models & Algorithms § Results 0 50 100 150 200 400 600 800 1000 1200 1400 frame # coordinate original DCMF DMF dynaMMo DCMFdynaMMo DMF MARKER PLACEMENT GUIDE The marker placement in this document is only one of many possible combinations. T his guide will only show the standard marker placement that’s being used in the motion capture laboratory. The marker placement in this guide resembles the one that is shown and explained in the Vicon 512 manual. As such, the Vicon 512 Manual can offer additional information. The difference with the marker set in this document from the Vicon 512 Manual is the addition of 4 m arkers, namely RARM, L ARM, RLEG, a nd LLEG. Before starting, below are some general rules of thumb one should follow: • Have the person who’s going to be motion captured wear tight fitting clothes—strap down any areas of the clothing that is loose. The marker balls’ position should move as little as possible and should be properly seen. • Place the marker balls as close to the boneas possible. This follows therule of having the marker balls stay stationary during movement.
- 26. Arizona State University iBall: A Network of Regression Models - 26 - • Y. Yao, H. Tong, F. Xu, J. Lu: Predicting long-term impact of CQA posts: a comprehensive viewpoint. KDD 2014 • L. Li, H. Tong: The Child is Father of the Man: Foresee the Success at the Early Stage. KDD 2015. • “Data Mining Reveals the Secret to Getting Good Answers”, MIT Technology Review, 2013 § Results § Models & Algorithms§ Problem Definition D1 D3 D2 D4
- 27. Arizona State University Fascinate: Cross-Layer Dependence Inference on Multi-Layered Networks - 27 - § Results § Methods§ Problem Definition Infer Unobserved Cross-Layer Links Cross-Layer Inference = Collective CF Effectiveness Efficiency • C. Chen, J. He, N. Bliss and H. Tong: “On the Connectivity of Multi-layered Networks: Models, Measures and Optimal Control” ICDM15. • C. Chen, H. Tong, L. Xie, L. Ying and Q. He: “FASCINATE: Fast Cross-Layer Dependence Inference on Multi-layered Networks”, KDD16, 3:15pm, Monday, Plaza Room A/B
- 28. Arizona State University Conclusion: a Network of X § Summary – NoN: Network + Networks – NoT: Network + Time Series – iBall: Network + Regression – Fascinate: Network + Inference § Take Home Messages – Modeling: `No’ (i.e., a Network of X) as the answer • Networks as data à as context – Algorithms: Networks as the contextual regularizer - 28 -
- 29. Arizona State University Roadmap § Motivations § NoN: A Network of Networks § Beyond NoN § Some of Our Other Recent Work – Team Replacement – TravelModeLogger – BrainQuest - 29 - – Network Alignment – Optimal Networks – Visual Influence Sum
- 30. Arizona State University Replacing the Irreplaceable: Team Replacement Recommendation - 30 - • L. Li, H. Tong, N. Cao, K. Ehrlich, Y.-R. Lin and N. Buchler: Replacing the Irreplaceable: Fast Algorithms for Team Member Recommendation, WWW 2015 • N. Cao, Y.-R. Lin, L. Li, H. Tong: g-Miner: Interactive Visual Group Mining on Multivariate Graphs, ACM CHI 2015 • System prototype & video demo: http://team-net-work.org § Problem Definition § System § Sol. § Results
- 31. Arizona State University Travel Mode Identification w/ Smartphones - 31 - § Prob.Dfn • X. Su, H. Tong and P. Ji: Accelerometer-based Activity Recognition on Smartphone. CIKM 2014 • X. Su, H. Caceres, H. Tong and Q. He: Travel Mode Identification with Smartphones. TRB 2015 § Method § Results § Open Challenges ² Battery Consumption (sampling rates, sensor selection) ² On-line algorithms ² Adaptive (summer vs. winter; high-way vs. local)
- 32. Arizona State University BrainQuest: Visual Brain Comparison - 32 - Quest brains to spot picture diff. • L. Shi, H. Tong, X. Mu: BrainQuest: Perception-Guided Visual Brain Comparison, ICDM 2015 • L. Shi, H. Tong, M. Daianu, X. Mu and P. Thompson Block-wise Human Brain Network Visual Comparison Using NodeTrix Representation. VIS'16
- 33. Arizona State University BrainQuest: Visual Brain Comparison - 33 - Quest computers to spot brain diff. • L. Shi, H. Tong, X. Mu: BrainQuest: Perception-Guided Visual Brain Comparison, ICDM 2015 • L. Shi, H. Tong, M. Daianu, X. Mu and P. Thompson Block-wise Human Brain Network Visual Comparison Using NodeTrix Representation. VIS'16
- 34. Arizona State University BrainQuest: Visual Brain Comparison - 34 - Quest computers to spot brain diff. AD group (n1) Control group (n2) • L. Shi, H. Tong, X. Mu: BrainQuest: Perception-Guided Visual Brain Comparison, ICDM 2015 • L. Shi, H. Tong, M. Daianu, X. Mu and P. Thompson Block-wise Human Brain Network Visual Comparison Using NodeTrix Representation. VIS'16
- 35. Arizona State University BrainQuest: Visual Brain Comparison - 35 - § VAFramework § Model & Algorithm § ProblemDfn. § Results Spot structural diff. between two groups of brain networks • L. Shi, H. Tong, X. Mu: BrainQuest: Perception-Guided Visual Brain Comparison, ICDM 2015 • L. Shi, H. Tong, M. Daianu, X. Mu and P. Thompson Block-wise Human Brain Network Visual Comparison Using NodeTrix Representation. VIS'16
- 36. Arizona State University Query-Specific Optimal Networks - 36 - L. Li, Y. Yao, J. Tang, W. Fan, H. Tong: QUINT: On Query-Specific Optimal Networks. KDD 2016. 10:00am, Monday, Plaza Room A/B § Goal: Optimal Networks – Query-Specific – Optimal Topology + Weights – On-line Learning § + Error Estimation § Results § Methods: VERY efficient way to estimate Accuracy(MAP)Scalabilitys x ij Query node Positive node @Q(x, s) @As(i, j) Q(j, s) ⇥ Q(x, i) / Neighbor of Neighbor ofs x
- 37. Arizona State University Attributed Network Alignment • D. Koutra, H. Tong, D. Lubensky:BIG-ALIGN: Fast Bipartite Graph Alignment. ICDM 2013. • S. Zhang and H. Tong: Final: Fast Attributed Network Alighnment. KDD 2016, 3:15pm, Monday, Plaza Room A/B § Formulation § Algorithms § ProblemDfn. § Results Accuracy vs. Time Accuracy vs. Noise • Iterative Alg. • Global Optimal • Same Complexity as ISORANK • Further Speed-up • Low-Rank Approximation • On-Query Alignment (Linear)
- 38. Arizona State University Vegas: Influence Graph Visual Summarization - 38 - • L. Shi, H. Tong, J. Tang and C. Lin: Flow-based Influence Graph Visual Summarization, ICDM 2014 • L. Shi, H. Tong, J. Tang, C. Lin: VEGAS: Visual influEnce GrAph Summarization on Citation Networks. TKDE 2015 § Solution § Results “Stochastic High-Level Petri Net and Applications” § Prob. Dfn. Who/What How/Why
- 39. Arizona State University Q&A Inside the atom is a whole new world! - 39 - • “A whole new world • Every turn a surprise • With new horizons to pursue • Every moment red-letter ……”
- 40. Arizona State University § Collaborators: – Norbou Buchler, Nan Cao, Madelaine Daianu, Kate Ehrlich, Wei Fan, Qing He, Ping Ji, Yu-ru Lin, Lei Shi, Chuang Lin, Jie Tang, Paul M. Thompson, Lei Xie, Yuan Yao, Lei Ying, Xiang Zhang § Students: – Liangyue Li – Chen Chen – Yongjie Cai (now at Google) – Xing Su – Si Zhang Acknowledgement - 40 -

No public clipboards found for this slide

Be the first to comment