- 1. Networks, Deep Learning and COVID-19 Tsuyoshi Murata Department of Computer Science School of Computing Tokyo Institute of Technology murata@c.titech.ac.jp http://www.net.c.titech.ac.jp/ The 1st International Conference on Data Science and Official Statistics (ICDSOS 2021) Nov. 13(Sat) 2021
- 2. about me • Tsuyoshi Murata • Department of Computer Science, School of Computing, Tokyo Institute of Technology • Research: artificial intelligence, network science, machine learning – Graph neural networks – Classification / prediction of time series data (such as predicting volcanic eruptions) – Social network analysis • http://www.net.c.titech.ac.jp/ 2
- 3. Table of contents • Networks (graphs) • Networks + Deep Learning • Networks + Deep Learning + COVID-19 3
- 4. Table of contents • Networks (graphs) • Networks + Deep Learning • Networks + Deep Learning + COVID-19 4
- 5. Networks (or graphs) • a set of vertices and edges • many objects in physical, biological, and social sciences can be thought of as networks 5 social networks metabolic networks food web “graph” and “network” are often used interchangeably
- 7. Understanding/analyzing networks metrics models processes algorithms path length, density, diameter, degree distribution, clustering coefficient, … Dijkstra's algorithm, graph partitioning, centrality computation, … random network, scale- free network, small-world network, power law, configuration model, … rumor/disease diffusion, influence maximization / minimization, SI model, SIR model, …
- 8. Topics • Community detection • Link prediction • Centrality (ranking) • Influence maximization • … 8 https://link.springer.com/article/10.1007/s11042-020-08700-4 https://www.nature.com/articles/s41598-019-57304-y https://www2.slideshare.net/tom.zimmermann/changes-and- bugs-mining-and-predicting-development-activities/19- CentralityDegree_Closeness_BetweennessBlue_binary_has https://link.springer.com/referenceworkent ry/10.1007%2F978-1-4939-7131-2_110197 Networks can be huge, incomplete, noisy, directed, weighted, signed, temporal, …
- 9. Community detection in signed networks 9 • two types of edges: friendship and hostility • Detection of nested communities (which often appears in real social networks) Tsuyoshi Murata, Takahiko Sugihara, and Talel Abdessalem, "Community Detection in Signed Networks Based on Extended Signed Modularity", Proceedings of the 8th Conference on Complex Networks (CompleNet 2017), Springer, 2017.
- 10. Transductive classification on heterogeneous networks • the labels of some vertices are given -> classify the labels of the remaining vertices 10 Phiradet Bangcharoensap, Tsuyoshi Murata, Hayato Kobayashi, Nobuyuki Shimizu, “Transductive Classification on Heterogeneous Information Networks with Edge Betweenness-based Normalization”, Proceedings of the 9th ACM International Conference on Web Search and Data Mining (WSDM2016), pp.437-446, 2016.
- 11. Influence maximization in dynamic networks • finding a set of nodes that will propagate information most in given social networks 11 Tsuyoshi Murata and Hokuto Koga, "Approximation Methods for Influence Maximization in Temporal Networks", Chapter 18, In: Petter Holme and Jari Saramaki (eds.), "Temporal Network Theory", pp.345- 368, Springer, 2019.
- 12. Detecting Communities of Distant Members 12 Xin Liu, Tsuyoshi Murata, Ken Wakita, "Detecting network communities beyond assortativity-related attributes", Physical Review E 90, 012806, 2014. Paulo Shakarian, Patrick Roos, Devon Callahan, Cory Kirk, "Mining for Geographically Disperse Communities in Social Networks by Leveraging Distance Modularity", KDD2013. • Our method was used for detecting terrorist networks by the researchers of U.S. Military Academy
- 13. Reference (Networks) • Networks (second edition), Mark Newman, Oxford University Press, 2018. https://global.oup.com/academic/product/net works-9780198805090
- 14. Table of contents • Networks (graphs) • Networks + Deep Learning • Networks + Deep Learning + COVID-19 14
- 15. Deep Learning Image recognition Voice recognition Natural Language Processing 15
- 16. Convolutional neural networks • Recognizing local features -> global features https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53 16
- 17. Convolution works for images, sentences, and networks • Images: grid of pixels • Sentences: sequences of words • Networks: – Number of neighbors is not fixed – Topologically complex – Vertices are not ordered 17
- 18. Graph Neural Networks • Learning features of vertices using their neighbors GNN Classification 18 gender, age, job, income, … https://edition.cnn.com/style/article/why-democrats- are-donkeys-republicans-are-elephants-artsy/index.html ? ? or
- 19. Machine learning tasks • classification • regression • clustering • dimensionality reduction regression f(x)=ax3+… clustering group 1 group 2 dimensionality reduction classification 19
- 20. Machine learning tasks for graphs • Node classification • Graph classification • Link prediction • Graph generation model for generating graphs 20
- 21. Applications of Graph Neural Networks • Computer vision – scene graph generation (input : images, output: objects and semantic relations) – realistic image generation (input: scene graph, output: images) • Recommender systems – recommendation as link prediction (input: items & users, output: missing links) • Traffic – Forecast of traffic speed (input: sensors on roads and the distances, output: traffic speed and volume) • Chemistry – classification of molecular graphs (atoms = nodes, bonds = edges) Graph Neural Networks: A Review of Methods and Applications https://arxiv.org/abs/1812.08434 21
- 22. DeepMind article (Sept. 2020) • Traffic prediction with advanced Graph Neural Networks – https://deepmind.com/blog/article/traffic- prediction-with-advanced-graph-neural-networks 22
- 23. GNN for traffic prediction • Segmenting roads as graphs 23
- 24. Some of our recent attempts • Cross-lingual Transfer for Text Classification with Dictionary- based Heterogeneous Graph (EMNLP 2021) • Predicting Emergency Medical Service Demand with Bipartite Graph Convolutional Networks (IEEE Access 2021) • Graph Neural Networks for Fast Node Ranking Approximation (ACM Trans. on Knowledge Discovery from Data 2021, CIKM 2019) • Graph Convolutional Networks for Graphs Containing Missing Features (Future Generation Computer Systems 2021) • Population Graph-based Multi-Model Ensemble Method for Diagnosing Autism Spectrum Disorder (Sensors 2020) • Learning Community Structure with Variational Autoencoder (ICDM 2018) 24
- 25. Cross-lingual Transfer for Text Classification (CLTC) • CLTC is transfer learning that uses training data from resource-rich languages to solve classification problems in resource-poor languages. • Collecting task-specific training data in high resource source languages can be infeasible because of the labeling cost, task characteristics, and privacy concerns. • This paper proposes an alternative solution that uses only task-independent word embeddings of high-resource languages and bilingual dictionaries. 「グラフニューラルネットワーク」 (FAN 2021 基調講演) 東京工業大学 村田剛志 2021.9.21(火) Nuttapong Chairatanakul, Noppayut Sriwatanasakdi, Nontawat Charoenphakdee, Xin Liu, Tsuyoshi Murata, “Cross-lingual Transfer for Text Classification with Dictionary-based Heterogeneous Graph”, Findings of EMNLP 2021 (accepted, to appear)
- 26. Predicting Emergency Medical Service Demand with Bipartite Graph Convolutional Networks • New bipartite graph convolutional neural network model to predict emergency demand (high/low) by representing ambulance data in Tokyo as a hospital- region bipartite graph. 26 Ruidong Jin, Tianqi Xia, Xin Liu, Tsuyoshi Murata, Kyoung-Sook Kim, “Predicting Emergency Medical Service Demand With Bipartite Graph Convolutional Networks”, IEEE Access, Vol. 9, pp.9903-9915, 2021. https://doi.org/10.1109/ACCESS.2021.3050607
- 27. Graph Neural Networks for Fast Node Ranking Approximation • A novel GNN for approximating centrality – aggregation is done separately for incoming and outgoing paths / Node’s own features are not aggregated / Nodes with no shortest paths are identified and corresponding rows in A and AT are set to zero 27 Sunil Kumar Maurya, Xin Liu, Tsuyoshi Murata, "Graph Neural Networks for Fast Node Ranking Approximation", ACM Transactions on Knowledge Discovery from Data, Vol.15, No.5, Article No.78, 2021. https://doi.org/10.1145/3446217 Sunil Kumar Maurya, Xin Liu, Tsuyoshi Murata, "Fast Approximations of Betweenness Centrality with Graph Neural Networks", Proc. of the 28th ACM Int'l Conf. on Information and Knowledge Management (CIKM'19) pp.2149–2152, 2019. https://doi.org/10.1145/3357384.3358080
- 28. Graph Convolutional Networks for Graphs Containing Missing Features • Integrating missing feature imputation and graph learning within the same neural network architecture. • Represents missing data as a Gaussian mixture model (GMM) and calculates the expected activation of neurons in the first hidden layer of the GCN. 28 Hibiki Taguchi, Xin Liu, Tsuyoshi Murata, "Graph Convolutional Networks for Graphs Containing Missing Features", Future Generation Computer Systems, Vol.117, pp.155-168, Elsevier, 2021. https://doi.org/10.1016/j.future.2020.11.016
- 29. Population Graph-based Multi-Model Ensemble Method for Diagnosing Autism Spectrum Disorder • Advances in brain imaging technology and machine learning have led to major advances in the diagnosis of brain diseases • New multi-model ensemble based on population graph for distinguish healthy people and patients 29 Zarina Rakhimberdina, Xin Liu, Tsuyoshi Murata, "Population Graph-Based Multi-Model Ensemble Method for Diagnosing Autism Spectrum Disorder“, Sensors, Vol.20, No.21, 18 pages, 2020. https://doi.org/10.3390/s20216001
- 30. Learning Community Structure with Variational Autoencoder • Variational autoencoder (VAE) : generative models for the classification of similar synthetic entities • Variational graph autoencoder (VGAE) : the extension of VAE to graph structures • Variational Graph Autoencoder for Community Detection (VGAECD) : encodes graph structures with multiple Gaussian distributions corresponding to each of the communities 30 Jun Jin Choong, Xin Liu, Tsuyoshi Murata, “Learning Community Structure with Variational Autoencoder”, Proceedings of IEEE ICDM 2018 (IEEE International Conference on Data Mining), pp.69-78, November, 2018. https://doi.org/10.1109/ICDM.2018.00022
- 31. Table of contents • Networks (graphs) • Networks + Deep Learning • Networks + Deep Learning + COVID-19 31
- 32. Table of contents • Networks (graphs) • Networks + Deep Learning • Networks + Deep Learning + COVID-19 – Networks + COVID-19 – Networks + Deep Learning + COVID-19 32
- 33. Table of contents • Networks (graphs) • Networks + Deep Learning • Networks + Deep Learning + COVID-19 – Networks + COVID-19 – Networks + Deep Learning + COVID-19 33
- 34. Network modeling for epidemics • disease spread 34
- 35. Network modeling for epidemics • disease spread 35 network # of infected people prob. of infection prob. of recovery
- 36. SIR model • S : susceptible • I : infected • R: recovered (or removed) 36 S I R 𝛾𝛿𝜏 1-𝛾𝛿𝜏 S I R NDlib - Network Diffusion Library https://ndlib.readthedocs.io/en/latest/index.html
- 37. SIR model • three states – Susceptible (S) : not infected – Infected (I) – Recovered (removed) (R) • It makes little difference to the disease whether a person is immune or dead • 𝜏 : the length of time that infected individual is likely to remain infected before they recover • 𝛾𝛿𝜏 : probability of recovering in time interval 𝛿𝜏 • 1 − 𝛾𝛿𝜏 : probability of not doing so • Probability that the individual is still infected after time 𝜏 : lim 𝛿𝑡→0 1 − 𝛾𝛿𝜏 𝜏 𝛿𝜏 = 𝑒−𝛾𝜏 • Probability 𝑝 𝜏 𝑑𝜏 that the individual remains infected for and then recovers between 𝜏 and 𝜏 + 𝑑𝜏 : 𝑝 𝜏 𝑑𝜏 = 𝛾𝑒−𝛾𝜏 𝑑𝜏 S I R recovery and death 𝛾𝛿𝜏 1-𝛾𝛿𝜏 Exponential distribution: some might remain in I state for a long time not realistic for most real disease
- 38. Equations for the SIR model • 𝑑𝑠 𝑑𝑡 = −𝛽𝑠𝑥 • 𝑑𝑥 𝑑𝑡 = 𝛽𝑠𝑥 − 𝛾𝑥 • 𝑑𝑟 𝑑𝑡 = 𝛾𝑥 • 𝑠 + 𝑥 + 𝑟 = 1 • Eliminate x : 1 𝑠 𝑑𝑠 𝑑𝑡 = − 𝛽 𝛾 𝑑𝑟 𝑑𝑡 • Integrate both sides with respect to t : 𝑠 = 𝑠0𝑒−𝛽𝑟 𝛾 • Put this equation and x = 1 − 𝑠 − 𝑟 : 𝑑𝑟 𝑑𝑡 = 𝛾 1 − 𝑟 − S I R 𝑠 𝑥 𝑟 S R I Time evolution of the SIR model 𝛽 = 1, 𝛾 = 0.4, 𝑠0 = 0.99, 𝑥0 = 0.01, 𝑟0 = 0
- 39. Time evolution of the SIR model • S decreases / R increases monotonically • S does not go to zero (because no I left as 𝑡 → ∞) • R: total size of the outbreak • 𝑑𝑟 𝑑𝑡 = 𝛾 1 − 𝑟 − 𝑠0𝑒−𝛽𝑟 𝛾 = 0 • 𝑟 = 1 − 𝑠0𝑒−𝛽𝑟 𝛾 • Initial condition: – c infected and n-c susceptible – 𝑠0 = 1 − 𝑐 𝑛 , 𝑥0 = 𝑐 𝑛 , 𝑟0 = 0 – When 𝑛 → ∞, 𝑠0 ≅ 1 • 𝑟 = 1 − 𝑒−𝛽𝑟 𝛾 S R I Time evolution of the SIR model 𝛽 = 1, 𝛾 = 0.4, 𝑠0 = 0.99, 𝑥0 = 0.01, 𝑟0 = 0 Size of the giant component of a Poisson random graph 𝑐 = 𝛽 𝛾 cS e S 1 size
- 40. Size of epidemics • If 𝛽 ≤ 𝛾 there is no epidemic – 𝐼 → 𝑅 is faster than 𝑆 → 𝐼 S y cS e y 1 cS e S 1 no giant component 0 S 0 S transition between two regimes 1 ) 1 ( cS e dS d 1 cS ce 1 0 c S cS e S 1 c S I R Epidemic transition 𝛽 = 𝛾
- 41. Basic reproduction number • The average number of additional I people – If each I person passes disease to two others on average, then 𝑅0 = 2 → disease will grow exponentially – If 𝑅0 = 1 2 → disease will die exponentially – If 𝑅0 = 1 → epidemic threshold (𝛽 = 𝛾) S I R 𝑠 𝑥 𝑟
- 42. Modelling COVID-19 epidemic in Italy • SIDARTHE model 42 Giordano, G., Blanchini, F., Bruno, R. et al. "Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy", Nature Medicine Vol.26, pp.855–860 (2020). https://doi.org/10.1038/s41591-020-0883-7 diagnosed not diagnosed severe mild
- 43. Growth of COVID-19 patients • Power-law curves between countries are highly correlated 43 Cesar Manchein, Eduardo L. Brugnago, Rafael M. da Silva, Carlos F. O. Mendes, and Marcus W. Beims, "Strong Correlations Between Power-law Growth of COVID-19 in Four Continents and the Inefficiency of Soft Quarantine Strategies", Chaos Vol.30, No.041102 pp.1-7, 2020. https://doi.org/10.1063/5.0009454 theoretically: exponential 𝑦 = 𝑥𝑘 http://maps.unomaha.edu/maher/ GEOL2300/week10/exp.html actually : power law 𝑦 = 𝑘𝑥
- 44. Effect of travel restrictions • Evaluating travel ban by computer simulation • Wuhan travel ban was effective for preventing COVID-19 outside of China, although it was not effective inside of China (already diffused) 44 Matteo Chinazzi, Jessica T. Davis, Marco Ajelli, Corrado Gioannini, Maria Litvinova, Stefano Merler, Ana Pastore y Piontti, Kunpeng Mu, Luca Rossi, Kaiyuan Sun, Cécile Viboud, Xinyue Xiong, Hongjie Yu, M. Elizabeth Halloran, Ira M. Longini Jr., Alessandro Vespignani, "The Effect of Travel Restrictions on the Spread of the 2019 Novel Coronavirus (COVID-19) Outbreak“, Science 24 Apr 2020, Vol. 368, Issue 6489, pp. 395-400, 2020. https://doi.org/10.1126/science.aba9757
- 45. Network analysis of genomes 45 Peter Forstera, Lucy Forster, Colin Renfrew, and Michael Forster, "Phylogenetic Network Analysis of SARS-CoV-2 Genomes“, PNAS, Vol.117, No.17, pp.9241-9243, 2020 https://doi.org/10.1073/pnas.2004999117 A B C Bat Europe and America East Asia • Phylogenetics: for the inference of the evolutionary history and relationships among groups of organisms • Three variants • Virus mutation emerges in two different hosts
- 46. “COVID-19 and Networks” • Tsuyoshi Murata, "COVID-19 and Networks", New Generation Computing, Springer, 2021. • https://doi.org/10.1007/s00354-021-00134-2 - Introduction of Network Epidemics - Influence Maximization Problem - Temporal Networks - COVID-19 papers in early 2020 - Data and resources 46
- 47. Table of contents • Networks (graphs) • Networks + Deep Learning • Networks + Deep Learning + COVID-19 – Networks + COVID-19 – Networks + Deep Learning + COVID-19 47
- 48. Graph Representation Learning and Beyond (GRL+) • A workshop collocated with International Conference on Machine Learning (ICML 2020) – https://grlplus.github.io/covid19/ • “Graph Methods for COVID-19 Response” William L. Hamilton (McGill University/Mila) – https://grlplus.github.io/files/graphs- against-covid.pdf 48
- 49. “Graph Methods for COVID-19 Response” • Three key types of data – Biomedical treatment data – Epidemiological network data – Supply chain networks • heterogeneous and relational structures – Computational drug design – Computational treatment design – Epidemiological forecasting – Demand forecasting and supply chain optimization – Outbreak tracking and tracing 49 William L. Hamilton, "Graph Methods for COVID-19 Response", https://grlplus.github.io/files/graphs- against-covid.pdf
- 50. Computational drug design • Can we design better antivirals to target COVID-19? • Sub-problem 1: Molecule representation and property prediction • Sub-problem 2: Molecule generation and search – How can we generate molecules that have particular properties? How can we effectively search over the space of possible molecules? 50 possibility of application of GNNs (still open challenge) William L. Hamilton, "Graph Methods for COVID-19 Response", https://grlplus.github.io/files/graphs- against-covid.pdf
- 51. Computational treatment design • Can we design better treatment strategies using existing drugs? • Approach 1: structure-based – similar to computational drug design • Approach 2: network-based – Leverage knowledge of biological interactions between drugs, diseases, and proteins 51 William L. Hamilton, "Graph Methods for COVID-19 Response", https://grlplus.github.io/files/graphs- against-covid.pdf
- 52. Epidemiological forecasting • Can we better predict how and where infection rate will change over time? 52 William L. Hamilton, "Graph Methods for COVID-19 Response", https://grlplus.github.io/files/graphs- against-covid.pdf
- 53. Demand forecasting and supply chain optimization • Can we forecast COVID-19 related demands to optimize supply chains? 1. Heterogeneous relational data 2. Temporal information and changes 3. Node-level predictions -> spatio-temporal GNNs are useful 53 William L. Hamilton, "Graph Methods for COVID-19 Response", https://grlplus.github.io/files/graphs- against-covid.pdf
- 54. Outbreak tracking and tracing • Can we model and predict infection risk at the individual level? 54 William L. Hamilton, "Graph Methods for COVID-19 Response", https://grlplus.github.io/files/graphs- against-covid.pdf
- 55. Table of contents • Networks (graphs) • Networks + Deep Learning • Networks + Deep Learning + COVID-19 55
- 56. Networks, Deep Learning and COVID-19 Tsuyoshi Murata Department of Computer Science School of Computing Tokyo Institute of Technology murata@c.titech.ac.jp http://www.net.c.titech.ac.jp/ The 1st International Conference on Data Science and Official Statistics (ICDSOS 2021) Nov. 13(Sat) 2021